我有 2 个 CSV 文件:“数据”和“映射”:
Device_Name
GDN
Device_Type
Device_OS
我知道当只存在 2 列时如何使用 dict(需要映射 1 列),但是当需要映射 3 列时我不知道如何完成此操作。
以下是我尝试完成映射的代码Device_Type:
x = dict([]) with open("Pricing Mapping_2013-04-22.csv", "rb") as in_file1: file_map = csv.reader(in_file1, delimiter=',') for row in file_map: typemap = [row[0],row[2]] x.append(typemap) with open("Pricing_Updated_Cleaned.csv", "rb") as in_file2, open("Data Scraper_GDN.csv", "wb") as out_file: writer = csv.writer(out_file, delimiter=',') for row in csv.reader(in_file2, delimiter=','): try: row[27] = x[row[11]] except KeyError: row[27] = "" writer.writerow(row)
它返回Attribute Error。
Attribute Error
经过一番研究,我想我需要创建一个嵌套的字典,但我不知道如何做到这一点。
嵌套字典是字典中的字典。很简单的一件事。
>>> d = {} >>> d['dict1'] = {} >>> d['dict1']['innerkey'] = 'value' >>> d['dict1']['innerkey2'] = 'value2' >>> d {'dict1': {'innerkey': 'value', 'innerkey2': 'value2'}}
您还可以使用包中的 adefaultdict来collections帮助创建嵌套字典。
defaultdict
collections
>>> import collections >>> d = collections.defaultdict(dict) >>> d['dict1']['innerkey'] = 'value' >>> d # currently a defaultdict type defaultdict(<type 'dict'>, {'dict1': {'innerkey': 'value'}}) >>> dict(d) # but is exactly like a normal dictionary. {'dict1': {'innerkey': 'value'}}
您可以随心所欲地填充它。
我会在您的代码中推荐 如下 内容:
d = {} # can use defaultdict(dict) instead for row in file_map: # derive row key from something # when using defaultdict, we can skip the next step creating a dictionary on row_key d[row_key] = {} for idx, col in enumerate(row): d[row_key][idx] = col
根据您的评论:
可能是上面的代码令人困惑的问题。简而言之,我的问题:我有 2 个文件 a.csv b.csv,a.csv 有 4 列 ijkl,b.csv 也有这些列。i 是这些 csvs 的关键列。jkl 列在 a.csv 中为空,但在 b.csv 中填充。我想将 jk l 列的值使用“i”作为键列从 b.csv 映射到 a.csv 文件
我的建议是 这样的 (不使用 defaultdict):
a_file = "path/to/a.csv" b_file = "path/to/b.csv" # read from file a.csv with open(a_file) as f: # skip headers f.next() # get first colum as keys keys = (line.split(',')[0] for line in f) # create empty dictionary: d = {} # read from file b.csv with open(b_file) as f: # gather headers except first key header headers = f.next().split(',')[1:] # iterate lines for line in f: # gather the colums cols = line.strip().split(',') # check to make sure this key should be mapped. if cols[0] not in keys: continue # add key to dict d[cols[0]] = dict( # inner keys are the header names, values are columns (headers[idx], v) for idx, v in enumerate(cols[1:]))
但请注意,为了解析 csv 文件,有一个csv 模块。