我有2个CSV文件:“数据”和“映射”:
Device_Name
GDN
Device_Type
Device_OS
我知道只有2列存在时才需要使用dict(需要映射1列),但是当需要映射3列时我不知道如何实现。
以下是我尝试完成的映射的代码Device_Type:
x = dict([]) with open("Pricing Mapping_2013-04-22.csv", "rb") as in_file1: file_map = csv.reader(in_file1, delimiter=',') for row in file_map: typemap = [row[0],row[2]] x.append(typemap) with open("Pricing_Updated_Cleaned.csv", "rb") as in_file2, open("Data Scraper_GDN.csv", "wb") as out_file: writer = csv.writer(out_file, delimiter=',') for row in csv.reader(in_file2, delimiter=','): try: row[27] = x[row[11]] except KeyError: row[27] = "" writer.writerow(row)
它返回Attribute Error。
Attribute Error
经过研究后,我认为我需要创建一个嵌套的字典,但是我不知道如何执行此操作。
嵌套字典是字典中的字典。非常简单的事情。
>>> d = {} >>> d['dict1'] = {} >>> d['dict1']['innerkey'] = 'value' >>> d {'dict1': {'innerkey': 'value'}}
你也可以使用一个defaultdict从collections包装,以方便创建嵌套的字典。
defaultdict
collections
>>> import collections >>> d = collections.defaultdict(dict) >>> d['dict1']['innerkey'] = 'value' >>> d # currently a defaultdict type defaultdict(<type 'dict'>, {'dict1': {'innerkey': 'value'}}) >>> dict(d) # but is exactly like a normal dictionary. {'dict1': {'innerkey': 'value'}}
您可以根据需要填充。
我建议在你的代码的东西 像 下面:
d = {} # can use defaultdict(dict) instead for row in file_map: # derive row key from something # when using defaultdict, we can skip the next step creating a dictionary on row_key d[row_key] = {} for idx, col in enumerate(row): d[row_key][idx] = col
根据您的评论:
可能是上面的代码使这个问题感到困惑。我的问题简而言之:我有2个文件a.csv b.csv,a.csv有4列ijkl,b.csv也有这些列。我是这些csv的关键列。jkl列在a.csv中为空,但在b.csv中填充。我想使用’i`作为键列将b.csv中的jk l列的值映射到a.csv文件
我的建议是什么 像 这样(不使用defaultdict):
a_file = "path/to/a.csv" b_file = "path/to/b.csv" # read from file a.csv with open(a_file) as f: # skip headers f.next() # get first colum as keys keys = (line.split(',')[0] for line in f) # create empty dictionary: d = {} # read from file b.csv with open(b_file) as f: # gather headers except first key header headers = f.next().split(',')[1:] # iterate lines for line in f: # gather the colums cols = line.strip().split(',') # check to make sure this key should be mapped. if cols[0] not in keys: continue # add key to dict d[cols[0]] = dict( # inner keys are the header names, values are columns (headers[idx], v) for idx, v in enumerate(cols[1:]))
但是请注意,用于解析csv文件的是csv模块。