我希望阅读具有15个字段和大约2000行的Excel工作簿,并将每行转换为Python中的字典。然后,我想将每个词典添加到列表中。我希望工作簿最上面一行的每个字段都成为每个字典中的键,并让相应的单元格值成为字典中的值。我已经在和这里看过示例,但是我想做一些不同的事情。第二个示例可以工作,但是我觉得循环遍历第一行来填充字典键,然后遍历每一行来获取值会更有效。我的Excel文件包含来自论坛的数据,看起来像这样(显然带有更多列):
id thread_id forum_id post_time votes post_text 4 100 3 1377000566 1 'here is some text' 5 100 4 1289003444 0 'even more text here'
所以,我想等领域id,thread_id等等,是字典键。我希望字典看起来像:
id
thread_id
{id: 4, thread_id: 100, forum_id: 3, post_time: 1377000566, votes: 1, post_text: 'here is some text'}
最初,我有一些类似这样的代码遍历文件,但是对于某些for循环,我的范围是错误的,并且生成的字典太多了。这是我的初始代码:
import xlrd from xlrd import open_workbook, cellname book = open_workbook('forum.xlsx', 'r') sheet = book.sheet_by_index(3) dict_list = [] for row_index in range(sheet.nrows): for col_index in range(sheet.ncols): d = {} # My intuition for the below for-loop is to take each cell in the top row of the # Excel sheet and add it as a key to the dictionary, and then pass the value of # current index in the above loops as the value to the dictionary. This isn't # working. for i in sheet.row(0): d[str(i)] = sheet.cell(row_index, col_index).value dict_list.append(d)
任何帮助将不胜感激。在此先感谢您的阅读。
这个想法是,首先,将标题读入列表。然后,迭代工作表行(从标题后的下一个开始),基于标题键和适当的单元格值创建新字典,并将其附加到词典列表中:
from xlrd import open_workbook book = open_workbook('forum.xlsx') sheet = book.sheet_by_index(3) # read header values into the list keys = [sheet.cell(0, col_index).value for col_index in xrange(sheet.ncols)] dict_list = [] for row_index in xrange(1, sheet.nrows): d = {keys[col_index]: sheet.cell(row_index, col_index).value for col_index in xrange(sheet.ncols)} dict_list.append(d) print dict_list
对于包含以下内容的工作表:
A B C D 1 2 3 4 5 6 7 8
它打印:
[{'A': 1.0, 'C': 3.0, 'B': 2.0, 'D': 4.0}, {'A': 5.0, 'C': 7.0, 'B': 6.0, 'D': 8.0}]
UPD(扩展字典理解):
d = {} for col_index in xrange(sheet.ncols): d[keys[col_index]] = sheet.cell(row_index, col_index).value