group_id, application_id, reading
1, a1, 0.1 1, a1, 0.2 1, a1, 0.4 1, a1, 0.3 1, a1, 0.0 1, a1, 0.9 2, b1, 0.1 2, b1, 0.2 2, b1, 0.4 2, b1, 0.3 2, b1, 0.0 2, b1, 0.9 ..... n, x, 0.3(lets say)
group_id
n=group_id
输出量
File 1 1, a1, 0.1 1, a1, 0.2 1, a1, 0.4 1, a1, 0.3 1, a1, 0.0 1, a1, 0.9
和
File2 2, b1, 0.1 2, b1, 0.2 2, b1, 0.4 2, b1, 0.3 2, b1, 0.0 2, b1, 0.9 .....
File n n, x, 0.3(lets say)
我该如何有效地做到这一点?
如果文件已经按排序group_id,则可以执行以下操作:
import csv from itertools import groupby for key, rows in groupby(csv.reader(open("foo.csv")), lambda row: row[0]): with open("%s.txt" % key, "w") as output: for row in rows: output.write(",".join(row) + "\n")