我有一个数据框,grouped具有以下的multiindex列:
grouped
import pandas as pd codes = ["one","two","three"]; colours = ["black", "white"]; textures = ["soft", "hard"]; N= 100 # length of the dataframe df = pd.DataFrame({ 'id' : range(1,N+1), 'weeks_elapsed' : [random.choice(range(1,25)) for i in range(1,N+1)], 'code' : [random.choice(codes) for i in range(1,N+1)], 'colour': [random.choice(colours) for i in range(1,N+1)], 'texture': [random.choice(textures) for i in range(1,N+1)], 'size': [random.randint(1,100) for i in range(1,N+1)], 'scaled_size': [random.randint(100,1000) for i in range(1,N+1)] }, columns= ['id', 'weeks_elapsed', 'code','colour', 'texture', 'size', 'scaled_size']) grouped = df.groupby(['code', 'colour']).agg( {'size': [np.sum, np.average, np.size, pd.Series.idxmax],'scaled_size': [np.sum, np.average, np.size, pd.Series.idxmax]}).reset_index() >> grouped code colour size scaled_size sum average size idxmax sum average size idxmax 0 one black 1031 60.647059 17 81 185.153944 10.891408 17 47 1 one white 481 37.000000 13 53 204.139249 15.703019 13 53 2 three black 822 48.352941 17 6 123.269405 7.251141 17 31 3 three white 1614 57.642857 28 50 285.638337 10.201369 28 37 4 two black 523 58.111111 9 85 80.908912 8.989879 9 88 5 two white 669 41.812500 16 78 82.098870 5.131179 16 78 [6 rows x 10 columns]
如何将列索引级别展平/合并为:“ Level1 | Level2”,例如size|sum,scaled_size|sum。等等?如果这不可能,是否有办法groupby()像我上面所做的那样不创建多索引列?
size|sum
scaled_size|sum
groupby()
您可以随时更改列:
grouped.columns = ['%s%s' % (a, '|%s' % b if b else '') for a, b in grouped.columns]