我有以下数据框:
> df1 id begin conditional confidence discoveryTechnique 0 278 56 false 0.0 1 1 421 18 false 0.0 1 > df2 concept 0 A 1 B
如何合并索引以获得:
id begin conditional confidence discoveryTechnique concept 0 278 56 false 0.0 1 A 1 421 18 false 0.0 1 B
我问是因为我的理解是merge()iedf1.merge(df2)使用列来进行匹配。事实上,这样做我得到:
merge()
df1.merge(df2)
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 4618, in merge copy=copy, indicator=indicator) File "/usr/local/lib/python2.7/dist-packages/pandas/tools/merge.py", line 58, in merge copy=copy, indicator=indicator) File "/usr/local/lib/python2.7/dist-packages/pandas/tools/merge.py", line 491, in __init__ self._validate_specification() File "/usr/local/lib/python2.7/dist-packages/pandas/tools/merge.py", line 812, in _validate_specification raise MergeError('No common columns to perform merge on') pandas.tools.merge.MergeError: No common columns to perform merge on
合并索引是不好的做法吗?这是不可能的吗?如果是这样,我怎样才能将索引转移到一个名为“索引”的新列中?
使用merge,默认情况下是内连接:
merge
pd.merge(df1, df2, left_index=True, right_index=True)
或者join,默认情况下是左连接:
join
df1.join(df2)
或concat),默认为外连接:
concat
pd.concat([df1, df2], axis=1)
样品 :
df1 = pd.DataFrame({'a':range(6), 'b':[5,3,6,9,2,4]}, index=list('abcdef')) print (df1) a b a 0 5 b 1 3 c 2 6 d 3 9 e 4 2 f 5 4 df2 = pd.DataFrame({'c':range(4), 'd':[10,20,30, 40]}, index=list('abhi')) print (df2) c d a 0 10 b 1 20 h 2 30 i 3 40
# Default inner join df3 = pd.merge(df1, df2, left_index=True, right_index=True) print (df3) a b c d a 0 5 0 10 b 1 3 1 20 # Default left join df4 = df1.join(df2) print (df4) a b c d a 0 5 0.0 10.0 b 1 3 1.0 20.0 c 2 6 NaN NaN d 3 9 NaN NaN e 4 2 NaN NaN f 5 4 NaN NaN # Default outer join df5 = pd.concat([df1, df2], axis=1) print (df5) a b c d a 0.0 5.0 0.0 10.0 b 1.0 3.0 1.0 20.0 c 2.0 6.0 NaN NaN d 3.0 9.0 NaN NaN e 4.0 2.0 NaN NaN f 5.0 4.0 NaN NaN h NaN NaN 2.0 30.0 i NaN NaN 3.0 40.0