假设我有两个这样的DataFrame:
left = pd.DataFrame({'key1': ['foo', 'bar'], 'lval': [1, 2]}) right = pd.DataFrame({'key2': ['foo', 'bar'], 'rval': [4, 5]})
我想合并它们,所以我尝试这样的事情:
pd.merge(left, right, left_on='key1', right_on='key2')
我很开心
key1 lval key2 rval 0 foo 1 foo 4 1 bar 2 bar 5
但是我正在尝试使用join方法,我被认为这是非常相似的。
left.join(right, on=['key1', 'key2'])
我得到这个:
//anaconda/lib/python2.7/site-packages/pandas/tools/merge.pyc in _validate_specification(self) 406 if self.right_index: 407 if not ((len(self.left_on) == self.right.index.nlevels)): --> 408 raise AssertionError() 409 self.right_on = [None] * n 410 elif self.right_on is not None: AssertionError:
我想念什么?
我总是join在索引上使用:
join
import pandas as pd left = pd.DataFrame({'key': ['foo', 'bar'], 'val': [1, 2]}).set_index('key') right = pd.DataFrame({'key': ['foo', 'bar'], 'val': [4, 5]}).set_index('key') left.join(right, lsuffix='_l', rsuffix='_r') val_l val_r key foo 1 4 bar 2 5
通过merge在以下各列上使用,可以具有相同的功能:
merge
left = pd.DataFrame({'key': ['foo', 'bar'], 'val': [1, 2]}) right = pd.DataFrame({'key': ['foo', 'bar'], 'val': [4, 5]}) left.merge(right, on=('key'), suffixes=('_l', '_r')) key val_l val_r 0 foo 1 4 1 bar 2 5