我想知道是否有某种方式将所有DataFrame负数替换为零?
如果所有列都是数字列,则可以使用布尔索引:
In [1]: import pandas as pd In [2]: df = pd.DataFrame({'a': [0, -1, 2], 'b': [-3, 2, 1]}) In [3]: df Out[3]: a b 0 0 -3 1 -1 2 2 2 1 In [4]: df[df < 0] = 0 In [5]: df Out[5]: a b 0 0 0 1 0 2 2 2 1
对于更一般的情况,此答案显示了private方法_get_numeric_data:
_get_numeric_data
In [1]: import pandas as pd In [2]: df = pd.DataFrame({'a': [0, -1, 2], 'b': [-3, 2, 1], 'c': ['foo', 'goo', 'bar']}) In [3]: df Out[3]: a b c 0 0 -3 foo 1 -1 2 goo 2 2 1 bar In [4]: num = df._get_numeric_data() In [5]: num[num < 0] = 0 In [6]: df Out[6]: a b c 0 0 0 foo 1 0 2 goo 2 2 1 bar
使用timedelta类型,布尔索引似乎可以在单独的列上工作,但不能在整个数据帧上工作。因此,您可以执行以下操作:
timedelta
In [1]: import pandas as pd In [2]: df = pd.DataFrame({'a': pd.to_timedelta([0, -1, 2], 'd'), ...: 'b': pd.to_timedelta([-3, 2, 1], 'd')}) In [3]: df Out[3]: a b 0 0 days -3 days 1 -1 days 2 days 2 2 days 1 days In [4]: for k, v in df.iteritems(): ...: v[v < 0] = 0 ...: In [5]: df Out[5]: a b 0 0 days 0 days 1 0 days 2 days 2 2 days 1 days
更新: 与pd.Timedelta整个DataFrame上的作品进行比较:
pd.Timedelta
In [1]: import pandas as pd In [2]: df = pd.DataFrame({'a': pd.to_timedelta([0, -1, 2], 'd'), ...: 'b': pd.to_timedelta([-3, 2, 1], 'd')}) In [3]: df[df < pd.Timedelta(0)] = 0 In [4]: df Out[4]: a b 0 0 days 0 days 1 0 days 2 days 2 2 days 1 days