如何在保留列顺序的同时创建DataFrame？

小编典典

如何在保留列顺序的同时创建DataFrame？

python

如何在保留列顺序的同时从多个numpy数组，Pandas系列或PandasDataFrame创建一个DataFrame ？

例如，我有这两个numpy数组，我想将它们合并为一个PandasDataFrame。

foo = np.array( [ 1, 2, 3 ] )
bar = np.array( [ 4, 5, 6 ] )

如果我这样做，该bar列将排在第一位，因为dict它不保留顺序。

pd.DataFrame( { 'foo': pd.Series(foo), 'bar': pd.Series(bar) } )

    bar foo
0   4   1
1   5   2
2   6   3

我可以做到，但是当我需要组合许多变量时，它变得很乏味。

pd.DataFrame( { 'foo': pd.Series(foo), 'bar': pd.Series(bar) }, columns = [ 'foo', 'bar' ] )

编辑：有没有一种方法可以指定要连接的变量并在一个操作中组织列顺序？也就是说，我不介意使用多行代码来完成整个操作，但我宁愿不必指定要多次连接的变量（因为我将对代码进行很多更改，这很容易出错）
。

EDIT2：还有一点。如果要添加或删除要连接的变量之一，则只想在一个位置添加/删除。

阅读 187

2021-01-20

共1个答案

小编典典

原始解决方案：错误使用 `collections.OrderedDict`

在我最初的解决方案中，我建议使用python标准库中OrderedDict的collections包。

>>> import numpy as np
>>> import pandas as pd
>>> from collections import OrderedDict
>>>
>>> foo = np.array( [ 1, 2, 3 ] )
>>> bar = np.array( [ 4, 5, 6 ] )
>>>
>>> pd.DataFrame( OrderedDict( { 'foo': pd.Series(foo), 'bar': pd.Series(bar) } ) )

   foo  bar
0    1    4
1    2    5
2    3    6

正确的解决方案：传递键值元组对以保留订单

但是，如前所述，如果将普通字典传递给OrderedDict，则顺序可能仍然无法保留，因为在构造字典时该顺序是随机的。但是，一种解决方法是将键值元组对的列表转换为OrderedDict，如下面的SO建议：

>>> import numpy as np
>>> import pandas as pd
>>> from collections import OrderedDict
>>>
>>> a = np.array( [ 1, 2, 3 ] )
>>> b = np.array( [ 4, 5, 6 ] )
>>> c = np.array( [ 7, 8, 9 ] )
>>>
>>> pd.DataFrame( OrderedDict( { 'a': pd.Series(a), 'b': pd.Series(b), 'c': pd.Series(c) } ) )

   a  c  b
0  1  7  4
1  2  8  5
2  3  9  6

>>> pd.DataFrame( OrderedDict( (('a', pd.Series(a)), ('b', pd.Series(b)), ('c', pd.Series(c))) ) )

   a  b  c
0  1  4  7
1  2  5  8
2  3  6  9

2021-01-20