我的矩阵中有大量数据x,我需要分析一些子矩阵。
x
我正在使用以下代码来选择子矩阵:
>>> import numpy as np >>> x = np.random.normal(0,1,(20,2)) >>> x array([[-1.03266826, 0.04646684], [ 0.05898304, 0.31834926], [-0.1916809 , -0.97929025], [-0.48837085, -0.62295003], [-0.50731017, 0.50305894], [ 0.06457385, -0.10670002], [-0.72573604, 1.10026385], [-0.90893845, 0.99827162], [ 0.20714399, -0.56965615], [ 0.8041371 , 0.21910274], [-0.65882317, 0.2657183 ], [-1.1214074 , -0.39886425], [ 0.0784783 , -0.21630006], [-0.91802557, -0.20178683], [ 0.88268539, -0.66470235], [-0.03652459, 1.49798484], [ 1.76329838, -0.26554555], [-0.97546845, -2.41823586], [ 0.32335103, -1.35091711], [-0.12981597, 0.27591674]]) >>> index = x[:,1] > 0 >>> index array([ True, True, False, False, True, False, True, True, False, True, True, False, False, False, False, True, False, False, False, True], dtype=bool) >>> x1 = x[index, :] #x1 is a copy of the submatrix >>> x1 array([[-1.03266826, 0.04646684], [ 0.05898304, 0.31834926], [-0.50731017, 0.50305894], [-0.72573604, 1.10026385], [-0.90893845, 0.99827162], [ 0.8041371 , 0.21910274], [-0.65882317, 0.2657183 ], [-0.03652459, 1.49798484], [-0.12981597, 0.27591674]]) >>> x1[0,0] = 1000 >>> x1 array([[ 1.00000000e+03, 4.64668400e-02], [ 5.89830401e-02, 3.18349259e-01], [ -5.07310170e-01, 5.03058935e-01], [ -7.25736045e-01, 1.10026385e+00], [ -9.08938455e-01, 9.98271624e-01], [ 8.04137104e-01, 2.19102741e-01], [ -6.58823174e-01, 2.65718300e-01], [ -3.65245877e-02, 1.49798484e+00], [ -1.29815968e-01, 2.75916735e-01]]) >>> x array([[-1.03266826, 0.04646684], [ 0.05898304, 0.31834926], [-0.1916809 , -0.97929025], [-0.48837085, -0.62295003], [-0.50731017, 0.50305894], [ 0.06457385, -0.10670002], [-0.72573604, 1.10026385], [-0.90893845, 0.99827162], [ 0.20714399, -0.56965615], [ 0.8041371 , 0.21910274], [-0.65882317, 0.2657183 ], [-1.1214074 , -0.39886425], [ 0.0784783 , -0.21630006], [-0.91802557, -0.20178683], [ 0.88268539, -0.66470235], [-0.03652459, 1.49798484], [ 1.76329838, -0.26554555], [-0.97546845, -2.41823586], [ 0.32335103, -1.35091711], [-0.12981597, 0.27591674]]) >>>
但我希望x1只是一个指针或类似的东西。每当需要子矩阵时,复制数据对我来说都是太昂贵了。我怎样才能做到这一点?
编辑:显然没有与numpy数组的任何解决方案。从这个角度来看,熊猫数据框会更好吗?
属性中x汇总了有关阵列的信息.__array_interface__
.__array_interface__
In [433]: x.__array_interface__ Out[433]: {'descr': [('', '<f8')], 'strides': None, 'data': (171396104, False), 'typestr': '<f8', 'version': 3, 'shape': (20, 2)}
它具有数组shape,strides(这里是默认值)和指向数据缓冲区的指针。Aview可以指向相同的数据缓冲区(可能更远),并具有自己的shape和strides。
shape
strides
view
但是用布尔值建立索引不能用这几个数字来概括。它要么必须一直携带整个index阵列,要么从x数据缓冲区中复制选定的项目。 numpy选择复制。您可以选择何时应用index,立即应用还是在调用堆栈的更下游应用。
index
numpy