known_array:numpy数组; 仅由标量值组成;shape: (m, 1)
known_array
shape: (m, 1)
test_array:numpy数组; 仅由标量值组成;shape: (n, 1)
test_array
shape: (n, 1)
indices:numpy数组; shape: (n, 1); 对于in test_array中的每个值,查找in中最接近的值的索引known_array
indices
residual:numpy数组; shape: (n, 1); 对于in test_array中的每个值,查找与in中最接近的值的差known_array
residual
In [17]: known_array = np.array([random.randint(-30,30) for i in range(5)]) In [18]: known_array Out[18]: array([-24, -18, -13, -30, 29]) In [19]: test_array = np.array([random.randint(-10,10) for i in range(10)]) In [20]: test_array Out[20]: array([-6, 4, -6, 4, 8, -4, 8, -6, 2, 8])
def find_nearest(known_array, value): idx = (np.abs(known_array - value)).argmin() diff = known_array[idx] - value return [idx, -diff] In [22]: indices = np.zeros(len(test_array)) In [23]: residual = np.zeros(len(test_array)) In [24]: for i in range(len(test_array)): ....: [indices[i], residual[i]] = find_nearest(known_array, test_array[i]) ....: In [25]: indices Out[25]: array([ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]) In [26]: residual Out[26]: array([ 7., 17., 7., 17., 21., 9., 21., 7., 15., 21.])
加快此任务的最佳方法是什么?Cython是一个选项,但是,我始终希望能够删除for循环并让代码保留为纯NumPy。
for
我做了一些小的基准测试来比较非矢量化和矢量化解决方案(可接受的答案)。
In [48]: [indices1, residual1] = find_nearest_vectorized(known_array, test_array) In [53]: [indices2, residual2] = find_nearest_non_vectorized(known_array, test_array) In [54]: indices1==indices2 Out[54]: array([ True, True, True, True, True, True, True, True, True, True], dtype=bool) In [55]: residual1==residual2 Out[55]: array([ True, True, True, True, True, True, True, True, True, True], dtype=bool) In [56]: %timeit [indices2, residual2] = find_nearest_non_vectorized(known_array, test_array) 10000 loops, best of 3: 173 µs per loop In [57]: %timeit [indices1, residual1] = find_nearest_vectorized(known_array, test_array) 100000 loops, best of 3: 16.8 µs per loop
加速约 10倍 !
known_array 未排序。
我按照下面@cyborg的答案运行了基准测试。
情况1 :如果known_array已排序
known_array = np.arange(0,1000) test_array = np.random.randint(0, 100, 10000) print('Speedups:') base_time = time_f('base') for func_name in ['diffs', 'searchsorted1', 'searchsorted2']: print func_name + ' is x%.1f faster than base.' % (base_time / time_f(func_name)) assert np.allclose(base(known_array, test_array), eval(func_name+'(known_array, test_array)'))
Speedups: diffs is x0.4 faster than base. searchsorted1 is x81.3 faster than base. searchsorted2 is x107.6 faster than base.
首先,对于大型数组,diffs方法实际上要慢一些,它还会占用大量RAM,而当我在实际数据上运行它时,系统就会挂起。
diffs
情况2 :何时known_array未排序;代表实际情况
known_array = np.random.randint(0,100,100) test_array = np.random.randint(0, 100, 100)
Speedups: diffs is x8.9 faster than base. AssertionError Traceback (most recent call last) <ipython-input-26-3170078c217a> in <module>() 5 for func_name in ['diffs', 'searchsorted1', 'searchsorted2']: 6 print func_name + ' is x%.1f faster than base.' % (base_time / time_f(func_name)) ----> 7 assert np.allclose(base(known_array, test_array), eval(func_name+'(known_array, test_array)')) AssertionError: searchsorted1 is x14.8 faster than base.
我还必须评论说,该方法还应该具有存储效率。否则我的8 GB RAM不足。在基本情况下,这很容易满足。
例如,您可以计算使用中的所有差异:
differences = (test_array.reshape(1,-1) - known_array.reshape(-1,1))
以及使用argmin和花式索引以及np.diagonal获得所需的索引和差异:
argmin
np.diagonal
indices = np.abs(differences).argmin(axis=0) residual = np.diagonal(differences[indices,])
因此对于
>>> known_array = np.array([-24, -18, -13, -30, 29]) >>> test_array = np.array([-6, 4, -6, 4, 8, -4, 8, -6, 2, 8])
一送一
>>> indices array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2]) >>> residual array([ 7, 17, 7, 17, 21, 9, 21, 7, 15, 21])