python 在numpy中选择具有可变索引边界的数组元素

z4bn682m 于 2023-01-12 发布在 Python

关注(0)|答案(3)|浏览(90)

这可能是不可能的，因为中间数组将有可变长度的行。我试图完成的是为具有由我的边界数组分隔的ad索引的元素的数组赋值。例如：

bounds = np.array([[1,2], [1,3], [1,4]])
array = np.zeros((3,4))
__assign(array, bounds, 1)

赋值后应该会产生

array = [
    [0, 1, 0, 0],
    [0, 1, 1, 0],
    [0, 1, 1, 1]
]

我已经尝试过类似的东西在各种迭代没有成功：

ind = np.arange(array.shape[0])
array[ind, bounds[ind][0]:bounds[ind][1]] = 1

我试图避免循环，因为这个函数会被调用很多次。有什么想法吗？

python

来源：https://stackoverflow.com/questions/63327849/select-array-elements-with-variable-index-bounds-in-numpy

3条答案

按热度按时间

zkure5ic1#

我绝不是Numpy的Maven，但从我能找到的不同数组索引选项来看，这是我能找到的最快的解决方案：

bounds = np.array([[1,2], [1,3], [1,4]])
array = np.zeros((3,4))
for i, x in enumerate(bounds):
    cols = slice(x[0], x[1]) 
    array[i, cols] = 1

这里我们迭代边界列表并使用切片引用列。
我尝试了下面的方法，首先构造一个列索引列表和一个行索引列表，但是速度慢得多。在我的笔记本电脑上，对于一个10000 x 10000的数组，10秒加上vir 0. 04秒。我猜切片会产生巨大的差异。

bounds = np.array([[1,2], [1,3], [1,4]])
array = np.zeros((3,4))
cols = []
rows = []
for i, x in enumerate(bounds):
    cols += list(range(x[0], x[1])) 
    rows += (x[1] - x[0]) * [i]

# print(cols) [1, 1, 2, 1, 2, 3]
# print(rows) [0, 1, 1, 2, 2, 2]

array[rows, cols] = 1

赞(0）回复(0）举报 2023-01-12

5kgi1eie2#

纯粹的NumPy方法解决这个问题的一个问题是，不存在使用另一个NumPy数组在轴上的边界来"切片" NumPy数组的方法。因此，得到的扩展边界最终成为列表的可变长度列表，如[[1],[1,2],[1,2,3]。然后，您可以使用np.eye和np.sum over axis = 0来获得所需的输出。

bounds = np.array([[1,2], [1,3], [1,4]])

result = np.stack([np.sum(np.eye(4)[slice(*i)], axis=0) for i in bounds])
print(result)

array([[0., 1., 0., 0.],
       [0., 1., 1., 0.],
       [0., 1., 1., 1.]])

我尝试了各种方法来将np.eye(4)从[start：stop]切片到一个由start和stop组成的NumPy数组上，但遗憾的是，您需要一次迭代才能完成。

- EDIT：另一种不用任何循环就能以矢量化方式执行此操作的方法是**-

一个二个一个一个

- 编辑：如果您正在寻找一个超快的解决方案，但可以容忍单个for循环，那么根据我的模拟，在此线程的所有答案中，最快的方法是**-

def h(bounds):
    zz = np.zeros((len(bounds), bounds.max()))

    for z,b in zip(zz,bounds):
        z[b[0]:b[1]]=1
        
    return zz

h(bounds)

array([[0., 1., 0., 0.],
       [0., 1., 1., 0.],
       [0., 1., 1., 1.]])

赞(0）回复(0）举报 2023-01-12

avwztpqn3#

使用numba.njit装饰器

import numpy as np
import numba

@numba.njit
def numba_assign_in_range(arr, bounds, val):

  for i in range(len(bounds)):

    s, e = bounds[i]
    arr[i, s:e] = val
  
  return arr

test_size = int(1e6) * 2

bounds = np.zeros((test_size, 2), dtype='int32')
bounds[:, 0] = 1
bounds[:, 1] = np.random.randint(0, 100, test_size)

a = np.zeros((test_size, 100))

带有numba.njit

CPU times: user 3 µs, sys: 1 µs, total: 4 µs
Wall time: 6.2 µs

不含numba.njit

CPU times: user 3.54 s, sys: 1.63 ms, total: 3.54 s
Wall time: 3.55 s

赞(0）回复(0）举报 2023-01-12

我来回答

python 在numpy中选择具有可变索引边界的数组元素

3条答案

相关问题

热门标签

最新问答