python 如何以向量化的方式修改具有任意索引的numpy数组？

khbbv19g 于 2023-04-04 发布在 Python

关注(0)|答案(2)|浏览(106)

简化故事

假设我有一个数组arr和索引idx。对于idx中出现的每个i，我想将arr[i]增加1。
非矢量化方法如下所示：

import numpy as np

arr = np.zeros(5)
idx = [0, 1, 1, 2, 0]

for i in idx:
    arr[i] += 1

有没有办法把它矢量化？
请注意，arr[idx] += 1是无效的，因为索引重复。

arr = np.zeros(1)
idx = [0, 0]
arr[idx] += 1  # arr becomes array([1]), not array([2])

当然，在这个1D数组例子中使用np.unique()也可以达到同样的目的。但实际上我正在尝试处理2D数组，我怀疑计数元素是否是最好的解决方案。

编辑

np.unique确实工作，但似乎有不必要的减速。我想要一个更快的方法（如果存在的话）。
下面是10，000个点的2D索引的示例，没有重复。

arr = np.zeros((10000, 10000))
idx = np.stack([np.arange(10000), np.arange(10000)])

%timeit np.unique(idx, axis=1, return_counts=True)  # takes 1.93 ms

%timeit arr[idx[0], idx[1]] += 1  # takes 235 μs

显然，通过索引进行迭代要快10倍左右。

编辑2

@PaulS的回答比np.unique还快。

%timeit np.add.at(arr, (idx[0], idx[1]), 1) # takes 925 μs

编辑3

下面是使用随机索引测试重复索引的示例。

arr = np.zeros((10000, 10000))
ran = (np.random.rand(10000)*10).astype(int)
idx = np.stack([ran, ran])

%timeit np.unique(idx, axis=1, return_counts=True)  # takes 3.24 ms

%timeit np.add.at(arr, (idx[0], idx[1]), 1) # takes 859 μs

(edit：错别字）

详细故事

我正在尝试使用NumPy实现Hough线变换算法。（我不使用cv2.HoughLines()的原因是因为我希望直接从点的坐标中得到结果，而不是从二进制数组中得到）。
在(r, θ)平面上获取曲线很容易，但我在矢量化的方式下实现累加器时遇到了麻烦。目前我依赖于将2D数据平坦化为1D。是否有更好更快的方法来执行累加？
感谢您的评分

python

来源：https://stackoverflow.com/questions/75920451/how-to-modify-numpy-array-with-arbitrary-indices-in-vectorized-way

2条答案

按热度按时间

2w2cym1i1#

一维数组

另一种可能的解决方案：

np.add.at(arr, idx, 1)

输出：

[2. 2. 1. 0. 0.]

二维数组

（谢谢，@mozway，你的例子，我现在在这里使用。

arr = np.zeros([3, 4], dtype=int)
idx = [[0, 0, 2, 0],
       [1, 1, 3, 1]]

np.add.at(arr, (idx[0], idx[1]), 1)

输出：

array([[0, 3, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 1]])

赞(0）回复(0）举报 2023-04-04

pcww981p2#

使用numpy.unique获取唯一索引及其计数：

idx2, cnt = np.unique(idx, return_counts=True)

arr[idx2] += cnt

更新arr：

array([2, 2, 1, 0, 0])

使用nd-arrays（2D示例）：

arr = np.zeros([3, 4], dtype=int)
idx = [[0, 0, 2, 0],
       [1, 1, 3, 1]]

idx2, cnt = np.unique(idx, axis=1, return_counts=True)
arr[*idx2] = cnt

输出：

array([[0, 3, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 1]])

如果索引被转置：

arr = np.zeros([3, 4], dtype=int)
idx = idx = [[0, 1], [0, 1], [2, 3], [0, 1]]

idx2, cnt = np.unique(idx, axis=0, return_counts=True)
arr[*idx2.T] = cnt

赞(0）回复(0）举报 2023-04-04

我来回答

python 如何以向量化的方式修改具有任意索引的numpy数组？

简化故事

编辑

编辑2

编辑3

详细故事

2条答案

使用nd-arrays（2D示例）：

相关问题

热门标签

最新问答