numpy通过索引查找数组中的多个匹配项

fjaof16o 于 2023-08-05 发布在其他

关注(0)|答案(6)|浏览(110)

给定以下数组：

array = [-1, -1, -1, -1, -1, -1, 3, 3, -1, 3, -1, -1,  2,  2, -1, -1,  1, -1]
 indexes  0   1   2   3   4   5  6  7   8  9  10  11  12  13  14  15  16  17

字符串
我需要找到出现相同数字的索引。在这个例子中，这将返回一个列表列表，如下所示：

list(list(), list(16), list(12, 13), list(6, 7, 9), list() etc...)
     0       1     \   2             3              4
     ^              \ 
      \              \ the index in the array at which "1" appears
       \ 
        \ the numbers in the array

型
在Numpy中如何做到这一点？
数字1出现在索引16处
数字2出现在索引12、13处
等等。
基于评论的注解：

-1可以忽略，我只对剩下的感兴趣
数组有~50个元素，最大值为int（500）
该函数将被调用6000+次。

numpy

来源：https://stackoverflow.com/questions/76770505/numpy-finding-multiple-occurrence-in-an-array-by-index

6条答案

按热度按时间

wd2eg0qa1#

对于O(n) time* 中的解决方案，使用字典收集索引：

# collect positions per value for each item
d = {}
for i, x in enumerate(array):
    d.setdefault(x, []).append(i)

# sort the output (optional)
out = {k: d[k] for k in sorted(d)}

字符串
输出量：

{-1: [0, 1, 2, 3, 4, 5, 8, 10, 11, 14, 15, 17],
  1: [16],
  2: [12, 13],
  3: [6, 7, 9]}

型

- O(k*log(k))其中k是唯一值的数量（如果需要排序输出）

对于列表列表：

out = [d.get(k, []) for k in range(min(d), max(d)+1)]

# or for only positive values
out = [d.get(k, []) for k in range(1, max(d)+1)]

型
输出量：

[[0, 1, 2, 3, 4, 5, 8, 10, 11, 14, 15, 17],
 [],
 [16],
 [12, 13],
 [6, 7, 9]]

型

替代

或者一个非常简单的方法，如果你预先初始化输出：

out = [[] for i in range(max(array)+1)]
for i, x in enumerate(array):
    out[x].append(i)

型

所有方法的比较

python是最快的。
初始数组使用np.random.randint(0, k, size=n).tolist()生成，其中n是数组的长度，k是数组中的最大值。
k=4：
x1c 0d1x的数据
k=100：

我们现在可以看到@TalhaTayyab/@Stitt/@PaulS方法的二次行为。

的
k=10_000：

我们可以注意到，对于相对较小的数组（当值可能是唯一的时），numpy的速度要快一些。

的

赞(0）回复(0）举报 2023-08-05

k0pti3hp2#

array = [-1, -1, -1, -1, -1, -1, 3, 3, -1, 3, -1, -1,  2,  2, -1, -1,  1, -1]
s = sorted(set(array))
print(s)  # all the unique elements in the list

#output
[-1, 1, 2, 3] 

[([i for i,d in enumerate(array) if d == x],x) for x in s]

#output
[([0, 1, 2, 3, 4, 5, 8, 10, 11, 14, 15, 17], -1),   #[([indices],element)]
 ([16], 1),
 ([12, 13], 2),
 ([6, 7, 9], 3)]

字符串

赞(0）回复(0）举报 2023-08-05

pkmbmrz73#

一个numpy解决方案：

l1 = np.array([-1, -1, -1, -1, -1, -1, 3, 3, -1, 3, -1, -1,  2,  2, -1, -1,  1, -1])

unique, counts = np.unique(l1, return_counts=True)
print(dict(zip(unique, np.split(np.argsort(l1), np.cumsum(counts)))))

字符串
印刷品：

{
    -1: array([0, 15, 14, 11, 10, 8, 4, 3, 2, 1, 5, 17]),
    1: array([16]),
    2: array([12, 13]),
    3: array([6, 7, 9]),
}

型

赞(0）回复(0）举报 2023-08-05

dluptydi4#

采用itertools.groupby + operator.itemgetter方法：

from itertools import groupby
from operator import itemgetter

[[k, list(i[0] for i in g)] for k, g in groupby(sorted(enumerate(a), key=itemgetter(1)), key=itemgetter(1))]

个字符

赞(0）回复(0）举报 2023-08-05

hiz5n14c5#

@talha-tayyab的答案将工作，如果你只需要考虑 do 出现在数组中的值，但是，如果你需要从value=0开始并递增到最高值，这应该可以工作。

array = [-1, -1, -1, -1, -1, -1, 3, 3, -1, 3, -1, -1,  2,  2, -1, -1,  1, -1]
max = numpy.max(array)

result = []
for i in range(max):
    result.append(numpy.argwhere(array == i))

return result

字符串

赞(0）回复(0）举报 2023-08-05

chhqkbe16#

另一种可能的解决方案：

[(x, np.where(array == x)[0].tolist()) for x in np.unique(array)]

字符串
输出量：

[(-1, [0, 1, 2, 3, 4, 5, 8, 10, 11, 14, 15, 17]),
 (1, [16]),
 (2, [12, 13]),
 (3, [6, 7, 9])]

型

赞(0）回复(0）举报 2023-08-05

我来回答

numpy通过索引查找数组中的多个匹配项

6条答案

替代

所有方法的比较

相关问题

热门标签

最新问答