Python-numpy数组中每个值的最大连续计数

gcmastyq 于 2023-10-19 发布在 Python

关注(0)|答案(2)|浏览(111)

给定一个一维numpy数组，目标是计算特定连续值的最大数量，例如，给定一个数组arr：

arr=np.array([1,1,1,2,2,3,4,4,4,4,4,2,4,4])

我想返回每个数字的最大连续值的数量。结果是一个二维数组，第一列是每个数字，第二列是每个数字的最大连续计数。

result=np.array([[1,3],[2,2],[3,1],[4,5]])

numpy

来源：https://stackoverflow.com/questions/77008765/python-maximum-consecutive-count-of-each-value-in-numpy-array

2条答案

按热度按时间

pw9qyyiw1#

这对于pandas来说很容易做到：

s = pd.Series(arr)
out = (s.groupby(s.ne(s.shift()).cumsum(), sort=False) # group consecutive values
        .agg({'first', 'size'})                        # get value and count
        .groupby('first', sort=False)['size'].max()    # max count per value
        .reset_index().to_numpy()                      # back to numpy
      )

对于纯numpy，它稍微复杂一些：

arr = np.array([1,1,1,2,2,3,4,4,4,4,4,2,4,4])

# identify the consecutive values
idx = np.nonzero(np.diff(arr))[0]
# array([ 2,  4,  5, 10, 11])

# get single value of consecutive ones
i = np.r_[arr[idx], arr[-1]]
# array([1, 2, 3, 4, 2, 4])

# count the number of replicates
n = np.diff(np.r_[0, idx+1, arr.shape[0]])
# array([3, 2, 1, 5, 1, 2])

# sort by value and count
order = np.lexsort([n, i])
# array([0, 4, 1, 2, 5, 3])

i2 = i[order]
# array([1, 2, 2, 3, 4, 4])

m = np.r_[np.diff(i2)!=0, True]
# array([ True, False,  True,  True, False,  True])

# combine
out = np.vstack([i2[m], n[order][m]]).T

输出量：

array([[1, 3],
       [2, 2],
       [3, 1],
       [4, 5]])

使用纯python和itertools.groupby：

from itertools import groupby

out = {}
for k, g in groupby(arr):
    out[k] = max(out.get(k, -1), len(list(g)))

out = list(out.items())

输出：[(1, 3), (2, 2), (3, 1), (4, 5)]

定时比较

使用随机数组作为输入（np.random.randint(1, 5, size=N)）。

赞(0）回复(0）举报 2023-10-19

nafvub8i2#

迭代每个值，并记录连续值的数量。如果该值发生变化，则更新最大计数（如果先前的运行计数更高），然后将运行计数重置为1。最大计数使用defaultdict存储，然后转换为数组。

from collections import defaultdict

arr=np.array([1,1,1,2,2,3,4,4,4,4,4,2,4,4])

prev = None
max_consecutive_count = defaultdict(int)
running_count = 0

for val in arr:
    if val == prev or not prev:
        running_count += 1
    else:
        if running_count > max_consecutive_count[prev]:
            max_consecutive_count[prev] = running_count
        running_count = 1
    
    prev = val
else:
    if running_count > max_consecutive_count[val]:
        max_consecutive_count[val] = running_count
    
max_consecutive_count_arr = [[k,v] for k,v in max_consecutive_count.items()]

赞(0）回复(0）举报 2023-10-19

我来回答

Python-numpy数组中每个值的最大连续计数

2条答案

定时比较

相关问题

热门标签

最新问答