pandas 如何在numpy数组中找到下一个非NaN值的距离

xggvc2p6 于 2023-09-29 发布在其他

关注(0)|答案(2)|浏览(103)

考虑以下数组：

arr = np.array(
    [
        [10, np.nan],
        [20, np.nan],
        [np.nan, 50],
        [15, 20],
        [np.nan, 30],
        [np.nan, np.nan],
        [10, np.nan],
        
    ]
)

对于arr中每列的每个单元格，我需要找到到下一个非NaN值的距离。也就是说，预期的结果应该是这样的：

expected = np.array(
    [
        [1, 2],
        [2, 1],
        [1, 1],
        [3, 1],
        [2, np.nan],
        [1, np.nan],
        [np.nan, np.nan]
    ]
)

pandas

来源：https://stackoverflow.com/questions/77172130/how-to-find-the-distance-to-next-non-nan-value-in-numpy-array

2条答案

按热度按时间

2w2cym1i1#

使用pandas，你可以用mask和shift计算一个反向的cumcount：

out = (pd.DataFrame(arr).notna()[::-1]
         .apply(lambda s: s.groupby(s.cumsum()).cumcount().add(1)
                           .where(s.cummax()).shift()[::-1])
         .to_numpy()
      )

输出量：

array([[ 1.,  2.],
       [ 2.,  1.],
       [ 1.,  1.],
       [ 3.,  1.],
       [ 2., nan],
       [ 1., nan],
       [nan, nan]])

赞(0）回复(0）举报 2023-09-29

hc2pp10m2#

你可能会得到一些perf加速，通过二进制搜索和一些numpy函数的组合：

box = []
for num in range(arr.shape[-1]):
    temp=arr[:, num]
    # this section gets the non-nan positions
    bools = ~np.isnan(temp)
    bools = bools.nonzero()[0]
    # this section gets positions of all indices 
    # with respect to the non-nan positions
    # note the use of side='right' to get the closest non-nan position
    positions = np.arange(temp.size)
    bool_positions = bools.searchsorted(positions, side='right')
    # out of bound positions are replaced with nan
    filtered=bool_positions!=bools.size
    blanks=np.empty(temp.size, dtype=float)
    blanks[~filtered]=np.nan
    trimmed=bool_positions[filtered]
    indexer = positions[filtered]
    # subtract position of next non-nan from actual position
    blanks[indexer] = bools[trimmed] - indexer
    box.append(blanks)

np.column_stack(box)
array([[ 1.,  2.],
       [ 2.,  1.],
       [ 1.,  1.],
       [ 3.,  1.],
       [ 2., nan],
       [ 1., nan],
       [nan, nan]])

赞(0）回复(0）举报 2023-09-29

我来回答

pandas 如何在numpy数组中找到下一个非NaN值的距离

2条答案

相关问题

热门标签

最新问答