Python - Filter numpy array filter element values for all other element at the same index without loop

8zzbczxx 于 2023-10-19 发布在 Python

关注(0)|答案(2)|浏览(94)

假设我有一个numpy数组，如下所示：数组的形状是（5，3），有超过100万行，类型为numpy对象。
样品阵列：

x = np.array([['A',1,10],['B',1,20],['C',2,80],['D',3,40],['E',2,50]])

我希望实现以下目标：
如果列Y的值存在于另一行中，但仅存在于整个数组集中的列Y中，则检查列X的值，并且它们不相等，然后过滤记录。

Col X  Y   Z

[['A' '1' '10'] ---> filter value '1' from the all values of column Y in entire array
 ['B' '1' '20'] ---> filter value '1' from the all values of column Y in entire array
 ['C' '2' '80'] ---> filter value '2' from the all values of column Y in entire array
 ['D' '3' '40'] ---> filter value '3' from the all values of column Y in entire array
 ['E' '2' '50']] ---> filter value '2' from the all values of column Y in entire array

例如：当检查行号1时，列y值为 *'1'，对于同一行，列x值为 *'A'。因此，第一个过滤器基于列y值“1”;下面的行满足条件。

[['A' '1' '10']
*['B' '1'* '20']]
然后根据第1行中列x的值应用第二个过滤器，该值不等于所有其他过滤行列x的值。
所以在这种情况下，行2满足这两个条件。
['B' '1'* '20']]
注意：本例显示了两条匹配的记录，但实际上，它可以是一条或多条，并且可以位于任何行位置。
接下来，我想执行的是，对于选定的记录（在本例中为第2行），追加到第1行。
请建议
我试过这段代码，但没有舔：

import numpy as np
x = np.array([['A',1,10],['B',1,20],['C',2,80],['D',3,40],['E',2,50]])
y = x
print(x)
print("Result is:",x[np.where(x[:,1] == y[:,1], np.where(x[:,0] != y[:,0][::-1]),False)])

x=  --->print(x)

[['A' '1' '10']
 ['B' '1' '20']
 ['C' '2' '80']
 ['D' '3' '40']
 ['E' '2' '50']]

结果为：空

numpy

来源：https://stackoverflow.com/questions/77270531/python-filter-numpy-array-filter-element-values-for-all-other-element-at-same

2条答案

按热度按时间

qlfbtfca1#

你的预期结果真的是这样吗？：

arr = [
    ["A", 1, 10],  # 0
    ["B", 1, 20],  # 1
    ["C", 2, 80],  # 2
    ["D", 3, 40],  # 3
    ["E", 2, 50],  # 4
    ["F", 1, 30],  # 5
    ["A", 1, 70],  # 6
]

expected_indexes = [
    (1, 5),
    (0, 5, 6),
    (4,),
    (),
    (2,),
    (0, 1, 6),
    (1, 5),
]

expected = [
    (["B", 1, 20], ["F", 1, 30]),
    (["A", 1, 10], ["F", 1, 30], ["A", 1, 70]),
    (["E", 2, 50]),
    (),
    (["C", 2, 80]),
    (["A", 1, 10], ["B", 1, 20], ["A", 1, 70]),
    (["B", 1, 20], ["F", 1, 30]),
]

如果是这样，您可以执行以下操作：

X, Y = arr[:, :2].T
cond1 = Y[:, None] == Y[None, :]
cond2 = X[:, None] != X[None, :]
mask = cond1 & cond2

>>> cond1
array([[ True,  True, False, False, False,  True,  True],
       [ True,  True, False, False, False,  True,  True],
       [False, False,  True, False,  True, False, False],
       [False, False, False,  True, False, False, False],
       [False, False,  True, False,  True, False, False],
       [ True,  True, False, False, False,  True,  True],
       [ True,  True, False, False, False,  True,  True]])

>>> cond2
array([[False,  True,  True,  True,  True,  True, False],
       [ True, False,  True,  True,  True,  True,  True],
       [ True,  True, False,  True,  True,  True,  True],
       [ True,  True,  True, False,  True,  True,  True],
       [ True,  True,  True,  True, False,  True,  True],
       [ True,  True,  True,  True,  True, False,  True],
       [False,  True,  True,  True,  True,  True, False]])

>>> mask
array([[False,  True, False, False, False,  True, False],
       [ True, False, False, False, False,  True,  True],
       [False, False, False, False,  True, False, False],
       [False, False, False, False, False, False, False],
       [False, False,  True, False, False, False, False],
       [ True,  True, False, False, False, False,  True],
       [False,  True, False, False, False,  True, False]])

然后使用掩码，每行True值的索引对应于expected_indexes。
从这里我看不出你如何在没有for循环的情况下工作，但是繁重的工作已经完成了，你再也没有结构化数组了：

>>> indexes = [tuple(np.where(r)[0]) for r in mask]
>>> assert indexes == expected_indexes
>>>
>>> result = [tuple(arr[list(inds)]) for inds in indexes]

赞(0）回复(0）举报 2023-10-19

xwbd5t1u2#

你的答案嵌套了两个np.where，真正起作用的是np.where(x[:,0] != y[:,0][::-1])，它返回x的索引，其中列Y相同，但列X不相等。
你要做的就是把索引发送给x。

Print('The answer is', x[np.where(x[:,0] != y[:,0][::-1])])

赞(0）回复(0）举报 2023-10-19

我来回答

Python - Filter numpy array filter element values for all other element at the same index without loop

2条答案

相关问题

热门标签

最新问答