numpy 使用低维布尔数组的子集/切片dask数组

368yc8dk  于 2023-08-05  发布在  其他
关注(0)|答案(1)|浏览(112)

使用numpy,我可以使用2D“掩模”来子集化3D阵列。同样返回一个带有dask数组的IndexError。是否有任何方法可以使用dask再现下面的numpy行为?

import numpy as np
import dask.array as da

# Create 3d arrays of random values and mask with shape matching second and third dimensions
y_da = da.random.random(size=(20, 100, 100))
y_np = np.random.rand(20, 100, 100)
mask = np.zeros((100, 100), dtype=np.uint8)
mask[20:80, 3:77] = 1

# Apply mask (flattens axes 1 and 2)
print(y_np[:,mask == 1].shape) # OK
print(y_da[:,mask == 1].shape) # IndexError

字符串

wbrvyc0a

wbrvyc0a1#

In [204]: y_np = np.random.rand(20, 100, 100)
     ...: mask = np.zeros((100, 100), dtype=np.uint8)
     ...: mask[20:80, 3:77] = 1
In [205]: y_np.shape
Out[205]: (20, 100, 100)
In [206]: mask.shape
Out[206]: (100, 100)
In [207]: y_np[:,mask == 1].shape
Out[207]: (20, 4440)

字符串
numpy中,使用布尔掩码通常等同于使用相应的“高级索引”数组。因此:

In [208]: I,J=np.nonzero(mask==1)
In [209]: I.shape
Out[209]: (4440,)
In [210]: y_np[:,I,J].shape
Out[210]: (20, 4440)


参见基本索引文档的布尔索引部分:
https://numpy.org/doc/stable/user/basics.indexing.html#advanced-indexing
您可以将相同的想法应用于dask
我必须将nonzero的结果与I,J进行“拆分”,否则:

In [211]: y_np[:,np.nonzero(mask==1)].shape
Out[211]: (20, 2, 4440, 100)

In [213]: y_np[(slice(None),*np.nonzero(mask==1))].shape
Out[213]: (20, 4440)

相关问题