python 如何访问NumPy多维数组的第i列？

0wi1tuuw 于 2023-04-19 发布在 Python

关注(0)|答案(9)|浏览(120)

给出：

test = np.array([[1, 2], [3, 4], [5, 6]])

test[i]给出了第 * i * 行（例如[1, 2]）。我如何访问第 * i * 列？（例如[1, 3, 5]）。此外，这会是一个昂贵的操作吗？

python

来源：https://stackoverflow.com/questions/4455076/how-do-i-access-the-ith-column-of-a-numpy-multidimensional-array

9条答案

按热度按时间

oymdgrw71#

要访问列0，请执行以下操作：

>>> test[:, 0]
array([1, 3, 5])

要访问行0，请执行以下操作：

>>> test[0, :]
array([1, 2])

这在NumPy reference的1.4节（索引）中有介绍。这很快，至少在我的经验中是这样。它肯定比在循环中访问每个元素快得多。

赞(0）回复(0）举报 2023-04-19

vatpfxk52#

>>> test[:,0]
array([1, 3, 5])

这个命令给你一个行向量，如果你只是想循环它，这很好，但是如果你想hstack与其他一些维度为3xN的数组，你将有

ValueError: all the input arrays must have same number of dimensions

同时

>>> test[:,[0]]
array([[1],
       [3],
       [5]])

给你一个列向量，这样你就可以做concatenate或hstack操作。
例如

>>> np.hstack((test, test[:,[0]]))
array([[1, 2, 1],
       [3, 4, 3],
       [5, 6, 5]])

赞(0）回复(0）举报 2023-04-19

hmtdttj43#

如果你想一次访问多个列，你可以这样做：

>>> test = np.arange(9).reshape((3,3))
>>> test
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
>>> test[:,[0,2]]
array([[0, 2],
       [3, 5],
       [6, 8]])

赞(0）回复(0）举报 2023-04-19

ruarlubt4#

你也可以转置并返回一行：

In [4]: test.T[0]
Out[4]: array([1, 3, 5])

赞(0）回复(0）举报 2023-04-19

5us2dqdw5#

虽然这个问题已经得到了回答，但让我提一些细微的差别。
假设您对数组的第一列感兴趣

arr = numpy.array([[1, 2],
                   [3, 4],
                   [5, 6]])

正如你已经从其他答案中知道的，要以“行向量”的形式（形状为(3,)的数组）获取它，你可以使用切片：

arr_col1_view = arr[:, 1]         # creates a view of the 1st column of the arr
arr_col1_copy = arr[:, 1].copy()  # creates a copy of the 1st column of the arr

要检查数组是视图还是另一个数组的副本，可以执行以下操作：

arr_col1_view.base is arr  # True
arr_col1_copy.base is arr  # False

参见ndarray.base。
除了两者之间的明显差异（修改arr_col1_view将影响arr）之外，遍历它们的字节步数也不同：

arr_col1_view.strides[0]  # 8 bytes
arr_col1_copy.strides[0]  # 4 bytes

strides和这个答案。
为什么这很重要？假设你有一个非常大的数组A而不是arr：

A = np.random.randint(2, size=(10000, 10000), dtype='int32')
A_col1_view = A[:, 1] 
A_col1_copy = A[:, 1].copy()

你想计算第一列所有元素的和，即A_col1_view.sum()或A_col1_copy.sum()。使用复制的版本要快得多：

%timeit A_col1_view.sum()  # ~248 µs
%timeit A_col1_copy.sum()  # ~12.8 µs

这是由于前面提到的步幅数不同：

A_col1_view.strides[0]  # 40000 bytes
A_col1_copy.strides[0]  # 4 bytes

虽然看起来使用列副本更好，但并不总是如此，因为创建副本也需要时间，而且会使用更多内存（在本例中，我花了大约200 µs来创建A_col1_copy）。但是，如果我们首先需要复制，或者我们需要对数组的特定列执行许多不同的操作，并且我们可以牺牲内存来提高速度，那就复印一份
在我们主要对列感兴趣的情况下，以column-major（'F'）顺序而不是row-major（'C'）顺序（这是默认的）创建数组可能是一个好主意，然后像以前一样进行切片以获得列而不复制它：

A = np.asfortranarray(A)   # or np.array(A, order='F')
A_col1_view = A[:, 1]
A_col1_view.strides[0]     # 4 bytes

%timeit A_col1_view.sum()  # ~12.6 µs vs ~248 µs

现在，在列视图上执行求和操作（或任何其他操作）与在列副本上执行它一样快。
最后，让我注意到，转置一个数组并使用行切片与在原始数组上使用列切片相同，因为转置只是通过交换原始数组的形状和步幅来完成的。

A[:, 1].strides[0]    # 40000 bytes
A.T[1, :].strides[0]  # 40000 bytes

赞(0）回复(0）举报 2023-04-19

zzoitvuj6#

要获得多个和独立的列，只需：

> test[:,[0,2]]

你会得到列0和2

赞(0）回复(0）举报 2023-04-19

ztigrdn87#

>>> test
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

>>> ncol = test.shape[1]
>>> ncol
5L

然后，您可以通过以下方式选择第2 - 4列：

>>> test[0:, 1:(ncol - 1)]
array([[1, 2, 3],
       [6, 7, 8]])

赞(0）回复(0）举报 2023-04-19

bxjv4tth8#

这不是多维的。它是2维数组。你想访问你想访问的列。

test = numpy.array([[1, 2], [3, 4], [5, 6]])
test[:, a:b]  # you can provide index in place of a and b

赞(0）回复(0）举报 2023-04-19

byqmnocz9#

这个问题已经得到了回答，但关于查看与复制的说明。
如果数组使用标量索引（常规索引），结果是一个视图（下面的x），这意味着对x所做的任何更改都将反映在test上，因为x只是test的不同视图。

test = np.array([[1, 2], [3, 4], [5, 6]])
# select second column
x = test[:, 1]
x[:] = 100        # <---- this does affects test

test
array([[  1, 100],
       [  3, 100],
       [  5, 100]])

但是，如果数组使用类似列表/数组的索引（高级索引），则结果是副本，这意味着对x的任何更改都不会影响test。

test = np.array([[1, 2], [3, 4], [5, 6]])
# select second column
x = test[:, [1]]
x[:] = 100        # <---- this does not affect test

test
array([[1, 2],
       [3, 4],
       [5, 6]])

一般来说，使用切片索引将返回一个视图：

test = np.array([[1, 2], [3, 4], [5, 6]])
x = test[:, :2]
x[:] = 100

test
array([[100, 100],
       [100, 100],
       [100, 100]])

但使用数组索引将返回一个副本：

test = np.array([[1, 2], [3, 4], [5, 6]])
x = test[:, np.r_[:2]]
x[:] = 100

test
array([[1, 2],
       [3, 4],
       [5, 6]])

常规索引非常快，高级索引要慢得多（也就是说，它仍然几乎是即时的，它肯定不会成为程序的瓶颈）。

赞(0）回复(0）举报 2023-04-19

我来回答

python 如何访问NumPy多维数组的第i列？

9条答案

相关问题

热门标签

最新问答