NumPy数组上的优化矢量化变换

xesrikrc 于 2023-10-19 发布在其他

关注(0)|答案(1)|浏览(91)

我有一个对NumPy数组执行矩阵变换的函数，我感兴趣的是通过消除循环并充分利用NumPy的向量化操作来优化其性能。函数transformation(current_tuple)接受一个1D元组current_tuple并返回一个2x2数组。

MWE：

import numpy as np

def transformation(current_tuple):
    return np.array([[current_tuple[0]  ,   current_tuple[1]],
                    [   0              ,   3.*current_tuple[0]]])

array_with_two_rows = np.random.randint(0, 10, size=(2, 6)) # Example 2D array

transformation_result = np.vstack([transformation(x) for x in array_with_two_rows.T])

print('true')

目前，我使用for-loop将transformation()函数应用于每一列。

**问题：**我有兴趣找到一个更有效和矢量化的替代方案，以实现相同的结果，而不需要循环。
**PS：*转换逻辑只是一个虚逻辑，表示转换后会返回一个更高维度的矩阵。 转换逻辑可能会发生变化 *。转换逻辑唯一有效的地方是它将行向量/数组作为输入，并返回二维数组作为输出（可以是2x2，2x 5，3x 7等）。

numpy

来源：https://stackoverflow.com/questions/77267769/optimizing-vectorized-transformation-on-numpy-arrays

1条答案

按热度按时间

bjp0bcyl1#

与我关于使用列表而不是数组的评论相关，让我们定义一个返回列表的函数版本：

In [251]: def trans_list(x):
     ...:     return [[x[0], x[1]], [0, 3.*x[0]]]
     ...:

你的原始计算，使用我在以前（删除）的答案中使用的array_with_two_rows：

In [252]: np.vstack([transformation(x) for x in array_with_two_rows.T])
Out[252]: 
array([[ 9.,  9.],
       [ 0., 27.],
       [ 8.,  7.],
       [ 0., 24.],
       [ 9.,  0.],
       [ 0., 27.],
       [ 5.,  5.],
       [ 0., 15.],
       [ 1.,  1.],
       [ 0.,  3.],
       [ 6.,  5.],
       [ 0., 18.]])

In [253]: timeit np.vstack([transformation(x) for x in array_with_two_rows.T])
96.6 µs ± 2.19 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

尝试相同的数组，但在列表模式下（tolist()在将数组转换为列表方面非常有效）：

In [254]: [trans_list(x) for x in array_with_two_rows.T.tolist()]
Out[254]: 
[[[9, 9], [0, 27.0]],
 [[8, 7], [0, 24.0]],
 [[9, 0], [0, 27.0]],
 [[5, 5], [0, 15.0]],
 [[1, 1], [0, 3.0]],
 [[6, 5], [0, 18.0]]]

In [255]: timeit [trans_list(x) for x in array_with_two_rows.T.tolist()]
5.63 µs ± 11.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

快多了。通常对于小的东西，python列表操作更快，特别是如果他们跳过制作小数组。我可以想象用extend作为for循环来重写这个代码，以删除一级列表嵌套。
即使我重新添加vstack，它仍然更快：

In [256]: np.vstack([trans_list(x) for x in array_with_two_rows.T.tolist()])
Out[256]: 
array([[ 9.,  9.],
       [ 0., 27.],
       [ 8.,  7.],
       [ 0., 24.],
       [ 9.,  0.],
       [ 0., 27.],
       [ 5.,  5.],
       [ 0., 15.],
       [ 1.,  1.],
       [ 0.,  3.],
       [ 6.,  5.],
       [ 0., 18.]])

In [257]: timeit np.vstack([trans_list(x) for x in array_with_two_rows.T.tolist()])
45.5 µs ± 156 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

通常的时间免责声明适用;替代方案的规模可能不同。
我之前建议的数组乘法方法做得很好，因为它直接与numpy整个数组方法一起工作：

In [258]: (x[:,None,:] * np.array([[1,1],[0,3]])).reshape(-1,2)
Out[258]: 
array([[ 9,  9],
       [ 0, 27],
       [ 8,  7],
       [ 0, 21],
       [ 9,  0],
       [ 0,  0],
       [ 5,  5],
       [ 0, 15],
       [ 1,  1],
       [ 0,  3],
       [ 6,  5],
       [ 0, 15]])

In [259]: timeit (x[:,None,:] * np.array([[1,1],[0,3]])).reshape(-1,2)
15.9 µs ± 21 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

赞(0）回复(0）举报 2023-10-19

我来回答

NumPy数组上的优化矢量化变换

1条答案

相关问题

热门标签

最新问答