如何将数据从一个numpy数组复制到另一个

nwnhqdif  于 2023-05-17  发布在  其他
关注(0)|答案(8)|浏览(181)

在不修改数组a地址的情况下,将数据从数组B复制到数组a的最快方法是什么。我需要这个是因为一个外部库(PyFFTW)使用了一个指向我的数组的指针,这个指针不能改变。
例如:

a = numpy.empty(n, dtype=complex)
for i in xrange(a.size):
  a[i] = b[i]

有没有可能不使用循环?

ctzwtxfj

ctzwtxfj1#

我相信

a = numpy.empty_like(b)
a[:] = b

将快速复制值。正如Funsi提到的,最近版本的numpy也有copyto函数。

cyvaqqii

cyvaqqii2#

NumPy版本1.7具有numpy.copyto函数,可以执行您正在寻找的操作:
numpy.copyto(dst,src)
将值从一个数组复制到另一个数组,并根据需要进行广播。
参见:https://docs.scipy.org/doc/numpy/reference/generated/numpy.copyto.html

envsm3lx

envsm3lx3#

a = numpy.array(b)

甚至比numpy v1.6之前建议的解决方案更快,并且还复制了数组。然而,我不能用copyto(a,B)测试它,因为我没有最新版本的numpy。

az31mfrm

az31mfrm4#

为了回答你的问题,我尝试了一些变体并分析了它们。
结论:要将数据从一个numpy数组复制到另一个数组,请尽可能使用numpy内置函数numpy.array(src)numpy.copyto(dst, src)之一。

更新2022-05:用numpy v1.22和CPython v3.9重新测试表明,src.astype(...)目前在我的系统上几乎一直是最快的。因此,最好运行所提供的代码,并自行截取,以获取特定设置的数字。

(But如果dst的内存已经被分配,总是选择numpy.copyto(dst, src),以重用内存。请参阅文章末尾的分析。)

配置文件设置

import timeit
import numpy as np
import pandas as pd
from IPython.display import display
    
def profile_this(methods, setup='', niter=10 ** 4, p_globals=None, **kwargs):
    if p_globals is not None:
        print('globals: {0}, tested {1:.0e} times'.format(p_globals, niter))
    timings = np.array([timeit.timeit(method, setup=setup, number=niter,
                                      globals=p_globals, **kwargs) for 
                        method in methods])
    ranking = np.argsort(timings)
    timings = np.array(timings)[ranking]
    methods = np.array(methods)[ranking]
    speedups = np.amax(timings) / timings

    # pd.set_option('html', False)
    data = {'time (s)': timings,
            'speedup': ['{:.2f}x'.format(s) if s != 1 else '' for s in speedups],
            'methods': methods}
    data_frame = pd.DataFrame(data, columns=['time (s)', 'speedup', 'methods'])

    display(data_frame)
    print()

分析代码

setup = '''import numpy as np; x = np.random.random(n)'''
methods = (
    '''y = np.zeros(n, dtype=x.dtype); y[:] = x''',
    '''y = np.zeros_like(x); y[:] = x''',
    '''y = np.empty(n, dtype=x.dtype); y[:] = x''',
    '''y = np.empty_like(x); y[:] = x''',
    '''y = np.copy(x)''',
    '''y = x.astype(x.dtype)''',
    '''y = 1*x''',
    '''y = np.empty_like(x); np.copyto(y, x)''',
    '''y = np.empty_like(x); np.copyto(y, x, casting='no')''',
    '''y = np.empty(n)\nfor i in range(x.size):\n\ty[i] = x[i]'''
)

for n, it in ((2, 6), (3, 6), (3.8, 6), (4, 6), (5, 5), (6, 4.5)):
    profile_this(methods[:-1:] if n > 2 else methods, setup, 
                 niter=int(10 ** it), p_globals={'n': int(10 ** n)})

结果适用于Windows 7(基于Intel i7 CPU、CPython v3.5.0、numpy v1.10.1)。

globals: {'n': 100}, tested 1e+06 times

     time (s) speedup                                            methods
0    0.386908  33.76x                                    y = np.array(x)
1    0.496475  26.31x                              y = x.astype(x.dtype)
2    0.567027  23.03x              y = np.empty_like(x); np.copyto(y, x)
3    0.666129  19.61x                     y = np.empty_like(x); y[:] = x
4    0.967086  13.51x                                            y = 1*x
5    1.067240  12.24x  y = np.empty_like(x); np.copyto(y, x, casting=...
6    1.235198  10.57x                                     y = np.copy(x)
7    1.624535   8.04x           y = np.zeros(n, dtype=x.dtype); y[:] = x
8    1.626120   8.03x           y = np.empty(n, dtype=x.dtype); y[:] = x
9    3.569372   3.66x                     y = np.zeros_like(x); y[:] = x
10  13.061154          y = np.empty(n)\nfor i in range(x.size):\n\ty[...

globals: {'n': 1000}, tested 1e+06 times

   time (s) speedup                                            methods
0  0.666237   6.10x                              y = x.astype(x.dtype)
1  0.740594   5.49x              y = np.empty_like(x); np.copyto(y, x)
2  0.755246   5.39x                                    y = np.array(x)
3  1.043631   3.90x                     y = np.empty_like(x); y[:] = x
4  1.398793   2.91x                                            y = 1*x
5  1.434299   2.84x  y = np.empty_like(x); np.copyto(y, x, casting=...
6  1.544769   2.63x                                     y = np.copy(x)
7  1.873119   2.17x           y = np.empty(n, dtype=x.dtype); y[:] = x
8  2.355593   1.73x           y = np.zeros(n, dtype=x.dtype); y[:] = x
9  4.067133                             y = np.zeros_like(x); y[:] = x

globals: {'n': 6309}, tested 1e+06 times

   time (s) speedup                                            methods
0  2.338428   3.05x                                    y = np.array(x)
1  2.466636   2.89x                              y = x.astype(x.dtype)
2  2.561535   2.78x              y = np.empty_like(x); np.copyto(y, x)
3  2.603601   2.74x                     y = np.empty_like(x); y[:] = x
4  3.005610   2.37x  y = np.empty_like(x); np.copyto(y, x, casting=...
5  3.215863   2.22x                                     y = np.copy(x)
6  3.249763   2.19x                                            y = 1*x
7  3.661599   1.95x           y = np.empty(n, dtype=x.dtype); y[:] = x
8  6.344077   1.12x           y = np.zeros(n, dtype=x.dtype); y[:] = x
9  7.133050                             y = np.zeros_like(x); y[:] = x

globals: {'n': 10000}, tested 1e+06 times

   time (s) speedup                                            methods
0  3.421806   2.82x                                    y = np.array(x)
1  3.569501   2.71x                              y = x.astype(x.dtype)
2  3.618747   2.67x              y = np.empty_like(x); np.copyto(y, x)
3  3.708604   2.61x                     y = np.empty_like(x); y[:] = x
4  4.150505   2.33x  y = np.empty_like(x); np.copyto(y, x, casting=...
5  4.402126   2.19x                                     y = np.copy(x)
6  4.917966   1.96x           y = np.empty(n, dtype=x.dtype); y[:] = x
7  4.941269   1.96x                                            y = 1*x
8  8.925884   1.08x           y = np.zeros(n, dtype=x.dtype); y[:] = x
9  9.661437                             y = np.zeros_like(x); y[:] = x

globals: {'n': 100000}, tested 1e+05 times

    time (s) speedup                                            methods
0   3.858588   2.63x                              y = x.astype(x.dtype)
1   3.873989   2.62x                                    y = np.array(x)
2   3.896584   2.60x              y = np.empty_like(x); np.copyto(y, x)
3   3.919729   2.58x  y = np.empty_like(x); np.copyto(y, x, casting=...
4   3.948563   2.57x                     y = np.empty_like(x); y[:] = x
5   4.000521   2.53x                                     y = np.copy(x)
6   4.087255   2.48x           y = np.empty(n, dtype=x.dtype); y[:] = x
7   4.803606   2.11x                                            y = 1*x
8   6.723291   1.51x                     y = np.zeros_like(x); y[:] = x
9  10.131983                   y = np.zeros(n, dtype=x.dtype); y[:] = x

globals: {'n': 1000000}, tested 3e+04 times

     time (s) speedup                                            methods
0   85.625484   1.24x                     y = np.empty_like(x); y[:] = x
1   85.693316   1.24x              y = np.empty_like(x); np.copyto(y, x)
2   85.790064   1.24x  y = np.empty_like(x); np.copyto(y, x, casting=...
3   86.342230   1.23x           y = np.empty(n, dtype=x.dtype); y[:] = x
4   86.954862   1.22x           y = np.zeros(n, dtype=x.dtype); y[:] = x
5   89.503368   1.18x                                    y = np.array(x)
6   91.986177   1.15x                                            y = 1*x
7   95.216021   1.11x                                     y = np.copy(x)
8  100.524358   1.05x                              y = x.astype(x.dtype)
9  106.045746                             y = np.zeros_like(x); y[:] = x

另外,请参阅分析的一个变体的结果,其中目标的内存在值复制期间已经预分配,因为y = np.empty_like(x)是设置的一部分:

globals: {'n': 100}, tested 1e+06 times

   time (s) speedup                        methods
0  0.328492   2.33x                np.copyto(y, x)
1  0.384043   1.99x                y = np.array(x)
2  0.405529   1.89x                       y[:] = x
3  0.764625          np.copyto(y, x, casting='no')

globals: {'n': 1000}, tested 1e+06 times

   time (s) speedup                        methods
0  0.453094   1.95x                np.copyto(y, x)
1  0.537594   1.64x                       y[:] = x
2  0.770695   1.15x                y = np.array(x)
3  0.884261          np.copyto(y, x, casting='no')

globals: {'n': 6309}, tested 1e+06 times

   time (s) speedup                        methods
0  2.125426   1.20x                np.copyto(y, x)
1  2.182111   1.17x                       y[:] = x
2  2.364018   1.08x                y = np.array(x)
3  2.553323          np.copyto(y, x, casting='no')

globals: {'n': 10000}, tested 1e+06 times

   time (s) speedup                        methods
0  3.196402   1.13x                np.copyto(y, x)
1  3.523396   1.02x                       y[:] = x
2  3.531007   1.02x                y = np.array(x)
3  3.597598          np.copyto(y, x, casting='no')

globals: {'n': 100000}, tested 1e+05 times

   time (s) speedup                        methods
0  3.862123   1.01x                np.copyto(y, x)
1  3.863693   1.01x                y = np.array(x)
2  3.873194   1.01x                       y[:] = x
3  3.909018          np.copyto(y, x, casting='no')
roqulrg3

roqulrg35#

您可以轻松用途:

b = 1*a

这是最快的方法,但也有一些问题。如果你不直接定义adtype,也不检查bdtype,你可能会遇到麻烦。例如:

a = np.arange(10)        # dtype = int64
b = 1*a                  # dtype = int64

a = np.arange(10.)       # dtype = float64
b = 1*a                  # dtype = float64

a = np.arange(10)        # dtype = int64
b = 1. * a               # dtype = float64

我希望我能把这一点说清楚。有时候,只需一个小操作就可以更改数据类型。

jmo0nnb3

jmo0nnb36#

你可以做很多不同的事情:

a=np.copy(b)
a=np.array(b) # Does exactly the same as np.copy
a[:]=b # a needs to be preallocated
a=b[np.arange(b.shape[0])]
a=copy.deepcopy(b)

不管用的东西

a=b
a=b[:] # This have given my code bugs
x7yiwoj4

x7yiwoj47#

为什么不用

a = 0 + b

我认为它类似于以前的乘法,但可能更简单。

ugmeyewa

ugmeyewa8#

假设目标数组a已经存在,我可以想到三个选项(其中两个已经在其他答案中提到):

a[...] = b
a[:] = b
np.copyto(a, b)

我已经在连续数组的情况下测试了它们。对于大型数组,所有这些方法的速度大致相同(因为时间由实际复制时间决定,这在所有三种方法中的效率相同)。对于小数组,第一个似乎比第二个稍快,第二个又比第三个稍快。就可读性而言,前两个对我来说大致相当(稍微倾向于第一个,因为它并不意味着在目标矩阵的第一维上进行稍微混乱的迭代)。我发现最后一个不太可读,因为我倾向于忘记它是Intel(mov dst, src)还是AT&T(mov src, dst)语法,除非使用命名参数(np.copyto(dst=a, src=b)),这可能有点冗长。copyto的目的地在前,而ufuncs的目的地在后,这并没有什么帮助(例如np.sin(b, a)等效于a[...] = np.sin(b),只是避免了创建临时数组)。

相关问题