python 将多维numpy数组写入多个文件

x8diyxa7 于 2022-11-21 发布在 Python

关注(0)|答案(1)|浏览(157)

我想知道是否有一个更有效的方法来做下面的不使用循环。
我有一个numpy数组，其形状为(i, x, y, z)。本质上，我有(x, y, z)形状的i元素。我想将每个元素写入一个单独的文件，这样我就有了i文件，每个文件都有来自单个元素的数据。
在我的例子中，每个元素都是一个图像，但我确信解决方案可以是格式不可知的。
我正在循环遍历每个i元素，并一次写出一个元素。
随着i变得越来越大，这将花费越来越长的时间。有没有更好的方法或有用的库可以使这更有效？

更新

我尝试了使用并发多处理的建议。先尝试线程池，然后再尝试进程池。它的代码更简单，但完成时间要慢4倍。
在这种情况下i大约为10000，而x和y大约为750

python

来源：https://stackoverflow.com/questions/74466262/write-multi-dimensional-numpy-array-to-many-files

1条答案

按热度按时间

7z5jn7bk1#

这听起来非常适合多处理，因为不同的元素需要单独处理，并且可以单独保存到磁盘。
Python有一个非常有用的包，叫做multiprocessing，里面有各种池、处理和其他选项。
下面是一个简单的（并有注解记录的）用法示例：

from multiprocessing import Process
import numpy as np    

# This should be your existing function
def write_file(element):
    # write file
    pass

# You'll still be looping of course, but in parallel over batches. This is a helper function for looping over a "batch"
def write_list_of_files(elements_list):
    for element in elements_list:
        write_file(element)

# You're data goes here...
all_elements = np.ones((1000, 256, 256, 3))

num_procs = 10  # Depends on system limitations, number of cpu-cores, etc.
procs = [Process(target=write_list_of_files, args=[all_elements[k::num_procs, ...]]) for k in range(num_procs)]  # Each of these processes in the list is going to run the "write_list_of_files" function, but have separate inputs, due to the indexing trick of using "k::num_procs"...

for p in procs:
    p.start()  # Each process starts running independantly

for p in procs:
    p.join()  # assures the code won't continue until all are "joined" and done. Optional obviously...
    
print('All done!')  # This only runs onces all procs are done, due to "p.join"

赞(0）回复(0）举报 2022-11-21

我来回答

python 将多维numpy数组写入多个文件

1条答案

相关问题

热门标签

最新问答