在Python中使用multiprocessing.pool,并使用返回自定义对象的函数

ioekq8ef  于 2023-04-08  发布在  Python
关注(0)|答案(1)|浏览(74)

我使用multiprocessing.Pool来加速计算,因为我多次调用一个函数,然后整理结果。下面是我的代码片段:

import multiprocessing
from functools import partial

def Foo(id:int,constant_arg1:str, constant_arg2:str):
    custom_class_obj = CustomClass(constant_arg1, constant_arg2)
    custom_class_obj.run() # this changes some attributes of the custom_class_obj
    
    if(something):
       return None
    else:
       return [custom_class_obj]


def parallel_run(iters:int, a:str, b:str):
  pool = multiprocessing.Pool(processes=k)

  ## create the partial function obj before passing it to pool
  partial_func = partial(Foo, constant_arg1=a, constant_arg2=b)

  ## create the variable id list
  iter_list = list(range(iters))
  all_runs = pool.map(partial_func, iter_list)
 
  return all_runs

这将在多处理模块中引发以下错误:

multiprocessing.pool.MaybeEncodingError: Error sending result: '[[<CustomClass object at 0x1693c7070>], [<CustomClass object at 0x1693b88e0>], ....]'
Reason: 'TypeError("cannot pickle 'module' object")'

我该如何解决这个问题?

83qze16e

83qze16e1#

我能够用一个un-picklable类的最小示例复制错误消息。该错误基本上声明您的类的示例不能被pickle,因为它包含对模块的引用,而模块是不可picklable的。您需要梳理CustomClass以确保示例不包含打开文件句柄,模块引用等内容。如果您需要这些内容,您应该使用__getstate____setstate__来定制pickle和unpickle过程。
你的错误的例子:

from multiprocessing import Pool
from functools import partial

class klass:
    def __init__(self, a):
        self.value = a
        import os
        self.module = os #this fails: can't pickle a module and send it back to main process

def foo(a, b, c):
    return klass(a+b+c)

if __name__ == "__main__":
    with Pool() as p:
        a = 1
        b = 2
        bar = partial(foo, a, b)
        res = p.map(bar, range(10))
    print([r.value for r in res])

相关问题