我已经定义了这个功能
def writeonfiles(a,seed): random.seed(seed) f = open(a, "w+") for i in range(0,10): j = random.randint(0,10) #print j f.write(j) f.close()
其中a是包含文件路径的字符串,种子是整数种子。我想以这样一种方式并行化一个简单的程序,即每个内核采用我提供的可用路径之一,播种其随机生成器,然后在该文件上写入一些随机数,例如,如果我传递矢量
vector = [Test/file1.txt, Test/file2.txt]
和种子
seeds = (123412, 989898),
它给第一个可用的核心功能
writeonfiles(Test/file1.txt, 123412)
第二个是具有不同参数的相同函数:
writeonfiles(Test/file2.txt, 989898)
我在Stackoverflow上浏览了很多类似的问题,但是我无法解决任何问题。我试过的是:
def writeonfiles_unpack(args): return writeonfiles(*args) if __name__ == "__main__": folder = ["Test/%d.csv" %i for i in range(0,4)] seed = [234124, 663123, 12345 ,123833] p = multiprocessing.Pool() p.map(writeonfiles, (folder,seed))
并给我TypeError:writeonfiles()恰好接受2个参数(给定1个)。
我也尝试过
if __name__ == "__main__": folder = ["Test/%d.csv" %i for i in range(0,4)] seed = [234124, 663123, 12345 ,123833] p = multiprocessing.Process(target=writeonfiles, args= [folder,seed]) p.start()
但这给了我 种子super(Random,self).seed(a)TypeError:unhashable type:’list’的文件“ /usr/lib/python2.7/random.py”,行120
最后,我尝试了contextmanager
@contextmanager def poolcontext(*args, **kwargs): pool = multiprocessing.Pool(*args, **kwargs) yield pool pool.terminate() if __name__ == "__main__": folder = ["Test/%d" %i for i in range(0,4)] seed = [234124, 663123, 12345 ,123833] a = zip(folder, seed) with poolcontext(processes = 3) as pool: results = pool.map(writeonfiles_unpack,a )
并导致文件“ /usr/lib/python2.7/multiprocessing/pool.py”的第572行中的get提高self._value
TypeError:“模块”对象不可调用
Python 2.7缺少starmapPython 3.3+中的合并方法。您可以通过使用包装器装饰目标函数来克服此问题,该包装器将对参数元组进行解包并调用目标函数:
starmap
import os from multiprocessing import Pool import random from functools import wraps def unpack(func): @wraps(func) def wrapper(arg_tuple): return func(*arg_tuple) return wrapper @unpack def write_on_files(a, seed): random.seed(seed) print("%d opening file %s" % (os.getpid(), a)) # simulate for _ in range(10): j = random.randint(0, 10) print("%d writing %d to file %s" % (os.getpid(), j, a)) # simulate if __name__ == '__main__': folder = ["Test/%d.csv" % i for i in range(0, 4)] seed = [234124, 663123, 12345, 123833] arguments = zip(folder, seed) pool = Pool(4) pool.map(write_on_files, iterable=arguments) pool.close() pool.join()