1 回答

TA貢獻1821條經驗 獲得超5個贊
調用combined = future.result()會阻塞,直到結果完成,因此您不會在前一個請求完成之前向池提交后續請求。換句話說,您永遠不會運行多個子進程。至少您應該將代碼更改為:
with ProcessPoolExecutor(max_workers=10) as pool:
the_futures = []
for samples in tqdm(sample_list):
future = pool.submit(reduce, add, [item2item[s] for s in samples], Counter())
the_futures.append(future) # save it
results = [f.result() for f in the_futures()] # all the results
另一種方式:
with ProcessPoolExecutor(max_workers=10) as pool:
the_futures = []
for samples in tqdm(sample_list):
future = pool.submit(reduce, add, [item2item[s] for s in samples], Counter())
the_futures.append(future) # save it
# you need: from concurrent.futures import as_completed
for future in as_completed(the_futures): # not necessarily the order of submission
result = future.result() # do something with this
此外,如果您未指定構造函數,則默認為您機器上的處理器數量max_workers。ProcessPoolExecutor指定一個大于您實際擁有的處理器數量的值不會有任何收獲。
更新
如果您想在結果完成后立即處理結果并需要一種方法將結果與原始請求聯系起來,您可以將期貨作為鍵存儲在字典中,其中相應的值表示請求的參數。在這種情況下:
with ProcessPoolExecutor(max_workers=10) as pool:
the_futures = {}
for samples in tqdm(sample_list):
future = pool.submit(reduce, add, [item2item[s] for s in samples], Counter())
the_futures[future] = samples # map future to request
# you need: from concurrent.futures import as_completed
for future in as_completed(the_futures): # not necessarily the order of submission
samples = the_futures[future] # the request
result = future.result() # the result
添加回答
舉報