我有以下代碼:def task1(): for url in splitarr[0]: print(url) #these are supposed to be scrape_induvidual_page() . print is just for debuggingdef task2(): for url in splitarr[1]: print(url)def task3(): for url in splitarr[2]: print(url)def task4(): for url in splitarr[3]: print(url)def task5(): for url in splitarr[4]: print(url)def task6(): for url in splitarr[5]: print(url)def task7(): for url in splitarr[6]: print(url) def task8(): for url in splitarr[7]: print(url) splitarr=np.array_split(urllist, 8)t1 = threading.Thread(target=task1, name='t1') t2 = threading.Thread(target=task2, name='t2') t3 = threading.Thread(target=task3, name='t3')t4 = threading.Thread(target=task4, name='t4') t5 = threading.Thread(target=task5, name='t5')t6 = threading.Thread(target=task6, name='t6')t7 = threading.Thread(target=task7, name='t7')t8 = threading.Thread(target=task8, name='t8')t1.start() t2.start()t3.start() t4.start() t5.start()t6.start() t7.start()t8.start() t1.join()t2.join()t3.join()t4.join()t5.join()t6.join()t7.join() t8.join() 它確實有所需的輸出,沒有重復或任何東西https://kickasstorrents.to/big-buck-bunny-1080p-h264-aac-5-1-tntvillage-t115783.htmlhttps://kickasstorrents.to/big-buck-bunny-4k-uhd-hfr-60fps-eng-flac-webdl-2160p-x264-zmachine-t1041079.htmlhttps://kickasstorrents.to/big-buck-bunny-4k-uhd-hfr-60-fps-flac-webrip-2160p-x265-zmachine-t1041689.htmlhttps://kickasstorrents.to/big-buck-bunny-2008-720p-bluray-x264-don-no-rars-t11623.htmlhttps://kickasstorrents.to/tkillaahh-big-buck-bunny-dvd-720p-2lions-team-t87503.htmlhttps://kickasstorrents.to/big-buck-bunny-2008-720p-bluray-nhd-x264-nhanc3-t127050.htmlhttps://kickasstorrents.to/big-buck-bunny-2008-brrip-720p-x264-mitzep-t172753.html但是,我覺得代碼對于所有重復的def taskx() 來說有點多余: 所以我嘗試使用單個任務來壓縮代碼:x=0def task1(): global x for url in splitarr[x]: print(url) x=x+1如何在多線程程序中正確地使 x 遞增?
1 回答

慕的地10843
TA貢獻1785條經驗 獲得超8個贊
for url in splitarr[x]:為 中的序列創建一個迭代器splitarr[x]。稍后增加 x 并不重要 - 迭代器已經構建好了。由于其中有打印內容,因此所有線程很可能會x在其仍為零時抓取并迭代相同的序列。
args一種解決方案是通過中的參數將遞增值傳遞給 task1 threading.Thread。但線程池更容易。
from multiprocessing.pool import ThreadPool
# generate test array
splitarr = []
for i in range(8):
splitarr.append([f"url_{i}_{j}" for j in range(4)])
def task(splitarr_column):
for url in splitarr_column:
print(url)
with ThreadPool(len(splitarr)) as pool:
result = pool.map(task, splitarr)
在此示例中,len(splitarr)用于為 中的每個序列創建一個線程splitarr。然后將每個序列映射到該task函數。由于我們創建了正確數量的線程來處理所有序列,因此它們都會同時運行。當映射完成時,該with子句退出并且池關閉,從而終止線程。
添加回答
舉報
0/150
提交
取消