3 回答

TA貢獻1784條經驗 獲得超9個贊
當我剛看到賞金的時候再試一次;)
基本上,我認為錯誤消息的含義是它的意思-多處理共享內存數組不能作為參數傳遞(通過酸洗)。序列化數據沒有意義-關鍵是數據是共享內存。因此,您必須使共享數組成為全局數組。我認為像我的第一個答案一樣,將其作為模塊的屬性比較整潔,但在示例中將其保留為全局變量也可以很好地工作??紤]到您不想在fork之前設置數據的觀點,這是一個修改后的示例。如果您希望擁有多個共享數組(這就是為什么要將toShare作為參數傳遞的原因),則可以類似地創建共享數組的全局列表,然后將索引傳遞給count_it(將變為for c in toShare[i]:)。
from sys import stdin
from multiprocessing import Pool, Array, Process
def count_it( key ):
count = 0
for c in toShare:
if c == key:
count += 1
return count
if __name__ == '__main__':
# allocate shared array - want lock=False in this case since we
# aren't writing to it and want to allow multiple processes to access
# at the same time - I think with lock=True there would be little or
# no speedup
maxLength = 50
toShare = Array('c', maxLength, lock=False)
# fork
pool = Pool()
# can set data after fork
testData = "abcabcs bsdfsdf gdfg dffdgdfg sdfsdfsd sdfdsfsdf"
if len(testData) > maxLength:
raise ValueError, "Shared array too small to hold data"
toShare[:len(testData)] = testData
print pool.map( count_it, ["a", "b", "s", "d"] )
[編輯:以上內容由于未使用fork而無法在Windows上運行。但是,下面的方法在Windows上仍然可以使用Pool,但仍然可以使用,因此我認為這與您想要的最接近:
from sys import stdin
from multiprocessing import Pool, Array, Process
import mymodule
def count_it( key ):
count = 0
for c in mymodule.toShare:
if c == key:
count += 1
return count
def initProcess(share):
mymodule.toShare = share
if __name__ == '__main__':
# allocate shared array - want lock=False in this case since we
# aren't writing to it and want to allow multiple processes to access
# at the same time - I think with lock=True there would be little or
# no speedup
maxLength = 50
toShare = Array('c', maxLength, lock=False)
# fork
pool = Pool(initializer=initProcess,initargs=(toShare,))
# can set data after fork
testData = "abcabcs bsdfsdf gdfg dffdgdfg sdfsdfsd sdfdsfsdf"
if len(testData) > maxLength:
raise ValueError, "Shared array too small to hold data"
toShare[:len(testData)] = testData
print pool.map( count_it, ["a", "b", "s", "d"] )
不知道為什么map不會腌制數組,而Process和Pool會腌制-我想也許它已經在Windows上的子進程初始化時轉移了。請注意,盡管在派生之后仍然設置了數據。

TA貢獻1895條經驗 獲得超3個贊
如果數據是只讀的,只需在從Pool派生之前將其設置為模塊中的變量即可。然后,所有子進程都應該能夠訪問它,并且只要您不對其進行寫操作,就不會被復制。
import myglobals # anything (empty .py file)
myglobals.data = []
def count_it( key ):
count = 0
for c in myglobals.data:
if c == key:
count += 1
return count
if __name__ == '__main__':
myglobals.data = "abcabcs bsdfsdf gdfg dffdgdfg sdfsdfsd sdfdsfsdf"
pool = Pool()
print pool.map( count_it, ["a", "b", "s", "d"] )
如果您確實想嘗試使用Array,則可以嘗試使用lock=False關鍵字參數(默認情況下為true)。
添加回答
舉報