我正在嘗試計算值對的出現次數。運行以下代碼時,numpy 版本 (pairs_frequency2) 比依賴 collections.Counter 的版本慢 50% 以上(隨著點數的增加,情況變得更糟)。有人可以解釋原因。是否有可能的 numpy 重寫以實現更好的性能?提前致謝。import numpy as npfrom collections import Counterdef pairs_frequency(x, y): counts = Counter(zip(x, y)) res = np.array([[f, a, b] for ((a, b), f) in counts.items()]) return res[:, 0], res[:, 1], res[:, 2]def pairs_frequency2(x, y): unique, counts = np.unique(np.column_stack((x,y)), axis=0, return_counts=True) return counts, unique[:,0], unique[:,1]x = np.random.randint(low=1, high=11, size=50000)y = x + np.random.randint(1, 5, size=x.size)%timeit pairs_frequency(x, y)%timeit pairs_frequency2(x, y)
添加回答
舉報
0/150
提交
取消