1 回答

TA貢獻1805條經驗 獲得超10個贊
是的,這是可能的。這個想法是用來np.repeat創建一個向量,其中的項目重復可變次數。這是代碼:
# The two following lines can be done only once if the indices are constant between iterations (precomputation)
counts = np.array([len(e) for e in vertex_neighbors])
flatten_indices = np.concatenate(vertex_neighbors)
E_int = np.sum(np.repeat(new_vertices, counts, axis=0) - new_vertices[flatten_indices])
這是一個基準:
import numpy as np
from time import *
n = 32768
vertices = np.random.rand(n, 3)
indices = []
count = np.random.randint(1, 10, size=n)
for i in range(n):
indices.append(np.random.randint(0, n, size=count[i]))
def initial_version(vertices, vertex_neighbors):
sommes = []
for j in range(vertices.shape[0]):
terme = vertices[j] - vertices[vertex_neighbors[j]]
somme_j = np.sum(terme)
sommes.append(somme_j)
return np.sum(sommes)
def optimized_version(vertices, vertex_neighbors):
# The two following lines can be precomputed
counts = np.array([len(e) for e in indices])
flatten_indices = np.concatenate(indices)
return np.sum(np.repeat(vertices, counts, axis=0) - vertices[flatten_indices])
def more_optimized_version(vertices, vertex_neighbors, counts, flatten_indices):
return np.sum(np.repeat(vertices, counts, axis=0) - vertices[flatten_indices])
timesteps = 20
a = time()
for t in range(timesteps):
res = initial_version(vertices, indices)
b = time()
print("V1: time:", b - a)
print("V1: result", res)
a = time()
for t in range(timesteps):
res = optimized_version(vertices, indices)
b = time()
print("V2: time:", b - a)
print("V2: result", res)
a = time()
counts = np.array([len(e) for e in indices])
flatten_indices = np.concatenate(indices)
for t in range(timesteps):
res = more_optimized_version(vertices, indices, counts, flatten_indices)
b = time()
print("V3: time:", b - a)
print("V3: result", res)
這是我機器上的基準測試結果:
V1: time: 3.656714916229248
V1: result -395.8416223057596
V2: time: 0.19800186157226562
V2: result -395.8416223057595
V3: time: 0.07983255386352539
V3: result -395.8416223057595
正如您所看到的,這一優化版本比參考實現快 18 倍,而預先計算索引的版本比參考實現快 46 倍。
請注意,優化版本應該需要更多 RAM(特別是如果每個頂點的鄰居數量很大)。
添加回答
舉報