亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

為了賬號安全,請及時綁定郵箱和手機立即綁定
已解決430363個問題,去搜搜看,總會有你想問的

使用列表索引時如何加速“for”循環?(Python)

使用列表索引時如何加速“for”循環?(Python)

RISEBY 2023-12-29 16:47:43
我嘗試使用 Numpy 函數或向量而不是 for 循環來加速此代碼:sommes = []for j in range(vertices.shape[0]):    terme = new_vertices[j] - new_vertices[vertex_neighbors[j]]    somme_j = np.sum(terme)    sommes.append(somme_j)E_int = np.sum(sommes)(它是迭代算法的一部分,并且有很多“頂點”,所以我認為 for 循環花費的時間太長。)例如,要計算 j = 0 時的“terme”,我有:In: new_vertices[0]Out: array([ 10.2533888 , -42.32279717,  68.27230793])In: vertex_neighbors[0]Out: [1280, 2, 1511, 511, 1727, 1887, 759, 509, 1023]In: new_verties[vertex_neighbors[0]]Out: array([[ 10.47121043, -42.00123956,  68.218715  ],            [ 10.2533888 , -43.26905874,  62.59473849],            [ 10.69773735, -41.26464083,  68.09594854],            [ 10.37030712, -42.16729601,  68.24639107],            [ 10.12158146, -42.46624547,  68.29621598],            [  9.81850836, -42.71158695,  68.33710623],            [  9.97615447, -42.59625943,  68.31788497],            [ 10.37030712, -43.11676015,  62.54960623],            [ 10.55512696, -41.82622703,  68.18954624]])In: new_vertices[0] - new_vertices[vertex_neighbors[0]]Out: array([[-0.21782162, -0.32155761,  0.05359293],             [ 0.        ,  0.94626157,  5.67756944],             [-0.44434855, -1.05815634,  0.17635939],             [-0.11691832, -0.15550116,  0.02591686],             [ 0.13180734,  0.1434483 , -0.02390805],             [ 0.43488044,  0.38878979, -0.0647983 ],             [ 0.27723434,  0.27346227, -0.04557704],             [-0.11691832,  0.79396298,  5.7227017 ],             [-0.30173816, -0.49657014,  0.08276169]])問題是 new_vertices[vertex_neighbors[j]] 并不總是具有相同的大小。例如,當 j = 7 時:In: new_vertices[7]Out: array([ 10.74106112, -63.88592276, -70.15593947])In: vertex_neighbors[7]Out: [1546, 655, 306, 1879, 920, 925]沒有for循環可以嗎?我的想法已經用完了,所以任何幫助將不勝感激!
查看完整描述

1 回答

?
holdtom

TA貢獻1805條經驗 獲得超10個贊

是的,這是可能的。這個想法是用來np.repeat創建一個向量,其中的項目重復可變次數。這是代碼:


# The two following lines can be done only once if the indices are constant between iterations (precomputation)

counts = np.array([len(e) for e in vertex_neighbors])

flatten_indices = np.concatenate(vertex_neighbors)


E_int = np.sum(np.repeat(new_vertices, counts, axis=0) - new_vertices[flatten_indices])

這是一個基準:


import numpy as np

from time import *



n = 32768

vertices = np.random.rand(n, 3)

indices = []


count = np.random.randint(1, 10, size=n)


for i in range(n):

    indices.append(np.random.randint(0, n, size=count[i]))


def initial_version(vertices, vertex_neighbors):

    sommes = []

    for j in range(vertices.shape[0]):

        terme = vertices[j] - vertices[vertex_neighbors[j]]

        somme_j = np.sum(terme)

        sommes.append(somme_j)

    return np.sum(sommes)


def optimized_version(vertices, vertex_neighbors):

    # The two following lines can be precomputed

    counts = np.array([len(e) for e in indices])

    flatten_indices = np.concatenate(indices)


    return np.sum(np.repeat(vertices, counts, axis=0) - vertices[flatten_indices])


def more_optimized_version(vertices, vertex_neighbors, counts, flatten_indices):

    return np.sum(np.repeat(vertices, counts, axis=0) - vertices[flatten_indices])


timesteps = 20


a = time()

for t in range(timesteps):

    res = initial_version(vertices, indices)

b = time()

print("V1: time:", b - a)

print("V1: result", res)


a = time()

for t in range(timesteps):

    res = optimized_version(vertices, indices)

b = time()

print("V2: time:", b - a)

print("V2: result", res)


a = time()

counts = np.array([len(e) for e in indices])

flatten_indices = np.concatenate(indices)

for t in range(timesteps):

    res = more_optimized_version(vertices, indices, counts, flatten_indices)

b = time()

print("V3: time:", b - a)

print("V3: result", res)

這是我機器上的基準測試結果:


V1: time: 3.656714916229248

V1: result -395.8416223057596

V2: time: 0.19800186157226562

V2: result -395.8416223057595

V3: time: 0.07983255386352539

V3: result -395.8416223057595

正如您所看到的,這一優化版本比參考實現快 18 倍,而預先計算索引的版本比參考實現快 46 倍。


請注意,優化版本應該需要更多 RAM(特別是如果每個頂點的鄰居數量很大)。


查看完整回答
反對 回復 2023-12-29
  • 1 回答
  • 0 關注
  • 158 瀏覽
慕課專欄
更多

添加回答

舉報

0/150
提交
取消
微信客服

購課補貼
聯系客服咨詢優惠詳情

幫助反饋 APP下載

慕課網APP
您的移動學習伙伴

公眾號

掃描二維碼
關注慕課網微信公眾號