2 回答

TA貢獻1809條經驗 獲得超8個贊
這在很大程度上取決于數據。似乎您正在嘗試找到一種有效的方法來返回數組中某些內容的第一個索引。好吧,其中沒有高效的numpy
,因為只允許整個數組的迭代numpy
,但您可以使用numba
它來代替以超越numpy
如果您需要對列表中的一小部分進行求和,numpy
是一個不錯的選擇:
zero_idx = np.where(data3==0)[0]
max_loc = np.searchsorted(zero_idx, np.argmax(data3))
start, end = zero_idx[max_loc - 1], zero_idx[max_loc]
total_sum = np.sum(data3[start:end])
否則,使用 pythonicindex方法(或numba):
k = np.argmax(data3)
left_list = data3[k:].tolist()
right_list = data3[k::-1].tolist()
s1 = np.sum(data3[k: k + left_list.index(0)])
s2 = np.sum(data3[k - right_list.index(0): k])
total_sum = s1 + s2
基準。 我發現第一種方法%timeit在 Jupyter Notebook 中使用裝飾器速度快了 20 倍:
512 μs ± 34.5 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
10.2 ms ± 146 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

TA貢獻1842條經驗 獲得超13個贊
您可以使用掩碼而不是使用循環。
掩碼[data3[max_index:] > 0]和[data3[:max_index] > 0]相當于切片[max_index:(max_index+j)]和[(max_index-k):max_index],只不過您不必費心去查找j和k。
from contextlib import contextmanager
import numpy as np
import time
@contextmanager
def time_this_scope(name):
"""Handy context manager to time a portion of code."""
t0 = time.perf_counter()
yield
print(f"{name} took {time.perf_counter() - t0}s.")
# Preparing the data.
data1 = [x for x in range(0, 100000, 1)]
data2 = [x for x in range(100000, -1, -1)]
data3 = data1 + data2
max_index = np.where(data3 == np.amax(data3))[0][0]
# Comparing the performance of both methods.
with time_this_scope("method 1"):
j = 1
k = 0
while data3[max_index + j] > 0:
j += 1
while data3[max_index - k] > 0:
k += 1
summ1 = np.sum(data3[max_index:(max_index+j)])
summ2 = np.sum(data3[(max_index-k):max_index])
total_m1 = summ1 + summ2
with time_this_scope("method 2"):
data3 = np.array(data3)
summ1 = np.sum(data3[max_index:][data3[max_index:] > 0])
summ2 = np.sum(data3[:max_index][data3[:max_index] > 0])
total_m2 = summ1 + summ2
# Checking they do yield the same result.
print(total_m1 == total_m2)
>>> method 1 took 0.08157979999998588s.
>>> method 2 took 0.011274500000013177s.
>>> True
添加回答
舉報