1 回答

TA貢獻1807條經驗 獲得超9個贊
動態增長的 python 列表比動態增長的 numpy 數組(這是 pandas 數據幀的底層數據結構)快得多。請參閱此處以獲取簡要說明??紤]到這一點:
import pandas as pd
# Initialize input dataframe
raw_dataset = pd.DataFrame({
'ID':['a121','b142','cd3'],
'start_date':['2019-10-09','2017-02-06','2012-12-05'],
'end_date':['2020-01-30','2019-08-23','2016-06-18'],
})
# Create integer columns for start year and end year
raw_dataset['start_year'] = pd.to_datetime(raw_dataset['start_date']).dt.year
raw_dataset['end_year'] = pd.to_datetime(raw_dataset['end_date']).dt.year
# Iterate over input dataframe rows and individual years
id_list = []
active_years_list = []
for row in raw_dataset.itertuples():
for year in range(row.start_year, row.end_year+1):
id_list.append(row.ID)
active_years_list.append(year)
# Create result dataframe from lists
desired_df = pd.DataFrame({
'id': id_list,
'active_years': active_years_list,
})
print(desired_df)
# Output:
# id active_years
# 0 a121 2019
# 1 a121 2020
# 2 b142 2017
# 3 b142 2018
# 4 b142 2019
# 5 cd3 2012
# 6 cd3 2013
# 7 cd3 2014
# 8 cd3 2015
# 9 cd3 2016
添加回答
舉報