1 回答

TA貢獻1799條經驗 獲得超9個贊
編寫您的邏輯代碼,一切都很好
freq='S'
沒有任何意義,您將生成與開始日期和結束日期之間的秒數一樣多的行在隨機化開始時間后,使用當前行和下一行作為結束時間隨機函數的種子。這是作為列表理解嗎
在范圍的開始和結束處獲取 UTC 秒數時更聰明一些
import pandas as pd
import numpy as np
from datetime import datetime
# date_rng = pd.date_range(start='5/18/2019', end='7/22/2020', freq='S')
date_rng = pd.date_range(start='5/18/2019', end='5/19/2019', freq='min')
sec = [(date_rng.min() - datetime(1970, 1, 1)).total_seconds(),
(date_rng.max() - datetime(1970, 1, 1)).total_seconds() ]
df = pd.DataFrame(date_rng, columns=['start_timestamp'])
df['start_timestamp'] = np.random.randint(sec[0],sec[1],size=(len(date_rng)))
df = df.sort_values(by="start_timestamp")
l = df["start_timestamp"].tolist() # get randomised start times
l[-1] = sec[1] # set last time to end of range
# randomise end time between two start times
df['end_timestamp'] = [np.random.randint(l[i], l[i+1]) if i<len(l)-1 and l[i]<l[i+1] else l[i] for i, s in enumerate(l)]
df['start_timestamp'] = pd.to_datetime(df['start_timestamp'],unit='s')
df['end_timestamp'] = pd.to_datetime(df['end_timestamp'],unit='s')
添加回答
舉報