1 回答
TA貢獻1752條經驗 獲得超4個贊
您需要讀入數據并轉換為日期時間格式 - 我用剪貼板讀入數據并在那里解析日期。其次,您需要按鍵對數據進行排序(在這種情況下,鍵是 df1 的“連接”和 df2 的“開始”)。在那之后 pandas merge_asof就足夠了。請注意,合并只能在一個鍵上發生,而不是多個:
對數據框進行排序
df1 = df1.sort_values(['Connect','Ended'])
df2 = df2.sort_values(['Start','End'])
合并數據框
merger = pd.merge_asof(df1,df2,
left_on='Connect',
right_on='Start',
tolerance = pd.Timedelta('20s'),
direction='forward')
merger
Connect Ended Start End
0 2020-03-31 11:00:08 2020-03-31 11:00:10 2020-03-31 11:00:10 2020-03-31 11:00:14
1 2020-04-01 22:00:05 2020-04-01 12:00:05 NaT NaT
2 2020-04-06 13:15:21 2020-04-06 14:05:18 2020-04-06 13:15:21 2020-04-06 14:05:18
應該很容易選擇匹配和不匹配的行:
matched = merger.dropna()
matched
Connect Ended Start End
0 2020-03-31 11:00:08 2020-03-31 11:00:10 2020-03-31 11:00:10 2020-03-31 11:00:14
2 2020-04-06 13:15:21 2020-04-06 14:05:18 2020-04-06 13:15:21 2020-04-06 14:05:18
unmatched = merger.loc[merger.isna().any(axis=1)]
unmatched
Connect Ended Start End
1 2020-04-01 22:00:05 2020-04-01 12:00:05 NaT NaT
希望它就足夠了......如果你被踩到,文檔有更多的例子來指導你
添加回答
舉報
