1 回答

TA貢獻1880條經驗 獲得超4個贊
鑒于描述,我建議使用pd.concat
or?merge
。這是一個測試示例:
import pandas as pd
#generating test data
index1 = pd.date_range('1/1/2000', periods=9, freq='D')
index2 = pd.date_range('1/4/2000', periods=9, freq='D')
series = range(9)
df1 = pd.DataFrame([index1,series]).T
df2 = pd.DataFrame([index2,series]).T
df1.columns = ['Time','Data']
df2.columns = ['Time','Data']
df1:
? ? ? ? ? ? ? ? ? Time Data
0? 2000-01-01 00:00:00? ? 0
1? 2000-01-02 00:00:00? ? 1
2? 2000-01-03 00:00:00? ? 2
3? 2000-01-04 00:00:00? ? 3
4? 2000-01-05 00:00:00? ? 4
5? 2000-01-06 00:00:00? ? 5
6? 2000-01-07 00:00:00? ? 6
7? 2000-01-08 00:00:00? ? 7
8? 2000-01-09 00:00:00? ? 8? ? ? ? ? ? ? ? ?
df2:
? ? ? ? ? ? ? ? ? Time Data
0? 2000-01-04 00:00:00? ? 0
1? 2000-01-05 00:00:00? ? 1
2? 2000-01-06 00:00:00? ? 2
3? 2000-01-07 00:00:00? ? 3
4? 2000-01-08 00:00:00? ? 4
5? 2000-01-09 00:00:00? ? 5
6? 2000-01-10 00:00:00? ? 6
7? 2000-01-11 00:00:00? ? 7
8? 2000-01-12 00:00:00? ? 8
請注意,兩個數據框中的數據可用于不同的日期。
#convert Time to pandas datetime format
#df1['Time'].to_datetime(df1['Time']) # <- uncomment this for your case
#df1['Time'].to_datetime(df1['Time'])? # <- uncomment this for your case
#making the time the index of the dataframes
df1.set_index(['Time'],inplace=True)
df2.set_index(['Time'],inplace=True)
#concatenating the dataframe column wise (axis=1)
df3 = pd.concat([df1,df2],axis=1)
print(df3)
輸出:
? ? ? ? ? ?Data Data
Time? ? ? ? ? ? ? ??
2000-01-01? ? 0? NaN
2000-01-02? ? 1? NaN
2000-01-03? ? 2? NaN
2000-01-04? ? 3? ? 0
2000-01-05? ? 4? ? 1
2000-01-06? ? 5? ? 2
2000-01-07? ? 6? ? 3
2000-01-08? ? 7? ? 4
2000-01-09? ? 8? ? 5
2000-01-10? NaN? ? 6
2000-01-11? NaN? ? 7
2000-01-12? NaN? ? 8
處理缺失值:
pd.concat correctly merges the data as per the data. NaN indicate the missing values after combining, which can be handled mainly with fillna(filling something inplace of NaN) or dropna (dropping the data containing NaN). Here is an example of fillna (dropna is used exactly the same way but without 0) :
#filling 0's inplace of `NaN`. You can use also method='bfill' or 'ffill' or interpolate
df3 = df3.fillna(0,inplace=True)?
#df3 = df3.fillna(method='bfill',inplace=True) # <- uncomment if you want to use this
#df3 = df3.fillna(method='ffill',inplace=True) # <- uncomment if you want to use this
Output:
? ? ? ? ? ? ?Data? Data
Time? ? ? ? ? ? ? ? ??
2000-01-01? ? ?0? ? ?0
2000-01-02? ? ?1? ? ?0
2000-01-03? ? ?2? ? ?0
2000-01-04? ? ?3? ? ?0
2000-01-05? ? ?4? ? ?1
2000-01-06? ? ?5? ? ?2
2000-01-07? ? ?6? ? ?3
2000-01-08? ? ?7? ? ?4
2000-01-09? ? ?8? ? ?5
2000-01-10? ? ?0? ? ?6
2000-01-11? ? ?0? ? ?7
2000-01-12? ? ?0? ? ?8
添加回答
舉報