3 回答

TA貢獻1847條經驗 獲得超7個贊
首先append或concat兩者DataFrame在一起,然后merge:
dat = pd.concat([data1, data2], ignore_index=True)
或者:
dat = data1.append(data2, ignore_index=True)
print (dat)
key1 key2 new feature
0 0.365803 0.259112 info1
1 0.204009 0.669371 info2
#if same joined columns names better is only on parameter
df = test.merge(dat, on=['key1', 'key2'], how='left')
print (df)
key1 key2 new feature
0 0.365803 0.259112 info1
1 0.086869 0.589834 NaN
2 0.269619 0.183644 NaN
3 0.755826 0.045187 NaN
4 0.204009 0.669371 info2

TA貢獻1804條經驗 獲得超8個贊
您可以pd.DataFrame.update改用:
# create new column and set index
res = test.assign(newfeature=None).set_index(['key1', 'key2'])
# update with new data sequentially
res.update(data1.set_index(['key1', 'key2']))
res.update(data2.set_index(['key1', 'key2']))
# reset index to recover columns
res = res.reset_index()
print(res)
key1 key2 newfeature
0 0.365803 0.259112 info1
1 0.086869 0.589834 None
2 0.269619 0.183644 None
3 0.755826 0.045187 None
4 0.204009 0.669371 info2

TA貢獻1799條經驗 獲得超6個贊
您還可以將數據幀設置為相同的索引并使用簡單 loc
df = df.set_index(["key1", "key2"])
df2 = df2.set_index(["key1", "key2"])
然后
df.loc[:, "new_feature"] = df2['new_feature']
添加回答
舉報