首頁猿問合并多個數據框熊貓

合并多個數據框熊貓

Python

手掌心 2021-03-30 13:14:36

我嘗試將多個新的dataFrames合并到一個主框架中。假設主數據框： key1 key20 0.365803 0.2591121 0.086869 0.5898342 0.269619 0.1836443 0.755826 0.0451874 0.204009 0.669371我嘗試將以下兩個數據集合并到主要數據集“新數據1 ”中： key1 key2 new feature0 0.365803 0.259112 info1新數據2： key1 key2 new feature0 0.204009 0.669371 info2預期結果： key1 key2 new feature0 0.365803 0.259112 info11 0.776945 0.780978 NaN2 0.275891 0.114998 NaN3 0.667057 0.373029 NaN4 0.204009 0.669371 info2我試過的test = test.merge(data1, left_on=['key1', 'key2'], right_on=['key1', 'key2'], how='left')test = test.merge(data2, left_on=['key1', 'key2'], right_on=['key1', 'key2'], how='left')第一個效果很好，但第二個效果不好，我得到的結果是： key1 key2 new feature_x new feature_y0 0.365803 0.259112 info1 NaN1 0.776945 0.780978 NaN NaN2 0.275891 0.114998 NaN NaN3 0.667057 0.373029 NaN NaN4 0.204009 0.669371 NaN info2謝謝你的幫助！

查看完整描述

3 回答

aluckdog

TA貢獻1847條經驗獲得超7個贊

首先append或concat兩者DataFrame在一起，然后merge：

dat = pd.concat([data1, data2], ignore_index=True)

或者：

dat = data1.append(data2, ignore_index=True)

print (dat)

key1 key2 new feature

0 0.365803 0.259112 info1

1 0.204009 0.669371 info2

#if same joined columns names better is only on parameter

df = test.merge(dat, on=['key1', 'key2'], how='left')

print (df)

key1 key2 new feature

0 0.365803 0.259112 info1

1 0.086869 0.589834 NaN

2 0.269619 0.183644 NaN

3 0.755826 0.045187 NaN

4 0.204009 0.669371 info2

反對回復 2021-04-20

胡說叔叔

TA貢獻1804條經驗獲得超8個贊

您可以pd.DataFrame.update改用：

# create new column and set index

res = test.assign(newfeature=None).set_index(['key1', 'key2'])

# update with new data sequentially

res.update(data1.set_index(['key1', 'key2']))

res.update(data2.set_index(['key1', 'key2']))

# reset index to recover columns

res = res.reset_index()

print(res)

key1 key2 newfeature

0 0.365803 0.259112 info1

1 0.086869 0.589834 None

2 0.269619 0.183644 None

3 0.755826 0.045187 None

4 0.204009 0.669371 info2

反對回復 2021-04-20

哈士奇WWW

TA貢獻1799條經驗獲得超6個贊

您還可以將數據幀設置為相同的索引并使用簡單 loc

df = df.set_index(["key1", "key2"])

df2 = df2.set_index(["key1", "key2"])

然后

df.loc[:, "new_feature"] = df2['new_feature']

反對回復 2021-04-20

3 回答
0 關注
187 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

合并多個數據框熊貓

合并多個數據框熊貓

3 回答

添加回答