首頁猿問將數據框列與列表值匹配，并附加數據...

將數據框列與列表值匹配，并附加數據框與匹配的行

Python

RISEBY 2022-10-18 15:09:52

我有兩個不同的 csv，我在兩個數據幀中讀取。我想將列 df1['building_type] 與 df2['model'] 匹配，并將相應的行附加到 df1。數據框 1：data = [{'length': '34', 'width': '58.5', 'height': '60.2', 'building_type': ['concrete','wood','steel','laminate']}, {'length': '42', 'width': '33', 'height': '23', 'building_type': ['concrete_double','wood_double','steel_double']}]df1 = pd.DataFrame(data)print(df1)數據框 2：data2 = [{'type': 'A1', 'floor': '2', 'model': ['wood','laminate','concrete','steel']}, {'type': 'B3', 'floor': '4', 'model': ['wood_double','concrete_double','steel_double']}]df2=pd.DataFrame(data2)print(df2)最終數據框： length width height building_type type floor0 34 58.5 60.2 [concrete, wood, steel, laminate] A1 21 42 33 23 [concrete_double, wood_double, steel_double] B3 4

查看完整描述

1 回答

交互式愛情

TA貢獻1712條經驗獲得超3個贊

pd.merge似乎是這里必要的工具，但我們需要一個不可變的 dtype。list是可變的，不能加入。我們可以將list(mutable) 轉換為tupleor frozenset，這兩者都是不可變的，可以用來加入。由于示例輸出顯示順序無關緊要，我選擇了frozenset.

這是代碼：

import pandas as pd

data = [{'length': '34', 'width': '58.5', 'height': '60.2', 'building_type': ['concrete','wood','steel','laminate']},

{'length': '42', 'width': '33', 'height': '23', 'building_type': ['concrete_double','wood_double','steel_double']}]

df1 = pd.DataFrame(data)

print(df1)

data2 = [{'type': 'A1', 'floor': '2', 'model': ['wood','laminate','concrete','steel']},

{'type': 'B3', 'floor': '4', 'model': ['wood_double','concrete_double','steel_double']}]

df2=pd.DataFrame(data2)

print(df2)

# Note: Merge fails on mutable dtype

# pd.merge(df1, df2, left_on='building_type', right_on='model')

# Produces `TypeError: unhashable type: 'list'`

# Convert mutable type to immutable type and merge.

# `tuple` is best if order matters for you. I am assuming that the

# order doesn't matter based on the sample output, so `frozenset` is more

# appropriate.

df1['building_type'] = df1['building_type'].apply(frozenset)

df2['model'] = df2['model'].apply(frozenset)

# Now, merge. Note that since column names are different both

# 'building_type' and 'model' would be retained. You can remove one of them.

final_df = pd.merge(df1, df2, left_on='building_type', right_on='model')

final_df = final_df.drop(['model'], axis=1)

print(final_df)

我機器上的輸出：

length width height building_type

0 34 58.5 60.2 [concrete, wood, steel, laminate]

1 42 33 23 [concrete_double, wood_double, steel_double]

type floor model

0 A1 2 [wood, laminate, concrete, steel]

1 B3 4 [wood_double, concrete_double, steel_double]

length width height building_type type floor

0 34 58.5 60.2 (laminate, wood, steel, concrete) A1 2

1 42 33 23 (concrete_double, steel_double, wood_double) B3 4

反對回復 2022-10-18

1 回答
0 關注
109 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

將數據框列與列表值匹配，并附加數據框與匹配的行

將數據框列與列表值匹配，并附加數據框與匹配的行

1 回答

添加回答