2 回答

TA貢獻2041條經驗 獲得超4個贊
這是一個可能的解決方案。但是,您必須事先找出所有可能的鍵值。我想,它可以通過編程方式完成,但我在這里對它們進行了硬編碼。此外,如果有多個有價值的項目,它將采用第一個。
import pandas as pd
import json
# original dataframe
df = pd.DataFrame({'x':['''[{"key":"Gender","value":["Men"]},
{"key":"Shoe Size","value":["M"]},
{"key":"Shoe Category","value":["Men's Shoes"]},
{"key":"Color","value":["Multicolor"]},
{"key":"Manufacturer Part Number","value":["8190-W-NAVY-7.5"]},
{"key":"Brand","value":["Josmo"]}]''',
'''[{"key":"Gender","value":["Women"]},
{"key":"Shoe Size","value":["M"]},
{"key":"Shoe Category","value":["Women's Shoes"]},
{"key":"Color","value":["Multicolor"]},
{"key":"Manufacturer Part Number","value":["8190-W-NAVY-7.5"]}]'''],
'y':['A','B']})
expanded_columns = ['Gender', 'Shoe Size', 'Shoe Category', 'Color',
'Manufacturer Part Number', 'Brand']
# function to create list of values from json text
def json_to_cols(s):
l = json.loads(s)
d = {i:None for i in expanded_columns}
for row in l:
d[row['key']] = row['value'][0]
return list(d.values())
# Create new dataframe with expanded columns
df1 = df.apply(lambda row: pd.Series(json_to_cols(row['x']), index=expanded_columns),
axis=1)
new_df = df.join(df1)
print(new_df)

TA貢獻1810條經驗 獲得超5個贊
尚不完全清楚您想要什么,但以下代碼將生成一個數據框,其中列名取自y,索引取自 鍵x,每列的值取自 中的值x,NaN對于任何沒有出現的鑰匙。
output_df = pd.DataFrame(
{input_row[1]['y']:
{
pair['key']: pair['value'][0]
for pair in ast.literal_eval(input_row[1]['x'])
}
for input_row in df.iterrows()
}
)
輸出:
A B
Brand Josmo NaN
Color Multicolor NaN
Gender Men Women
Heel Height NaN 1 Inches
Manufacturer Part Number 8190-W-NAVY-7.5 NaN
Shoe Category Men's Shoes NaN
Shoe Size M NaN
Size NaN XL
添加回答
舉報