首頁猿問從 Pandas 列中提取嵌套字典

從 Pandas 列中提取嵌套字典

Python

慕桂英3389331 2022-06-02 10:16:36

我嘗試從我的熊貓數據框中的嵌套字典創建一個數據框，但我無法讓它工作......我的數據框： created_at selected 2019-08-13T12:24:53+00:00 {"982813":false,"1786112":true,"3002218":false} 2019-08-31T13:47:51+00:00 {"309279":true,"1903384":false} ...我想創建一個新的 df ，所選列數據的格式如下： created_at ID Value 2019-08-13T12:24:53+00:00 982813 false 2019-08-13T12:24:53+00:00 1786112 true 2019-08-13T12:24:53+00:00 3002218 false 2019-08-31T13:47:51+00:00 309279 true 2019-08-31T13:47:51+00:00 1903384 false ...我一直在嘗試使用 explode() 和 json_normalize() 但沒有成功，所以我決定使用 pd.DataFrame.from_dict() 和如下的 for 循環，但我遇到了錯誤。x = {}for row in df.selected: pd.DataFrame.from_dict(row, orient='index')但我收到以下錯誤：AttributeError：“str”對象沒有屬性“values”我仍然是python的初學者，所以如果有人作為一個想法/解釋我全神貫注。

查看完整描述

3 回答

HUWWW

TA貢獻1874條經驗獲得超12個贊

您想使用.apply(pd.Series)，stack()然后重命名您的列：

df.set_index('created_at')['selected'].apply(pd.Series).stack().reset_index().rename(columns={'level_1':'ID',0:'Value'})

created_at ID Value

0 2019-08-13T12:24:53+00:00 982813 False

1 2019-08-13T12:24:53+00:00 1786112 True

2 2019-08-13T12:24:53+00:00 3002218 False

3 2019-08-31T13:47:51+00:00 309279 True

4 2019-08-31T13:47:51+00:00 1903384 False

順便說一句，為了將來參考，您可以通過提供代碼來復制您的起點來更快地獲得答案。大部分時間我都在想這個：

df = pd.DataFrame({"created_at": ['2019-08-13T12:24:53+00:00', '2019-08-31T13:47:51+00:00'], "selected": [{"982813":False,"1786112":True,"3002218":False}, {"309279":True,"1903384":False}]})

反對回復 2022-06-02

墨色風雨

TA貢獻1853條經驗獲得超6個贊

這是一個向您展示這個想法的微型示例。如果您的音量很大，則不建議：

import pandas as pd

df = pd.DataFrame([[1, {'abc':11}], [2, {'def':22, 'ghi':33}]], columns=['id', 'dct'])

lst = []

for index, row in df.iterrows():

for key, value in row['dct'].items():

lst.append([row['id'], key, value])

new = pd.DataFrame(lst, columns=['id', 'string', 'value'])

print(new)

反對回復 2022-06-02

白豬掌柜的

TA貢獻1893條經驗獲得超10個贊

在您的情況下，您可以explode在 0.25.0 之后在 pandas中使用

df.BB=df.BB.map(lambda x : list(x.items()))

s=df.explode('BB')

pd.concat([s,pd.DataFrame(s.BB.tolist(),index=s.index)],axis=1)

Out[93]:

CC BB 0 1

0 1 (1, 2) 1 2

0 1 (2, 1) 2 1

1 2 (2, 2) 2 2

1 2 (8, 3) 8 3

1 2 (4, 5) 4 5

數據

df= pd.DataFrame({'CC':[1,2],'BB':[{1:2,2:1},{2:2,8:3,4:5}]})

反對回復 2022-06-02

3 回答
0 關注
132 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

從 Pandas 列中提取嵌套字典

從 Pandas 列中提取嵌套字典

3 回答

添加回答