已解決430363個問題，去搜搜看，總會有你想問的

Groupby 和僅選定的列

首頁猿問 Groupby 和僅選定的列

Groupby 和僅選定的列

Python

慕村225694 2021-09-11 15:18:28

在這里我讀了一個文件“userdata.xlsx”：ID Debt Email Age User1 7.5 [email protected] 16 John2 15 [email protected] 15 John3 22 [email protected] 15 John4 30 [email protected] 22 David5 33 [email protected] 22 David6 51 [email protected] 61 Fred7 11 [email protected] 25 Fred8 24 [email protected] 19 Eric9 68 [email protected] 55 Terry10 335 [email protected] 55 Terry在這里，我按用戶分組并為每個用戶創建一個電子表格并將其輸出為自己的 .xlsx 文件，如下所示：ID Debt Email Age User1 7.5 [email protected] 16 John2 15 [email protected] 15 John這是整個代碼： #!/usr/bin/env python3 import pandas as pd import numpy as np import matplotlib.pyplot as plt import xlrd df = pd.read_excel('userdata.xlsx') grp = df.groupby('User') for group in grp.groups: grouptofile = (grp.get_group(group)) print(grouptofile) print(group) grouptofile.to_excel('%s.xlsx' % group , sheet_name='sheet1', index=False)現在我只想保存選定的列來為每個用戶保存。假設我只希望選擇“ID”和“電子郵件”列。我學會了如何只選擇某些列，如下所示：selected = df[['ID','Email']]我現在認為在這里添加 ID 和電子郵件是有意義的。grp = df.groupby('User')添加了“ID”和“電子郵件”grp = df[['ID', 'Email']].groupby('User')甚至可以組合 groupby 和 select 列嗎？#!/usr/bin/env python3 import pandas as pd import numpy as np import matplotlib.pyplot as plt import xlrd df = pd.read_excel('userdata.xlsx') grp = df[['ID', 'Email']].groupby('User') for group in grp.groups: grouptofile = (grp.get_group(group)) print(grouptofile) print(group) grouptofile.to_excel('%s.xlsx' % group , sheet_name='sheet1', index=False)

查看完整描述

2 回答

不負相思意

TA貢獻1777條經驗獲得超10個贊

我認為您需要在子集中指定列：

cols = ['ID', 'Email']

for i, group in df.groupby('User'):

group[cols].to_excel('{}.xlsx'.format(i), sheet_name='sheet1', index=False)

如果得到KeyError: 'User'它意味著你想要選擇不存在的列。

因此，如果選擇列ID和Email，則鏈接的 groupby 找不到User列并引發錯誤：

print (df[['ID', 'Email']])

ID Email

0 1 [email protected]

1 2 [email protected]

2 3 [email protected]

3 4 [email protected]

4 5 [email protected]

5 6 [email protected]

6 7 [email protected]

7 8 [email protected]

8 9 [email protected]

9 10 [email protected]

所以有必要選擇列也在 groupby 中使用：

for i, group in df[['ID', 'Email', 'User']].groupby('User'):

group.to_excel('{}.xlsx'.format(i), sheet_name='sheet1', index=False)

或者在寫入文件之前選擇列，就像在第一個解決方案中一樣。

for i, group in df[['ID', 'Email', 'User']].groupby('User'):

group[cols].to_excel('{}.xlsx'.format(i), sheet_name='sheet1', index=False)

反對回復 2021-09-11

MMMHUHU

TA貢獻1834條經驗獲得超8個贊

這是可能的......但不是你這樣做的方式。

您正在有效地刪除除兩列之外的所有列，然后嘗試按不再存在的第三列進行分組。相反，您需要在選擇列之前進行分組（盡管我不知道分組是否numpy是一個變異操作，因此您可能需要先進行復制）。

（可能次優）示例：

grp = df[('ID', 'Email', 'User')].groupby('User')[('ID', 'Email')]

反對回復 2021-09-11

2 回答
0 關注
152 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

Groupby 和僅選定的列

Groupby 和僅選定的列

2 回答

添加回答