1 回答

TA貢獻1934條經驗 獲得超2個贊
如果您只想按城市執行此操作:
df.groupby(by='City').median()
如果您想同時按城市和工作分組:
df.groupby(by=['City', 'Job']).median()
獲取每個城市的失業率:
import pandas as pd
df = pd.DataFrame({
'User': ['A', 'B', 'C', 'D', 'E', 'F'], 'City': ['x', 'x', 'x', 'y', 'y', 'y'],
'Job': ['Unemployed', 'Student', 'Unemployed', 'Data Scientist', 'Unemployed', 'Student'],
'Age':[33, 18, 27, 28, 45, 18],
})
df['count'] = 1
unmpl = df.groupby(by=['City', 'Job'])['count'].sum().reset_index()
unmpl_by_city = unmpl[unmpl['Job'] == 'Unemployed'].reset_index(drop=True)
count_by_city = df.groupby(by=['City'])['count'].sum().reset_index(drop=True)
frac_by_city = (unmpl_by_city['count'] * 100.0 /
count_by_city)
unmpl_by_city['frac'] = frac_by_city
unmpl_by_city
添加回答
舉報