我有這個數據框import pandas as pdfrom datetime import datetimedf = pd.DataFrame([ {"_id": "1", "date": datetime.strptime("2020-09-29 07:00:00", '%Y-%m-%d %H:%M:%S'), "status": "started"}, {"_id": "2", "date": datetime.strptime("2020-09-29 14:00:00", '%Y-%m-%d %H:%M:%S'), "status": "end"}, {"_id": "3", "date": datetime.strptime("2020-09-25 17:00:00", '%Y-%m-%d %H:%M:%S'), "status": "started"}, {"_id": "4", "date": datetime.strptime("2020-09-17 09:00:00", '%Y-%m-%d %H:%M:%S'), "status": "end"}, {"_id": "5", "date": datetime.strptime("2020-09-19 07:00:00", '%Y-%m-%d %H:%M:%S'), "status": "end"}, {"_id": "6", "date": datetime.strptime("2020-09-19 08:00:00", '%Y-%m-%d %H:%M:%S'), "status": "end"},]).set_index('date')看起來像這樣: _id statusdate 2020-09-29 07:00:00 1 started2020-09-29 14:00:00 2 end2020-09-25 17:00:00 3 started2020-09-17 09:00:00 4 end2020-09-19 07:00:00 5 end我正在嘗試按天分組并計算每個狀態。但我想在列名稱中包含名稱的名稱。這是所需的輸出: status_started status_enddate2020-09-29 07:00:00 1 12020-09-25 17:00:00 1 02020-09-17 09:00:00 0 12020-09-19 07:00:00 0 2我試過這個:df = df.groupby([pd.Grouper(freq='d'), 'status']).agg({'status': "count"})df = df.reset_index(level="status")out: statusdate status 2020-09-17 end 12020-09-19 end 22020-09-25 started 12020-09-29 end 12020-09-29 started 1但并沒有成功改造df。
2 回答

qq_笑_17
TA貢獻1818條經驗 獲得超7個贊
您可以嘗試crosstab:
d = pd.crosstab(df.index.date, df['status'])\
.rename_axis('date').add_prefix('status_')
status status_end status_started
date
2020-09-17 1 0
2020-09-19 2 0
2020-09-25 0 1
2020-09-29 1 1

一只名叫tom的貓
TA貢獻1906條經驗 獲得超3個贊
您只需要unstack:
df.groupby([pd.Grouper(freq='d'), 'status']).size().unstack('status', fill_value=0)
輸出:
status end started
date
2020-09-17 1 0
2020-09-19 2 0
2020-09-25 0 1
2020-09-29 1 1
添加回答
舉報
0/150
提交
取消