2 回答

TA貢獻1801條經驗 獲得超8個贊
對我來說工作正常,似乎沒有分配回新變量:
mydata['State'] = pd.Categorical(mydata['State'],
["Delivered", "In-Transit", "Shipped", "Cancelled"],
ordered=True)
#keep='first'is default value, so should be omitted
mydata = mydata.sort_values('state').drop_duplicates(['ID','version'])
print (mydata)
ID version Name state
2 101 1 Nut Delivered
3 101 2 Nut 2.0 In-Transit
5 102 1 Screw In-Transit
6 102 2 Screw 2.0 Shipped
此外,如果想要按 排序輸出ID,version請按多列添加排序:
mydata['State'] = pd.Categorical(mydata['State'],
["Delivered", "In-Transit", "Shipped", "Cancelled"],
ordered=True)
mydata = mydata.sort_values(['ID','version','state']).drop_duplicates(['ID','version'])

TA貢獻1802條經驗 獲得超5個贊
使用pd.Categoricalwithordered=True創建一個分類變量,然后sort_values在這個分類變量上使用groupbyonID, version和aggusing first:
mydata['State'] = pd.Categorical(mydata['State'], ["Delivered", "In-Transit", "Shipped", "Cancelled"], ordered=True)
df = mydata.sort_values('State').groupby(['ID', 'version'], as_index=False).first()
結果:
ID version Name State
0 101 1 Nut Delivered
1 101 2 Nut 2.0 In-Transit
2 102 1 Screw In-Transit
3 102 2 Screw 2.0 Shipped
添加回答
舉報