df = pd.DataFrame({'key1' : ['a','a','a','b','b'],
'key2' : ['c','d','c','c','d'],
'data' : [1,10,2,3,30]})
>>> df
key1 key2 data
0 a c 1
1 a d 10
2 a c 2
3 b c 3
4 b d 30
目標結果
key1 key2 data row_number
0 a c 1 1
1 a d 10 1
2 a c 2 2
3 b c 3 1
4 b d 30 1
以key1、key2分組,按照data排序,取出序號應該怎么處理呢?搜索找到的以下方法沒有成功
df['row_number'] = df['data'].groupby(df['key1','key2']).rank(ascending=True,method='first')
1 回答

德瑪西亞99
TA貢獻1770條經驗 獲得超3個贊
def cumsum_seq(v):
sub = v.sort_values('data')
sub['seq'] = sub['seq'].cumsum()
return sub.loc[:, ['data', 'seq']]
df['seq'] = 1
df.groupby(['key1', 'key2']).apply(cumsum_seq).reset_index().drop(columns='level_2')
結果
key1 | key2 | data | seq | |
---|---|---|---|---|
0 | a | c | 1 | 1 |
1 | a | c | 2 | 2 |
2 | a | d | 10 | 1 |
3 | b | c | 3 | 1 |
4 | b | d | 30 | 1 |
添加回答
舉報
0/150
提交
取消