2 回答

TA貢獻1155條經驗 獲得超0個贊
您可以通過首先選擇一個團隊filter,stack數據并使用str.get_dummies,然后選擇groupbylevel=0(原始 df 中的行)和sum。對于兩個團隊add_prefix之前的專欄,例如:concat
df_ = pd.concat([
(df.filter(like=f'Pick_{i}').stack()
.str.get_dummies()
.groupby(level=0).sum()
.add_prefix(f'T{i}_')
) for i in [1,2] ],
axis=1)
print (df_)
T1_A T1_B T1_C T1_D T1_E T1_F T1_G T1_M T1_O T2_A T2_B T2_Q \
0 1 0 0 1 0 0 1 0 0 0 0 1
1 1 0 0 1 0 0 1 0 0 0 0 1
2 2 0 0 0 1 0 0 0 0 0 0 0
3 0 1 0 0 0 1 0 1 0 1 0 0
4 0 0 1 0 0 1 0 0 1 1 1 0
T2_R T2_S T2_V T2_W T2_X
0 1 0 1 0 0
1 0 0 0 1 1
2 0 1 0 1 1
3 0 1 0 1 0
4 1 0 0 0 0

TA貢獻1836條經驗 獲得超4個贊
如果只需要值或需要計數值,請get_dummies與聚合一起使用:max1,0sum
df_enc = (pd.get_dummies(df.rename(columns=lambda x:x.split('_', 2)[-1].replace('team','T')))
.max(axis=1, level=0)
.sort_index(axis=1, level=0))
print (df_enc)
T1_A T1_B T1_C T1_D T1_E T1_F T1_G T1_M T1_O T2_A T2_B T2_Q \
0 1 0 0 1 0 0 1 0 0 0 0 1
1 1 0 0 1 0 0 1 0 0 0 0 1
2 1 0 0 0 1 0 0 0 0 0 0 0
3 0 1 0 0 0 1 0 1 0 1 0 0
4 0 0 1 0 0 1 0 0 1 1 1 0
T2_R T2_S T2_V T2_W T2_X
0 1 0 1 0 0
1 0 0 0 1 1
2 0 1 0 1 1
3 0 1 0 1 0
4 1 0 0 0 0
添加回答
舉報