1 回答

TA貢獻1875條經驗 獲得超5個贊
添加新列 C 并根據與正則表達式匹配的數據幀為該列分配 ID“1”或“2”。
In [17]: df
Out[17]:
A B
0 NaN this has Color:Red
1 NaN Color: Blue,red, green
2 NaN Color: Yellow
3 NaN This has many colors. Color: green, red, Yellow
4 NaN Filter oil Type: Synthetic Motor oil
5 NaN Oil Type : High Mileage Motor oil
您構造了兩個條件:
In [18]: one = (df['B'].str.match('.*Color:.*') | df['B'].str.match('.*colorFUL:.*')) & df.A.isnull()
In [19]: one
Out[19]:
0 True
1 True
2 True
3 True
4 False
5 False
dtype: bool
In [20]: two = (df['B'].str.match('.*Type:.*')) & df.A.isnull()
In [21]: two
Out[21]:
0 False
1 False
2 False
3 False
4 True
5 False
dtype: bool
這是制作新專欄的一種方法。
In [22]: df['C'] = 0
使用條件的布爾系列根據這些條件分配值。
In [23]: df.loc[one,'C'] = 1
In [24]: df.loc[two,'C'] = 2
In [25]: df
Out[25]:
A B C
0 NaN this has Color:Red 1
1 NaN Color: Blue,red, green 1
2 NaN Color: Yellow 1
3 NaN This has many colors. Color: green, red, Yellow 1
4 NaN Filter oil Type: Synthetic Motor oil 2
5 NaN Oil Type : High Mileage Motor oil 0
如果 df 是輸入數據幀,fd 是與模式匹配的輸出數據幀,如何直接將 id 分配給 fd 而不進行布爾檢查
fd = df.loc[one]
fd['C'] = 1
添加回答
舉報