數據框看起來像這樣col_aPython PY is a general purpose PY languageProgramming PY language in Python PY Its easier to understand PYThe syntax of the language is clean PY這段代碼我試圖實現此功能,但無法獲得預期的輸出。如果有任何幫助表示贊賞。這是我使用正則表達式處理的以下代碼:df['col_a'].str.extract(r"([a-zA-Z'-]+\s+PY)\b")期望的輸出:col_a col_b_PY Python PY is a general purpose language Python PY purpose PYProgramming PY language in Python PY Python PY Programming PY Its easier to understand PY understand PY The syntax of the language is clean PY clean PY
2 回答

揚帆大魚
TA貢獻1799條經驗 獲得超9個贊
import re
def app(row):
return ' '.join(re.findall(r'\w+\s+PY', row.col_a))
df['col_b_PY'] = df.apply(app, axis=1)
您需要連接應用函數中每一行的所有匹配項。也可以使用它來做到這extractall一點,但我發現這更簡單、更直接。
添加回答
舉報
0/150
提交
取消