已解決430363個問題，去搜搜看，總會有你想問的

如何在 python 中刪除數據框中單詞的精確匹配？

首頁猿問如何在 python...

如何在 python 中刪除數據框中單詞的精確匹配？

Python

開心每一天1111 2023-03-30 16:20:27

假設以下數據框有一列名為 <game>：df: game0 juegos blue1 juego red2 juegos yellow我想從以下停用詞列表中刪除這些詞：stopWords = ['juego','juegos']并且期望的結果是：df: game0 blue1 red2 yellow我試過了：df['game'] = df['game'].str.replace("|".join(stopWords ), " ")該函數有效，但它從條目“juegos”中刪除了“juego”，只留下一個“s”：df: game0 s blue1 red2 s yellow有沒有辦法只在完全匹配的情況下刪除單詞？

查看完整描述

2 回答

www說

TA貢獻1775條經驗獲得超8個贊

你可以用 pandas DataFrame.replace() 來做

In [1]: import pandas as pd

...: df = pd.DataFrame({'game': ['juegos blue', 'juego red', 'juegos yellow']})

...: stop_words = [r'juego\b', r'juegos\b']

...: df.replace(to_replace={'game': '|'.join(stop_words)}, value='', regex=True, inplace=True)

...: df

Out[1]:

game

0 blue

1 red

2 yellow

In [2]: df = pd.DataFrame({'game': ['juegos blue', 'juego red', 'juegos yellow']})

...: stop_words = [r'juego\b']

...: df.replace(to_replace={'game': '|'.join(stop_words)}, value='', regex=True, inplace=True)

...: df

Out[2]:

game

0 juegos blue

1 red

2 juegos yellow

假設 stop 'words' 以單詞 boundary 結尾\b。

反對回復 2023-03-30

明月笑刀無情

TA貢獻1828條經驗獲得超4個贊

Python 字符串替換不起作用，但正則表達式模塊可以。您將需要向字符串添加一些標記以使正則表達式查找完整的單詞。例如，您可能知道它是一個完整的單詞，因為它后面跟有句號.、逗號,或任何類型的空格\s，或結尾行$。\b是單詞邊界的正則表達式模式。

import re

s1 = df['game'].str

for sw in stopWords:

? ? s1 = re.sub(r'{0}\b'.format(sw), '', s1)

df['game'].str = s1

保留舊代碼以備不時之需。此方法還會直接刪除匹配詞后的空格、逗號或句點，這不是您要求的，但可能會有用。

import re

s1 = df['game'].str

for sw in stopWords:

? ? s1 = re.sub(r'{0}([.,\s]|$)'.format(sw), '', s1)

df['game'].str = s1

反對回復 2023-03-30

2 回答
0 關注
125 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

如何在 python 中刪除數據框中單詞的精確匹配？

如何在 python 中刪除數據框中單詞的精確匹配？

2 回答

添加回答

如何在 python 中刪除數據框中單詞的精確匹配？

如何在 python 中刪除數據框中單詞的精確匹配？