首頁猿問 Python...

Python 正則表達式：僅當單詞前面有空格和逗號或者單詞是起始單詞時

Python

慕妹3242003 2023-09-12 20:09:30

對于給定的字符串，如下所示：'Rob and Amber Mariano, Heather Robinson, Jane and John Smith, Kiwan and Nichols Brady John, Jimmy Nichols, Melanie Carbone, Jim Green and Nancy Brown, Todd and Sana Clegg with Tatiana Perkin'我想確定可能被稱為“John 和 Jane Doe”的夫婦或其他家庭成員，并排除“Jim Green 和 Nancy Brown”等案例。我只想識別以下內容：Rob and Amber Mariano, Jane and John Smith, Kiwan and Nicholas Brady John, Todd and Sana Clegg下面正則表達式中的組似乎捕獲了我想要的大多數情況，但我在排除“Jim Green”時遇到了麻煩。我想提出的條件是第一個單詞是一個名稱，但它要么位于字符串的開頭，要么前面只有空格和逗號。但由于某種原因，我的表達不起作用。我期望 ([^|,\s']?) 捕捉到這一點，但它似乎并沒有這樣做。([^|\,\s]?)([A-Z][a-zA-Z]+)(\s*and\s*)([A-Z][a-zA-Z]+)(\s[A-Z][a-zA-Z]+)(\s[A-Z][a-zA-Z]+)?

查看完整描述

3 回答

慕尼黑5688855

TA貢獻1848條經驗獲得超2個贊

讓我們將答案分解為兩個簡單的步驟。

將整個字符串轉換為一組情侶姓名。
獲取所有與所請求的模式匹配的對。

我們對遵循以下模式的情侶名字感興趣：

<Name1> and <Name2> <Last-name> <May-or-may-not-be-words-separated-by-spaces>.

<Name1> and <Name2> <Last-name>但我們只對每個匹配字符串的部分感興趣。現在我們已經定義了我們想要做什么，下面是相同的代碼。

import re

testStr = """Rob and Amber Mariano, Heather Robinson,

Jane and John Smith, Kiwan and Nichols Brady John,

Jimmy Nichols, Melanie Carbone, Jim Green and Nancy Brown,

Todd and Sana Clegg with Tatiana Perkin

"""

# Pattern definition for the match

regExpr = re.compile("^(\w+\sand\s\w+\s\w+)(\s\w)*")

# Remove whitespaces introduced at the beginning due to splitting

coupleList = [s.strip() for s in testStr.split(',')]

# Find all strings that have a matching string, for rest match() returns None

matchedList = [regExpr.match(s) for s in coupleList]

# Select first group which extracts the necessary pattern from every matched string

result = [s.group(1) for s in matchedList if s is not None ]

反對回復 2023-09-12

慕婉清6462132

TA貢獻1804條經驗獲得超2個贊

有點晚了，但可能是最簡單的正則表達式

import re

regex = r"(?:, |^)(\w+\sand\s\w+\s\w+)"

test_str = "Rob and Amber Mariano, Heather Robinson, Jane and John Smith, Kiwan and Nichols Brady, John, Jimmy Nichols, Melanie Carbone, Jim Green and Nancy Brown, Todd and Sana Clegg with Tatiana Perkin"

matches = re.finditer(regex, test_str, re.MULTILINE)

for matchNum, match in enumerate(matches, start=1):

for groupNum in range(0, len(match.groups())):

groupNum = groupNum + 1

print (match.group(groupNum))

輸出：

Rob and Amber Mariano

Jane and John Smith

Kiwan and Nichols Brady

Todd and Sana Clegg

反對回復 2023-09-12

皈依舞

TA貢獻1851條經驗獲得超3個贊

試試這個...按預期完美工作

(,\s|^)([A-Z][a-z]+\sand\s[A-Z][a-z]+(\s[A-Z][a-z]+)+)

測試腳本：

import re

a=re.findall("(,\s|^)([A-Z][a-z]+\sand\s[A-Z][a-z]+(\s[A-Z][a-z]+)+)","Rob and Amber Mariano, Heather Robinson, Jane and John Smith, Kiwan and Nichols Brady John, Jimmy Nichols, Melanie Carbone, Jim Green and Nancy Brown, Todd and Sana Clegg with Tatiana Perkin")

print(a)

回復：

[('', 'Rob and Amber Mariano', ' Mariano'), (', ', 'Jane and John Smith', ' Smith'), (', ', 'Kiwan and Nichols Brady John', ' John'), (', ', 'Todd and Sana Clegg', ' Clegg')]

反對回復 2023-09-12

3 回答
0 關注
167 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

Python 正則表達式：僅當單詞前面有空格和逗號或者單詞是起始單詞時

Python 正則表達式：僅當單詞前面有空格和逗號或者單詞是起始單詞時

3 回答

添加回答