首頁猿問帶組的正則表達式管道分隔符

帶組的正則表達式管道分隔符

Python

炎炎設計 2023-07-05 11:10:45

我的 URL 中有一個未編碼的 URL。看起來像這樣https://myhost.mydomain.com/pnLVyL7HjrxMlxjBQkhcOMr2WUs=/400x400/https://myhost.mydomain.com/images/98f9a734-52e2-4616-adf7-bf0165bbf738.png我的域名可以是mydomain.com或mydomain.io. 此外，該/400x400/部件實際上可能會有所不同并且相似，/blahblah/XxY/blahblah或者可能完全缺失。圖像可以是jpg, jpeg, png我想提取最后的 URL 的第二部分https://myhost.mydomain.com/images/98f9a734-52e2-4616-adf7-bf0165bbf738.png我有這樣的正則表達式https://myhost.mydomain.com/[a-zA-Z0-9=]*/.+[\/a-zA-Z0-9]?(/https://[a-zA-Z0-9=-]*.mydomain.(com|io)/images/[a-zA-Z0-9-]*.(png|jpg|jpeg))這將其標識為 4 組但是，我想將第二個 URL 作為一個組提取 - 所以整個https://myhost.mydomain.com/images/98f9a734-52e2-4616-adf7-bf0165bbf738.png你能幫我修復我的正則表達式嗎？謝謝！

查看完整描述

3 回答

慕少森

TA貢獻2019條經驗獲得超9個贊

嘗試使用

import re

s = "https://myhost.mydomain.com/pnLVyL7HjrxMlxjBQkhcOMr2WUs=/400x400/https://myhost.mydomain.com/images/98f9a734-52e2-4616-adf7-bf0165bbf738.png"

m = re.search(r"https://.+(https.+)$", s)

if m:

print(m.group(1))

輸出：

https://myhost.mydomain.com/images/98f9a734-52e2-4616-adf7-bf0165bbf738.png

反對回復 2023-07-05

隔江千里

TA貢獻1906條經驗獲得超10個贊

我建議采用這種方法：

https?(?!.*https?):\/\/.*\bmydomain\.(?:com|io).*

此正則表達式使用負向前查找來確保我們匹配的 URL 是輸入字符串中的最后一個。示例腳本：

inp = "https://myhost.mydomain.com/pnLVyL7HjrxMlxjBQkhcOMr2WUs=/400x400/https://myhost.mydomain.com/images/98f9a734-52e2-4616-adf7-bf0165bbf738.png"

url = re.findall(r'https?(?!.*https?):\/\/.*\bmydomain\.(?:com|io).*', inp)[0]

print(url)

這打印：

https://myhost.mydomain.com/images/98f9a734-52e2-4616-adf7-bf0165bbf738.png

反對回復 2023-07-05

海綿寶寶撒

TA貢獻1809條經驗獲得超8個贊

由于有 2 個鏈接，您可以匹配第一個鏈接并捕獲組 1 中的第二個鏈接。

https?://myhost\.mydomain\.(?:com|io)/\S*?(https?://myhost\.mydomain\.(?:com|io)/\S*\.(?:jpe?g|png))

https?://myhost\.mydomain\.(?:com|io)/匹配第一個鏈接的開頭
\S*?匹配 0+ 次非空白字符非貪婪
(捕獲組 1

https?://myhost\.mydomain\.(?:com|io)/匹配第二個鏈接的開頭
\S*匹配 0+ 次非空白字符
\.(?:jpe?g|png)匹配 .jpg 或 .jpeg 或 .png

)關閉組 1

正則表達式演示| Python演示

例如

import re

regex = r"https?://myhost\.mydomain\.(?:com|io)/\S*?(https?://myhost\.mydomain\.(?:com|io)/\S*\.(?:jpe?g|png))"

test_str = ("https://myhost.mydomain.com/pnLVyL7HjrxMlxjBQkhcOMr2WUs=/400x400/https://myhost.mydomain.com/images/98f9a734-52e2-4616-adf7-bf0165bbf738.png")

matches = re.search(regex, test_str)

if matches:

print(matches.group(1))

輸出

https://myhost.mydomain.com/images/98f9a734-52e2-4616-adf7-bf0165bbf738.png

反對回復 2023-07-05

3 回答
0 關注
127 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

帶組的正則表達式管道分隔符

帶組的正則表達式管道分隔符

3 回答

添加回答