已解決430363個問題，去搜搜看，總會有你想問的

_Scrape_ 塊引用 bs4 后的文本

首頁猿問 _Scrape_ 塊引用 bs4...

_Scrape_ 塊引用 bs4 后的文本

Python

小怪獸愛吃肉 2023-06-06 14:46:11

我在 HTML 中有這樣的東西：<tt> some text:</tt><tt> (8/4)</tt><a href="some link"><tt>some other text</tt></a><tt>, (9/4)</tt><a href="some other link"><tt> some text:</tt><tt>, (19/6)</tt><tt>text after comment</tt></blockquote></blockquote><tt>, </tt><a href="link i want"><tt>text i want</tt></a><tt> ... 我在 Python 中的代碼：page = requests.get(site)soup = BeautifulSoup(page.content, 'html.parser')rounds = soup.find('p', align="left")matches_links = rounds.find_all('a')我得到了一些評論和文本的所有鏈接。之后我什么也得不到</blockquote></blockquote>。這兩個塊引用在頁面代碼中是不可見的，只有當我調試我的 Python 代碼時我才能在soup. 我有soup所有 HTML 代碼，但rounds代碼以<tt>text after comment</tt>.有什么方法可以獲得“我想要的鏈接”和“我想要的文字”？

查看完整描述

1 回答

開滿天機

TA貢獻1786條經驗獲得超13個贊

如果您查看 HTML 代碼，您會看到有before </blockquote></blockquote>。這意味著您的變量rounds不包含您想要的鏈接。<a>在此標簽后搜索下一個：

from bs4 import BeautifulSoup

txt = '''

<tt>

some text:</tt><tt> (8/4)</tt><a href="some link"><tt>some other text</tt></a><tt>, (9/4)</tt><a href="some other link"><tt>

some text:</tt><tt>, (19/6)</tt><tt>text after comment</tt></blockquote></blockquote><tt>, </tt><a href="link i want"><tt>text i want</tt></a><tt>

...

'''

soup = BeautifulSoup(txt, 'html.parser')

matched_link = soup.select_one('p[align="left"] ~ a')

print(matched_link)

印刷：

反對回復 2023-06-06

1 回答
0 關注
116 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

_Scrape_ 塊引用 bs4 后的文本

_Scrape_ 塊引用 bs4 后的文本

1 回答

添加回答