已解決430363個問題，去搜搜看，總會有你想問的

嘗試使用 BeautifulSoup 獲取元數據時出現意外結果

首頁猿問嘗試使用...

嘗試使用 BeautifulSoup 獲取元數據時出現意外結果

Python

慕無忌1623718 2021-10-12 17:49:17

好的，這就是我正在嘗試做的。我對 Python 還很陌生，我才剛剛掌握它。無論如何，使用這個小工具，我正在嘗試從頁面中提取數據。在這種情況下，我希望用戶輸入一個 URL 并讓它返回<meta content=" % Likes, % Comments - @% on Instagram: “post description []”" name="description" /> 但是，替換%為帖子的喜歡/評論等數量。這是我的完整代碼：from urllib.request import urlopenfrom bs4 import BeautifulSoupimport requestsimport reurl = "https://www.instagram.com/p/BsOGulcndj-/"page2 = requests.get(url)soup2 = BeautifulSoup(page2.content, 'html.parser')result = soup2.findAll('content', attrs={'content': 'description'})print (result)但是每當我運行它時，我都會得到[]. 我究竟做錯了什么？

查看完整描述

2 回答

ITMISS

TA貢獻1871條經驗獲得超8個贊

匹配這些標簽的正確方法是：

result = soup2.findAll('meta', content=True, attrs={"name": "description"})

但是，html.parser不能<meta>正確解析標簽。它沒有意識到它們是自閉合的，所以它<head>在結果中包含了其余的大部分。我改為

soup2 = BeautifulSoup(page2.content, 'html5lib')

然后上面搜索的結果是：

[<meta content="46.3m Likes, 2.6m Comments - EGG GANG ?? (@world_record_egg) on Instagram: “Let’s set a world record together and get the most liked post on Instagram. Beating the current…”" name="description"/>]

反對回復 2021-10-12

2 回答
0 關注
164 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

嘗試使用 BeautifulSoup 獲取元數據時出現意外結果

嘗試使用 BeautifulSoup 獲取元數據時出現意外結果

2 回答

添加回答