亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

為了賬號安全,請及時綁定郵箱和手機立即綁定
已解決430363個問題,去搜搜看,總會有你想問的

如何使用 beautiful soup 從腳本標簽中提取 json?

如何使用 beautiful soup 從腳本標簽中提取 json?

慕田峪7331174 2023-12-25 15:55:10
reviewCount我想使用 beautiful soup 從腳本標簽中提取。嘗試了不同的方法但沒有成功。<script type="application/json" data-initial-state="review-filter">{"languages":[{"isoCode":"all","displayName":"Toutes les langues","reviewCount":"573"},{"isoCode":"fr","displayName":"fran?ais","reviewCount":"567"},{"isoCode":"en","displayName":"English","reviewCount":"6"}],"selectedLanguages":["all"],"selectedStars":null,"selectedLocationId":null}</script>
查看完整描述

3 回答

?
jeck貓

TA貢獻1909條經驗 獲得超7個贊

這應該可行,我絕對確定有一種更優雅的方法:


import json

from bs4 import BeautifulSoup


html = '''

<script type="application/json" data-initial-state="review-filter">

{"languages":[{"isoCode":"all","displayName":"Toutes les langues","reviewCount":"573"},{"isoCode":"fr","displayName":"fran?ais","reviewCount":"567"},{"isoCode":"en","displayName":"English","reviewCount":"6"}],"selectedLanguages":["all"],"selectedStars":null,"selectedLocationId":null}

</script>

'''


soup = BeautifulSoup(html, 'html.parser')

res = soup.find('script')

json_object = json.loads(res.contents[0])


for language in json_object['languages']:

    print('{}: {}'.format(language['displayName'], language['reviewCount']))

輸出:


Toutes les langues: 573

fran?ais: 567

English: 6


查看完整回答
反對 回復 2023-12-25
?
慕無忌1623718

TA貢獻1744條經驗 獲得超4個贊

導入 json 并加載數據json,然后 iterarte 獲取所有reviewCount.


import json

html='''<script type="application/json" data-initial-state="review-filter">

{"languages":[{"isoCode":"all","displayName":"Toutes les langues","reviewCount":"573"},{"isoCode":"fr","displayName":"fran?ais","reviewCount":"567"},{"isoCode":"en","displayName":"English","reviewCount":"6"}],"selectedLanguages":["all"],"selectedStars":null,"selectedLocationId":null}

</script>'''


soup=BeautifulSoup(html,"html.parser")

item=soup.select_one('script[data-initial-state="review-filter"]').text

jsondata=json.loads(item)

for item in jsondata['languages']:

    print(item['reviewCount'])

輸出:


573

567

6


查看完整回答
反對 回復 2023-12-25
?
慕妹3242003

TA貢獻1824條經驗 獲得超6個贊

import re


html = '''<script type="application/json" data-initial-state="review-filter">

{"languages":[{"isoCode":"all","displayName":"Toutes les langues","reviewCount":"573"},{"isoCode":"fr","displayName":"fran?ais","reviewCount":"567"},{"isoCode":"en","displayName":"English","reviewCount":"6"}],"selectedLanguages":["all"],"selectedStars":null,"selectedLocationId":null}

</script>'''



match = [item.group(1) for item in re.finditer('reviewCount":"(.+?)"', html)]


print(match)

輸出:


['573', '567', '6']


查看完整回答
反對 回復 2023-12-25
  • 3 回答
  • 0 關注
  • 252 瀏覽

添加回答

舉報

0/150
提交
取消
微信客服

購課補貼
聯系客服咨詢優惠詳情

幫助反饋 APP下載

慕課網APP
您的移動學習伙伴

公眾號

掃描二維碼
關注慕課網微信公眾號