首頁猿問在 Python 中從 URL...

在 Python 中從 URL 讀取 XML 文件

Python

米琪卡哇伊 2023-02-22 15:50:17

我想讀取 count 中存在的整數tags。這是我寫的代碼：import xml.etree.ElementTree as ETimport urllib.request, urllib.parse, urllib.errorfrom bs4 import BeautifulSoupimport sslctx = ssl.create_default_context()ctx.check_hostname = Falsectx.verify_mode = ssl.CERT_NONEurl = 'http://py4e-data.dr-chuck.net/comments_42.xml'content1 = urllib.request.urlopen(url, context = ctx).read()soup = BeautifulSoup(content1, 'html.parser')tree = ET.fromstring(soup)tags = tree.findall('count')print(tags)它拋出一個錯誤：Traceback (most recent call last): File "C:\Users\Name\Desktop\Py4e\Me\Assi_15_01.py", line 15, in <module> tree = ET.fromstring(soup) File "C:\Users\Name\AppData\Local\Programs\Python\Python38-32\lib\xml\etree\ElementTree.py", line 1320, in XML parser.feed(text)TypeError: a bytes-like object is required, not 'BeautifulSoup'我能做些什么？更多信息：http://py4e-data.dr-chuck.net/comments_42.xml

查看完整描述

2 回答

SMILET

TA貢獻1796條經驗獲得超4個贊

無需使用xml.etree，只需使用<count>BeautifulSoup 選擇所有標簽即可：

import requests

from bs4 import BeautifulSoup

url = 'http://py4e-data.dr-chuck.net/comments_42.xml'

soup = BeautifulSoup(requests.get(url).content, 'html.parser')

for c in soup.select('count'):

print(int(c.text))

印刷：

反對回復 2023-02-22

白衣非少年

TA貢獻1155條經驗獲得超0個贊

我認為您不需要使用 ElementTreee。只需將 BeautiflulSoup 更改為使用 lxml 解析器（將“html-parser”更改為“lxml”）并在湯上調用 findall 方法，而不是樹（即 soup.findall('count')）。

反對回復 2023-02-22

2 回答
0 關注
215 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

在 Python 中從 URL 讀取 XML 文件

在 Python 中從 URL 讀取 XML 文件

2 回答

添加回答