已解決430363個問題，去搜搜看，總會有你想問的

如何使用 html.parser

首頁猿問如何使用 html.parser

如何使用 html.parser

Python

慕碼人8056858 2023-01-04 16:51:46

大家好，我是 python 的新手，正在嘗試使用 python 的 html.parser 模塊，我想抓取這個網站并使用 html.parser 獲取 url、交易名稱和價格，它位于li標簽 https://www.mcdelivery 中.com.pk/pk/browse/menu.html 獲取 url 后，我想將它們附加到基本 URL 中，并從該站點獲取帶有價格的交易。import urllib.requestimport urllib.parseimport refrom html.parser import HTMLParserurl = 'https://www.mcdelivery.com.pk/pk/browse/menu.html'values = {'daypartId': '1', 'catId': '1'}data = urllib.parse.urlencode(values)data = data.encode('utf-8') # data should be bytesreq = urllib.request.Request(url, data)resp = urllib.request.urlopen(req)respData = resp.read()list1 = re.findall(r'<div class="product-cost"(.*?)</div>', str(respData))for eachp in list1: print(eachp)正在使用正則表達式來上課，但我失敗了。現在試圖弄清楚如何使用 html.parser 來做到這一點。我知道工作變得更容易，beautifulsoup and scrapy但我正在嘗試使用裸 python，所以請跳過第 3 方庫。我真的需要幫助。我卡住了。Html.parser 代碼（更新）from html.parser import HTMLParserimport urllib.requestimport html.parser# Import HTML from a URLurl = urllib.request.urlopen( "https://www.mcdelivery.com.pk/pk/browse/menu.html")html = url.read().decode()url.close()class MyParser(html.parser.HTMLParser): def __init__(self, html): self.matches = [] self.match_count = 0 super().__init__() def handle_data(self, data): self.matches.append(data) self.match_count += 1 def handle_starttag(self, tag, attrs): attrs = dict(attrs) if tag == "div": if attrs.get("product-cost"): self.handle_data() else: returnparser = MyParser(html)parser.feed(html)for item in parser.matches: print(item)

查看完整描述

1 回答

侃侃爾雅

TA貢獻1801條經驗獲得超16個贊

這是一個可能需要特定調整的良好開端：

import html.parser

class MyParser(html.parser.HTMLParser):

def __init__(self, html):

self.matches = []

self.match_count = 0

super().__init__()

def handle_data(self, data):

self.matches.append(data)

self.match_count += 1

def handle_starttag(self, tag, attrs):

attrs = dict(attrs)

if tag == "div":

if attrs.get("product-cost"):

self.handle_data()

else: return

用法是沿著

request_html = the_request_method(url, ...)

parser = MyParser()

parser.feed(request_html)

for item in parser.matches:

print(item)

反對回復 2023-01-04

1 回答
0 關注
183 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

如何使用 html.parser

如何使用 html.parser

1 回答

添加回答