已解決430363個問題，去搜搜看，總會有你想問的

Scrapy 請求得到一些響應，但不是全部

首頁猿問 Scrapy...

Scrapy 請求得到一些響應，但不是全部

Python

達令說 2023-12-08 16:44:37

我正在抓取一個在同一個 <@div (bold) xpath 中有 36 個 <@hrefs 的頁面，所以當我嘗試獲取這些內容時，即使在 scrapy shell 上，它也始終只獲得相同的 12 個 <@hrefs，并且這是不正常的。我使用這種方式：response.xpath('/html/body/div[1]/div[2]/section/div/div[3]/div[2]/div/div[2]//?div?//article//div[1]// a[re:test(@href,"pd")]//@href').getall()它來自以下頁面：?https://www.lowes.com/pl/Bottom-freezer-refrigerators-Refrigerators-Appliances/4294789499 ?offset=36

查看完整描述

1 回答

吃雞游戲

TA貢獻1829條經驗獲得超7個贊

看來html的一部分是動態加載的，所以scrapy看不到它。數據本身存在于 html 中的 json 結構中。你可以嘗試這樣獲取：

import json

# get the script with the data

json_data = response.xpath('//script[contains(text(), "__PRELOADED_STATE__")]/text()').extract_first()

# load the data in a python dictionary

dict_data = json.loads(json_data.split('window.__PRELOADED_STATE__ =')[-1])

items = dict_data['itemList']

print(len(items)) # prints 36 in my case

# go through the dictionary and get the product_urls

for item in items:

product_url = item['product']['pdURL']

...

反對回復 2023-12-08

1 回答
0 關注
158 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

Scrapy 請求得到一些響應，但不是全部

Scrapy 請求得到一些響應，但不是全部

1 回答

添加回答

Scrapy 請求得到一些響應，但不是全部

Scrapy 請求得到一些響應，但不是全部