2 回答

TA貢獻1799條經驗 獲得超6個贊
您將錯誤的函數傳遞給回調,您的self.parse
函數只能在登錄頁面上使用。
if next_page is not None: yield response.follow(next_page, callback=self.start_scraping)

TA貢獻1869條經驗 獲得超4個贊
這是來自您的執行日志:
File "C:\Users\Robert\Documents\Demos\vstoolbox\scrapytest\scrapytest\spiders\quotes_spider.py", line 15, in parse
yield scrapy.FormRequest(url=self.login_url,formdata={
File "C:\Users\Robert\anaconda3\envs\condatest\lib\site-packages\scrapy\http\request\form.py", line 31, in __init__
querystr = _urlencode(items, self.encoding)
File "C:\Users\Robert\anaconda3\envs\condatest\lib\site-packages\scrapy\http\request\form.py", line 71, in _urlencode
values = [(to_bytes(k, enc), to_bytes(v, enc))
File "C:\Users\Robert\anaconda3\envs\condatest\lib\site-packages\scrapy\http\request\form.py", line 71, in <listcomp>
values = [(to_bytes(k, enc), to_bytes(v, enc))
File "C:\Users\Robert\anaconda3\envs\condatest\lib\site-packages\scrapy\utils\python.py", line 104, in to_bytes
raise TypeError('to_bytes must receive a str or bytes '
TypeError: to_bytes must receive a str or bytes object, got NoneType
簡而言之,它告訴您formdata參數中的參數是None,但預計它是“a str 或 bytes 對象”。鑒于您formdata有三個字段,只有一個是變量,token必須返回空。
...
token = response.css('input[name="csrf_token"]::attr(value)').extract_first()
yield scrapy.FormRequest(url=self.login_url,formdata={
'csrf_token':token,
'username':'roberthng',
'password':'dsadsadsa'
},callback = self.start_scraping)
但是,如果您位于登錄頁面,您的選擇器會正確返回值。我的假設是,當您定義下一頁的請求時,您正在將回調設置為您的parse方法(或者根本不設置它,這會導致parse默認)。我說假設,因為你沒有發布那部分代碼。您的代碼示例停在這里:
#Go to Next Page:
next_page = response.css('li.next a::attr(href)').get()
if next_page is not None:
因此,請確保在此之后為請求正確設置回調函數。
添加回答
舉報