已解決430363個問題，去搜搜看，總會有你想問的

試圖抓取頁面，但缺少一個

首頁猿問試圖抓取頁面，但缺少一個

試圖抓取頁面，但缺少一個

Python

慕雪6442864 2023-09-26 14:31:21

index_cd = 'KPI200'page_n = 1naver_index = 'http://finance.naver.com/sise/sise_index_day.nhn?code' + index_cd + '&page=' + str(page_n)from urllib.request import urlopensource = urlopen(naver_index).read()import bs4source = bs4.BeautifulSoup(source, 'lxml')td = source.find_all('td')len(td)# /html/body/div/table[1]/tbody/tr[3]/td[1] # this is XPathsource.find_all('table')[0].find_all('tr')[2].find_all('td')[0]我以為輸出會是這樣的：<td class="date">2020.09.29</td>但結果是這樣的：<td class="date"> </td>和'\xa0'之間有一個。<td class="date"</td>我需要提取該日期。這種情況該如何解決呢？

查看完整描述

1 回答

繁華開滿天機

TA貢獻1816條經驗獲得超4個贊

問題在于url您提供的。你錯過了一個=之后code。

更改naver_index = 'http://finance.naver.com/sise/sise_index_day.nhn?code' + index_cd + '&page=' + str(page_n)為naver_index = 'http://finance.naver.com/sise/sise_index_day.nhn?code=' + index_cd + '&page=' + str(page_n)

這是工作代碼：

index_cd = 'KPI200'

page_n = 1

naver_index = 'http://finance.naver.com/sise/sise_index_day.nhn?code=' + index_cd + '&page=' + str(page_n)

from urllib.request import urlopen

source = urlopen(naver_index).read()

import bs4

source = bs4.BeautifulSoup(source, 'lxml')

td = source.find_all('td')

len(td)

# /html/body/div/table[1]/tbody/tr[3]/td[1] # this is XPath

print(source.find_all('table')[0].find_all('tr')[2].find_all('td')[0])

輸出：

如果您只想顯示日期，請將最后一行更改為：

print(source.find_all('table')[0].find_all('tr')[2].find_all('td')[0].text)

輸出：

2020.09.29

希望這對你有幫助！

反對回復 2023-09-26

1 回答
0 關注
107 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

試圖抓取頁面，但缺少一個

試圖抓取頁面，但缺少一個

1 回答

添加回答

試圖抓取頁面，但缺少一個

試圖抓取頁面，但缺少一個