已解決430363個問題，去搜搜看，總會有你想問的

使用python3從網頁中抓取特定表格（網頁有多個表格）

首頁猿問使用python3從網頁中抓取特定...

使用python3從網頁中抓取特定表格（網頁有多個表格）

Python

HUWWW 2023-04-25 15:22:38

我正在嘗試從網頁上的特定表格中提取數據。頁面上有多個表，所以我試圖使用表 ID 僅提取所需的表。網址：https://basketball.realgm.com/player/Luke-Nelson/Summary/50483我到目前為止的代碼如下。from urllib.request import urlopenfrom bs4 import BeautifulSoupimport pandas as pdimport ssl# Ignore SSL certificate errorsctx = ssl.create_default_context()ctx.check_hostname = Falsectx.verify_mode = ssl.CERT_NONE#URL inputurl = 'https://basketball.realgm.com/player/Luke-Nelson/Summary/50483'html = urlopen(url, context=ctx).read()soup = BeautifulSoup(html, "html.parser")table = soup.find('table', id='table-1696')print(table)我假設 print 語句會從表中打印 HTML（以前只在一張表上工作）但是當我運行程序時它有以下輸出：終端輸出最終我的目標是在 python 中重新創建表并導出到 excel，但無法克服第一個障礙！

查看完整描述

3 回答

冉冉說

TA貢獻1877條經驗獲得超1個贊

使用 pandas 獲取表格標簽并使用 id 屬性選擇您想要的：

import pandas as pd

url = 'https://basketball.realgm.com/player/Luke-Nelson/Summary/50483'

df = pd.read_html(url, attrs={'id':'table-1696'})[0]

反對回復 2023-04-25

尚方寶劍之說

TA貢獻1788條經驗獲得超4個贊

你可以使用熊貓：

import pandas as pd

df = pd.read_html(url) # df -> list of tables

print(len(df)) # 29

你可以選擇你想要的表格。

反對回復 2023-04-25

一只名叫tom的貓

TA貢獻1906條經驗獲得超3個贊

表 ID 是動態分配的，因此我建議使用另一種方法來訪問您的表。假設您想獲取 NBA 夏季聯賽統計數據 - 總計的表格，請嘗試：

table_heading = 'NBA Summer League Stats - Totals'

table = soup.find(string=re.compile(table_heading))

.find_parent()

.find_next_sibling()

print(table)

table_heading您可以為表格中的其他標題更改。讓我知道是否有幫助。

反對回復 2023-04-25

3 回答
0 關注
130 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

使用python3從網頁中抓取特定表格（網頁有多個表格）

使用python3從網頁中抓取特定表格（網頁有多個表格）

3 回答

添加回答