亚洲校园伊人视在线播放,国产制服91一区二区三区制服,五十路天天狠狠

我正在從事一個大型網頁抓取項目，其中每個網頁的 HTML 結構都彼此不同。我想從網頁上抓取產品描述，并且我正在使用 BeautifulSoup 包。例如，我嘗試抓取的產品描述存儲在 HTML 結構中：<div class="product-description"> <p> "Title" </p> <p> "Some content" </p> <p> "Product description" </p></div><div class="product-description"> <p> "Title" </p> <p> "Product description" </p></div><div class="product-description"> <p> "Title" </p> <p> "Some content" </p> <p> "Some content" </p> <p> "Product description" </p></div><div class="product-description"> <p> "Title" </p> <p> "Some-content" </p> <p> "Some-content" </p> <p> "Some-content" </p> <p> "Product description" </p></div>我編寫了一個 for 循環，根據頁面結構從 div 類“產品描述”獲取數據。我的示例代碼片段：requests = (grequests.get(url) for url in urls)responses = grequests.imap(requests, grequests.Pool(1000))for response in responses: html_soup = BeautifulSoup(response.text, 'html.parser') if html_soup.find('div',class_='product_description').next_element.next_sibling.next_sibling.next_sibling.next_sibling: product_description = html_soup.find('div',class_='product_description').next_element.next_sibling.next_sibling.next_sibling.next_sibling.text elif html_soup.find('div', class_='product-description').next_element.next_sibling.next_sibling.next_sibling: product_description = html_soup.find( 'div', class_='product_description').next_element.next_sibling.next_sibling.next_sibling.text elif html_soup.find('div', class_='product-description').next_element.next_sibling.next_sibling: product_description = html_soup.find( 'div', class_='product_description').next_element.next_sibling.next_sibling.text else: product_description = html_soup.find( 'div', class_='product_description').next_element.next_sibling.text我期望 if 條件檢查當前 HTML 級別是否有同級，如果沒有則檢查后續條件。然而，經過 3000 次迭代后，我得到了Attribute error一句話Nonetype object has no attribute next_sibling。下面附上截圖：我知道一定有其他更簡單的方法來處理這個動態頁面結構。任何幫助將非常感激。提前致謝！

查看完整描述

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

網頁抓取動態 HTML 頁面結構

網頁抓取動態 HTML 頁面結構

1 回答

添加回答

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

網頁抓取動態 HTML 頁面結構

網頁抓取動態 HTML 頁面結構

1 回答

添加回答