我正在嘗試抓取網頁并將結果存儲在 csv/excel 文件中。我用的是漂亮的湯。我正在嘗試使用 find_all 函數從湯中提取數據,但我不確定如何捕獲字段名稱或標題中的數據HTML 文件具有以下格式<h3 class="font20"> <span itemprop="position">36.</span> <a class="font20 c_name_head weight700 detail_page" href="/companies/view/1033/nimblechapps-pvt-ltd" target="_blank" title="Nimblechapps Pvt. Ltd."> <span itemprop="name">Nimblechapps Pvt. Ltd. </span></a> </h3>到目前為止,這是我的代碼。不知道如何從這里開始from bs4 import BeautifulSoup as BSimport requests page = 'https://www.goodfirms.co/directory/platform/app-development/iphone? page=2'res = requests.get(page)cont = BS(res.content, "html.parser")names = cont.find_all(class_ = 'font20 c_name_head weight700 detail_page')names = cont.find_all('a' , attrs = {'class':'font20 c_name_head weight700 detail_page'})我曾嘗試使用以下 -Input: cont.h3.a.spanOutput: <span itemprop="name">Nimblechapps Pvt. Ltd.</span>我想提取公司名稱 - “Nimblechapps Pvt. Ltd.”
添加回答
舉報
0/150
提交
取消