我試圖從 www.hujjat.org 的網站上抓取祈禱時間。這是我感興趣的區域的 html 部分(您可能已經注意到所有 4 個祈禱的 class 屬性都相同):<table width="100%"> <tbody> <tr> <td class="NamaazTimes"> <div class="NamaazTimeName">Fajr</div> <div class="NamaazTime">04:42</div> </td> <td class="NamaazTimes"> <div class="NamaazTimeName">Sunrise</div> <div class="NamaazTime">06:32</div> </td> <td class="NamaazTimes"> <div class="NamaazTimeName">Zohr</div> <div class="NamaazTime">13:02</div> </td> <td class="NamaazTimes"> <div class="NamaazTimeName">Maghrib</div> <div class="NamaazTime">19:33</div> </td> </tr> </tbody></table>到目前為止,我已經編寫了以下代碼:# import librariesimport jsonimport urllib2from bs4 import BeautifulSoup# specify the urlquote_page = 'http://www.hujjat.org/'# query the website and return the html to the variable 'page'page = urllib2.urlopen(quote_page)# parse the html using beautiful soap and store in variable 'soup'soup = BeautifulSoup(page, 'html.parser')table = soup.find("div",class_="NamaazTimeName", text="Fajr").find_previous("table")for row in table.find_all("tr"): a = row.find_all("td") # print(row.find_all("td"))print (a)我的結果是:[<td class="NamaazTimes">\n<div class="NamaazTimeName">Fajr</div>\n<div class="NamaazTime">04:42</div>\n</td>, <td class="NamaazTimes">\n<div class="NamaazTimeName">Sunrise</div>\n<div class="NamaazTime">06:32</div>\n</td>, <td class="NamaazTimes">\n<div class="NamaazTimeName">Zohr</div>\n<div class="NamaazTime">13:02</div>\n</td>, <td class="NamaazTimes">\n<div class="NamaazTimeName">Maghrib</div>\n<div class="NamaazTime">19:33</div>\n</td>]我想從我的代碼中得到的只是每個祈禱的時間,例如,如果是“Fajr”祈禱,那么輸出應該是“04:42”。然后我想將這個“04:42”保存在一個文本文件中。有誰可以幫助我嗎?
3 回答

慕工程0101907
TA貢獻1887條經驗 獲得超5個贊
from bs4 import BeautifulSoup
import pandas as pd
data = BeautifulSoup(#HTML data)
NamaazName = data.find_all('div', {'class':'NamaazTimeName'})
NamaazTime = data.find_all('div', {'class':'NamaazTime'})
for i in range(len(NamaazName)):
coll[NamaazName[i].text] = NamaazTime[i].text
master_data.columns=pd.DataFrame()
master_data['NamaazName'] = coll.keys()
master_data['NamaazTime'] = coll.values()
print(master_data)
輸出
Nammaz NammazTime
0 Fajr 04:42
1 Sunrise 06:32
2 Zohr 13:02
3 Maghrib 19:33
添加回答
舉報
0/150
提交
取消