我正在使用代碼來網絡抓取客戶評論。一切都按照我希望代碼執行的操作進行,但我無法正確獲取評級的類或屬性,因此代碼始終返回該Ratings列的空白結果。有人可以幫我找到正確的屬性并修復Ratings代碼行嗎?from bs4 import BeautifulSoupimport requestsimport pandas as pdimport jsonprint ('all imported successfuly')# Initialize an empty dataframedf = pd.DataFrame()for x in range(1, 37): names = [] headers = [] bodies = [] ratings = [] published = [] updated = [] reported = [] link = (f'https://www.trustpilot.com/review/fabfitfun.com?page={x}') print (link) req = requests.get(link) content = req.content soup = BeautifulSoup(content, "lxml") articles = soup.find_all('article', {'class':'review'}) for article in articles: names.append(article.find('div', attrs={'class': 'consumer-information__name'}).text.strip()) headers.append(article.find('h2', attrs={'class':'review-content__title'}).text.strip()) try: bodies.append(article.find('p', attrs={'class':'review-content__text'}).text.strip()) except: bodies.append('') try: #ratings.append(article.find('div', attrs={'class':'star-rating star-rating--medium'}).text.strip()) ratings.append(article.find('div', attrs={'class': 'star-rating star-rating--medium'})['alt']) except: ratings.append('') dateElements = article.find('div', attrs={'class':'review-content-header__dates'}).text.strip() jsonData = json.loads(dateElements) published.append(jsonData['publishedDate']) updated.append(jsonData['updatedDate']) reported.append(jsonData['reportedDate'])
1 回答

交互式愛情
TA貢獻1712條經驗 獲得超3個贊
只需更改代碼中的這一行:
ratings.append(article.find_all("img", alt=True)[0]["alt"])
df.Rating然后輸出到:
0 1 star: Bad
1 5 stars: Excellent
2 5 stars: Excellent
3 5 stars: Excellent
4 5 stars: Excellent
5 5 stars: Excellent
6 5 stars: Excellent
在文章中找到img標簽并從中檢索替代文本似乎更容易。
- 1 回答
- 0 關注
- 117 瀏覽
添加回答
舉報
0/150
提交
取消