我無法讓 Pandas 以我想要的格式導出一些網絡抓取數據。我想訪問其中的每個 URLURLs并從該頁面獲取各種元素,并將它們放入具有指定列名的 Excel 電子表格中。然后我想訪問下一個 URLURLs并將這些數據放在 Excel 工作表的下一行,這樣我就有一個包含 6 列和三行數據的 Excel 工作表,每個植物一個(每個植物在一個單獨的 URL 中) .目前我有一個錯誤,說ValueError: Length mismatch: Expected axis has 18 elements, new values have 6 elements新記錄被水平放置在一起,而不是放在 Excel 中的新行上,而 Pandas 沒有預料到這一點。有人可以幫忙嗎?謝謝import csvimport pandas as pdfrom pandas import ExcelWriterfrom pandas import ExcelFileimport numpy as npfrom urllib2 import urlopenimport bs4from bs4 import BeautifulSoupURLs = ["http://adbioresources.org/map/ajax-single/27881","http://adbioresources.org/map/ajax-single/27967","http://adbioresources.org/map/ajax-single/27880"]mylist = []for plant in URLs: soup = BeautifulSoup(urlopen(plant),'lxml') table = soup.find_all('td') for td in table: mylist.append(td.text) heading2 = soup.find_all('h2') for h2 in heading2: mylist.append(h2.text) para = soup.find_all('p') for p in para: mylist.append(p.text)df = pd.DataFrame(mylist)transposed_df = df.Ttransposed_df.columns = ['Status','Type','Capacity','Feedstock','Address1','Address2']writer = ExcelWriter('Pandas-Example.xlsx')transposed_df.to_excel(writer,'Sheet1',index=False)writer.save()
添加回答
舉報
0/150
提交
取消