亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

為了賬號安全,請及時綁定郵箱和手機立即綁定
已解決430363個問題,去搜搜看,總會有你想問的

將網絡抓取的表格放入excel(selenium,python)

將網絡抓取的表格放入excel(selenium,python)

慕碼人8056858 2023-04-25 17:37:43
我想將表格及其標題放入 excel 中。我嘗試了很多東西,但我似乎無法弄清楚如何在 excel 中正確顯示它。下面還有一張圖片展示了我希望它如何理想地顯示。先感謝您。from selenium import webdriverfrom selenium.webdriver.support.ui import Selectfrom selenium.webdriver.common.keys import Keysfrom selenium.webdriver.support.ui import WebDriverWaitfrom selenium.webdriver.common.by import Byfrom selenium.webdriver.support import expected_conditions as ECdriver = webdriver.Chrome("drivers/chromedriver")driver.get("https://web3.ncaa.org/hsportal/exec/hsAction")Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.ID, "state")))).select_by_visible_text("New Hampshire")driver.find_element_by_xpath("//input[@id='city']").send_keys("Moultonborough")driver.find_element_by_xpath("//input[@id='name']").send_keys("Moultonborough Academy")driver.find_element_by_xpath("//input[@value='Search']").click()WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@name='hsCode']"))).click()print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@id='approvedCourseTable_1']//th[@class='header']")))])table = ([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table#approvedCourseTable_1.tablesorter")))])with open('out.csv', 'w', newline='') as csvfile:    writer = csv.writer(csvfile)    writer.writerow(table)出于某種原因,在使用 table#approvedCourseTable_1.tablesorter 時將表格抓取到 excel 僅顯示“課程”,僅此而已。當我將標題和表格內容分開時,我可以將它們分別抓取到 excel,但不能一起抓取。此外,當我設法將其抓取到 excel 時,表格內容沒有正確排列。x = ([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table#approvedCourseTable_1 th.header")))])y = ([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table#approvedCourseTable_1 td")))])如果可能的話,我希望它像這樣顯示:
查看完整描述

1 回答

?
米脂

TA貢獻1836條經驗 獲得超3個贊

我有這個使用 Selenium/Python 的工作。試試下面的代碼示例,


from selenium import webdriver

from selenium.webdriver.support.ui import Select

from selenium.webdriver.common.keys import Keys

from selenium.webdriver.support.ui import WebDriverWait

from selenium.webdriver.common.by import By

from selenium.webdriver.support import expected_conditions as EC

import csv

csvFile = open('out.csv', 'w')

writer = csv.writer(csvFile)



driver = webdriver.Chrome("drivers/chromedriver")

driver.get("https://web3.ncaa.org/hsportal/exec/hsAction")

Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.ID, "state")))).select_by_visible_text("New Hampshire")

driver.find_element_by_xpath("//input[@id='city']").send_keys("Moultonborough")

driver.find_element_by_xpath("//input[@id='name']").send_keys("Moultonborough Academy")

driver.find_element_by_xpath("//input[@value='Search']").click()

WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@name='hsCode']"))).click()

print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@id='approvedCourseTable_1']//th[@class='header']")))])


#table = ([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table#approvedCourseTable_1.tablesorter")))])


table_header = driver.find_element_by_xpath("(//table[@id='NcaaCrs_ApprovedCategory_All']//td[@class='hs_tableHeader'])[1]")

print(table_header.text)

writer.writerow(table_header.text)


#Find All Approved Categories

approved_Categories = driver.find_elements_by_xpath("//div[contains(@id,'NcaaCrs_ApprovedCategory_')]")


for i in range(len(approved_Categories)):

    cateogry_header = driver.find_element_by_xpath("//div[contains(@id,'NcaaCrs_ApprovedCategory_"+str(i+1)+"')]//td[@class='hs_tableHeader']")

    print(cateogry_header.text)

    writer.writerow(cateogry_header.text)

    #Find Course table header and rows

    course_headers = driver.find_elements_by_xpath("//table[contains(@id,'approvedCourseTable_"+str(i+1)+"')]/thead//th")

    header_val = []

    for headers in course_headers:

        header_val.append(headers.text)

    print(header_val)

    writer.writerow(header_val)

    course_rows = driver.find_elements_by_xpath("//table[@id='approvedCourseTable_"+str(i+1)+"']//tbody/tr")

    for j in range(len(course_rows)):

        row_values = driver.find_elements_by_xpath("//table[@id='approvedCourseTable_"+str(i+1)+"']//tbody/tr["+str(j+1)+"]/td")

        row_val = []

        for row in row_values:

            row_val.append(row.text)

        print(row_val)

        writer.writerow(row_val)



csvFile.close()

driver.quit()

CSV 輸出將是這樣的,


['Course\nWeight', 'Title', 'Notes', 'Max\nCredits', 'OK\nThrough', 'Disability\nCourse']

Approved Courses

English

['Course\nWeight', 'Title', 'Notes', 'Max\nCredits', 'OK\nThrough', 'Disability\nCourse']

['', 'AFRICAN LITERATURE', '', '', '', 'No']

['', 'AMERICAN LITERATURE', '', '', '', 'No']

['', 'AP ENGLISH LANGUAGE & COMPOSITION', '', '', '', 'No']

['', 'AP ENGLISH LITERATURE & COMPOSITION', '', '', '', 'No']

['', 'COLLEGE COMPOSITION', '', '', '', 'No']

['', 'ENGLISH 9 (ENG 091/092/093)', '', '', '', 'No']

['', 'ENGLISH 9/H', '', '', '', 'No']

['', 'PUBLIC SPEAKING', '', '', '', 'No']

['', 'WORLD STUDIES', '', '', '', 'No']

['', 'WORLD STUDIES HBC', '', '', '', 'No']

Social Science

['Course\nWeight', 'Title', 'Notes', 'Max\nCredits', 'OK\nThrough', 'Disability\nCourse']

['', 'AP WORLD HISTORY', '', '', '', 'No']

['', 'ECONOMICS', '', '', '', 'No']

['', 'GOVERNMENT', '', '', '', 'No']

['', 'PSYCHOLOGY', '', '', '', 'No']

['', 'US HISTORY', '', '', '', 'No']

['', 'US HISTORY/AP', '', '', '', 'No']

['', 'WORLD STUDIES', '', '', '', 'No']

['', 'WORLD STUDIES HBC', '', '', '', 'No']


查看完整回答
反對 回復 2023-04-25
  • 1 回答
  • 0 關注
  • 149 瀏覽
慕課專欄
更多

添加回答

舉報

0/150
提交
取消
微信客服

購課補貼
聯系客服咨詢優惠詳情

幫助反饋 APP下載

慕課網APP
您的移動學習伙伴

公眾號

掃描二維碼
關注慕課網微信公眾號