2 回答

TA貢獻1824條經驗 獲得超5個贊
嘗試使用requestsPython 的標準urllib.request. requests模塊打開頁面時出現問題:
import urllib.request
from bs4 import BeautifulSoup
url='http://www.sis.itu.edu.tr/tr/ders_programlari/LSprogramlar/prg.php'
html_content = urllib.request.urlopen(url).read()
soup = BeautifulSoup(html_content, "lxml")
url_course_main='http://www.sis.itu.edu.tr/tr/ders_programlari/LSprogramlar/prg.php?fb='
url_course=url_course_main+soup.find_all('option')[1].get_text()
html_content_course=urllib.request.urlopen(url_course).read()
soup_course=BeautifulSoup(html_content_course,'lxml')
for j in soup_course.find_all('td'):
print(j.get_text(strip=True))
印刷:
2019-2020 Yaz D?nemi AKM Kodlu Derslerin Ders Program?
...

TA貢獻1816條經驗 獲得超6個贊
問題是在末尾get_text()給出空格并發送帶有此空格的 url - 服務器找不到帶有空格的文件。'AKM 'requests'AKM '
我用><字符串'>{}<'.format(param)來顯示這個空間 - >AKM <- 因為沒有><它似乎沒問題。
代碼需要get_text(strip=True)或get_text().strip()刪除這個空格。
import requests
from bs4 import BeautifulSoup
url = 'http://www.sis.itu.edu.tr/tr/ders_programlari/LSprogramlar/prg.php'
html_content = requests.get(url).text
soup = BeautifulSoup(html_content, 'lxml')
url_course_main = 'http://www.sis.itu.edu.tr/tr/ders_programlari/LSprogramlar/prg.php?fb='
param = soup.find_all('option')[1].get_text()
print('>{}<'.format(param)) # I use `> <` to show spaces
param = soup.find_all('option')[1].get_text(strip=True)
print('>{}<'.format(param)) # I use `> <` to show spaces
url_course = url_course_main + param
html_content_course = requests.get(url_course).text
soup_course = BeautifulSoup(html_content_course, 'lxml')
for j in soup_course.find_all('td'):
print(j.get_text())
添加回答
舉報