2 回答

TA貢獻1876條經驗 獲得超6個贊
它可能比你做的更簡單。讓我們嘗試簡化它:
import requests
from bs4 import BeautifulSoup as bs
boston_url = 'https://www.mass.gov/service-details/request-for-proposal-rfp-notices'
hdr = {'User-Agent': 'Mozilla/5.0'}
req = requests.get(boston_url,headers=hdr)
soup = bs(req.text,'lxml')
soup.select('main main div.ma__rich-text>p')[0].text
輸出:
'PERAC has not reviewed the RFP notices or other related materials posted on this page for compliance with M.G.L. Chapter 32, section 23B. The publication of these notices should not be interpreted as an indication that PERAC has made a determination as to that compliance.'

TA貢獻1813條經驗 獲得超2個贊
您可以使用bs.find('p', text=re.compile('PERAC'))來提取該段落:
from bs4 import BeautifulSoup
import requests
import re
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) '
'AppleWebKit/537.36 (KHTML, like Gecko) '
'Chrome/83.0.4103.61 Safari/537.36'
}
boston_url = (
'https://www.mass.gov/service-details/request-for-proposal-rfp-notices'
)
resp = requests.get(boston_url, headers=headers)
bs = BeautifulSoup(resp.text)
bs.find('p', text=re.compile('PERAC'))
添加回答
舉報