2 回答

TA貢獻1858條經驗 獲得超8個贊
使用正則表達式。
import re
html='''<div class="sk-expander-content" style="display: block;">
<ul>
<li>
<span>Third Party Liability</span>
<span>€756.62</span>
</li>
<li>
<span>Fire & Theft</span>
<span>€15.59</span>
</li>
</ul>
</div>
<div class="sk-expander-content" style="display: block;">
<ul>
<li>
<span>Fire & Theft</span>
<span>€756.62</span>
</li>
<li>
<span>Third Party Liability</span>
<span>€15.59</span>
</li>
</ul>
</div>'''
soup = BeautifulSoup(html, "html.parser")
for item in soup.find_all(class_="sk-expander-content"):
for span in item.find_all('span',text=re.compile("€(\d+).(\d+)")):
print(span.find_previous_sibling('span').text)
print(span.text)
輸出:
Third Party Liability
€756.62
Fire & Theft
€15.59
Fire & Theft
€756.62
Third Party Liability
€15.59
更新:如果您想獲取第一個節點值。然后使用find()而不是find_all()。
import re
html='''<div class="sk-expander-content" style="display: block;">
<ul>
<li>
<span>Third Party Liability</span>
<span>€756.62</span>
</li>
<li>
<span>Fire & Theft</span>
<span>€15.59</span>
</li>
</ul>
</div>
<div class="sk-expander-content" style="display: block;">
<ul>
<li>
<span>Fire & Theft</span>
<span>€756.62</span>
</li>
<li>
<span>Third Party Liability</span>
<span>€15.59</span>
</li>
</ul>
</div>'''
soup = BeautifulSoup(html, "html.parser")
for span in soup.find(class_="sk-expander-content").find_all('span',text=re.compile("€(\d+).(\d+)")):
print(span.find_previous_sibling('span').text)
print(span.text)

TA貢獻1850條經驗 獲得超11個贊
from bs4 import BeautifulSoup
import re
html = """
<div class="sk-expander-content" style="display: block;">
<ul>
<li>
<span>Third Party Liability</span>
<span>€756.62</span>
</li>
<li>
<span>Fire & Theft</span>
<span>€15.59</span>
</li>
</ul>
</div>
"""
soup = BeautifulSoup(html, 'html.parser')
target = soup.select("div.sk-expander-content")
for tar in target:
data = [item.text for item in tar.findAll("span", text=re.compile("€"))]
print(data)
輸出:
['€756.62', '€15.59']
注意:我使用了selectwhich returnResultSet來查找所有div.
- 2 回答
- 0 關注
- 251 瀏覽
添加回答
舉報