首頁猿問 BeautifulSoup...

BeautifulSoup 獲取元素之間的文本

Python

蠱毒傳說 2022-06-14 16:33:37

我有這樣的事情：foo: bar baz:YES spam eggs: ham 現在我想在 s 之間獲取所有這些字符串。我可以做這樣的事情：from bs4 import BeautifulSoup# get the html heresoup = BeautifulSoup(content, 'html.parser')for element in soup.find_all('b'): print(element.next_sibling)它有效，但僅適用于未封裝的文本，即標簽中。所以我會得到bar，ham但我不會得到YES，而且出乎意料的是，我什至不會得到spam。有沒有辦法在不使用正則表達式的情況下解析它？

查看完整描述

2 回答

臨摹微笑

TA貢獻1982條經驗獲得超2個贊

您可以使用 find_all() 并檢查所有標簽，然后根據該標簽查找標簽。用于next_element獲取值。

from bs4 import BeautifulSoup

html='''foo: bar

baz:

YES spam

eggs: ham

'''

soup=BeautifulSoup(html,'lxml')

for item in soup.find_all():

if item.name=='font':

print(item.text.strip())

print(item.next_element.next_element.strip())

if item.name=='b':

if item.next_element.next_element.strip()!='':

print(item.next_element.next_element.strip())

輸出：

bar

YES

spam

ham

反對回復 2022-06-14

PIPIONE

TA貢獻1829條經驗獲得超9個贊

我試了一下。希望它有效

# get the html here

soup = BeautifulSoup(content, 'html.parser')

all_b=soup.find_all('b')

for b in all_b:

print(b.get_text())

next_b=b.findNext('b')

#print(next_b)

for sibling in b.next_siblings:

if(sibling!=next_b):

if(sibling!=None and isinstance(sibling,str)==False):

print(sibling.get_text())

sibling=sibling.next_sibling

elif(sibling!=None and isinstance(sibling,str)==True):

print(sibling)

sibling=sibling.next_sibling

elif(sibling==next_b):

break

print("new")

反對回復 2022-06-14

2 回答
0 關注
236 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

BeautifulSoup 獲取元素之間的文本

BeautifulSoup 獲取元素之間的文本

2 回答

添加回答