已解決430363個問題，去搜搜看，總會有你想問的

Python Beautiful Soup - 刪除的標簽仍然影響輸出

首頁猿問 Python Beautiful...

Python Beautiful Soup - 刪除的標簽仍然影響輸出

Python

侃侃爾雅 2023-07-11 15:38:06

你好，from bs4 import BeautifulSouphtml = 'This is a test.'soup = BeautifulSoup(html, 'lxml')print(soup)for tag in soup.find_all('i'): tag.replace_with('is')print(soup)print("\n")print(soup.prettify())print("\n")for string in soup.stripped_strings: print(string)該程序輸出以下內容：<html><body>This is a test.</body></html><html><body>This is a test.</body></html><html> <body> This is a test. </body></html>Thisisa test為什么呢？為什么字符串仍然分為三部分，就好像刪除的標簽仍然存在一樣？如果我使用This is a test.（這是我替換標簽后的輸出）作為我的起始 html，一切都工作正常。我究竟做錯了什么？提前致謝

查看完整描述

1 回答

守著星空守著你

TA貢獻1799條經驗獲得超8個贊

看起來它替換is為is，但它沒有替換樹中的節點，并且它仍然is作為樹中的單獨項目運行。

您必須將樹轉換為字符串并再次解析它才能將其作為樹中的單個節點。

html = str(soup)

#print(html)

soup = BeautifulSoup(html, 'lxml')

如果您希望文本作為一個字符串那么您可以嘗試get_text(strip=True, separator=" ")

from bs4 import BeautifulSoup

html = 'This is a test.'

soup = BeautifulSoup(html, 'lxml')

print(soup.get_text(strip=True, separator=" "))

反對回復 2023-07-11

1 回答
0 關注
154 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

Python Beautiful Soup - 刪除的標簽仍然影響輸出

Python Beautiful Soup - 刪除的標簽仍然影響輸出

1 回答

添加回答