已解決430363個問題，去搜搜看，總會有你想問的

使用jsoup從body標簽中提取innerHtml

首頁猿問使用jsoup從body標簽中提取...

使用jsoup從body標簽中提取innerHtml

Java

拉風的咖菲貓 2019-04-26 17:15:38

我正在使用jsoup解析html并想要在body標簽內提取innerHtml到目前為止，我嘗試并使用document.body.childern（）。outerHtml; 但它只提供html元素并在正文內部跳過浮動文本（不包含在任何html標記內）private String getBodyTag(final Document document) { return document.body().children().outerHtml();}輸入：<!DOCTYPE html><html lang="de"> <head> <META http-equiv="Content-Type" content="text/html; charset=UTF-8"> <link rel="stylesheet" type="text/css" href="assets/style.css"> </head> <body> <div>questions to improve formatting and clarity.</div> <h3>Guided Mode</h3> some sample raw/floating text </body></html>預期：<div>questions to improve formatting and clarity.</div><h3>Guided Mode</h3> some sample raw/floating text實際：<div>questions to improve formatting and clarity.</div><h3>Guided Mode</h3>

查看完整描述

2 回答

小唯快跑啊

TA貢獻1863條經驗獲得超2個贊

請使用這個：

private String getBodyTag(final Document document) {
    return document.body().html();}

反對回復 2019-05-15

慕后森

TA貢獻1802條經驗獲得超5個贊

您可以嘗試返回document.body.innerHtml;，因此它會返回body標記內的所有內容，包括任何標記之外的文本。

據我所知，你試圖完成它的方式不起作用，因為“原始文本”不被視為孩子。

反對回復 2019-05-15

2 回答
0 關注
1253 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

使用jsoup從body標簽中提取innerHtml

使用jsoup從body標簽中提取innerHtml

2 回答

添加回答