2 回答

TA貢獻1799條經驗 獲得超8個贊
另一種解決方案。
from simplified_scrapy import SimplifiedDoc,req,utils
html = '''<Segment StartTime="639.752" EndTime="642.270" Participant="fe016">
But I bet it's a good <Pause/> superset of it.
</Segment>'''
doc = SimplifiedDoc(html)
print(doc.Segment)
print(doc.Segment.text)
結果:
{'StartTime': '639.752', 'EndTime': '642.270', 'Participant': 'fe016', 'tag': 'Segment', 'html': "\n But I bet it's a good <Pause /> superset of it.\n"}
But I bet it's a good superset of it.
這里有更多例子。https://github.com/yiyedata/simplified-scrapy-demo/blob/master/doc_examples

TA貢獻1815條經驗 獲得超13個贊
xml = '''<Segment StartTime="639.752" EndTime="642.270" Participant="fe016">
But I bet it's a good <Pause/> superset of it.
</Segment>'''
# solution using ETree
from xml.etree import ElementTree as ET
root = ET.fromstring(xml)
pause = root.find('./Pause')
print(root.text + pause.tail)
添加回答
舉報