首頁猿問如何遍歷 Python...

如何遍歷 Python 中的字符串列表并連接屬于標簽的字符串？

Python

SMILET 2022-11-01 17:13:51

在 Python 3 中遍歷元素列表時，如何“隔離”感興趣的元素之間的內容？我有一個清單：list = ["<h1> question 1", "question 1 content", "question 1 more content", "<h1> answer 1", "answer 1 content", "answer 1 more content", "<h1> question 2", "question 2 content", "<h> answer 2", "answer 2 content"]在此列表中，有帶有標簽 < h > 的元素和其他沒有標簽的元素。這個想法是具有此標簽的元素是“標題”，直到下一個標簽的以下元素是它的內容。如何連接屬于 header 的列表元素以具有兩個相等大小的列表：headers = ["<h1> question 1", "<h1> answer 1", "<h1> question 2", "<h> answer 2"]content = ["question 1 content question 1 more content", "answer 1 content answer 1 more content", "question 2 content", "answer 2 content"]這兩個列表的長度相同，在這種情況下，每個列表有 4 個元素。我能夠將這些部分分開，但您可以使用一些幫助來完成：list = ["<h1> question 1", "question 1 content", "question 1 more content", "<h1> answer 1", "answer 1 content", "answer 1 more content", "<h1> question 2", "question 2 content", "<h> answer 2", "answer 2 content"]headers = []content = []for i in list: if "<h1>" in i: headers.append(i) if "<h1>" not in i: tempContent = [] tempContent.append(i) content.append(tempContent)關于如何組合這些文本以使它們一一對應的任何想法？謝謝！

查看完整描述

2 回答

catspeake

TA貢獻1111條經驗獲得超0個贊

假設在每個標題之后所有元素都是該標題的內容，并且第一個元素始終是標題 - 您可以使用itertools.groupby.

key可以是元素是否具有標題標簽，這樣標題的內容將在其后分組：

from itertools import groupby

lst = ["<h1> question 1", "question 1 content", "question 1 more content", "<h1> answer 1", "answer 1 content", "answer 1 more content", "<h1> question 2", "question 2 content", "<h> answer 2", "answer 2 content"]

headers = []

content = []

for key, values in groupby(lst, key=lambda x: "<h" in x):

if key:

headers.append(*values)

else:

content.append(" ".join(values))

print(headers)

print(content)

給出：

['<h1> question 1', '<h1> answer 1', '<h1> question 2', '<h> answer 2']

['question 1 content question 1 more content', 'answer 1 content answer 1 more content', 'question 2 content', 'answer 2 content']

您當前方法的問題是您總是只將一項添加到內容中。您要做的是累積temp_content列表，直到遇到下一個標題，然后才添加它并重置：

headers = []

content = []

temp_content = None

for i in list:

if "<h" in i:

if temp_content is not None:

content.append(" ".join(temp_content))

temp_content = []

headers.append(i)

else:

temp_content.append(i)

反對回復 2022-11-01

慕勒3428872

TA貢獻1848條經驗獲得超6個贊

您可以在collections.defaultdict迭代列表時將標題和內容收集到 a 中。然后將鍵和值拆分為最后headers的content列表。我們可以通過簡單地檢查一個字符串來檢測標題。str.startswith "<h"

我還使用該continue語句在找到標頭后立即進入下一次迭代。也可以在這里只使用一個else語句。

from collections import defaultdict

lst = [

"<h1> question 1",

"question 1 content",

"question 1 more content",

"<h1> answer 1",

"answer 1 content",

"answer 1 more content",

"<h1> question 2",

"question 2 content",

"<h> answer 2",

"answer 2 content",

]

header_map = defaultdict(list)

header = None

for item in lst:

if item.startswith("<h"):

header = item

continue

header_map[header].append(item)

headers = list(header_map)

print(headers)

content = [" ".join(v) for v in header_map.values()]

print(content)

輸出：

['<h1> question 1', '<h1> answer 1', '<h1> question 2', '<h> answer 2']

['question 1 content question 1 more content', 'answer 1 content answer 1 more content', 'question 2 content', 'answer 2 content'

反對回復 2022-11-01

2 回答
0 關注
147 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

如何遍歷 Python 中的字符串列表并連接屬于標簽的字符串？

如何遍歷 Python 中的字符串列表并連接屬于標簽的字符串？

2 回答

添加回答

如何遍歷 Python 中的字符串列表并連接屬于標簽的字符串？

如何遍歷 Python 中的字符串列表并連接屬于標簽的字符串？