首頁猿問在 Python...

在 Python 中使用“for x in list”訪問 x+1 元素

Python

慕標琳琳 2021-12-26 10:33:31

我正在嘗試將新行分隔的文本文件解析為行塊，這些行附加到 .txt 文件中。我希望能夠在我的結束字符串之后抓取 x 行，因為這些行的內容會有所不同，這意味著設置“結束字符串”以嘗試匹配它會丟失行。文件示例："Start""...""...""...""...""---" ##End here"xxx" ##Unique data here"xxx" ##And here這是代碼first = "Start"first_end = "---"with open('testlog.log') as infile, open('parsed.txt', 'a') as outfile: copy = False for line in infile: if line.strip().startswith(first): copy = True outfile.write(line) elif line.strip().startswith(first_end): copy = False outfile.write(line) ##Want to also write next 2 lines here elif copy: outfile.write(line)有什么方法可以使用for line in infile，或者我需要使用不同類型的循環嗎？

查看完整描述

3 回答

肥皂起泡泡

TA貢獻1829條經驗獲得超6個贊

您可以使用next或readline（在 Python 3 及更高版本中）檢索文件中的下一行：

elif line.strip().startswith(first_end):

copy = False

outfile.write(line)

outfile.write(next(infile))

或者

#note: not compatible with Python 2.7 and below

elif line.strip().startswith(first_end):

copy = False

outfile.write(line)

outfile.write(infile.readline())

這也將導致文件指針前進兩行額外的行，因此下一次迭代for line in infile:將跳過您閱讀的兩行readline。

附加術語 nitpick：文件對象不是列表，訪問列表第 x+1 個元素的方法可能不適用于訪問文件的下一行，反之亦然。如果您確實想訪問正確列表對象的下一項，則可以使用enumerate它來對列表的索引執行算術運算。例如：

seq = ["foo", "bar", "baz", "qux", "troz", "zort"]

#find all instances of "baz" and also the first two elements after "baz"

for idx, item in enumerate(seq):

if item == "baz":

print(item)

print(seq[idx+1])

print(seq[idx+2])

請注意，與不同readline，索引不會推進迭代器，因此for idx, item in enumerate(seq):仍會迭代“qux”和“troz”。

適用于任何迭代的方法是使用附加變量來跟蹤迭代中的狀態。這樣做的好處是您不必了解如何手動推進迭代；缺點是對循環內的邏輯進行推理比較困難，因為它暴露了額外的副作用。

first = "Start"

first_end = "---"

with open('testlog.log') as infile, open('parsed.txt', 'a') as outfile:

copy = False

num_items_to_write = 0

for line in infile:

if num_items_to_write > 0:

outfile.write(line)

num_items_to_write -= 1

elif line.strip().startswith(first):

copy = True

outfile.write(line)

elif line.strip().startswith(first_end):

copy = False

outfile.write(line)

num_items_to_write = 2

elif copy:

outfile.write(line)

在從分隔文件中提取重復數據組的特定情況下，完全跳過迭代并使用正則表達式可能是合適的。對于像您這樣的數據，可能如下所示：

import re

with open("testlog.log") as file:

data = file.read()

pattern = re.compile(r"""

^Start$ #"Start" by itself on a line

(?:\n.*$)*? #zero or more lines, matched non-greedily

#use (?:) for all groups so `findall` doesn't capture them later

\n---$ #"---" by itself on a line

(?:\n.*$){2} #exactly two lines

""", re.MULTILINE | re.VERBOSE)

#equivalent one-line regex:

#pattern = re.compile("^Start$(?:\n.*$)*?\n---$(?:\n.*$){2}", re.MULTILINE)

for group in pattern.findall(data):

print("Found group:")

print(group)

print("End of group.\n\n")

在日志上運行時，如下所示：

Start

foo

bar

baz

qux

---

troz

zort

alice

bob

carol

dave

Start

Fred

Barney

---

Wilma

Betty

Pebbles

...這將產生輸出：

Found group:

Start

foo

bar

baz

qux

---

troz

zort

End of group.

Found group:

Start

Fred

Barney

---

Wilma

Betty

End of group.

反對回復 2021-12-26

慕村225694

TA貢獻1880條經驗獲得超4個贊

最簡單的方法是制作一個解析 infile 的生成器函數：

def read_file(file_handle, start_line, end_line, extra_lines=2):

start = False

while True:

try:

line = next(file_handle)

except StopIteration:

return

if not start and line.strip().startswith(start_line):

start = True

yield line

elif not start:

continue

elif line.strip().startswith(end_line):

yield line

try:

for _ in range(extra_lines):

yield next(file_handle)

except StopIteration:

return

else:

yield line

try-except如果您知道每個文件都是格式良好的，則不需要這些子句。

你可以像這樣使用這個生成器：

if __name__ == "__main__":

first = "Start"

first_end = "---"

with open("testlog.log") as infile, open("parsed.txt", "a") as outfile:

output = read_file(

file_handle=infile,

start_line=first,

end_line=first_end,

extra_lines=1,

)

outfile.writelines(output)

反對回復 2021-12-26

紅顏莎娜

TA貢獻1842條經驗獲得超13個贊

具有三態變量和更少的代碼重復。

first = "Start"

first_end = "---"

# Lines to read after end flag

extra_count = 2

with open('testlog.log') as infile, open('parsed.txt', 'a') as outfile:

# Do no copy by default

copy = 0

for line in infile:

# Strip once only

clean_line = line.strip()

# Enter "infinite copy" state

if clean_line.startswith(first):

copy = -1

# Copy next line and extra amount

elif clean_line.startswith(first_end):

copy = extra_count + 1

# If in a "must-copy" state

if copy != 0:

# One less line to copy if end flag passed

if copy > 0:

copy -= 1

# Copy current line

outfile.write(line)

反對回復 2021-12-26

3 回答
0 關注
535 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

在 Python 中使用“for x in list”訪問 x+1 元素

在 Python 中使用“for x in list”訪問 x+1 元素

3 回答

添加回答