3 回答

TA貢獻1982條經驗 獲得超2個贊
您可以使用廣度優先搜索的遞歸形式:
def overlap(a, b) -> bool:
return a[-1] >= b[0] and a[-1] < b[-1]
def group(d, _c, seen):
return [_c,
[i if i not in seen else group(d, i, seen+[i]) for i in d if overlap(_c, i)]]
r = {'5ykw.pdb': [[10, 22], [33, 40], [39, 51], [63, 71], [94, 105]]}
new_data = [group(r['5ykw.pdb'], i, []) for i in r['5ykw.pdb'] if not any(overlap(c, i) for c in r['5ykw.pdb'])]
final_data = [a if not b else [a[0], max(h for _, h in b)] for a, b in new_data]
輸出:
[[10, 22], [33, 51], [63, 71], [94, 105]]
這也適用于具有更多重疊的輸入:
r = {'5ykw.pdb':[[15, 20], [18, 21], [19, 30]]}
new_data = [group(r['5ykw.pdb'], i, []) for i in r['5ykw.pdb'] if not any(overlap(c, i) for c in r['5ykw.pdb'])]
final_data = [a if not b else [a[0], max(h for _, h in b)] for a, b in new_data]
輸出:
[[15, 30]]

TA貢獻1850條經驗 獲得超11個贊
您可以使用reduce自定義merge函數來創建新列表:
from functools import reduce
def merge(acc, curr):
if not len(acc) or acc[-1][1] < curr[0]:
acc.append(curr)
return acc
acc[-1][1] = curr[1] # update last element in accumulator
return acc
data = {'5ykw.pdb': [[10, 22], [33, 40], [39, 51], [63, 71], [94, 105]]}
data['5ykw.pdb'] = reduce(merge, data['5ykw.pdb'], [])
print(data)
# {'5ykw.pdb': [[10, 22], [33, 51], [63, 71], [94, 105]]}

TA貢獻1839條經驗 獲得超15個贊
你可以簡單地使用這個函數“去重疊”列表:
def deoverlap(lst):
if not lst:
return []
lst = [sorted(pair) for pair in lst] # sort pairs (leave out if not needed)
lst = sorted(lst) # sort by first item (breaking ties by second item)
out = []
prev = lst[0]
for pair in lst[1:]:
if prev[1] >= pair[0]:
if prev[1] < pair[1]:
prev[1] = pair[1]
else:
out.append(prev)
prev = pair
out.append(prev)
return out
dct = {'5ykw.pdb': [[10, 22], [33, 40], [39, 51], [63, 71], [94, 105]]}
dct['5ykw.pdb'] = deoverlap(dct['5ykw.pdb'])
print(dct) # prints {'5ykw.pdb': [[10, 22], [33, 51], [63, 71], [94, 105]]}
這里唯一的假設是輸入deoverlap()是一個可比較類型(通常是數字)的對列表,其中每對是一個長度為 2 的列表。
對在內部排序,然后按第一項排序,如果前對的最大值≥當前對的最小值,則合并。如果當他們是平等的合并應該不會發生,在9日的行deoverlap()應該成為
if prev[1] > pair[0]:
添加回答
舉報