2 回答

TA貢獻1911條經驗 獲得超7個贊
我的選擇:
wordlist = ['air', 'pack', 'port', 'hard', 'back', 'bag', 'disk', 'ground', 'play']
word = 'playground'
lenw, minlen = len(word), min([len(w) for w in wordlist])
pairs = [(word[:n], word[n:]) for n in range(1,lenw) if (n >= minlen and n < lenw-minlen+1) ]
found = False
for w1, w2 in pairs:
if w1 in wordlist and w2 in wordlist:
print('Found ' + word + ' as: ' + w1 + ' + ' + w2)
found = True
break
if not found: print('No words found')
#=> Found playground as: play + ground
pairs是將詞一分為二的映射,其中兩個子詞不小于詞表中最小的詞。這減少了查找次數。
打印出來看看:
print(pairs)
#=> [('pla', 'yground'), ('play', 'ground'), ('playg', 'round'), ('playgr', 'ound'), ('playgro', 'und')]
如果是大量單詞,我建議按起始字母(作為詞匯表)分組,然后僅查找單詞字母和起始單詞集之間的交集內的單詞。這里不完整的代碼:
letters = set(word)
print(letters) #=> {'r', 'a', 'u', 'g', 'l', 'n', 'd', 'o', 'y', 'p'}
alphabet = {}
for word in wordlist:
alphabet.setdefault(word[0], set()).add(word)
print(alphabet)
#=> {'a': {'air'}, 'p': {'port', 'play', 'pack'}, 'h': {'hard'}, 'b': {'back', 'bag'}, 'd': {'disk'}, 'g': {'ground'}}
所以交集是:{'g', 'p', 'd', 'a'} 然后建立查找列表:
lookuplist = []
for i in intersection:
for word in alphabet[i]:
lookuplist.append(word)
lookuplist #=> ['air', 'disk', 'ground', 'port', 'pack', 'play']
所以用lookuplist代替wordlist
用一些方法在抽屜里訂購
def vocabulary(wordlist):
res = {}
for word in wordlist:
res.setdefault(word[0], set()).add(word)
return res
def lookuplist(vocabulary, word):
vocabulary_alphabet = set(vocabulary.keys())
word_letters = set(word)
intersection = vocabulary_alphabet.intersection(word_letters)
lookuplist = []
for i in intersection:
for word in vocabulary[i]:
lookuplist.append(word)
return lookuplist
def find_word(word, lookuplist):
lenw, minlen = len(word), min([len(w) for w in lookuplist])
pairs = [(word[:n], word[n:]) for n in range(1,lenw) if (n >= minlen and n < lenw-minlen+1) ]
for w1, w2 in pairs:
if w1 in lookuplist and w2 in lookuplist: return (word, w1, w2)
return []
您可以按如下方式使用:
wordlist = ['air', 'pack', 'port', 'hard', 'back', 'bag', 'disk', 'ground', 'play']
word = 'playground'
vocabulary = vocabulary(wordlist) # run once then store the result
lookuplist = lookuplist(vocabulary, word)
found_word = find_word(word, lookuplist)
print(found_word)
#=> ('playground', 'play', 'ground')

TA貢獻1836條經驗 獲得超13個贊
不是一個完整的解決方案,但一個好主意可能是將單詞存儲在字典中,例如,其中鍵是單詞的長度,值是一組單詞。然后創建一個長度列表來迭代它,而不是迭代輸入詞 ( s),例如:
words = ['toothpaste',
'hard-to-find',
'economic',
'point',
'food',
'seal',
'outrageous',
'motionless',
'ice',
'tow',
'boot',
'cruel',
'peanut',
'butter']
index = {}
for word in words:
index.setdefault(len(word), set()).add(word)
lengths = sorted(index)
def break_into_spaces(s):
s_length = len(s)
for length in lengths:
if length < s_length:
left = s[length:]
right = s[:length]
if left in index[length] and s_length - length in index and right in index[s_length - length]:
print("{} {}".format(right, left))
print("Found them!")
else:
break
break_into_spaces('peanutbutter')
輸出
peanut butter
Found them!
這樣做是否可以節省時間:
它避免了遍歷整個輸入詞,想象一下輸入詞比字典中的所有詞都短的情況,這將立即打破循環并且不打印任何內容。
通過將單詞存儲在相同長度的集合中,您只需要檢查是否存在相同長度的匹配單詞,而不是檢查所有單詞。請注意,這可能毫無意義,因為字典是哈希表,因此理論上檢查包含情況是O(1).
添加回答
舉報