3 回答

TA貢獻1830條經驗 獲得超3個贊
好的,我根據您的修改更新了答案:
from functools import reduce
sent = "terras ipsius Azar vocatas Ta Ta Zagra Ta Zagra Xellule et Ginen Chagem in contrata Deyr Issafisaf cum iuribus suis omnibus"
places = [ 'Ras il Huichile', 'Ras il Hued', 'Ta Richardu', 'Roma', 'Russilion', 'La Rukiha', 'Irrukiha ta il Bayada',
'Casalis Milleri', 'Ta Sabat', 'Casalis Zebug', 'Ta Zagra', 'Sagra in Ras il Hued', 'Ta Isalme', 'Ta Xellule', 'Ginen Chagem',
'Deyr Issafisaf']
places_map = {p:[('PLACE', l) for l in p.split()] for p in places}
def find_places(sent, places):
if len(places) is 0:
return [('O', l) for l in sent.split()]
place = places[0]
remaining_places = places[1:]
sent_splits = sent.split(place)
return reduce(lambda a,b:a+places_map[place]+b, [find_places(s, remaining_places) for s in sent_splits])
print(find_places(sent, places))
輸出為:
[('O', 'terras'), ('O', 'ipsius'), ('O', 'Azar'), ('O', 'vocatas'), ('O', 'Ta'), ('PLACE', 'Ta'), ('PLACE', 'Zagra'), ('PLACE', 'Ta'), ('PLACE', 'Zagra'), ('O', 'Xellule'), ('O', 'et'), ('PLACE', 'Ginen'), ('PLACE', 'Chagem'), ('O', 'in'), ('O', 'contrata'), ('PLACE', 'Deyr'), ('PLACE', 'Issafisaf'), ('O', 'cum'), ('O', 'iuribus'), ('O', 'suis'), ('O', 'omnibus')]
因此,我使用了一種遞歸方法來查找句子中的位置,以所需的格式對其進行更改,然后對句子的其余部分與其余位置進行遞歸處理,然后將它們最終合并在一起。

TA貢獻1828條經驗 獲得超3個贊
嘗試這樣的事情:
res = []
for x in sent:
for place in places:
if x in place:
# add 'PLACE' if it matches
res.append(('PLACE', x))
if ('PLACE', x) not in res:
# add '0' if we find nothing
res.append(('0', x))
print(res)

TA貢獻1811條經驗 獲得超4個贊
這是一個僅基于列表理解的建議,適用于理解愛好者:
sent = ['terras', 'ipsius', 'Azar', 'vocatas', 'Ta', 'Xellule', 'et', 'Ginen', 'Chagem', 'in', 'contrata', 'Deyr', 'Issafisaf']
places = ['Ta Xellule', 'Ginen Chagem', 'Deyr Issafisaf']
p = [i for place in places for i in place.split()]
result = [('PLACE',word) if word in p else ('O',word) for word in sent]
print(result)
# [('O', 'terras'), ('O', 'ipsius'), ('O', 'Azar'), ('O', 'vocatas'), ('PLACE', 'Ta'),
# ('PLACE', 'Xellule'), ('O', 'et'), ('PLACE', 'Ginen'), ('PLACE', 'Chagem'),
# ('O', 'in'), ('O', 'contrata'), ('PLACE', 'Deyr'), ('PLACE', 'Issafisaf')]
添加回答
舉報