已解決430363個問題，去搜搜看，總會有你想問的

我想從python中的兩個嵌入文檔中獲取語義相似的單詞列表

首頁猿問我想從python中的兩個嵌入文檔...

我想從python中的兩個嵌入文檔中獲取語義相似的單詞列表

Python

阿波羅的戰車 2022-10-05 17:00:46

我正在研究在 python 中嵌入文本。我發現兩個文檔與 Doc2vec 模型的相似之處。代碼如下：for doc_id in range(len(train_corpus)): inferred_vector = model.infer_vector(train_corpus[doc_id].words) # it takes each document words as a input and produce vector of each document sims = model.docvecs.most_similar([inferred_vector], topn=len(model.docvecs)) # it takes list of all document's vector as a input and compare those with the trained vectors and gives the most similarity of 1st document to other and then second to other and so on . print('Document ({}): ?{}?\n'.format(doc_id, ' '.join(train_corpus[doc_id].words))) print(u'SIMILAR/DISSIMILAR DOCS PER MODEL %s:\n' % model) for label, index in [('MOST', 0), ('SECOND-MOST', 1), ('MEDIAN', len(sims)//2), ('LEAST', len(sims) - 1)]: print(u'%s %s: ?%s?\n' % (label, sims[index], ' '.join(train_corpus[sims[index][0]].words)))現在，從這兩個嵌入的文檔中，我如何才能從這些特定文檔中提取一組語義相似的單詞。請幫幫我。

查看完整描述

1 回答

www說

TA貢獻1775條經驗獲得超8個贊

只有某些Doc2Vec模式也訓練詞向量：（dm=1默認）或（DBOW doc-vectors，但添加了 skip-gram 詞向量。如果您使用過這種模式，那么您的屬性dm=0, dbow_words=1中將會有詞向量。model.wv

調用model.wv.similarity(word1, word2)方法將為您提供任何 2 個單詞的成對相似性。

因此，您可以遍歷中的所有單詞doc1，然后收集與中的每個單詞的相似度doc2，并報告每個單詞的最高相似度。

反對回復 2022-10-05

1 回答
0 關注
89 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

我想從python中的兩個嵌入文檔中獲取語義相似的單詞列表

我想從python中的兩個嵌入文檔中獲取語義相似的單詞列表

1 回答

添加回答