已解決430363個問題，去搜搜看，總會有你想問的

將主題模型存儲在列表中也考慮了最大出現次數

首頁猿問將主題模型存儲在列表中也考慮了最大...

將主題模型存儲在列表中也考慮了最大出現次數

Python

江戶川亂折騰 2021-08-14 15:57:19

我正在執行主題建模并使用函數來獲取主題模型中的頂級關鍵字，如下所示。def getTopKWords(self, K): results = [] """ returns top K discriminative words for topic t ie words v for which p(v|t) is maximum """ index = [] key_terms = [] pseudocounts = np.copy(self.n_vt) normalizer = np.sum(pseudocounts, (0)) pseudocounts /= normalizer[np.newaxis, :] for t in range(self.numTopics): topWordIndices = pseudocounts[:, t].argsort()[-1:-(K+1):-1] vocab = self.vectorizer.get_feature_names() print (t, [vocab[i] for i in topWordIndices]) ## Code for storing the values in a single list return results上面的函數給了我如圖所示的代碼0 ['computer', 'laptop', 'mac', 'use', 'bought', 'like', 'warranty', 'screen', 'way', 'just']1 ['laptop', 'computer', 'use', 'just', 'like', 'time', 'great', 'windows', 'macbook', 'months']2 ['computer', 'great', 'laptop', 'mac', 'buy', 'just', 'macbook', 'use', 'pro', 'windows']3 ['laptop', 'computer', 'great', 'time', 'battery', 'use', 'apple', 'love', 'just', 'work']它是循環運行 4 次并打印索引和每個詞匯中的所有關鍵字的結果?，F在，我想從返回以下輸出的函數中返回一個列表。return [keyword1, keyword2, keyword3, keyword4]其中，keyword1/2/3/4 是在輸出中索引為 0、1、2、3 的詞匯表中出現次數最多的單詞。

查看完整描述

1 回答

臨摹微笑

TA貢獻1982條經驗獲得超2個贊

您可以使用collection.Counter：

from collections import Counter

a = ['computer', 'laptop', 'mac', 'use', 'bought', 'like',

'warranty', 'screen', 'way', 'just']

b = ['laptop', 'computer', 'use', 'just', 'like', 'time',

'great', 'windows', 'macbook', 'months']

c = ['computer', 'great', 'laptop', 'mac', 'buy', 'just',

'macbook', 'use', 'pro', 'windows']

d = ['laptop', 'computer', 'great', 'time', 'battery', 'use',

'apple', 'love', 'just', 'work']

def get_most_common(*kwargs):

"""Accepts iterables, feeds all into Counter and returns the Counter instance"""

c = Counter()

for k in kwargs:

c.update(k)

return c

# get the most common ones

mc = get_most_common(a,b,c,d).most_common()

# print top 4 keys

top4 = [k for k,v in mc[0:4]]

print (top4)

輸出：

['computer', 'laptop', 'use', 'just']

some_results = [] # store stuff

for t in range(self.numTopics):

topWordIndices = pseudocounts[:, t].argsort()[-1:-(K+1):-1]

vocab = self.vectorizer.get_feature_names()

print (t, [vocab[i] for i in topWordIndices])

some_results.append( [vocab[i] for i in topWordIndices] )

mc = get_most_common(*some_results).most_common()

return [k for k,v in mc[0:4]]

反對回復 2021-08-14

1 回答
0 關注
137 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

將主題模型存儲在列表中也考慮了最大出現次數

將主題模型存儲在列表中也考慮了最大出現次數

1 回答

添加回答