我在訓練數據上從 sklearn 訓練了一個 TFIDF,當我將詞匯應用到新數據上時,它給了我一個關鍵錯誤,因為它沒有從中學習。我該如何解決它?這是我的代碼。 def feature_engineering(self, inputs): x = [self.analyser(seq) for seq in inputs] return x def fit(self, inputs): if self.vocabulary and self.analyser: pass else: vectorizer = TfidfVectorizer( ngram_range=(self.config_dict["min_n_gram"], self.config_dict["max_n_gram"]), lowercase=False, stop_words=None,min_df=2) vectorizer.fit(inputs) self.analyser = vectorizer.build_analyzer() self.vocabulary = vectorizer.vocabulary_ save_object(os.path.join(self.feature_extraction_folder, "analyzer.pickle"), self.analyser) save_object(os.path.join(self.feature_extraction_folder, "vocabulary.pickle"), self.vocabulary) def transform(self, inputs): vocab_size = len(self.vocabulary) inputs = self.feature_engineering(inputs) inputs = [[self.vocabulary[x] for x in l] for l in inputs]##This line generate an error return np.array(inputs)
1 回答

慕少森
TA貢獻2019條經驗 獲得超9個贊
使用 if 語句解決我的問題
inputs = [[self.vocabulary[x] for x in l if x in self.vocabulary.keys()] for l in inputs]```
添加回答
舉報
0/150
提交
取消