已解決430363個問題，去搜搜看，總會有你想問的

Spacy is_stop 不識別停用詞？

首頁猿問 Spacy is_stop...

Spacy is_stop 不識別停用詞？

Python

慕少森 2021-06-14 13:09:11

當我使用 SpaCy 來識別停用詞時，如果我使用en_core_web_lg語料庫它不起作用，但是當我使用en_core_web_sm. 這是一個錯誤，還是我做錯了什么？import spacynlp = spacy.load('en_core_web_lg')doc = nlp(u'The cat ran over the hill and to my lap')for word in doc: print(f' {word} | {word.is_stop}')結果： The | False cat | False ran | False over | False the | False hill | False and | False to | False my | False lap | False但是，當我更改此行以使用en_core_web_sm語料庫時，會得到不同的結果：nlp = spacy.load('en_core_web_sm') The | False cat | False ran | False over | True the | True hill | False and | True to | True my | True lap | False

查看完整描述

2 回答

湖上湖

TA貢獻2003條經驗獲得超2個贊

試試from spacy.lang.en.stop_words import STOP_WORDS，然后你就可以顯式檢查單詞是否在集合中

from spacy.lang.en.stop_words import STOP_WORDS

import spacy

nlp = spacy.load('en_core_web_lg')

doc = nlp(u'The cat ran over the hill and to my lap')

for word in doc:

# Have to convert Token type to String, otherwise types won't match

print(f' {word} | {str(word) in STOP_WORDS}')

輸出以下內容：

The | False

cat | False

ran | False

over | True

the | True

hill | False

and | True

to | True

my | True

lap | False

對我來說看起來像一個錯誤。但是，STOP_WORDS如果您需要，這種方法還可以讓您靈活地將單詞添加到集合中

反對回復 2021-06-15

2 回答
0 關注
183 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

Spacy is_stop 不識別停用詞？

Spacy is_stop 不識別停用詞？

2 回答

添加回答

Spacy is_stop 不識別停用詞？