已解決430363個問題，去搜搜看，總會有你想問的

如何在文件中查找重復句子的頻率

首頁猿問如何在文件中查找重復句子的頻率

如何在文件中查找重復句子的頻率

Python

臨摹微笑 2022-07-12 14:38:17

我有數據框，我需要使用 Python 查找前 20 個重復的句子，請讓我知道如何去做Column AHello How are you?This ticket is not validHow are things at you end?Hello How are you?How can I help you?Please help me with ticketsThis ticket is not validHello How are you?預期產出Column A Frequency of Repeated sentenceHello How are you? 3This ticket is not valid 2How can I help you? 1...到目前為止的代碼df = pd.read_csv("C:\\Users\\aaa\\abc\\Analysis\\chat.csv", encoding="ISO-8859-1") df['word_count'] = df['Column A'].apply(lambda x: len(str(x).split(" ")))df[['Column A','word_count']].head()for i, g in df.groupby('Column A'): print ('Frequency of repeating sentence : {}'.format(g['Column A'].duplicated(keep=False).sum()))我需要一個數據框中的結果，該數據框可以在最終結果中使用“A 列”和“頻率”列寫入 CSV

查看完整描述

4 回答

郎朗坤

TA貢獻1921條經驗獲得超9個贊

這是一種使用方法.value_counts：

df['ColumnA'].value_counts()

要將其添加為列，您可以執行以下操作：

df['Frequency'] = df['ColumnA'].map(df['ColumnA'].value_counts())

反對回復 2022-07-12

隔江千里

TA貢獻1906條經驗獲得超10個贊

嘗試這個：

df['count']=df.groupby(['ColumnA'] ).count()

df.sort_values(by='count', ascending=False)

print(df.head(20))

反對回復 2022-07-12

慕的地8271018

TA貢獻1796條經驗獲得超4個贊

df['count'] = df.groupby('Sentence')['Sentence'].transform('count')

df = df.sort_values(by = 'count', ascending = False)

df.head(20)

這將在原始數據框中添加一列“計數”，其中將包含相應句子的頻率。transform()返回與原始數據框對齊的系列。

反對回復 2022-07-12

慕哥9229398

TA貢獻1877條經驗獲得超6個贊

df['count'] = df.groupby('Sentence')['Sentence'].transform('count')

df = df.sort_values(by = 'count', ascending = False)

df.head(20)

這將在原始數據框中添加一列“計數”，其中將包含相應句子的頻率。transform()返回與原始數據框對齊的系列。

反對回復 2022-07-12

4 回答
0 關注
189 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

如何在文件中查找重復句子的頻率

如何在文件中查找重復句子的頻率

4 回答

添加回答