亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

為了賬號安全,請及時綁定郵箱和手機立即綁定
已解決430363個問題,去搜搜看,總會有你想問的

有沒有辦法分析文本文件來檢查這個標準

有沒有辦法分析文本文件來檢查這個標準

當年話下 2023-07-11 10:32:46
我需要創建一個程序來分析文件中的一段文本,然后進行計數:多少字單詞的平均長度每個單詞出現多少次字母表中每個字母開頭有多少個單詞到目前為止,我已經成功完成了前兩個要點(如下所示),fileName = open(input('Please enter the full name of the file: '), 'r')     w = [len(word) for line in fileName for word in line.rstrip().split(" ")]    total_w = len(w)    avg_w = sum(w) / total_w          print('The total number of words in this file is:', total_w)  print('The average length of the words in this file is:', avg_w)
查看完整描述

1 回答

?
幕布斯6054654

TA貢獻1876條經驗 獲得超7個贊

collections.Counter使得這相對簡單。我用來re.findall(r'[\w]+', data)查找單詞(單詞是帶有字母、下劃線和數字的東西)。根據需要進行調整。

import re

from collections import Counter


fn = input('Please enter the full name of the file: ')

with open(fn, 'r') as f:

? ? words = Counter(re.findall(r'[\w]+', f.read()))

? ? # use words = Counter(f.read().split()) if everything split by spaces

? ? # adjust regular expression depending on whether you want or don't want

? ? # stuff like numbers to be counted as "words"


print('Total number of words:', sum(words.values()))

# this is weighted by word occurrence, not sure whether this is correct

print('Average length of words:',?

? ? ? sum(len(w) * o for w, o in words.items()) / sum(words.values()))

print('Word occurrence:', words)

# this only shows letters that actually occur. If you need all letters of?

# the alphabet, you have to add the rest

print('Start letter occurrence', Counter(w[0] for w in words.elements()))


查看完整回答
反對 回復 2023-07-11
  • 1 回答
  • 0 關注
  • 96 瀏覽
慕課專欄
更多

添加回答

舉報

0/150
提交
取消
微信客服

購課補貼
聯系客服咨詢優惠詳情

幫助反饋 APP下載

慕課網APP
您的移動學習伙伴

公眾號

掃描二維碼
關注慕課網微信公眾號