python统计文本字符串里单词出现频率的方法
本文实例讲述了python统计文本字符串里单词出现频率的方法。分享给大家供大家参考。具体实现方法如下:
#wordfrequencyinatext #testedwithPython24vegaseat25aug2005 #Chinesewisdom... str1="""Manwhoruninfrontofcar,gettired. Manwhorunbehindcar,getexhausted.""" print"Originalstring:" printstr1 print #createalistofwordsseparatedatwhitespaces wordList1=str1.split(None) #stripanypunctuationmarksandbuildmodifiedwordlist #startwithanemptylist wordList2=[] forword1inwordList1: #lastcharacterofeachword lastchar=word1[-1:] #usealistofpunctuationmarks iflastcharin[",",".","!","?",";"]: word2=word1.rstrip(lastchar) else: word2=word1 #buildawordListoflowercasemodifiedwords wordList2.append(word2.lower()) print"Wordlistcreatedfrommodifiedstring:" printwordList2 print #createawordfrequencydictionary #startwithanemptydictionary freqD2={} forword2inwordList2: freqD2[word2]=freqD2.get(word2,0)+1 #createalistofkeysandsortthelist #allwordsarelowercasealready keyList=freqD2.keys() keyList.sort() print"Frequencyofeachwordinthewordlist(sorted):" forkey2inkeyList: print"%-10s%d"%(key2,freqD2[key2])
希望本文所述对大家的Python程序设计有所帮助。