tr命令在统计英文单词出现频率中的妙用
tr命令我们很清楚,可以删除替换,删除字符串。在英文中我们要经常会经常统计英文中出现的频率,如果用常规的方法,用设定计算器一个个算比较费事,这个时候使用tr命令,将空格分割替换为换行符,再用tr命令删除掉有的单词后面的点号,逗号,感叹号。先看看要替换的this.txt文件
TheZenofPython,byTimPeters
Beautifulisbetterthanugly.
Explicitisbetterthanimplicit.
Simpleisbetterthancomplex.
Complexisbetterthancomplicated.
Flatisbetterthannested.
Sparseisbetterthandense.
Readabilitycounts.
Specialcasesaren'tspecialenoughtobreaktherules.
Althoughpracticalitybeatspurity.
Errorsshouldneverpasssilently.
Unlessexplicitlysilenced.
Inthefaceofambiguity,refusethetemptationtoguess.
Thereshouldbeone--andpreferablyonlyone--obviouswaytodoit.
Althoughthatwaymaynotbeobviousatfirstunlessyou'reDutch.
Nowisbetterthannever.
Althoughneverisoftenbetterthan*right*now.
Iftheimplementationishardtoexplain,it'sabadidea.
Iftheimplementationiseasytoexplain,itmaybeagoodidea.
Namespacesareonehonkinggreatidea--let'sdomoreofthose!
上面的文本文件,如果要文中出现次数的最多的10个单词统计出来,可以使用下面的命令
[root@linux~]#catthis.txt|tr'''\n'|tr-d'[.,!]'|sort|uniq-c|sort-nr|head-10 10is 8better 8than 5to 5the 3of 3Although 3never 3be 3one
可谓非常方便!
总结
以上就是这篇文章的全部内容了,希望本文的内容对大家的学习或者工作具有一定的参考学习价值,谢谢大家对毛票票的支持。如果你想了解更多相关内容请查看下面相关链接