Count (group by) number of line occurrences in file
Here’s a neat ruby script to group and count the number of occurrences of lines inside a given file.
count.rb
list = IO.readlines(ARGV[0])
h = Hash.new {|hash, key| hash[key] = 0}
list.each {|item| h[item] += 1}
h = h.sort_by { |k,v| v }.reverse
h.each_with_index do |p,i|
puts "(#{p[1]}) #{p[0]}"
if ARGV[1] and ARGV[1].to_i <= i + 1
break
end
end
Usage is:
ruby count.rb <file> [show_at_most_n]
If you have a file named “faren.txt” comprised of the following lines:
test test awesome dude dude dude faren meran
Running the script with:
ruby count.rb faren.txt
Will yield the following result:
(3) dude (2) test (1) awesome (1) faren (1) meran
In addition, if you pass the “show_at_most_n” parameter (a number), it will only print that number of results. For example 2 will show the following:
(3) dude (2) test
Related:
- Java Web Start (jnlp) simple example One of my current projects requires me to deliver a...
- Earthquake information in Ruby Here’s a little script to get the latest earthquakes worldwide...
Categorised as: ruby, software development, technology
sort faren.txt | uniq -c | sort -nr
Ha, thanks for that. Command line power
You only missed the ability to limit the number of returned lines