word frequencies 1

Guest_imported · May 21, 2002

Hi all,

I need a (gawk) script that counts the number of words in different textfiles, sorts them from high (left) to low (right), and copies this to a new file. Suppose the first file - named sport.txt - looks like this:

bird sport basket tennis basket guard victory basket guard victory

This becomes:

basket guard victory bird sport tennis

and should be automatically copied to a file, named sport.key (so extension "txt" should be replaced with "key&quot

.

Suppose the second file - named war.txt - looks like this:

guns food defence shield defence victory guns victory rebels

This becomes:

guns defence victory food shield rebels

and should be copied to a new file, called "war.key".

I hope you can help me with this. Many thanks in advance!

sunny

ps: before I forget: if two or more words have the same frequency of occurence, the order in which these words occur, is of no importance.

bigoldbulldog · May 21, 2002

Something like this can get you started. You just have to arrange the saving part. It's not all awk as you see. All awk would make the file writing/saving very easy but requires writing sorting and formatting routines which will take time. This looks like a college project, but we'll help anyway.

awk '{
for( i=1; i<=NF; i++){
A[$i]++
}

for (item in A){
print A[item] , item
}
}' sport.txt |
sort -k 1nr,2 |
sed -e 's/^[0-9]*[ ]*//;1h;1!{x;G;$!x;}' -e 's/\n/ /g;$!d;' Cheers,
ND [smile]

bigoldbulldog@hotmail.com

Guest_imported · May 22, 2002

Problem solved indeed! And no, this is not a school project

Many thanks!

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

word frequencies 1

Guest_imported

New member

bigoldbulldog

Programmer

Guest_imported

New member

Similar threads

Part and Inventory Search

Sponsor