Hi,
I'm struggling trying to put together something to grab out details from many logfiles created over a 24 hour period and count how many times per hour a name appears.
I want to run through every hour (i.e. 00,01,02.....23) and print out how many times name 'x' was found.
The only fields I'm interested in are;
The hour is in field 2
The name is in field 11
My input file(s) are comma delimited and look like;
2013/07/13,04:54:33,333,444,555,666,777,888,999,101010,John,121212,131313,141414,151515,161616,171717,181818,191919
I would like my output to look like (NAME HOUR - No. of Entries)
John 00 - 123
Marc 00 - 101
John 01 - 96
Marc 01 - 300
etc...
I've got so far with the code but the totals per hour it prints out are the GRAND TOTAL of entries per name. So using the example above 'john' would show 219 for hours 00 AND 01 (123+96).
My code looks like;
awk -F"," '{ split($2,DTM,":");a[DTM[1]]++;count[$11]++}END{for(i in a) {for(name in count) printf "%30s - %d\n", name" " i, (count[name]) | "sort -k1" }}'
And its output looks like this;
John 00 - 219
Marc 00 - 401
John 01 - 219
Marc 01 - 401
I know its probably a simple change, but i've hit a wall and cant find the right syntax, so any help you can give would be very appreciated.
Thanks.
I'm struggling trying to put together something to grab out details from many logfiles created over a 24 hour period and count how many times per hour a name appears.
I want to run through every hour (i.e. 00,01,02.....23) and print out how many times name 'x' was found.
The only fields I'm interested in are;
The hour is in field 2
The name is in field 11
My input file(s) are comma delimited and look like;
2013/07/13,04:54:33,333,444,555,666,777,888,999,101010,John,121212,131313,141414,151515,161616,171717,181818,191919
I would like my output to look like (NAME HOUR - No. of Entries)
John 00 - 123
Marc 00 - 101
John 01 - 96
Marc 01 - 300
etc...
I've got so far with the code but the totals per hour it prints out are the GRAND TOTAL of entries per name. So using the example above 'john' would show 219 for hours 00 AND 01 (123+96).
My code looks like;
awk -F"," '{ split($2,DTM,":");a[DTM[1]]++;count[$11]++}END{for(i in a) {for(name in count) printf "%30s - %d\n", name" " i, (count[name]) | "sort -k1" }}'
And its output looks like this;
John 00 - 219
Marc 00 - 401
John 01 - 219
Marc 01 - 401
I know its probably a simple change, but i've hit a wall and cant find the right syntax, so any help you can give would be very appreciated.
Thanks.