FrancisMMM
IS-IT--Management
hello, I just wrote my second script with awk and it's getting slower and slower
It parses requests from a tomcat server log which contains 448478 ones (on a 246MB file) :
Here is the awk script :
The script collects the log between the start and end strings, then outputs the filename that is pushed to a bash script who does smalls tests and removes the file.
$4 is something like "http-thread-89"
Sometimes it is just stalled... I don't understand why...
the size of f is about 200.
And I don't think it has to do with the bash script since.... it is faster to do this with only bash !
Since I am a beginner, any help would be appreciated.
Cheers !
PS: edit
Same awk script with bash script removed
It parses requests from a tomcat server log which contains 448478 ones (on a 246MB file) :
Code:
...10000 / 448478 (83 secs)
...20000 / 448478 (91 secs)
...30000 / 448478 (90 secs)
...40000 / 448478 (87 secs)
...50000 / 448478 (86 secs)
...60000 / 448478 (87 secs)
...70000 / 448478 (88 secs)
...80000 / 448478 (90 secs)
...90000 / 448478 (94 secs)
...100000 / 448478 (98 secs)
...110000 / 448478 (94 secs)
...120000 / 448478 (119 secs)
...130000 / 448478 (134 secs)
...140000 / 448478 (153 secs)
...150000 / 448478 (188 secs)
...160000 / 448478 (211 secs)
...170000 / 448478 (226 secs)
...180000 / 448478 (240 secs)
...190000 / 448478 (260 secs)
...200000 / 448478 (253 secs)
...210000 / 448478 (259 secs)
Here is the awk script :
Code:
awk -F'[][]' -v serv="$host" '
BEGIN { cur="dummy" ; c=0 ; num="%06d" }
{
# nouveau thread : incrément
if ( $0 ~ / startstring /) {
cur=$4 ;
f[cur]++ ;
c++;
fn=serv"/"cur"-"sprintf(num,f[cur]) ;
# autres lignes
} else {
if (length($4) > 4 ) {
cur=$4 ;
fn=serv"/"cur"-"sprintf(num,f[cur])
}
# dernière ligne
if ( $0 ~ / endstring /) {
print fn
}
}
print $0 > fn
}
END { print "#TotalRequests="c > "/dev/stderr" }' $hlog
The script collects the log between the start and end strings, then outputs the filename that is pushed to a bash script who does smalls tests and removes the file.
$4 is something like "http-thread-89"
Sometimes it is just stalled... I don't understand why...
the size of f is about 200.
And I don't think it has to do with the bash script since.... it is faster to do this with only bash !
Since I am a beginner, any help would be appreciated.
Cheers !
PS: edit
Same awk script with bash script removed
Code:
...10000 / 448478 (9 secs)
...20000 / 448478 (8 secs)
...30000 / 448478 (14 secs)
...40000 / 448478 (22 secs)
...50000 / 448478 (35 secs)
...60000 / 448478 (53 secs)
...70000 / 448478 (59 secs)