nawk syntax difficulty

uksnowman · Apr 11, 2001

Hi people,

I am parsing a log file, removing blank lines and certain lines which have useless information. I can do all these things separately, but not when I put them together in one script! Please can someone suggest the correct syntax so the whole log file is pared correctly. Prog below:-

When the script runs, it reads the log file, the while loop works fine and deletes backspace chars, but blank lines and lines with text in field $2 are not deleted for some reason.
I have also tried to remove non-alpha chars from field $4 at the end of the script, this does not work correctly in the script, but the command works fine separately.

Example log file:-

111.222.333.444 555.666.777.888 21/tcp <unknown>
"starting up"
111.222.333.444 555.666.777.888 21/tcp "guest"

Output file:-

DESTINATION SOURCE PORT USERNAME
111.222.333.444 555.666.777.888 21/tcp unknown
111.222.333.444 555.666.777.888 21/tcp guest

Prog:-

#! /bin/nawk -f

BEGIN {
printf "\n%15s\t%15s\t%10s\t%10s\n\n", "DESTINATION", "SOURCE", "PORT", "USERNAME"
}

{
if ( $0 ~ /^$/ ) {} # Do not print blank lines
elseif ( $2 !~ /[A-Za-z]/) # Ignore any lines with text in field 2
{
while ( $4 ~ /\^H/ || $4 ~ /\?/ )
{
gsub ( /\"/,"" )
if ( $4 ~ /^\^H/ )
sub ( /\^H/, "" )
sub ( /.\^H/, "" )
sub ( /\^\?/, "" )
$4 = "\""$4"\""
}
++num
}
}
{
printf "%15s\t%15s\t%10s\t%10s\n", $2, $5, $3
gsub( /[^A-Za-z0-9<> \t]/, "" ); # Remove all non-alpha chars from username field $4
printf "%10s\n", $4
}

END {
print NR, "sessions read."
print num, "Modifications made."
}

Thanks in advance,
uksnowman

Krunek · Apr 11, 2001

Hi, uksnowman!

Try put this block

{
    printf "%15s\t%15s\t%10s\t%10s\n", $2, $5, $3
    gsub( /[^A-Za-z0-9<> \t]/, "" ); # Remove all non-alpha chars from username field $4
    printf "%10s\n", $4
}

inside first block of main loop.

You can also use this pattern

NF <> 4 || $2 !~ /[a-zA-Z]/ { actions }

and your awk program can be simpler.

I didn't test this solutions, but I hope this helps.

Bye!

KP.

flogrr · Apr 12, 2001

#! /bin/nawk -f

BEGIN {
printf "\n%15s\t%15s\t%10s\t%10s\n\n", "DESTINATION", "SOURCE", "PORT", "USERNAME"
}

/^$/ {next} # Do not print blank lines
$2 ~ /[A-Za-z]/ {next} # Ignore any lines with text in field 2

{
while ( $4 ~ /\^H/ || $4 ~ /\?/ )

{
gsub ( /\"/,"" )
if ( $4 ~ /^\^H/ ) sub ( /\^H/, "" )
sub ( /.\^H/, "" )
sub ( /\^\?/, "" )
}
gsub( /[^A-Za-z]/,"",$4 ) # Remove non-alpha chars
$4 = "\""$4"\""
++num
}
{
printf "%15s\t%15s\t%10s\t%10s\n", $2, $5, $3, $4
}

END {
print NR, "sessions read."
print num, "Modifications made."
}

flogrr
flogr@yahoo.com

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

nawk syntax difficulty

uksnowman

Technical User

Krunek

Programmer

flogrr

Programmer

Similar threads

Part and Inventory Search

Sponsor