Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Need help in writing a file reading script

Status
Not open for further replies.

SDCSA

Programmer
May 8, 2003
22
US
Hello All!

I am very new to shell scripting.


I want to write a script that reads a log file and outputs the result


A snippet of log file is as shown below. 333.333.333.333 is the ip address. ip addresses are always preceded by the word AAD@.

------------------
10:54:11 [crelay_child-106762-0]: Reading local IPC message...
10:54:11 [crelay_child-106762-0]: Writing 220 bytes : LOGIN_MSG_RSP[-726510107](1) secu_server-0@222.222.22.22->AAD@333.333.333.33
10:54:12 [crelay_child-106762-0]: Scheduled send message to client on sfd 5
11:54:01 [crelay_child-312350-0]: Read 173 bytes : LOGIN_MSG[-934384907](1) AAD@333.333.333.33->secu_server@222.222.222.22
11:54:01 [crelay_child-312350-0]: Writing 213 bytes : LOGOUT_MSG[-726510107](1) secu_server-0@222.222.222.22->AAD@333.333.333.33


------------------

The log file has many such entries as shown above. The script needs to read only secu_server entries. The line with LOGIN_MSG
represents a login and the line with LOGOUT_MSG represents a logout.

And after everything, Count(Logins) - Count(Logouts) should give the
number of current users.


It would be great if it can also print the login ip addresses and logout ip addresses.

I feel this is a tough question. Your help will be greatly appreciated.

Thanks a lot in anticipation!!!

chatguy2020.




 
something like that should be a good start:

nawk -f parseLog.awk myLogfile

#-------------------- parseLog.awk
BEGIN {
FS="(\\[)|(\\])"

PAT_IP="AAD@[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+"
}

/LOGIN_MSG/ {
loginID=$2
logoutID=""
currA[loginID]++;
#print "IN->[" $2 "]"
}

/LOGOUT_MSG/ {
logoutID=$2
loginID=""
delete currA[logoutID];
#print "OUT->[" $2 "]"
}

/secu_server/ {
if (match($0, PAT_IP)) {
ipAddr=substr($0, RSTART+4, RLENGTH-4);
#print "ipAddr->[" ipAddr "]"
if (logoutID != "")
arrO[ipAddr]++;
else
arrI[ipAddr]++;
}
}
END {
for (i in currA)
printf("currently logged with id->[%s]\n", i);

printf("\nlogIN ipAddrsses\n");
for (i in arrI)
printf("\t[%s] [%d time(s)]\n", i, arrI);

printf("logOUT ipAddrsses\n");
for (i in arrO)
printf("\t[%s] [%d time(s)]\n", i, arrO);
}

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Thanks a lot vgersh for the script!! It runs and produces the following output:

currently logged with id->[crelay_child-3080230-0]
currently logged with id->[crelay_child-2244630-0]

login ipAddresses
logout ipAddresses

Sorry to say but it is incorrect.

There are some changes in the file.

The lines:
secu_server-0@222.222.22.22->AAD@333.333.333.33
AAD@333.333.333.33->secu_server@222.222.222.22
always come in a fresh line and start with a tab (5th character in vi).

Just above each ip address line will be the LINE THAT HAS THE MESSAGES.(LOGIN_MSG, LOGOUT_MSG).

i.e it's like this:

----------------------------------------
10:54:11 [crelay_child-106762-0]: Reading local IPC message...
10:54:11 [crelay_child-106762-0]: Writing 220 bytes : LOGIN_MSG_RSP[-726510107](1)
secu_server-0@222.222.22.22->AAD@333.333.333.33
10:54:12 [crelay_child-106762-0]: Scheduled send message to client on sfd 5
11:54:01 [crelay_child-312350-0]: Read 173 bytes : LOGIN_MSG[-934384907](1)
AAD@333.333.333.33->secu_server@222.222.222.22
11:54:01 [crelay_child-312350-0]: Writing 213 bytes : LOGOUT_MSG[-726510107](1)
secu_server-0@222.222.222.22->AAD@333.333.333.33


------------------



Also, there are messages like LOGIN_MSG_RESPONSE and LOGOUT_MSG_RESPONSE. But we need only LOGIN_MSG and LOGOUT_MSG.

A single ip addresses may participate in multiple LOGINS and multiple LOGOUTS. (Thats the way the file is!!). We need to count distinct logins and distinct logouts and make a difference.

i.e the result is COUNT(DISTINCT(LOGIN)) - COUNT(DISTINCT(LOGOUT)).

It may be complex. But I would greatly appreciate the help.

Thanks in anticipation,
sdcsa.


 
OK, I can the problem here.
Before I post the solution, what does identify the DISTINCT login/logout session?

In a line like this:

11:54:01 [crelay_child-312350-0]: Read 173 bytes : LOGIN_MSG[-934384907](1)

What's the message 'id'? Is it
&quot;-934384907&quot;

or is it

&quot;crelay_child-312350-0&quot;

You need this to map matching login/logout messages.

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 

Hello vgersh99:

The distinct logins are nothing but distinct ip addresses as each ip address represents a distinct user.

So in AAD@333.333.333.33, the ip address represents a distinct user. And there is the message line (with LOGIN_MSG, or LOGOUT_MSG) on top of it.

Thanks again.
 
ok, try that one:

#-------------------- parseLog.awk
BEGIN {
#FS=&quot;(\\[)|(\\])&quot;

PAT_IP=&quot;AAD@[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+&quot;
}

match($0, /LOGIN_MSG[[][-0-9]+[]]/) {
loginID=$2;
logoutID=&quot;&quot;
currA[loginID]++;
#print &quot;IN->[&quot; loginID &quot;]&quot;
}

match($0, /LOGOUT_MSG[[][-0-9]+[]]/) {
logoutID=$2;
loginID=&quot;&quot;
delete currA[$2];
#print &quot;OUT->[&quot; logoutID&quot;]&quot;
}

/secu_server/ {
if (match($0, PAT_IP)) {
ipAddr=substr($0, RSTART+4, RLENGTH-4);
#print &quot;ipAddr->[&quot; ipAddr &quot;]&quot;
if (logoutID != &quot;&quot; && !(logoutID in arrO))
arrO[logoutID]=ipAddr;
if (loginID != &quot;&quot; && !(loginID in arrO))
arrI[loginID]=ipAddr;
}
}
END {
for (i in currA)
printf(&quot;currently logged with id->[%s] from ip->[%s]\n&quot;, i, arrI);

printf(&quot;\nlogIN ipAddrsses\n&quot;);
for (i in arrI)
printf(&quot;\tlogged IN from [%s] logged OUT from->[%s]\n&quot;, arrI, arrO)
;

}


vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
vgersh99:

It gave me the following output.

currently logged with id->[[crelay_child-2818-0]:] from ip->[]
currently logged with id->[[crelay_child-40056-0]:] from ip->[]
currently logged with id->[[crelay_child-26910-0]:] from ip->[]
currently logged with id->[[crelay_child-29826-0]:] from ip->[]
currently logged with id->[[crelay_child-36958-0]:] from ip->[]
currently logged with id->[[crelay_child-48582-0]:] from ip->[]

logIN ipAddrsses
logged IN from [] logged OUT from->[]
logged IN from [] logged OUT from->[]
logged IN from [] logged OUT from->[]
logged IN from [] logged OUT from->[]
logged IN from [] logged OUT from->[]
logged IN from [] logged OUT from->[]

I think we don't need to track the crelay.... we just have to track the ip addresses and the messages above them.

Will appreciate if you can make it better.

Thanks again!!
 
#-------------------- parseLog.awk
BEGIN {
#FS=&quot;(\\[)|(\\])&quot;

PAT_IP=&quot;AAD@[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+&quot;
}

match($0, /LOGIN_MSG[[][-0-9]+[]]/) {
loginID=$2;
logoutID=&quot;&quot;
currA[loginID]++;
#print &quot;IN->[&quot; loginID &quot;]&quot;
}

match($0, /LOGOUT_MSG[[][-0-9]+[]]/) {
logoutID=$2;
loginID=&quot;&quot;
delete currA[$2];
#print &quot;OUT->[&quot; logoutID&quot;]&quot;
}

/secu_server/ {
if (match($0, PAT_IP)) {
ipAddr=substr($0, RSTART+4, RLENGTH-4);
#print &quot;ipAddr->[&quot; ipAddr &quot;]&quot;
if (logoutID != &quot;&quot; && !(logoutID in arrO))
arrO[logoutID]=ipAddr;
if (loginID != &quot;&quot; && !(loginID in arrO))
arrI[loginID]=ipAddr;
}
}
END {
for (i in currA)
printf(&quot;currently from ip->[%s]\n&quot;, arrI);
}


vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
vgersh99:

Sorry for the trouble.. The latest script doesn't output anything. It would be great if you could look into it.

Thanks a lot again!1
 
given your latest sample file, you're right - the script outputs nothing. It means that there are NO current 'LOGIN' session - which is correct.

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Hello again!!

It now outputs this with a file that is big.

currently from ip->[]
currently from ip->[]
currently from ip->[]

Could this be changed? As I said We need to track LOGIN_MSG and LOGOUT_MSG. Again the file would look like this:

----------------------------------------
10:54:11 [crelay_child-106762-0]: Reading local IPC message...
10:54:11 [crelay_child-106762-0]: Writing 220 bytes : LOGIN_MSG_RSP[-726510107](1)
secu_server-0@222.222.22.22->AAD@333.333.333.33
10:54:12 [crelay_child-106762-0]: Scheduled send message to client on sfd 5
11:54:01 [crelay_child-312350-0]: Read 173 bytes : LOGIN_MSG[-934384907](1)
AAD@333.333.333.33->secu_server@222.222.222.22
11:54:01 [crelay_child-312350-0]: Writing 213 bytes : LOGOUT_MSG[-726510107](1)
secu_server-0@222.222.222.22->AAD@333.333.333.33


------------------

We need to track the ip addresses on the lines that precede by a line that has the LOGIN_MSG (for logins) and LOGOUT_MSG (for logouts).

There is a change in the technique. The script has to track ip addresses. If a new ip address is found
and if the line above it has LOGIN_MSG, then the user is logged in. And if the same ip address
is found again (with LOGIN_MSG at top), that would make no difference. (Here multiple LOGIN_MSGs exist). Similarly, when a ip address is found with LOGOUT_MSG on it's above line, then
that user is no longer logged in. It's like deleting the user from the list.

Finally we just need to print the logged in ip addresses (distinct) and the count.


Thanks very much for all the help till now and it would be great if you can resolve this
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top