Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

awk pattern matching

Status
Not open for further replies.

SpongeBob1

Programmer
Mar 1, 2002
49
0
0
US
Hi,

I need to use several pattern matches such as

/pattern/
in my script, however it seems that when i do the first one, the write some awk statements, then open the second one, I get an error, how is it possible to use several patterns?

ps. I am able to do;

/pattern1/ { print }
/patern2/ {print }

however when things get more complex I have problems...
Thanks.
 

You should be able to use multiple patterns. Post a script that gets an error. CaKiwi
 
This is what I am testing to do this

/patten1/ { for (i=1; i < NF; ++i);
split ($0, a, &quot; &quot;)
print a[6]

}

The next step here is to do another pattern search to find a new value, but when modify script to be something like


/patten1/ { for (i=1; i < NF; ++i);
split ($0, a, &quot; &quot;)
print a[6]

}
/patten2/ { for (i=1; i < NF; ++i);
split ($0, b, &quot; &quot;)
print b[6]

}

It does not return a value... I'm not sure how to do this....
 

How are you running it and what is the input? CaKiwi
 
I'm running it as...

awk -f awkscript log

The input is completely semented... Basically I wish to get one value like i'm doing in the part;

/uid:/ { for (i=1; i < NF; ++i);
split ($0, a, &quot; &quot;)
print a[6]
}

Then I wish to use the value in this array a[6] to cross reference another line in the log file, such as....

[02/10/2002:15:15:17 : 'kelly' failed

I wish to get this user ID ( kelly ) in the above part of script & then cross reference in the part of the log;

[02/10/2002:15:15:17 : 'kelly' failed

by the pattern /failed/

 

I'm still a bit lost here.

Firstly, the for statement does nothing.

Secondly, there is no string uid: in your input.

Thridly, even if a line matched, there is only 4 fields so a[6] will not contain anything after the split.

What am I missing here? CaKiwi
 
OK I can see where you're coming from... apoligies, firstly the for was for tesing purposes....

so in reality I am using;

/uid:/ {
split ($0, a, &quot; &quot;)
print a[6]
}

The pattern /uid/ filters out all of the uids in the log.file ( this is a really big file); so I get a list of names such as;

kelly
brown
etc.

( these are stored in a[6] )

From this I am hoping to use the array - in this case a[6] and use this to reference the second requirement in my script; that is to get the line containing;

[02/10/2002:15:15:17 : 'kelly' failed

which are 'kelly' - uid
and the pattern - failed.

This is the part I am unsure how to do, all I know is that since I have the uid, I hope to match it to failed somehow, as these are good search materials in this case....

 

Post a larger sample of your input.
Does it contain 'uid: ... kelly' followed by all the 'kelly failed' records followed by a 'uid: ... brown' record followed by all the 'brown failed' records, etc. If not, does the uid always come before the failed records for each user?
CaKiwi
 
From what sense I can make of this, I would say that &quot;awk&quot; is not the best tool to use. Try this:
Code:
#! /bin/ksh
grep 'uid:'
useridfile
Code:
 | cut -d' ' -f6 | sort -u |while read uid
do
   grep &quot;'$uid' failed&quot;
logfile
Code:
done

The grep | cut combo provides the 6th field of lines containing &quot;uid:&quot;, which I think is what you were doing with the split subroutine. Feeding the results into &quot;sort -u&quot; means you will get only one copy of each user id. The while loop reads them in one at a time and searches for any &quot;failed&quot; lines associated with them.
 
daedalus,
While you may be able to speak authoritatively for your
abilities with awk, and what you would do rather than use awk, I think you do the language a disservice by supposing
that the above code can do something that gawk/nawk is not capable of doing, or do it better.

You should probably take a look at cakiwi's posts, and vgersh's for the level of expertise here and what problems
gawk/nawk has been applied to successfully.
 
Hi,


The log file I'm working with appears as follows;

[07/Oct/2002: 12:01:03] uid : 'kelly'
[07/Oct/2002: 12:01:03] Login from server 'qa1sa'
[07/Oct/2002: 12:01:03] IP > 100 : &quot;101.101.101.101&quot;
[07/Oct/2002: 12:01:04] result : 'kelly' Failed.

This is an excerpt..... Basically it gives the uid value each time, then follows it with the Failed value....
So I'm thinking, if I can get the /uid/ value, store it in an array as in the above code;

/uid:/ {
split ($0, a, &quot; &quot;)
print a[6]
}


I can use the array value to match against


[07/Oct/2002: 12:01:04] result : 'kelly' Failed.

the value 'kelly' and the /Failed/, I hope this makes it clearer to everyone & I appreciate any help I have been recieving.... Perhaps there is an easier way of doing this, but I think 'getline' perhaps out of the question as the log is so broken up.....


Thanks
 
Marsd: Perhaps you speak a different version of english than I do, in which the phrase &quot;not the best tool to use&quot; means &quot;it cannot be done that way&quot;, but that is not at all what I meant. I am quite aware of the abilities of the awk family, and freely acknowledge that the programming abilities of CaKiwi, Vgersh, and others exceeds my own in awk, and in Unix scripting. However it seemed to me that it was EASIER to do this in the few short lines of Unix script I used above than to pull out the big guns of awk for it. If I have mis-interpreted the problem, I apologize for wasting everyone's time, but if the script accomplishes the purpose, why do you object to my posting it?
 
Daedalus,
You have misunderstood, I did say &quot;do it better.&quot;

When you say that &quot;awk is not the best tool to use&quot;,
and then proffer a shell script the logical assumption
is that you feel the shell is superior to an awk solution
for the problem you address. Your solution works, the benefit over awk is unclear.

As for &quot;pulling out the big guns of awk..&quot; I am afraid
I don't follow you. Awk is as easy to use as the shell
and the syntax is simpler and more &quot;traditional&quot; than
most shells IMO. Awk is even easier to use in conjunction
with the shell replacing &quot;cut&quot; and &quot;grep&quot; with more
concentrated efficiency for the problem at hand.
 
For the detailed discussion on the similar thread, see

thread271-376879

vlad
+---------------------------+
|#include<disclaimer.h> |
+---------------------------+
 
Yeah, I read it, interesting, but inconclusive.
Some of the arguments were silly though, sed is no more
a part of the shell than awk or ed or vi, etc..

 
yeah, I guess the 'jury' is still deliberating.
Interesting discussion though regardless of what might sound silly, but... interesting nevertheless.

Just to prove that nothing is black&white anymore ;) vlad
+---------------------------+
|#include<disclaimer.h> |
+---------------------------+
 
Quite frankly, I don't see the point in arguing what is or is not a part of shell. To me they are all just tools to be pulled out as needed. As for what I feel is &quot;superior&quot; about my script: After coming to a conclusion about what I thought Spongebob1 was after (I make no claim that I was right on that), I was able to immediately write it out, based on a few tools, and it is conceptually straightforward. While it is possible to do the same thing within awk, every method I can think of is either longer, or relies on sly tricks (such as the very neat one involving &quot;|&quot; that Vlad gives in another Spongebob1 thread). I admit that the &quot;grep | cut&quot; combo could have been replaced with &quot;awk '/uid:/ {print $6}'&quot; to get something even more straightforward, but I do not apologize for giving the script at it was. (I chose the &quot;grep | cut&quot; combo merely to show that there are other ways to accomplish this task of searching and printing the 6th field.) Nor do I apologize for offering my opinion, though you seem to think one so lowly as I should not ever deign to do so! And until I see a simpler solution in awk alone, I will still hold to that opinion.
 
I never attacked you or your skills personally, you are reading a lot into my opinion that you are the best expert
where you are concerned.

This is an awk forum.

I did not remember whether or not you had posted before
and thought that you could benefit from seeing the types
of solutions that awk programmers on the forum had dealt
with using awk, rather than using the shell for whatever
reason.
I apologize for my posts brusqueness.
As far as sequential searches after a match how's this
for simple?

function searchafter(pat) {
while (getline) {
if ($0 ~ pat) {
print $0
}
}
print &quot;Done with function search.&quot;
return
}

/stuff/ { searchafter(&quot;1&quot;) }
/33/ { searchafter(&quot;pauline&quot;) }
/12/ { searchafter(&quot;ed&quot;) }

 
Marsd: I apologise for being so thin-skinned. It's not been a good week. I have a question about your routine, and had to wait until I had time to test it. What I find is that the first time searchafter is called, it reads in the rest of the file, leaving nothing for the rest of the routine to act on, so the program ends after that first call. I don't see how you could reset the datastream either, without the routine falling into an infinite loop.

As for this being an awk forum, I don't see that this should prevent the occasional suggestion that a different approach should work. I have seen a few posts on the UNIX Scripting forum giving Perl scripts, and never thought that this was inappropriate.
 
On more mature consideration, and after running my head into a brick wall on this one, your solution IS probably the easiest.

You are right about the function call:
One way to reset it is to make FNR = 1 and use: while (getline var); if (var ~ pat) ; print ; break;, however this invokes some weird behavior.

So second best is this, still not too
complicated, but it rereads the file
'n' times.

function findarray(fname,pat ,arra,x,f) {
print &quot;Looking for match to: &quot;, pat
while ((getline < fname) > 0) {
arra[x++] = $0
}
close(fname)

for (f in arra) {
if (arra[f] ~ pat) {
print arra[f]
break
}
}
return
}



BEGIN {
fname = ARGV[1]
pat = ARGV[2]
quit = 0
a = split(pat,loc)
while(quit <= a) {
findarray(fname,loc[quit++])
}
}

Output:
>awk -f s.awk code_sam* &quot;for proc proc&quot;
Looking for match to: for
[format %d &quot;0x$val$out&quot;]
Looking for match to: proc
proc createtable {arr list1 list2} {
Looking for match to: proc
proc createtable {arr list1 list2} {

real 0m0.008s
user 0m0.010s
sys 0m0.000s

Pretty fast still.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top