Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

grab a string in a line 1

Status
Not open for further replies.

dvknn

IS-IT--Management
Mar 11, 2003
94
US
Hi,

I have a bunch of lines..which looks like this..

2002.12.11 11:33:05 [Thread-421] gtm1dev.nts.fedex.com [5] gemstone DocumentDeterminationServer.keywordSearch NO KEYWORD MATCH: (saree) - FILTER: 1

2003.04.07 12:01:50 [Thread-8637] toa1.sac.fedex.com [5] gemstone DocumentDeterminationServer.keywordSearch NO KEYWORD MATCH: (NEW SOFTWARE) - FILTER: 1

2003.04.07 12:02:15 [Thread-8435] toa1.sac.fedex.com [5] gemstone DocumentDeterminationServer.keywordSearch NO KEYWORD MATCH: (metal edging strips) - FILTER: 1


How can I take the string between the "(" and ")" ??

Thank You
 
nawk -F'[()]' '{ print $2}' myFile.txt vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Thanks maan,

I have a bigger problem. I have a file which has data like this:

2002.12.11 Search NO KEYWORD MATCH: (saree) - FILTER: 1
2002.12.11 Search NO KEYWORD MATCH: (saree) - FILTER: 1
2002.12.11 Search NO KEYWORD MATCH: (abc) - FILTER: 1
2002.12.11 Search NO KEYWORD MATCH: (abc) - FILTER: 1
2002.12.12 Search NO KEYWORD MATCH: (234eqewqe) - FILTER: 1
2002.12.12 Search NO KEYWORD MATCH: (1234wqe) - FILTER: 1
2002.12.12 Search NO KEYWORD MATCH: (1234wqe) - FILTER: 1
2003.04.07 Search NO KEYWORD MATCH: (fiber preform) - FILTER: 1
2003.04.07 Search NO KEYWORD MATCH: (fiber preform) - FILTER: 1


Now, I would like to extract the DATE and the word between &quot;(&quot; and &quot;)&quot; (example : abc).

If observe, the word is being repeated. So, I would like to avoid the repetetions.

And output into a file. the output file should look like:

2002.12.11, saree
2002.12.11, abc
2002.12.12, 234eqewqe
2003.04.07, fiber preform

How could I do that?????

I REALLy appreciate if you could help me towards this. (I am not at all a shell script person. So, I need your help)

Thank You,
 
something like that would do it:

nawk -f a.awk myText.txt | uniq

#-------------- a.awk
BEGIN {
FS=&quot;[()]&quot;
OFS=&quot;,&quot;
}

{
split($1, a, &quot; &quot;);
print a[1],$2
}
vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
could you please explain??

I tried to run:

#!/usr/bin/ksh -p

#. $HOME/.profile

nawk -f a.awk NKM.err | uniq

#-------------- a.awk
BEGIN {
FS=&quot;[()]&quot;
OFS=&quot;,&quot;
}

{
split($1, a, &quot; &quot;);
print a[1],$2
}

But, I did not get any results..
 
1. you saved the awk script in a file a.awk?
2. you have 'nawk' on your system? [what OS are you running?]
3. you ran 'nawk -f awk.awk yourTextFile'?
4. you have 'uniq' on your system?
5. post your NKM.err file again pls! vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
1. Yes, I did
2. I am running this script on a Solaris box (I have some 'nawk' in some other scripts..they run OK..)
3. I ran the script as you have mentioned..

#!/usr/bin/ksh -p
#. $HOME/.profile
if [[ -a NKM_ERR.csv ]]
then
rm -f NKM_ERR.csv
elif [[ -a NKM1_CSV.csv ]]
then
rm -f NKM1_CSV.csv
fi

nawk -f a.awk NKM.err | uniq

#-------------- a.awk
BEGIN {
FS=&quot;[()]&quot;
OFS=&quot;,&quot;
}

{
split($1, a, &quot; &quot;);
print a[1],$2
}

4. What is uniq??

5.My NKM.err

2002.12.11 Search NO KEYWORD MATCH: (saree) - FILTER: 1
2002.12.11 Search NO KEYWORD MATCH: (saree) - FILTER: 1
2002.12.11 Search NO KEYWORD MATCH: (abc) - FILTER: 1
2002.12.11 Search NO KEYWORD MATCH: (abc) - FILTER: 1
2002.12.12 Search NO KEYWORD MATCH: (234eqewqe) - FILTER: 1
2002.12.12 Search NO KEYWORD MATCH: (1234wqe) - FILTER: 1
2002.12.12 Search NO KEYWORD MATCH: (1234wqe) - FILTER: 1
2003.04.07 Search NO KEYWORD MATCH: (fiber preform) - FILTER: 1
2003.04.07 Search NO KEYWORD MATCH: (fiber preform) - FILTER: 1

Thanks,
 
mine works fine with you data sample file and your script.
I hope you didn't paste the content of a.awk into your ksh script, didn't you?

your ksh script should look like:
#!/usr/bin/ksh -p
#. $HOME/.profile
if [[ -a NKM_ERR.csv ]]
then
rm -f NKM_ERR.csv
elif [[ -a NKM1_CSV.csv ]]
then
rm -f NKM1_CSV.csv
fi

nawk -f a.awk NKM.err | uniq vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
OK..I saved as 'kwdtest'

#!/usr/bin/ksh -p
#. $HOME/.profile
if [[ -a NKM_ERR.csv ]]
then
rm -f NKM_ERR.csv
elif [[ -a NKM1_CSV.csv ]]
then
rm -f NKM1_CSV.csv
fi

nawk -f a.awk NKM.err | uniq

then, I had another file a.awk with

BEGIN {
FS=&quot;[()]&quot;
OFS=&quot;,&quot;
}

{
split($1, a, &quot; &quot;);
print a[1],$2
}

I get a

-p^M: bad option(s) (i removed the -p option and executed kwdtest. I got no results at all...)

Thanks
 
strange I have it working here just fine.

let's try to run it from the shell prompt as:

nawk -f a.awk NKM.err

What do you see? Anything?
I've noticed you had some ^M in your ksh script. Make sure you don't have any escape chars neither in a.awk nor in NKM.err
vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
I get ...

nawk: syntax error at source line 1
context is
<<< BEGIN >>> {
nawk: illegal statement at source line 1
nawk: syntax error at source line 2
 
REMOVE any escape chars from a.awk - you have trailing ^M in the awk script source.

If I put trailing ^M in a.awk - I get the same error. Remember to ftp trandfer your files from PC to UNIX in 'ascii' mode. vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
I am SOOOOOOOO thankful to you for patiently working through this..

Now, I gotta move this script UP in our environment (to production) and test it there again..
 
glad to be of help.

Just be aware that this awk script makes certain assumptions based on the data source file. One of them is that the desired expression inside the '()' is extracted from the FIRST expression inside the parentheses. If you have MULTIPLE parenthesesed expressions you'll have to change the a.awk source.

There might be other ways to achieve the same results even within awk - I can think of at least 2 more.
vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Vlad....Thanks a bunch...Could you please show me other 2??

I too have script which did the same that you did, but, it was cumbersome and also...it did not take into consideration that with () multiple words can exist like ..(form thread). It had assumed that it will always be..only one word with ()....

Thanks a bunch again...
 
ok, here's one:

#-------------------- a1.awk
BEGIN {
OFS=&quot;,&quot;
FLDvalue=6
}

{
gsub(&quot;[()]&quot;, &quot;&quot;, $FLDvalue)
print $1,$FLDvalue
}


# here's the other [perverted] - a2.awk
BEGIN {
OFS=&quot;,&quot;
PATvalue=&quot;:[^(]*\([^)]*)&quot;
}

{
value=&quot;&quot;
num=match($0,PATvalue)
if (num)
value=substr($0, RSTART, RLENGTH)
gsub(&quot;[:() ]&quot;, &quot;&quot;, value)
print $1,value
}
vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Or you can just pass your file into the cut command.

cat $file | cut -d&quot;(&quot; -f2 | cut -d&quot;)&quot; -f1

I'm not an eloquent scripter. Cut at least doesn't care where the parentheses are in the line though nor does it care how many words are bounded by the parentheses. The final result will have no parentheses however.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top