grab a string in a line 1

dvknn · Apr 8, 2003

Hi,

I have a bunch of lines..which looks like this..

2002.12.11 11:33:05 [Thread-421] gtm1dev.nts.fedex.com [5] gemstone DocumentDeterminationServer.keywordSearch NO KEYWORD MATCH: (saree) - FILTER: 1

2003.04.07 12:01:50 [Thread-8637] toa1.sac.fedex.com [5] gemstone DocumentDeterminationServer.keywordSearch NO KEYWORD MATCH: (NEW SOFTWARE) - FILTER: 1

2003.04.07 12:02:15 [Thread-8435] toa1.sac.fedex.com [5] gemstone DocumentDeterminationServer.keywordSearch NO KEYWORD MATCH: (metal edging strips) - FILTER: 1

How can I take the string between the "(" and &quot

" ??

Thank You

vgersh99 · Apr 8, 2003

nawk -F'[()]' '{ print $2}' myFile.txt vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+

dvknn · Apr 8, 2003

Thanks maan,

I have a bigger problem. I have a file which has data like this:

2002.12.11 Search NO KEYWORD MATCH: (saree) - FILTER: 1
2002.12.11 Search NO KEYWORD MATCH: (saree) - FILTER: 1
2002.12.11 Search NO KEYWORD MATCH: (abc) - FILTER: 1
2002.12.11 Search NO KEYWORD MATCH: (abc) - FILTER: 1
2002.12.12 Search NO KEYWORD MATCH: (234eqewqe) - FILTER: 1
2002.12.12 Search NO KEYWORD MATCH: (1234wqe) - FILTER: 1
2002.12.12 Search NO KEYWORD MATCH: (1234wqe) - FILTER: 1
2003.04.07 Search NO KEYWORD MATCH: (fiber preform) - FILTER: 1
2003.04.07 Search NO KEYWORD MATCH: (fiber preform) - FILTER: 1

Now, I would like to extract the DATE and the word between "(" and &quot

" (example : abc).

If observe, the word is being repeated. So, I would like to avoid the repetetions.

And output into a file. the output file should look like:

2002.12.11, saree
2002.12.11, abc
2002.12.12, 234eqewqe
2003.04.07, fiber preform

How could I do that?????

I REALLy appreciate if you could help me towards this. (I am not at all a shell script person. So, I need your help)

Thank You,

vgersh99 · Apr 8, 2003

something like that would do it:

nawk -f a.awk myText.txt | uniq

#-------------- a.awk
BEGIN {
FS="[()]"
OFS=","
}

{
split($1, a, " &quot

;
print a[1],$2
}
vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+

dvknn · Apr 8, 2003

could you please explain??

I tried to run:

#!/usr/bin/ksh -p

#. $HOME/.profile

nawk -f a.awk NKM.err | uniq

#-------------- a.awk
BEGIN {
FS="[()]"
OFS=","
}

{
split($1, a, " &quot

;
print a[1],$2
}

But, I did not get any results..

vgersh99 · Apr 8, 2003

1. you saved the awk script in a file a.awk?
2. you have 'nawk' on your system? [what OS are you running?]
3. you ran 'nawk -f awk.awk yourTextFile'?
4. you have 'uniq' on your system?
5. post your NKM.err file again pls! vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+

dvknn · Apr 8, 2003

1. Yes, I did
2. I am running this script on a Solaris box (I have some 'nawk' in some other scripts..they run OK..)
3. I ran the script as you have mentioned..

#!/usr/bin/ksh -p
#. $HOME/.profile
if [[ -a NKM_ERR.csv ]]
then
rm -f NKM_ERR.csv
elif [[ -a NKM1_CSV.csv ]]
then
rm -f NKM1_CSV.csv
fi

nawk -f a.awk NKM.err | uniq

#-------------- a.awk
BEGIN {
FS="[()]"
OFS=","
}

{
split($1, a, " &quot

;
print a[1],$2
}

4. What is uniq??

5.My NKM.err

2002.12.11 Search NO KEYWORD MATCH: (saree) - FILTER: 1
2002.12.11 Search NO KEYWORD MATCH: (saree) - FILTER: 1
2002.12.11 Search NO KEYWORD MATCH: (abc) - FILTER: 1
2002.12.11 Search NO KEYWORD MATCH: (abc) - FILTER: 1
2002.12.12 Search NO KEYWORD MATCH: (234eqewqe) - FILTER: 1
2002.12.12 Search NO KEYWORD MATCH: (1234wqe) - FILTER: 1
2002.12.12 Search NO KEYWORD MATCH: (1234wqe) - FILTER: 1
2003.04.07 Search NO KEYWORD MATCH: (fiber preform) - FILTER: 1
2003.04.07 Search NO KEYWORD MATCH: (fiber preform) - FILTER: 1

Thanks,

vgersh99 · Apr 8, 2003

mine works fine with you data sample file and your script.
I hope you didn't paste the content of a.awk into your ksh script, didn't you?

your ksh script should look like:
#!/usr/bin/ksh -p
#. $HOME/.profile
if [[ -a NKM_ERR.csv ]]
then
rm -f NKM_ERR.csv
elif [[ -a NKM1_CSV.csv ]]
then
rm -f NKM1_CSV.csv
fi

nawk -f a.awk NKM.err | uniq vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+

dvknn · Apr 8, 2003

OK..I saved as 'kwdtest'

#!/usr/bin/ksh -p
#. $HOME/.profile
if [[ -a NKM_ERR.csv ]]
then
rm -f NKM_ERR.csv
elif [[ -a NKM1_CSV.csv ]]
then
rm -f NKM1_CSV.csv
fi

nawk -f a.awk NKM.err | uniq

then, I had another file a.awk with

BEGIN {
FS="[()]"
OFS=","
}

{
split($1, a, " &quot

;
print a[1],$2
}

I get a

-p^M: bad option(s) (i removed the -p option and executed kwdtest. I got no results at all...)

Thanks

vgersh99 · Apr 8, 2003

strange I have it working here just fine.

let's try to run it from the shell prompt as:

nawk -f a.awk NKM.err

What do you see? Anything?
I've noticed you had some ^M in your ksh script. Make sure you don't have any escape chars neither in a.awk nor in NKM.err
vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+

dvknn · Apr 8, 2003

I get ...

nawk: syntax error at source line 1
context is
<<< BEGIN >>> {
nawk: illegal statement at source line 1
nawk: syntax error at source line 2

vgersh99 · Apr 8, 2003

REMOVE any escape chars from a.awk - you have trailing ^M in the awk script source.

If I put trailing ^M in a.awk - I get the same error. Remember to ftp trandfer your files from PC to UNIX in 'ascii' mode. vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+

dvknn · Apr 8, 2003

I am SOOOOOOOO thankful to you for patiently working through this..

Now, I gotta move this script UP in our environment (to production) and test it there again..

vgersh99 · Apr 8, 2003

glad to be of help.

Just be aware that this awk script makes certain assumptions based on the data source file. One of them is that the desired expression inside the '()' is extracted from the FIRST expression inside the parentheses. If you have MULTIPLE parenthesesed expressions you'll have to change the a.awk source.

There might be other ways to achieve the same results even within awk - I can think of at least 2 more.
vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+

dvknn · Apr 8, 2003

Vlad....Thanks a bunch...Could you please show me other 2??

I too have script which did the same that you did, but, it was cumbersome and also...it did not take into consideration that with () multiple words can exist like ..(form thread). It had assumed that it will always be..only one word with ()....

Thanks a bunch again...

vgersh99 · Apr 9, 2003

ok, here's one:

#-------------------- a1.awk
BEGIN {
OFS=","
FLDvalue=6
}

{
gsub("[()]", "", $FLDvalue)
print $1,$FLDvalue
}

# here's the other [perverted] - a2.awk
BEGIN {
OFS=","
PATvalue=":[^(]*\([^)]*)"
}

{
value=""
num=match($0,PATvalue)
if (num)
value=substr($0, RSTART, RLENGTH)
gsub("[

) ]", "", value)
print $1,value
}
vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+

caallison · Apr 9, 2003

Or you can just pass your file into the cut command.

cat $file | cut -d"(" -f2 | cut -d&quot

" -f1

I'm not an eloquent scripter. Cut at least doesn't care where the parentheses are in the line though nor does it care how many words are bounded by the parentheses. The final result will have no parentheses however.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

grab a string in a line 1

dvknn

IS-IT--Management

vgersh99

Programmer

dvknn

IS-IT--Management

vgersh99

Programmer

dvknn

IS-IT--Management

vgersh99

Programmer

dvknn

IS-IT--Management

vgersh99

Programmer

dvknn

IS-IT--Management

vgersh99

Programmer

dvknn

IS-IT--Management

vgersh99

Programmer

dvknn

IS-IT--Management

vgersh99

Programmer

dvknn

IS-IT--Management

vgersh99

Programmer

caallison

Technical User

Similar threads

Part and Inventory Search

Sponsor