Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Printing a certain string from multiple locations

Status
Not open for further replies.

preaves

Programmer
Jun 18, 2007
2
0
0
US
Hello I have a very large document that I need to extract a certain string from. The only problem that I have is that the string is located in multiple fields like $12 $17 and so on. I only want to be able to print only the string from each record is there any way to do this.
 
Hi

Not sure what you want. Some tries :
Code:
awk '$12~/[green][i]whatever[/i][/green]/ || $17~/[green][i]whatever[/i][/green]/' /input/file

awk 'BEGIN{split("12 17",a," ")}{for(i in a)if($a[$i]~/[green][i]whatever[/i][/green]/){print;next}}' /input/file

awk '$12~/[green][i]whatever[/i][/green]/{print$12} || $17~/[green][i]whatever[/i][/green]/{print$12}' /input/file

awk 'BEGIN{split("12 17",a," ")}{for(i in a)if($a[$i]~/[green][i]whatever[/i][/green]/)print$a[$i]}' /input/file

Feherke.
 
Ok I will try to be more specific sorry about that

I have a document with different seperators " " and "/" that look something like this but much larger.

dksfjdlfkjsldf 0000000000000 skdlfjsidfljskdfj/sfkdfjslf/ GN078390/kdljfsidfjksdfjs
sdfsdlfskjlkdjdfs GN032090334 des/sdklk dfkjsikk sdflkidjlk cooooo000000/ lsikflsi
GN045445 dlksjdfkdsjldkfjs/sdfksjlfk 99998989 lkdfus8okdfj

All I am interested in is the GN0 followed by the 4 numbers. Since they are not in the same columns to distinguish $1 $2 $3 etc. Im not sure on how to select only the field that contains the GN0 followed by the 4 numbers when they are scattered throughout the document. I just want to be able to write a code that will spit out

GN034567
GN087439
GN092342
GN056372

I hope I made myself more clear.
 
A starting point:
awk 'BEGIN{FS="[ /]"}{for(i=1;i<=NF;++i)if($i~/^GNO[0-9][0-9][0-9][0-9]/)print substr($i,1,8)}' /path/to/input

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
A small variation :
- Use any non alpha-numeric character as field separator.
- Strings are displayed only once.
Code:
awk -F '[^[:alnum:]]' '
   {
      for (f=1; f<=NF; f++) {
         if ($f ~ /GN0[0-9][0-9][0-9][0-9]/)
            ++strings[substr($f, 1, 7)];
      }
   }
   END {
      for (s in strings)
         print s;
   }
' mloc.txt

Jean-Pierre.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top