Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

getline and zcat 1

Status
Not open for further replies.

malpa

Technical User
Feb 8, 2004
122
CO
Hi


Is it possible to execute a zcat command into awk?, and then compare each (zcat file.Z) file with the "file_1" file?

file_1 , this file contains the data that I want to search into the information or registers of each file.Z. compressed file. The files compressed are related into the list_of_files_Z.


find -name "*.Z" -mtime .... > list_of_files_Z


awk ' BEGIN{
while( getline "file_1"){ field[1]=1}
while( getline "list_of_files_Z" ) { file[1]=1 }


for ( i in file){ # list_of_files.Z
while( "zcat file" | getline var )
split (var,k,";")
if ( k[1] in field )
print $field[k[1]]
}
}'




I want to read the file_1, then compare it with the data of (zcat files.Z) files. The files.Z are filtered with find command. Into de filesystem there are many files.Z files, but I only want to process the files that are between two dates. For example 2008-01-01 ... 2008-01-02.




Thanks malpa
 
You should be able to read the output of zcat fine in awk.

Personally though I would use standard shell script for this task as it could be complicated in awk.

Annihilannic.
 

Hi

This is the structure of the files.


registers

file_1
xxxx;yyyy;78860380;20051118;zzzz;9890;85141
xxxx;yyyy;78860335;20051124;zzzz;9890;85141
xxxx;yyyy;78860390;20051123;zzzz;9890;85141

----
I used tis line command to find the compressed files for date

find -name "*.Z" -mtime .... > list_of_files.Z

list_of_files.Z

file000.Z;2008-01-01
file002.Z;2008-01-02
..
..
file999.Z;2008-01-02
------


zcat filexxx.Z

078860335,5555555,,,20071130,134054,1170,,21119602,2,1,1196,4997,0,TTTT,1,RRRR,1,,5,,1,238,,4,81001,137,26,7,,,,,,,,,,,,,,,,,1009,,700
..
..
..

The idea is to search efficiently the 3rth field from file_1 into the first field from each compressed filexxx.Z


the first idea was

for i in `cat list_of_files.Z`
do
zcat $i | awk ......
done

but this spend many time, because I must to read file_1 (300.000 registers) and then save it into the memory, each time that the shell reads one compressed file (zcat filexxx.Z).

The best solution would be to read the file_1 for only first time and save it into the memory, then read and compare the compressed files with the information of file_1 saved into the memory.

For this reason I want to use zcat command and getline function into the awk but I don´t know if it is possible to do it.


Thanks for your assistance

malpa
 
How did you end up with the dates in list_of_files_Z?

When you find a matching 3rd field in the compressed file, what do you want to do with it? Just print out that single field?

I haven't tested this, but there are a few problems I notice in your original awk code:

Code:
awk [COLOR=red]-F';'[/color] ' BEGIN{
       while( getline [COLOR=red]<[/color] "file_1"){ field[[COLOR=red]$3[/color]]=1}
       [COLOR=red]close "file_1"[/color]
       while( getline [COLOR=red]<[/color] "list_of_files_Z" ) { file[[COLOR=red]$[/color]1]=1 }
       [COLOR=red]close "list_of_files_Z"[/color]
      
          for ( i in file){             # list_of_files.Z
              [COLOR=red]cmd="zcat " file[i][/color]
              while( [COLOR=red]cmd[/color] | getline var ) [COLOR=red]{[/color]
                if ( [COLOR=red]$[/color]1 in field )
                   print [COLOR=red]$1[/color]
              [COLOR=red]}
              close cmd[/color]
          }
      }'

For efficiency you should drop out of the while ( cmd | getline var ) loop as soon as you find a match, but I'll leave that as an exercise for you!

Annihilannic.
 
Hi

Thanks a lot Annihilannic. I love awk. This shell is must faster than the original idea.

I made some changes at the end of your programm and this is the final shell.

using crontab

find . -mtime 1 | awk -F"[/]" ' { print $2 } ' | xargs -n1 -I{} ls -l --time-style=full-iso {} | awk ' { print $9";"$6 } ' > list_of_files_Z


awk -F';' ' BEGIN{
while( getline < "file_1"){ field[$3]=1}
close "file_1"
while( getline < "list_of_files_Z" ) { file[$1]=1 }
close "list_of_files_Z"

for ( i in file){ # list_of_files.Z
cmd="zcat " file
while( cmd | getline var ) {
split(var,var1,",")
sub(/0/,"", var1[1])
if ( var1[1] in field )
print field[var1[1]]";"var
}
close (cmd)
}
}'



Thanks

Malpa
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top