How to read ids from 1 file and grab those ids from 2nd file 1

rsteffler · Apr 25, 2002

I am trying to write a compact AWK script to read a file that contains a list of id's like this:

444
234
236

I have a second file that is pipe delimited where the 6th record is this id like this:

name|address1|address2|hphone|wphone|444|

I want to print out to the screen any record in the second file whose id is in the first file. It's essentially automating (and hopefully streamlining) the process similar to: grep 444 file2 and then making sure its only 444 and its in the 6th field.

Is there a simple AWK Script that do this? Please keep in mind that my second file, the file to be searched has 8 million records, so greps are slow.

Thanks,
Robert

vgersh99 · Apr 25, 2002

BEGIN {
FS="|"
CONF_FILE="searchFile1.txt"
fld2search="6"

while (getline < CONF_FILE > 0) {
# skip comments in the config file
if ( $0 ~ /^[#].*/ ) continue;
arrConfs[$0];
}
close(CONF_FILE);
}

$fld2search in arrConfs { print }

rsteffler · Apr 25, 2002

Thanks! That's exactly what I needed.

Robert

CaKiwi · Apr 25, 2002

Any solution is going to take some time to process 8 million records, but hopefully this awk program won't take too long.

Code:

BEGIN {
  FS = &quot;|&quot;
  while ((getline < &quot;-&quot;) > 0) a[++ix] = $0
}
{
  for (i=1;i<=ix;i++) {
    if ($6 == a[i]) {print; next}
  }
}

Run it by entering:

Code:

awk -f awk-script big-file < id-file

Hope this helps. CaKiwi

vgersh99 · Apr 25, 2002

I believe using the "in" construct will be faster in discriminating records - one lookup instead of the "iterative" lookup.

vlad

CaKiwi · Apr 25, 2002

Vlad,

You beat me to it with a better solution. I agree the "in" construct is probably faster. CaKiwi

bigoldbulldog · Apr 26, 2002

Since I'm a sed-o-holic and a speed freak this is about 3 times faster then the awk example with 'in' (which is about 3 times faster than '==', on my machine). The caveat is that sed takes only 100 commands so the config file may only supply 98 items.

#! /bin/sh
eval `sed '
1s/.*/sed -e &/
:loop
N
s/\n.*/ -e &/
s/\n//
$!b loop
' $1 |
sed "
s/[0-9][0-9]*/'\/&\/b'/g
s/$/ -e d $2/
"`

script id-file big-file

Cheers,
ND [smile]

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

How to read ids from 1 file and grab those ids from 2nd file 1

rsteffler

Programmer

vgersh99

Programmer

rsteffler

Programmer

CaKiwi

Programmer

vgersh99

Programmer

CaKiwi

Programmer

bigoldbulldog

Programmer

Similar threads

Part and Inventory Search

Sponsor