awk, sort array unique 1

madasafish · Feb 17, 2007

I have 2 files..file1 and file2 and am trying to sort these files uniquely by the second column and then find entries that do not appear in file1. I have used an excellent snippet of code from here that gets very close to what I want. The only thing missing is the "sort unique" routine.

Any help would be greatly appreciated,
Many Thanks,
Madasafish

file1
1,aa
2,ab
3,ac
4,aa
5,ab

file2
1,ab
2,aa
3,aa
4,ac
5,dd
6,dd

Snippet of code
awk -F"," '
BEGIN {
# Read file1 into an array
while (getline < "file1") {
users[$2]=1
}
}
{
# For every line in file2, test for presence and
# display result
if (users[$2])
print $2 " does exist in file1"
else
print $2 " does NOT exist in file1"

}
' file2

Result....
ab does exist in file1
aa does exist in file1
aa does exist in file1
ac does exist in file1
dd does NOT exist in file1
dd does NOT exist in file1

As you can see from above "aa" and "dd is shown twice.

PHV · Feb 17, 2007

Why not simply pipe the actual output to the sort command ?
awk -F"," '
...
' file2 | sort -u

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886

PHV · Feb 17, 2007

And if you want all in awk only:
awk -F"," '
BEGIN{
while(getline<"file1")users[$2]=1
}
{ if($2 in done)next;done[$2]
if(users[$2])
print $2 " does exist in file1"
else
print $2 " does NOT exist in file1"
}
' file2

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886

madasafish · Feb 17, 2007

PHV, Thank-you! it worked a treat!

Can I throw out another question....?

I would like to set file1 and file2 as an external variable that is referenced within the awk program.
I have tried using the -v but without success

What I am trying to do is....
CODE:

#!/bin/ksh
file1=/tmp/myfile1
file2=/tmp/myfile2

awk -F"," '
BEGIN {
# Read file1 into an array
while (getline < "$file1") {
users[$2]=1 ........

Once again, thank-you for your help
Madasafish

PHV · Feb 18, 2007

I have tried using the -v but without success
Which code ?

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

awk, sort array unique 1

madasafish

Technical User

PHV

MIS

PHV

MIS

madasafish

Technical User

PHV

MIS

Similar threads

Part and Inventory Search

Sponsor