madasafish
Technical User
I have 2 files..file1 and file2 and am trying to sort these files uniquely by the second column and then find entries that do not appear in file1. I have used an excellent snippet of code from here that gets very close to what I want. The only thing missing is the "sort unique" routine.
Any help would be greatly appreciated,
Many Thanks,
Madasafish
file1
1,aa
2,ab
3,ac
4,aa
5,ab
file2
1,ab
2,aa
3,aa
4,ac
5,dd
6,dd
Snippet of code
awk -F"," '
BEGIN {
# Read file1 into an array
while (getline < "file1") {
users[$2]=1
}
}
{
# For every line in file2, test for presence and
# display result
if (users[$2])
print $2 " does exist in file1"
else
print $2 " does NOT exist in file1"
}
' file2
Result....
ab does exist in file1
aa does exist in file1
aa does exist in file1
ac does exist in file1
dd does NOT exist in file1
dd does NOT exist in file1
As you can see from above "aa" and "dd is shown twice.
Any help would be greatly appreciated,
Many Thanks,
Madasafish
file1
1,aa
2,ab
3,ac
4,aa
5,ab
file2
1,ab
2,aa
3,aa
4,ac
5,dd
6,dd
Snippet of code
awk -F"," '
BEGIN {
# Read file1 into an array
while (getline < "file1") {
users[$2]=1
}
}
{
# For every line in file2, test for presence and
# display result
if (users[$2])
print $2 " does exist in file1"
else
print $2 " does NOT exist in file1"
}
' file2
Result....
ab does exist in file1
aa does exist in file1
aa does exist in file1
ac does exist in file1
dd does NOT exist in file1
dd does NOT exist in file1
As you can see from above "aa" and "dd is shown twice.