Jan 6, 2004 #1 hugheskbh Programmer Dec 18, 2002 37 US How can I identify records that have duplicate values in a field (position 2 thru 6). I need to indentify duplicate employee numbers in a file. Thanks Ken
How can I identify records that have duplicate values in a field (position 2 thru 6). I need to indentify duplicate employee numbers in a file. Thanks Ken
Jan 6, 2004 #2 PHV MIS Nov 8, 2002 53,708 FR Try something like this: Code: awk '{++a[substr($0,2,5)]} END{for(i in a)if(a[i]>1)printf "%s x %d\n",i,a[i]} ' /path/to/inputfile You can also take a look at Code: man sort and Code: man uniq Hope This Help PH. Upvote 0 Downvote
Try something like this: Code: awk '{++a[substr($0,2,5)]} END{for(i in a)if(a[i]>1)printf "%s x %d\n",i,a[i]} ' /path/to/inputfile You can also take a look at Code: man sort and Code: man uniq Hope This Help PH.
Jan 6, 2004 #3 vgersh99 Programmer Jul 27, 2000 2,146 US what does a sample record look like? vlad +----------------------------+ | #include<disclaimer.h> | +----------------------------+ Upvote 0 Downvote
what does a sample record look like? vlad +----------------------------+ | #include<disclaimer.h> | +----------------------------+
Jan 6, 2004 Thread starter #4 hugheskbh Programmer Dec 18, 2002 37 US An example will look like this: File 1 A22333test1 A25555test2 A88888test3 File2 A33333test1 A54444test2 A22333test3 Note the duplicate 22333 in both files. Upvote 0 Downvote
An example will look like this: File 1 A22333test1 A25555test2 A88888test3 File2 A33333test1 A54444test2 A22333test3 Note the duplicate 22333 in both files.
Jan 6, 2004 #5 PHV MIS Nov 8, 2002 53,708 FR Try something like this: Code: awk '{ k=substr($0,2,5);i=++a[k] b[k","i]=FILENAME":"$0 } END{ for(k in a)if(a[k]>1) for(i=1;i<=a[k];++i)print b[k","i] printf "\n" }' File1 File2 Hope This Help PH. Upvote 0 Downvote
Try something like this: Code: awk '{ k=substr($0,2,5);i=++a[k] b[k","i]=FILENAME":"$0 } END{ for(k in a)if(a[k]>1) for(i=1;i<=a[k];++i)print b[k","i] printf "\n" }' File1 File2 Hope This Help PH.
Jan 7, 2004 #6 Ygor Programmer Feb 21, 2003 623 GB You could use a pattern file with grep... [tt] cut -c2-6 file1 >tmpfile grep -f tmpfile file2 Upvote 0 Downvote