Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

need to find unique column entries in a large file 1

Status
Not open for further replies.

preserver3

Programmer
Feb 21, 2006
1
US
Hi,

I have a 600,000 line file with 14 columns of data separated by a "|"

I need only the unique combinations of Column 3 and Column 7, and their instance of column 1 so I can find the ID number.

awk -F"|" '{print $1} {print $3} {print $7} foo.txt | ???

Do I need to try to sort this or niq it? How?

Thanks.
 
Is column 1 unique for each combination of Column 3 and Column 7 ?
If yes:
awk -F'|' '{print $3,$7,$1}' foo.txt | sort -u
If not, then for the 1st instance:
awk -F'|' '!a[$3,$7]{print $3,$7,$1;++a[$3,$7]}' foo.txt

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886
 
[tt]

Hi, Could you please explain how below awk syntax works:

awk -F'|' '!a[$3,$7]{print $3,$7,$1;++a[$3,$7]}' foo.txt

[/tt]
 
man awk

look at associative arrays

for every line of file foo.txt
loop
if (combination of fields 3 and 7 not yet "found")
then
print fields 3, 7 and 1
set combination of fields 3 and 4 to "found"
endif
endloop


HTH,

p5wizard
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top