Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Pure logic puzzle 1

Status
Not open for further replies.

dickiebird

Programmer
Feb 14, 2002
758
GB
Hi Guys
With a flat file consisting of :
0001,123
0002,023
0002,989
0003,123
0004,093
0005,030
0005,987
0006,030
0007,939 etc etc

I want to print all unique field 1, but only the second listed where there's a duplicate field 1.
So the above would end up like :
0001,123
0002,989
0003,123
0004,093
0005,987
0006,030
0007,939 etc etc
The first listed 0002 has gone , as has the first 0005.
I got this far and then cried :

awk ' BEGIN {FS = "," }
{
if(NR==1) # 1st line is OK to print anyway
{
print $0;
next;
}
if (f1sav=$1)
{
print $0;
}
else
{
print f0save
}
f1sav=$1;
f0save=$0;
}' allext > allexta

Whaddya think I need - apart from a brain implant ????
DB :)
Dickie Bird
db@dickiebird.freeserve.co.uk
 
If the oupur order does NOT matter...

BEGIN {
FS=","
}

{ arr[$1]=$0; }

END {
for (i in arr)
printf("%s\n", arr);
}
 
Hi vgersh99
Your method worked fine - I ran a sort on it to restore the order. How does the loading of an array work, though ?
DB :)
PS I also got it with this eventually
awk ' BEGIN {FS = "," }
{ if(NR==1)
print $0;
if (f1sav!=$1)
print f0save;
f0save=$0;
f1sav=$1;
}
END { print f0save;
}' allext > allexta

Dickie Bird
db@dickiebird.freeserve.co.uk
 
Hi,

awk uses the associative arrays. The array is loaded using the first colomn as it's INDEX and the entire record as its content. Any subsequent records with repetetive first colomns overwrite the the existing entry in the array [if one exists].

The output is produced by iterating through the array content.

HTH

vlad
 
If this is your input (3 repeats and on term)

0001,123
0002,023
0002,989
0002,999
0003,123
0004,093
0005,030
0005,987
0006,030
0007,939
0007,939

and your output is

0001,123
0002,999
0003,123
0004,093
0005,987
0006,030
0007,939

then this sed script works

sed '
1{
${p;q;}
h
}
1!{
x
G
/^\([^,]*\),.*\n\1.*/{
s/.*\n//
$p
b
}
$!s/\n.*//
p
}
' allext > alltexta

Cheers,
ND [smile]
 
Thanks guys - your help is very much appreciated
DB
;-)
Dickie Bird
db@dickiebird.freeserve.co.uk
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top