Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

pattern counting

Status
Not open for further replies.

IMAUser

Technical User
May 28, 2003
121
CH
Hi ,

I have a datafile , a sample of it is as below.

Nope,REDEMP_VAL,591894209
Nope,PREV_CPN_DT,591894209
Nope,CUR_CPN,591894209
Nope,CPN,591894209
Nope,AMT_OUTSTANDING,591894209
Nope,REDEMP_VAL,591894308
Nope,PREV_CPN_DT,591894308
Nope,CUR_CPN,591894308
Nope,CPN,591894308
Nope,AMT_OUTSTANDING,591894308
Nope,CPN,591894506
Nope,AMT_OUTSTANDING,591894506
Nope,REDEMP_VAL,591894605
Nope,CUR_CPN,591894605
Nope,CPN,591894605
Nope,AMT_OUTSTANDING,591894605

The file is sorted on the third ( last) column ( id ) which is a character column. There should ideally be five records for each of the ids , but for some reason there are certain records which have less then 5 records per id. I need to identify the valid sets ( 5 recs each ) and create a file of it and load them onto my db.

Any ideas folks how to go about this.
Many thanks,
 
Thanks for your reply guggach. But I think I failed to explain the question.

All the records have three fields. But if you look at the records the first five records all have the id
591894209. Also the next five have the id 591894308. These are the records I need to pick up and not 591894506, because 591894506 has less then five records.

So I think I somehow need to scroll thru the file, pick up the first id and compare it to the next record. If it is the same I'll increment a counter. If the counter reaches 5 then I know I have a full set. If the next id does not match then I re-set the counter. Soething like that. I was not sure how to do that using awk.


Thanx
 
so so... an array is needed, something like:

{
if($3 != last){
if(pos == 5) for(last = 0;
pos >last;
printf("%s\n",arr[last++]);
last = $3;
pos = 0;
next;
}
last = $3;
arr[pos++] = $0;
}

NOT TESTED !!




:) guggach
 
Something like this ?
awk -F',' '
a3!=$3{
if(n==5)for(i=1;i<=n;++i)print t
n=0;a3=$3
}
{t[++n]=$0}
END{if(n==5)for(i=1;i<=n;++i)print t}
' /path/to/input >/path/to/valid_ids

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top