Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

3 records per key

Status
Not open for further replies.

normeus

Programmer
Oct 23, 2005
11
US
I deal with names and addresses all the time and I use some old cobol programs. I wanted to do this in awk but my cobol background shows; anyway the program reads a file ( about 20,000 records ) it is a fixed lenght record so I use substr() to get the key, now the trick is that we want to keep the records in an array and as soon as we get 3 keys (records) that are the same print array to a file and if the key of the next record is the same then just keep printing. Now if the key changes and you have less than 3 names in the array print the array to a different file.
Any way I use AWK in windows xp , with EditPlus which is a pleasure to use. this is what I started with it does not work yet and I will fix it but I just wanted some tips on arrays maybe?
Code:
BEGIN {
       print "BEG TIME......... "strftime("%H:%M:%S")
save_table[0]=0
      }
{ 
#begin_newfilename:////// add "-A" filename in windows//////
if (NR < 2 ) 
  {begfile=fname=FILENAME
   if(match(begfile,/\..+/)){endfile=substr(fname,RSTART,RLENGTH)}
   gsub(/\..+/,"",begfile)
   fname=begfile"-A"endfile
  }
#end_newfilename://////

key=substr($0,77,22)

#save the first record
if (NR < 2){
   save_table[0]=0
   okey=key}
if (key != okey){
   cnt=1
   if (save_table[0]==0){
       save_table[0]++
       next}
   okey=key
 save_table[save_table[0]]=$0
}
if (save_table[0] < 3) 
   {save_table[save_table[0]]=$0
    save_table[0]++
   okey=key
   }

#//// TODO ////
cntsel++
}
END {print "INPUT FILENAME... " FILENAME
     print "SELECTED RECORDS. " cntsel+0
     print "RECORDS  ........ " NR
     print "END TIME ........ " strftime("%H:%M:%S")
    }

Remember this program doesn't do anything yet.

Norm,
Thanx.
 
Well as it turns out this program will not work. I figured I will take about an hour and fix it but as I see the logic I will have to actually make a flowchart!!!
Let me fix it then post your fixes to the working program.

Norm.
 
here is a working program. I didnt flowchart it like I should have but the program works. it will take a file.
you only have to change "minnum" ( short for minimum number,
I know it was a Stupid name.) and the "key" field. The output will be "file1" and "file2"

Code:
# This program will take a file and split it into 2 files
# file one will use "minnum" and have a minimum number of records per key.
# at the U.S. post office if you have 10 records per Carrier Route you get a discount so first I sort the records by route
#then I run this program to get at least 10 per route
#(restrictions apply, etc..)
BEGIN {FS="\t"
       OFS="\t"
  print "BEG TIME......... "strftime("%H:%M:%S")
  minnum = 10 # minimum number of records to qualify change me
  file1 = "c:/file1.dat"
  file2 = "c:/file2.dat"
      }
{ 
# I use fixed records but you could use tab or comma delimited as long
# as you define a key

key=substr($0,150,4)   #change me

#key=$9   #or a field  

if (NR <2) #first record save it 
   {okey=key
   tosave[savedex=1]=$0
   next
   }
if (okey==key && savedex < minnum) 
   {tosave[++savedex]=$0
    next
   }
else{ 
   if (okey==key && savedex == minnum) 
      {for (x1=1;x1<=savedex ;x1++){print tosave[x1] > file1 }
      print $0 > file1
      savedex++
      cnt1=cnt1+savedex
      next
      }
   if (okey==key && savedex > minnum) 
      {  print $0 > file1
      cnt1++
      savedex++
      next
      }
   if (savedex < minnum) 
      {for (x1=1;x1<=savedex ;x1++){print tosave[x1] > file2 }
      savedex=1
      tosave[savedex]=$0
      okey=key
      next
      }
   if (savedex == minnum) 
      { for (x1=1;x1<=savedex ;x1++){print tosave[x1] > file1 }
      cnt1=cnt1+savedex
      savedex=1
      tosave[savedex]=$0
      okey=key
      next
      }
   if (savedex > minnum) 
      {savedex=1 #reset index
      tosave[savedex]=$0
      okey=key
      next
      }
   print "LOGIC ERROR"
   exit
}
}
END {print "INPUT FILENAME... " FILENAME
     print "SELECTED RECORDS. " cnt1+0
     print "RECORDS  ........ " NR
     print "END TIME ........ " strftime("%H:%M:%S")
    }
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top