Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

script help 2

Status
Not open for further replies.

mdarsot

Instructor
Apr 13, 2006
86
CA
i have a file like below

44309,266,01/22/2009 06:03:31,44224477173,00042000156022508735
44309,266,01/22/2009 06:04:07,44224223732,00042000156022509112
44309,266,01/22/2009 06:04:07,44224536952,00042000156022510250
44309,266,01/22/2009 06:04:07,44224642506,00042000156022510285
44309,266,01/22/2009 06:09:37,44224470142,00042000156022508745
44309,266,01/23/2009 06:03:31,44224477173,00042000156022508735



last field serials do repeat many times in file however i only want to select the very last instance of it based on date and time. How can i do that.

in above example first record will be dropped as it is repeating in very last record exact same code (field 5) however differnt date.
 
output will look exactly the same as input, however it will remove the fields which are repeating and of earlier time stamp.

so in above case the output will look like
44309,266,01/22/2009 06:04:07,44224223732,00042000156022509112
44309,266,01/22/2009 06:04:07,44224536952,00042000156022510250
44309,266,01/22/2009 06:04:07,44224642506,00042000156022510285
44309,266,01/22/2009 06:09:37,44224470142,00042000156022508745
44309,266,01/23/2009 06:03:31,44224477173,00042000156022508735


it will drop the first record because field 5 is repeating at the end and the last one is most recent date and time stamp.
 
Hi

I think I got it.
Code:
awk -F , 'FNR==NR{l[$5]=d($3);next}l[$5]<=d($3);function d(w){return substr(w,7,4)"/"substr(w,1,5)substr(w,11)}' /input/file /input/file
Tested with [tt]gawk[/tt] and [tt]mawk[/tt].

Note : the /input/file parameter passed twice is not a typo.

Feherke.
 
great!! it works.

can you please explain code. appreciate that

thanks
 
Hi

Code:
awk
-F ,           [gray]# set the input field separator to ,[/gray]
'    
FNR==NR {      [gray]# if processing the first input file[/gray]
  l[$5]=d($3)  [gray]# put the time in the array l at index $5[/gray]
  next         [gray]# advance to next record skipping further code[/gray]
}

               [gray]# processing the second input file[/gray]
l[$5]<=d($3)   [gray]# if current record's time is maximum or greater (*)[/gray]
               [gray]# perform default action ( print current record )[/gray]

function d(w)  [gray]# declare the time formatting function[/gray]
{              [gray]# makes it yyyy/mm/dd hh:nn:ss for comparison[/gray]
  return substr(w,7,4)"/"substr(w,1,5)substr(w,11)
}
'
/input/file    [gray]# pass 1 : collect maximum times[/gray]
/input/file    [gray]# pass 2 : print out records with maximum time[/gray]
[gray](*)[/gray] - [tt]<=[/tt] is quite pointless, [tt]==[/tt] would be enough

If we change the comparison to [tt]==[/tt], there is no need to format the time :
Code:
awk -F , 'FNR==NR{l[$5]=$3;next}l[$5]==$3' /input/file /input/file
But one question : are the records always in chronological order ? So the maximum time is always the last ?

Feherke.
 
Thanks

no the records are not in chronological order, they could be random with same serial appering with different time or date within any field of file.
 
Hi

The abit of modification would be better to ensure the maximum times are picked up wherever they are.
Code:
awk -F , 'FNR==NR{if(l[$5]<t=d($3))l[$5]=t;next}l[$5]==d($3);function d(w){return substr(w,7,4)"/"substr(w,1,5)substr(w,11)}' /input/file /input/file

Feherke.
 
thanks

however would it check date & time and pick out most recent.
 
If not preserving the original sequence is not an issue you don't need to read the file twice:
Code:
awk -F, '{t=substr($3,7,4)substr($3,1,5)substr($3,12);if(x[$5]<t){x[$5]=t;l[$5]=$0}}END{for(i in l)print l[i]}' /path/to/input

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top