Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Extract by recent date timestamp

Status
Not open for further replies.

Zahier

MIS
Oct 3, 2002
97
ZA
Hi techies,
I have a comma delimited file in this format;

aa,001,2003-08-14-10.57.39.807324,aa,
aa,001,2003-08-15-10.57.39.807324,aa,
aa,003,2003-08-17-11.57.39.807324,aa,
aa,003,2003-08-17-10.57.39.807324,aa,
aa,003,2003-08-17-10.59.39.807324,aa,

Field 3 is a date timestamp field. I am trying to extract where field 2 occurred most recently based on field 3. Example I want to only end up with this in my output;

aa,001,2003-08-15-10.57.39.807324,aa,
aa,003,2003-08-17-11.57.39.807324,aa,

I am getting stuck on the finding the most recent date.

 
Try

BEGIN { FS=","}
{
if (!a[$2]) { a[$2]=$0; next}
split (a[$2],a0,",")
split (a0[3],a1,"-")
split (a1[4],a2,".")
split ($3,b1,"-")
split (b1[4],b2,".")
if (a1[1] > b1[1]) next
if (a1[1] < b1[1]) { a[$2]=$0; next}
if (a1[2] > b1[2]) next
if (a1[2] < b1[2]) { a[$2]=$0; next}
if (a1[3] > b1[3]) next
if (a1[3] < b1[3]) { a[$2]=$0; next}
if (a2[1] > b2[1]) next
if (a2[1] < b2[1]) { a[$2]=$0; next}
if (a2[2] > b2[2]) next
if (a2[2] < b2[2]) { a[$2]=$0; next}
if (a2[3] > b2[3]) next
if (a2[3] < b2[3]) { a[$2]=$0; next}
}
END {
for (j in a) print a[j]
}

CaKiwi
 
Another way:
sort -t , -k 2,2 -k 3r /path/to/input | awk -F, '!a[$2]++'

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
Appreciate the help CaKiwi and PHV.

I tested your script Cakiwi, and it works perfectly on a small sample of about 1000 records. But when I run it on a 7.5 mil record file it errors with "out of memory" messages.

PHV's script also works on a small sample but errors with memory errors on the big file.

What I've done is ask the developers to give me the file in sorted order, first by field 2 and then by ascending order on field 3. I used this script to then check when field 2 changes and then only print the last record because it is sorted by ascending date.

awk 'BEGIN { FS = "," }
{
if (var != $1 && NR != 1)
print line
var = $1
line = $0
}
END { print line }' $inpfile

Again, thanks for the help. STAR for each of you :)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top