Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations biv343 on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

retrieving a date from somewhere in a free-format field

Status
Not open for further replies.

ianholmes

Programmer
Mar 24, 2001
61
GB
I am being passed data in a 140 character string, containing a date, of the format DD xxxx YY where xxxx can be JANUARY FEBRUARY etc etc.
The date can be anywhere in the string.
Is there any way of identifying it, and turning it into YYMMDD format ?
I had though of getting the first character of the month, and then locating the leading and trailing spaces around it, but I am not clever or skilled enough.
 
awk is your friend.
A sample, pls.

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
How about
Code:
#!/bin/awk -f
#
BEGIN {
    m = &quot;&quot;
    m = m &quot;JANUARY,FEBRUARY,MARCH,APRIL,MAY,JUNE,&quot;
    m = m &quot;JULY,AUGUST,SEPTEMBER,OCTOBER,NOVEMBER,DECEMBER&quot;
    split(m,arr,&quot;,&quot;)

    # setup months_re=(JANUARY|FEBRUARY|...)
    # this is the regular expression which matches any month
    months_re = &quot;(&quot;
    for ( i = 1 ; i <= 12 ; i++ ) {
        months_re = months_re arr[i]
        if ( i < 12 ) months_re = months_re &quot;|&quot;
    }
    months_re = months_re &quot;)&quot;

    # setup name2num[&quot;JANUARY&quot;] = 1
    for ( i = 1 ; i <= 12 ; i++ ) name2num[arr[i]] = i

    # the regex to match the whole thing
    # basically, a pair of digits either side of the months
    date_re = &quot;[0-9][0-9] &quot; months_re &quot; [0-9][0-9]&quot;
}

# find the date in the line
match($0,date_re) > 0 {
    split(substr($0,RSTART,RLENGTH),arr)
    printf(&quot;%02d%02d%02d\n&quot;,arr[3],name2num[arr[2]],arr[1])
}

Example
Code:
hi there 01 JANUARY 02 there there
fee fi foe fum 02 FEBRUARY 02

produced
020101
020202
 
Hi vlad


Here is such a record.


RECEIVED 6,500 SHS abc CORP (XYZ GROUP) COM USD2.5 ON 29 MAY 03 AT 12.3066 THRU mnop GROUP INC, USA COMMISSIONS & OTHER MKT CHARGES 65.07 TRADE REF#:20030599 CLIENT REF:Z35

But subsequent records will have the date elsewhere in the character string.
 
or a variation on Salem's theme:


BEGIN {
mon=&quot;JANUARY|FEBRUARY|MARCH|APRIL|MAY|JUNE|JULY|AUGUST|SEPTEMBER|OCTOBER|NOVEMBER|DECEMBER&quot;
monN=split(mon, monA, &quot;|&quot;);
for(i=1; i<=monN; i++) {
monA[monA]=i;
delete monA;
}

monMatch=&quot;[0-9][0-9]* &quot; &quot;(&quot; mon &quot;)&quot; &quot; [0-9][0-9]*&quot;;
}

match($0, monMatch) {
split(substr($0, RSTART, RLENGTH), a);
printf(&quot;%s%0.2d%s\n&quot;, a[1], monA[a[2]], a[3]);
}

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Hi vlad & salem
Many thanks
I'll need to get myself sent on an awk course.
Regards
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top