Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to pass value of datastep variable to a SAS macro 1

Status
Not open for further replies.

Jubinell

Technical User
Oct 21, 2008
14
JP
Hi,

I'm trying to create a macro that returns the number of months elapsed between two dates. The macro would be called in a datastep pulling in the dates from two variables and returning the months difference into a new variable.

Here is what I have so far:

*************************

%macro months_elapsed(date1, date2);
%let year1 = year(&date1);
%let year2 = year(&date2);
%let month1 = month(&date1);
%let month2 = month(&date2);
%let day1 = day(&date1);
%let day2 = day(&date2);

%let month_diff = &year2*12 + &month2 - &year1*12 - &month1;
%if &day1<&day2 %then
%let month_diff = &month_diff - 1;

&month_diff.;
%mend;

data test2;
set test;
month_diff = %months_elapsed(firstdate,seconddate);
run;


*************************

Everything works splendidly up until the if statement, where SAS cannot resolve my macro variable to be the value of the variables firstdate and seconddate, but instead thinks that &day1 = "firstdate" and &day2 = "seconddate". How do I make sure that the values of firstdate and seconddate are probably carried through into the macro?
 
Hi Jubinell,

SAS already has a function for calculating time differences between dates called intck, so a custom macro is overkill.

The usage of a macro to assign a value to a datastet variable is a common misunderstanding of macros, usually indicative of a programmer coming from another programming language - myself included.

You can create a "function" and use it as you intended provided that only datastep code is used as I have shown below.

As macro code is executed prior to SAS code you need to use the resolve function to ensure the macro execution is carried out at datastep run time. Again, I can only stress that this is not a good technique. I have also made some changes to your macro code: use the macro function %sysfunc to make datastep functions availiable to the macro; use the %eval function to perform integer arithmetic in macros; and don't add a semi-colon to the name of the variable you want returned.

HTH

Code:
%macro months_elapsed(date1, date2);
%let year1 = %sysfunc(year(&date1));
%let year2 = %sysfunc(year(&date2));
%let month1 = %sysfunc(month(&date1));
%let month2 = %sysfunc(month(&date2));
%let day1 = %sysfunc(day(&date1));
%let day2 = %sysfunc(day(&date2));
%let month_diff = %eval(&year2*12 + &month2 - &year1*12 - &month1);
&month_diff
%mend;

%macro month_diff(startdate,enddate);
   intck('month',&startdate,&enddate)
%mend;

data;
   input (firstdate seconddate)(:date9.) @@;
   format firstdate seconddate date9.;
   cards;
01/Jan/07 03/Mar/08 08/Sep/06 09/Aug/07
01/Feb/05 07/Mar/06 18/Aug/01 21/Aug/01
   ;run;
proc print;run;
data;
   set data1;
   month_diff_macro = resolve('%months_elapsed('||firstdate||','!!seconddate!!')');
   month_diff_func  = intck('month',firstdate,seconddate);
   month_diff_wmacro= %month_diff(firstdate,seconddate);
proc print;run;

 
Dear kdt82,

That was extremely useful. I'm glad that right after joining the forum I was exposed to such quality responses from you.

I've used your code with a slight modification that differentiates the interval depending on whether the second day of the month is greater or less than the first day of the month (01/15/2000 - 02/14/2000 should yield a different result than 01/15/2000 - 02/16/2000). This is something that the intck() function cannot do because the subperiod for the month interval of intck() can only be months, not days of the month.

A final note, why is this bad practice? Do you mean it is inefficient? As far as I can see there are no extraneous variables being created or unnecessary steps. Please enlighten me on this matter.

***

%macro months_elapsed(date1, date2);
%let year1 = %sysfunc(year(&date1));
%let year2 = %sysfunc(year(&date2));
%let month1 = %sysfunc(month(&date1));
%let month2 = %sysfunc(month(&date2));
%let day1 = %sysfunc(day(&date1));
%let day2 = %sysfunc(day(&date2));
%let month_diff = %eval(&year2*12 + &month2 - &year1*12 - &month1);
%if &day1<&day2 %then
%let month_diff = %eval(&month_diff - 1);
&month_diff
%mend;

******
 
Hi Jubinell,

Yes, the use of macros in this way really is quite inefficient. Let's take a trivial example of performing a simple addition on two variables and storing it in a third using the normal datastep vs passing each observation to a macro over a million records.

Code:
data test;
do i=1 to 1000000;
   x = ceil(ranuni(0)*100);
   y = ceil(ranuni(0)*50);
   output;
end;
run;

data one;
set test;
z = x + y;
run;

%macro add(val1,val2);
%eval(&val1 + &val2)
%mend;

data two;
set test;
z =resolve('%add('!!x!!','!!y!!')');
run;

Over 3 runs the datastep version averages 0.51 seconds whereas the macro takes 1:44.46 - that's over 200 times longer. There are also other potential pitfalls associated with using macros for performing calculations as macros process strings, therefore even numeric parameters are converted to strings and back to numbers when using %eval, and then back to strings again.

For your particular example, I would stick to the datastep or simply use the macro as a wrapper for the datastep code. This can be quite succintly written as:

Code:
%macro month_diff(date1, date2);
year(&date2)*12 + month(&date2)  - year(&date1)*12 - 
         month(&date1) - (day(&date1) < day(&date2));
%mend;

data have;
   input (firstdate seconddate)(:date9.) @@;
   format firstdate seconddate date9.;
   cards;
01/Jan/07 01/Feb/07 08/Sep/06 09/Aug/07
01/Feb/05 07/Mar/06 18/Aug/01 21/Aug/01
   ;run;
proc print;run;
data;
   set have;
   month_diff_wmacro= %month_diff(firstdate,seconddate);
proc print;run;

This type of macro is only executed once to create the underlying datastep code, and therefore has a performance overhead of next to nil.

And once again, if you are used to creating functions, then I have been assured by someone from SAS that 9.3 will allow you to create your own functions.

HTH
 
A macro here might actually be overkill, I think it can be done in one line of code.
Also, please not guys that the INTCK function does not count the number of months between 2 dates per se. What it actually does is cound the number of times the 1st has rolled around between the 2 dates, so if date 1 is the 30th of August, and date2 is the 2nd of September, it'll return a value of 1 for
months = INTCK('MONTH','30AUG2008'd,'02SEP2008'd);

However, SAS had a really good article up on their support site a while back (it's probably still there but I can't find it). The article showed how to calculate a persons age accurately in years in 1 line.

age = floor((intck('month',birth,somedate)- (day(somedate) < day(birth))) / 12);

I think if you take out the /12 bit at the end, and drop the "floor() function), you should get your value...
Test it out, see how you go.


Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
One thing to note, and this bit took me a while to understand, is the second half of that statement.

months = intck('month',birth,somedate)- (day(somedate) < day(birth));


"- (day(somedate) < day(birth))"

This performs a logical test (is day of the month of "somedate" less than the day of the month of "birth") and returns a TRUE or FALSE. A TRUE actually equates to 1 and a FALSE to 0, so, if it is true then 1 is subtracted from the result, otherwise nothing happens.
This corrects for the INTCK quirk I explained earlier.

I hope that this helps.

Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
Hi Chris,

This does indeed look to solve the OP's problem. I had actually presented both halves of that equation, just never thought to combine then :(

I still maintain however, that it is perfectly acceptable to encapsulate that code in a macro to treat as a function. In fact, Ian Whitlock commented in a talk that "this may be overkill, but it is a good habit".
Also a good point about the boolean logic. I find that I find it particularly good for arithmetic operations or you would like to negate an if then statement. Some other examples are:

Code:
data _null_;
a = 17;
b = 17;
c = -16;
is_zero_a     = ^a; * is missing or zero;
is_zero_c     = ^c; * negative integer is also logically true;
is_not_zero_a = ^^a; * is not zero or missing;
is_equal      = a = b; * equivelence testing ;
complex_bool  = (a = b or (a gt b and a gt c));
put 'NOTE: '(_all_)(=);
run;
 
Cool. I'll try to remember that if I ever need it. First time I came across that a few years back I couldn't for the life of me work out what it was doing. I ended up flagging it as a coding error before the author put me straight. :)

Did you know that in SAS 9.2 you'll be able to compile your SAS code for calculating AGE into a SAS Function rather than a macro? Cool huh? Apparently it'll be a little more efficient.


Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
Isn't SAS 9.3 already released, or parts of it are? Can't wait for the ability to write UDFs!
 
No, you might be thinking of 9.1.3, which is the current version. 9.2 is currently out in the wild at a few sites, but it's not been released to everyone yet apparently.

Some really nice enhancements in 9.2, I went to a launch presentation a couple of weeks ago.

Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top