Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Avoiding data set merging problems when by-variable has different leng

Status
Not open for further replies.

Biguidance

Instructor
Jan 23, 2012
5
BE
When merging 2 tables on a common by-variable you run the risk of having a different length on this (or these) by variable(s). If this is the case, SAS will casually inform you with the following warning:



WARNING: Multiple lengths were specified for the BY variable mergevar by input data sets. This may cause unexpected results.



On first sight you might be inclined to ignore this pretty warning but SAS will do exactly as it states: it may cause unexpected results. In some cases the merge might be successful, other times the merge is partial or just not executed at all, leaving you with an empty table.



Take the following example:


Code:
DATA table1;

          length mergevar $10;

          mergevar = "blabla";

          table1varIwant = "interesting stuff";

RUN;

 

DATA table2;

          length mergevar $11;

          mergevar = "blabla";

          table2varIwant = "more interesting stuff";

RUN;

 

PROC SORT DATA=table1;

          BY mergevar;

RUN;

 

PROC SORT DATA=table2;

          BY mergevar;

RUN;

 

DATA combinedtable;

          MERGE table1 table2;

          BY mergevar;

RUN;


The warning will be shown in the log.



WARNING: Multiple lengths were specified for the BY variable mergevar by input data sets. This may cause unexpected results.



Hopefully you will get the correct result but if you have a lot of observations to merge on, changes are… you won’t.



One way to solve this is by first determining the longest length for the ‘mergevar’ and using that information in a new length statement just before the actual merge.







The PROC SQL puts the longest length in a macro variable called ‘maxlength’.




Code:
PROC SQL;

          SELECT max(length) INTO :maxlength

                   FROM sashelp.vcolumn

                             WHERE libname='WORK' 

                                      and memname in

                                      ("table1", "table2")

                                      and upcase(name)="mergevar";

QUIT;


This macrovariable is than used to set the LENGTH of the mergevar.


Code:
DATA combinedtable;

          LENGTH mergevar $ &maxlength;

          MERGE table1 table2;

          BY mergevar;

RUN;


This merge runs without warning and you should have the expected result.

BIGuidance
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top