Proc Means of Questions Sorted by Dimension

smalek · Feb 19, 2009

Hi,
I need some help with a problem I'm facing. I have list of questions sorted by dimension and service area. I would like to use proc means to output the list of questions with mean of questions within each dimension and service area. For example:
Dimension Service Area Question Score
Access I 56
Access I 78
Access I 22
Info/rights I 55
Info/rights I 44
Info/rights I 66
Access E 68
Access E 35
Access E 95
Info/rights E 75
Info/rights E 77
Info/rights E 20

Desired output:
Dimension Service Area Question Score
Access I 56
Access I 78
Access I 22
Access Av 52
Info/rights I 55
Info/rights I 44
Info/rights I 66
Info/rights Av 55
Access E 68
Access E 35
Access E 95
Access Av 66
Info/rights E 75
Info/rights E 77
Info/rights E 20
Info/rights Av 57.3

Is it possible to acheive this setup with proc means?
Any advice is much appreciated.
Thanks

kdt82 · Feb 19, 2009

Hi Smalek,

I don't think you can do this in proc means/summary directly. Usually storing this type of structure in a dataset is a bad idea, as creating subtotals is usually only for display purposes and is much better handled by proc report.

That said, if you *need* to store it this way, you could try calculating the averages using proc summary/means and sticking them in the right places. As I assume you will want to keep the categories together, it will make your life easier if you create some kind of id variable as a place holder for the subtotals.

Code:

data have ;
   input Dimension :$20. Service_Area :$1.   Question_Score ;
   _id = _n_ ; * Create ID variable ;
   cards ;
Access       I               56
Access       I               78
Access       I               22
Info/rights  I               55
Info/rights  I               44
Info/rights  I               66
Access       E               68
Access       E               35
Access       E               95
Info/rights  E               75
Info/rights  E               77
Info/rights  E               20
;
proc summary data=have mean nway ;
   class dimension service_area ;
   var question_score ;
   id _id ;
   output out = totals(drop=_freq_ _type_) mean= ;
   run ;
data want ;
   set have totals (in=is_total) ;
   if is_total then do ;
      _id=_id + 0.5 ;
      dimension = catx(' ',dimension, 'Av') ;
      question_score = round(question_score,.1) ;
      call missing(service_area) ;
      end ;
   run ;
proc sort data=want out=want(drop=_:);
   by _id ;
   run ;

Alternatively, this type of dataset can be easily created using a data structure called the DOW loop. If you use this, you can get your results in one step, and there is no need to create an id variable.

Code:

 data want (drop=_:);
   do _n_=1 by 1 until(last.service_area) ;
      set have ;
      by dimension service_area notsorted ;
      _tot = sum(0,_tot,question_score) ;
      output ;
      end ;
   dimension = catx(' ',dimension, 'Av') ;
   call missing(service_area) ;
   question_score=round(divide(_tot, _n_),.1) ;
   output ;
   run ;

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Proc Means of Questions Sorted by Dimension

smalek

Programmer

kdt82

Programmer

Similar threads

Part and Inventory Search

Sponsor