Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

export to multiple excel tabs 3

Status
Not open for further replies.

aybooth

MIS
Feb 6, 2004
9
US
How can I export from one sas data set to multiple excel tabs on one spreadsheet. I want to segment the dataset using a variable I have created which denotes which department the records belong to.

Thanks

Anthony
 
I'm not sure if there is a direct method, but SAS help has examples of exporting to a specific sheet and exporting based on a criteria. So, you could adapt the examples like i've done below, ie get a unique list of values that use are segmenting by (in this example its 'M' & 'F' (sex)) and store them in a macro variable. Then iterate over the values and use proc export to export to excel, using the value as the filter.

it assumes the values you are segmenting by are valid names for excel sheets.

Code:
%macro export_to_excel();
%local varlist idx var;

proc sql noprint;
  select distinct sex into: varlist separated by '||'
  from sashelp.class;
quit;

%let idx = 1;
%do %while ( %scan(&varlist, &idx, %str(||)) ne %str() ); 
    %let var=%scan(&varlist, &idx, %str(||));
    proc export data=sashelp.class (where=(sex="&var"))
       outfile='c:\temp\Femalelist.xls'
       dbms=excel; 
       sheet="&var";
    quit;
    %let idx = %eval(&idx + 1);
%end;
%mend export_to_excel;

%export_to_excel;
 
Check out the ExcelXP tagset, that'll do the trick for you, assuming you have SAS v9

Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
Chris,

Thanks for the suggestion. I gave it a go and it almost worked. That was enough inspiration for me search other techniques. Many were very close to working, but each suffered from the same problem. Randomly throughout the document carriage returns and line feeds would appear. Unfortunately, these would appear in the middle of text - both data text and tag text. I just tried the CHTML method, which incorporates minimal formatting. That was helpful, and most of the data came over. Still, with tags like this:
Code:
<t
h align="right">
It's hard for Excel to recognize that this is the opening of a th tag and data cells were a bit mashed.

I suspect these work just fine when using SAS for Windows. However, I'm not. Have you run across examples of using these solutions for SAS 9.1.3 for z/OS?

Thanks again for the inspiration,
Larry
 
Sorry, I'm a Windows or Unix man, never played with z/OS I'm afraid.
It might be worth checking the SAS Support site to see if there are any examples for z/OS, they're usually pretty good, though having just had a quick scan myself I can't see anything.
If you want me to take a look at it, could you post your code, and a sample of what that data looks like that generates that problem, and what the full output file looks like. Best bet is to pare the file down to just a few example records and take out any identifying fields and double check that it still produces the problem (ie produce as small a set of data as you can, with as short a piece of code you can, while still producing the same effect). If it does, post it here, and I'll run it on my machine here, and that'll allow us to see if it's a z/OS issue or a tagset/data issue.
One thing I've found with writing data out in ExcelXP is that you do sometimes find special characters hidden in the original Data which SAS ignores quite happily, but which Excel then tries to act on...

Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
Chris,

Thanks for taking the time to reply. You've seen the same thing I've seen - lots of good examples, but nothing for the mainframe.

I had already tried the data filtering technique. The original output was over 25k lines - too many to work with. I pared it down to about 200 and was able to spot what was going on - where I was getting carriage returns & line feeds that were breaking tags. I manually removed them with a simple text editor and the file came up perfectly.

The data is coming from DB2 and there's no characters that are problematic for Excel. Our method is output the data into a comma-separated file, saved as text. When Excel opens the file, it brings up an import wizard where we're able to dictate cell format. This works better than creating a csv file. While Excel opens csv files natively, it also takes over formatting and is wrong on 99% of the files we generate (e.g. text fields with leading zeros stripped). We also use Microsoft Access to query DB2 and export to Excel. Once again, data's good there. The culprit lies somewhere with ODS and the mainframe. It could be SAS, it could be JCL, it could be the z/OS file format. Who knows?

Anyway, thanks again for your suggestions.

Oh, and love the titles: Code Monkey, Data Wrangler. Sounds like my job!
 
Are the carriage returns and line feeds in the original data, or are they being introduced by the output step?
If it's in the original data (and it wouldn't surprise me if it is) you should be able to remove them using the COMPRESS function ie
Code:
   field = compress(field,'0D'x '0A'x);
If it's the output step, there might be options that you can set if you're using say Proc Report, to sort out line wrapping.

There's also the option of making changes to the tagset to do what you want it to. There's a discussion on this here:-


Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
Chris,

No, they're not in the original data. In fact, for the most part the output data is just fine. As you can see from the example in my original post, the line feed/carriage return was put in the table header tag. It actually split the 't' and 'h' in the tag. Most th tags came across just fine, but this one didn't. Another that I remember was
Code:
<td align=ri
ght>
Of course, this *should* have been
Code:
<td align=right>
Naturally, the former was not processed as a valid tag.

Now that I think about it, it, I'm pretty sure ODS is not the culprit. It wouldn't make sense that SAS would stick spurious carriage returns & line feeds *inside* the html tags their program is generating. Outside, maybe, but not inside. This has to be a problem with z/OS, the JCL, or possibly character encoding. Well, unless I'm wrong of course...

- Larry
 
Could be a line length issue then, is your output file defined in the JCL? If so, maybe you need to set the line/record length of the output file. Maybe even in the SAS options, though I've never seen an ODS file affected by the SAS line length...

Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
Chris,

Had to get back to work for a spell, so couldn't persue this for a spell. Anyway, I found cause of the problem. I had set the records to fixed block. When I changed to variable block, the random line feeds disappeared. Perfectly logical now, but I was focused on SAS instead of the JCL.

Nevertheless, your idea of working with the record length helped me look in the right area.

I'll start a new thread with my next set of questions.

Thanks for the assist,
Larry
 
Oh, I should mention that this works for ODS CHTML, but not for ODS TAGSETS.EXCELXP. Not sure why FB vs VB works for the former and not the latter, but that's what it's doing.
 
My guess is that it gets to the end of the line and adds in a linefeed to wrap it, but some of the responses from SAS seem kind of arcane at times... Glad I was able to help. I've seen similar issues before when outputting from SAS, there are also line length options within SAS which might need to be expanded. A good way to check current settings is to run :-
Code:
proc options;
run;
which will list all the options currently set in your log. Peruse the list, find the line related options and see if any of those might be interfering.

Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
Chris,

I found the answer: As soon as I added RS=NONE, everything fell into place and I no longer had oddly placed carriage returns:
Code:
ODS CHTML FILE=EXCEL RS=NONE                               
  HEADTEXT="<STYLE> TD {MSO-NUMBER-FORMAT:\@}</STYLE>";    
  TITLE;                                                   
  PROC PRINT DATA=EMPS_IN NOOBS;                           
  RUN;                                                     
ODS CHTML CLOSE;

By default, ODS places record separator characters in stream and writes the file as one big file stream. If the stream is larger than the block size defined in your JCL (32k is the max allowed), then the job inserts a line break. Apparently, when you set record selectors to none, ODS writes each line separately. As long as the block size is larger than the longest single record, then the record won't span blocks and won't have those unexpected line feeds.

Hopefully this post will help others because the solution is far from intuitive and took quite awhile to find.

- Larry
 
Great, thanks for posting the solution. I may have to check out that CHTML tagset. I always use the EXCELXP tagset, with some tweaks of my own that I included, I should check out your way to see if it performs any better, though I've had no problems with ExcelXP.
Cheers.

Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
Chris,

I did work with the MSOFFICE2K tagset a bit yesterday. My biggest beef was the format. I couldn't figure out how to create a spreadsheet with 10 pt Arial text, cells bordered, and header row bold. This is the format the CHTML tagset uses.

I tried adding styles to the head text. Unfortunately, ODS embeds styles after the head text, so mine were overwritten. I did find that I could pass a predefined style and that was accepted. I found one that was close enough - definitely better than the default big text on dark grey, with dark blue headers. Good grief who came up with that?

Anyway, here's what I have now:
Code:
ODS TAGSETS.MSOFFICE2K FILE=FILENAME_HERE RS=NONE STYLE=SASDOCPRINTER
  HEADTEXT="<STYLE> TD {MSO-NUMBER-FORMAT:\@}</STYLE>";      
  TITLE;                                                     
  PROC PRINT DATA=EMPS_IN NOOBS;                             
  RUN;                                                       
ODS TAGSETS.MSOFFICE2K CLOSE;

One thing about the CHTML tagset is that it can be a bit smaller since it has no style information embedded. I created a small test file. With MSOFFICE2K, it's 56k. With CHTML, it's 14k. When I tested with a slightly larger file, the MSOFFICE2K version was 2M and the CHTML version was 920k. When I tested on a large file, the MSOFFICE2K version was 25M and the CHMTL version was 11M. That'd be a pretty good reason to use the CHTML version - if formatting is not important.
 
Ah, well, if you use the ODS TAGSETS.EXCELXP file=... method, you can set up your styles using Proc Template, I'd recommend doing it that way as you can set them up once and never need to worry about them again.
I use this to submit my reports...
Code:
ods tagsets.excelxp file="P:\Clients\Active\xxxx\reports\xxxx_&RUNDT..xls"
    options(absolute_column_width='10,10,10,10,8,8,8,8,8,8,8,8,8,8'
            row_repeat='header'
            embedded_titles='yes'
            frozen_headers='yes'
            scale='100' 
            orientation='landscape'
            sheet_name='Report'
           )
   style=styles.psexcel;

  ... report...

ODS tagsets.excelxp close;
The important bit there is the "style=styles.psexcel".

Here's the proc template step I used to create the psexcel stylesheet...
Code:
*Set ODS Path to only store stylesheet for the session. *;
*ods path work.templat(update) sashelp.tmplmst(read) ;
ods path sashelp.tmplmst(update) ;


* Stylesheet for standard colours etc. *;
proc template;
   define style Styles.PSExcel;
      parent = styles.default;
      style SystemTitle from SystemTitle /
         background = #FFFFFF
         foreground = #000000
         font_style = Roman
         font_weight = Bold
         font_size = 4
         font_face = "Arial, Helvetica, Sans Serif";

      style SystemFooter from SystemFooter /
         background = #FFFFFF
         foreground = #000000
         font_style = Italic
         font_weight = Bold
         font_size = 3
         font_face = "Arial, Helvetica, Sans Serif";

      style SysTitleAndFooterContainer from SysTitleAndFooterContainer /
         borderwidth = 0
         background = #FFFFFF
         foreground = #FFFFFF
         font_style = Italic
         font_weight = Bold
         font_size = 3
         font_face = "Arial, Helvetica, Sans Serif";

      style Body from Body /
         rightmargin = 8px
         leftmargin = 8px
         background = #FFFFFF
         foreground = #000000
         font_style = Roman
         font_weight = Medium
         font_size = 2
         font = ("Arial, Helvetica, Sans Serif")
         topmargin=0.75in
         bottommargin=0.75in
         leftmargin=0.75in
         rightmargin=0.75in;;

      style RowHeader from RowHeader /
         background = #99CCFF
         foreground = #000000
         font_style = Roman
         font_weight = Bold
         font_size = 2
         font = ("Arial, Helvetica, Sans Serif")
         vjust = Top
         borderwidth = 2px
         bordercolor = #000000;

      style Header from Header /
         background = #99CCFF
         bordercolor = #000000
         borderwidth = 2px
         foreground = #000000
         font_style = Roman
         font_weight = Bold
         font_size = 2
         just = center
         vjust = bottom
         font = ("Arial, Helvetica, Sans Serif");

      style Data from Data /
         borderwidth = 2px
         bordercolor = #000000
         background = #FFFFFF
         foreground = #000000
         font_style = Roman
         font_weight = Medium
         font_size = 2
         font = ("Arial, Helvetica, Sans Serif");

      style dataemphasis from dataemphasis /
         borderwidth = 3px
         bordercolor = #000000
         background = #99CCFF
         foreground = #000000
         font_style = Roman
         font_weight = BOLD
         font_size = 2
         font = ("Arial, Helvetica, Sans Serif");

      style Table from Table /
         cellspacing = 1px
         bordercolor = #000000
         borderwidth = 1px
         background = #FFFFFF
         foreground = #000000
         font_style = Roman
         font_weight = Medium
         font_size = 2
         font = ("Arial, Helvetica, Sans Serif");
   end;
run;

One thing to remember is that Excel has a very small pallette, if you specify colours that don't fit in the pallette, Excel will pick one that it thinks is closest, so you may not get out what you expect colour-wise.
Also, this template doesn't work so good for outputting to HTML, just Excel.

Enjoy.

Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
Chris,

Another problem I had with EXCELXP is that it stripped leading zeros, even from strings. Also, the file size is the same is MSOFFICE2K. I had looked for a way to override the datatype declaration, but couldn't find one. Here's an example of the offending line:
Code:
<Cell ss:StyleID="Data" ss:Index="1"><Data ss:Type="Number">012345678</Data></Cell>

Here's what I want:
Code:
<Cell ss:StyleID="Data" ss:Index="1"><Data ss:Type="String">012345678</Data></Cell>

Both CHTML and MSOFFICE2K have no problems correctly handling this string as a string. Only EXCELXP changes my string into a number.

While I will likely use CHTML due to the significantly reduced file size, there is one situation where EXCELXP would come in handy - the [tt]OPTIONS (SHEET_INTERVAL = 'BYGROUP')[/tt] statement. Having the ability to programmatically segregate data by sheets in such an easy way would be nice.

Have you seen any method to correct EXCELXP's data type overrides?
 
Chris,

I accidently bumped into the answer on how to override the format ( It took quite a few wrong turns during my Google searches, but I stumbled upon TAGATTR in a document about how to create a text baloon for a cell. I'm still not sure how I found the answer there, because I'm not interested in adding text baloons.

Anyway, after fiddling around a bit with their format, I found that the following works:
Code:
VAR FIELD_NAME / STYLE={TAGATTR='000000000'};

Alas, as soon as I have one var statement on a column, I have to list all my columns in a var statement. I tried cheating by using [tt]VAR _ALL_;[/tt] just after my card above, but SAS saw right through that. It printed said field twice - once from the formatted line and again as part of the _ALL_ line. So, each column has to be listed.

Oh well, I guess that's part of the price of using EXELXP. Still, it does give me multiple worksheets for those rare times when users request them.

- Larry
 
Wow, that's great, heaps better than my method. I actually made some changes to the ExcelXP tagset to add in an alternative TAGATTR to force the data to be treated as text when I tagged it as such.
The problem as it was explained to me by the guy who wrote the EXCELXP tagset is that most of the output procedures output the data only, no metadata, so the ODS tagsets have to try to determine what the data is before they convert it. If there are no characters in the field other than numerics, then ODS has to take it as a numeric. I'll check out the method that you found, seems easier than making amendments to the tagset.

The post which led to me makign a bunch of changes to the ExcelXp tagset is here:-

There is however, an updated version of the ExcelXP tagset released in January this year I believe, so not allt he changes I made are necessary anymore...

Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
Thanks!

You ever think that you're soooo close to a better solution?

ChrisW75 said:
The problem as it was explained to me by the guy who wrote the EXCELXP tagset is that most of the output procedures output the data only, no metadata, so the ODS tagsets have to try to determine what the data is before they convert it. If there are no characters in the field other than numerics, then ODS has to take it as a numeric.

I understand what you mean about metadata. However, both CHTML and MSOFFICE2K were able to identify which fields were numeric and which ones were strings of numbers. All I needed to do was add [tt]HEADTEXT="<STYLE> TD {MSO-NUMBER-FORMAT:\@}</STYLE>";[/tt]. Those items that were strings in the database were left as strings and the leading zeros were retained. Those items that were numeric in the database were treated as numbers. So, somewhere in the stream, something is being passed that identifies strings of numbers. I wonder what the CHTML and MSOFFICE2K tagsets have that EXCELXP doesn't have that allows them to parse correctly.

It seems we're so close to a better implementation, yet it's beyond me to understand how to bridge the gap.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top