Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Importing issues with CSV files

Status
Not open for further replies.

SASnoobie

Programmer
Feb 5, 2009
2
US
Hello,

I'm faced with two issues while importing a csv file into SAS.

First, after the file has been imported (vis proc import), I noticed that ALL my variable types are in character including those that need be in numeric. Is this normal? I know I can convert them back to numeric, but is there a way to import them into proper types?

Second, I also noticed that some of the variable lengths were truncated after importing. IE, a numeric variable with valid values of 0-100 would need at least 3 digits in length, but were cut to 1 after importing.

I'm not sure if this makes sense. Any help would be greatly appreciated. Thanks in advance!
 
OK, first of all there is an option in Proc Import to tell SAS how many records to read for guessing the data type and lengths of the variables, I think it's GUESSINGROWS=, but you'll be able to find it in the documentation for Proc Import pretty easily. Set it to the maximum value (which I believe is 32767) and that'll hopefully fix up your length issues.

Personally, I very rarely use Proc Import to read in data, it's just not reliable enough for the reasons you've shown above, you can't guarantee what you're going to get out.
I almost always use a datastep to read in the data instead, like this:-
Code:
data indat;
  * Tell SAS where the file is, and what it looks like*;
  infile "c:\data.csv"
         recfm=v lrecl=2056 dsd dlm=',' missover firstobs=2;

  * Specify the lengths (and types) of the variables *;
  length recid  $20
         var1   8
         var2   4
         var3   $30
         date1  8
         ;
  * specify any informats...*;
   informat date1 ddmmyy10.;

  * And formats... *;
   format date1 ddmmyy10.;

  * And input the data *;
  input recid
        var1
        var2
        var3
        date1
        ;
run;
There's lots of different ways of reading in the data, but I've found that the method above is the most bullet proof and most generally applicable to a variety of files.
NB - ALWAYS be careful about specifying numeric variable lengths, read the SAS documentation about numeric lengths and make sure you understand the ramifications numeric truncation.
You'll see in your log after running proc import that it actually generates code like that above, but not quite so tidy (I think).

Hope that this helps.

Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
Thank you for your help Chris.

You're right. Infiling is definitely bullet proof, but cumbersome. I thought proc import, by default, checks the variable types and lengths and imports the data properly. I guess not.

I've reading many of your helpful posts/threads. Thank you and keep up the great work!
 
No worries.
Proc Import does check, however, it only checks the first few rows. Also, you need to tell it if it's got a header row, it might be that your data file has a header, and SAS is reading that in as a data line, and therefore interpreting the columns as character fields.

Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top