Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Input Statement with multiple observations per record 1

Status
Not open for further replies.

vassilisef

Technical User
Dec 14, 2008
14
0
0
GR
Hey ,

Just to let you know that I am not an experienced user.

My input data - in a comma delimited form - is the following :

Date,Price,Volume,Price,Volume
25/03/2005,15.78,1000,1.88,1500
05/09/2006,10.88,2000,0.77,2500

My desired input format would be :

Date Price Volume
25/03/2005 15.78 1000
25/03/2005 1.88 1500
05/09/2006 10.88 2000
05/09/2006 0.77 2500

My best attempt so far, would be:

data work.prices;
infile datalines dlm=',';
input date ddmmyy10.@;
input price volume @@;
format date ddmmyy10.;
datalines;
25/03/2005,15.78,1000,1.88,1500
05/09/2006,10.88,2000,0.77,2500
;
run;

However, this is wrong, because, as expected, when the Data step starts to iterate for the 2nd time, it attempts to read a date again - which does not exist.

I have a feeling that a retain statement for the date could be useful, but i cannot see where exactly it comes into play.

Thanks in advance,
Vassilis



 
I'm no wiz with compound input statements, but the following will work:

data prices(drop=price2 volume2);
set prices(keep=date price volume)
prices(in=b keep=date price2 volume2);
if b then do;
price = price2;
volume = volume2;
end;
run;
 
sorry...here's the full code:


data prices;
input date mmddyy10. price: 8.2 volume: 8. price2: 8.2 volume2: 8.;
cards;
03/25/2005 15.78 1000 1.88 1500
09/05/2006 10.88 2000 0.77 2500
;
run;

data prices(drop=price2 volume2);
set prices(keep=date price volume)
prices(in=b keep=date price2 volume2);
if b then do;
price = price2;
volume = volume2;
end;
run;

proc print;
run;
 
Thank you very much for your response.

If i had just 2 pairs of Price/Volume per record, then your written code would do the job.

However, in my data set, each line (record) holds HUNDREDS of observations (pairs of Price/Volume):

25/03/2005,15.78,1000,1.88,1500........,33.52,2000(200th Obs)
05/09/2006,10.88,2000,0.77,2500........,34.66,2500(200th Obs)


That is why, in the indication-code that i have written above, i am using the Double Trailing sign (@@). This makes it more complicated than it looks in the first place.

I am sorry for not being clear enough from the start.
 
Hi Vassilisef,

Your original code was a good effort. The problem is whether you know exactly how many repitions there are per line. You can then code this number into a loop to bring in the records as needed. If there are a variable number of repitions, then it will be a bit trickier.

Code:
data work.prices;
   infile datalines dlm=',';
   input date ddmmyy10.@;
   do _n_=1 to 2 ;
      input price volume @;
      output ;
      end ;
   format date ddmmyy10.;
   datalines;
25/03/2005,15.78,1000,1.88,1500
05/09/2006,10.88,2000,0.77,2500
;
run;

 
Hey kdt,

your code does resolve my case, since i know exactly the number of observation (repetitions) per line.

In fact, there is a second level/problem in my query but I will give it a try first , based upon your guidance above.

In the case that I cannot find it on my own, i will let you know.

Really grateful for the time that you expended for my shake,
Vassilis
 
Hey again,

after quite of effort.. i managed to solve my 2nd problem on my own (Beginner's luck..).
I am providing all of the details for your reference:


Available input data in a csv file:

DATE Price,Volume,Price,Volume

ABK,EFG,ABK,EFG
25/03/2005,15.78,1000,1.88,1500
05/09/2006,10.88,2000,0.77,2500
11/06/2007,11.42,3000,0.89,3500


Namely, i have trading Price & Volume data for 3 dates for 2 companies.
In reality, i have hundreds of companies and dates but for illustration purposes i have restricted the amount of date for these few.

Desired Output

date name price volume
25/03/2005 ABK 15.78 1000
25/03/2005 EFG 1.88 1500
05/09/2006 ABK 10.88 2000
05/09/2006 EFG 0.77 2500
11/06/2007 ABK 11.42 3000
11/06/2007 EFG 0.89 3500


Based upon the suggestion by "kdt82", the Code that i have used is:

Code:
data work.inputwithloops;
infile datalines dlm=',';

input date :ddmmyy10.@;
length name $3;
	Do name='ABK','EFG';
			Do;
			input price volume @;
			output;
			end;
	end;
format date ddmmyy10.;
datalines;
25/03/2005,15.78,1000,1.88,1500
05/09/2006,10.88,2000,0.77,2500
11/06/2007,11.42,3000,0.89,3500
;
run;


Thank you all for your help,
Vassilis
 
Very nice Vassilis, it's always good to see people trying for themselves and this is a tough problem for a beginner :)

The most minor of comments: You don't need the inner do.
Code:
data work.inputwithloops;
infile datalines dlm=',';

input date :ddmmyy10.@;
length name $3;
    Do name='ABK','EFG';
            input price volume @;
            output;
    end;
format date ddmmyy10.;
datalines;
25/03/2005,15.78,1000,1.88,1500
05/09/2006,10.88,2000,0.77,2500
11/06/2007,11.42,3000,0.89,3500
;
run;

 
Hey kdt82,

thanks for the tek-tip!

I have started using SAS as a by-product of my research work, with no expertise help whatsoever.

Initially, i was deterred by the fact that my only support/help would be the Manuals of SAS Institute which lack coverage/variety in their examples given.

When i found this Forum, i hoped that it would help me overcome frustrating... moments while experimenting with SAS coding.

Now, with your responses, i am convinced that i am not alone.. in my "heavy" task of learning how to program!

Thank you very much,
Vassilis
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top