Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Date command to fill the date

Status
Not open for further replies.

wundergal2000

Technical User
Jun 14, 2005
10
US
Hi,

Is there any auto fill in date command in SAS?I have a file like this:

Name Date
B -2
B -1
B 20030901
B 1
B 2

Now the middle row has the date, and I need for SAS to give me the date for the previous dayS where there is -1 and -2 and for the next days where there is 1 and 2. Is there any auto date fill in command because iam working with about 40000 observations whose dates need to be filled in like this.

Thank you.
 
Also I need to execute this in a loop. For example the file would be:

Name Date
B -2
B -1
B 20030901
B 1
B 2
C -2
C -1
C 20040709
C 1
C 2

Thanks.
 
If this doesn’t work, I can match this with another file:

File A

Name Date
B -2
B -1
B 20030901
B 1
B 2
C -2
C -1
C 20040709
C 1
C 2

FILE B:

Date Value
20030102 798392.892
20030103 79429.8020
20030104 74992183.79329

Now is there a way to merge file A and B such that, whenever the event date of File A corresponds to the same event date of File B, then Take 2 days before the event date from file B and match it with the 2 days before in File A and similarly 2 days after too? That way basically what I want to do is, merge the two files by the common event date, and from file B take the days before and after dates so that file A uses the same before and after dates instead of the –2,-1,1,2. The problem is, File A has event days in specific areas (and the 2 days before and after are just integers instead of dates) and the before and after days are just days and not specific dates. I need the –2, -1, 1, 2 to be converted to specific dates from File B. File B is a master file with all the dates and corresponds to the days of file A. So SAS should basically be able to match them and plus this needs to be done in a loop for each of the names. I have several names such as B, C, D, E, F such that each one has a date and 2 days before and after (in the format –2,-1,1,2) in file A.

I would be very very grateful for help on this matching and merging. Thank you so much.
 
wundergal2000,
I would break you problem into two parts.
1) Normalize your file A data (so that all values the date column are just that, Dates.)
2) Then focus on the merge with file B.

Try this to accomplish task 1.
Code:
** I JUST ASSUMED THIS IS A TEXT FILE IT CAN BE ANY FORMAT ;
data test;
infile "c:\data.txt" lrecl=2000 ;

input  Name $ date $ ;
** IF THE CONVERTED STRING VALUE IS GREATER THAN 100 (MEANING ITS A DATE) CONVERT THE STRING AS DATE VALUE;
if (input(date,8.)) gt 100 then
  newdate = input(date, yymmdd10.);
else
  newdate = input(date,8.);
** READ FROM LINE 2 (NO HEADER) ;
  if _n_ gt 1 then
   output;
run;
** SORT THE DATA ;
proc sort
  data = test
  out  = test2;
  by name decending newdate;
run;
** RETAIN THE FIRST DATE VALUE NOW AT THE TOP;
data test3;
  set test2;
  by name ;
 retain  date1;
  ** AT THE FIRST TIME THE NAME CHANGES SET THE DATE FIELD;
  if first.name then 
    date1 = newdate;
** IF THE FIELD IS NOT A DATE CHANGE TO PROPER DATE BY SUMMING THE VALUE AGAINST THE MASTER DATE;
  if newdate lt 100 then
    date2 = sum(date1,newdate);
  else
    date2 = newdate;
    format date2 yymmdd10.;
run;
Sort file A (now test3) by date2.
This should make it real easy to merge against file B once that file is in a SAS ds and sorted by date.
I hope that this has helped you.
Klaz
 
Hi klaz,

Thank you very very much for your help. However, there is one hitch, I cannot just sum the dates because there are days in between like weekends and holidays that this file does not count. And since it is a large data set for a large project, it would be difficult to get a lookup file with the weekend-dates and special holidays for this particular task. However, file B as I mentioned has only the dates that are required (does not count weekends and holidays etc). So is there some way for me to match the dates and then be able to pull the dates so that when the merge is done, the correct absolute date corresponds to the values in file A and file B?
Thanks again!
 
Are you saying that you can't just take the -1 off of date as it might take you to a Sunday when you would actually want to go back to the Friday in that situation? So what you actually want is to go back one record in B instead?

I can't see any easy way of doing that.

 
Hi what I am trying to accomplish is this:
File A

ID Number Date Value
1 -2 .4792
1 -1 0.3791
1 20030906 0.31791
1 1 0.3191
1 2 0.3810
2 -2 0.3819
2 -1 0.31810
2 20020503 0.380937
2 1 0.283617
2 2 0.32819

File B
Date Value
20010330 0.012110
20010402 -0.015547
20010403 -0.035918
20010404 -0.003298
20010405 0.045934
20010406 -0.020144
20010409 0.009219
20010410 0.028975
20010411 -0.000798
20010412 0.015925

The task is to take each absolute date in the Day column of File A and to go back two days (as indicated by -1 and -2) and compute the average of the values column. Similarly I should be able to go down two days as indicated by (1, 2) and compute the average of the values for each of the ID numbers. ID 1 is one ID and so on. This I am hoping to do with a BY VAR (or should I try to loop through by ID number?). However, the problem comes with file B.

For each corresponding absolute date value found in the Day column of file A, I should be able to identify the corresponding absolute date value in File B and similarly, in File B, go back two days from that absolute date and compute the average of the values and go down two days and compute the average of the values from that absolute date. File B is all in absolute dates. File A has absolute dates as the 3rd record (sort of like day zero and before day zero there are 2 days and after day zero there are 2 days) when sorted by each ID number. The problem I face is trying to match the absolute date from File A in File B and computing the average values(before and after) in file B.

Again, thank you very very much for your help!




 
?
Sorry, I'm even more confused than I was before. It might help if you showed a better sample, ie, the records in file B that would match to File A.

It sounds like what you want to achieve is to join only the listed dates in File A to the dates in File B to get that records Value, then calculate the Value for the +/-1 and 2, or is value joined on from File B only. You started talkign about averages as well which is confusing the issue.

I have a possible way forwards to what I think you are doing. Assuming that what you are wanting is the absolute dates matched exactly with File B, and then the relative dates matched to the relative record on File B (ie, the -1 record would match to the record before the absolute date).

I think that if you put a sequential number onto FILEB then you can do this.

Code:
proc sort data=fileb;  by date;  run;
data fileb2;
  set fileb;

  seq_nb_b = _n_;
run;

* put sequence on file A so we can return it to it's original order afterwards *;
data filea2;
  set filea;

  seq_nb_a = _n_; 
run;

proc sort data=filea2;  by date; run;

* Join on sequence from file B *;
data merge1;
  merge filea2(in=in1)
        fileb2(in=in2 keep =(date seq_nb_b));
  by date;
  if in1;
run;

* order file so correctly dated record comes first for each group *;
proc sort data=merge1;
  by name date;
run;

data filea3;
  set merg1;
  by name date;

  retain seq;

  * get sequence number from file B for the dated record *;
  if first.name then seq = seq_nb_b;
  * calculate sequence number for relevant fileB record for the +/- 1/2 records *;
  else seq_nb_b = seq + date;
run;

proc sort data=filea3;   by seq_nb_b;  run;
proc sort data=fileb2;   by seq_nb_b;  run;

data all;
  merge filea3(in=in1)
        fileb2(in=in2 keep=seq_nb_b value);
  by seq_nb_b;
  if in1;
run;

proc sort data=all;
  by seq_nb_a;
run;

That should give you all the correct values. Any averages you need to calculate should be done next using proc summary/proc means.

Hope this gives you the answer you're looking for.


 
Hi Chris,

I tried the above code and unfortunately it didn't work. To give you a better sample:
File A

ID NUMBER DAY COUNT VALUESFORA
1 -2 10
1 -1 20
1 20020908 5
1 1 4
1 2 3
2 -1 100
2 -2 8
2 20030102 19
2 1 19
2 2 18

File B

DATE VALUESFORB
20020905 40
20020906 100
20020907 38
20020908 82
20020909 373
20020910 37
20010911 37

Ok take the example of ID 1 in file A. The absolute date there is 20020908. Now I find the average values of the two days (-1 and -2) before which is 10+20/2=15. Similarly i find the average for day 1 and day 2 after the absolute date of 20020908 which is 4+3/2=3.5. Now in File B, you have a corresponding absolute date namely 20020908. Now in file B, I should be able to find the average values for 2 days before 20020908 namely 100+38/2=69. Again I should be able to find the average of two days after 20020908 which is 373+37/2. My problem comes in trying to correspond the two before and after days from file A to the two before and after days from file B so that I can compute the average of the values in file A and in file B. Thus going by ID number, I should be able to identify the absolute date in File A, compute the average of the values 2 days before and after the absolute date in File A. Next come to file B, match the absolute date in file B with that of File A, and compute the average of the values in file B 2 days before and after the absolute date. The values in File A and B are different.
Thank you so much Chris for your help. You guys in this forum are awesome! I hope this gives a clearer picture of the problem I have.
 
wondergal2000,

While you prob can do everything you just described using PROC SQL, you can do this using the methods I mapped out above. First you need to realize that you have two tasks. one, merge the data by (on) the correct date value. Two, you need to compute your stats - analysis on your data.

So why not merge the data the way I showed above then compute your data. The weekend can be handled after you get the data together. You could modify the analysis algorithm to check if that date is a weekend day. You can use the WEEKDAY(date) function to check if the day is a Sun or Sat (1 or 7).
You prob need the RETAIN statement as well.

Let me know if this helped you.
 
Hi Klaz,
Thank you very much for your help.
Unfortunately it is not only a weekend problem. This is all data on stock trading and there are several days other than weekends when stocks don't trade (such as special holidays or after Sept 11 when trading was halted). I tried getting a list of holidays to create a look-up file but that seems very very hard to get(ive tried getting this at every avenue including contacting NYSE!).And this data goes back by 50 years or so. So this is the problem I face because I just cannot look to see if it is a weekend..
Many many thanks again....I can merge the data using Proc SQL but I am lost about how to calculate the values as i mentioned above after the merge..
 
Hiya. As Klaz says, you need to break this down and not try to do it all in one go. That can be one of the biggest problems in trying to do something complex. Work out the steps, then go through and do each step in turn, testing at each stage to make sure it's working.
I think that my code (which I didn't test) should get you most of the way there. It should get you the correct records on file A joined to the correct records on File B. You may need to tweak it a little though. I unfortunately don't have much time to examine this as I'm snowed under at work myself. Once you get the records from B attached to the records from A, you should be able to summarise to get the averages. Append flags to show whether the DATE field is a - a plus or an absolute date, then summarise by ID and this flag.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top