Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to Create New observation in SAS

Status
Not open for further replies.

smalek

Programmer
Jan 15, 2009
28
CA
Hi
I have the following table in SAS:

Var1 Var2 Var3
TJones Chem 01042008
0356 spec
0444 test
SJohn Hem 02042008
0112 test
0411 spec

Desired Output:
Var1 Var2 Var3 TestCode Testname
TJones Chem 01042008 0356 spec
TJones Chem 01042008 0444 test
SJohn Hem 02042008 0112 test
SJohn Hem 02042008 0411 spec
I need to create one observation per person with unique test codes.

Thanks
 

I'm not sure if I have understood your question, but
perhaps yo can use 'retain' to repeat var1, var2, var3.

and with code like

if first.var1 then do;
or
if last.var2 then do;

control the output of the rest of variables.

Hope it helps...
 
Hi
Actually what I currently have is a table with the following format:
Name Type Chart No.
TJones Chem 01042008
0356 spec
0444 test

I would like to create a new data set having one observation per person with a unique test code according to the following output:

Name Type Chart No. TestCode Testname
TJones Chem 01042008 0356 spec
TJones Chem 01042008 0444 test

Is this possible?
Thanks
 
Hi, this is my first post in answering a question. My solution works but it is not every efficient. I tried to find materials on retain but I didn't find anything good on it. I think juanvg is right, retain should work better. There should be a more efficient way of making this work. If you got a better answer please let me know.

Here is what I did. It is really inefficient but works. Basically, you have to read in the data twice and then use the set statement to combine the 2 data sets.

data one;
input #1 name $ type $ chart $ #2 testcode $ testname $;
datalines;
TJones Chem 01042008
0356 spec
0444 test
run;

data two;
input #1 name $ type $ chart $ #3 testcode $ testname $;
datalines;
TJones Chem 01042008
0356 spec
0444 test
run;

data three;
set one two;
run;
 
Thanks cosmid. My data table does contain over 100,000 entries. I think I will give the retain suggestion by juanvg a try.

Hope it works
 
Smalek is there any kind of identifier that Shows that these three lines belong together?

Name Type Chart No.
TJones Chem 01042008
0356 spec
0444 test


As in how do I know that 0356 should be associated with Tjones and not with Sjohn?
 
Can you please let me know the solution using a retain statement or a more efficient way of doing this? Because the way I did it only works but in a very inefficient way. If you need the result, the code I put above, you only need 9 lines of it. If you have the raw data available, all you need to do is copy and paste them twice under each data step and then use the set statement. Or read in the data using infile, works the same. But I really want to know the right way of getting the result. Please post your answer when you have it. Thanks in advance!
 
Hi Cosmid,

Congratulations on your first solution, but I must admit, you picked a difficult problem to start with.

Smalek,

The following code will work, but it is not very flexible (or scalable for many variables as you would have to do a lot of renaming), it should however, be pretty efficient.

The actual format of your dataset should probably be changed as this will only present problems for you in future analysis. You should try limiting the number of blank fields as much as possible as eliminating repeats is something proc report does pretty well. I would also suggest name gets its own column and the ids a separate one.

HTH
Code:
data want(drop=_:);
   if 0 then set have;
   do _n_ =1 to 3;
      set have(rename=(name=testcode type=testname chartno=_x));
      if _n_ = 1 then do;
         name=testcode;
         type=testname;
         chartno=_x;
         end;
      else output;
      end;
   run;
 
Hi All
First off I would like to apologize for the delay in posting the solution. I would also0 like to thank you all for your invaluable contribution. I ended up using juanvg1972 suggestion. Below you will find the code's framework:
data cl.Lab2;
set cl.Lab1;
format var1 $10.
var2 $10.
var3 $10.
var4 comma6.0
var5 $9.

if var5 ne '' then do;
retain var1 var2 var3 var4
run;

Cheers
 
hm...I can hardly understand your code kdt82. I need to read more SAS books. I wish I can be 1/2 as good as you guys in SAS.

Thanks for the solution smalek. Now I got a better idea in using retain statement.
 
That's a pretty poor data structure. Usually data like that has some sort of record type identifier at the start of each record so you know what to do with it.

Smalek - Are you sure that your solution is working correctly? I was under the impression that the retain statement is a non-executable statement so it gets run straightaway.

I would probably have coded it similarly, but slightly different.
Code:
data dset;
  infile blah ....;

  retain name ''
        type '' 
        chartno '';

  input var1
        var2
        var3;
  if var3 ne '' then
  do;
    name = var1;
    type = var2;
    chartno = var3;
  end;
  else 
  do;
    testcode = var1;
    testname = var2;
    output;
  end;
run;
You'd need to add a length statement for the different variables as well, but this way should be pretty robust I think.

Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
Just re-reading my post and wanted to clarify that I'm not criticising Smaleks code or data, just pointing out that he's been dealt some badly formatted data to work with.

My version of the solution I think gives an added advantage that you can micro-manage the data in a more obvious way, but it is more long winded and possibly not as clever.

Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
Thanks ChrisW75. I will give your version of the code a try within the next couple of days and keep you updated.
Cheers
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top