Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Counting variables in SAS 1

Status
Not open for further replies.

DOYLE1

Technical User
Jan 11, 2011
3
IE
Hi Everyone,

I really need some help with this. I am doing research on twinning and I want to identify mothers who had more than one set of twins and calculate how many sets of twins they had. My data looks similair to this:

Mother Father d.o.b. idtwin single/twin
010 210 15.3.99 01021015399 1
034 165 16.3.87 03416516387 1
025 379 10.3.87 02537910387 2
025 379 10.3.87 02537910387 2
056 852 01.2.95 05685201295 1
025 379 10.4.89 02537910489 2
025 379 10.4.89 02537910489 2
056 245 18.2.94 05624518294 1

where I have a mother code, a father code and a date of birth which I compress to make teh column 'idtwin' and use to find the number of twins as twins will have the same idtwin. My question is how to to find the mothers who have multiple sets of twins over a number of years?

Any help would be greatly appriciated as I'm really stuck.
 
Hi, I hope I understood the request correctly - the code does the following after reading in an advanced set of test data with single kids, twins and triplet: using proc sort create a table containing a unique list of mothers and their non-single kids (WORK.twinMom). From that table get the mothers and twin IDs where a mother has more than one.

No SQL, I love the flexibility of first/last control to play with any data group combination ;-)

[TT]data work.birthRec;
length Mother 8
Father 8
dt $8
idtwin $11
cnt 8;
infile cards DLM=' ' dsd missover;
input Mother Father dt idTwin cnt;
cards;
010 210 15.3.99 01021015399 1
034 165 16.3.87 03416516387 1
025 379 10.3.87 02537910387 2
025 379 10.3.87 02537910387 2
056 852 01.2.95 05685201295 1
025 379 10.4.89 02537910489 2
025 379 10.4.89 02537910489 2
056 245 18.2.94 05624518294 1
066 555 11.1.11 11111111111 3
066 555 11.1.11 11111111111 3
066 555 11.1.11 11111111111 3
055 777 12.2.12 22222222222 1
055 777 12.3.13 33333333333 2
055 777 12.3.13 33333333333 2
077 666 11.1.21 11111111111 3
077 666 11.1.21 11111111111 3
077 666 11.1.21 11111111111 3
077 666 12.2.22 22222222222 1
077 666 12.3.23 33333333333 2
077 666 12.3.23 33333333333 2
;
run;

proc sort
data=work.birthRec (keep=Mother cnt idTwin
where=(cnt gt 1))
out=work.twinMom (drop=cnt)
nodup;
by Mother;
run;

data work.multiTwinMom;
set work.twinMom;
by Mother;
if not first.Mother
or not last.Mother
then
output;
run;[/TT]

Cheers,
Matthias
 
Hi,

Thanks very much for that, it was really helpful and I'm close to having it figured out. I am only dealing with twins births so triples etc have already been deleted from the data set. Because each twin has a '2' in the single/twin coulmn (count) when I use the code you gave me I am still getting mother whom only had one set of twins. Any idea how I could get only the mulitiple twin birth mothers in a data set. Thanks again for your help.
 
This works for any non-single due to only keeping "cnt gt 1" in the PROC SORT. In your case this will be all twins as the count is 2. The double entries will be single entries in the result of PROC SORT because option "nodup" is specified. So output has moms and their twins.

The following datastep sorts out the moms that only had a single twin birth. if a mom ID and a single twin ID exists, the first.mother is true and last.mother as well, so the "IF" statement is false -> no output. If a mom ID had multiple twin births, the mother ID cannot be first and last at the same time, so all mother/twinID pairs will be written to the result table. That should be it already in the code above.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top