Subscribe Bookmark RSS Feed

New column

jstewart

Community Trekker

Joined:

Jan 10, 2012

I have a large databse of patients hospitalized for a certain condition.  I have been trying to make a new column to indicate the "years prior to death".  See below:

This is what my database looks like:

YEARpatientMortality_in365days
1990221
1991221
1991221
1993221
1995221*
2000789
2001789
2002789*
2001965
2005965*

This is what I am trying to create:

YEARpatientMortality_in365daysDeath yearYears before death
1990221
19955
1991221
19954
1991221
19954
1993221
19952
1995221*19950
2000789
20022
2001789
20021
2002789*20020
2001965
20054
2005965*20050

A star in mortality_in365days column means they died within the last 365 days.  I know the years before death will not be an exact calculation since depending on how long ago they died (30 days versus 365 days) it may mean they died the year prior.  I had set up JMP to make the death year column, and then make a simple subtraction formula to calculate years before death, but then I run into problems.  I had done it manually using the update feature since I am not very good at scripting yet.  If you can help me either with manual instructions or scripting please let me know. 

Thanks.

2 REPLIES
hai_kuo

Community Trekker

Joined:

Jan 25, 2012

Know little about JMP, however, it seems to me a 2X DUL will do the job as well:

data have;

infile cards missover;

input YEAR     patient $     Mortality_in365days $;

cards;

1990     221

1991     221

1991     221

1993     221

1995     221     *

2000     789

2001     789

2002     789     *

2001     965

2005     965     *

;

data want;

do until (last.patient);

set have;

by patient year;

Death_year=ifn(last.patient,year,death_year);

retain death_year;

end;

do until (last.patient);

set have;

by patient year;

Years_Before_death=death_year-year;

output;

end;

run;

proc print;run;

Regards,

Haikuo

ms

Super User

Joined:

Jun 23, 2011

The maxiumum year for each patient obviously correspond to year of death. Manually you can then do a Summary with max year grouped by Patient. Then Update the original table with the summary using Patient as matching columns in both tables. Finally a simple subtraction (Death year - Year) would give the the last column.

Here is one slightly different way way to do it by script

dt = Data Table( "database" );

Summarize( p = by( :Patient ), maxyear = Max( :Year ) );

aa = Associative Array( Eval( p ), As List( maxyear ) );

col1 = dt << New Column( "Death year", numeric );

col2 = dt << New Column( "Years before death", numeric );

For Each Row(

          col1[] = aa[Char( :Patient[] )];

          col2[] = col1[] - :Year[];

);