Data are by subject over time, across a row. See the following example:
columns _1 _2 _3 _4 _5 _8
subject(001) data -0.069965723 0.492749371 0.955245597 1.346963522 1.701118239 2.523141195
I have thousands of such rows where the slope is needed.
All I really need is the slope for each series of numbers in the row.
A new additional column following each series should contain the slope value for that particular series.
I've asked on a SAS forum if SAS can handle this. Wondering if JMP can?? Any help would be greatly appreciated.
Nicholas
hi nkormanik,
the answer is yes.
in general you would need to transpose the data and do the calculations. assuming the data table is too big for that i think it is easier to transpose a matrix. therefore, the process in general involves extracting the data into a matrix, transposing it and calculating the slopes. have a look at this. it does all the dfferent steps for illustration purposes. it is not very robust (if you have missing values) or anything funny.
best,
ron
// create a data table
dt = new table ("data in rows",
add rows (100),
new column ("Subject", formula (row())),
new column ("slope"),
new column ("R2"),
add multiple columns ("column", 20, numeric)
);
// set values for the data columns
for (i=4, i<=23, i++,
:column (i) << formula ((i/2) + Random Normal()) << eval formula << delete formula ;
);
// start working
// get the table as a matrix
datamatrix = dt<<GetAllColumnsAsMatrix;
// remove the columns not used for regression calculations (columns 1-4)
datamatrix [ 0 , 1::3] = [];
// transpose the matrix
data = datamatrix`;
// make an x variable for time.
X = (1 :: ncol(datamatrix))`;
// add a column of 1's for the intercept term
X = J(Nrow(X),1) || X;
//now we have all the data we need for the calculations in matrix form
for (irow=1, irow<=nrows(dt), irow++, // for each row
// extract individual data one by one
Y = data [0, irow ];
// compute least squares estimates
β = Inv(X`*X)*X`*Y;
dt:slope[irow] = β[2];
// compute R2
N = NRows(Y);
Ybar = Mean(Y);
R2 = (β`*X`*Y - N*Ybar^2)/(Y`*Y - N*Ybar^2);
dt:R2[irow] = R2;
);
Here is an example of a column formula to calculate the slope across columns. It is not clear what the X values are, but I assume they are defined by the column names as in your example. As you can see the the X matrix is now hard coded. The formula could be made more general if the X values were held in e.g. the first row (or in a table variable).
// Example table
dt = New Table("Slope_Across_Columns.jmp",
Add Rows(2),
New Column("Subject", Character, Values({"001", "002"})),
New Column("_1", Values([-0.069965723, 2])),
New Column("_2", Values([0.492749371, 4])),
New Column("_3", Values([0.955245597, 6])),
New Column("_4", Values([1.346963522, 8])),
New Column("_5", Values([1.701118239, 10])),
New Column("_8", Values([2.523141195, 16]))
);
//Formula expression
F = Eval Expr(
Local({i = 1, n = Expr(N Col(Current Data Table()) - 1), X = J(n, 1, 1) || [1, 2, 3, 4, 5, 8], Y},
Y = Matrix(Repeat({i++ ; Column(i)[]}, n));
(Inverse(X` * X) * X` * Y)[2];
)
);
//Add formula column
dt << New Column("Slope", Numeric, Formula(Name Expr(F)));
Here is another method. It has only one advantage, in that the script just duplicates what you could do running JMP interactively. I have attached a sample data table that I developed the script around.
// Set a pointer to the sample data
names default to here(1);
dt = current data table();
// Stack the 6 data columns from each row
stacked = dt << Stack(
columns( :_3, :_4, :_5, :_6, :_7, :_8 ),
Source Label Column( "Label" ),
Stacked Data Column( "Data" )
);
// Create a new column that represents the time factor
stacked << new column("time",formula(Num(Substr(:Label, 2, 1)) - 2));
// Run the analysis on the data, by each row(student)
biv=stacked<<Bivariate(invisible,
Y( :Data ),
X( :time ),
Fit Line( {Line Color( {213, 72, 87} )} ),
by(_1)
);
// Create a data table from the estimates output
dtslope=report(biv[1])["Parameter Estimates"][1]<<make combined data table;
// Get rid of the intercept data
dtslope<<select where(:term == "Intercept");
dtslope<<delete rows;
// Select the columns that are not wanted and delete them
dtslope:x<<set selected(1);
dtslope:y<<set selected(1);
dtslope:term<<set selected(1);
dtslope:Name("~bias")<<set selected(1);
dtslope:std error<<set selected(1);
dtslope:t ratio<<set selected(1);
dtslope:Name("prob>|t|")<<set selected(1);
dtslope<<delete columns;
// Rename the Estimate column to Slope
dtslope:Estimate<<set name("Slope");
// Add the slope to the original data table
dt<< Update(
With( dtslope ),
Match Columns( :_1 = :_1 )
);