Extract factor range from bivariate graph

pauldeen · Oct 5, 2021 11:09 AM

How do I efficiently extract the crossing points of this curve and its 95% CI with an arbitrary Y value.

dt = Open( "$sample_data/Growth.jmp" );
biv = Bivariate(
	Y( :age ),
	X( :ratio ),
	Fit Line( {Line Color( {204, 121, 41} )}, save studentized residuals ),
	SendToReport(
		Dispatch(
			{},
			"Bivar Plot",
			FrameBox,
			{Grid Line Order( 3 ), Reference Line Order( 4 )}
		)
	)
);
biv << (Curve[1] << Save Studentized Residuals);

biv2 = Bivariate(
	Y( :Studentized Residuals age ),
	X( :ratio ),
	Fit Polynomial( 2, {Confid Shaded Fit( 1 ), Line Color( {212, 73, 88} )} ),
	SendToReport(
		Dispatch(
			{},
			"ratio",
			ScaleBox,
			{Min( 0.2796875 ), Max( 1.2666259765625 ), Inc( 0.2 ), Minor Ticks( 1 )}
		),
		Dispatch(
			{},
			"Studentized Residuals age",
			ScaleBox,
			{Add Ref Line( 1.96, "Solid", "Black", "", 1 ),
			Add Ref Line( -1.96, "Solid", "Black", "", 1 )}
		)
	)
);

So I would like to get the Ratio values where the curve and its 95%CI crosses +/- 1.96 (so 6 values in total). Are there any better solutions than the brute force I can think of: dumping the predicteds and the mean confidence limit formula into a table and adding rows with values to the table to see when it crosses the limits?

Craige_Hales · Oct 8, 2021 7:28 AM

minor variation

offset = 1.96;

// (I do not know how to pick these without a visual examination of the result.)
startingPoint = 1.1; // or 0.5 (trial and error. pick something close and check.)

// example data set
dt = Open( "$sample_data/Growth.jmp" );

// fit first-order polynomial model to response
biv = Bivariate( Y( :age ), X( :ratio ), Fit Line );

// save residuals for next step
biv << (Curve[1] << Save Studentized Residuals);
biv << Close Window();

// fit second-order polynomial model to residuals
biv = Bivariate( Y( :Studentized Residuals age ), X( :ratio ), Fit Polynomial( 2, {Confid Shaded Fit( 1 ), Line Color( {212, 73, 88} )} ) );

// save formulas for lower / upper 95% confidence bound
biv << (Curve[1] << save predicteds);
biv << (Curve[1] << Mean Confidence Limit Formula);
// get formulas, using order columns were saved
fupper = Column( dt, N Cols( dt ) ) << Get Formula;
flower = Column( dt, N Cols( dt ) - 1 ) << Get Formula;
fpredicted = Column( dt, N Cols( dt ) - 2 ) << Get Formula;

// replace predictor name with x
fsupper = Substitute( Name Expr( fupper ), Expr( :ratio ), Expr( x ) );
fslower = Substitute( Name Expr( flower ), Expr( :ratio ), Expr( x ) );
fspredicted = Substitute( Name Expr( fpredicted ), Expr( :ratio ), Expr( x ) );

fsupper2 = Substitute( Expr( a - offset ), Expr( a ), Name Expr( fsupper ) ); // find zero at offset 
fslower2 = Substitute( Expr( a - offset ), Expr( a ), Name Expr( fslower ) ); // find zero at offset 
fspredicted2 = Substitute( Expr( a - offset ), Expr( a ), Name Expr( fspredicted ) ); // find zero at offset 

// initialize x, could get x that predicts mean residual = 0
x = startingPoint;
// minimize expression that is Abs( residual ), or mean = 0
faupper = Insert( Expr( Abs() ), Name Expr( fsupper2 ) );
Minimize( faupper, {x} );
xupper = x;
Show( xupper, Eval( faupper ) );

// initialize x, could get x that predicts mean residual = 0
x = startingPoint;
// minimize expression that is Abs( residual ), or mean = 0
falower = Insert( Expr( Abs() ), Name Expr( fslower2 ) );
Minimize( falower, {x} );
xlower = x;
Show( xlower, Eval( falower ) );


// initialize x, could get x that predicts mean residual = 0
x = startingPoint;
// minimize expression that is Abs( residual ), or mean = 0
fapredicted = Insert( Expr( Abs() ), Name Expr( fspredicted2 ) );
Minimize( fapredicted, {x} );
xpredicted = x;
Show( xpredicted, Eval( fapredicted ) );

// add lines on plot
Report( biv )[axisbox( 2 )] << {Add Ref Line( xpredicted, "Solid", "Black", "P", 1 ), Add Ref Line( xupper, "Solid", "Black", "U", 1 ),
Add Ref Line( xlower, "Solid", "Black", "L", 1 )};
Report( biv )[axisbox( 1 )] << {Add Ref Line( offset, "Solid", "Black", "", 1 )};
Report( biv )[framebox( 1 )] << Frame Size( 752, 513 );

Low sideHigh side

Craige

ih · Nov 9, 2021 4:40 AM

@Craige_Hales, to solve the initial value problem, if you minimize the whole function first you could use that ratio as a limit for the minimize search and be guaranteed to find the values on each side. For this problem that could be 0.8:

Minimize( fa, { x([very small number],[min ratio - found from minimizing the function]) } );

//For this example -10 is arbitrary (could use -1E6 or something), and 0.8 is
//roughly the overall function minimum
Minimize( fa, { x(-10,0.8) } );

pauldeen · Oct 12, 2021 6:32 AM

I tried desperatly to come up with a algorithm for the starting point, for example I used the quadratic formula to calculate the predicted curve crossing with 1.96 and even if you plug that number in for the starting condition you get fairly unpredictable behavior:

So thanks for showing how it can be used and I did learn a lot from this example, but it is not robust enough yet.

Craige_Hales · Oct 12, 2021 7:12 AM

I think the problem is with the absolute value function making a curve that doesn't have a nice derivative, making minimize likely to fail. I was hoping @ih idea would force minimize to stay in the right neighborhood, but maybe not.

I don't have time to puzzle through it right now; it seems likely there is a standard alternative to abs() for this problem...yielding a smoother function. (maybe squaring it? something like that.)

Edit: and your result agrees with mine, which is why I had the comment about visual inspection. The abs function makes a corner in the function to minimize, and the slope near that corner is used by the minimize function, and it expects a continuous (2nd?) derivative on each side of the answer. Since the slope instantly flips on each side of the true answer, it is amazing it gets close answers sometimes. I think the message in the log may be because of this.

Craige

ih · Oct 20, 2021 5:11 AM

Expanding on @Craige_Hales and @Mark_Bailey 's code, you could do something like what is below to find the overall minimum and then use that as a bound for subsequent searches on each side, that way you are not as reliant on the initial value. Also, if you use a square instead of an absolute value gets rid of the corner. In practice though I've had good luck using the absolute value and other functions with step changes in the profiler's optimizer.

Names Default To Here( 1 );

// example data set
dt = Open( "$sample_data/Growth.jmp" );

// fit first-order polynomial model to response
biv = Bivariate(
	Y( :age ),
	X( :ratio ),
	Fit Line
);

// save residuals for next step
biv << (Curve[1] << Save Studentized Residuals);
biv << Close Window();

// fit second-order polynomial model to residuals
biv = Bivariate(
	Y( :Studentized Residuals age ),
	X( :ratio ),
	Fit Polynomial( 2, {Confid Shaded Fit( 1 ), Line Color( {212, 73, 88} )} )
);

// save formulas for lower / upper 95% confidence bound
biv << (Curve[1] << Mean Confidence Limit Formula);

// get formulas
fUpper = Column( dt, N Cols( dt ) ) << Get Formula;
fLower = Column( dt, N Cols( dt ) - 1 ) << Get Formula;

// replace predictor name with x
fsUpper = Substitute( Name Expr( fUpper ), Expr( :ratio ), Expr( x ) );
fsLower = Substitute( Name Expr( fLower ), Expr( :ratio ), Expr( x ) );

// initialize x, could get x that predicts mean residual = 0
x = 0.5;

// find the minimum of the whole expression, one solution will fall above and one below this value
fOverallMin = Name Expr( fsLower );
Minimize( fOverallMin, { x } );
xOverallMin = x;

// find where the value cross the x axis on each side of the overall min
x = xOverallMin - 1;
fMinCrossingTarget = Insert( Expr( Power() ), Insert( Expr( add(-1.96) ), Name Expr( fsLower ) ) );
Minimize( fMinCrossingTarget, { x(-1000, xOverallMin) } );
xMinLower = x;

x = xOverallMin + 1;
Minimize( fMinCrossingTarget, { x(xOverallMin, 1000) } );
xMinUpper = x;

x = xOverallMin - 1;
fMaxCrossingTarget = Insert( Expr( Power() ), Insert( Expr( add(-1.96) ), Name Expr( fsUpper ) ) );
Minimize( fMaxCrossingTarget, { x(-1000, xOverallMin) } );
xMaxLower = x;

x = xOverallMin + 1;
Minimize( fMaxCrossingTarget, { x(xOverallMin, 1000) } );
xMaxUpper = x;

show(xOverallMin, xMinLower, xMinUpper, xMaxLower, xMaxUpper);

Re: Extract factor range from bivariate graph

Re: Extract factor range from bivariate graph

Re: Extract factor range from bivariate graph

Re: Extract factor range from bivariate graph

Re: Extract factor range from bivariate graph