Discussions

uday_guntupalli · Dec 30, 2017 07:20 PM

All,

I have some data that fits well with the "Power Fit" in Excel.

I found some helpful information in this post (https://community.jmp.com/t5/Discussions/Power-Fit-Equation-for-JMP/td-p/5783)

I would greatly appreciate some more guidance on this topic:

When the power fit is performed on the data in Excel - the fit seems fairly reasonable
In order to perform the power fit on the data in JSL - theta1 and theta2 are the 2 tunable parameters. I have figured through trial and error what is a reasonable value for theta2 and theta 1 to make the fit model the data well

How can an user figure out what are the best values for theta1 and theta2 for the given data set without overfitting the model ?

Please kindly provide an example if possible.

Best
Uday

dale_lehman · Dec 31, 2017 09:02 AM

I don't have an answer, but this may start some interesting discussion. Attached is a small data set with some observations about automobile speed and stopping distance. A power curve appears to fit well. I used the nonlinear platform in JMP with the estimator column - I used initial paramter guesses of 1,1 (not too close to the right values but not very far off) - what is saved in the data set is the best fit from the nonlinear platfrom (in partial answer to your question, JMP will compute theta1 and theta2 so you don't need to worry about "overfitting" but there is an issue about whether JMP will converge on a solution depending on what values you use as starting guesses). I also put a column using the formula that Excel gives for the Power fit trendline. Note that they are not the same! The JMP fit is better than the Excel fit. This is not shocking, as Excel often does not perform the best statistical analysis. However, I was surprised at how different the two solutions are for this fairly simple data.

I believe both JMP and Excel can run into issues when fitting a power relationship to data. JMP may or may not converge to a solution if it converges, I believe the solution is reliable. Excel just gives you a fit - I don't know if it gives an error if it can't fit or what happens - perhaps someone can shed light on how Excel fits the power function to the data. I think "overfitting" is a red herring here - it pertains to whether the power relationship is the right relationship or not, and not the particular power relationship that is being fit. Since the algorithm is choosing the theta1 and theta2 values, they will be the values that fit the data best - overfitting is always a possibility, but it is a property of the function you are using to fit the data and not the particular choices of the parameters which the algorithm chooses. Perhaps a clearer way of saying this is that if this particular data is not a good sample of the data you are trying to model (perhaps a different type of car), then fitting the power relationship to this data runs the risk of overfitting.

The Excel vs JMP question is a different one and I am interested if anyone can shed light on the differences between the two.

uday_guntupalli · Jan 1, 2018 02:05 PM

@dale_lehman, @susan_walsh1
Thank you for your suggestion .

I am trying to put together an example to understand the workflow.
When I launch the "Non-Linear" Model platform , and specify the following :

Y - variable
X - variable
Non Linear Model

Save the Predicted Formula - the values are saved to the data table.

I am presuming that by following these steps, since I am not setting any starting values for the parameters in the model, JMP is automatically setting what it thinks is the best possible values to the model parameters and saving the predicted data as a column to the data table .

However, if I try to run a second model on the same data table , the column that gets added after launching the Non Linear Platform for the second time has the old model's equation. Please see the sample code below :


Clear Log(); 
Clear Globals(); 

dt = Open( "$SAMPLE_DATA/Nonlinear Examples/US Population.jmp" );

/* Run Predictions using different models*/ 
PowerFitPredModel = Nonlinear(Y(:pop),X(:Name("X-formula")),"Model E(2P)");
PowerFitPredModel << Save Prediction Formula ; 
PowerFitPredModel << Close Window; 

Wait(0.1);
CPredModel = Nonlinear(Y(:pop),X(:Name("X-formula")),"Mechanistic Growth Model(3P)");
CPredModel << Save Prediction Formula ; 
CPredModel << Close Window;

Now, a couple of things that I don't follow are :

When I launch the Non Linear Platform multiple times, why is only the first model equation that I call being applied in the above script ?
How do I determine if the parameters that JMP is assuming to fit the model are yielding a solution that converges ?

Best
Uday

dale_lehman · Jan 1, 2018 02:25 PM

My understanding of how the nonlinear platform works is a bit different. To run that platform, you create a formula predictor column using parameters - I believe you must put initial values for these parameters as guesses. When you click "Go" the platform will tell you if it converged to a solution or not. If it converges and you save the solution, it places those values in the predictor column you created - but the formula is still the same as the original. In other words, the predictor column contains parameters which you initialize and need to be optimized. But unlike other JMP platforms, the save column does not create a new column but only places the solution values to the nonlinear optimization in that same formula column you created.

My understanding may be wrong, so someone can correct me.

uday_guntupalli · Jan 3, 2018 08:20 PM

In order to provide closure, I am posting the response I received from Technical Support.

Essentially, the Non Linear Platform Launching via JSL has certain limitations i.e. the best way to leverage the Non Linear Models is by doing the following :

1. Define a column which encodes the formula that governs the model

2. Provide the formulated column as an input to the function call of the Non Linear platform.

This while requiring an initial starting value is the best answer i have found so far.

dt << New Column("Model E", Numeric, Continuous,
               Formula( Parameter( {theta1 = 0.03, theta2 = 1.25}, theta1 * :year ^ theta2 ) )
               );
   
dt << New Column("Exponential 3", Numeric, Continuous,
               Formula( Parameter( {a = -20, b = 0.01, c = 0.01},
                       a + b * Exp( c * :year ) ) )
               );
 
dt << New Column("Mechanistic Growth", Numeric, Continuous,
               Formula( Parameter( {a = -40, b = 0.01, c = -0.01},
                       a * (1 - b * Exp( -c * :year ) ) ) )
               );            
dt << Run Formulas;
               
fc = Nonlinear( Y( :pop ), X( :Name("Exponential 3" )) );
fc << Finish;
 
fc << Save Prediction Formula; 
//End Script;

Best
Uday

Discussions

Implementing "Power Fit" Model in JMP and tuning the model

Re: Implementing "Power Fit" Model in JMP and tuning the model

Re: Implementing "Power Fit" Model in JMP and tuning the model

Re: Implementing "Power Fit" Model in JMP and tuning the model

Re: Implementing "Power Fit" Model in JMP and tuning the model

Recommended Articles