Selecting the proper analysis method to develop nonlinear predictive formula

Kathorvath · Jun 8, 2023 5:32 PM

I would like to use simulation data to develop a predictive formula that would relate physical dimensions (independent variables with a range of dimension values) to the resulting field that is produced (dependent variable). I have 3 independent variables and I'm measuring the generated field at 3 locations. I would like to develop a set of formulas that allow me to enter the desired field at the 3 different locations and solve for the 3 physical dimensions that would yield those field values.

Example of desired formula:

Desired field at position 1 = a*(dimension 1) + b*(dimension 2) + c*(dimension 1) + d

Using the Fit Model - Standard Least Squares - Effect Leverage - I have generated 3 linear predicted formulas which theoretically should allow me to accomplish my task but those predicted formulas are only representative of the resulting field across a very narrow range of the physical dimensions. I believe there is some non-linear behavior that these linear predicted formulas are not capturing.

Could anyone suggest an analysis method for generating non-linear predicted formulas?

Thank you!

statman · Apr 26, 2021 02:30 PM

First, welcome to the community. If possible, it would be helpful if you could attach the JMP data set you are looking at (all right to code the actual values if it is sensitive). I'm trying to get my head around your situation, but have some questions:

1. Are the 3 dimensions actually independent?

2. How much do those dimension values vary in the study?

3. What models did you use? If your dimension data is more than 2 levels, you can certainly add non-linear terms to the model. In the Fit Model platform, you can elect the 3 dimensions in the list and select Macros>Factorial to Degree and have JMP write appropriate models (Degree= 2 is quadratic, Degree=3 is cubic, etc.). I don't know what data you have available?

4. For the analysis you performed, how well did the models predict the Y? R-squares, p-values, RMSE, etc. What do the residuals look like? Often you can see departure from linear in residual plots.

5. Field is your Y? I don't know what this is? A magnetic field or electrical charge? It sounds like you did not get much variation in the field in your data set and therefore your models are limited in space. This may be an inference space issue?

6. If I understand you correctly, you want an equation Y=D1 + D2 + D3 and then you want to enter a Y and solve for D1-D3. Do you know how to algebraically solve simultaneous equations?

Perhaps others have a better understanding or a different interpretation of your situation.

"All models are wrong, some are useful" G.E.P. Box

Kathorvath · Apr 26, 2021 05:44 PM

Hello thanks for taking the time to look over my question. Let me preface this with, I'm relatively new to using JMP and I don't have much background in statistics (part of why I've been struggling to figure this out). From a practical perspective, I want to use this set of equations to design a physical structure that is capable of producing my desired electric field. Attached is my dataset.

1. The independent variables are dimensions of an object and indicate the width, height, gap, and thickness - so technically there are 4 independent variables but the thickness factor doesn't seem to be very significant

2. Dimension's ranges are as follows the width: 10-25 mm, height: 1-7 mm, gap: 0.1-3 mm, and thickness: 1-7 mm (ideally with the constraint of height > gap)

3. I've been using the fit model and I believe have been running the analysis with macros>factorial to degree already set to 2

4. I saved the fit model analysis to the dataset table

5. My Y is a measurement of a simulated electric field and I actually get A LOT of variation - perhaps too much. I don't think the inference space is an issue but I'm also not very familiar with analyzing this concept.

6. yes, that is correct, and yes given three equations and three unknown variables (width, height, and gap) I should be able to solve for these values.

Additional thoughts:

A. The Eh values from the predicted formulas should never be <0 - is there a way to constrain this?

B. These predicted formulas do seem to provide an accurate prediction of the Y across a range of independent variables (width, height, and gap). Is there a way to analyze the data with a segmented or split approach and generate multiple sets of predicted formulas that represent the simulated results better?

Any additional thoughts or suggestions could be most appreciated.

Selecting the proper analysis method to develop nonlinear predictive formula

Re: Selecting the proper analysis method to develop nonlinear predictive formula

Re: Selecting the proper analysis method to develop nonlinear predictive formula