I'm not even sure this is the right place to ask, but here goes.
Right now I'm working with multiple regressions in JMP. I have Y as the abundance of bees in a certain area, and several X variables such a temperature, land use area, amount of flowers and so forth.
Right now I know the values of Y and X, and I'm trying to narrow the X variables down, so only 3-5 will remain as the significant variables determining bee-abundance.
Will I be able to, via JMP, to create a model where I'll be able to input new values and then get the corresponding Y (number of bees)? Say that I would like to examine the effect of a rise in temperature. I would like to be able to punch in the new temperature, and then get the new Y value. Sort of a prediction tool for bee abundance.
Once you have found the model that you think best fits your data you can choose "save columns/prediction formula" under the red triangle in the report (Fit Model platform).
A new column with the prediction formula for your model is created and if you add new rows and enter x values the corresponding Y-values is automatically calculated (however the model is not automatically re-fitted to the new data).
Then only problem with this method: The predicted column doesn't match the actual Y values that I have. Perhaps this this question reflects my JMP knowlegde, but shouldn't they be the same? I mean if I use my Y and X values to create a formular that can predict new Y values from new X values, shouldn't the formular be able to predict the old Y values from the old X values?
In regression, the Y is regarded as a random variable, i.e. it has some error, whereas the X is typically defined as constants (although often not the case in real problems). The prediction formula represents the "best" fit of the model to your actual data under the current model assumptions (e.g. linearity etc). The formula will rarely predict any of the original Y's exactly. Think of a simple linear regression: the line goes somewhere in the middle through the observed x-y pairs, only by chance a point intercepts the line.
In summary, the prediction formula should be able to predict your Y's reasonably well, but don't expect them to do any exact fits. However if the prediction is way off for the majority of the old Y's then the model is not good.
Thank you for clarifying that for me. It could be that my explanatory variables are a somewhat weak, since a portion of these are based on human assessment, grouping certain areas. These groups might not be the same from a bees perspective.
Nice with the profiler tool. :) I always find a useful to visualize my result. I will definitely be working some more with that.
"Will I be able to, via JMP, to create a model where I'll be able to input new values and then get the corresponding Y (number of bees)?"
The answer to this is yes. In the Fit Model platform, from the main hot spot (red triangle) for the model, you can select Factor Profiling > Profiler.
This will give you interactive graphs showing the effect of each X on Y.
There is a vertical dotted red line on each graph ... you can drag this to change the X value for the prediction that shows on the Y axis. The current X value is shown as a red number just below the x-axis ... you can over-type this number to enter directly the value of X for which you want to make the prediction.
The prediction is based on the model, and is only as good as the model, which in turn is dependent on the X's you have in the model (i.e. do you have data that can act as predictors) and the structure of the model (e.g. interactions between predictors).