Hi everyone,
I tried to find an answer to my question in the community or in JMP help, but I couldn't. So, I would really appreciate if anyone could point me in the right direction here or let me know what is the basic concept I am missing.
I am fitting a model to my response (dependent variable - DV), which is continuous, using 3 independent variables (IV) [2 categorical and 1 continuous] and the interaction between 2 of them (the other one I am just using the main effect). I first used Least Squares for that, as you can see in the screenshot below.
However, my residual by predicted plot does not look very good (I have some outliers). No matter which other factors I tried to add to the model - or transformation - I was not able to improve my residuals plot. The best I could get was using a change of my response, which is the one I am already showing above.
Next, I thought about using the Penalized Regression Method (more specifically Ridge) through the Generalized Regression. I read it can be more robust to outliers (though it will introduce some bias).
I started running the Generalized Regression with Standard Least Squares.
According to some videos I have seen, this should produce the same results as the Least Square regression.
Also, according to JMP help (https://www.jmp.com/support/help/en/15.2/index.shtml#page/jmp/training-and-validation-measures-of-fi...) the Generalized RSquare provided should be between 0 and 1 and it will simplify to the traditional RSquare for continuous normal responses in the standard least-squares setting. However, as you can see from the screenshot below, although the Generalized RSquare matches the RSquare for the Training set, it does not match for the Validation set (and it is negative). The other statistics (e.g. RASE) do match both sets (Training and Validation).
Any ideas about what may be happening here?
Thank you in advance for any ideas and hints!
PS: For everything I am doing in this post, I used the FitModel platform in JMP.