I am trying to perform a multivariate partial least squares on my dataset in JMP. This dataset consists of 10 rows and 59740 columns (FT-IR data, which has many thousands of "discrete" measurements to create a continuous curve). I get the following error when I try to run the PLS, setting 59738 columns to the X factors and two columns to the Y response.
This dataset is structured in the same way as the Baltic sample dataset but with many more measurements.
Does anyone know how to resolve this? It works if I only select a small number of the X responses and not all 59738 of them. Have I exceeded the capacity of data JMP can handle?
Probably best to email email@example.com referencing this thread.
From a methodological point of view, have you plotted the data? Do you expect all regions of wavelength/frequency to discriminate between the responses?
I agree with Ian on contacting support, but there are couple of other options you can try out along the way.
It looks as if you are using JMP Pro. Have you tried Generalized Regression? You can limit the number of important intensities using a penalized regression approach like Elastic Net.
Also a good way to look at spectral data is to use Functional Data Explorer using Rows as Functions. Using your outputs of concentrations of imines and amines as "Z Supplementary" will allow you to use the Functional DOE capability (JMP Pro 15) to get a Generalized Regression model of your data. Make sure to put your sample ID column in to fit the individual curves and use a P-Spline to fit the model. This might take a little bit with 59,740 columns, but I believe it is worth a shot.
Another question for you...are the 'zero' responses really numerically zero, or just placeholders for maybe a 'missing value'? Zeros just look odd to me in the context of the other values in each column. JMP needs to know if they are truly zero (which it looks like that's how they'll get treated by your analysis so far) or if they are missing, then how would you like to handle the 'missingness'?
OK...the zeros are actual numeric data. I'm wondering based on the error message if somewhere in the long list of over 50,000 predictor variables you've somehow got a column(s) with entire missing values? If I'm interpreting the error message correctly calculating the mean for missing values is the method of choice for missing values. Have you run any of the missing value exploratory platforms to see if in fact this is the root cause of the error? If you had a column of complete missing values, the mean can't be created. If missing values isn't the issue...then I'm at a loss and think along with @ian_jmp and @bill_worley reaching out to JMP Technical Support might be the best recourse.
I (during my tenure as a JMP systems engineer) once had a customer that was encountering a similar 'error' type message and the root cause was missing values scattered throughout the data table that rendered execution of the analysis platform she was trying to use impossible.
There are no labels assigned to this post.