Our World Statistics Day conversations have been a great reminder of how much statistics can inform our lives. Do you have an example of how statistics has made a difference in your life? Share your story with the Community!
Choose Language Hide Translation Bar
Highlighted
Matheus
Level II

Imputation for Nominal Data

Hello, everyone. 

 

I would like to know if it´s possible to impute data for nominal missing data. 

I know how to do it for continuous data, but for categorical I did not find anything in the software. 

Imagem1.jpg

 

Thanks in advance! 

 
 

 

6 REPLIES 6
Highlighted

Re: Imputation for Nominal Data

JMP does impute missing categorical values. The methods for imputation might depend on the analysis or model of the data, so JMP does not provide a separate imputation function for the data table. Imputation is performed by the analysis platforms. Which platform do you intend to use?

Learn it once, use it forever!
Highlighted
Matheus
Level II

Re: Imputation for Nominal Data

Thank you, @markbailey 

I had no idea! 

 

Actually I have different options to analyse my data and I would like to compare the responses provided by different platforms as:

Fit Model 

Predictive Model - Neural

Specialized Modeling - Gaussian Process

 

Do you know if it´s possible to impute using these platforms? 

If yes, please tell me how.  

 

Thanks again! 

Highlighted

Re: Imputation for Nominal Data

Fit Model is not a platform. It is a user dialog to specify the launch of many different platforms. You can start here in the JMP Help system and find the platform that you intend to use. It should indicate either in the section to launch it or the section about options how it deals with missing values.

 

The Neural platform in JMP Pro supports 'informative missing' instead of imputation. The platform is documented here.

 

The Gaussian Process platform in JMP Pro allows for categorical predictors but it does not provide either imputation or informative missing, so those observations are excluded.

Learn it once, use it forever!
Highlighted
Matheus
Level II

Re: Imputation for Nominal Data

Thank you for the reply, @markbailey 

 

So .. inside of the Fit Model dialog, I would probably use these ones:

- simple and multiple linear regression

- nominal and ordinal logistic regression

- partial least squares

 

My JMP version is not the Pro one, so can the JMP 15.1 version impute data for nominal missing values, based on the platforms described here?

 

Note: I got your point about dialog versus platform, but in the Help section you shared, it´s written

"Using the Fit Model platform, you can specify complex models efficiently".

 

Best regards

Highlighted

Re: Imputation for Nominal Data

I do not know for sure the differences between JMP and JMP Pro with regard to support for missing value imputation or handling. I am sure about what I posted previously. You can contact JMP Technical Support (support@jmp.com) for more detailed answers.

 

You might want to see this add-in, too.

 

Note: clearly I did not write the documentation!

Learn it once, use it forever!
Highlighted
P_Bartell
Level VI

Re: Imputation for Nominal Data

I agree with all the @markbailey has shared. To add my two cents, if you don't have JMP Pro, here's an idea for you to consider where you don't have to invoke informative missing on a platform dialog specification window, but get the moral equivalent of said invocation in the modeling analysis...the 'missingness' will be sort of hidden and not as efficiently/neatly displayed in tabular or data visualizations...but it will be in there.

 

First a bit of an explanation of how JMP handles 'missingness' for nominal variables/effects. In general the concept is called 'informative missing' throughout the JMP ecosystem. The primary presumption wrt to informative missing is that there is INFORMATION within the system of study that leads to the observation being missing. Maybe an underlying cause? Commonality? Or lurking common variable that leads to the missing observation?

 

For example, suppose one were to survey a bunch of fishermen about how many fish they caught and you were trying to model effects that influence the number of fish caught. In the survey one of the questions is 'fishing method'. And there are three options for respondents to pick from; 1. trolling, 2. spin casting, 3. still fishing. Respondents are given the option to leave a field blank...that is it's now 'missing' within the observation. Now let's suppose for some reason the respondents that tend to have the highest fish caught counts are also those that chose to NOT ANSWER THE QUESTION. And this makes some sense doesn't it? Fisherman are notorious for keeping their secrets so there is information wrt to the study when respondents leave out the answer. So now one would want to try and understand the underlying mechanism behind why the data is missing to better understand the entire system.

 

Now here's my suggestion within JMP. If you are willing to presume there is information in the 'missingness'...and this is a big presumption...then you can replace the missing cells with a psuedo value nominal value that is common in each missing cell. For analysis purposes, informative missing in JMP/JMP Pro generally treats the missingness as another level of the nominal variable. Hence it will be included in any effects that include that term. Here's the JMP documentation regarding informative missing within the Partition platform that is NOT part of JMP Pro. So make sure you closely examine each modeling platform dialogue specification box you are thinking of using...you may NOT have to mess with your data table after all:

 

https://www.jmp.com/support/help/en/15.2/#page/jmp/informative-missing-2.shtml