Discussions

SophieCuvillier · Dec 4, 2023 11:01 AM

Hello,

I have a file with missing values. I want to impute these missing values In place with JMP Pro's ADI method, for columns X2, x10, X12-X16 (see attached file).

However, when I apply the method, not all missing values are imputed. What's the reason behind this?

Imputed values are highlighted in blue, and non-imputed values are circled in red below.

Victor_G · Dec 5, 2023 6:32 AM

Hi @SophieCuvillier,

The selection of columns may be the explanation in difference of results we have. Another option if you don't want to have imputed values in some columns could be to use the option "Save Scoring Formula to Current Data Table", so that new columns with imputed values are created, enabling you to assess the relevance of the imputed values, selecting the imputed column you want in visualization or modeling, and/or also enabling you to better assess the impact (and benefits ?) of imputation method on the modeling results.

Unlike other imputation methods, the more columns the better for ADI, as it uses the information from the other covariates to impute missing values. To avoid data leakage (info from validation/test sets used in the training and data imputation), a validation column should be used when launching this platform.

A presentation about this platform is available here : Automated Data Imputation: A Versatile Tool in JMP® Pro 14 for Handling Missing ... - JMP User Commu...

And more info about the method can be found here : The Missing Value Report

I hope this complementary answer will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

View solution in original post

Victor_G · Dec 5, 2023 5:08 AM

Hi @SophieCuvillier,

Can you provide a snapshot of the ADI platform when you launch it, and more details about the columns used ?
I was not able to reproduce the errors you obtained with the default configuration of the ADI platform, all missing values can be imputed with the default recommended settings. It might be a problem in the settings of the platform that prevents from imputing all missing values (for example an inappropriate dimensional space), or the selection of columns.

Here are the steps I did :

Selection of the columns in the platform for data imputation :
Default recommended settings for ADI :
Results obtained :

I attached the datatable with imputed values and the script for ADI.

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

SophieCuvillier · Dec 5, 2023 08:43 AM

Hello @Victor_G ,

Thank you for your response. I let the default configuration of the ADI platform like you, with random seed of 0.

The only difference between what you did and what I did is that I wanted to impute the missing values, but only for the X2, x10, X12-X16 columns, not the other. So, before clicking on the ADI imputation button, I first selected these columns via the imputation report (as shown below) and then launched ADI imputation.

It's true that if I don't select anything beforehand it works, but I naively thought that I could select the columns where I wanted to impute

Victor_G · Dec 5, 2023 6:32 AM

Hi @SophieCuvillier,

The selection of columns may be the explanation in difference of results we have. Another option if you don't want to have imputed values in some columns could be to use the option "Save Scoring Formula to Current Data Table", so that new columns with imputed values are created, enabling you to assess the relevance of the imputed values, selecting the imputed column you want in visualization or modeling, and/or also enabling you to better assess the impact (and benefits ?) of imputation method on the modeling results.

Unlike other imputation methods, the more columns the better for ADI, as it uses the information from the other covariates to impute missing values. To avoid data leakage (info from validation/test sets used in the training and data imputation), a validation column should be used when launching this platform.

A presentation about this platform is available here : Automated Data Imputation: A Versatile Tool in JMP® Pro 14 for Handling Missing ... - JMP User Commu...

And more info about the method can be found here : The Missing Value Report

I hope this complementary answer will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

SophieCuvillier · Dec 5, 2023 10:39 AM

Thank you very much Victor !

Discussions

Some missing values are not imputed with ADI (Automated Data Imputation)

Re: Some missing values are not imputed with ADI (Automated Data Imputation)

Re: Some missing values are not imputed with ADI (Automated Data Imputation)

Re: Some missing values are not imputed with ADI (Automated Data Imputation)

Re: Some missing values are not imputed with ADI (Automated Data Imputation)

Re: Some missing values are not imputed with ADI (Automated Data Imputation)

Recommended Articles