- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Quick normality test on many parameters
Hello,
I have lots of parameters (100 ~ 2000) each with 200 to 2500 values in the sample. I want to do for each one a statistical test to check if the distribution is normal or not.
My problem is that it takes me a long time (about 6 minutes): I have to run the Distribution > Fit Normal > Goodness of fit platform for each parameter and retrieve the p-value from the Anderson test, for example, to make my conclusion.
Isn't there a faster way? I'd like to avoid using Python as much as possible, because we've got JMP17 and we're afraid it'll cost a lot to migrate the code when we switch to MP19 in 6 months' time, because it changes a lot.
Best regards,
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Quick normality test on many parameters
Hi @SophieCuvillier ,
Got you, I've had a try at putting something together for you that will run the make combined data table in a script - this was a fun challenge to get the platform to run all the columns into one window instead of individual windows per column.
Let me know how that works for you.
//Random Column Generator v1 - Inspired by https://www.linkedin.com/feed/update/urn:li:activity:7241062805292929025/
//Get current data table, clear previous selections
dt = CurrentDataTable();
Names Default to Here(1);
dt << Clear Column Selection();
//Launch dialog for columns - adapted from https://community.jmp.com/t5/JMPer-Cable/Prompting-for-columns-totally-modally/ba-p/189647
nw = New Window( "Launch Dialog",
<<Modal,
V List Box( Align( "right" ),
H List Box(
Panel Box( "Select Columns",
//Filter for continuous data only
clb = Filter Col Selector( dt, All,<<continuous(1), <<ordinal(0), <<nominal(0))
),
Panel Box( "Cast Selected Columns for Distribution",
Lineup Box( N Col( 2 ), Spacing( 5 ),
Button Box( "Columns",
clbY << Append( clb << Get Selected )
),
clbY = Col List Box(
"Numeric",
MinItems( 1 ),
MaxItems( 20 ),
nlines( 5 )
),
Button Box( "Remove",
clbY << Remove Selected
)
)
)
),
H List Box(
Button Box( "OK",
cols = clby << Get Items( "Column Reference" );
),
Button Box( "Cancel" )
)
)
);
//Test for cols names
Print(cols);
N = N Items(cols); //count the number of cols
distList = (""); //create a storage location to compile the script commands for each distribution
for(i = 1, i <= N, i++,
Insert Into(distList,
(
"Continuous Distribution(column(" || char(cols[i]) || "), Fit Normal(Goodness of Fit(1))),"
)
)
);
finalScript = "Distribution(" || distList || ");"; //combine the distlist with distribution cols with the distribution command
distribrep=eval(parse(finalscript));
//Target the first Anderson wilkes table
Firsttable=Report(distribrep)[char(cols[1]),"Fitted Normal Distribution","Goodness-of-Fit Test",TableBox(2)]
//Make a combined data table with the results
firsttable<<make combined data table;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Quick normality test on many parameters
@matth1 wrote:
Unfortunately it's no help while you're using JMP17, but I posted a JSL script which uses python to do what you're looking for (I think). Hopefully it might be helpful when you move to JMP 18/19.
@SophieCuvillier actually I forgot I had created a non-python version of this script - I only changed because the python version was much faster for large data volumes.
Please see attached, I hope it helps. Note that everything below line 107 is just for creating the results display window - if all you want is the table of comparative p-values, feel free to delete that code.
I only have JMP 18 to try it with, but it should be OK with v17.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Quick normality test on many parameters
Have you used JSL to do this? Can you provide example of the script you have used? There can potentially be some optimizations to make it faster which might already be enough.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Quick normality test on many parameters
Hello,
In the end, I used the script provided by @matth1 and @Ben_BarrIngh (Thank you very much !!). To improve speed, we thought of randomly taking a sub-sample for each parameter instead of taking the total sample.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Quick normality test on many parameters
Hi @SophieCuvillier,
Have you tried holding right click and then using the red triangle to broadcast the commands across all of the individual distributions? You could then use this to right click the anderson darling table and 'Make Combined Data Table'.
Thanks,
Ben
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Quick normality test on many parameters
Hello Ben
Thank you for your answer. Well, I would like to do that in scripting as this is an analysis we want to incorporate in an addin ...
Best regards
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Quick normality test on many parameters
Hi @SophieCuvillier ,
Got you, I've had a try at putting something together for you that will run the make combined data table in a script - this was a fun challenge to get the platform to run all the columns into one window instead of individual windows per column.
Let me know how that works for you.
//Random Column Generator v1 - Inspired by https://www.linkedin.com/feed/update/urn:li:activity:7241062805292929025/
//Get current data table, clear previous selections
dt = CurrentDataTable();
Names Default to Here(1);
dt << Clear Column Selection();
//Launch dialog for columns - adapted from https://community.jmp.com/t5/JMPer-Cable/Prompting-for-columns-totally-modally/ba-p/189647
nw = New Window( "Launch Dialog",
<<Modal,
V List Box( Align( "right" ),
H List Box(
Panel Box( "Select Columns",
//Filter for continuous data only
clb = Filter Col Selector( dt, All,<<continuous(1), <<ordinal(0), <<nominal(0))
),
Panel Box( "Cast Selected Columns for Distribution",
Lineup Box( N Col( 2 ), Spacing( 5 ),
Button Box( "Columns",
clbY << Append( clb << Get Selected )
),
clbY = Col List Box(
"Numeric",
MinItems( 1 ),
MaxItems( 20 ),
nlines( 5 )
),
Button Box( "Remove",
clbY << Remove Selected
)
)
)
),
H List Box(
Button Box( "OK",
cols = clby << Get Items( "Column Reference" );
),
Button Box( "Cancel" )
)
)
);
//Test for cols names
Print(cols);
N = N Items(cols); //count the number of cols
distList = (""); //create a storage location to compile the script commands for each distribution
for(i = 1, i <= N, i++,
Insert Into(distList,
(
"Continuous Distribution(column(" || char(cols[i]) || "), Fit Normal(Goodness of Fit(1))),"
)
)
);
finalScript = "Distribution(" || distList || ");"; //combine the distlist with distribution cols with the distribution command
distribrep=eval(parse(finalscript));
//Target the first Anderson wilkes table
Firsttable=Report(distribrep)[char(cols[1]),"Fitted Normal Distribution","Goodness-of-Fit Test",TableBox(2)]
//Make a combined data table with the results
firsttable<<make combined data table;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Quick normality test on many parameters
Unfortunately it's no help while you're using JMP17, but I posted a JSL script which uses python to do what you're looking for (I think). Hopefully it might be helpful when you move to JMP 18/19.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Quick normality test on many parameters
@matth1 wrote:
Unfortunately it's no help while you're using JMP17, but I posted a JSL script which uses python to do what you're looking for (I think). Hopefully it might be helpful when you move to JMP 18/19.
@SophieCuvillier actually I forgot I had created a non-python version of this script - I only changed because the python version was much faster for large data volumes.
Please see attached, I hope it helps. Note that everything below line 107 is just for creating the results display window - if all you want is the table of comparative p-values, feel free to delete that code.
I only have JMP 18 to try it with, but it should be OK with v17.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Quick normality test on many parameters
I don't know why, but unfortunately I don't see the results of the normality test, no output, maybe my JMP version is too low (V14.3). I will try to use a higher version.
Can you make it into a jmpaddin file, so that it can be used better.
I hope to only see the results of the normality test, without the need for capability analysis.
This picture is the result of MINITAB. Can you make it look like this?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Quick normality test on many parameters
You can create the normal quantile plot comparing multiple parameters by stacking the data and using the Fit X by Y platform. For example, see the simple code below:
dt = open( "$SAMPLE_DATA/Semiconductor Capability.jmp" );
dt_stack = dt << Stack(
columns( :PNP1, :PNP2, :NPN2, :PNP3 ),
"Non-stacked columns"n( Keep( :lot_id, :wafer, :Wafer ID in lot ID, :SITE ) ),
Output Table( "Stack" )
);
dt_stack << Color or Mark by Column(
:Label,
Color Theme( "JMP Default" ),
Marker( 0 )
);
ow = dt_stack << Oneway(
Y( :Data ),
X( :Label ),
All Graphs( 0 ),
Line of Fit( 1 ),
Plot Quantile by Actual( 1 )
);
You could then build a custom window and include the A-D results table from the previous script as a report next to the normal quantile plot. If you're happy with scripting then this might help:
https://www.jmp.com/support/help/en/18.1/#page/jmp/construct-display-boxes-for-new-windows.shtml