Solved: Iterate over elements of report

teoten · Jun 9, 2023 4:34 PM

Hi, I'm new using JMP but I find it very useful and practical, so far I'm enjoying it a lot. I’m learning JSL because I need to automatize some of the analysis for my teammates and for efficiency. I am starting with something simple but I need help to move forward.

I need to customize the results of the distribution platform to perform a test of normality for several columns of a table, in order to repeat it as often as I or my colleagues need it. I managed to write a simple script that achieves this in an easy way. However I am having problems on how to edit particular elements of the report when a condition meet.

Here is the script that return values from the Shapiro-Wilk test:

// Select variables to analyze 
Names Default to Here( 1 );

dt = Current Data Table();

dialog = New Window( "Select Columns", << Modal,
	Panel Box( "Column Selection",
		Lineup Box( N Col( 1 ),
			Text Box( "Select columns for distribution" ),
			y col list = Col List Box( All )
		)
	),
	H List Box(
		Button Box( "OK",
			y col = y col list << Get Selected;
		),
		Button Box( "Cancel" )
	)
);

If( dialog["Button"] == -1,
	Throw( "User cancelled" )
);

// Create distribution
dist = dt << Distribution(Columns(evalList(y col)),Histograms Only);

// Add parameters
dist << Normal Quantile Plot( 1 );
dist << Fit Distribution(Normal(Goodness of Fit(1)));

How can I make JMP to automatically show customized summary statistics (inluding Skewness and Kurtosis) when the P value (Prob<W) is < 0.05.

When I try to edit something on the report, for example, remove the parameter estimates

distr["Parameter Estimates"] << Delete;

It does it only for the first element of the report.

I would appreciate any help on how to manipulate elements of the report, in general such as remove all the parameter estimates, and in particular such as P < 0.05.

Thanks!

MathStatChem · Aug 13, 2020 10:05 AM

When you use Distribution() with a list of columns, the report window contains the output from many 'independent' distribution analysis. Each one of those analysis is a separate object. Interacting with those objects can be different than interacting with the report layer in the report window When you do

dist << Normal Quantile Plot(1);

that message is broadcast to all of the distribution objects that are being displayed in the report window. But when you do

distr=Report(dist);
distr["Parameter Estimates"] << Delete;

what JMP does is to go to the first instance of the report layer object that can be referenced using the "Parameter Estimates" string, which in this case is the Outline Box under the Fitted Normal results. To do what you want to do, you would need iterate through each part of the report, get the result for the Shapiro-Wilk test from the report layer, and then decide to delete parts of the output report.

I think in your case it would be easier to iterate the distribution analysis for each column one at a time, appending the results to a new report window. Here is some example code that shows how to do that

// Select variables to analyze 
Names Default To Here( 1 );

dt = Current Data Table();

dialog = New Window( "Select Columns",
	<<Modal,
	Panel Box( "Column Selection",
		Lineup Box( N Col( 1 ),
			Text Box( "Select columns for distribution" ),
			y_col_list = Col List Box( All )
		)
	),
	H List Box(
		Button Box( "OK", y_col = y_col_list << Get Selected ),
		Button Box( "Cancel" )
	)
);

If( dialog["Button"] == -1,
	Throw( "User cancelled" )
);


// Create new empty report window
nw = New Window( "Distribution Analysis", hb = H List Box() );

// iterate through each column

For( ii = 1, ii <= N Items( y_col ), ii++, 
// Create distribution, display in report window by appending
	// to the H List Box
	hb << Append(
		dist = dt << Distribution( Columns( y_col[ii] ), Histograms Only )
	);
	
	// get reference to the report layer for the individual distribution analysis
	distr = Report( dist );
	
    // turn on Normal Quantile Plot and add the Normal distribution
    // fit with the goodness of fit test
	dist << Normal Quantile Plot( 1 );
	dist << Fit Distribution( Normal( Goodness of Fit( 1 ) ) );
	
	// instead of deleting the parameter estimates, just close the 
	// outline box that contains them
	distr["Parameter Estimates"]<<Close;

	// get the p-value for the shapiro wilks test from the report layer
	// CAUTION:  the  Goodness of Fit test output depends on 
	// the number of observations and other conditions, so this may not be
	// robust to all scenarios

	/* the report element that contains the p-value is in the second
	Number Col Box in the first Table Box underneath the 
	"Goodness-of-Fit Test" outline box. Using << get on that report 
	object returns a list, and use the [1] index to get the first element
	of that list*/

	swpval = (distr["Goodness-of-Fit Test"][Table Box( 1 )][Number Col Box( 2 )] <<
	get)[1];
	
	// check to see if GOF test p value is <0.05 and if so add custom summary stats
	If( swpval < 0.05,
		dist << Summary Statistics( 1 );
		dist << Customize Summary Statistics( Skewness( 1 ), Kurtosis( 1 ) );
	);
);

View solution in original post

MathStatChem · Aug 13, 2020 10:05 AM

When you use Distribution() with a list of columns, the report window contains the output from many 'independent' distribution analysis. Each one of those analysis is a separate object. Interacting with those objects can be different than interacting with the report layer in the report window When you do

dist << Normal Quantile Plot(1);

that message is broadcast to all of the distribution objects that are being displayed in the report window. But when you do

distr=Report(dist);
distr["Parameter Estimates"] << Delete;

what JMP does is to go to the first instance of the report layer object that can be referenced using the "Parameter Estimates" string, which in this case is the Outline Box under the Fitted Normal results. To do what you want to do, you would need iterate through each part of the report, get the result for the Shapiro-Wilk test from the report layer, and then decide to delete parts of the output report.

I think in your case it would be easier to iterate the distribution analysis for each column one at a time, appending the results to a new report window. Here is some example code that shows how to do that

// Select variables to analyze 
Names Default To Here( 1 );

dt = Current Data Table();

dialog = New Window( "Select Columns",
	<<Modal,
	Panel Box( "Column Selection",
		Lineup Box( N Col( 1 ),
			Text Box( "Select columns for distribution" ),
			y_col_list = Col List Box( All )
		)
	),
	H List Box(
		Button Box( "OK", y_col = y_col_list << Get Selected ),
		Button Box( "Cancel" )
	)
);

If( dialog["Button"] == -1,
	Throw( "User cancelled" )
);


// Create new empty report window
nw = New Window( "Distribution Analysis", hb = H List Box() );

// iterate through each column

For( ii = 1, ii <= N Items( y_col ), ii++, 
// Create distribution, display in report window by appending
	// to the H List Box
	hb << Append(
		dist = dt << Distribution( Columns( y_col[ii] ), Histograms Only )
	);
	
	// get reference to the report layer for the individual distribution analysis
	distr = Report( dist );
	
    // turn on Normal Quantile Plot and add the Normal distribution
    // fit with the goodness of fit test
	dist << Normal Quantile Plot( 1 );
	dist << Fit Distribution( Normal( Goodness of Fit( 1 ) ) );
	
	// instead of deleting the parameter estimates, just close the 
	// outline box that contains them
	distr["Parameter Estimates"]<<Close;

	// get the p-value for the shapiro wilks test from the report layer
	// CAUTION:  the  Goodness of Fit test output depends on 
	// the number of observations and other conditions, so this may not be
	// robust to all scenarios

	/* the report element that contains the p-value is in the second
	Number Col Box in the first Table Box underneath the 
	"Goodness-of-Fit Test" outline box. Using << get on that report 
	object returns a list, and use the [1] index to get the first element
	of that list*/

	swpval = (distr["Goodness-of-Fit Test"][Table Box( 1 )][Number Col Box( 2 )] <<
	get)[1];
	
	// check to see if GOF test p value is <0.05 and if so add custom summary stats
	If( swpval < 0.05,
		dist << Summary Statistics( 1 );
		dist << Customize Summary Statistics( Skewness( 1 ), Kurtosis( 1 ) );
	);
);

teoten · Aug 13, 2020 01:17 PM

Thank you very much, it fulfills its function and also teaches me a lot. It is a great help.

I'd like to ask you something to help my understanding, I hope it won't be much of a problem for you: I noticed that first step you took in the for loop was to append the distribution to the hb variable and afterwards you add all the details (qq-plot, SW test, etc.). Under my logic and experience programing, I usually would add the details first and then append everything to the right place (that's exactly where I was having the trouble actually). My question is, how does it work the hb created by the function New Window? Is it some kind of white page where it will be adding all the subsequent commands and adjusting the size of the new window accordingly? Will it append everything that has to do with the variable dist? And how does the new variable created with distr = Report( dist ) influences what is being appended?

About your comment on the goodness of fit test, thank you very much, that is actually the reason why I am creating the script. I just started a new job a month ago and so far my teammates have been using ONLY the p value from the Shapiro-Wilk test to accept or reject normal distribution. I am trying to provide them with more tools to understand or evaluate if the data follows a normal distribution or not, and why.

MathStatChem · Aug 14, 2020 09:38 AM

The H List Box() is just a container in the report window. JMP report windows can be thought of as collection of nested display containers and display elements in those containers. If you want to see the display tree structure with the display containers, right click on the grey triangle next to an Outline box and select Edit > Show Tree Structure.

In this case, I don't think you can do it in the reverse way, as you describe, as one of the results of running an analysis platform script command is that it creates the display window, and the <<Append() message for the H List Box() expects a display object as the argument.

If you don't place dist into the display tree as I did in the script, it will just create a new window with that single distribution analysis in it, and there will be a separate window with each distribution analysis (for each column selected).

teoten · Aug 17, 2020 10:03 AM

Thanks!!!

Iterate over elements of report

Re: Iterate over elements of report

Re: Iterate over elements of report

Re: Iterate over elements of report

Re: Iterate over elements of report

Re: Iterate over elements of report