cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
New to using JMP? Hit the ground running with the Early User Edition of Discovery Summit. Register now, free of charge.
Register for our Discovery Summit 2024 conference, Oct. 21-24, where you’ll learn, connect, and be inspired.
Choose Language Hide Translation Bar
shampton82
Level VII

This might be a big ask, but can someone help with a script to try and select a normal distribution when it is pretty close to the best fitted distribution?

So here's what I'm hoping for:

When you click fit all in Distribution platform you get a lot of fits

shampton82_0-1724034774388.png

However, if the AICc of the best fit is within 5 of the normal distribution you might as well use the normal fit.  Sooooooo, if there a way to script something that would go through a bunch of columns that have already had the best fit ran and then adjust the selected fit (assuming it is non-normal) to Normal if it is within an AICc of 5 to the normal distribution?  Bonus points would be for being able to have an input box to enter the delta of the AICc you are willing to live with.  Double bonus would be to remove Students t, Cauchy, and ExGaussian from the selection options as you can't calculate process capabilities on these distributions (and that will be the next step to run after this clean up script is ran).

 

I've tried and can't get it to work, any help would be greatly appreciated!!

 

Steve

7 REPLIES 7
jthi
Super User

Re: This might be a big ask, but can someone help with a script to try and select a normal distribution when it is pretty close to the best fitted distribution?


 if there a way to script something that would go through a bunch of columns that have already had the best fit ran


Is the Distribution platform still open and it has all the results or where are the results stored? Do you need to leave compare distributions open or is it enough that one distribution has been fit?

-Jarmo
shampton82
Level VII

Re: This might be a big ask, but can someone help with a script to try and select a normal distribution when it is pretty close to the best fitted distribution?

Hey @jthi !

Yeah the platform would still be open and all the best fit comparisons are still there.  I think ideally the comparison box is still there but I can live with just the best fit if that is a lot easier.

 

So starting here

shampton82_0-1724075513653.png

and for something like this fit, normal would be chosen, however lets say that Normal was way down the list, then SHASH would be chosen

shampton82_1-1724075601642.png

Here normal would be chosen as it is within 5 of Johnson Sb's AICc value.

shampton82_2-1724076209144.png

 

Hope this helps and thanks for looking at this!

 

Steve

jthi
Super User

Re: This might be a big ask, but can someone help with a script to try and select a normal distribution when it is pretty close to the best fitted distribution?

There are two difficult things with this which are related: you cannot default sort table box with JSL (to my knowledge and this causes issues in many platforms) and this makes it difficult to set the check boxes correctly (they are not in same order as you see other values, most likely they don't get always sorted or something)

 

And because of this, I'm not sure if this will work in every case (I tried doing without the For but the orders did change too much)

Names Default To Here(1); 

dt = Open("$SAMPLE_DATA/Cities.jmp");

dt << Distribution(
	Continuous Distribution(Column(:"pop- m"n), Process Capability(0)),
	Continuous Distribution(Column(:POP), Process Capability(0)),
	Continuous Distribution(Column(:Max deg. F Jan), Process Capability(0)),
	Continuous Distribution(Column(:OZONE), Process Capability(0)),
	Continuous Distribution(Column(:CO), Process Capability(0)),
	Continuous Distribution(Column(:SO2), Process Capability(0)),
	Continuous Distribution(Column(:NO), Process Capability(0)),
	Continuous Distribution(Column(:PM10), Process Capability(0)),
	Continuous Distribution(Column(:Lead), Process Capability(0)),
	Continuous Distribution(Column(:X), Process Capability(0)),
	Continuous Distribution(Column(:Y), Process Capability(0)),
	Fit All
);


// Script basically starts from here
Names Default To Here(1);


obs = Current Report() << XPath("//OutlineBox[@helpKey='Distrib']");
If(N Items(obs) > 0,
	dist = obs[1] << Get Scriptable Object;
);

obs = Report(dist) << XPath("//OutlineBox[text()='Compare Distributions']");

For Each({ob}, obs,
	tb = ob[TableBox(1)];
	vals = tb << get;

	// Order might be incorrect in these
	dists = vals["Distribution"];
	normal_idx = Contains(dists, "Normal");
	normal_aic = vals["AICc"][normal_idx];
	m_aic = Matrix(vals["AICc"]);
	
	
	// This comparison might require some fixes
	ifs = (ob << parent) << XPath("//IfBox");
	Remove From(ifs, 1, 2); // might require more robust method
	fit_titles = (ifs << child) << get title;


	If(Min(m_aic) >= normal_aic - 20 & normal_idx > 1,
		For(i = 1, i <= N Items(dists), i++,
			tb[1] << Set All(0);
			tb[1] << Set(i);
			idx = Contains(ifs << get, 1);
			If(fit_titles[idx] == "Fitted Normal Distribution",
				break();
			);	
		);
	);
);

Write();


 

-Jarmo
shampton82
Level VII

Re: This might be a big ask, but can someone help with a script to try and select a normal distribution when it is pretty close to the best fitted distribution?

Thanks for the script @jthi !

I tried running it and got this error 

shampton82_0-1724087148090.png

From the log:

/*:
Subscript Range in access or evaluation of 'fit_titles[ /*###*/idx]' , fit_titles[/*###*/idx]

at line 54 in C:\Users\steve.hampton\OneDrive - Precision Castparts Corp\Documents\Script 3.jsl

 

jthi
Super User

Re: This might be a big ask, but can someone help with a script to try and select a normal distribution when it is pretty close to the best fitted distribution?

You have to start adding debug prints and check where it goes wrong. You might have to change the loop from N Items(dists) to N Items(fit_titles) or something.

-Jarmo
txnelson
Super User

Re: This might be a big ask, but can someone help with a script to try and select a normal distribution when it is pretty close to the best fitted distribution?

I attempted to approach the solution similar to Jarmo's, but different.  Different in that I chose to use the Check Boxes in the Table Box under the Compare Distributions Outline Box.  I reasoned (correctly), that if the Normal AICc was within 5 of whatever the actual selected distribution was, then all that needs to be done is in the JSL is to unselect the check box for the JMP chosen distribution, and to select the check box for the Normal Distribution.  If in JSL, this unselection and selection is made, the correct distribution is displayed on the Histogram.  However, it appears that the normal messages that are used to manipulate a check box from within JSL do not return or set check boxes correctly........unless I do not have a complete understanding of the checkboxes displayed in a table box.

See below for what I found

names default to here(1);
dt =
// Open Data Table: semiconductor capability.jmp
// → Data Table( "semiconductor capability" )
Open( "$SAMPLE_DATA/semiconductor capability.jmp" );

dist = dt << Distribution(
	Continuous Distribution(
		Column( :INM2 ),
		Process Capability(0),
		fit all(1)
	)
);

// From the current report, find all of the Compare Distribution boxes
//dist =( current report() << xpath( "//OutlineBox[text()='Compare Distributions']" ));

// There is a bug in JMP where the check boxes return a standard order, not the order displayed
checkBoxOrder = {
		"Normal",
		"Cauchy",
		"Student's t",
		"Lognormal",
		"Exponential",
		"Gamma",
		"Johnson Sb",
		"SHASH",
		"ExGaussian",
		"Normal 2 Mixture",
		"Normal 3 Mixture",
		"Weibull"
};

distr = dist<<report;
// The Johnson Sb returns the Fit All as the selected distribution
// However, if one gets all of the selected indices for the 
// check boxes, it indicates that the 7th check box is selected
mat=distr[1][CheckBoxBox(1)]<<get selected indices as matrix;
show(mat);

// I am not sure if this is just a false artifact, but if one 
// looks into the checkBoxOrder list, the 7th item is Johnson Sb
show(checkBoxOrder[loc(mat,1)]);

// However, if one wants to unselect the Johnson Sb check box
// it is the 9th check box item that has to be selected
distr[1][CheckBoxBox(1)]<<set(9,0);

// Setting the 7th check bos as selected selects the Normal
// destribution
distr[1][CheckBoxBox(1)]<<set(7,1);
// or
// Setting the loc CheckBoxOrder index value will set the 
// Normal distribution to seelcted 
// distr[1][CheckBoxBox(1)]<<set(loc(mat,1), 1 );

// If column PNP3 is selected the same patterns appear
// The selected distribution is LogNormal
// The get selected indices as matrix return with the
// 4th element set as 1
// If you set specify <<set(4,1) the Normal Distribution is selected
// but to turn off the LogNormal selection the 8th checkbox needs
// to be set to 0

I am hoping that someone can clue me in as the my error in logic or usage.  If no such enlightenment is available, I will turn this over to JMP Support.

Jim
jthi
Super User

Re: This might be a big ask, but can someone help with a script to try and select a normal distribution when it is pretty close to the best fitted distribution?

My guess is that the checkboxes aren't really tied to the table box -> when you sort the table box, check boxes do not get sorted properly. And then by default the table box is already sorted in different manner than "default". So you end up in very weird situation where you have no idea which index sets which item as they are not aligned. That is why I went with the very complicated For loop and to check which distribution's Ifbox was enabled.

 

If you use "default" order (which you cannot set with JSL which causes issues in other platforms also like process screening) you get this

jthi_0-1724166055918.png

these are the "correct" indices most often.

 

I haven't yet tried if you could calculate which index is which given the current sort order (or with the knowledge which column was used for sorting).

-Jarmo