cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar

Add CDFs to Graph Builder

Recommend adding the ability to create CDF plots within Graph Builder.  

 

Add a new "Connection" type -- maybe "Extended Step", that would draw the first step from the left edge starting at 0, and from the last point to the right edge to represent a true CDF.  Then, document the approach, and consider adding these settings as some type of preset.

35 Comments
TorP
Level I

That's a pity. Distribution graphs are the main reason we still use Minitab in our department.

Currently it still produces such graphs faster and easier than jmp.

(Minitab has much less evaluation options in the reports, but that's secondary if you first need the distribution graphs.)

 

Example w/ data from Big class.jmp

Minitab: Automatic settings, 3 clicks in dialogue window

TorP_0-1721041532788.png

 

jmp (no JSL, just std. options)

1. Graph Builder

Automatic settings  

TorP_1-1721041675921.png 

y-axis manually set to Probability Normal and adjusted to 0.01 and 0.99

TorP_2-1721041818647.png

 

2. Lifetime Distribution

Automatic settings. Nice graphics, but

- No panel option

- No info about distribution type (Normal, Weibull,...) and CI

- Legend missing when copied w/ "Copy Graph" (available in original report)

TorP_3-1721042156595.png

Summary: Such graphs in the graph builder w/ the above issues fixed would really be appreciated.

And it seems like most pieces are already available, but in diff. places/platforms?

 

hogi
Level XII

It's even worse than that.
For Graph Builder, the cumulative probability has to be calculated manually (for every subplot!)

 

Cumulative Percentage is no option (!)
edit: actually, it IS - see below.
sorry for the confusion.
I missed the point that for a line plot a user can select Cumulative Percent without specifying a column for the Y axis - and then JMP just uses "1s" as values. So no issues with negative and different "values".

 

See: airline Delay (with some negative values)

Open( "$SAMPLE_DATA/Airline Delays.jmp" );
Graph Builder(	Variables( X( :Arrival Delay ), Wrap( :Month ), Overlay( :Airline ) ));


Minitab as reference - this is how the plot should look like:

hogi_0-1721072004056.png

 

JMP / Graph Builder via Cumulative Percent "trick":
hogi_2-1721072143442.png

pro: months are in the right order

con: woah! completely wrong plot!!!

Under the line:
cumulative percent
IS NOT
cumulative probability

hogi
Level XII

"Add CDF/Cumulative Probability  to Graph Builder Statistics":
tiny effort compared to e.g. 🙏 X group: restrict the values on the axis to the respective group 

maybe we can get @XanGregg 's support to get the functionality into JMP19?

Hi everyone, thank you for the discussion and feedback! We are still looking into adding this in a future release, but our developers are currently focused on other features and functionality improvements in JMP 19, including other wish list requests. In the meantime, you can achieve similar results using Platform Presets (see steps below), however we understand this will not be sufficient in all cases. 

 

Steps: Put a numeric variable on the X-Axis, select the line element, then change the settings in the line panel to what is shown in the image. Then make any other changes you see fit, save as a preset, and the next time you open graph builder, add a numeric variable to the X and select the preset you created.

CDF.png

hogi
Level XII
BHarris
Level VI

@Sarah-Sylvestre :  Is that behavior strictly equivalent to running a Fit Y-by-X where GB X = FYX Y and GB Overlay = FYX X?  I just did a couple of spot checks and it looks the same, but I'm worried there might be some important differences, e.g. Add CDFs to Graph Builder - JMP User Community .

@BHarris, I would be skeptical that it's the same in every case even if appears the same in the cases that you have tested. My suggestion would be to use either Fit Oneway or Life Distribution if you want classically "correct" CDF plots, at least for the time being. See @hogi's commentary in that other post you referenced. If you have a specific example that you'd like us to look at more carefully we'd be happy to look at that, you can email us at support@jmp.com.

 

@hogi Thanks for making me aware. You may consider discussing this one (again if not already) with the R&D team next time you have a chance to make it to JMP Discovery.  In most decisions to 'table' an idea for possible future implementation there are both prioritization considerations and technical constraints (which also affect prioritization). 

 

Cheers, Patrick (JMP Technical Support) 

hogi
Level XII

@PatrickGiuliano , I agree: use Fit Oneway or Life Distribution if you want classically "correct" CDF plots.

DON'T USE Graph Builder / Cumulative Percent ! *)

 

The difference:
Cumulative PROBABILITY counts each row with  THE SAME weight.
Cumulative PERCENT (Sum %) counts each row with the corresponding value (negative values count negative -> easy to show the difference)

hogi_0-1724918834690.png

 

 

*)  Cumulative Percent is OK for some rare cases (1+2):
1) ALL VALES ARE POSITIVE
2) ALL VALUES ARE VERY CLOSE TO EACH OTHER (approximation holds:   all values ~ THE SAME)

 

Examples with

Open( "$SAMPLE_DATA/Airline Delays.jmp" );

Life Distribution:

hogi_0-1724913318848.png

50% → 0

 

Analysis/Distribution:

hogi_2-1724913478076.pnghogi_3-1724914126022.png

 

Fit Y By X:

hogi_4-1724914291483.pnghogi_5-1724914340528.png

 

Graph Builder / Cumulative Percent:

hogi_7-1724914450151.png

 50% → 100

Thanks Patrick! My suggestion was going to be the same but Hogi has provided some great explanation on the differences. I have passed all of that feedback on to the developers.

XanGregg
Staff

I've seen various discussions related to this topic, and I've likely have missed some nuance, but I don't readily see any mention of using Cumulative Percent without a Y. Cumulative Percent will calculate cusum(y)/sum(y). CDF wants cumulative_count(x)/count(x). You can turn the former into the latter by replacing y with count, which can be achieved either by leaving out Y completely or replacing it with a freq variable (possibly all ones). Here's an example of the first way.

 

NB this computation matches CDF plots, but the Cumulative Probability transform matches the Normal Quantile plot and is slightly different.

XanGregg_0-1725377457514.png

Open( "$SAMPLE_DATA/Airline Delays.jmp" );
Graph Builder(
	Size( 650, 597 ),
	Show Control Panel( 0 ),
	Variables(
		X( :Arrival Delay ),
		Wrap( :Month, Show Title( 0 ) ),
		Overlay( :Airline )
	),
	Elements( Line( X, Legend( 7 ), Summary Statistic( "Cumulative Percent" ) ) ),
	SendToReport(
		Dispatch( {}, "Arrival Delay", ScaleBox,
			{Min( -40 ), Max( 240 ), Inc( 50 )}
		),
		Dispatch( {}, "", ScaleBox,
			{Min( 0 ), Max( 1 ), Inc( 0.25 ), Minor Ticks( 0 )}
		)
	)
);