Share your ideas for the JMP Scripting Unsession at Discovery Summit by September 17th. We hope to see you there!
Choose Language Hide Translation Bar
Byron_JMP
Staff
One Way ANOVA Figure for Scientists

Screen Shot 2020-05-08 at 6.23.25 AM.png

 

Ever since scientists started using ANOVA, the result have been represented with a bar chart for each treatment, and one-sided wisker above the plot to represent the 95% confidence interval*.  This makes comparing the treatment groups easy. If the next bar is higher/lower than the wisker then its significantly different than the control (usually on the left).   Then each bar is labeled with the connecting letter from the Turkey Kramer (HSD) test to make it even easier to tell which treatments are the same and different. 

 

*Standard Error of the Mean gets used a lot too. Heck, if my 95% CI was taller than the bar I'd choose that too, its almost 2x skinnier, and makes the figure look way better.

 

How to make the figure.

1. Start with Fit y by X. Add your treatment variable (nominal or ordinal) and you continuous response. 

2. From the Red Triangle Menu (RTM) pick Means and Std Dev

3. From the RTM, pick Compare Means, pick the right test, likely All Paris, Turkey HSD if you have more than two groups.

4. Right click on the connecting letters report, pick, Make into Data Table

5. Right click on the Means Std Dev report, pick, Make into Data Table

6. Make a new column in the connecting letters table, manually construct the letters in one column from all the letters columns. Use lower case.

7. Use Tables, Update to join your new letters column back to the Means Std Dev table. Match column, on Level.  

8. Turn on Labels for the letter column

9. Rename the Level column to the name of the treatment variable.

10. Rename the Mean column to Mean Response (your response) and 95% CI. e.g. "Mean Absorbance and 95% CI", or something like that.

11. Start Graph Builder.

12. Treatment goes on the x-axis

13. Mean Treatment and the upper 95% confidence interval go on the y-axis.

14. Choose the Bar graph element (looks wrong, but wait)

15. Make the bar graph type be Interval (still looks wrong, but wait)

16. Right click inside the graph, Add, Another Bar chart.

17. In the properties of the second bar chart, from the Label drop down, choose Label by Row.

18. Label the graph Response by Treatment. 

19. Change the Graph Builder Title Bar to read One Way ANOVA

20. Click on the legend, and change the bar color too black.

21. From RTM, Save Script to Data Table.

22. From RTM in the tables box, Copy Table script.

23. In your original source data table's Table Box RTM, pick new script

24. In the dialog give the script a title, like, "Figure 1". and past the script in the lower window.

25. If your data changes. Do all this over again, because the script you just saved isn't dynamic, but at least it's a road map, kind of, to how you make the figure.

 

The reason only scientists use this figure is because it practically takes a PhD. to keep all the steps straight in order to get this figure right. (Note, its way easier to do this with a sharpie and graph paper.)

 

If the 25 step process is just a little much, or left you stuck on some vague parts that probably needed their own 25 step sub-processes. Just run the script below. Hopefully the annotation helps if you want to customize it for your frequent figures.

 

Names Default To Here( 1 );
//make sure there is a table open, if not open one
dtlist=Get Data Table List();
picker=expr(	File = Pick File("Select JMP File","$SAMPLE_DATA",{"JMP Files|jmp;", "All Files|*"},1,0,"",));Try( Open( File[i] ) );
if(nitems(dtlist)==0, picker, dt=current data table());
dtname=dt<<get name;dtname=(item(1, dtname, ".")); //pick up table name for later
// Column dialog to pick varialbles
cd=Column Dialog(
	xxxx = ColList( "Choose Categorical X Variable",
		Max Col( 1 ),
		Modeling Type( {"Nominal", "Ordinal"} )
	),
	yyyy = ColList( "Choose Continuous Response",
		Max Col( 1 ),
		Modeling Type( "Continuous" )
	)
);
show(cd); // this is what the column dialog returns.
eval(cd[1]);eval(cd[2]);show(yyyy);show(xxxx); //variables yyyy and xxxx don't exist outside the list from the column dialog until they are evaluated.
// xxxx and yyyy are lists with one item so reference them as xxxx[1]

//Run the ANOVA using Fit Y by X
platform = dt << Oneway(
	Y( yyyy[1]),
	X( xxxx[1] ),
	All Pairs( 1 ),
	Means and Std Dev( 1 ),
	Mean Error Bars( 1 ),
	Std Dev Lines( 1 )
);
//Get the Connecting Letters Report as a table
Wait( 0 );
dt1 = Report( platform )[Outline Box( "Comparisons for all pairs using Tukey-Kramer HSD" )][Outline Box( "Connecting Letters Report" )][Table Box( 1 )] << Make Into Data Table;
//Data wrangling to get the letters as a list of strings (Huge Thanks to Jeff P.!)
cols = dt1 << get column names( string );
concat_vals = {};
For( j = 1, j <= N Row( dt1 ), j++,
str = "";
For( i = 2, i <= N Col( dt1 ) - 1, i++,
str = str || Column( i )[j];);
ncomma=length(str)-1;
while(ncomma>0,
insert into(str, ",", ncomma+1);
ncomma--;);
concat_vals[j] = str;);
Show( concat_vals );/// The letters are now a list in the order of the levels from table 1
/// Put the letters back into the table
dt1 << New Column( "Letters", Character, "Nominal", Set Values( concat_vals ) );
//
//Get the means and confidence intervals from the report as a table
dt2=Report( platform )[Outline Box( "Means and Std Deviations" )][Table Box( 1 )] << Make Into Data Table;
dt2<<set name(dtname||" Summary Stats and Connecting Letters");
Report( platform ) << Close Window;//all done with the report 
//Update the means stdev table with the letters
dt2 << Update(
	With( dt1 ),
	Match Columns( :Level = :Level ),
	Add Columns from Update Table( :Letters ),
	Replace Columns in Main Table( None ),
	Ignore missing
);
close(dt1, no save); //done with the letters table
//make the letters lower case
dt2 << Begin Data Update;
dt2 << Recode Column( dt2:Letters, {Lowercase( _rcNow )}, Target Column( :Letters ) );
dt2 << End Data Update;
//new column formatted to work in graph builder. Don't want the letters to crash into the wisker
dt2<<New Column( "Letter Lables",
		Character,
		"Nominal",
		Formula( "          " || :Letters )
	);
dt2 << Set Label Columns( name("Letter Lables") );

// Make the graph
// You may want specific formatting and labels. Replace this scritp with yours
dt2<<
Graph Builder(
	Size( 500, 400 ),
	Show Control Panel( 0 ),
	Show Legend( 0 ),
	Variables( X( :Level ), Y( :Mean ), Y( :Upper 95%, Position( 1 ) ) ),
	Elements(
		Bar( X, Y( 1 ), Y( 2 ), Legend( 5 ), Bar Style( "Interval" ) ),
		Bar( X, Y( 1 ), Legend( 6 ), Label( "Label by Row" ) )
	),
	SendToReport(
		Dispatch( {}, "Graph Builder", OutlineBox, {Set Title( "One Way ANOVA" ), Image Export Display( Normal )} ),
		Dispatch( {}, "400", ScaleBox, {Legend Model( 6, Properties( 0, {Fill Color( 0 )}, Item ID( "Mean", 1 ) ) )} ),
		Dispatch( {}, "graph title", TextEditBox, {Set Text( xxxx[1] || " by " || yyyy[1] )} ),
		Dispatch( {}, "X title", TextEditBox, {Set Text( xxxx[1] )} ),
		Dispatch({},"Y title",TextEditBox,{Set Text( "Mean "||yyyy[1]||" and 95% CI" )})
	)
);

 

 

 

 

 

Article Labels

    There are no labels assigned to this post.

Article Tags
2 Comments
Level VI

TBH, I am a scientist and I have never seen the results of an ANOVA presented this way.  i don't find the chart intuitive, and the scale is all wrong.  In ANOVA you really want the scale of the data, for visualization, to be the range of the data, or a +/- k standard deviation range around the data (k somewhere in the 2-4 range).  The plot above mostly focuses on proportional differences by showing a bar for the mean and with zero on the y-axis.  

 

It also looks like you are using the standard deviation from each group to construct the error bars, but the Tukey-HSD used the pooled standard deviation, doesn't it?  That's what the diamonds in the Oneway plot use, also, and the width of the diamond varies because of the number of observations in the group.   

 

I think it is better, for the plot you constructed, to stretch the Y-axis so the the focus on on the location of the mean result.  

 

Is this just a way to get around having to explain the "circles" in the Oneway plot that are used to which groups are similar or statistically different?  

 

How about this plot instead?

MathStatChem_0-1589167274396.png

 

I played around with your resulting table, created two columns to indicate if the row was in group "a" or "b" and then created this dashboard.  

MathStatChem_1-1589168428666.png

 

 

 

Staff

It's definitely the 95% confidence intervals, take a look at the table that the graph is run against

Published scientific papers use this format frequently