Re: post hoc tests for nonparametric data

Report Inappropriate Content · Sep 10, 2010 10:13 AM

Hi

I have a sinking feeling there is no quick fix to this but does anybody know how to run post hoc tests for nonparametric data in JMP? I am trying to compare 3 groups for differences in mean values such as visual acuity and neither the data nor the residuals are normally distributed in every instance. I have been performing Kruskal-Wallis analyses and then excluding groups to make the comparisons between 2 groups at a time. The problem is that this introduces type I error and, if the data were normally distributed, I would do a post hoc test to control for this.

What options are there to do this in JMP - if any?

Pulling my hair out with frustration at present.

Thanks for your time.

Neil

Report Inappropriate Content · Sep 19, 2010 07:21 AM

Hi Neil - this looks as if it'll need some scripting, I'm afraid. I've just had a go at it myself, and come up with the one below, which should produce the same Kruskal-Wallis P values as the Fit Y by X platform, for any nominal factor X and any number of Y variables. I've tried it out on the Big Class.JMP sample data set, using either Age or Sex as the X factor, and Height and/or Weight as the Y variable(s), and it seems to work, though I've no doubt it could be tidied up a lot. This program doesn't adjust the P values from the pairwise comparisons for multiple testing: the best I can suggest there would be to use the Bonferroni correction on them.


// Start of program;
 
DataFile = PickFile("Select Raw Data Table for Kruskal-Wallis Test:",, {"JMP Files|JMP"});
 
dt = open( DataFile );
dt << minimize window;
 
VarList = ColumnDialog(Title("Assign Roles:"),
	Xfac = ColList("Factor ID",   Min Col(1), Max Col(1)),
	Yvar = ColList("Response(s)", Min Col(1), DataType(Numeric)),
	);
 
Column(dt, Char(VarList["Xfac"][1])) << data type("Character");
Column(dt, Char(VarList["Xfac"][1])) << set modeling type("Nominal");
 
xCol = Column(dt, VarList["Xfac"]);
 
Response_List = VarList["Yvar"];
yCols = {};
for(i=1, i<=nItems(VarList["Yvar"]), i++,
	insert into(yCols, Column(dt, Response_List[i]))
	);
 
summarize(xLev=by(xCol)); // We'll need a list of the factor levels later, so create it now;
 
dt << select all rows;
dts = dt << subset(visible);
 
dto = Oneway(
	Y( eval list( yCols ) ), X( eval list( xCol ) ),
	Wilcoxon Test( 1 ), Box Plots( 0 ), Mean Diamonds( 0 ), invisible
	);
 
dtoreport = dto << report;
 
// DTOREPORT will only be a list if there are at least two response variables;
 
if(islist(dtoreport),
	chisqd = dtoreport[1][NumberColBox(5)] << MakeCombinedDataTable,
	chisqd = dtoreport[NumberColBox(5)]    << MakeCombinedDataTable
	);
 
dtoreport << close window;
 
chisqd << add multiple columns("With",    1, Before First, Character);
chisqd << add multiple columns("Compare", 1, Before First, Character);
 
for(i=1, i<=nrow(chisqd), i++,
	column(chisqd, "Compare")[i] = "All"; column(chisqd, "With")[i] = "All"
	);
 
/*
	We'll get a different set of summary statistics depending on whether xLev has more
	than two levels or not, because if xLev > 2 then it's a chi-squared test, whereas
	if there are only two levels the 2-sample test returns a Z statistic.
*/
 
if(nItems(xLev) > 2,
	chisqd << delete columns({"ChiSquare", "DF"});
	column(chisqd, "Prob>ChiSq") << set name("P Value");
	,
	chisqd << delete columns({"s", "Z"});
	column(chisqd, "Prob>|Z|") << set name("P Value");
	);
 	
chisqd << select all rows;
Significances = chisqd << subset(visible);
close(chisqd, nosave);
Significances << set name("Significances");
column(Significances, "P Value") << format("Fixed Dec", 20, 4);
 
close(dts, nosave);
 
/*
	Now run every pairwise comparison of all the levels of Xfac,
	and append each of them in turn to Significances;
*/
 
if(nItems(xLev) > 2,
 
	xName = Column(dt, VarList["Xfac"]) << get name;
 		
	for(k=1, k<=nItems(xLev)-1, k++,
		for(l=k+1, l<=nItems(xLev), l++,
 
			dt << select all rows;
			TextToParse = "dt << select where((:" || xName || "==\!"" || xLev[k]
				|| "\!")|(:" || xName || "==\!"" || xLev[l] || "\!"));";
//			show(TextToParse);
			eval(parse(TextToParse));
			dts = dt << subset(invisible);
			dts << set name("DTS");

			xCol = Column(dts, VarList["Xfac"]);
 
			Response_List = VarList["Yvar"];
			yCols = {};
			for(i=1, i<=nItems(VarList["Yvar"]), i++,
				insert into(yCols, Column(dt, Response_List[i]))
				);
		
			dto = dts << Oneway(
				Y( eval list( yCols ) ), X( eval( xCol ) ),
				Wilcoxon Test( 1 ), Box Plots( 0 ), Mean Diamonds( 0 ), invisible
				);
 
			dtoreport = dto << report;
 
			// DTOREPORT will only be a list if there are at least two response variables;
 
			if(islist(dtoreport),
				chisqd = dtoreport[1][NumberColBox(5)] << MakeCombinedDataTable,
				chisqd = dtoreport[NumberColBox(5)]    << MakeCombinedDataTable
				);
 
			chisqdi = chisqd << subset(invisible);
			close(chisqd, nosave);
			
			chisqdi << delete columns({"s", "Z"});
			column(chisqdi, "Prob>|Z|") << set name("P Value");
			column(chisqdi, "P Value") << format("Fixed Dec", 20, 4);
 		
			close(dts, nosave);
 		
			chisqdi << add multiple columns("With",    1, Before First, Character);
			chisqdi << add multiple columns("Compare", 1, Before First, Character);
 
			for(i=1, i<=nrow(chisqdi), i++,
				column(chisqdi, "Compare")[i] = xLev[k]; column(chisqdi, "With")[i] = xLev[l]
				);
 
			Significances << Concatenate( chisqdi, Append to First Table );
			close(chisqdi, nosave);
			)
		)	
	);
 
// Add a column of significance codes for P <= 0.05 = "*", P <= 0.01 = "**", P <= 0.001="***";
 
Significances << new column("Sig", Character(10));
for(i=1, i<=nrow(Significances), i++,
	column(Significances, "Sig")[i] = if(column(Significances, "P Value")[i] <= 0.001, "***",
		if(column(Significances, "P Value")[i] <= 0.01, "**",
			if(column(Significances, "P Value")[i] <= 0.05, "*", "")
			)
		)
	);
 	
close(dt, nosave);
 
// End of program;

edit (Save) required to reset formatting

Message was edited by: ForumAdmin@sas

Report Inappropriate Content · Sep 19, 2010 07:53 AM

Another thought: an alternative way it might be possible to tackle it would be to try to find a multiple comparison procedure calculated directly from the rank sums within the overall comparison, as opposed to essentially constructing a new set of pairwise tests from scratch, which is what I've done above. I don't actually have a copy of the book any more, so do check me out on this first if you consider ordering a copy, but I think "Nonparametric Statistical Methods" by Myles Hollander & Douglas A. Wolfe (http://www.amazon.com/Nonparametric-Statistical-Methods-Myles-Hollander/dp/product-description/0471190454 ) includes a number of post-hoc pairwise tests for several nonparametric procedures including the Kruskal-Wallis test - and if so, perhaps one of those could be scripted up instead.

Regards, David

Report Inappropriate Content · Sep 23, 2010 09:21 AM

JMP 9, coming out next month, will include nonparametric multiple comparisons. These methods are, of course, nonparametric tests, and they control for the Type I rate.

post hoc tests for nonparametric data