Subscribe Bookmark RSS Feed

post hoc tests for nonparametric data

Hi

I have a sinking feeling there is no quick fix to this but does anybody know how to run post hoc tests for nonparametric data in JMP? I am trying to compare 3 groups for differences in mean values such as visual acuity and neither the data nor the residuals are normally distributed in every instance. I have been performing Kruskal-Wallis analyses and then excluding groups to make the comparisons between 2 groups at a time. The problem is that this introduces type I error and, if the data were normally distributed, I would do a post hoc test to control for this.

What options are there to do this in JMP - if any?

Pulling my hair out with frustration at present.

Thanks for your time.

Neil
3 REPLIES
Hi Neil - this looks as if it'll need some scripting, I'm afraid. I've just had a go at it myself, and come up with the one below, which should produce the same Kruskal-Wallis P values as the Fit Y by X platform, for any nominal factor X and any number of Y variables. I've tried it out on the Big Class.JMP sample data set, using either Age or Sex as the X factor, and Height and/or Weight as the Y variable(s), and it seems to work, though I've no doubt it could be tidied up a lot. This program doesn't adjust the P values from the pairwise comparisons for multiple testing: the best I can suggest there would be to use the Bonferroni correction on them.



// Start of program;

DataFile = PickFile("Select Raw Data Table for Kruskal-Wallis Test:",, {"JMP Files|JMP"});

dt = open( DataFile );
dt << minimize window;

VarList = ColumnDialog(Title("Assign Roles:"),
Xfac = ColList("Factor ID", Min Col(1), Max Col(1)),
Yvar = ColList("Response(s)", Min Col(1), DataType(Numeric)),
);

Column(dt, Char(VarList["Xfac"][1])) << data type("Character");
Column(dt, Char(VarList["Xfac"][1])) << set modeling type("Nominal");

xCol = Column(dt, VarList["Xfac"]);

Response_List = VarList["Yvar"];
yCols = {};
for(i=1, i<=nItems(VarList["Yvar"]), i++,
insert into(yCols, Column(dt, Response_List[i]))
);

summarize(xLev=by(xCol)); // We'll need a list of the factor levels later, so create it now;

dt << select all rows;
dts = dt << subset(visible);

dto = Oneway(
Y( eval list( yCols ) ), X( eval list( xCol ) ),
Wilcoxon Test( 1 ), Box Plots( 0 ), Mean Diamonds( 0 ), invisible
);

dtoreport = dto << report;

// DTOREPORT will only be a list if there are at least two response variables;

if(islist(dtoreport),
chisqd = dtoreport[1][NumberColBox(5)] << MakeCombinedDataTable,
chisqd = dtoreport[NumberColBox(5)] << MakeCombinedDataTable
);

dtoreport << close window;

chisqd << add multiple columns("With", 1, Before First, Character);
chisqd << add multiple columns("Compare", 1, Before First, Character);

for(i=1, i<=nrow(chisqd), i++,
column(chisqd, "Compare")[i] = "All"; column(chisqd, "With")[i] = "All"
);

/*
We'll get a different set of summary statistics depending on whether xLev has more
than two levels or not, because if xLev > 2 then it's a chi-squared test, whereas
if there are only two levels the 2-sample test returns a Z statistic.
*/

if(nItems(xLev) > 2,
chisqd << delete columns({"ChiSquare", "DF"});
column(chisqd, "Prob>ChiSq") << set name("P Value");
,
chisqd << delete columns({"s", "Z"});
column(chisqd, "Prob>|Z|") << set name("P Value");
);

chisqd << select all rows;
Significances = chisqd << subset(visible);
close(chisqd, nosave);
Significances << set name("Significances");
column(Significances, "P Value") << format("Fixed Dec", 20, 4);

close(dts, nosave);

/*
Now run every pairwise comparison of all the levels of Xfac,
and append each of them in turn to Significances;
*/

if(nItems(xLev) > 2,

xName = Column(dt, VarList["Xfac"]) << get name;

for(k=1, k<=nItems(xLev)-1, k++,
for(l=k+1, l<=nItems(xLev), l++,

dt << select all rows;
TextToParse = "dt << select where((:" || xName || "==\!"" || xLev[k]
|| "\!")|(:" || xName || "==\!"" || xLev[l] || "\!"));";
// show(TextToParse);
eval(parse(TextToParse));
dts = dt << subset(invisible);
dts << set name("DTS");

xCol = Column(dts, VarList["Xfac"]);

Response_List = VarList["Yvar"];
yCols = {};
for(i=1, i<=nItems(VarList["Yvar"]), i++,
insert into(yCols, Column(dt, Response_List[i]))
);

dto = dts << Oneway(
Y( eval list( yCols ) ), X( eval( xCol ) ),
Wilcoxon Test( 1 ), Box Plots( 0 ), Mean Diamonds( 0 ), invisible
);

dtoreport = dto << report;

// DTOREPORT will only be a list if there are at least two response variables;

if(islist(dtoreport),
chisqd = dtoreport[1][NumberColBox(5)] << MakeCombinedDataTable,
chisqd = dtoreport[NumberColBox(5)] << MakeCombinedDataTable
);

chisqdi = chisqd << subset(invisible);
close(chisqd, nosave);

chisqdi << delete columns({"s", "Z"});
column(chisqdi, "Prob>|Z|") << set name("P Value");
column(chisqdi, "P Value") << format("Fixed Dec", 20, 4);

close(dts, nosave);

chisqdi << add multiple columns("With", 1, Before First, Character);
chisqdi << add multiple columns("Compare", 1, Before First, Character);

for(i=1, i<=nrow(chisqdi), i++,
column(chisqdi, "Compare")[i] = xLev[k]; column(chisqdi, "With")[i] = xLev[l]
);

Significances << Concatenate( chisqdi, Append to First Table );
close(chisqdi, nosave);
)
)
);

// Add a column of significance codes for P <= 0.05 = "*", P <= 0.01 = "**", P <= 0.001="***";

Significances << new column("Sig", Character(10));
for(i=1, i<=nrow(Significances), i++,
column(Significances, "Sig")[i] = if(column(Significances, "P Value")[i] <= 0.001, "***",
if(column(Significances, "P Value")[i] <= 0.01, "**",
if(column(Significances, "P Value")[i] <= 0.05, "*", "")
)
)
);

close(dt, nosave);

// End of program;



edit (Save) required to reset formatting


Message was edited by: ForumAdmin@sas
Another thought: an alternative way it might be possible to tackle it would be to try to find a multiple comparison procedure calculated directly from the rank sums within the overall comparison, as opposed to essentially constructing a new set of pairwise tests from scratch, which is what I've done above. I don't actually have a copy of the book any more, so do check me out on this first if you consider ordering a copy, but I think "Nonparametric Statistical Methods" by Myles Hollander & Douglas A. Wolfe (http://www.amazon.com/Nonparametric-Statistical-Methods-Myles-Hollander/dp/product-description/0471190454 ) includes a number of post-hoc pairwise tests for several nonparametric procedures including the Kruskal-Wallis test - and if so, perhaps one of those could be scripted up instead.

Regards, David
JMP 9, coming out next month, will include nonparametric multiple comparisons. These methods are, of course, nonparametric tests, and they control for the Type I rate.