I want to process data contained in a table according to the example attached. The table consists in 100 columns + one column serving as a control measure, and with 30 to 80 rows.
The aim is to apply a statistical test in an iterative manner starting with control vs column 51, control vs column 52, control vs column 53, etc… in order to detect when a significant difference appears, how long it is present on successive columns and when the significance is lost.
For example, a difference can be detected from column 63 to column 78, and then lost from c79 to c86, and then reappears on c87 to c98.
I don't know how I can write a script which allows an automatic analysis of this type of table, giving a report of the moments at which differences occur.
Any suggestion ?
here is what I learned from our chat:
Definition of a difference: Wilcoxon test comparing values in the Ctrl column to values in each of the 100 other columns (repeating comparisons).
The example Table is only for giving an idea of the data structure. The behavior described above does not apply to this table.
The aim would be to sort the onset and offset times of a difference for each analyzed table of this kind.
If you could provide another table together with some analysis results you would like to see would help a lot.
Hi Volker and thank you for taking care of my problem,
Here is an already analyzed data table in which the Ctrl / test comparison has evidenced significant differences (assessed with a t test, but a Wilcoxon would be more appropriate).
Each row below gives the comparison between the Ctrl data (control column) and a given test column (numbered from 100 to 200)
Col nbr P value
thank you, that helped a lot.
It seems that scripting is not mandatory here. You could do the following:
1. Tables > Stack: Get a new data table with data from your control column and all measurement columns stacked. This will create a column Label with your column names, and a second column with your all your data.
The attached file shows my result.
2. Analyse > Fit Y by X, with X=Label and Y=Data > Hotspot > Compare Means > With Control, Dunnett's: Choose your control column and you will get all pairwise tests. You can right-click the table under LSD Threshold Matrix, e.g. to sort it or to make it into a data table (maybe for graphing p-values in Graph Builder).
For Dunnett's test see also Compare Means.
Hope that helps,
Someone should probably mention that the approach y'all are taking might cause a sizable number of dead preeminent statisticians to spin like lathes in their respective graves.
I cannot discern exactly what decision you're trying to make from your description, but it sounds more like changepoint detection to me. If so, analyzing a number of sequentially-aggregated p-values from any two-sample test could be considered poor form at best.
Thanks for your message Kevin,
I don’t want to disturb preeminent statisticians whether they are alive or dead.
I would be pleased to know what is your suggestion.
Thanks Kevin, any suggestion is appreciated.
From my understanding Paul did not choose any test yet, but was more asking for a way how to run a sequence of two-sample tests (one sample always the same) in JMP, given his original data set. My point was that scripting would not be necessary in this case.
Thanks again for your contributions.
Without more information, I'm not doing much more than shooting in the dark.
But if, as I suspect, you are searching through sequentially-gathered data in an attempt to discern a change in the data's generative process, a changepoint approach might be more justified.
There are many ways to detect changepoints, and a rich trove of references going back many years.
CRAN has a changepoint package that implements several recently-researched methods in R. JMP has some neat R integration, in which you can execute R code on JMP datasets and get the results back into JMP. Try the Pruned Exact Linear Time (PELT) method referenced in R. Killick , P. Fearnhead & I. A. Eckley (2012) Optimal Detection of Changepoints With a Linear Computational Cost, Journal of the American Statistical Association, 107:500, 1590-1598.
That’s a constructive way to exchange advice ! Thanks for that Kevin. It is interesting to know that we can go back and forth with R and JMP. I will surely take a closer look to changepoint procedures.