cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Check out the JMP® Marketplace featured Capability Explorer add-in
Choose Language Hide Translation Bar
JackLEE
Level I

Small sample size, how to find correlations between parameters correctly?

Hi,

Unfortunately, finally I only have 3 samples left, there were around 10 different characteristics determined on those samples. So now I have a table with 3 rows and 10 columns. What I want to do is figure out if there are one-on-one trends (correlations) between these parameters, or, in other words, if some of these parameters have strong correlations with some others.

 

Therefore, I performed multivariate analysis with JMP, and it worked. But the question is, is this sample size too small to draw a solid conclusion? If so, how should I correctly calculate the sample size?

 

Someone said if JMP could work, it means my sample size is okay, but I kind of doubt this. How do you think?

 

Thank you in advance for your input or comments! 

3 REPLIES 3
statman
Super User

Re: Small sample size, how to find correlations between parameters correctly?

First, welcome to the community.  If you have multiple columns of data, JMP can perform a multivariate analysis.  It may not be useful, but it won't fail to "work". The more challenging question is how confident are you the results of this study will be true in the future(assuming this is what you mean by "solid conclusion")?  Seldom can this be assessed just statistically (I know some may argue with this). The question is how representative of future conditions are the samples you have in hand? Hopefully your SME's can provide some guidance as to how representative the 3 samples you have are of "all samples".  I would be suspicious.  How were the 3 samples acquired?

 

 “Unfortunately, future experiments (future trials, tomorrow’s production) will be affected by environmental conditions (temperature, materials, people) different from those that affect this experiment…It is only by knowledge of the subject matter, possibly aided by further experiments to cover a wider range of conditions, that one may decide, with a risk of being wrong, whether the environmental conditions of the future will be near enough the same as those of today to permit use of results in hand.”

Dr. Deming

 

There are ways to be efficient in your sampling that can help to increase the inference space of the study and thereby increase the likelihood your study will be applicable in the future. 

"All models are wrong, some are useful" G.E.P. Box
JackLEE
Level I

Re: Small sample size, how to find correlations between parameters correctly?

Thank you for your swift reply and welcome! Like you said, if the results are still valid in the future is not merely dependent on the statistical results. But it should also consider the practical meaning. What I was trying to figure out while proposing this question is, from a statistical point of view, whether 3 samples x 10 characteristics are enough to obtain correlations between those characteristics. Should I believe the correlation R2 and start to interpret the practical meaning of the obtained results? (Not sure if I clarified my question... sry). 

 

The 3 samples were prepared in the same way, just with a gradually increasing protein content. 

 

dale_lehman
Level VII

Re: Small sample size, how to find correlations between parameters correctly?

I don't believe any amount of statistical wizardry will overcome the fact that you only have 3 samples and thus must be skeptical of any relationship you observe.  You might try the bootstrap so you get a more concrete feel for what the confidence intervals (they should match fairly well) look like for the correlations.  I suspect they will be quite wide, and that is an indication of just how unsure you should be of the relationships.  So, you can have JMP estimate the relationships along with the uncertainties, but ultimately the issue is what you want to use the results for.  I think it would be dangerous to use the point estimates for anything - but along with some simulation of the uncertainties, it might prove useful.  Ultimately, you will need more data to have any confidence in what you find, in my opinion.