Discussions

Novice_Hector

I wanted to know if there is a way to run models and perform data analysis when the data is only available as means and standard deviations. While modeling generally uses the mean and standard deviation as metrics to analyze and model data, JMP calculates them and then uses them for your chosen method. Unfortunately, my partners only have the data available as the means and SD per formulation. Thank you very much, and have a great day. I appreciate your time and expertise.

statman · | Posted in reply to message from Novice_Hector 01-19-2026

I'm sorry if I don't understand your question. I think it is appropriate to look at the raw data before summarizing and describing the raw data with enumerative statistics (e.g., mean and standard deviation). Are those the appropriate statistics to use? Once you have the summary statistics, you can certainly model them. I would first look to see if they correlate, then run separate fit models to asses model effects.

"All models are wrong, some are useful" G.E.P. Box

Potcner · Jan 21, 2026 3:26 PM

Yes, you can do the analysis on the means and std devs very similarly if you had the raw data. In your example, however, 'Formulation' isn't a possible factor to use as that's just an ID for each set of data values. You would just want to use 'Percent'. That's a factor that has 3 levels with 2 values at 0, 3 at 12.71, and 4 at 30.

If you have 'Percent' as a categorical variable, than the Fit Y by X platform would result in doing ANOVAs. Note: Some people like to use Log(SD) as the response instead of just the SD, as that transforms the values so they're more normally distributed. Not necessarily though unless you're trying to some very precise inference. You're probably fine as is.
If you have 'Percent' as a continuous variable, than the Fit Y by X platform will fit a linear regression.

Just make sure when interpreting results you do so knowing the data analyzed are the means and std deviations and not individual data values for whatever attribute was measured. A nice graph to do with this kind of data is to plot the means on one axis, the std dev on the other, and then use a symbol or color to show the 3 % levels. This allows you to simultaneously view the central tendency and variation for each of the 3 % levels.

I attached .jmp file with the analyses and graphs.

dlehman1 · | Posted in reply to message from Potcner 01-21-2026

Your response confuses me a bit. I'll admit I hadn't thought of doing the analysis you suggest and it might be what Hector is looking for. But my initial reaction was that more information would be required - at least something about the sample sizes that lie behind these summary statistics. My inclination would have been to say that some simulated data would be necessary. If you make an assumptions about the distribution of the individual data points behind these means and standard deviations, then you can create a simulated data set (perhaps several, based on different distributional assumptions) to analyze rather than just focusing on these available summary statistics.

Now, based on your suggestion, I see that such an analysis might be unnecessary. However, it seems to me that your analysis might be making some implicit assumptions about the data distributions behind these summary statistics. Can you comment on this?

Discussions

Analysis of Data Using Mean and SD

Re: Analysis of Data Using Mean and SD

Re: Analysis of Data Using Mean and SD

Re: Analysis of Data Using Mean and SD

Recommended Articles