Subscribe Bookmark RSS Feed

Simple Question..?

I'm a new user and not a great statistician! But as part of my learning I have been posed the followig question:-

"A process that previously had a standard deviation of 4 has been significantly improved to have a standard deviation of 2. Each sample statistic comes from a normal distribution and is based on a sample size of 10. Do you agree? Why?"

I'm sure the answer is simple but i'm really struggling with it - Can anyone help me with this one?

Community Trekker


Jun 23, 2011

Not sure what this has to do with JMP? There are too many unknowns to state a reasonable case. What process are we talking about? What is the measurement error? How do you know the underlying distribution is normal? With a sample size of 10, what normality tests were done? Is the process consistent? Not sure how you would know with 10 samples. Certainly 2 is smaller than 4!
Thanks for your feedback Statman - I was wondering if a valid approach to this question is to use the 'Sample Size & Power' option under 'DOE' and 'One Standard Deviation'?
I tried entering 0.05 for Alpha 4 as my hypothesized SD and 'Smaller'; then 'Difference to Detect' = -2 and 'Power' = 0.90 in order to generate a sample size of 12.
I also tried entering the same data but set 'Sample Size' = 10 to generate a Power of 0.8505 in an attempt to draw some conclusions about the validity in the original question.
Any comments wouel be appreciated..!
I would have thought that answering a statistical question is perfectly relevant to JMP.

There are 2 approaches, one academic, and one pragmatic.

The academic approach is to ask what sample size is required to differentiate between the 2 standard deviations. That is the power test that you performed. As you said, this yields an answer of 12 for a power of 0.9. If you select the "?" icon from the toolbar and click on the dialog window it will give you an example of its usage to give you confidence that you used it correctly.

The pragmatic approach is to simulate the problem.

Create a new table table. When you create a new column you can assign initial initialise the data - specify 10 rows, select random normal, specify the mean and a standard deviation of 4.

Do the same again for a second column but specify a standard deviation of 2.

To analyse the data you want to use Fit Y by X but the shape of the data is wrong. You need to select Tables>Stack and stack the 2 columns on top of each other. By default this will give you a new table with columns "label" and "data".

Perform Fit Y by X with "data" as your Y and "label" as your X.

From the platform hotspot, select the option "Unequal Variances". This will perform a statistical test to see whether there is a statistically significant difference between the variances of the 2 columns of simulated data.

You will see that the results of 5 different tests are shown. This is an indication that in fact the question you are trying to ask is not so simple! Again you can use the "?" to get more background to the tests.

Bear in mind that performing a simulation this way is subject to the risks associated with the power test i.e. for a sample size of 10 the power was 0.85, so there is a 15% chance that the Unequal Variance test will fail to detect a significant difference even if one does exist.

Statistics is never black and white so the final answer will depend on the context of the question. There is sufficient statistical evidence for you to justify that there is a significant reduction in the variation based on the information that you have available. But if you wanted to argue to the contrary there is always scope to argue that you need more data simply by demanding a higher value for the power or lower values of alpha (i.e. reducing the risk of Type I or Type II error). Your tolerance to these risks will depend on the consequences of the wrong type of conclusion.

Hope this helps
Thanks stig; I'll follow your logic - it makes sense.