cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
MikeKim
Level III

How can we easily assume the 'independence of error term(and so many other factors)??

Once, I got the question which is, "How can we easily assume the 'independence of error term(and so many other factors)??"

 

Since, in many field, data necessarily are 'autocorrelated'.

 

For example, the typical tool which assume those of normality and non-autocorrelations, Control Chart, are definetely autocorrelated (just extent matters).

Obviously, production process are impacted by the outside temperature, operators' condition, humidity, etc. And all of these are definetely affected by Seasonal change and/or some kind of pattern which is continous.

So, impacted by these factors, the result variables necessarily follow specific pattern.

Of course, the extent would be small, since Manufacturer always focus on the 'consistency of condition', resulting all of those factors can be neglected (but it can not be zero).

Likewise, many other things necessarily follow some pattern.

 

So Do I think wrong? 

little bit of autocorrelation violation can be accepted as there is no autocorrelation?

 

1 ACCEPTED SOLUTION

Accepted Solutions
Phil_Kay
Staff

Re: How can we easily assume the 'independence of error term(and so many other factors)??

Hi @MikeKim ,

I love this! A very philosophical question. I am fairly sure that I am not qualified to provide a definitive answer. But here goes...

Like many assumptions in scientific reasoning, I think this is one where it is does not have to be 100% true to be useful. And you might even say that the nature of assumptions is that you know that they are not 100% true. Otherwise, they would be "givens" not "assumptions."

I agree that the idea of 2 consecutive observations on a control chart being completely independent is absurd. There will definitely be some factors that are somewhat consistent from one observation to the next.

However, if we can assume that this autocorrelation is small and not important, then we can use control charts to separate common cause from special cause variation, identify if the process is trending out of control - all the good stuff we use SPC for.

I spend most of my time thinking about DOE. In sequential DOE we screen out the "inactive" factors from the "active" factors. This is similarly absurd. All factors will be active in some way. But if we can classify some as unimportant/inactive, we can focus our limited effort on the most important factors and understand, optimise, and control our process to yield consistently high-quality output.

Assumptions enable us to build useful models to better understand our processes. All models are wrong - and that is partly because the assumptions are not 100% true - but some are useful.

I hope that helps!

Phil 

View solution in original post

8 REPLIES 8
Phil_Kay
Staff

Re: How can we easily assume the 'independence of error term(and so many other factors)??

Hi @MikeKim ,

I love this! A very philosophical question. I am fairly sure that I am not qualified to provide a definitive answer. But here goes...

Like many assumptions in scientific reasoning, I think this is one where it is does not have to be 100% true to be useful. And you might even say that the nature of assumptions is that you know that they are not 100% true. Otherwise, they would be "givens" not "assumptions."

I agree that the idea of 2 consecutive observations on a control chart being completely independent is absurd. There will definitely be some factors that are somewhat consistent from one observation to the next.

However, if we can assume that this autocorrelation is small and not important, then we can use control charts to separate common cause from special cause variation, identify if the process is trending out of control - all the good stuff we use SPC for.

I spend most of my time thinking about DOE. In sequential DOE we screen out the "inactive" factors from the "active" factors. This is similarly absurd. All factors will be active in some way. But if we can classify some as unimportant/inactive, we can focus our limited effort on the most important factors and understand, optimise, and control our process to yield consistently high-quality output.

Assumptions enable us to build useful models to better understand our processes. All models are wrong - and that is partly because the assumptions are not 100% true - but some are useful.

I hope that helps!

Phil 

MikeKim
Level III

Re: How can we easily assume the 'independence of error term(and so many other factors)??

So much thank you. It really helps. Similar to my assumption. I am happy.
statman
Super User

Re: How can we easily assume the 'independence of error term(and so many other factors)??

Actually the assumptions you suggest for control charts is wrong.  There are no assumptions of normality or of auto-correlation or lack thereof.  Please read Wheeler and Shewhart.  And Phil, you should note the author of the original quote "All models are wrong, some are useful" G.E.P. Box

 

Here's one paper to start:

https://www.spcpress.com/pdf/DJW088.pdf

"All models are wrong, some are useful" G.E.P. Box
MikeKim
Level III

Re: How can we easily assume the 'independence of error term(and so many other factors)??

Ohh,,

I read the paper... quite hard for me,

Thinking,

"Control chart itself's basic concept and basis is actually from <normdist>, and now the inventioner deny the <normdist> assumption?"

Quite confusing..

But I got the point, good pointing, thank you.

It is very sorry that I unintentionally posted misleading post.... thank you for correction.

 

But anyway, my intention was not the control chart but the every stat tools which assume <normdist>. 

 

MikeKim
Level III

Re: How can we easily assume the 'independence of error term(and so many other factors)??

By the way, 

'I-control chart needs <normdist> assumption, by 

https://www.isixsigma.com/tools-templates/normality/dealing-non-normal-data-strategies-and-tools/ 

'

Hmm.... confusing,,, OTL.

 

"Normally distributed data is needed to use a number of statistical tools, such as individuals control charts, Cp/Cpk analysis, t-tests and the analysis of variance (ANOVA). If a practitioner is not using such a specific tool, however, it is not important whether data is distributed normally. The distribution becomes an issue only when practitioners reach a point in a project where they want to use a statistical tool that requires normally distributed data and they do not have it."

 

FYI.

statman
Super User

Re: How can we easily assume the 'independence of error term(and so many other factors)??

Not to be disrespectful, but isixsigma is not a source I would be quoting.  You can find all sorts of disinformation on the net. 

"All models are wrong, some are useful" G.E.P. Box
MikeKim
Level III

Re: How can we easily assume the 'independence of error term(and so many other factors)??

thank you for fertillizing me.
you made me more knowledge.
it is you actually make me to fortify the needlessness of normdist assumption when use of control chart, from which is my initial source of every questions arised.
But my personal seek to intellectual research, accepted answer was more satisfying.
I admire you
statman
Super User

Re: How can we easily assume the 'independence of error term(and so many other factors)??

Here is another well written paper by a distinguished author:

Woodall, William H. (2000), "Controversies and Contradictions in Statistical Process Control", Journal of Quality Technology, Vol. 32, No.4 October 2000

 

If you can get ahold of that issue of JQT, there is an informative discussion of the appropriate use of control chart method.

 

"All models are wrong, some are useful" G.E.P. Box