Choose Language Hide Translation Bar
JerryFish
Staff
Learning from my mistakes -- Part 3: Asking the complete question

answer-business-career-221164.jpgWhat else should I consider when planning my test?

For this third installment of my "Learning from my mistakes" blog series, I’d like to wrap up my thoughts on the preliminaries of planning a test. We’ve already talked about what we can expect from a t-test in general. So, we now know that we need to establish a confidence level, establish a target for improvement as compared to current condition, and establish a direction for the improvement (greater than or less than the current condition). Now let’s talk about some other test considerations, and then wrap up with Ask the Complete Question.

Other Considerations Prior to Test Planning

Recall from the original scenario that the engineer decided to draw three samples from each of the populations to determine whether the population mean changed between the New Treatment and the Control. Why did he choose three samples? Why not five? Or 10? Or 2? Perhaps it was because “We’ve always done it that way.” Perhaps there were other considerations that haven’t yet been described.

From a purely statistical/mathematical perspective, more is always better when choosing a sample size. After all, if we could sample ALL parts from a population, then we wouldn’t have to make any assumptions about the mean or variance of the population. We would know it!

Unfortunately, there are many practical considerations in our daily work that affect sample size selection. And even more unfortunately, I have myself been guilty of not considering all of the constraints and impacts of executing a given test. It would be much wiser to consider these factors before beginning the test, rather than waste time out of haste.

Below I offer a few things that might affect how you plan your test:

  • What is the budget for this test?
    • Management is always interested in budget!
  • When do we need an answer?
    • If an answer is needed by the end of the week, but the test will take at least a month to complete, then we need to rethink the question that we are trying to answer, and the methodology that we are using to answer it.
  • How many test samples can we afford to build?
    • This might involve cost of the parts themselves, as well as time and scheduling to build them. Are we shutting down a production line to build these parts?
  • Is “Batch” going to be important to us (assuming parts are made in batches)?
    • If so, consider sampling across multiple batches. Of course this impacts the time to produce the initial test parts, and the time until we have our final results.
  • How many samples can we actually test?
    • Is the lab that tests the samples busy with other things? Does it cost money to send the samples out to an external lab?
  • Are we really planning to test the right parameter, in the right fashion? In our scenario, we want to increase the “strength” of our parts. There are many ways to measure strength (e.g., pull tests, repeated cycling, impact testing). Or perhaps we really need to be measuring elastic modulus, or durometer, or some other parameter. Don’t just rush in and start measuring!
  • Can we even measure the “important difference” that we want to detect, using our current test equipment?
    • A Measurement System Analysis helps understand your gauge variation. I’ll cover more about this in my next blog post, including what to do if your gauge isn’t good enough.
  • Do you know the normal part-to-part strength variation in your Control parts? Do you expect the new parts to have the same variation? Is the variation expected to be normally distributed?
    • These questions play an important role when we plan the test and analyze results.
  • Besides improving the average part strength, is maintaining existing part-to-part variation important to the product? Does the variation also need to be improved? Conversely, can it be expanded?
    • If we improve the average strength of the parts but increase the spread (variation) of the parts, will we actually be making more scrap?
    • Usually we are worried about processes that are out of control, and/or making bad parts. But if we have an in-control process that is making consistently good parts, how much are we spending to maintain that control?  Could we tolerate larger variation in the New Treatment parts and save some money in process controls?

All of these considerations fold into selecting the sample size for our hypothetical t-test. (I’m sure you can think of many more considerations that need to be discussed before the test is planned as well.) 

So, I’ll call this mistake: Not adequately considering all factors that could affect test planning.

What Is the Complete Question That We Are Trying to Answer?

You can mitigate all of the above discussion and mistakes by setting aside time to “Ask the Complete Question.” (Here, I need to give a shout-out to Bill Kappele, my mentor and friend, who taught this method to me many years ago.)

“Ask the Complete Question” means going through all of the discussion above (and in Parts 1 and 2) and ensuring that everything has been considered, so we are measuring the right quantity, defining constraints on our testing, defining how much difference is important to us, etc.

In our scenario, let’s say that we have talked to all interested parties and stakeholders, and we rewrite the question as follows:

“With 95% confidence, can we say that Treatment A improves the average strength of the part by more than 1 unit, as compared to the current process, without increasing variation of the process?”

Following this, we would list any constraints. A few of these might be:

  • We can afford to build and test a total of 40 parts (total of all parts from Treatment A + Control <=40)
  • We can ignore the effect of Batch
  • We need the answer in one month (just enough time to complete this blog series!)

Confirm That We Have Asked the Complete Question

Over my career, I’ve seen engineers spend a lot of time and resources in the lab trying to get to the bottom of some issue, only to find that no one cares about the results. The problem is not in the initial thinking and test planning. The problem is not in the test execution. The problem is that the engineers didn’t check to make sure ahead of time that the right/complete question was being answered.adult-agent-approval-684385.jpgDoes the boss agree with your plan?

So, once we’ve done all this forward thinking and test planning, it is a great idea to do one last thing: Ask your manager if he/she agrees that this is the question that needs to be answered. In fact, if the test takes a good bit of time to run, it’s a good idea to periodically check in with management to make sure conditions haven’t changed. No one wants to waste their time running a useless test!

Coming Up

I’m planning to touch on all of the following subjects as we go along. Hope you can join me for the whole series!

3 Comments
Staff

Hey Jerry,

 

Loving your series! As somone who has previous experience in industry, I'm excited to see someone covering the same topics I delt with (and overcoming some of the same mistakes as well). Can't wait to see what's next!

 

I do have a quick comment and hopefully it's not too nitpicky :-/. You mention that from a statistical/mathematical perspective, more samples are always better. I would certainly agree...up to a point. That point would be where I start considering diminishing returns. How much more information does/do the extra sample(s) provide toward answering my question? Is it worth the extra effort in time and/or money to get that extra bit of data?

 

Perhaps an extreme example might help illlustrate my point. Suppose I offered you the choice between the entire population of interest and a subset of that population. If I left it at that, you would probably choose the entire population as a no brainer. But what if I told you (in my best Lawrence Fishburne voice and cool sunglasses) that the subset consisted of all but one member of the population. At this point, the choice would probably be more difficult. You might start asking "Well, what's the difference between the two really? Do I really need the one extra member to answer my complete question?" (which you will have determined using the guiding questions in the above blog post :-) ). Maybe you do, maybe you don't. But the point is that the choice is not so clear cut anymore. It reminds me of a famous paradox about when a heap of sand stops being a heap as you remove grains one by one. If you think of it in a reverse manner, will adding more sand (data) provide me with valuable information or will it just give me a slightly bigger heap than the one I already have? 

 

Now, obviously, most people's situations fall in the category of having more data clearly means having more information. Usually you don't need to worry about this until you get into the large sample situations. But I would argue that if you've ever used a statistical power calculation to help determine sample sizes, you're implicitly used this diminishing return argument. After all, a power calculation simply tells you the minimum sample size required to achieve your statistical goals (assuming it's used properly), with the implication that extra data is just icing on the cake. 

 

(Hopefully I haven't stolen any thunder from your upcoming posts :-/. Or been waaaayyyy too nitpicky :-/.)

Staff

Not nitpicky at all, @calking !  Happy to have the comments.

 

Yes, sample sizes do have tradeoffs, and you make several good points.  And you aren't stealing my thunder... just whetting appetites for what lies ahead on selecting proper sample sizes.  Thanks for the thoughts and inputs!

Community Member

A very good approach to asking the right question.

in 1993, Doug Montgomery and I simulataneously devised a "master guide to DOE". He published first (Technometrics, 1993). I've been using the guide ever since. It asks about the same questions as you do.

I've got a copy on my website (http://cawseandeffect.com/doe-master-guide/) Feel free to take a look.