Please feel free to completely ignore my post below. No reply is asked for. I am hesitant on posting this...no offense is intended. I have copied part 1 (of 3) of an earlier post by Fausto, describing the situation and motivation. I have added questions.
- Data were provided from a large hospital system concerned with a very high rate of hospital-acquired urinary tract infections (UTIs).
What is “a very high rate”? How is it measured? Has the measurement system been studied (e.g., determining it is a UTI and not some other infection or health issue)? How do they determine it was hospital induced/acquired and not some infection the patient already had, perhaps un-detectable or not measured before coming to the hospital or contracted by visitors to the patient, etc.?
Specifically, the hospital would like to track the frequency of patients being discharged who had acquired a UTI while in the hospital as a way to quickly identify an increase in infection rate or, conversely, monitor whether forthcoming process or material changes result in fewer infections.
Why do they want to track this? I would think they want to reduce it. The main questions answered by control charting is:
- How should the investigation proceed (special or common?) and
- Where should you focus your investigative efforts (e.g., which components of variation dominate)?
I assume their intention is to understand causal structure, perhaps running experiments or sampling to uncover causes and then ultimately suggesting potential solutions. It seems what they lack is a measurement that is capable of providing expedient measurement of the infection? How does plotting the current data lead to creating a “better” measure? I would think, based on the information provided in 2 (from the previous post), they want to understand and reduce the number of hospital induced/acquired health problems (of which one is UTI’s). They don’t want to reduce UTI’s by replacing that infection with another health issue.
Because the root cause often differs based on gender, male and female patients are charted separately and this example focuses on males.
This is a hypothesis. Perhaps it has been studied? How was it studied (one factor at a time?)? There is a lot of noise and hidden factors involved. How was the noise accounted for in their studies? It seems directed sampling would be quite useful to separate components of variation.
The data, which can be seen in the appendix, appear to satisfy the distributional assumption for use of the t chart, with the mean time between male UTI patients at 0.21 days or about 5 hours.
Is this the best way to measure the phenomena? Does the severity of the UTI matter? Can the amount of infection be measured? Are there other “Y’s” of interest (e.g., vitals, other infections, other health issues). How frequently are the patients tested to determine when, exactly, they got the infection? Could other measures correlate or predict infection?
They really care about the hospital induced infection rate. While mean time to UTI is a measure of the rate, it seems each event could be considered independent (each data point is a different patient with different underlying health conditions). It also seems there may be some similarities due to potential hospital effects (both common AND special causes). How will plotting the metric distinguish between these different sources of variation?
The data were plotted using the proposed t chart method and demonstrate statistical control.
What is meant by statistical control? Is a consistent, predictable amount of UTI’s a good thing? How are the control limits “created”. Control limits are a function of the x’s (variables) changing at the designed sampling frequency. Shewhart’s approach is to select a subset of the x’s and group them into a subgroup (for argument sake, relatively short term in time). Chose another set of x’s that will vary over a longer term (or some other rational condition). Use the subset of x’s as a basis for comparison. First, is the subset of x’s consistent (range chart)? If they are, then a comparison can be made to the other set of x’s that vary in the study. This is done using the X-bar chart. The control limits are a function of the range (the subset of x’s) and the dots plotted on the X-bar chart are a function of the x’s changing between subgroup. If this chart is “in-control” then the within subgroup x’s dominate the effect on the plotted Y. If this chart is “out-of-control”, then the between sources dominate.
“The engineer who is successful in dividing his data initially into rational subgroups based on rational theories is therefore inherently better off in the long run. . .”
Shewhart
"All models are wrong, some are useful" G.E.P. Box