I have daily patient volume data and I am to predict the daily forecast for next 14 days so that they do the appropriate staffing ( doctors, nurse etc.)
So my question is, what is the best method to do the time series in covid variation phase? Before covid I was forecasting with seasonal Arima model .
I have daily data since july 2019 . Is it good idea to include data since July 2019 or since we have about 35% reduction in daily volume, its good idea to just focus on march 2020 data and predict for 14 days? or just past 3 week to predict for 14 days...
If the daily patient volume is consistent and stable (as verified using control charts), then you would be able to extend the limits of the control chart as a prediction of future time periods. If, however, the patient volume is inconsistent, any method of prediction will be poor. Since the COVID outbreak, my guess is the data from prior to the event is not representative of the current situation. You could try modeling the "recent" data from that may be representative of the next 14 days (I don't know what specific data that is). But, in any case, if the process is unstable, any prediction will be poor. It might be useful to find leading indicators to help understand the patient volume.
Thank you Statman.
I used the past 4 weeks data to predict for next 3 week. The forecast trend is going way up but in fact the actual is not up high.This ahs caused issue in staffing because we staffed for 165 patients but in fact we are just seeing in 120s.
Ideally it should be in red line in the graph below
is there way we can just give more weights to last week and maintain the weekly pattern ( high on Mondays, low on friday etc) .
which method would that be.. or is there any better suggestion.
I have attached the file with both actual and predicted ( from time series seasonal anima of (110,012)7
I used 7 because we have weekly variation like high on Mondays
I used 3/20 to 5/09 in actual col to predict for 5/10 to 5/23 which is also added in cols c to g
Why don't you attach the JMP file?
I have created MR, Individual charts from your actual data: The MR chart answers the question: Is the variation in consecutive days (short-term variation) consistent and stable? The individual chart answers the question: Which sources of variation are greater, the short-term or long-term.
As you can see you have "special cause" data as identified by the MR chart. These "events" are worthy of investigation, although they happened 4/26-4/27 & 5/3-5/4, so now you are hypothesizing what happened between these dates? Since the MR is not consistent, you must be careful interpreting the control limits on the individual chart (since the MR is used to calculate the control limits and we know it to be inconsistent). There does appear to be sources of variation, long-term, that are over and above the sources found day-to-day.
Thank you for bringing my attention on those high variation dates. I have started investigating this info.
As you have requested I have attached the data that I am planning to forecast for 5/24 thru 6/6 as this is what we use to plan for staff for that pay period.
Below is the control chart using the attached JMP file. Before covid few variation is normal and now it needed some tweak to adjust for this uncertainty.
Given this data, will you please recommend me what is the best option for forecast for 3 weeks?
I would be grateful if you guide me here.
Thanks for attaching a JMP file.
I don't know what a visit "costs" in terms of risk/reward? How precise does the prediction need to be? Within how many visits? If that # of visits is small (≤ 10), then there is huge variation in your process right now.
You need to understand those particular jumps in the data. It looks to me as though there is a 7 day cycle. I'm not sure if the actual # of visits "counted" is some how influenced by day of the week? Perhaps the time or day the visits are counted is impacted by shift change or how the data is logged in. Seems its correlates with Sunday/Monday? Perhaps all of the weekend visits are tallied on Sunday or Monday? Until your measurements or accounting is consistent, it will be difficult to be confident in prediction. Looking at the additional data, the average was shifted at around March 19. The average before March 19, was about 157 (121-193). The average from 3/19-5/3 was about 102 (66-140). It seems you had more total visits before Covid than after. In terms of planning, you'll have to decide how much cost there is associated with having enough to handle the extreme prediction or what the risk associated with not planning for enough visits.
I agree that we have a clear weekly variation. The question you had about hoe many variation is acceptable.... then I would say anything less than 10. The forecast is daily emergency department visits and we are staffing based on the volume we anticipate to see for next two weeks.
I used march to t - 1 day to forecast for 3 weeks. like you have observed it was around 160s before march 15 and then significantly dropped for few weeks and backed up recently in 120s.
I have followed these steps ( attached )
This is what I was producing but actual is still in 120s 130s so using this we have over staffed as the forecast showing it will go in 160s
I am wondering if there is anything I need to do here..
You are trying to create a model with no understanding (or accounting for) variables that might impact the response variable of Visits. The only predictor variable you are putting in the model is time. IMHO, in order to have a prediction model that is useful you need to understand what factors affect the response variable, Y=f(x). Once you understand the input variables (x's) that are influential, you can consider how to control, manage, measure those x's to achieve the desired output (note some x's will affect the mean and some will affect variability). As I have already pointed out, your data suggests there are x's, yet to be identified, that are causing instability in the number of actual visits. No mathematical modeling can alleviate this. "Garbage in, garbage out." I suggest you:
1. map the process,
2. identify x's that may be influential,
3. develop hypotheses as to why those x's may or may not be influential
4. collect data to provide insight into your hypotheses.
Yes, this is challenging work and requires resources to study the process.
"There is no instant pudding". Deming
Then you can start to think about using mathematical models to predict.
There are no labels assigned to this post.