I have a data table which presents the sales of year 2015 by sales date.
After adding a Cumulative Summary column I have the final amount of year 2015 "total revenue".
Depending on this data (I also have the same for 2012, 2013, 2014 if that helps a healthier prediction) how will I predict the total revenue of 2016 by filling the "Sales for 2016" column day by day?
Message was edited by: Sait Copuroglu
Second data table added " Predicting Future Values Years Weeks"
It's a great question: How will you predict Total Revenue in 2016?
Here's what I recommend:
I ignored the missing data, made three models, averaged them, and predicted 5.4M euros in 2015 and 7.1M euros in 2016. Better data will support better models. Better models will hopefully support better predictions.
Niels Bohr is reputed to have said "Prediction is difficult, especially about the future."
Thanks for your attention and for your reply!
Actually I feel I am very lucky with this prediction, last 4 years data is moving within a 85% confidence interval (Please see inserted pic).
I'd much rather be lucky than good. I'm not going to share my work with you...I am going to try to help you by consulting with you, not by doing the job for you.
Zero is not missing! Egyptian accountants had a symbol for zero in 1740 BC. We should honor prehistory by using it in place of the missing data in your file.
But your graph of the 2014 data shows a monotonic increase that doesn't jive with the large numbers of zeroes (used to be missing) in the Revenue data. So I guess I am even more confused with this new information.
The functional shapes on the graph all look similar year-over-year, and you may do better by just using the average of the functions at each day. That would certainly be more parsimonious than my approach, and it may even be more accurate. Remember, all you gave us was two partial years...
I know very little about the resort hotel business. But it makes some intuitive sense to me that, for instance, weekends would be more lucrative than weekdays, that summer vacationtime would be more lucrative than winter, that holidays like New Year's Eve would make more money than other days, etc. So I made some variables like DayOfYear, DayOfWeek, WeekOfYear, Year, etc. I then fit "decent" Neural Net, Bootstrap Forest, and Boosted Tree models with the response of Revenue with those variables as Factors, and saved the prediction formulae. I made a variable that summarized the mean of those three model predictions. Then I incremented the dates through 2016, made the daily predictions, and summed the means by Year. It's a very naive approach that could probably be improved dramatically with some subject knowledge.
Assuming that you got the missing data issue Kevin showed, this would likely be a good choice for using the Time series platform - under modeling. You can take out isolate seasonal effects from linear trends in revenue. This also gives the ability to do forecasting.