Hi Dale
unfortunatly also by using the response variable difference none of the regeression models was able to return a good result.
So at this point maybe I need to assume that our X variables are not good predictors of response.
Anyway still remain the doubt if for data like ours (sensor data) we have to split the training/test by taking in account the time axis or instead we can use the random split tecnique. In the literature I found several papers where for sensor data they use the random split training/test. By if we do in the same way the results seems really good until we do the deployment of model in to production to understand that the model is completely wrong. So where is the truth?
In attached you can find a paper that apply the machine learning tecniques to a semicondcutor tool (etcher). They try to do a predictive maintenance by looking at the sensors data. So the data should be really similar to ones. If you give a look at the part hof the paper ighlighted in green looks like that they split the data taking in account the time axis and then considering the data as time series.
I hope that more people besides Dale can have their say and help us to solve this dilemma.
Are or not the data like the ours multi-time series?
It is true that the predictors X and the respone Y depend on the time. But if we want predict the Y according to the X values why the time variable should be so important? To me the Y should depend on the X variables that are set automatically by the machine during the process phase. Of course this happen in the time cronologically, but the X value depend on the recipe and machine state. So I have some difficult to accept the dependancy on the time.
Happy Easter to the whole community.
Felice