I'm looking at some process data from a manufacturing plant. Specifically, I have data every minute from up to 10 meters that should read the same, but which are slightly different in reality. The meters are in different parts of the plant and are sometimes taken offline leading to nonsense values, so I've filtered the data - this means that my data set has lots of gaps in each individual meter reading. There's also noise in some of the meters related to startup/shutdown of lines (I think) that I haven't managed to completely filter out.
What I'd like to do, is:
Identify the "most likely" reading. Currently, I'm taking the median of the active values (the mean is more strongly impacted by individual line disturbances). Is there something smarter I could do?
Identify meters that have "drifted" - picked up a constant (or proportional) offset from the "true" value. At the moment, my best idea is to compare an individual meter to the median of all the others and see if there's a consistent offset. An alternative thought would be to build parity plots for the overall median against each individual meter and see if the slope of the line through (0,0) is not equal to 1. This works, and I guess I could try to script it, but it seems like there should be a smarter way to tackle this...
Any thoughts or suggestions would be gratefully accepted!
can you send any commands to the meters? can you tell them to disconnect and get a true zero value? Or change the scale by a factor of 10? You might get insight into a constant vs proportional error.
Do the meters use batteries? Could battery age account for drift? Do you know when batteries are replaced? records over time might be useful.
I'd suspect physical connections to equipment in addition to the meter. Depending on the environment (electrical noise, chemicals, vibration, temperature, ?) you might need other engineering solutions to lock it down better.
Is the drift cyclical? Could it be time-of-day or temperature related? Is the drift a step change or a slow change over time?
Thanks for your thoughts - no I can't interact with the system directly, but I can manipulate the data after the fact, so can amplify the numbers if required.
I agree with your suggested root causes and the fact that the drifting will develop over time. I do have the historical data, but what I'm trying to do is mine that data to see if there is a "fault" in one of the meters. Now that I've found a drift in at least one of the meters, I can get that one fixed - I'm trying to plan for the next issue and come up with a quicker way to identify this type of problem.