Interesting, but I am confused.
My thoughts: How can run order be a factor? Let's say it is significant, and the 3rd level gives the best results. How can you possibly operate with this knowledge? How would you run this process? Accordingly, you also could optimize the interaction of X1 and X2 from a practical sense. Now, if you say factor X2 is time, the you have a continuous factor with multiple levels. Understandably, you would have a restriction on randomization as X2 can't be randomized. What is changing in time?
Another thought, is to just run each level of factor X1 4 times (in order) for a total of 12 runs. Essentially an OFAT over time.
Another thought is to do sampling (vs. DOE) of each level over time...not restricted to 4 specific times.
I'm sure others will chime in.
"All models are wrong, some are useful" G.E.P. Box