There really isn't enough information to provide specific advice, but here are some options. There is always the question, are you trying to explain variation in the output or develop a causal understanding for predictive purposes?
Just curious, how do you know there is an interaction between the sample thickness and the other factors? Or is this hypothesis? Where did you get the data to support this?
Have you done any sampling? Perhaps use sampling of the preparation process to get an understanding of consistency and how much it varies.
Since you are "preparing the sample", can you experiment on how you prepare the sample, purposely varying the factors associated with preparation? If so, you can run an experiment on preparation as the whole plot (consider each sample from this experiment and split the sample) and subsequently run the experiment on the other factors in the sub-plot of a split-plot design. This increases precision of the whole plot, the sub-plot and provides for interactions between the whole plot and sub-plot with increased precision.
Another option, as you imply, is to make one experimental unit for each treatment which consists of multiple samples. These samples are not independent of the treatments, but you can average those samples which will reduce the sample thickness effect and you can also take the variance of the samples and use this as a second response variable to see if your controllable factors effect the variation within treatment (mostly due to sample thickness and measurement error).
Also the option, if it is possible to sample to determine the two extremes of thicknesses, then grab enough of each to run replicates and confound this with the block (low thickness = Block -1, and high thickness = Block 1). You can then handle block as a fixed effect and add block and block-by-factor interactions into the model. This depends greatly on your confidence in knowing what is confounded with the block. If the block effect or block-by-factor interactions are significant, then there are options to disaggregate the block. Otherwise, you can treat block as a random effect and use it to be the basis of statistical tests.
There is nothing wrong with measuring the thickness and including the actual measure as a covariate unless the thickness changes due to the other factors. The measure doesn't necessarily have to be done before the experiment to use it as a covariate. The only potential issue I see with this, is you can only use one value for the covariate for each treatment. If there is variation within sample, then what value do you use? I suppose you could average the thickness measures...but this may not be useful.
There is no one right way to run the experiment. Many options exist, each with their relative plusses and minuses. My advice is to design multiple options, consider what the potential to learn is for each (e.g., model effects, restrictions, aliasing) and contrast/compare with the associated resource requirements and constraints. Predict all possible outcomes and what the next possible iterations will be. Then pick one and prepare to iterate.
"All models are wrong, some are useful" G.E.P. Box