Just a bit of another point of view regarding some of your comments, Victor. Particularly:
For Type I and Type II studies, the selection of the product and corresponding batches (for Type II) is very important : for Type I, the product chosen should be representative of the future samples measured by the equipment, and for Type II the batches should all be conform and cover the specification range (from LSL to USL in order to be able to see some part-to-part variation, meaning your equipment is able to differentiate quite precisely your batches).
I agree completely the selection of "samples" for your study is of particular importance, however that section depends on what questions you are trying to answer. For example, if you want to be able to detect within batch variation, you will need multiple measures within batch for your study. Limiting samples to just the spec limit range may not adequately describe measurement system capability (Don't assume the specs have anything to do with reality as they are ofter derived independent of actual variation). If you want to understand the adequacy of the measurement system for providing insight to your hypotheses, the samples should be collected as a function of your hypotheses.
Recognize conclusions regarding measurement systems capability are conditional upon how the study is performed and what comparisons are being made. Change either of those, so goes the conclusions. Measurement studies are seldom a one shot event.
"All models are wrong, some are useful" G.E.P. Box