I'll give it a shot. Please excuse my "lecture".
There is no "one way" to build models. Some methods may be more effective or efficient given the situation, but it all depends on the situation. Let me also suggest, the model must be useful.
A good model is an approximation, preferably easy to use, that captures the essential features of the studied phenomenon and produces procedures that are robust to likely deviations from assumptions
G.E.P. Box
I use Scientific Method (i.e., an iterative process of induction-deduction, examples include Deming's PDSA or Box's Iterative Learning Process) as a basis for all investigations.
The impetus or motivation (e.g., reduce cost, improve performance, respond to customer complaint, add function, etc.) should be stated, but care should be taken as this may simply be a symptom . A decision should be made as to whether this is a study to explain what is already occurring or is the study intended to understand causal structure (I am biased to understanding causality). In addition, it must be understood what the constraints are (e.g., resources, sense of urgency). Start with situation diagnostics (e.g., asking questions, observation, data mining (this may include different regression type procedures), collaboration, potential solutions, measurement systems, etc.). What are the appropriate responses? Is there interest in understanding central tendency, variation or both. Do the response variables appropriately quantify the phenomena? I have seen countless times where the problem was variation, but the response was a mean. That will never work.
Develop hypotheses. Once you have a set of hypotheses, get data to provide insight to those hypotheses. The appropriate data collection strategy is situation dependent. If there are many hypotheses or factors (e.g., >15), the most efficient approach may be to use directed sampling (e.g., components of variation, MSE) studies. The data should help direct which components have the most leverage. As the list of "suspects" becomes smaller, perhaps experimentation can accelerate the learning. This will likely follow a sequence of iterations, but the actual "best" sequence is likely unknown to start. Error on the side of starting with a large design space (e.g., lots of factors and bold levels). This will likely be some sort of fractional experiment over that design space. Realize, the higher order effects (e.g., interactions and non-linear) are still present, they just may be confounded. Let the data, and your interpretation of what the insight the data is providing you, be the guide.
Regarding Noise (e.g., incoming materials, ambient conditions, measurement error, operator techniques, customer use, etc.), I think this is the most often missed opportunity. Intuitively, I think our bias is to hold those untested factors constant while we experiment. This is completely wrong. This creates narrow an inference space and reduces the likelihood results will hold true in the future. Understanding noise is a close second to understanding a first order model. Think about the largest experiment you have ever designed,. Think about how many factors were in the study. What proportion of ALL factors is the experiment? Unfortunately, the software is not really capable of providing direction as to how to appropriately study the noise.
"Block what you can, randomize what you cannot", G.E.P. Box
Much is spent on the most efficient way to study the design factors (e.g., custom, optimality, DSD), but little is spent on incorporating noise into the study. I'm not sure what a complex model does for you if you are in the wrong place. Not sure it can be "automated" in a software program? it requires critical thinking.
Regarding your specific questions:
- Should the screening evaluate both main effects and interaction effects simultaneously or should you start with main effects and then follow-up with 2nd order interactions?
Advice is to develop models hierarchically. Your question might not be "evaluate" but estimate or separate? All depends on what you know and what you don't. Error on the assumption: you don't know much!
- If a main effect is found "insignificant" does this mean there is no potential for interaction effects as well?
Absolutely not. I give you F=ma (neither main effect, m or a, is significant)
- Should you add center-points to detect curvature at the screening phase?
Situation dependent. Are you suspicious of a non-linear effect? Why? Are you in a design space that the non-linear effect is useful?
- Should you add replicates at this phase? (I've heard this may be unnecessary when large impacts are being evaluated rather than low variance)
- When should replicates be added/focus toward variance reduction?
You need a noise strategy. Replication is one of many. Again the question will be is the noise in the design space similar to the noise near where you want to be? If so, study it now.
The exact standardization of experimental conditions, which is often thoughtlessly advocated as a panacea, always carries with it the real disadvantage that a highly standardized experiment supplies direct information only in respect to the narrow range of conditions achieved by the standardization. Standardization, therefore, weakens rather than strengthens our ground for inferring a like result, when, as is invariably the case in practice, these conditions are somewhat varied.
- R. A. Fisher (1935), Design of Experiments (p.99-100)
"All models are wrong, some are useful" G.E.P. Box