Welcome to the community. You are correct that having a nominal response posses some particular challenges, and Pete's advice is right on. One of those challenges is the response may be the result of multiple failure mechanisms. It lacks discrimination to separate those mechanisms (which might have different causal structures).
Of course you can use experimentation to better understand potential causal relationships between the independent variables and the dependent response variable(s). This is true regardless the response variable. Now, experimentation may not be very resource efficient for nominal responses, but it will be effective none-the-less. My first set of advice would be to consider how to quantify the "phenomena" (response). What does the response yes or pass mean? Pass what? As an example, Dr. Box describes an experiment on cracked springs in one of his papers ("The Scientific Context of Quality Improvement"). He was able to use the % good as a response and discover ways to improve the spring performance. Are there alternative ways to quantify the phenomena? As suggested, if you are able to detect gradations or categories of "goodness", then you might be able to use an ordinal scale. In my discussions with Dr. Taguchi, he always emphasized the importance of developing appropriate response variables...in many cases creating a new response variable to provide insight to the problem at hand.
In any case, let's assume you only have the nominal response. What type of failure rate currently exists? If the failure rate is high (perhaps >10%), then you aught to be able to detect a change in that rate fairly easily with experimentation. If the failure rate is low (<5%), then experimentation could be inefficient as you might have to have large sample sizes to detect differences. Also, given the nature of the response, make sure the design space is large (lots of potential factors set at bold levels).
"All models are wrong, some are useful" G.E.P. Box