Thank you for your comprehensive answer and explanations. They have been very helpful to me. I will try to clarify any uncertanties and answer the questions you were asking. In order to avoid confusion, I will mark the answers numerically.
1. In section Evaluate Design I entered 0,05 for statistical significance, 1 for anticipated coefficients and 1 for RMSE. I entered these values prior to experiments, since I got no information about what RMSE or anticipated coefficients can I expect. I obtained the power 0,92 (see the photo Evaluate design below). I am aware that such power would be the case only if RMSE would really be around 1, I just wanted to demonstrate that the design looked good prior to experiments. Of course, if I put in the values I got for RMSE and anticipated coefficients, the power is much lower(see photo Evaluate design 2 below).
2. I don't know for certain, but I expect around four main effects to be active. However Mr. Jones and Mr. Nachtsheim stated in an article from 2016 (Bradley Jones & Christopher J. Nachtsheim (2016): Effective
Design-Based Model Selection for Definitive Screening Designs, Technometrics), that in the analysis of DSD with 6 factors and 4 additional runs (using the new approach for DSD analysis) all 6 main factors could be detected with the power of 0,99. If I look at that number, I don't see the problem in many active main effects. Or am I missing something?
3. Yes, this experiment contains mixture components. One of the factors is non-solvent volume to the final volume ratio (in final volume we have non-solvent and water). With the peristaltic pump I added the solution in the non-solvent. Do you think that it would be better to leave that ratio constant? I am thinking that this could be the reason for the big RMSE. However, I cannot afford to make many more experiments, so I need to soften this problem (if that is the case), if it is possible.
4. Yes, this experiment was fully randomised. I don't have hard to change factors, I also had relatively good control of factors during the experiment.
5. I am aware that there are also other factors that could affect the response. However, I tried to control every factor that I could, so I don't see a big problem in my experiment here.
6. You are correct, that the run with the highest yield is the outlier. That is also very clear from the Studentised Residuals (see photo Residuals below). However, when I look at the residual analysis of the other responses the outlier is not present, but the RMSE is still big. So I think, that there needs to be an additional reason for the big RMSE.
7. I also have a new question. As I was playing with the analysis with the Fit Definitive Screening, I noticed, that RMSE is lower if I include fake factors in the analysis. (see the photo Analysis with fake factors below. On the left side, there are estimates that I got and on the right side there is selection of response and factors). Am I allowed to do this?
Thank you.
Danijel