cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
CEFOG
Level I

Which Parameter(s) affects the response variable the most?

Hi

I´m quite new to JMP Pro and therefore not very familiar with how to deal with the following:

CEFOG_0-1675955027553.png

(I have also inserted the <JSL> below)

 

 

I have investigated 3 parameters of a fermentation in different combinations (my DoE)
and quantified the protein concentration (that is my response variable).

Is it possible to see what parameters are the most significantly affecting the protein concentration?

 

pH

Temperature

Dilution Rate

Protein Concentration

5.5

32

0.1

0.002989482

6.5

32

0.1

0.003129305

4.5

26

0.1

0.002990589

5.5

26

0.2

0.00247863

5.5

37

0.1

0.002470013

6.5

37

0.2

0.000480857

4.5

32

0.2

0.002055102

5.5

32

0.2

0.003187044

 

Thank you in advance.

best regards,

 

Carl

 

 

3 REPLIES 3
statman
Super User

Re: Which Parameter(s) affects the response variable the most?

Welcome to the community.  In the future, it is best if you just add the JMP table.  

First question: How much of a change in the Protein Concentration is meaningful in a scientific or engineering sense?  This is called practical significance.  Does the response variable vary enough to be interesting?  What is the smallest increment of change you would be interested in?  Are you trying to maximize, minimize or hit a target?

Second, you have a fractional factorial with no direct estimate of MSE (e.g., no replication).  2 factors are 3-level (and one of those is un-balanced?) and the third is 2-level.  What model are you interested in? Initial analysis for un-replicated designs there will be no p-values so look at the Pareto plot, Normal plot, Prediction Profiler and interaction plots for interesting effects.   Note: You have confounded 2nd order linear effects with quadratic terms.  Also it is always good to keep track of run order.

 

I have added your data table with the following scripts: Design evaluation (for estimating the aliasing, power, etc.) and Fit Model (for initial assessment).  Simply click on the green arrows in the upper left zone).  The outputs need interpretation via SME.

"All models are wrong, some are useful" G.E.P. Box

Re: Which Parameter(s) affects the response variable the most?

Where did you obtain this design? Did you use one of the DOE platforms in JMP to make it?

 

Your question, "Is it possible to see what parameters are the most significantly affecting the protein concentration?" suggests that you are screening factors or effects at this stage. You are not necessarily attempting to model the response to optimize it through factor settings. An important question in such a case is, what is my chance of detecting a real effect? A power analysis can answer this. Here is the analysis of your design:

 

power.PNG

 

Power is the chance that you will detect a real effect. The desired level of power is subjective, but higher power is good. It depends on the effect size, the variance in the response, the significance level for decisions, and the number of runs. My analysis reveals that your chance of finding a real effect is around 80-97% IF the effect is about 20 times greater than the standard deviation.

P_Bartell
Level VIII

Re: Which Parameter(s) affects the response variable the most?

To add a bit to @Mark_Bailey and @statman 's sage advice, for questions such as your's the very first thing you should do is plot the data in a variety of forms. I suggest, in no particular order, the responses in the order of experimentation. A simple histogram of the responses. And Fit Y by X for each x and y. Here's why?

 

1. Plotting in order of experimentation is your premodeling best shot at seeing if perhaps some lurking factor crept into wreak havoc with what you are really after.

2. A histogram of the responses help with identifying the center, spread and shape of the distribution of the responses. Does the histogram look reasonable given your process knowledge? Are there any responses that might make subsequent modeling problematic? Outliers? Nonsense values? Missing values? etc.

3. Lastly the Fit Y by X plots can give you some hints regarding the structure of the model. In addition, if there appear to be relationships, do they make sense wrt to your a priori process knowledge? I always used to ask the engineers I worked with when examining these plots was something like, "If the plot suggests water runs uphill, we've got a problem and should stop and pause to reconcile this apparent conclusion."

 

Only after doing the above and being satisfied with the results would I proceed to modeling.