cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
Marie-Eve
Level I

Screen for important effects from multiple nominal variables with 1 continuous response, using historical data

Hello,

I'm new to JMP and I'm learning about statistical analysis. I would like some pointers on what I'm trying to achieve. 

 

Let's say I am working on expressing a protein using a bacteria. 


I have one response : Yield (mg/mL)

I have multiple (nominal) elements regulating the expression in the bacteria :

 

- Type of expression promoters : PromA, PromB, PromC, PromD, ... etc (I have 20+ choices) 

- Type of terminator : Term1, Term2, Term3, Term4, Term5

- Type of bacteria :  BactX, BactY, BactZ.

 

I have expressed the protein using many of the possible combinations of Promoters, Terminators and Bacteria, and have measured the yield for each trial. Most combinations were tested several times (3 to 50 replicates per combination)

 

I already have a lot of data, and I would like to screen for important effects and relations between the variables to explain how to maximize yields. I did not use the DOE feature to create the experiment, this is historical data.

 

Any hints to how I should proceed ? I will be happy to watch videos or read about your suggestions.

Thank you !

5 REPLIES 5
statman
Super User

Re: Screen for important effects from multiple nominal variables with 1 continuous response, using historical data

Welcome to the community @Marie-Eve !  Just a few thoughts:

1. How confident are you in the measurement system?

2. Are the multiple results for similar combinations repeats or replicates (are they independent data points or not)?

3. I might look at those data points first and then possible summarize those to look at the factor combination effects.

4. Usually with historical data sets I start with Regression (standard Least Squares or Logistic):

https://www.jmp.com/support/help/en/16.0/?os=mac&source=application&utm_source=helpmenu&utm_medium=a...

 

https://www.jmp.com/support/help/en/16.0/?os=mac&source=application&utm_source=helpmenu&utm_medium=a...

Do you have a model in mind? Have you thought about what factors/combinations make sense?  Have you rank ordered the model effects? I'd start with first order models and then augment with interaction effects.

5. After you have "tortured the data", I would plan designed experiments to replicate the results.

"All models are wrong, some are useful" G.E.P. Box
Marie-Eve
Level I

Re: Screen for important effects from multiple nominal variables with 1 continuous response, using historical data

Hi statman,

 

1. There is a lot of uncontrolled variable because we work with living sytems. The actual measure is very accurate but the expression can vary from batch to batch.

2. They are independant data points from repeats that were conducted separately week after week

3. I have summarized the data and am currently working on the yield average for each combination. I will look further into this before dealing with each data point.

4. I have very little experience with statistical analysis and I don't really understand the different models so I have a lot of learning to do, I will start with those links you shared. From what I've seen, I can see how it would make sense to start there!

5. Good tip, I will certainly want to use DOE to replicage / confirm the results.

 

Thanks ! 

statman
Super User

Re: Screen for important effects from multiple nominal variables with 1 continuous response, using historical data

One other bit of advice...Before a statistical examination of the data, make sure the data meets a practical significance threshold. Is the variation in the measurement enough to warrant any further analysis? If it does, first LOOK at the data. Use Graph builder to look for patterns in the data and possible associations with the independent variables. Also examine the repeats for consistency before summarizing them (try simple range charts.)
"All models are wrong, some are useful" G.E.P. Box

Re: Screen for important effects from multiple nominal variables with 1 continuous response, using historical data

Hi,

 

The video here describes a situation very similar to yours.  It uses the Model Screening platform in JMP Pro 16, but even if you don't yet have JMP Pro, this may help get you started in your analysis.

 

 

Marie-Eve
Level I

Re: Screen for important effects from multiple nominal variables with 1 continuous response, using historical data

Hi @HadleyMyers ,

 

This is EXACTLY what I'm trying to achieve. I'm a little bummed that I can't use JMP Pro, but it certainly helps to see how they process the data. Thanks !