Subscribe Bookmark RSS Feed

how to prepare data for calculation of odd ratios?

irinastl

Occasional Contributor

Joined:

May 9, 2017

Hello,

I have large genomics database on two groups of patients (continuous variables). One group developed complication at some point of observation, this group has 4 repeated measures of gene eapression collected. Control group has only 3 time points collected. I would like to know if expression of some genes at baseline (week 0) predict outcomes: comlication (coded yes no) or death (coded 1 or 2). How do I prepare the data for analysis and what model do I choose? Do I need to compare LSmeans per subgroup? In this data I have genes on columns, patients IDs on rows.

5 REPLIES
jiancao

Staff

Joined:

Jul 7, 2014

You can start with logistic regression. The data should consist of a cross-section of patients with complication and death columns (coded either as 1/0 or Y/N and modeling type set as nominal), a treatment/control group indicator (coded as 1/0), and columns for gene expression at baseline and other time points, as well as any other variables that you have available such as patient demographics.

You can then fit a nominal logistic regression on Complication (and Death) being a nominal outcome with these columns as model effects:Treatemnt/Control, Gene Expressions at different time points, and other variables. You might want to consider interactions such as Treatment/Comtrol*Gene Expression at Baseline, etc.  

JMP Documentation on odds ratios from logistic regression

http://www.jmp.com/support/help/13/Odds_Ratios_Nominal_Responses_Only.shtml#65579

irinastl

Occasional Contributor

Joined:

May 9, 2017

I have 3 types of role variables, or treatment complications: 1. Early complications, 2. Late complications and 3. No complications. So, Logistic regression wouldn't probably work: I am getting "failed to converge (step-halving limit)" error. The treatment type is the same in both groups. Have anybody used a Partial Least squares for biomarkers prediction for disease outcome?
chris_kirchberg

Joined:

May 28, 2014

Hi,

If you have a large genomics database, I am assuming that you may have 10,000's of genes and thus 10,000's of columns.  If so, you may want to take a look at JMP Genomics:

 

https://www.jmp.com/en_us/software/jmp-genomics.html

 

Also, you can look at a similar use case by Matthew J Wongchenko, et.al. where they looked at different treatment groups and Progression Free Survival. Link to paper below.

 

http://clincancerres.aacrjournals.org/content/early/2017/05/23/1078-0432.CCR-17-0172

 

If you have not used JMP Genomics before, then take a look at this short video:

 

https://www.youtube.com/watch?v=DmaKz4NOURk

 

Let me know if you would like to know more.

 

Best,

Chris Kirchberg

irinastl

Occasional Contributor

Joined:

May 9, 2017

Our University does not have subscription for JMP Genomics, but thanks for the article.
Ted

Community Trekker

Joined:

Mar 29, 2016

In JMP you can conduct multi-level logistic regression

https://community.jmp.com/t5/Discussions/Logistic-regression-with-multiple-outcome-variables/td-p/49...

but I would start with a two-level with aggregated data (and would look what will happen):

1. Complications Yes ( Complications Early and Late)

2. Complications No