Discussions

irinastl · Jul 3, 2017 12:22 PM

Hello,

I have large genomics database on two groups of patients (continuous variables). One group developed complication at some point of observation, this group has 4 repeated measures of gene eapression collected. Control group has only 3 time points collected. I would like to know if expression of some genes at baseline (week 0) predict outcomes: comlication (coded yes no) or death (coded 1 or 2). How do I prepare the data for analysis and what model do I choose? Do I need to compare LSmeans per subgroup? In this data I have genes on columns, patients IDs on rows.

jiancao · Jul 10, 2017 05:46 PM

You can start with logistic regression. The data should consist of a cross-section of patients with complication and death columns (coded either as 1/0 or Y/N and modeling type set as nominal), a treatment/control group indicator (coded as 1/0), and columns for gene expression at baseline and other time points, as well as any other variables that you have available such as patient demographics.

You can then fit a nominal logistic regression on Complication (and Death) being a nominal outcome with these columns as model effects:Treatemnt/Control, Gene Expressions at different time points, and other variables. You might want to consider interactions such as Treatment/Comtrol*Gene Expression at Baseline, etc.

JMP Documentation on odds ratios from logistic regression

http://www.jmp.com/support/help/13/Odds_Ratios_Nominal_Responses_Only.shtml#65579

irinastl · Jul 11, 2017 02:20 PM

I have 3 types of role variables, or treatment complications: 1. Early complications, 2. Late complications and 3. No complications. So, Logistic regression wouldn't probably work: I am getting "failed to converge (step-halving limit)" error. The treatment type is the same in both groups. Have anybody used a Partial Least squares for biomarkers prediction for disease outcome?

Chris_Kirchberg · Jul 10, 2017 10:35 PM

Hi,

If you have a large genomics database, I am assuming that you may have 10,000's of genes and thus 10,000's of columns. If so, you may want to take a look at JMP Genomics:

https://www.jmp.com/en_us/software/jmp-genomics.html

Also, you can look at a similar use case by Matthew J Wongchenko, et.al. where they looked at different treatment groups and Progression Free Survival. Link to paper below.

http://clincancerres.aacrjournals.org/content/early/2017/05/23/1078-0432.CCR-17-0172

If you have not used JMP Genomics before, then take a look at this short video:

https://www.youtube.com/watch?v=DmaKz4NOURk

Let me know if you would like to know more.

Best,

Chris Kirchberg

Chris Kirchberg, M.S.²
Data Scientist, Life Sciences - Global Technical Enablement
JMP Statistical Discovery, LLC. - Denver, CO
Tel: +1-919-531-9927 ▪ Mobile: +1-303-378-7419 ▪ E-mail: [email protected]
www.jmp.com

irinastl · Jul 11, 2017 02:20 PM

Our University does not have subscription for JMP Genomics, but thanks for the article.

Ted · Jul 11, 2017 04:00 PM

In JMP you can conduct multi-level logistic regression

https://community.jmp.com/t5/Discussions/Logistic-regression-with-multiple-outcome-variables/td-p/49...

but I would start with a two-level with aggregated data (and would look what will happen):

1. Complications Yes ( Complications Early and Late)

2. Complications No

Discussions

how to prepare data for calculation of odd ratios?

Re: how to prepare data for calculation of odd ratios?

Re: how to prepare data for calculation of odd ratios?

Re: how to prepare data for calculation of odd ratios?

Re: how to prepare data for calculation of odd ratios?

Re: how to prepare data for calculation of odd ratios?

Recommended Articles