cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Check out the JMP® Marketplace featured Capability Explorer add-in
Choose Language Hide Translation Bar
Thierry_S
Super User

JMP > Multivariate Analysis > Correlation> How to Handle Large Datasets?

Hi JMP Community,

I need to assess the correlation between ~10,500 biomarkers measured in 150 patients at baseline. I tried a simple JSL script based on the fit Y by X and the Multivariate Correlation platform. Both approaches take a long time to compute on my machine (Windows 10 Pro, Intel i-7 2.50 GHz, 32 GB RAM, JMP 16). Is there a faster approach to generate and retrieve Pearson's r and associated p Values? Of note, while the matrix-based approach seems attractive, it does not appear to return the p-values. 

 

I appreciate your help.

 

Best,

TS

Thierry R. Sornasse
1 ACCEPTED SOLUTION

Accepted Solutions

Re: JMP > Multivariate Analysis > Correlation> How to Handle Large Datasets?

@Thierry_S ,

 

How long does it take now?  Have you compared it to Response Screening? Put all Biomarkers in Y and X and use the Corr (Pearsons Product Moment Correlation) advanced option?

Chris Kirchberg, M.S.2
Data Scientist, Life Sciences - Global Technical Enablement
JMP Statistical Discovery, LLC. - Denver, CO
Tel: +1-919-531-9927 ▪ Mobile: +1-303-378-7419 ▪ E-mail: chris.kirchberg@jmp.com
www.jmp.com

View solution in original post

5 REPLIES 5

Re: JMP > Multivariate Analysis > Correlation> How to Handle Large Datasets?

@Thierry_S ,

 

How long does it take now?  Have you compared it to Response Screening? Put all Biomarkers in Y and X and use the Corr (Pearsons Product Moment Correlation) advanced option?

Chris Kirchberg, M.S.2
Data Scientist, Life Sciences - Global Technical Enablement
JMP Statistical Discovery, LLC. - Denver, CO
Tel: +1-919-531-9927 ▪ Mobile: +1-303-378-7419 ▪ E-mail: chris.kirchberg@jmp.com
www.jmp.com
Thierry_S
Super User

Re: JMP > Multivariate Analysis > Correlation> How to Handle Large Datasets?

Hi Chris,

 

I did not think of that approach. Let me try ASAP.

 

Thanks,

 

TS

Thierry R. Sornasse
Thierry_S
Super User

Re: JMP > Multivariate Analysis > Correlation> How to Handle Large Datasets?

Hi @Chris_Kirchberg ,

 

Beautiful. The processing time has been cut by 5 to 10 folds.

Best,

TS

Thierry R. Sornasse

Re: JMP > Multivariate Analysis > Correlation> How to Handle Large Datasets?

YEAH! I had the p-value table on launch option turned on so maybe that is my cause of so much memory use.

Chris Kirchberg, M.S.2
Data Scientist, Life Sciences - Global Technical Enablement
JMP Statistical Discovery, LLC. - Denver, CO
Tel: +1-919-531-9927 ▪ Mobile: +1-303-378-7419 ▪ E-mail: chris.kirchberg@jmp.com
www.jmp.com

Re: JMP > Multivariate Analysis > Correlation> How to Handle Large Datasets?

@Thierry_S ,

 

I have a data set that is 18856 mRNA relative abundances and 65 samples. Response Screening in JMP Pro 16 seems to take 120 GB of memory ( I only have 64 so caching is occurring).  Might not be a good idea now that I mentioned it.

 

UPDATE:  I cut that down to about 9000 and it took less than a minute. So there is definitely a threshold on the number of biomarkers you will be able to use.

Chris Kirchberg, M.S.2
Data Scientist, Life Sciences - Global Technical Enablement
JMP Statistical Discovery, LLC. - Denver, CO
Tel: +1-919-531-9927 ▪ Mobile: +1-303-378-7419 ▪ E-mail: chris.kirchberg@jmp.com
www.jmp.com