cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar

JMP Genomics - Collapsing Isoform Level Differential Expression Data to Gene Level

I have an RNA-Seq data set where reads are mapped to the transcript level, resulting in multiple unique transcript identifiers sharing common gene IDs. I am performing differential gene expression in JMP Genomics at the level of transcript identifiers, but I would like to collapse that information to the gene level in a statistically meaningful way. Most of the downstream data analysis tools (gene ontology, pathway analysis) are going to require gene level information and expression data. At the moment, I do not have a software solution that allows me to map to the gene level prior to JMP Genomics. Is there a solution that already exists for this in JMP Genomics?

4 REPLIES 4

Re: JMP Genomics - Collapsing Isoform Level Differential Expression Data to Gene Level

Hi @LouisAltamura ,

 

Do you have an annotation file for the RNA-seq data? Sometimes when the data is mapped to the transcript level, there is an annotation file that gets created that might also map to the gene level (gene symbol or entrez gene ID).  If you only have the transcript IDs, how do you know which transcripts belong to which gene?  Maybe that source can be used for the annotation file or better yet, that column of information can be merged into the main transcript data file.

 

You can always do DEG at the gene level, if that is what you are looking for. There is a mechanism for that in JMP Genomics. You can add the gene ID as the By Variables item instead of the transcript ID. You can put the transcript ID at the row level category variables section. The transcripts will be averaged for that gene then that gene will be used for DEG.

 

If you are only looking to have JMP Genomics export the results with the transcript data along with a Gene ID and let the next software do the averaging for you, then an annotation file will be needed.

 

Another thought, if you have the public IDs for the transcripts, then something like the UCSC genome browser can be used to get the transcript IDs and the gene IDs for that species. That can then be used to merge into the transcript data. Then use the gene ID as the By Variables section if you are using the ANOVA platform.

 

If you are using the RNA-Seq workflow, then you would not be able to specify the gene ID as the level of the analysis you wanted. However, there is a utility called Statistics for Columns under SAS Data Set Utilties>Columns where you can use the gene ID as the Variables By Which to Summarize and the samples in the Variables to Be Summarized. You also get to choose how to summarize in the options tab: Mean, median, etc. But this does assume you have the transcript ID and gene ID in the same data set.

Chris Kirchberg, M.S.2
Data Scientist, Life Sciences - Global Technical Enablement
JMP Statistical Discovery, LLC. - Denver, CO
Tel: +1-919-531-9927 ▪ Mobile: +1-303-378-7419 ▪ E-mail: chris.kirchberg@jmp.com
www.jmp.com

Re: JMP Genomics - Collapsing Isoform Level Differential Expression Data to Gene Level

Thanks Chris.

So, after more review of our sequence analysis software (Geneious Prime), we were able to recalculate expression at the gene level. This greatly reduced the amount of gene level redundancy, but there are still a few hundred rows in our input data tables with duplicate gene names, but unique accession numbers for these variants. I plan to use these unique accession numbers for the differential expression, but is there a way to carry over both the gene name and the accession number to annotate things like volcano plots, ANOVA output, etc? Thanks!

Re: JMP Genomics - Collapsing Isoform Level Differential Expression Data to Gene Level

Why yes there is.  By Variables adn Variables by Which to Merge Primary Annotation Data (if using a separate annotation file) items should be used with the accession number. You can use the Label Variable item to put the gene name into if it is in the same file as the expression data and the accession number. See screen shot as to where to put it.

Capture.JPG

 

Chris Kirchberg, M.S.2
Data Scientist, Life Sciences - Global Technical Enablement
JMP Statistical Discovery, LLC. - Denver, CO
Tel: +1-919-531-9927 ▪ Mobile: +1-303-378-7419 ▪ E-mail: chris.kirchberg@jmp.com
www.jmp.com

Re: JMP Genomics - Collapsing Isoform Level Differential Expression Data to Gene Level

BTW, this also works for the Basic Expression workflow and the Basic RNA-Seq Workflow.

Chris Kirchberg, M.S.2
Data Scientist, Life Sciences - Global Technical Enablement
JMP Statistical Discovery, LLC. - Denver, CO
Tel: +1-919-531-9927 ▪ Mobile: +1-303-378-7419 ▪ E-mail: chris.kirchberg@jmp.com
www.jmp.com