I am trying to use JMP (14.0 or 14.2) to analyze a coral transcriptome: 12 samples x 390,000 genes (=12 rows by 390,000 columns). Now, doing the simplest of tasks can almost crash my 8-GB RAM Macbook Air, so I am trying to see if there is a simpler way to standardize my data. In other words, I want to convert the raw data for my 390,000 genes to z-scores (sample-mean/std. dev.). With smaller datasets, I can select the columns I want standardized and then use "formula column" and select "standardize." This will build new columns with standardized data (which is a great feature of JMP BTW). However, I can only do a few hundred columns at a time before JMP crashes, meaning it could take days to manually standardize my transcriptome! Can anyone think of an easier way to go about transforming such a huge dataset? I can do this in Excel, but I am trying to think of a way to do it solely in JMP.
PS: I AM aware that the data can be standardized prior to analyses when selecting columns for whichever analytical test/algorithm (e.g., selecting columns for PCA), but this also crashes JMP!
It sounds like you are running standard JMP and not JMP Pro? Reason I ask is because in a general JMP design sense, for many platforms (PCA being one) JMP developers have included a 'wide' or 'sparse' type option within the platform's specification window. These options use more computationally efficient techniques for data such as yours that might not crash your computer but still give you the insights you seek? So maybe, just maybe, you can accomplish what you want (not necessarily standarizing) analytically using these options within the platform?
Here's a bit more detail on the PCA platform:
Thank you for your response. In fact, it is you who gave me the idea of using PLS (and JMP in general) to analyze my transcriptomic data. When using JMP standard (I don't have Pro), you CAN, as you mention, use the "wide" option with transcriptomic datasets under the discriminant analysis (DA) option (which is a sort of categorical version of PLS, right?). The good thing about DA is that is has an option for standardizing first. So, for DA/PLS, I am all set.
Also, to other readers out there who are interested in trying this, PCA, hierarchical clustering, and even MDS CAN be done with standard JMP with huge transcriptomic datasets in a relatively short amount of time (minutes-hours, seconds, and seconds, respectively). This leads me to believe JMP is doing some sort of singular value decomposition by default, at least for the latter two approaches, which are done too quickly to be considering every possible relationship between the data matrix. So, the weakest link is the actual standardization step itself.
I am trying one more time to see if I can have JMP generate 390,000 new columns with standardized data (two hours running, so far). Afterwards, I will see if I have success standardizing within the PCA, hierarchical clustering, and MDS platforms at the same time as usign a singular value decomposition (if available). I am starting to wonder if standardizing will even make a huge difference with 390,000 genes. The effect of a few high expression genes is likely to not exert THAT much of a pull on such an enormous dataset (though I could be wrong on this).
@abmayfield: Yes I do remember our initial conversations on PLS and Discriminant Analysis. With @chris_kirchberg also contributing here...he's our internal expert on all these types of subjects and a SME on JMP's Life Sciences applications of which JMP Genomics is one. So you are in good hands there, especially if you can find the wherewithall for JMP Genomics...it sure sounds like it's capabilities are much more relevant to your specific use case here than JMP Pro would be.
Another thought is what will you be doing next? After standardizing the data, will you be doing PCA and for what purpose? Differential Gene Expression? Clustering? Modeling?
JMP Genomics was designed to deal with these size and types of data for this very reason that in some cases, you would never be able to add enough memory do standardized or even analyze this number of genes. Take a look here:
If you are planing on doing any of what you see on this link, let me know and we can discuss further if you like.
Thos JMP Genomics graphics look beautiful, and the types of analyses supported (namely differential gene expression from RNA-Seq data) are exactly what I need to do. Can it carry out any of these network analyses that are all the rage these days? I am on several manuscripts featuring such a network analysis with transcriptomic data, but I have no idea how the calculations work and could currently NOT replicate the findings on my own. My main reason for not already purchasing JMP Pro or JMP Genomics are financial, as I am an independent researcher essentially making a living off small, soft money grants. That being said, I think my next grant has a "computer" budget whose funds I might be able to use for software.
It does do Gene Set Enrichment analysis but not network analysis like iPathway Guide, Ingenuity Pathway Analysis or other such software.
One thing to note, it does do partial correlation diagrams which is a network like diagram based on correlation of the data. See this link for more details:
If you need some sort of idea for cost of JMP Genomics, it is the same as JMP Pro. That should help if you are able to put it into the computer budget. If you need a exact number (and don't already have it), your local salesperson can get it for you.
Great! I will look into it in more detail. I do have a quick question about JMP Genomics that others may be interested in knowing, as well:
1. Can it run on a standard personal laptop with 8-16 GB of RAM or do most people have a dedicated work computer for running it?
2. Does it have all the features of standard JMP (for non-genomic analyses), or would I need both standard JMP AND JMP Genomics to do both standard statistical analyses and bioinformatic stuff?
1) Yes it can run on a Windows based OS with 8 to 16 GB of RAM
2) Yes, it currently comes with JMP 13 and it is an integral part of JMP Genomics. If you are taking advantage of a JMP 14 feature that does not exist in JMP 13, then you would need both and they would not interact with each other (they are installed in separate locations in Program Files folder).
Hope that helps.
There are no labels assigned to this post.