A JMP(R) Add-in for Dimension-Reduction and Visualization Using t-SNE and UMAP (2019-US-EPO-158)
Aug 27, 2019 12:39 PM
| Last Modified: Oct 22, 2019 12:04 PM
Meijian Guan, JMP Research Statistician Developer, SAS
High-dimensional data sets can be very difficult to visualize. Recently, nonlinear dimension reduction and visualization algorithms, most notably t-Distributed Stochastic Neighbor Embedding (t-SNE) and uniform manifold approximation and projection (UMAP), have been widely applied to various research areas such as image processing, text mining and genomics. They differ from linear dimension reduction methods, for example, principal components analysis (PCA), by preserving only small pairwise distances or local similarities, whereas PCA is concerned with preserving large pairwise distances to maximize variance.
While JMP does not implement t-SNE or UMAP in current releases, its user-friendly R/Python-interface provides users quick and easy access to these open source packages. In this presentation, we will demonstrate a JMP add-in that provides access to both t-SNE and UMAP algorithms. It offers a user-friendly interface enabling data table navigation, data quality control, sparsity handling, intuitive parameterization and interactive results interpretation.
MNIST handwritten digits data and a mouse single-cell sequencing data set that consists of >100,000 cells from 20 organs will be used in the demonstration. Results from PCA, t-SNE and UMAP will be visualized and compared. We will also showcase how JMP facilitates scientific discoveries through its high interactivity.