BookmarkSubscribe
Choose Language Hide Translation Bar
MJ
Staff MJ
Staff

Data visualization with t-SNE and UMAP

Description

Recently, non-linear dimension-reduction and visualization algorithms, most notably t-Distributed Stochastic Neighbor Embedding (t-SNE) and uniform manifold approximation and projection (UMAP), have been widely applied to various research areas such as image processing, text mining, and genomics. This Add-in provides access to both t-SNE and UMAP R packages. It offers a user-friendly interface enabling data table navigation, data quality control, sparsity handling, intuitive parameterization, and interactive results interpretation.

 

Usage Example

Here is a screenshot of the interface with MNIST data loaded. Under Model Specifications, I selected the label column as Label and all the pixels as predictors. I chose both t-SNE and UMAP as the algorithms.

 

Embedding_Interface.png

 

Another screenshot of the results of both t-SNE (top) and UMAP (bottom).

 

tSNE&UMAP.png

 

This add-in also supports some basic quality control options, including missing value checking, distribution, and sparsity calculation. You can find these options under Quality Control Options on the interface.

 

Updates: JMP R interface on Mac has versioning issues. Please downgrade your R to version <=3.3.3, and use t-sne only if you are a Mac user. 

 

Changelog:

v.1.2 <3/14/2019> Fixed an issue in Rtsne package where a large number of columns causing stack overflow problem.
v.1.1 <3/8/2019> Fixed a bug that could potentially produce “issues found in R, memory exhausted?” error message. Added a submenu for the MNIST example dataset.

v.1.0 <2/26/2019> Initial version.

Comments
mujahida

Hi! MJ,
Can I understand that, in above photos, each mass/group of points can be treated as the same attribute? or different color points should be treated as different attribute?

MJ
Hi Mujahida,
The colors indicate true labels for this dataset, so data points with the same color should have similar attribute. The clusters were estimated by t-SNE and UMAP, you should see some color mismatches.
marxx

Hi MJ, 

 

I am getting an error message when I try this addin. It looks like it is not recognizing my installation of R. Would you be able to help figure out how to get your addin to recognize my R installation if that is indeed the issue?

 

I am including a screenshot showing that R is open but not being recognized, and a screenshot of my R install location (not program files) and also copying some of the error message below.

 

This looks like a very exciting tool and I am hoping to use it. I've previously found t-sne in R to be useful and it would be of great value to be able to do this right in JMP. Any help you might provide is greatly appreciated.

 

Thanks!

 

Screenshots of error and open R instance, then screenshot of R install location

t-sne addin not recognizing R install.pngr install location.png

 

Error code

 

"

An installation of R cannot be found on this system. JMP R support requires R version 2.9.1 or higher in access or evaluation of 'Glue' , write2lastRun( pb2, _addinPath_ );
algr = cbb << get selected();
Print( "Got algr!" );
Try(
dim1 = dimBx1 << get;
per1 = perBx1 << get;
iter1 = iterBx1 << get;
);
Try(
dim2 = dimBx2 << get;
per2 = perBx2 << get;
iter2 = iterBx2 << get;
dist = distbox << get;
);
Print( "Got parameters!" );
Try( predictor = selectedX << getitems );
If( Length( predictor ) < 1,
Throw( "Please specify Predictors" )
);
Try( labelY = selectedY << getitems );
If( N Items( labelY ) > 0,
labelY1 = labelY[1];
grphVars = Eval Insert( "X( :Y2 ), Y( :Y1 ), Color( :^labelY1^ ) " );
,
labelY1 = "";
grphVars = "X( :Y2 ), Y( :Y1 )";

***** Text Truncated *****"

 

 

MJ
Hi marxx,
Your problem was likely caused by multiple installations of R on your machine and JMP couldn't decide which one to use. Please try to set R_HOME as an environmental variable as following and try this add-in again. Open Window CMD console and type: setx R_HOME "This PC\Documents\R\R-3.5.2". And make sure Rtsne and umap packages are installed to this version of R. Please let me know if this solves your problem.
Also, we have a few threads on the community talking about this issue that you can check out.
https://community.jmp.com/t5/Administration-Discussions/Help-JMP-find-R-installation/td-p/6357
https://community.jmp.com/t5/Discussions/Setting-path-to-R-location/td-p/59764


ngalphie

Hello @MJ,

I tried running your sample data, but had a access violation crash. In the log file, it looks like the R-side ran successfully. Any ideas how I can address the issue?

 

Thanks,

Al

 

Embedding - JMP window.jpgJMP error message.jpglog file.jpg

MJ

Hi @ngalphie, this looks like a problem with the older version of JMP.

1. I noticed that you are using JMP 13, it might be helpful if you can update it to the latest version (JMP 14.3) and try this add-in again.

2. Send the crash report saved at C:\Users\username\AppData\Local\Temp\JmpCrashReports\13 to me to tech support (https://support.sas.com/ctx/supportform/createForm?ctry=us_JMP) or me directly (Meijian.Guan@jmp.com), we can dig into it.

3. There are a few posts on the community also talking about this issue. You can take a look to see if there are anything they mentioning could help.

https://community.jmp.com/t5/Discussions/JMP-has-performed-an-access-violation-and-will-shut-down-wh...

https://community.jmp.com/t5/Discussions/JMP-Access-Violation-Likely-Causes/m-p/5648#M5647

FN

Thank you very much for providing an interface for JMP.

 

I wonder if these addins can include the R/Python executables so we can run them directly.

 

 

MJ

@FN Hey there, thank you for your suggestions. It would be a great option to include R/Python executables but due to our policies and legal concerns, I did not include them. Please let me know if you have any problems regarding R installations or versions when using this add-in. 

 

FN

Thank you. To be honest, I am not sure what is the best way to install R. I am used to manage Python installations with conda/anaconda, which also includes R now.

 

This is the path where I have R installed.

 

(base) C:\Users\john_doe>where r
C:\Users\john_doe\AppData\Local\Continuum\miniconda3\Scripts\R.exe

 

I guess I need to install these packages

https://anaconda.org/conda-forge/r-tsne

https://anaconda.org/conda-forge/r-umap

 

To make JMP able to find my R installation, I exectue this (or change the PATH manually):

setx R_HOME "C:\Users\john_doe\AppData\Local\Continuum\miniconda3\Scripts\"

 

If there is a detailed guide on how to do this better, please let me know.

 

 

FN

I managed to run umap but not via Anaconda/conda.

 

I think I am installing the wrong package for tsne. Can you provide the URL in cran?

 

Here is the step by step.

 

Intall R from https://cran.r-project.org/

Install Rstudio commnutiy https://www.rstudio.com/products/rstudio/download/#download

Use Rstudio to install tsne and umap.

Rinstallation.png

 

MJ

Hello, @FN, it looks like you installed a different version of t-sne package. Could you please try to install Rtsne through R studio instead? The github version for this package is here: https://github.com/jkrijthe/Rtsne. Also please make sure your R_HOME path is pointing to the right R version with UMAP and Rtsne installed. Let me know if you have further questions.

Pat1

Dear @MJ 

thanks for this nice add-in. I used it on a Mac tSNE with R Version 3.3.3 worked fine. UMAP did not.

Now I checked it on Windows using the latest R Version 3.6.1 with RStudio. I installed the packages for umap and Rtsne unfortunately neither umap nor Rtsne worked. I use JMP 14.3.

Here is what log says when I try the mnist data after PCA (using 2 PCs as predictors):

 

{"UMAP"}
{2, 3, 200, 0.1}
"Got algr!"
"Got parameters!"
"Dim of inData2R is: "
10000
3
"Start backend!"
"UMAP is selected!"

TKIntRJMP.R version 14.0
label is: label
dim of inDataUniq is: 10000 3
Ready for Run
We are running UMAP
An exception of type c0000005 occurred at address 6c910ef2 while processing the submitted R statements. This address is at offset 10ef2 into module "C:\Program Files\R\R-3.6.1\bin\i386\R.dll"
An exception of type c0000005 occurred at address 6c910ef2 while processing the submitted R statements. This address is at offset 10ef2 into module "C:\Program Files\R\R-3.6.1\bin\i386\R.dll"
Issues found in R, could be caused by unsuccessful installation of Rtsne/umap packages or limited memory.

{"t-SNE"}
{2, 5, 500}
"Got algr!"
"Got parameters!"
"Dim of inData2R is: "
10000
3
"Start backend!"
"t-SNE is selected!"

TKIntRJMP.R version 14.0
label is: label
dim of inDataUniq is: 10000 3
Ready for Run
We are running t-SNE
dim of inDataTsne is: 10000 2
Read the 10000 x 2 data matrix successfully!
OpenMP is working. 1 threads.
Using no_dims = 2, perplexity = 5.000000, and theta = 0.500000
Computing input similarities...
Building tree...
- point 10000 of 10000
Done in 0.45 seconds (sparsity = 0.001695)!
Learning embedding...
Iteration 50: error is 120.349392 (50 iterations in 2.08 seconds)
Iteration 100: error is 103.025177 (50 iterations in 1.94 seconds)
Iteration 150: error is 94.995130 (50 iterations in 1.70 seconds)
Iteration 200: error is 90.828601 (50 iterations in 1.75 seconds)
Iteration 250: error is 87.964178 (50 iterations in 1.81 seconds)
Iteration 300: error is 4.069311 (50 iterations in 1.82 seconds)
Iteration 350: error is 3.474366 (50 iterations in 1.83 seconds)
Iteration 400: error is 3.044447 (50 iterations in 1.84 seconds)
Iteration 450: error is 2.719790 (50 iterations in 1.83 seconds)
Iteration 500: error is 2.466420 (50 iterations in 1.84 seconds)
Fitting performed in 18.44 seconds.
[,1] [,2]
[1,] -18.4173747 30.133875
[2,] -23.5082997 -17.985255
[3,] 0.7376548 -4.148267
[4,] -5.6881024 6.385878
[5,] 8.8457460 22.245393
[6,] -10.8212058 -17.013940
[1] "Analysis done!"
An exception of type c0000005 occurred at address 6c910ef2 while processing the submitted R statements. This address is at offset 10ef2 into module "C:\Program Files\R\R-3.6.1\bin\i386\R.dll"
Issues found in R, could be caused by unsuccessful installation of Rtsne/umap packages or limited memory.

 

Would be great if you would have an idea what I should try. Thanks in advance and best regards Patrick

AR_RAHMAN

Hi Meijian,

 

Good afternoon. I tried to use t-SNE and UMAP on the sample data. I use the latest version of R Studio and JMP 13. But I got the following error-

ERROR JMP.png

 

Please me know how can I fix it. Thanks for your help.

PS. I won't be able to upgrade to JMP 14 at this moment.

 

Sincerely,

Arif

raewdy

dear Sir

 

the addins does not run


 

t-sne addin not recognizing R install.png

MJ

Dear @raewdy, I believe I have responded above regarding this issue. It's likely because JMP R Interface had trouble finding your R installation.

Please try to set R_HOME as an environmental variable as following and try this add-in again. Open Window CMD console and type: setx R_HOME "Path to R". And make sure Rtsne and umap packages are installed to this version of R. Please let me know if this solves your problem.

If you are using Mac, you need to downgrade your R to version 3.3.3 and only use T-SNE.
Also, we have a few threads on the community talking about this issue that you can check out.
https://community.jmp.com/t5/Administration-Discussions/Help-JMP-find-R-installation/td-p/6357
https://community.jmp.com/t5/Discussions/Setting-path-to-R-location/td-p/59764

 

Hope that helps!

raewdy

Dear sir:

after install tsne, Rtsne & umap packages, and type the below script on CMD, the add-ins are run✌

thank you very much

2019-09-18.png

MJ

@raewdy Thank you for letting me know. Very glad to hear it!

DBhattaram

Dear @MJ,

When attempting to run the program on the test data set, I got an error saying "Issues found in R, could be caused by unsuccessful installation of Rtsne/umap packages or limited memory."

 

I'm pretty sure I downloaded everything I need (the packages and R, too), so if you could clue me into why this isn't working, that would be much appreciated

 

Regards,

Dhruv Bhattaram

MJ

Hi @DBhattaram, it's possible that JMP didn't find the right R version, or the R versioning issues with JMP R Interface. Could you please open the log file (CTRL+Shift+L) when you see the error message and send the detailed log info to me at Meijian.Guan@jmp.com? I'd be happy to take a look.

 

MJ

Pat1
Are you using a Mac? Faced the same problem! Unfortunately UMAP did not work.
Rtsne nicely worked.

Best wishes
Pat
DBhattaram

Hi, @MJ 

 

I sent the log of all my failed attempts at getting it to run the program. Hopefully, that will be of some use

 

Regards,

Dhruv

DBhattaram

@Pat1 This is on Windows for me

MJ

Thank you @DBhattaram, it looks like you didn't have Rtsne and UMAP package installed to the R version JMP is talking to. If you have multiple versions of R, make sure you set up R_Home as environmental variable as following: open Window CMD console and type: setx R_HOME "your Path to R". And make sure Rtsne and umap packages are installed to this version of R. Let me know if it solves your problem.