turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- How do I declutter K Means Cluster in JMP Scripting

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Oct 13, 2017 3:58 PM
(873 views)

Hello,

I have a JMP script that matches a pair of cartesian coordinate data sets using K Means Cluster. The script works well for data sets that have an equal number of coordinates. I am trying to expand the script to handle scenarios where one of the data sets has missing coordinates or less coordinates than the other. I can do this manually through the K Means Cluster dialog window by first using 'Declutter' then selecting the outliers and excluding them. I would like to automate this process through the JMP script. My plan is to run the 'Declutter' function and limit number of nearest neighbors to 1 then 'Save NN Distances' to a column and exclude the rows that are outside 3 sigma of the mean. Then run the cluster analysis on the remaining coordinates.

I am not sure how to do this through JMP scripting. The action of identifying and excluding the outliers by nearest neighbors must happen before the cluster function begins. This is where I am stuck.

Would it be easier to leave the K Means Cluster dialog up with the Declutter plot and allow the user to highlight the outliers, exclude them, then run the clustering algorithm? If so, is it possible for the script to pause while the user performs these actions then continue after the cluster function is complete? I have additional actions that are performed on the cluster result.

Below is a snippet of the K Means Cluster function as I have now. The nClusters variable is defined by the number of rows from the data set with the least number of coordinates.

obj = K Means Cluster**(**

Y**(** :X, :Y**)**,

Number of Clusters**(** nClusters **)**,

Columns Scaled Individually**(****0)**

**)**;

obj << Declutter(1,1);

obj << **Go**;

Any help would be greatly appreciated.

-Ry

4 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

In JMP 13 I believe you can do this using the screening platform:

```
//Use sample data
dt = Open( "$SAMPLE_DATA/Cytometry.jmp" );
//Find outliers using KNN
outliers = dt << Explore Outliers(
Y( :CD3, :CD8 ),
Name( "Multivariate k-Nearest Neighbor Outliers" )(K( 1 ))
);
//Save the distance to the nearest point
outliers << Save NN Distances;
//Make a column indicating that the point is an outlier. You could skip this and select the points over a certain value directly.
dt << New Column( "Is Outlier",
Numeric, "Nominal",
Formula(
If(
:Nearest 1 Distance > Col Mean( :Nearest 1 Distance ) + 3 *
Col Std Dev( :Nearest 1 Distance ),
1,
0
)
),
Value Labels( {0 = "No", 1 = "Yes"} ), Use Value Labels( 1 )
);
//show which points will be excluded
dt << Graph Builder(
Show Control Panel( 0 ),
Variables(
X( Transform Column( "Row", Formula( Row() ) ) ),
Y( :Nearest 1 Distance ),
Color( :Is Outlier )
),
Elements( Points( X, Y, Legend( 14 ) ) )
);
//Uncomment to Hide and exclude outliers
//dt << select where( :Is Outlier == 1 );
//dt << hide and exclude;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Hello ih,

I'm not sure this function is available in JMP 12 (what I am using). When I test the code you listed, there is no result for the "Explore Outliers" step. Saving NN Distances does not generate a column.

Regards,

Ry

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

My memory of how that looked in JMP 12 is missing, but John Sall referenced it here so I expeect it can be done. Can you check Cols->Modeling Utilities? Hopefully you can work through the same analysis through the platform.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Hello ih,

Thank you. I will explore this function further.

Best regards,

Ry