Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Re: find values and categorize

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Feb 1, 2018 9:43 AM
(5031 views)

Hello,

I have the problem that I need to categorize some data. The selection of the values to be categorized would be much easier, if there was an automatized process. The problem is such: Table with 3 columns (1 category, 2 depths measurments, 3 number of data points).

1) I would like to get those that are below 0,5 (see example file column 2) for each lake respectively. This means, 0,5 should not be included in the evaluation for selection.

2) In addition, I would like to sort those values found in column 2 after the ones with most data points (N, column 3) by selecting the three deepest values found which contain the most data, respectively for each name of column 1.

3) To categorize them, I would like to create a new column and add a 1/2/3 for the ones with most data found to the ones with least data of those three. If two depths found to be equal in number of data points the deeper one should be categorized as 1 or 2.

See example file for some better understanding :-).

Thanks in advance for any solution, if that is possible with jsl.

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

When I run the script and then make a subset of the selected rows with

```
dt << Subset( Copy formula( 0 ), Selected Rows( 1 ), Selected columns only( 0 ) )
```

I get

All the columns are indeed ranked, but within unique values of Stationsnamn and isBelow05. That's controled by the second and third argumenrs to Col Rank(). If you want the stations ranked together, you would remove that by variable from the Col Rank() arguments.

9 REPLIES 9

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: find values and categorize

Something like this should help if I understand correctly.

```
dt = Current Data Table();
dt << New Column( "isbelow05", formula( :Profildjup <= 0.5 ) );
dt << New Column( "rank", formula( Col Rank( -(:n rows + :Profildjup), :isbelow05, :Stationsnamn ) ) );
dt << select where( :isbelow05 & :rank <= 3 );
```

If by depth "below" you mean numerically greater, then change the "<= 0.5" to ">= 0.5". Thanks for including the example file.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: find values and categorize

Hello Greg,

thanks for the reply and the script. Unfortunately, it does not work with my data. The results are two new columns with no data in the fist column and only one entry in the first cell of column 2 ("4,20533670002011e-315").

Any idea what could have gone wrong?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: find values and categorize

The script that Xan included works great with the sample data you attached. What is different between your real data and the Example data table you attached?

Jim

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: find values and categorize

mhh, that´s strange. I just downloaded the sample file from here again (the difference was that I had made a selection with one additional column), however, even the sample file I uploaded shows me the same error I posted before. However, now it marked surface values with a one in the below 0,5 column. (see file attached)

I could simplify the question as well. Then the values just need to be selected and categorized according to their rank in highest numbers for each lake name section.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: find values and categorize

I have not sure what is causing your issue. When I open the data table you supplied in your last response, I see the you issue. I was able to correct the issue by Right Clicking on the column header for Rank, and selecting Column Info. It brings up the Column Info window. Select "Suppress Eval" as shown below, and then Click on the "Apply" button. Unselect the "Suppress Eval" and then Click on the "Apply" button again, and see if it corrected the values in the column

The other thing I would like you to try, is to delete the "isbelow05" and "rank" columns, and run the following script.

```
dt = Current Data Table();
dt << New Column( "isbelow05", formula( :Profildjup <= 0.5 ) );
wait(.5);
dt << New Column( "rank", formula( Col Rank( -(:n rows + :Profildjup), :isbelow05, :Stationsnamn ) ) );
dt << select where( :isbelow05 & :rank <= 3 );
```

If it works, what the issue may be is that your "isbelow05" formula isn't being completed before the "rank" column is being generated, and it is causing issues. The

`wait(.5);`

is giving it time to complete the creation of column "isbelow05"

Jim

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: find values and categorize

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: find values and categorize

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

When I run the script and then make a subset of the selected rows with

```
dt << Subset( Copy formula( 0 ), Selected Rows( 1 ), Selected columns only( 0 ) )
```

I get

All the columns are indeed ranked, but within unique values of Stationsnamn and isBelow05. That's controled by the second and third argumenrs to Col Rank(). If you want the stations ranked together, you would remove that by variable from the Col Rank() arguments.

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: find values and categorize

Hi @Pacco,

Are you using JMP 12 (or earlier) by chance? Col Rank() changed in JMP 13 to allow the additional arguments Xan is using.

From the docs:

JMP 12:

`Col Rank(column)`

JMP 13:

`Col Rank(column, <ByVar, ...>, <<tie("average"|"arbitrary"|"row"|"minimum")`