cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Browse apps to extend the software in the new JMP Marketplace
Choose Language Hide Translation Bar
Pacco
Level III

find values and categorize

Hello,

 

I have the problem that I need to categorize some data. The selection of the values to be categorized would be much easier, if there was an automatized process. The problem is such: Table with 3 columns (1 category, 2 depths measurments, 3 number of data points).

 

1) I would like to get those that are below 0,5 (see example file column 2) for each lake respectively. This means, 0,5 should not be included in the evaluation for selection.

2) In addition, I would like to sort those values found in column 2 after the ones with most data points (N, column 3) by selecting the three deepest values found which contain the most data, respectively for each name of column 1.

3) To categorize them, I would like to create a new column and add a 1/2/3 for the ones with most data found to the ones with least data of those three. If two depths found to be equal in number of data points the deeper one should be categorized as 1 or 2.

 

See example file for some better understanding :-).

 

Thanks in advance for any solution, if that is possible with jsl.

1 ACCEPTED SOLUTION

Accepted Solutions
XanGregg
Staff

Re: find values and categorize

When I run the script and then make a subset of the selected rows with

dt << Subset( Copy formula( 0 ), Selected Rows( 1 ), Selected columns only( 0 ) )
	

I getisbelow05.png

 

 

All the columns are indeed ranked, but within unique values of Stationsnamn and isBelow05. That's controled by the second and third argumenrs to Col Rank(). If you want the stations ranked together, you would remove that by variable from the Col Rank() arguments.

 

View solution in original post

9 REPLIES 9
XanGregg
Staff

Re: find values and categorize

Something like this should help if I understand correctly.

 

dt = Current Data Table();
dt << New Column( "isbelow05", formula( :Profildjup <= 0.5 ) );
dt << New Column( "rank", formula( Col Rank( -(:n rows + :Profildjup), :isbelow05, :Stationsnamn ) ) );
dt << select where( :isbelow05 & :rank <= 3 );

If by depth "below" you mean numerically greater, then change the "<= 0.5" to ">= 0.5". Thanks for including the example file.

Pacco
Level III

Re: find values and categorize

Hello Greg,

 

thanks for the reply and the script. Unfortunately, it does not work with my data. The results are two new columns with no data in the fist column and only one entry in the first cell of column 2 ("4,20533670002011e-315").

 

Any idea what could have gone wrong?

txnelson
Super User

Re: find values and categorize

The script that Xan included works great with the sample data you attached.  What is different between your real data and the Example data table you attached?

Jim
Pacco
Level III

Re: find values and categorize

mhh, that´s strange. I just downloaded the sample file from here again (the difference was that I had made a selection with one additional column), however, even the sample file I uploaded shows me the same error I posted before. However, now it marked surface values with a one in the below 0,5 column. (see file attached)

 

I could simplify the question as well. Then the values just need to be selected and categorized according to their rank in highest numbers for each lake name section.

txnelson
Super User

Re: find values and categorize

I have not sure what is causing your issue.  When I open the data table you supplied in your last response, I see the you issue.  I was able to correct the issue by Right Clicking on the column header for Rank, and selecting Column Info.  It brings up the Column Info window.  Select "Suppress Eval" as shown below, and then Click on the "Apply" button.  Unselect the "Suppress Eval" and then Click on the "Apply" button again, and see if it corrected the values in the column

 

suppress.PNG

The other thing I would like you to try, is to delete the "isbelow05" and "rank" columns, and run the following script.

dt = Current Data Table();
dt << New Column( "isbelow05", formula( :Profildjup <= 0.5 ) );
wait(.5);
dt << New Column( "rank", formula( Col Rank( -(:n rows + :Profildjup), :isbelow05, :Stationsnamn ) ) );
dt << select where( :isbelow05 & :rank <= 3 );

If it works, what the issue may be is that your "isbelow05" formula isn't being completed before the "rank" column is being generated, and it is causing issues.  The 

wait(.5);

is giving it time to complete the creation of column "isbelow05"

Jim
Pacco
Level III

Re: find values and categorize

Thanks for the ideas, the "suppress eval" selecteing and deselcting helped. However, the wait 5 did not change anything!

 

 error message.jpg

Pacco
Level III

Re: find values and categorize

May I ask how the result looks like? Mine ranks the entire column. However, I am looking for ranking each section (each batch of same name, see column Stationsnamn). 

 

XanGregg
Staff

Re: find values and categorize

When I run the script and then make a subset of the selected rows with

dt << Subset( Copy formula( 0 ), Selected Rows( 1 ), Selected columns only( 0 ) )
	

I getisbelow05.png

 

 

All the columns are indeed ranked, but within unique values of Stationsnamn and isBelow05. That's controled by the second and third argumenrs to Col Rank(). If you want the stations ranked together, you would remove that by variable from the Col Rank() arguments.

 

julian
Community Manager Community Manager

Re: find values and categorize

Hi @Pacco,

Are you using JMP 12 (or earlier) by chance? Col Rank() changed in JMP 13 to allow the additional arguments Xan is using. 

 

From the docs:

JMP 12: 

Col Rank(column)

 

JMP 13:

Col Rank(column, <ByVar, ...>, <<tie("average"|"arbitrary"|"row"|"minimum")