cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
joshua
Level III

getting specific percentile rows from data

Hi,

I'm trying to figure it out how get top 25th percentile of row values from data. basically subset the original data with top 25th percentile of values.

 

Tried this ;

 

dt = Open( "$SAMPLE_DATA/Big Class.jmp" );

top_25th_perc = dt  << Summary(
			Group( :sex),
            Quantile( :height, .25 ),
			Freq( "None" ),
			Weight( "None" ),
			Link to original data table( 1 ),
			output table name( "top_25th_perc" )
		);

which yields this table below. 

joshua_0-1610849174124.png

What I need is that a subset of dt with only 25th percentile rows of both groups.

How to do that?

 

the following question how can we sample of those 25th percentile and say get some sample from that distribution. Again summary data table what I need.

 

3 REPLIES 3
txnelson
Super User

Re: getting specific percentile rows from data

I may be a little off with my solution.  You indicated that you wanted the "top" 25 percentile, and then determined the top 75th by calculating based on the 25th percentile.  The code below selects the rows from the Males and Females that are in the top 25th percentile of the data

Names Default To Here( 1 );
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );

dt << select where( :height >= Col Quantile( :height, .75, :sex ) );

dtTop = dt << subset( selected rows( 1 ), selected columns( 0 ) );
Jim
joshua
Level III

Re: getting specific percentile rows from data

This is working great Thanks TXnelson!

is there a way to sample that 25th percentile data before moving the final data set ?

txnelson
Super User

Re: getting specific percentile rows from data

Here is how I would sample from the top 25%

Names Default To Here( 1 );
dt = Open( "$SAMPLE_DATA/semiconductor capability.jmp" );

top25pRows = dt << get rows where( :NPN1 >= Col Quantile( :NPN1, .75, :site ) );

dt << clear selected row states;

// create a 5% of the top 25 percentile data

// Select 5% of the top 25 percentile rows
For(i=1,i<=nrows(top25pRows),i++,
	If(random uniform(0,1)<=.05,rowstate(top25pRows[i])=selected state(1));
);

// Create the output data table
dtTop = dt << subset( selected rows( 1 ), selected columns( 0 ) );
Jim