cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
vince_faller
Super User (Alumni)

Getting Values for shortest half range

Is there a way to get the actual values from the shortest half vis in the outlier box plot?

Vince Faller - Predictum
1 ACCEPTED SOLUTION

Accepted Solutions

Re: Getting Values for shortest half range

Tonya's solution is clever and right to the point. The rest of the script is actually not that bad. The variable jrn contains a string representation of the Distriution platform (not just the Outlier Box Plot). So let's use string functions to suss out the desired ends of the red bracket. We find argument with Contains() and we extract the bits with a succession of calls to Word():

 

dt = Open( "$SAMPLE_DATA/Big Class.jmp" );

dist = dt << Distribution( Y( :weight ) );

rpt = dist << Report;

jrn = rpt[FrameBox(2)] << Get Journal;

pos = Contains( Upper Case( jrn ), "ADJACENTS" );

adj = Word( 2, Sub Str( jrn, pos ), "()" );

lo = Num( Word( 3, adj, "," ) );
hi = Num( Word( 4, adj, "," ) );

View solution in original post

9 REPLIES 9
txnelson
Super User

Re: Getting Values for shortest half range

I don't see anyway to retrieve such information. I tried a real simpleton method of looking at the ranges of 50% of the data from lowest to highest, to see if that would match up, but it failed. It would be a nice enhancement, to be able to "<< get shorterhalf values".

I would pass it on to support@jmp.com
Jim

Re: Getting Values for shortest half range

Yes, but it is not particularly easy.  These values present themselves as the last two numbers in the adjacents portion of the journal.  First you need to get the journal for a box plot.

 

dt=Open("$SAMPLE_DATA/Big Class.jmp");
obj=dt<<Distribution(Y(:height));
rpt=obj<<report;
jrn=rpt[FrameBox(2)]<<get journal;

Next, you will have to parse jrn for adjacents.  Perhaps someone with more JSL experience can help you with that if you do not already know how to do that.  I'm not familiar with the ins and outs of parsing.

 

Re: Getting Values for shortest half range

Tonya's solution is clever and right to the point. The rest of the script is actually not that bad. The variable jrn contains a string representation of the Distriution platform (not just the Outlier Box Plot). So let's use string functions to suss out the desired ends of the red bracket. We find argument with Contains() and we extract the bits with a succession of calls to Word():

 

dt = Open( "$SAMPLE_DATA/Big Class.jmp" );

dist = dt << Distribution( Y( :weight ) );

rpt = dist << Report;

jrn = rpt[FrameBox(2)] << Get Journal;

pos = Contains( Upper Case( jrn ), "ADJACENTS" );

adj = Word( 2, Sub Str( jrn, pos ), "()" );

lo = Num( Word( 3, adj, "," ) );
hi = Num( Word( 4, adj, "," ) );

Re: Getting Values for shortest half range

Thanks for finishing my script, Mark!  I've never been the most efficient with the character functions.

vince_faller
Super User (Alumni)

Re: Getting Values for shortest half range

Hi Tonya, 

Any reason you could think of that I'd get a value of 

 

adjacents(0,0,0,0)

 

?

 

It's definitely not 0-0 as the shortest range.  Sorry but I can't post the actual data.  I'll see if I'm allowed to anonymize it.  

Vince Faller - Predictum
vince_faller
Super User (Alumni)

Re: Getting Values for shortest half range

Hi Tonya, 

Any reason you could think of that I'd get a value of 

 

adjacents(0,0,0,0)

 

?

 

It's definitely not 0-0 as the shortest range.  Sorry but I can't post the actual data.  I'll see if I'm allowed to anonymize it.  

 

*Edit* Nevermind.  For some reason my adjacents went to Framebox(1) and all the framebox(2)  were 0s.  

Vince Faller - Predictum

Re: Getting Values for shortest half range

This isn't really an answer, but maybe useful in some cases. If you just wanted to see the shortest half values, you could save as Interactive HTML and hover over the red bracket.

Interactive HTML Box Plot Shortest Half ValuesInteractive HTML Box Plot Shortest Half Values

When viewing in a mobile browser, just tap the red bracket.

 

ms
Super User (Alumni) ms
Super User (Alumni)

Re: Getting Values for shortest half range

Thanks for sharing the <<Get Journal idea. Really clever and new to me. 

Here's an attempt to get the limits by raw calculation. In my limited testing it compares well with the Journal numbers. It seems to handle missing values and ties as expected (but I may have missed something).

// Function for calculating the lower and upper limits of
// Shortest Half Range of a numeric column.

shorth = Function({col}, {M, n, d, i = 0},
    M = col << get values;
    M = Sort Ascending(M[Loc Nonmissing(M)]);
    n = N Row(M) / 2 + Mod(N Row(M), 2);
    d = Loc Min(J(1, n, M[(i++) + n] - M[i]));
    lo = M[d];
    hi = M[d + n - 1];
);

// Test
dt = Open("$SAMPLE_DATA/Big Class.jmp");
shorth(Column(dt, "weight"));
Show(lo, hi);

 

ron_horne
Super User (Alumni)

Re: Getting Values for shortest half range

Hi @vince_faller 

a broader version of this question was asked later https://community.jmp.com/t5/Discussions/Finding-the-X-range-that-will-include-a-certain-portion-of-...

my solution is as follows. could anyone chack if it is robust? thank you.

Names Default To Here( 1 );

dt = Open( "$SAMPLE_DATA/Big Class.jmp" );

// this is just in case wewant to bring the data back to original row order later.
rowcol = New Column("Row", Numeric, "Continuous", Format("Best", 12), Formula(Row()));
dt << run formulas();
rowcol << suppress eval( true );

// now we start working
dt << Sort( By( :weight ), Order( Ascending ), replace table );

// here is where we define the share of included range (0.5)
difcol = New Column("dif", Numeric, "Continuous", Format("Best", 12), Formula(Abs(:weight - Lag(:weight, -(N Rows() * 0.5)))));
dt << run formulas();
difcol << suppress eval( true );

start  = (dt<<get rows where(Col minimum (:dif)==:dif))[1];
// here we also mantion the share of included range (0.5)
end = start + nrows(dt)*0.5 -1;

// new binary column for in or out the range
dt << New Column("inrange", Numeric, "Ordinal");
for each row (:inrange = if (and (row() >= start, row()<=end),1 ,0  ));