cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
pcarroll1
Level IV

Column quantiles or probabilites by ...

I need jsl code that will save the probabilities of a given column distribution, by other columns.

 

I know how to do this un-scripted by taking a distribution of the column by the other columns and then under the red action button while holding control, and selecting Save Prob Scores.  But I don't know how to script this.

I looked through other posts but didn't find an answer to how to script the above or doing a more direct calculation.

 

The only thing I could think of is to do multiple iterations over each "by": taking the subset, sorting those tables, error checking for a null table, calculate estimated quantiles using row number (not sure how to get this), and updating the original table with each.  It seems like a lot of coding that maybe can be done with one line with the right function.

 

I'm sure someone has a beter idea.

2 ACCEPTED SOLUTIONS

Accepted Solutions
Kevin_Anderson
Level VI

Re: Column quantiles or probabilites by ...

Hi, pcarroll1!

 

I think there are many ways to calculate and script what you wish.

 

Using a method from Blom G., Statistical estimates and transformed beta-variables, Wiley; New York: 1958, you could calculate the quantiles of, for instance, height by sex in the Big Class data table using the formula 

(Col Rank( :height, :sex ) - 3 / 8) / (Col Number( :height, :sex ) + 1 / 4)

I can think of several more sophisticated ways to do what I think you want, but making a new column with a formula (or Set Values) in it is pretty easy in JSL...

 

Good luck!

 

 

View solution in original post

Kevin_Anderson
Level VI

Re: Column quantiles or probabilites by ...

Hi, Pat!

 

"Wrong" may be an overstatement!

 

The reference from Blom is specifically regarding Inverse Normal Transformations, which are of the form

Region Capture.JPG

The parenthetical argument is the quantile.  Blom recommends that c=3/8;  Van der Waerden recommends c=0; Bliss and others recommend different c's.

 

No less than John W. Tukey suggests in Tukey JW.; "The future of data analysis"; Ann Math Stat. 1962;33:pp1–67 that there is a trivial difference between any of the methods and the Expected Normal Scores, so it really doesn't matter which one you pick.  I chose Blom because I've studied it and I'm confident that it closely matches ENS even with small sample sizes.

 

Crazy is a short drive for me, but would you like some assistance with constructing the equation for given column name inputs? 

View solution in original post

4 REPLIES 4
Kevin_Anderson
Level VI

Re: Column quantiles or probabilites by ...

Hi, pcarroll1!

 

I think there are many ways to calculate and script what you wish.

 

Using a method from Blom G., Statistical estimates and transformed beta-variables, Wiley; New York: 1958, you could calculate the quantiles of, for instance, height by sex in the Big Class data table using the formula 

(Col Rank( :height, :sex ) - 3 / 8) / (Col Number( :height, :sex ) + 1 / 4)

I can think of several more sophisticated ways to do what I think you want, but making a new column with a formula (or Set Values) in it is pretty easy in JSL...

 

Good luck!

 

 

pcarroll1
Level IV

Re: Column quantiles or probabilites by ...

Kevin,

    Thanks.  This works, though I don't understand the 3/8 and 1/4.  I've seen the equation as:

(Col Rank( :height, :sex ) ) / (Col Number( :height, :sex ) + 1)

Is this wrong?

 

The other issue is for me is constructing the equation for given column name inputs. 

The response I get from JMP are driving me crazy.  Forcing me several times to have to kill JMP with the Task Manager.

 

Pat

Kevin_Anderson
Level VI

Re: Column quantiles or probabilites by ...

Hi, Pat!

 

"Wrong" may be an overstatement!

 

The reference from Blom is specifically regarding Inverse Normal Transformations, which are of the form

Region Capture.JPG

The parenthetical argument is the quantile.  Blom recommends that c=3/8;  Van der Waerden recommends c=0; Bliss and others recommend different c's.

 

No less than John W. Tukey suggests in Tukey JW.; "The future of data analysis"; Ann Math Stat. 1962;33:pp1–67 that there is a trivial difference between any of the methods and the Expected Normal Scores, so it really doesn't matter which one you pick.  I chose Blom because I've studied it and I'm confident that it closely matches ENS even with small sample sizes.

 

Crazy is a short drive for me, but would you like some assistance with constructing the equation for given column name inputs? 

pcarroll1
Level IV

Re: Column quantiles or probabilites by ...

Thanks for the offer. 

 I took a pill and then finally decided to construct my code as a string and then use eval(parse()) to convert it into an expression and run it.  In the end it worked!

 

Pat