turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Extract prediction formula without having to save ...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 7, 2016 10:29 PM
(3054 views)

I'm writing a script to fit a set of generalized linear models to a data and then extracting the AIC etc to do model averaging. Because of the way that JMP codes categorical variables in the design matrix for generalized linear models (JMP uses a sum-to-zero coding rather than a corner point coding), it is difficult to model average the predictions using the estimated coefficients reported in the fit (one of the categories is never shown and I would need to do extensive programming to recreate it). It looks like it would be easiest to simply use the prediction formula from the platform and then apply the model weights to these predictions in one large formula.

However, as far as I can see, I need to save the prediction formula to the data table for EACH model, and then I can extract the formula and save for use elsewhere using something along the lines of

modelfit << **prediction formula**;

modelpredformula = char**(**Column**(**data, "P(PA) Formula"**)** << **Get formula****())**;

where "P(PA) Formula" is the predicted formula. I can then save the model prediction formula (in character form) in my summary AIC table and eventually create a super formula which is a weighted sum of the individual formulae ....

However, this slows the script down considerably because the predictions are computed for the entire data table and I have to add and remove the column for each model fit.

So, is there a way to extract the actual prediction formula from a model fit WITHOUT having to first save the column to the data table and have JMP actually do the computations?

Thanks

Carl Schwarz

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 8, 2016 1:18 AM
(5200 views)

Solution

I'm not sure there is a way to do what you want. The approach I take is to suppress formula evaluation prior to saving the prediction formulae:

dt << Suppress Formula Eval(1)

-Dave

11 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 8, 2016 1:18 AM
(5201 views)

I'm not sure there is a way to do what you want. The approach I take is to suppress formula evaluation prior to saving the prediction formulae:

dt << Suppress Formula Eval(1)

-Dave

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 8, 2016 1:22 AM
(2600 views)

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 8, 2016 9:21 PM
(2600 views)

Thanks Ian and Dave. I could not find the prediction formula anywhere in the ShowProperties tree, so I guess it isn't computed until requested by a save column action.

I tried my script with and without suppressing the formula evaluation. Without suppressing the formula, it took 45 seconds. With suppressing the formula it took 44 seconds. I guess I was wrong in assuming that formula evaluation took a large amount of time compared to the actual model fittings, extracting from the report, and saving information into a new (summary) data table.

It didn't occur to me to try and profile the script. I've just done this. Here are the results

11% of the time spent in fitting the models

13% of the time spent in creating the reports

47% of the time spent on Deleting the column that was used for each model for the predictions using: data << delete columns("P(PA) Formula");

23% of the time spent in closing the report window after I extract the information from it using: modelreport << close window;

I found this surprising!

Carl

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 9, 2016 5:22 AM
(2600 views)

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 9, 2016 11:07 AM
(2600 views)

The reports from the fit never appear on the screen, but I never explicitly made the reports invisible. I see that there is an `invisible' option for launching fits....

When I specify the invisible option on the platform launch, it reduces the time from 45 seconds to 30 seconds.

The original data table is visible (into which the prediction formulae are successively saved), and the final summary data table (where the results from each model fit are successively added) are also visible. I tried making the original data table invisible and that really speeds things up (now down to 15 second) even with adding/removing the prediction columns.

The final results table was built up successively so I could see the rows being added successively. Also made that invisible and reduced the time again to 8 second!

So the bottle neck appears to be dealing with screen operations rather than underling CPU time.

Now the revised timings (on the final speeded up script) are

56% model fitting

13% generating model reports

12% closing the model report (even though it is invisible?)

When looking at my log, my line

modelfit << **prediction formula**;

triggered a warning message about a missing boolean operator. If I switch it to

modelfit << **prediction formula(1)**;

the script still runs fine without a warning message. However, I can't seem to find any documentation in the JSL Syntax Reference or Scripting Guide on what a

modelfit << **prediction formula(0)**;

would do? I'll try writing a small script to see if the latter would also not evaluate the prediction formula.

I'm now trying to modify the script so the the "invisible" option can be set on the fly. I want to do something like

runsilent=1;

data << open("blah", if(run silent, "Invisible"));

but JMP always halts at that point... I always have problem in how to make messages dynamic when "hard wired" code words are needed. Back to the books....

Carl.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 11, 2016 6:08 AM
(2600 views)

Invisible is a key word so it's not so easy to set as a parameter. I usually just use IF statements:

My Distribution = Function**({**dt,col,doInvisible**}**,**{**Default Local**}**,

If **(**doInvisible,

dt << **Distribution(** Invisible, Column**(** Eval**(**col**)** **)** **)**

,

dt << **Distribution(** Column**(** Eval**(**col**)** **)** **)**

**)**

**)**;

dt = Open**(**"$SAMPLE_DATA/Big Class.jmp"**)**;

My Distribution**(**dt,"height",**1)**;

If you want to set it as a parameter it you can build the command as a string then parse and execute it:

My Distribution = Function**({**dt,col,doInvisible**}**,**{**Default Local**}**,

If **(**doInvisible,

keyword = "Invisible"

,

keyword = ""

**)**;

Eval**(**Parse**(**Eval Insert**(**"\[

dt << Distribution( ^keyword^, Column( "^col^" ) )

]\"**)))**;

**)**;

dt = Open**(**"$SAMPLE_DATA/Big Class.jmp"**)**;

My Distribution**(**dt,"height",**1)**;

-Dave

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 11, 2016 9:17 AM
(2600 views)

Thanks for the suggestions... I also came up with

// The runsilent flag below is useful for debugging. Set it to 0 to make everything visible. 1 to run silently

runsilent = **0**;

eval**(**parse**(**"data = Open( \!"SPPI_ Data_JMP_Revised.jmp\!"" || if**(**runsilent,",\!"invisible\!""," "**)** || ");"**))**;

Note I had to include the comma with the ",invisible" string to make an express that works if runsilent=0 to avoid having "empty" arguments in the call.

Carl.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 12, 2016 7:21 AM
(2600 views)

slightly different, and no parsing

runsilent = **1**;

invis = Expr**(** invisible **)**; //this isn't a string or a list, and it isn't evaluated

dt = Expr**( **Open**(** "$SAMPLE_DATA/Big Class.jmp" **)****)**;

If**(** runsilent == **1**, insert Into**(** dt, Name Expr**(** invis **)** **)****)**; //use Name Expr to return the expression's symbol without evaluating it.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 13, 2016 4:43 PM
(2600 views)

Neat... but I don't really understand how the 'insert into' knows where to insert the "invis" expression into the "dt" expression. I guess that Expr "parsea" the expression to make some sort of tree and then "insert into" simply add the "invis" argument as another leaf in the tree?

I don't have a good enough understanding of how arguments are passed to functions in JMP. I come from an R back ground where it is either positional or argument=value and find JMPs use of things like Invisible in the Open() function a bit weird because it isn't a string). Any good references for this way of calling functions?

I haven't tried this yet, but I assume you need to evaluate the expression 'dt' to open the actual data table.