Share your ideas for the JMP Scripting Unsession at Discovery Summit by September 17th. We hope to see you there!
Choose Language Hide Translation Bar
0 Kudos

Consistent handling of excluded rows

Hi,

 

I had a conversation with JMP support since I assumed a bug in JMP. However, I learned the following:

- If one marks certain rows as "excluded" (no matter if chosen visible or nor) they will be removed from many computations (e.g. doing a histogram does not take excluded rows into account)

- If one marks certain rows as "excluded" (no matter if chosen visible or nor) they will be included as a validation set for platforms like boosted tree or bootstrap forest.

Apparently this is intended behavior. Personally, it confused me and fortunately I got in contact with support to learn about it before I published the results, since in my case the excluded rows are invalid data. I just keep those for tracking purposes.

 

So, finally my wish: could JMP be fully consistent in the use of excluded rows?

 

Here's a quote that JMP support sent me:

"The Bootstrap Forest and several other platforms in JMP Pro have a feature that if some rows are excluded, and you do not otherwise specify a Validation set, those rows are used as the Validation set. To avoid those rows from being included at all, you could:

1) Subset the data table so it doesn't include those rows, then re-run the Bootstrap Forest

2) Use a different Validation method (Holdback or Validation Column)

In general, it is often a good idea to devote some rows to a Validation set. This would give you the ability to use the Early Stopping option, which can help avoid overfitting."

 

Tracking Number:

Defect ID: S1559592

1 Comment
Community Manager
Status changed to: Delivered

Starting in JMP 15 excluded rows are not used for Validation unless you submit the JSL:

 

Use Excluded Rows for Validation(1)