turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- Re: Partition Models: different results for each run

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jun 12, 2017 3:25 PM
(1250 views)

Hi All!

I am running partition models, including bootstrap forest, decision tree and boosted tree and would really appreciate some suggestions on how to solve this problem. For example, I've run bootstrap forest on my data for several times and the RSquare for validation could range from -0.2 to 0.8. If I look at column contributions, the rank of variables will change a lot in each run. I am not sure what caused the unstable results (small data set?).

Here is a short description of my data set. I also attach the data file.

48 observations, 5 continuous predictor variables

Response variable (i.e. the one to be predicted) is continuous

Does anyone have advice that would help solve this problem? Thank!

Shenxuan

3 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

This is not a problem. It is inherent in the methods that you are using. They rely on a random assignment, so each run will necessarily be different.

You can set the random seed to the same value before each run and you will always get the same results. By why is a particular sample better than another?

Learn it once, use it forever!

Highlighted
##
##### Re: Partition Models: different results for each run

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Do you know about the methods that you are using? You might find your answer by selecting **Help** > **Books** > **Predictive and Specialized Modeling**. There are chapters devoted to the methods that you mention.

The JMP guides are not meant to replace an education about predictive modeling but they are still valuable and informative resources.

Learn it once, use it forever!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Oh, and yes, 46 observations is a small data set for these methods.

Have you done your preliminary analysis prior to modeling to understand the data? Distribution of predictors and response? Multivariate survey (e.g., scatter plot matrix)? What did you find? Outliers? Collinearity among predictors? Are transformations suggested?

Have you tried other methods like penalized regression (**Fit Model** > **Generalized Regression**)? They do not involve a random assignment unless you use cross-validation. (And if you do use it, you must use K-fold cross-validation or leave-one-out cross-validation with only 46 observations and 10-15 potential terms in the linear predictor.)

Learn it once, use it forever!