Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Data mining with Small samples...Regression with Small Samples?

May 9, 2013 12:00 PM
(3465 views)

Dear all JMP experts, I know that this kind of discussion might be both a silly...very silly and a hot issue simultaneously...but I thought it would be nice to raise it : Which types of linear AND non-linear regression does JMP offer when you have a small sample size (e.g. 50-70 records) and you want to predict a continuous outcome from 5-7 predictors (both categorical and continuous) Can k-fold cross-validated stepwise linear regression OR Partial Least Regression is a remedy to this problem? Are boosted trees OR Neural networks just insane even to think about them? Your responses are GREATLY WELCOMED and VERY MUCH APPRECIATED. Respectfully, Chris

10 REPLIES 10

Highlighted
##

Whether these procedures will work or not at those sample sizes depends on signal-to-noise. If you have lots of signal and low noise, you probably won't even need 50 records. Other way around, low signal and high noise, and you're probably wasting your time.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Data mining with Small samples...Regression with Small Samples?

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Data mining with Small samples...Regression with Small Samples?

Hi PaigeMiller

Would you please be so kind and elaborate a little bit more about signal and noise? (just one sentence)

Please accept my apologies but for me these two terms are pretty allegorical to my silly mind.

THANK YOU!

Chris

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Data mining with Small samples...Regression with Small Samples?

Would you please be so kind and elaborate a little bit more about signal and noise? (justone sentence)Please accept my apologies but for me these two terms are pretty allegorical to my silly mind.

Most statistical modeling attempts to determine if there is a signal (a real relationship) that is larger than the variability of the errors (noise). Each modeling technique that I am aware of gives a measure of this signal-to-noise; for example, in standard regression is would be the overall F-test.

There's no reason you can't have a very large signal and very low noise in 50 data points. It depends on the data.

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Data mining with Small samples...Regression with Small Samples?

That's much clearer. Thank you VERY VERY much

Kind Regards

C

Highlighted
##

The first thing I do with such skimpy datasets is a regression tree analysis. It is a quick and intuitive way to explore the relationship between your dependent and predictor variables. The first three or four nodes are generally meaningful. - PG

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Data mining with Small samples...Regression with Small Samples?

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Data mining with Small samples...Regression with Small Samples?

First of all thank you very much for your prompt reply. I really appreciate it

PGStats do you mean Classification and Regression Tree analysis? CART?

Alternatively do you think that multivariate adaptive regression splines could work also for small sample size?

Many thanks again!

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Data mining with Small samples...Regression with Small Samples?

P.S. I know that MARS is not offered in JMP but it is offered from SAS...so scipt might help.

Highlighted
##

Yes I meant CART-type analysis, called Partition in JMP. When dependent variable is continuous, the result is a regression tree, otherwise, it's a classification (or decision) tree. I think regression splines eat up too many degrees of freedom to be applicable to your size of dataset. - PG

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Data mining with Small samples...Regression with Small Samples?

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Data mining with Small samples...Regression with Small Samples?

Thanks PG, this helps. I will check it out.

Article Labels

There are no labels assigned to this post.