Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- how do I create a subset that is not a SRS but follows a given distn (specify th...

Topic Options

- Start Article
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

how do I create a subset that is not a SRS but follows a given distn (specify the mean and SD)

Feb 5, 2016 12:00 PM
(1440 views)

I have a large dataset and I need to subsample to form two groups. One is an SRS (that is easy). The other is a purposeful sample that will match a given distribution. I have the data for the distn I am trying to match, and know the mean and SD. How can I do this?

4 REPLIES 4

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: how do I create a subset that is not a SRS but follows a given distn (specify the mean and SD)

I can't think of a way to do this without scripting it. Even then, I don't know what strategy or algorithm you might use. Do you have any references or examples of how this might be accomplished? With that the Community might be able to provide some direction on some JSL.

-Jeff

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: how do I create a subset that is not a SRS but follows a given distn (specify the mean and SD)

I think I figured it out.

Thank you!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: how do I create a subset that is not a SRS but follows a given distn (specify the mean and SD)

I'm glad to hear it Sarah.

Can you share what method you ended up with?

-Jeff

-Jeff

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: how do I create a subset that is not a SRS but follows a given distn (specify the mean and SD)

A is approx bimodal; this is the group I want to match. B is a much larger (10x larger) population, strong left skew The mean of A << the mean of B. Trying to get a subset of B to match A.

Get a freq table for group A.

Get a histogram of group B, match the bin size to group A.

Take a SRS from the first bin in group B of size n that matches group A, to get the same n in the sample of B as in A.

Repeat for each bin across the histogram in order to build the complete subset of B.

Used a two-sample t to make sure the means weren't too different between groups.

This took a while to build and I had to be careful not to accidentally select the wrong rows. But it worked.