turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- Are these equivalent ways to model the response variable?

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Dec 8, 2016 3:19 PM
(4170 views)

Hello,

My example involves a repeated measures experiment where a sample data point is collected from each subject every 10 minutes, for a total of 10 measurements (t0, t1, ..., t9).

As a simple example, the response variable of interest is: the interval of time between consecutive eye blinks.

My question... are the following two definitions of the response variable equivalent?

1) The number of eye blinks recorded in each 10-minute time window

2) The average time between consecutive blinks in each 10-minute time window

Also, would these be classified as Poisson, binomial, or are they normally distributed?

Thank you,

JP

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

I think you should not treat those two responses the same. Just compare the two scenarios:

Person 1 has a very regular frequency of blinking.

Person 2 has a higher frequency of blinking but one time for whatever reason one time she had a long time between two blinks.

Both might end up with the same number of blinks but they have different mean times between blinks.

The number of blinks is probably poisson distributed.

The average time between blinks might be approximated by a normal distribution but is probably more something like a weibull-distribution or gamma-distribution, as times cannot be negative.

5 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

I think you should not treat those two responses the same. Just compare the two scenarios:

Person 1 has a very regular frequency of blinking.

Person 2 has a higher frequency of blinking but one time for whatever reason one time she had a long time between two blinks.

Both might end up with the same number of blinks but they have different mean times between blinks.

The number of blinks is probably poisson distributed.

The average time between blinks might be approximated by a normal distribution but is probably more something like a weibull-distribution or gamma-distribution, as times cannot be negative.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Thank you for your reply.

Person 1 could likely have more than 10 blinks as there are 15 seconds (9*5 + 40 = 85s vs. 10*10 = 100s) unaccounted for if the test interval was 100s. This is assuming that the potential 11th blink for person 1 doesn't have a time greater than 15 seconds from the 10th, in which it would contribute to the next time interval.

That being said, your point was clear and makes a lot of sense.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

I agree that the two methods are not the same. Here is an example that may be a little extreme, but is useful: Suppose you are tracking recordable injuries per month at your company location via process behavior charts. Some months there are zero injuries, some months there are 1, or 2, o3 , etc. If the average number of recordable injuries per month is low (< 6), then the data is "chunky" (see Dr. Donald Wheeler articles on Chunky data), and the control chart method does not yield yield useful results/conclusions. HOWEVER, if you track by "Days between recordable injuries", the control chart works quite well.

What is the distribution of the data? What does it matter? The I-MR chart does not require the data to have a particular distribution!

Steve

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Dec 9, 2016 12:42 PM
(4119 views)
| Posted in reply to message from Steven_Moore 12/09/2016 03:00 PM

Thank you. I will look into the literature of 'chunky' data.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

If ultimately trying to build regression models with lots of 'zeros' in the response variable set, there are a family of regression techniques known generically as 'zero inflated'. The zero inflated modeling capability is a part of JMP Pro in the Fit Model -> Generalized Regression personality, Distribution: ZI Binomial (and others). Here is a link to the relevant sections of the JMP online documentation as well:

http://www.jmp.com/support/help/13/Distribution.shtml