turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- DOE analysis in JMP, ocnfused

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Apr 20, 2009 7:10 AM
(1936 views)

I used DOE ----> Screening Design to set up a DOE study for 1out put (height, maximize purpose) with 3 factors (rate, temp, speed, all continuous). I chose the design of full factorial (8 runs + 2 center points).

After I got the results of the totally 10 runs, I type them into the design table.

Here, I got confusion.

Now I have two ways to analyze the data.

1. Analyze ---> Fit Model.

In this way, "height" is automatically picked in "Y", and "rate", "temp", "speed", "rate*temp", "rate*speed" and "temp*speed" are automatically picked in "Construct Model Effects". "Effect Screenin" is automatically picked as "Emphasis".

After I run the model, none of the main effects and interactions is significant because no P value is smaller than 0.005.

2. Analyze ---> Modeling ---> Screening

In this way, I put "height" in "Y". I put "rate", "temp" and "speed" in "X", hit "OK"

In the "Constracts" window and "Half Normal Quantile" plot, JMP says "speed", "rate", "speed*speed", "speed*rate" and "speed*rate*temp" are all significant (Because the "Individual p-Value"s are smaller than 0.05).

Then I hit "Run Model", again, the main effects and interactions mentioned above are significant. However, I got different p-values for those significant factors and intetractions.

Here are my questions:

1. Why these two ways got so different results?

2. Which way is the right way to analyze the data?

3. If the first way is the right one, after I see there is no significant factor in my design, what is the next step to do to find the significant one? Pick other possible factor? Change the range of the chosen factors?

Thanks a lot for your help!

After I got the results of the totally 10 runs, I type them into the design table.

Here, I got confusion.

Now I have two ways to analyze the data.

1. Analyze ---> Fit Model.

In this way, "height" is automatically picked in "Y", and "rate", "temp", "speed", "rate*temp", "rate*speed" and "temp*speed" are automatically picked in "Construct Model Effects". "Effect Screenin" is automatically picked as "Emphasis".

After I run the model, none of the main effects and interactions is significant because no P value is smaller than 0.005.

2. Analyze ---> Modeling ---> Screening

In this way, I put "height" in "Y". I put "rate", "temp" and "speed" in "X", hit "OK"

In the "Constracts" window and "Half Normal Quantile" plot, JMP says "speed", "rate", "speed*speed", "speed*rate" and "speed*rate*temp" are all significant (Because the "Individual p-Value"s are smaller than 0.05).

Then I hit "Run Model", again, the main effects and interactions mentioned above are significant. However, I got different p-values for those significant factors and intetractions.

Here are my questions:

1. Why these two ways got so different results?

2. Which way is the right way to analyze the data?

3. If the first way is the right one, after I see there is no significant factor in my design, what is the next step to do to find the significant one? Pick other possible factor? Change the range of the chosen factors?

Thanks a lot for your help!

5 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

I would not use the model fit by your #2 in this case. It has a speed*speed term, but your design doesn't really support quadratic effects. It also has a 3 way interaction, so I'm thinking model fit in case #2 is quite inappropriate here. (Of course, you didn't tell use what model was fit in case #2, you just told us what the significant terms were...)

Regarding question 3: there is no significant effect here. There could be many reasons, and depending on your actual experiment and the underlying phenomena, there are many ways to proceed. Change the range of the factors, reduce the noise of the experiment, look at different factors are all possibilities.

Regarding question 3: there is no significant effect here. There could be many reasons, and depending on your actual experiment and the underlying phenomena, there are many ways to proceed. Change the range of the factors, reduce the noise of the experiment, look at different factors are all possibilities.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Hi Paige,

Thanks for your reply.

For my case #2, instead of hitting "Run model" directly at the bottom of "Half Normal Quantile" plot, if I hit "make model", I see "height" in "Y". I also see "rate", "speed", "speed*speed", "speed*rate", "speed*rate*temp" in "Construct Model Effects". Basically all of them have positive contrast value in the "Contrast" window. And "Effect Screening" is automatically picked as "Emphasis".

I still do not understand, if my design does not really support quadratic effects, why when I used option #2, JMP included the speed*speed term and 3 way interaction.

And what is really the difference (analytical method wise) between my case #1 and #2 for same set of data. The results are so different.

Thanks a lot.

Thanks for your reply.

For my case #2, instead of hitting "Run model" directly at the bottom of "Half Normal Quantile" plot, if I hit "make model", I see "height" in "Y". I also see "rate", "speed", "speed*speed", "speed*rate", "speed*rate*temp" in "Construct Model Effects". Basically all of them have positive contrast value in the "Contrast" window. And "Effect Screening" is automatically picked as "Emphasis".

I still do not understand, if my design does not really support quadratic effects, why when I used option #2, JMP included the speed*speed term and 3 way interaction.

And what is really the difference (analytical method wise) between my case #1 and #2 for same set of data. The results are so different.

Thanks a lot.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

I have version 5 of JMP. I do not know what your case 2 is, as it is not an option in my version of JMP. Nor can I explain how it constructed the model effects. Your design does not support quadratic effects; it does support three-way interactions, although they should be used sparingly if at all. I wouldn't recommend a three-way interaction in a model unless there was a good theoretical reason for including it.

The difference between 1 and 2 is that 1 is appropriate for your design, and 2 is not. Therefore, you ignore results from 2, and believe results from 1.

The difference between 1 and 2 is that 1 is appropriate for your design, and 2 is not. Therefore, you ignore results from 2, and believe results from 1.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

In for first case JMP uses the default model (main effects and two-way interactions) that was scripted to your data table when you set up your full factorial design. In this case all main effects and interactions are delineated since it is a full factorial design. Your center point replicates will give you a lack of fit response where your pure error and error due to curvature (lack of fit) will be partitioned. Based on your description of your analysis it sounds like you must have some lack of fit evident in your model. This could be why none of your main effects and interactions appear significant when you ran the default model. In addition it also could be that many of your terms can be removed by fitting your model and selecting personality = stepwise rather than using the default personality = standard least squares.

In the second case you are utilizing a new platform introduced in JMP 7 (Analyze>Modeling>Screening). Bottom line is that it is using a Monte Carlo simulation with all main effects, two way interactions and polynomial terms and running an algorithm to see which terms most accurately describe your data. I believe if you try the stepwise platform mentioned previously and enter all of your terms to be considered for a model, namely main effects, two-way interactions and polynomial terms you may get a similar result as the one you mention by using method 2.

In all cases you should be looking at your adjusted R square to see which model describes your situation to help settle on a model. Remember... "All models are wrong but some are useful" (G. Box) So you should verify your model by running a verification experiment.

Below is a description taken from the online documentation on the Analyze>Modeling>Screening Platform...

The Screening platform has a carefully defined order of operations.

First, the main effect terms enter according to the absolute size of their contrast. All effects are orthogonalized to the effects preceding them in the model. The method assures that their order is the same as it would be in a forward stepwise regression. Ordering by main effects also helps in selecting preferred aliased terms later in the process.

After main effects, all second-order interactions are brought in, followed by third-order interactions, and so on. The second-order interactions cross with all earlier terms before bringing in a new term. For example, with size-ordered main effects A, B, C, and D, B*C enters before A*D. If a factor has more than two levels, square and higher-order polynomial terms are also considered.

An effect that is an exact alias for an effect already in the model shows in the alias column. Effects that are a linear combination of several previous effects are not displayed. If there is partial aliasing, i.e. a lack of orthogonality, the effects involved are marked with an asterisk.

The process continues until n effects are obtained, where n is the number of rows in the data table, thus fully saturating the model. If complete saturation is not possible with the factors, JMP generates random orthogonalized effects to absorb the rest of the variation. They are labeled Null n where n is a number. This situation occurs, for example, if there are exact replicate rows in the design.

**Screening as an Orthogonal Rotation**

Mathematically, the Screening platform takes the n values in the response vector and rotates them into n new values that are mapped by the space of the factors and their interactions.

**Contrasts = T' ¥ Responses**

where T is an orthonormalized set of values starting with the intercept, main effects of factors, two-way interactions, three-way interactions, and so on, until n values have been obtained. Since the first column of T is an intercept, and all the other columns are orthogonal to it, these other columns are all contrasts, i.e. they sum to zero. Since T is orthogonal, it can serve as X in a linear model, but it doesn't need inversion, since T' is also T-1 and also (T'T)T', so the contrasts are the parameters estimated in a linear model.

If no effect in the model is active after the intercept, the contrasts are just an orthogonal rotation of random independent variates into different random independent variates with the same variance. To the extent that some effects are active, the inactive effects still represent the same variation as the error in the model. The hope is that the effects and the design are strong enough to separate which are active from which are random error.

**Lenth's Pseudo-Standard Error**

At this point, Lenth's method (Lenth, 1989) identifies inactive effects from which it constructs an estimate of the residual standard error--the Lenth Pseudo Standard Error (PSE).

The value for Lenth's PSE is shown at the bottom of the Screening report. From the PSE, t-ratios are obtained. To generate p-values, a Monte Carlo simulation of 10,000 runs of n - 1 purely random values is created and Lenth t-ratios are produced from each set. The p-value is the interpolated fractional position among these values in descending order. The simultaneous p-value is the interpolation along the max(|t|) of the n - 1 values across the runs. This technique is similar to that in Yee and Hamada (2000).

Message was edited by: Lou V@JMP

Message was edited by: Lou V@JMP

In the second case you are utilizing a new platform introduced in JMP 7 (Analyze>Modeling>Screening). Bottom line is that it is using a Monte Carlo simulation with all main effects, two way interactions and polynomial terms and running an algorithm to see which terms most accurately describe your data. I believe if you try the stepwise platform mentioned previously and enter all of your terms to be considered for a model, namely main effects, two-way interactions and polynomial terms you may get a similar result as the one you mention by using method 2.

In all cases you should be looking at your adjusted R square to see which model describes your situation to help settle on a model. Remember... "All models are wrong but some are useful" (G. Box) So you should verify your model by running a verification experiment.

Below is a description taken from the online documentation on the Analyze>Modeling>Screening Platform...

The Screening platform has a carefully defined order of operations.

First, the main effect terms enter according to the absolute size of their contrast. All effects are orthogonalized to the effects preceding them in the model. The method assures that their order is the same as it would be in a forward stepwise regression. Ordering by main effects also helps in selecting preferred aliased terms later in the process.

After main effects, all second-order interactions are brought in, followed by third-order interactions, and so on. The second-order interactions cross with all earlier terms before bringing in a new term. For example, with size-ordered main effects A, B, C, and D, B*C enters before A*D. If a factor has more than two levels, square and higher-order polynomial terms are also considered.

An effect that is an exact alias for an effect already in the model shows in the alias column. Effects that are a linear combination of several previous effects are not displayed. If there is partial aliasing, i.e. a lack of orthogonality, the effects involved are marked with an asterisk.

The process continues until n effects are obtained, where n is the number of rows in the data table, thus fully saturating the model. If complete saturation is not possible with the factors, JMP generates random orthogonalized effects to absorb the rest of the variation. They are labeled Null n where n is a number. This situation occurs, for example, if there are exact replicate rows in the design.

Mathematically, the Screening platform takes the n values in the response vector and rotates them into n new values that are mapped by the space of the factors and their interactions.

where T is an orthonormalized set of values starting with the intercept, main effects of factors, two-way interactions, three-way interactions, and so on, until n values have been obtained. Since the first column of T is an intercept, and all the other columns are orthogonal to it, these other columns are all contrasts, i.e. they sum to zero. Since T is orthogonal, it can serve as X in a linear model, but it doesn't need inversion, since T' is also T-1 and also (T'T)T', so the contrasts are the parameters estimated in a linear model.

If no effect in the model is active after the intercept, the contrasts are just an orthogonal rotation of random independent variates into different random independent variates with the same variance. To the extent that some effects are active, the inactive effects still represent the same variation as the error in the model. The hope is that the effects and the design are strong enough to separate which are active from which are random error.

At this point, Lenth's method (Lenth, 1989) identifies inactive effects from which it constructs an estimate of the residual standard error--the Lenth Pseudo Standard Error (PSE).

The value for Lenth's PSE is shown at the bottom of the Screening report. From the PSE, t-ratios are obtained. To generate p-values, a Monte Carlo simulation of 10,000 runs of n - 1 purely random values is created and Lenth t-ratios are produced from each set. The p-value is the interpolated fractional position among these values in descending order. The simultaneous p-value is the interpolation along the max(|t|) of the n - 1 values across the runs. This technique is similar to that in Yee and Hamada (2000).

Message was edited by: Lou V@JMP

Message was edited by: Lou V@JMP

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Lou, I'm not familiar with the underlying statistics here; I never heard of a method that keeps adding terms in this fashion until you get a saturated model. Where can I read more about this?