turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- High VIF in constrained design space

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Nov 21, 2011 7:30 AM
(982 views)

Hi there,

I'm working on analyzing a design space that is constrained. There are 3 factors, all between 0 and 1, whose sum should not exceed 1. Note that this is not a mixture design, as the sum may be less than 1 but not above.

I've created a design space with 20 experiments. As the outcome of experiments comes from a lengthy CFD computation, there are no replicates (replicates would give exactly the same answer).

I am able to fit a nice quadratic model on the data, but I find Variable Inflation Factors (VIF) values that are very high, from 7 to the low hundreds. My textbooks advise me that these should never exceed 5-10, so I'm a bit worried about this.

I have a suspicion that the high VIF's are caused by the constraint on the design space, because in unconstrained design spaces for similar problems I don't find such high VIF values.

Any thoughts on why the VIF's are so high? is it indeed due to the constraint? I've attached some example data FYI.

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Nov 21, 2011 3:13 PM
(1814 views)

Solution

Certainly applying constraints may induce some collinearity which could manifest itself as high VIF values. However, when I took a look at your data and fitted a response surface the VIF values were all below 10 - see screen shot below:

So the question is why the difference? The data from the DOE will have column properties e.g. Coding ... I am wondering if somewhere along the line these have got mis-specified causing an artificial collinearity due to strange ranges on the coding property?

I've just changed the coding of the A,B,C columns to have ranges 0 tol 1 and now I see the following much higher values of VIF:

My interpretation is that this is being artificially induced by the coding property. If all my factors are in the same numerical range then I am happy to just remove the coding. Alternatively you could specify the coding to be based on the actual data range 0 to 0.667 which brings the values below the "10 threshold".

To understand the nature of the collinearity you can look at the correlations of A,B,C under Multivariate Methods>Multivariate and also within Fit Model: Estimates>Correlation of Estimates.

Dave

-Dave

3 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Nov 21, 2011 3:13 PM
(1815 views)

Certainly applying constraints may induce some collinearity which could manifest itself as high VIF values. However, when I took a look at your data and fitted a response surface the VIF values were all below 10 - see screen shot below:

So the question is why the difference? The data from the DOE will have column properties e.g. Coding ... I am wondering if somewhere along the line these have got mis-specified causing an artificial collinearity due to strange ranges on the coding property?

I've just changed the coding of the A,B,C columns to have ranges 0 tol 1 and now I see the following much higher values of VIF:

My interpretation is that this is being artificially induced by the coding property. If all my factors are in the same numerical range then I am happy to just remove the coding. Alternatively you could specify the coding to be based on the actual data range 0 to 0.667 which brings the values below the "10 threshold".

To understand the nature of the collinearity you can look at the correlations of A,B,C under Multivariate Methods>Multivariate and also within Fit Model: Estimates>Correlation of Estimates.

Dave

-Dave

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Nov 22, 2011 5:35 AM
(907 views)

It figures that adding a constraint like 0 <= A+B+C <= 1 will make the design space much less orthogonal therefore increasing VIF's.

Just one thing: you say that the data range is 0 to 0.667, but it clearly is 0 to 1. Entries 2-4 have a value of 1 for each of the three factors.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Nov 22, 2011 5:54 AM
(907 views)