Subscribe Bookmark RSS Feed

DOE when the Y response is a vector (ex: Y is a curve)

quentin_dehaine

Community Trekker

Joined:

Sep 14, 2015

I'm working with a simple 3² DOE to see the influence of 2 variables on the shape of a curve. So my response Y is not a single value but a curve, i.e. Y here is a vector (containing the y-coordinates of the points which make the curve).

I was wondering if there is a way in JMP to use the Y vector as a whole response instead of performing the classic script on each points which make the curve, which is laborious.

Thank you for your help!

1 ACCEPTED SOLUTION

Accepted Solutions
Solution

Recently I had a similar problem. I don't think there is any straight-forward solution. We did the following:

1. Fit a non-linear-model using the nonlinear plattform for each individual run. In our case a Michaelis-Menten-fit worked fairly well.

2. Then we used the two parameters of the Michaelis Menten fit as response variables for our DOE.

This way we had some kind of hierarchical model: X => Nonlinear-Fit-Parameters == Y (representing the relevant characteristics of the curve).

If your are not in a hurry you might want to visit the JMP Discovery Summit 2016 in Amsterdam where my boss will present (assuming of course his talk will be accepted) exactly this project.

Sebastian

6 REPLIES
Steven_Moore

Super User

Joined:

Jun 4, 2014

Quentin,  It sounds like you actually have two response variables for your DOE - the x and the y values of your vector response.  You can easily model each response variable with the results of your DOE.  There may be some difficulty if there is a strong interaction between the x and y variables, in which case, you might include the x value in the predictive model for the y and vice versa.  I'm not sure if this is totaly "kosher", but that's what I would do do to see if it works.

Steve
quentin_dehaine

Community Trekker

Joined:

Sep 14, 2015

Dear smoore2, thank you for your help but I think I'v not been clear enough.

I'm actually trying to model the effect of 2 experimental factors only on the shape of curve, which cannot be modelled easily by a y=f(x) function (this curve is actually the particle size distribution of a material after its treatment by the studied apparatus). Thus the x values are fixed, they are not a variable but the corresponding y values are.

The only solution I found for the time being  is to model my problem this way for my 11 runs I have:

  • First column: values (+1, 0 or -1) for factor # 1,
  • 2nd column: values (+1, 0 or -1) for factor # 2,
  • Column 3 to 42:
    • Column header =x - value: corresponding y-values for x.

This is very laborious as I have to treat every x-value separately and I would like to treat every x-values in one time to see the influence of my factors on the global y=f(x) curve on not on each singular (xi,yi) couples.

I hope my explanation is clear and that someone has an idea of what I could do!

Thank you for your help!

Solution

Recently I had a similar problem. I don't think there is any straight-forward solution. We did the following:

1. Fit a non-linear-model using the nonlinear plattform for each individual run. In our case a Michaelis-Menten-fit worked fairly well.

2. Then we used the two parameters of the Michaelis Menten fit as response variables for our DOE.

This way we had some kind of hierarchical model: X => Nonlinear-Fit-Parameters == Y (representing the relevant characteristics of the curve).

If your are not in a hurry you might want to visit the JMP Discovery Summit 2016 in Amsterdam where my boss will present (assuming of course his talk will be accepted) exactly this project.

Sebastian

quentin_dehaine

Community Trekker

Joined:

Sep 14, 2015

Dear Sebastian,

Thank you for your help. Indeed I think this the only solution for the time beeing.  However this implies that you can fit a model to your data which is not always possible.

Anyway I'll give it a try!

Thanks again!

ian_jmp

Staff

Joined:

Jun 23, 2011

I think a workable answer will depend somewhat on the quality of your data and what you want to use the results of the analysis for.

You could try to model the Y values that make up the functional response using PLS, or you might try some kind of feature extraction which is then the subject of further analysis. Sebastian's suggestion above could be considered an example of the latter, but it does assume you can find a useful model, ideally with parameters that have some scientific or technological relevance. Another 'feature' might be the area under the curve (possibly between some relevant limits), or a quantile regression (which might be appropriate for a sample size distribution).

See:

Partial Least Squares Models

http://www.jmp.com/support/help/Distribution_Option_in_Generalized_Regression.shtml

greg_stockdale

Community Member

Joined:

Sep 18, 2015

Quentin,

If the real world were not so complicated you could simply fit a normal distribution to the PSD. Then you would have only two (instead of 40) parameters that completely describe the PSD. However, the distributions probably don't fit a parametric distribution so you will need to focus on non-parametric measures.

My hunch is that there are very few features of the PSD you need to worry about. Even if you used a PLS model, it is probably reducing the 40 "X" values down to two or three essential properties, such as the center of the distribution, and how peaked or flat it is. All these can be captured by the 10th, 50th, and 90th percentile - and the difference between them - to essentially capture the major features of the particle size distribution. Also, these measures are likely more familiar to the scientists and engineers working on the problem (KISS).

It is also likely that the PSD is not the only attribute (response) of concern. Properties such as flow, surface area, moisture content, etc may be even more important to resolving the problem. This is where the PLS or MANOVA might be interesting.

Greg