Discussions

PhamBao · Jun 23, 2025 05:40 AM

Hi team,
Currently, I would like to perform prediction from X to Y where X and Y are non linear to each other
From image below, value in Y > 100 or X >550 consider as bad in reality. However, I do not know whether data outside the contour or lay on areas with question marks are consider as bad or good. I would like to ask some question:
1. Is there anyway to make prediction with data points laid on question marks area or data points beyond contours, but not exceed limit (red lines). Since X and Y are not linear to each other, I try to perform some non linear model, but till stuck

2. I would like to draw boundary with buffer zone with a distance, then I consider data points:
-Out-site new boundary --> bad
-In-site new boundary , but exceed red limits --> bad
-Else: good

How could I define an appropriate distance?

Hopefully, I could get some advice from you

Ben_BarrIngh · Jun 23, 2025 08:44 AM

Hi @PhamBao ,

How have you generated this data? Is it from experimentation? Is the coloured area only displaying areas you've tested?

If you've only tested amongst the areas where you are showing the response, it might not be appropriate to extrapolate outside of the area where you've actually tested into the 'question mark' zones - it is entirely possible to generate a model in JMP and start to extrapolate outside of your tested regions - but this is all done with a great deal of risk. In many cases you might want to use 'Extrapolation Control' in the Prediction Profiler to stop from making predictions that are unfounded.

If you can do more experimentation, why not test at a wider range (using Custom Design or Augment design in the DOE options) gain more information on your system so that you can cover the area with the question marks to test?

If you could provide the dataset that would be useful as well.

Thanks,

Ben

“All models are wrong, but some are useful”

PhamBao · Jun 23, 2025 11:20 AM

Hi @Ben_BarrIngh

I got the data from my workplace. Data is taken from different machines (12 machines - data frame around 6 weeks) using same recipe . Actually, Y is subsequent data after X. Usually, I try to control single parameter such X and Y individually
X limit: Upper control limit : 550 and lower control limit : 100 --> product with single parameter will be considered as bad if not within control limit
Y limit: Upper control limit : 100 and lower control limit :-170 --> product with single parameter will be considered as bad if not within control limit
Flag3: 0 considered good products - 1: bad products
However, we detect that controlling single parameter during process not really effectively. Therefore, I try to plot X and Y as the chart above show to see how Y response to X, then perform control X and Y at once time. As I plot the chart ( X and Y )with 6 weeks data from different machines, I got the pattern as the above image show. I got the one question
1: With large data points from 6 weeks and different machine, is it appropriate to extrapolate outside of the area ?
If not, could you give me the instruction or reference how I could use DOE to test wide range to gain more information to cover question mark areas

Ben_BarrIngh · Jun 24, 2025 03:41 AM

Hi @PhamBao ,

Because you are only looking at a binary response/flag of good/bad it would not really be appropriate to extrapolate (as it rarely ever is in any case). Similarly these markers are only being considered good/bad when they violate the control limit rules, not because they've actually been measured and placed as 'good' or 'bad' (correct me if I'm wrong). Your application of the contour is not appropriate or even really needed - in the case of your data the change from good to bad isn't gradual, its sudden as the X or Y values cross their control limits.

If the latter is true, then I would suggest a few things:

- Find a quality attribute from this process that defines whether or not the product is good or bad (i.e. product purity, defect rate)

- Measure the associated values of that attribute against your X and Y (although these can both be considered X's)

If you need to then look at trying to use experimentation to fill that gap, look at these DoE resources: https://www.jmp.com/doeintrokit/en/index.html .

Thanks,

Ben

“All models are wrong, but some are useful”

PhamBao · Jun 24, 2025 10:10 AM

Hi @Ben_BarrIngh

Is there any other resource that I could learn from since some video links are dead

"Similarly these markers are only being considered good/bad when they violate the control limit rules, not because they've actually been measured and placed as 'good' or 'bad' (correct me if I'm wrong"
--> actually, my doubt is if I refer to the shape or pattern of data shape, the data pattern looks constant. Meaning that there maybe no chance for data laid on X from 0 to 50 regardless Y value. Similarly, no data points with Y from -200 to -100 where X from 0 to 550.

"- Measure the associated values of that attribute against your X and Y (although these can both be considered X's)" --> is there any function in JMP that I could add in more parameters and try to adjust parameters to see how X and Y response to parameter

Thanks

Ben_BarrIngh · Jun 24, 2025 11:24 AM

Hi @PhamBao ,

I'm not sure I fully understand your first point - and this may be something you want to share with a consultant (such as a JMP Partner) to better flesh out your challenge so that you can get assistance, I think this may beyond the scope of what's possible here.

For the second point, you can use the Fit Model platform to look at the relationship between parameters.

Thanks,

Ben

“All models are wrong, but some are useful”

Discussions

How to deal with data prediction with non-linear model?

Re: How to deal with data prediction with non-linear model?

Re: How to deal with data prediction with non-linear model?

Re: How to deal with data prediction with non-linear model?

Re: How to deal with data prediction with non-linear model?

Re: How to deal with data prediction with non-linear model?

Recommended Articles