Solved: How to use a benchmark model for Model driven multivariate continued process ver...

Riaz90Nawaz · Jul 6, 2023 12:28 AM

Hi All,

In continuation to the post (link to which is provided below) -

https://community.jmp.com/t5/Discussions/Help-in-JSL-scripting-of-Model-Driven-Multivariate-Control-...

I would like to extend the script to have one generic script on the similar lines what was suggested earlier, but this time, the model driven multivariate control chart should use some benchmark models (either PCA or PLS) to fix the control limit. The reason is, for continued process verification purpose, once we fix the limit (calculated from the historical dataset without any outliers), the future batches have to be monitored against this limit. Currently, what's happening is, every time when this script is run, it generates new control limit (for obvious reasons) as the dataset is differing.

So, the first part is to fix the control limit (all the historical/golden dataset should fall below the control limit) and to do this, is it possible to develop a model with cross-validation to finalize the number of components so that the data-points fall below the control limit (off course, after removing the outliers)? The tricky part here I am facing, even though the so-called outliers are removed, every time the MDMVCC gives one or other data point beyond a control limit (as the dataset is changing). This has to be fixed.

Second part is to run this script (with fixed control limit) on the subsequently manufactured batches to identify any excursions in the variables as part of CPV activity.

The overall idea is to have a generic MDMVCC script (from a benchmark model) which can predict/identify the outliers/variations with good accuracy.

Highly appreciate if anyone from the community can guide me or throw some more light on this request.

Thanks in advance !!

KR,

Nawaz

yuichi_katsumur · Jul 11, 2023 08:56 PM

Hi @Riaz90Nawaz ,

As you may know, the Upper Control Limit (UCL) for historical data is based on the Beta distribution. On the other hand, the UCL for current data is based on F distribution. https://www.jmp.com/support/help/en/17.1/#page/jmp/statistical-details-for-limits.shtml#ww345612
I'm not sure if it makes sense to use the same limit for each data, one idea is to use the Exclude Row option. Here is a sample script for your reference.

Names Default To Here( 1 );
historical_dt = Open( "$SAMPLE_DATA/Quality Control/Aluminum Pins Historical.jmp" );
current_dt = Open( "$SAMPLE_DATA/Quality Control/Aluminum Pins Current.jmp" );

//Get the number of rows for the historical data
historical_data_end_row = N Row( historical_dt );
//Combine the data tables
dt = historical_dt << Concatenate( current_dt );
//Close
Close( historical_dt, nosave );
Close( current_dt, nosave );

dt << Clear Row States;
cols_of_interest = dt << get column names( Continuous, "String" );

dt << Multivariate(
	Y( Eval( cols_of_interest ) ),
	Variance Estimation( "Row-wise" ),
	Scatterplot Matrix( Density Ellipses( 1 ) )
);

//Exclude current data
data_end_row = N Row( dt );
selected_row = dt << select rows( Index( historical_data_end_row, data_end_row ) );
selected_row << Exclude;

obj = dt << Model Driven Multivariate Control Chart( Process( Eval( cols_of_interest ) ) );
//Show excluded row
obj << Show Excluded Rows( 1 );

obj << T Square Plot( Save Values );
new_cols = Associative Array( dt << get column names( Continuous, "String" ) );
new_cols << Remove( Associative Array( cols_of_interest ) ); // T2 column

// get UCL (should be made more robust)
ucl_value = ((Report( obj )[Outline Box( "T² Limit Summaries" )] << Child)[3] << Get)[1];
alarm_rows = Loc( dt[0, new_cols << get keys] > ucl_value );

dt << Select Rows( alarm_rows );
Wait( 0 );
obj << T² Plot( Contribution Plot for Selected Samples( alarm_rows ) );
dt << Clear Select;

//Add reference line 
report = obj << report;
report[Picture Box( 1 )][axis box( 2 )] << Add Ref Line(
	{0, eval(historical_data_end_row)},
	"Solid",
	"Light Red",
	"Historical",
	1,
	0.25
);

View solution in original post

yuichi_katsumur · Jul 10, 2023 09:41 PM

Hello @Riaz90Nawaz

For your second questions (with fixed control limits), this is an idea.
- Add historical/golden data to current dataset/future batch
- Set "Historical Data End at Row" option for MDMVCC
Here is a sample script for your reference.

Names Default To Here( 1 );
historical_dt = Open( "$SAMPLE_DATA/Quality Control/Aluminum Pins Historical.jmp" );
current_dt = Open( "$SAMPLE_DATA/Quality Control/Aluminum Pins Current.jmp" );

//Get the number of rows for the historical data
historical_data_end_row = N Row( historical_dt );
//Combine the data tables
dt = historical_dt << Concatenate( current_dt );
//Close
Close( historical_dt, nosave );
Close( current_dt, nosave );

dt << Clear Row States;
cols_of_interest = dt << get column names( Continuous, "String" );

Eval(
	Substitute(
			Expr(
				obj = dt <<
				Model Driven Multivariate Control Chart(
					Process( cols_of_interest ),
					Historical Data End at Row( historical_data_end_row )
				)
			),
		Expr( cols_of_interest ), Eval( cols_of_interest ),
		Expr( historical_data_end_row ), Eval( historical_data_end_row )
	)
);

obj << T Square Plot( Save Values );
new_cols = Associative Array( dt << get column names( Continuous, "String" ) );
new_cols << Remove( Associative Array( cols_of_interest ) ); // T2 column

// get UCL (should be made more robust)
ucl_value = ((Report( obj )[Outline Box( "T² Limit Summaries" )] << Child)[3] << Get)[1];
alarm_rows = Loc( dt[0, new_cols << get keys] > ucl_value );

dt << Select Rows( alarm_rows );
Wait( 0 );
obj << T² Plot( Contribution Plot for Selected Samples( alarm_rows ) );
dt << Clear Select;

Hope it works for your problem.

Riaz90Nawaz · Jul 11, 2023 02:40 AM

Hi @yuichi_katsumur ,

Thank you so much for your guidance. Highly appreciate.

To the same code, I am trying to add multivariate scatterplot matrix so that to understand the correlations amongst the variables by adding the following code, but it seems, I am doing it wrong way as it's not working -

Multivariate(
Y(cols_of_interest),
Variance Estimation( "Row-wise" ),
Scatterplot Matrix( Density Ellipses( 1 ) )
);

Appreciate if you can throw some light on this aspect as well.

Thanks and KR,

yuichi_katsumur · Jul 11, 2023 04:12 AM

Hello @Riaz90Nawaz ,

One idea is the following code. I think this will work.

Multivariate(
	Y( eval(cols_of_interest) ),
	Variance Estimation( "Row-wise" ),
	Scatterplot Matrix( Density Ellipses( 1 ) )
);

Riaz90Nawaz · Jul 11, 2023 08:05 AM

Hi @yuichi_katsumur ,

Thank you so much for the prompt support. It worked.

One general query, which I have posted in my main post as well, can't we add this fixed limit (generated from historical dataset) to the current dataset visually (through scripting)? I see, although the script provides contributions only for those rows which are breaching the historical control limit. it would be ideal to have this reference line for the current dataset as well. At present, for the current dataset, the control limit is shown as dashed red line, I don't want this to appear (for obvious reasons), instead of this, the fixed control limit from the historical should continue for the current/future dataset as well. This provides more correct visual representation.

KR

yuichi_katsumur · Jul 11, 2023 08:56 PM

Hi @Riaz90Nawaz ,

As you may know, the Upper Control Limit (UCL) for historical data is based on the Beta distribution. On the other hand, the UCL for current data is based on F distribution. https://www.jmp.com/support/help/en/17.1/#page/jmp/statistical-details-for-limits.shtml#ww345612
I'm not sure if it makes sense to use the same limit for each data, one idea is to use the Exclude Row option. Here is a sample script for your reference.

Names Default To Here( 1 );
historical_dt = Open( "$SAMPLE_DATA/Quality Control/Aluminum Pins Historical.jmp" );
current_dt = Open( "$SAMPLE_DATA/Quality Control/Aluminum Pins Current.jmp" );

//Get the number of rows for the historical data
historical_data_end_row = N Row( historical_dt );
//Combine the data tables
dt = historical_dt << Concatenate( current_dt );
//Close
Close( historical_dt, nosave );
Close( current_dt, nosave );

dt << Clear Row States;
cols_of_interest = dt << get column names( Continuous, "String" );

dt << Multivariate(
	Y( Eval( cols_of_interest ) ),
	Variance Estimation( "Row-wise" ),
	Scatterplot Matrix( Density Ellipses( 1 ) )
);

//Exclude current data
data_end_row = N Row( dt );
selected_row = dt << select rows( Index( historical_data_end_row, data_end_row ) );
selected_row << Exclude;

obj = dt << Model Driven Multivariate Control Chart( Process( Eval( cols_of_interest ) ) );
//Show excluded row
obj << Show Excluded Rows( 1 );

obj << T Square Plot( Save Values );
new_cols = Associative Array( dt << get column names( Continuous, "String" ) );
new_cols << Remove( Associative Array( cols_of_interest ) ); // T2 column

// get UCL (should be made more robust)
ucl_value = ((Report( obj )[Outline Box( "T² Limit Summaries" )] << Child)[3] << Get)[1];
alarm_rows = Loc( dt[0, new_cols << get keys] > ucl_value );

dt << Select Rows( alarm_rows );
Wait( 0 );
obj << T² Plot( Contribution Plot for Selected Samples( alarm_rows ) );
dt << Clear Select;

//Add reference line 
report = obj << report;
report[Picture Box( 1 )][axis box( 2 )] << Add Ref Line(
	{0, eval(historical_data_end_row)},
	"Solid",
	"Light Red",
	"Historical",
	1,
	0.25
);

Riaz90Nawaz · Jul 12, 2023 02:05 AM

Hi @yuichi_katsumur ,

Thank you so much once again for your prompt support !!

Yes, I am aware about the basic distribution differences for the historical and current dataset. Actually, as part of "Continued Process Verification (CPV)", we can't use different limits for the subsequent monitoring. There has to be one fixed control limit (derived from historical dataset) against which the subsequent batches are monitored, basically to study any drifts/shifts in the process. Hope you got the point what I am trying to fix.

I appreciate your support on this, but, can't we do this without excluding rows ? I just tried the script provided, Although in terms of T2 control chart visualization, it's serving the purpose what I wanted to achieve but when I looked at the T2 contribution plots, the script is giving the charts for all the points (especially for current dataset) irrespective of whether a specific point is breaching the UCL or not. Although in the script, the contribution plots are specifically mentioned for "alarm_rows", but it seems it's not working, might be because we have excluded current dataset rows, not sure about it.

Hope you understood what I am trying to achieve.

Thanks in advance for your support,

Kind Regards,

yuichi_katsumur · Jul 12, 2023 04:53 AM

Hi @Riaz90Nawaz , Thank you for your reply. I understand what you want. I'm sorry, there is a mistake in my script. Please add the following line to the script (in line 26).

//Exclude current data
data_end_row = N Row( dt );
selected_row = dt << select rows( Index( historical_data_end_row, data_end_row ) );
selected_row << Exclude;
dt << Clear Select; //Please add this line!!

When I exclude rows, I selected some rows, so you need to deselect them. I think this will work.

Riaz90Nawaz · Jul 12, 2023 06:33 AM

Hi @yuichi_katsumur ,

Thank you so much for the updated script. It's working now.

Appreciate your prompt support !!

Have a Great Day !!

Best Regards,

How to use a benchmark model for Model driven multivariate continued process verification purpose

Re: How to use a benchmark model for Model driven multivariate continued process verification purpose

Re: How to use a benchmark model for Model driven multivariate continued process verification purpose

Re: How to use a benchmark model for Model driven multivariate continued process verification purpose

Re: How to use a benchmark model for Model driven multivariate continued process verification purpose

Re: How to use a benchmark model for Model driven multivariate continued process verification purpose

Re: How to use a benchmark model for Model driven multivariate continued process verification purpose

Re: How to use a benchmark model for Model driven multivariate continued process verification purpose

Re: How to use a benchmark model for Model driven multivariate continued process verification purpose

Re: How to use a benchmark model for Model driven multivariate continued process verification purpose