cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
MichaelO
Level II

How to find every test to screen a unit

I'm fairly new to JMP and JSL and currently working on screening out a unit in a very large dataset. My goal is to get an output table containing a name of every test in which the unit (defined by it's ECID) has a z-score greater than 3 (or some other value I can choose). These z-scores need to be based on each tests respective distributions of passing units. Is there JMP functionality for this or a script that has been made to do something along these lines. Below is my very loose pseudocode for a script.

 

Pseudocode:

Requires large enough dataset to have a normal distribution for different tests (minimum 30 good units)

Subset base data table into only bin 1 units

Calculate mean and std deviation for each test using only passing units

Calculate z-score in relation to each tests' mean and std deviation

Create output table with all tests and scores that are above 3

1 ACCEPTED SOLUTION

Accepted Solutions
txnelson
Super User

Re: How to find every test to screen a unit

Here is an example of one way of handling your issue.  The way I have set it up, is that you select the ECID rows in your input data table that you want to examine, and then run the script.  Check it out and see if it points you in a direction where you can make the finishing changes that you need.

Names Default To Here( 1 );

// Open Data Table: Probe.jmp
// → Data Table( "Probe" )
//dt = Open( "$SAMPLE_DATA/Process Measurements.jmp" );
dt = Data Table( "Modified_Probe_Input" );

// Only run if at least one ECID row has been selected
If( N Rows( dt << get selected rows() ) > 0, 

// Create a new data table to place the results in
	dtOutput = dt << subset( selected rows( 1 ), selected columns( 0 ) );
	selRows = dt << get selected rows;
	
// Get column names for all continuous columns
	colNames = dt << get column names( string );
	Remove From( colNames, 1, 9 );	
	
	dtSum = dt << Summary(
		invisible,
		Group( :Bin ),
		Mean( colNames ),
		Std Dev( colNames ),
		Freq( "None" ),
		Weight( "None" ),
		link to original data table(0)
	);
	dtSum << select where( :Bin != 1 );
	Try( dtSum << delete rows );
		
	For( i = 1, i <= N Items( colNames ), i++,
		Mean = Column( dtSum, "Mean(" || colNames[i] || ")" )[1];
		Stddev = Column( dtSum, "Std dev(" || colNames[i] || ")" )[1];
		column( dtOutput, colNames[i]) << set name( column( dtOutput, colNames[i]) << get name || " Z Score" );
		For( k = 1, k <= N Rows( dtOutput ), k++,
			Column( dtOutput, colNames[i] )[k] = Abs( As Column( dtOutput, colNames[i] )[k] - Mean ) / Stddev
		);


		targetRows = dtOutput << get rows where( As Column( dtOutput, colNames[i] ) >= 3 );
		Try( As Column( dtOutput, colNames[i] ) << color cells( "red", As List( targetRows ) ) );
	);
);
close( dtSum, nosave );
Jim

View solution in original post

9 REPLIES 9
txnelson
Super User

Re: How to find every test to screen a unit

Here is a sample script, that creates the Z scores and identifies the values that are greater than 3.  Is this in the direction of what you want?  

Names Default To Here( 1 );

// Open Data Table: Probe.jmp
// → Data Table( "Probe" )
dt = Open( "$SAMPLE_DATA/Process Measurements.jmp" );

// Create a Bin column
dt << New Column( "Bin",
	modeling type( nominal ),
	set each value( If( Random Uniform( 0, 1 ) <= .8, 1, Random Integer( 2, 7 ) ) )
);

// Get column names for all continuous columns
colNames = dt << get column names( string, continuous );
For( i = 1, i <= N Items( colNames ), i++,
	mean = .;
	stddev = .;
// Create the Z scores for each test 
	dt << New Column( colNames[i] || " Z Score",
		formula(
			If( Row() == 1,
				Mean = Col Mean( If( :Bin == 1, As Column( dt, colNames[i] ), . ) );
				Stddev = Col Std Dev( If( :Bin == 1, As Column( dt, colNames[i] ), . ) );
			);
			Abs( :Process 1 - Mean ) / Stddev;
		)
	);
	Column( dt, N Cols( dt ) ) << delete formula;

	targetRows = dt << get rows where( As Column( dt, N Cols( dt ) ) >= 3 );
	Try( As Column( dt, N Cols( dt ) ) << color cells( "red", As List( targetRows ) ) );
);
Jim
MichaelO
Level II

Re: How to find every test to screen a unit

This is a very good starting point. Thank you for your help. I would like to adopt this script so that I can run it only on units with a specific ECID within a large dataset. If I were to run this script on every datapoint/die I will unfortunately crash JMP. How would you recommend adopting this script to allow for this?

txnelson
Super User

Re: How to find every test to screen a unit

I would just create a subset of the original data table and run the analysis on it.

dt = current data table();
ECIDsToTest = { "413257", "56739",...........};

dt << select where( contains( ECIDsToTest, :ECID ) );

dtToTest = dt << subset( selected columns( 0 ), selected rows( 1 ) );

// Run the Z Score analysis on the new subset
Jim
MichaelO
Level II

Re: How to find every test to screen a unit

Hi Jim,

 

I wasn't clear with my previous issue. I would like to find a single units z-score for every test based on the the entire dataset's distribution. Because of this, I can't just subset the ECID's of note. Secondly, I'm not sure the original script is working correctly. Upon further review, I've noticed that many of the z-scores are unlikely. There are some z-scores that are nearing triple digits.

 

MichaelO_0-1643128128554.png

 

txnelson
Super User

Re: How to find every test to screen a unit

Please provide a sample input data table, and a sample of what you are expecting for your output.

Jim
MichaelO
Level II

Re: How to find every test to screen a unit

I've modified and attached the Probe sample dataset to be my input data table. From there I would like to create a script that can output a table containing a list of tests (column names) found in the original dataset where the parametric data for an individual unit (denoted by it's ECID) has a Z-score above 3 (or some other user controlled value). I've attached a sample table output below. The attached output table is not based on the parametric data, but an example of what I would like. In this example, I am screening all tests in which the first unit in the Probe dataset (ECID: Z1J4H_24_2_1) has parametric values that are greater than 3 sigma from the respective test's mean. Please let me know if I can clarify anything!

txnelson
Super User

Re: How to find every test to screen a unit

Here is an example of one way of handling your issue.  The way I have set it up, is that you select the ECID rows in your input data table that you want to examine, and then run the script.  Check it out and see if it points you in a direction where you can make the finishing changes that you need.

Names Default To Here( 1 );

// Open Data Table: Probe.jmp
// → Data Table( "Probe" )
//dt = Open( "$SAMPLE_DATA/Process Measurements.jmp" );
dt = Data Table( "Modified_Probe_Input" );

// Only run if at least one ECID row has been selected
If( N Rows( dt << get selected rows() ) > 0, 

// Create a new data table to place the results in
	dtOutput = dt << subset( selected rows( 1 ), selected columns( 0 ) );
	selRows = dt << get selected rows;
	
// Get column names for all continuous columns
	colNames = dt << get column names( string );
	Remove From( colNames, 1, 9 );	
	
	dtSum = dt << Summary(
		invisible,
		Group( :Bin ),
		Mean( colNames ),
		Std Dev( colNames ),
		Freq( "None" ),
		Weight( "None" ),
		link to original data table(0)
	);
	dtSum << select where( :Bin != 1 );
	Try( dtSum << delete rows );
		
	For( i = 1, i <= N Items( colNames ), i++,
		Mean = Column( dtSum, "Mean(" || colNames[i] || ")" )[1];
		Stddev = Column( dtSum, "Std dev(" || colNames[i] || ")" )[1];
		column( dtOutput, colNames[i]) << set name( column( dtOutput, colNames[i]) << get name || " Z Score" );
		For( k = 1, k <= N Rows( dtOutput ), k++,
			Column( dtOutput, colNames[i] )[k] = Abs( As Column( dtOutput, colNames[i] )[k] - Mean ) / Stddev
		);


		targetRows = dtOutput << get rows where( As Column( dtOutput, colNames[i] ) >= 3 );
		Try( As Column( dtOutput, colNames[i] ) << color cells( "red", As List( targetRows ) ) );
	);
);
close( dtSum, nosave );
Jim
MichaelO
Level II

Re: How to find every test to screen a unit

Hi Jim,

 

I've looked at this script and it seems to work on the sample dataset I sent you. For whatever reason when I modify just a few lines to fit my need case I'm getting the following error: "Scoped data table access requires a data table column or variable{1}". I've changed the following lines to fit my needs: 6, 17, 21, and 28. Other than that, this is the same script that you've provided above. Any ideas?

txnelson
Super User

Re: How to find every test to screen a unit

Can you supply a sample data table that is resulting in an error?

Jim