cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Have your say in shaping JMP's future by participating in the new JMP Wish List Prioritization Survey
Choose Language Hide Translation Bar
Neo
Neo
Level VI

How to group/seperate a correlation plot by a variable?

Following is an example similar to my actual data case. I am interested in looking at the correlation between two process variables known to be dependent on each other. Especially, I would like to understand the wafer number dependence of the correlation.

For my actual data case, unlike the example case here,  the x-axis variable is bi-modal on the wafer number (first say n wafers sit on one mode, the rest on the other mode).  

How to best analyse the data to understand if there is a statistically significant correlation on a wafer number basis between the two variables?

Names Default To Here(1);
clear log ();
dt = Open("$SAMPLE_DATA/Semiconductor Capability.jmp");
Bivariate( Y( :PNP1 ), X( :PNP2), Fit Line( {Line Color( {212, 73, 88} )} ) );

I admit that the way I have approached this may not be the best way to look at this problem, so alternate routes are welcome, but I would like to keep the analysis simple for a start and keep deep statistics away unless unavoidable .

 

When it's too good to be true, it's neither
2 REPLIES 2
Victor_G
Super User

Re: How to group/seperate a correlation plot by a variable?

Hi @Neo,

 

If I understand well your problem, you might be interested by the platform Response Screening (jmp.com).

You want to check if there is significant correlations between PNP1 and PNP2 for all wafers separately :

  1. Launch the Response screening platform, specify "PNP1" as Y and "PNP2" as X, and waferID as your grouping variable : 
    Victor_G_0-1715604691791.png
  2. You have then the results provided by the platform, and you can sort by pvalue, effect size, Rsquare ...

    Victor_G_1-1715604803716.png

    By right-clicking on the results, and then choosing "Make Combined Data Table", you can then export the results in a JMP datatable and process the results further/differently if needed.

Names Default To Here(1);
clear log ();
dt = Open("$SAMPLE_DATA/Semiconductor Capability.jmp");
// Launch platform: Response Screening Data Table( "Semiconductor Capability" ) << Response Screening( Y( :PNP2 ), X( :PNP1 ), Grouping( :Wafer ID in lot ID ), PValues Table on Launch( 0 ) );

 

Another option could be to look at correlations with the Multivariate platform.

Specify your variables and the wafer ID in the "By" variable, and you can look at correlations for each wafer, and/or right-click on the Correlations values, "Make Combined Datatable" to export the datatable and process it :

Victor_G_0-1715610748606.pngVictor_G_1-1715610783930.png

You can also simply use Graph Builder to visualize each pair of X and Y for each wafer using waferID as "Page" :

Graph Builder(
	Size( 534, 99956 ),
	Show Control Panel( 0 ),
	Variables( X( :PNP2 ), Y( :PNP1 ), Page( :Wafer ID in lot ID ) ),
	Elements(
		Points( X, Y, Legend( 29 ) ),
		Line Of Fit( X, Y, Legend( 31 ), R²( 1 ), F Test( 1 ) )
	)
);

You will have R² coefficient and you can also display a F test to check for statistical significance :

Victor_G_0-1715614389671.png

 

Does it answer your needs or did I understand your topic ? 

Victor GUILLER
Scientific Expertise Engineer
L'Oréal - Data & Analytics
Neo
Neo
Level VI

Re: How to group/seperate a correlation plot by a variable?

@Victor_G  Thanks for your suggestions. 

With the response screening platform, for my actual data case, I do not seem to get any additional information than what I get by just plotting box-plot trends the two variables one above the other i.e. each mode of my bi-modal variable corelate well with the dependent variable as expected or in other words, there is wafer number dependence on both parameters. But this is for a small data set.

For a very large data set I would like to JMP to tell me if expected correlation exists or not as it is no longer visually apparent.  Perhaps I need to understand what the various numbers which JMP outputs in the Process Screening Platform. But plotting them by wafer number shows me the same trend as the box-plots do.  I will try to understand the numbers - work in progress anyway.

 

I have already looked at correlation matrix, unfortunately its not what I am looking for. 

 

Can factor analysis via Fit Model help in my case, if yes, how to include wafer number in the analysis?

 

When it's too good to be true, it's neither