cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Agustin
Level IV

Confusion matrix colour and count based on different variables

Hi, 

I'm working on a confusion matrix using the graph builder. I have 2 columns one is Known class (y-axis) an the other Predicted class (x-axis). I want the numbers inside the boxes to show the number of samples in each box. As shown in picture below.

 

Agustin_1-1663762133816.png

 

However, instead of the colour gradient map, I would like to have the main diagonal (True Positives) be one colour and the rest of the boxes a different colour. I have a column that is either 1 or 0 depending on whether the sample is a True Positive or not, however when I add this column to the colour option in Graph Builder this happens:

Agustin_2-1663762335690.png

Is there a way I can have the colours as in the second picture, but the labels of the first picture?

 

Thank you! 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ih
Super User (Alumni) ih
Super User (Alumni)

Re: Confusion matrix colour and count based on different variables

Sorry about that, I missed saving that step.  Again here I summarized the table first and then drew the graph using row labels instead of the counts, but this time I actually defined row labels:

 

ih_0-1663773802794.png

 

Names Default To Here( 1 );

dt = Open( "$Sample_data/big class.jmp" );

random reset(2);
dt << New Column("Age 2", Numeric, "Ordinal", Format("Fixed Dec", 5, 0), Formula(:age[Col Shuffle()]));
dt << Run Formulas;
dt:age << Set Name("Age 1");
dtSum = (sum = dt << Tabulate(
	Add Table(
		Column Table( Statistics( N ) ),
		Row Table( Grouping Columns( :"Age 2"n, :"Age 1"n ) )
	)
)) << Make Into Data Table;

dtSum << New Column("Is Diagonal", Numeric, "Nominal", Format("Best", 12), Formula(:Age 1 == :Age 2));

dtSum:N << Label( 1 );

gb = dtSum << Graph Builder(
	Size( 525, 450 ),
	Show Control Panel( 0 ),
	Variables( X( :Age 1 ), Y( :Age 2 ), Color( :Is Diagonal ) ),
	Elements( Heatmap( X, Y, Legend( 5 ), Label( "Label by Row" ) ) ),
	SendToReport(
		Dispatch(
			{},
			"400",
			ScaleBox,
			{Legend Model(
				5,
				Properties( 0, {Fill Color( -15790017 )}, Item ID( "0", 1 ) ),
				Properties( 1, {Fill Color( 5 )}, Item ID( "1", 1 ) )
			)}
		)
	)
);

 

View solution in original post

3 REPLIES 3
ih
Super User (Alumni) ih
Super User (Alumni)

Re: Confusion matrix colour and count based on different variables

You could summarize the table first and then draw your graph using row labels instead of the counts:

 

ih_1-1663772650455.png

 

 

Names Default To Here( 1 );

dt = Open( "$Sample_data/big class.jmp" );

random reset(2); dt << New Column("Age 2", Numeric, "Ordinal", Format("Fixed Dec", 5, 0), Formula(:age[Col Shuffle()])); dt:age << Set Name("Age 1"); dtSum = (dt << Tabulate( Add Table( Column Table( Statistics( N ) ), Row Table( Grouping Columns( :"Age 2"n, :"Age 1"n ) ) ) )) << Make Into Data Table; dtSum << New Column("Is Diagonal", Numeric, "Nominal", Format("Best", 12), Formula(:Age 1 == :Age 2)); dtSum << Graph Builder( Size( 525, 450 ), Show Control Panel( 0 ), Variables( X( :Age 1 ), Y( :Age 2 ), Color( :Is Diagonal ) ), Elements( Heatmap( X, Y, Legend( 5 ), Label( "Label by Row" ) ) ), SendToReport( Dispatch( {}, "400", ScaleBox, {Legend Model( 5, Properties( 0, {Fill Color( -15790017 )}, Item ID( "0", 1 ) ), Properties( 1, {Fill Color( 5 )}, Item ID( "1", 1 ) ) )} ) ) );

 

Agustin
Level IV

Re: Confusion matrix colour and count based on different variables

Thank you for your answer but I don't think this gives the right solution, see below:

Agustin_0-1663773226396.png

 

Left shows the actual values that should be displayed, right is what the script provided results in.

ih
Super User (Alumni) ih
Super User (Alumni)

Re: Confusion matrix colour and count based on different variables

Sorry about that, I missed saving that step.  Again here I summarized the table first and then drew the graph using row labels instead of the counts, but this time I actually defined row labels:

 

ih_0-1663773802794.png

 

Names Default To Here( 1 );

dt = Open( "$Sample_data/big class.jmp" );

random reset(2);
dt << New Column("Age 2", Numeric, "Ordinal", Format("Fixed Dec", 5, 0), Formula(:age[Col Shuffle()]));
dt << Run Formulas;
dt:age << Set Name("Age 1");
dtSum = (sum = dt << Tabulate(
	Add Table(
		Column Table( Statistics( N ) ),
		Row Table( Grouping Columns( :"Age 2"n, :"Age 1"n ) )
	)
)) << Make Into Data Table;

dtSum << New Column("Is Diagonal", Numeric, "Nominal", Format("Best", 12), Formula(:Age 1 == :Age 2));

dtSum:N << Label( 1 );

gb = dtSum << Graph Builder(
	Size( 525, 450 ),
	Show Control Panel( 0 ),
	Variables( X( :Age 1 ), Y( :Age 2 ), Color( :Is Diagonal ) ),
	Elements( Heatmap( X, Y, Legend( 5 ), Label( "Label by Row" ) ) ),
	SendToReport(
		Dispatch(
			{},
			"400",
			ScaleBox,
			{Legend Model(
				5,
				Properties( 0, {Fill Color( -15790017 )}, Item ID( "0", 1 ) ),
				Properties( 1, {Fill Color( 5 )}, Item ID( "1", 1 ) )
			)}
		)
	)
);