## JSL summarize with a cross with a zero count

Super User

Joined:

Feb 10, 2013

Ok, JSL pros.  I want to summarize data that you would find in a contingency table.  Diagnosis has 2 items, TestRslt has 2 items (I know what they are if that helps so Diagnosis ={"pos", "neg"}).  The following works great if my count is at least 1 in each cell of the contingency table.  I get my c = [34, 35, 2, 3].  However, if I have a zero in a cell then I get c =[34, 35, 2] when I really need to have c = [34, 35, 2, 0] as I need to know I have a zero for a combination.

summarize (cross = by (Column( R["Diagnosis"] ) , Column(  R["TestRslt"])),

c = count,

);

1 ACCEPTED SOLUTION

Accepted Solutions

Super User

Joined:

Jun 23, 2011

Solution

cat=Categorical(

X( :Diagnosis ),

Responses( :TestRslt ),

Share Of Responses( 0 ),

Share Chart( 0 ),

Legend( 0 )

);

// The display box type poorly implemented/documented in jsl, but <<GetXML often works...

xml = Report( cat )[Gridmulticellbox( 2 )] << get xml;

// Extraxt the numbers as list of strings

c = XPath Query( xml, "//GridMultiCellBoxItem/text()" );

// Make into matrix

c = matrix( repeat( i = 0; { i++; num( c[i] ) }, nitems( c )));

6 REPLIES

Super User

Joined:

Jun 23, 2011

One way would be to add dummy rows, one for each possible combination, and then correct for the hack in the end with c = c - 1.

Another way is to use the built-in Contingency platform, e.g.:

// Example tables

x1 = {"pos", "neg"}; x2 = {"A", "B"};

dt1 = New Table( "All there", add rows( 12 ), New Column( "Diagnosis", character ), New Column( "TestRslt", character, ) );

dt2 = New Table( "One missing", add rows( 12 ), New Column( "Diagnosis", character ), New Column( "TestRslt", character ) );

For Each Row(

dt1:Diagnosis[] = x1[Count( 1, 2, 2, 5 )];

dt2:Diagnosis[] = x1[Count( 1, 2, 2, 6 )];

dt1:TestRslt[] = x2[Count( 1, 2, 2, 3 )];

dt2:TestRslt[] = x2[Count( 2, 1, 2, 7 )];

);

// Count combination frequency by Summarize()

s_counts = Function( {dt},

Current Data Table( dt );

Summarize( cross = by( Column( "Diagnosis" ), Column( "TestRslt" ) ), c = count, );

c;

);

// Count combination frequency by Contigency()

c_counts = Function( {dt},

Current Data Table( dt );

c = (Contingency(

Y( :Diagnosis ),

X( :TestRslt ),

Contingency Table(

Count( 1 ),

Total %( 0 ),

Col %( 0 ),

Row %( 0 ),

Expected( 0 ),

Deviation( 0 ),

Cell Chi Square( 0 ),

Col Cum( 0 ),

Col Cum %( 0 ),

Row Cum( 0 ),

Row Cum %( 0 )

),

Mosaic Plot( 0 ),

Tests( 0 ),

Make Into Data Table

) << close window()) << get as matrix << close window();

Shape( c`, 4, 1 ); // just a reshape to be comparable here with the summarize vector

);

// Compare!

Write( "Continency, all there=", c_counts( dt1 ), ", One missing=", c_counts( dt2 ), "\!r", " Summarize, all there=", s_counts( dt1 ), ", One missing=", s_counts( dt2 )) ;

Super User

Joined:

Feb 10, 2013

Ok, the hack option is out.  Trying to improve upon what I have and I can't be messing up my data tables.  The contingency table is an option. The reason I was trying to use a summary rather than a platform was because the specific number is critical, as in I need to know which number goes with the pos/pos outcome.  If the ordering of the labels is different in the columns then the cell for the pos/pos could change.  However, I could control for that with by setting the ordering in the column. Currently, rather than the contingency platform I use the categorical platform.  So what would by your preferred method to capture the data from the frequency table?

dt2 = New Table( "One missing",

New Column( "Diagnosis", character ),

New Column( "TestRslt", character )

);

For Each Row(

dt1:Diagnosis[] = x1[Count( 1, 2, 2, 5 )];

dt2:Diagnosis[] = x1[Count( 1, 2, 2, 6 )];

dt1:TestRslt[] = x2[Count( 1, 2, 2, 3 )];

dt2:TestRslt[] = x2[Count( 2, 1, 2, 7 )];

);

Categorical(

X( :Diagnosis ),

Responses( :TestRslt ),

Share Of Responses( 0 ),

Share Chart( 0 ),

Legend( 0 )

);

Super User

Joined:

Jun 23, 2011

Solution

cat=Categorical(

X( :Diagnosis ),

Responses( :TestRslt ),

Share Of Responses( 0 ),

Share Chart( 0 ),

Legend( 0 )

);

// The display box type poorly implemented/documented in jsl, but <<GetXML often works...

xml = Report( cat )[Gridmulticellbox( 2 )] << get xml;

// Extraxt the numbers as list of strings

c = XPath Query( xml, "//GridMultiCellBoxItem/text()" );

// Make into matrix

c = matrix( repeat( i = 0; { i++; num( c[i] ) }, nitems( c )));

Super User

Joined:

Feb 10, 2013

Thank you!  I will give that a try when I get a chance and then I will be back with the next issue.

Staff

Joined:

Jun 23, 2011

Why not use the Contingency platform, which gets it right? You want to access the powerful Cross Tab Box display box object that is responsible for the presentation of the actual Contingency Table. You can find its protocol in the Help > Scripting Index under Display Boxes. I have used it many times as a 'work horse.'

Generally, always start with the high-test level object ( Contingency ) instead of the lowest level code ( Summarize() function ).

Learn it once, use it forever!

Super User

Joined:

Feb 10, 2013

Thanks Mark,

I did end up using the contingency for extracting the values I needed but stayed with the categorical for the display.