cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Lino
Level III

Summary Data Table by Category

Hi,

 

I have data with around 5000 rows.

There three columns (Group, Score and Count)

Value for Group: A, B

Value for Score ranging 0.5 - 1

Value for Count is 1 for each row.

 

Lino_0-1606919316800.png

 

I would like to summary the data by group A and B and group the Score value in few category and then sum the count for each score category , as below format.

 

Lino_1-1606919681918.png

 

I am seeking help from the community how to create JSL script to perform this task?

 

I attach data here as well.

 

 

Thanks

 

 

 

 

 

2 ACCEPTED SOLUTIONS

Accepted Solutions
Michael_MXAK
Level IV

Re: Summary Data Table by Category

I would use an interactive approach to capture a script for this. First, let's start with the data table:

Michael_MXAK_0-1606922913008.png

I left Count off because we won't need it but it won't hurt if it's there.

 

I selected the Score column, then navigated to Cols, Utilities, Make Binning Formula:

Michael_MXAK_1-1606922971171.png

 

And set Offset to 0.5, and bin width to 0.1:

Michael_MXAK_2-1606923036446.png

Right click the new column (Score Binned) and Copy Column Properties:

Michael_MXAK_3-1606923089990.png

 

Then open a new script window and paste (ctrl-V), which results in this JSL:

Add Column Properties(
	Value Labels(
		{-0.5 = "-0.5 — -0.4", -0.4 = "-0.4 — -0.3", -0.3 = "-0.3 — -0.2", -0.2 =
		"-0.2 — -0.1", -0.1 = "-0.1 — 0", 0 = "0 — 0.1", 0.1 = "0.1 — 0.2", 0.2 =
		"0.2 — 0.3", 0.3 = "0.3 — 0.4", 0.4 = "0.4 — 0.5", 0.5 = "0.5 — 0.6", 0.6 =
		"0.6 — 0.7", 0.7 = "0.7 — 0.8", 0.8 = "0.8 — 0.9", 0.9 = "0.9 — 1", 1 =
		"1 — 1.1", 1.1 = "1.1 — 1.2", 1.2 = "1.2 — 1.3", 1.3 = "1.3 — 1.4", 1.4 =
		"1.4 — 1.5", 1.5 = "1.5 — 1.6", 1.6 = "1.6 — 1.7", 1.7 = "1.7 — 1.8", 1.8 =
		"1.8 — 1.9", 1.9 = "1.9 — 2"}
	),
	Use Value Labels( 1 ),
	Formula( 0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 ) )
)

Two things here: we don't care about any scores less than 0.5 (they don't exist in your description) or greater than 1, so we can change the properties a bit and edit the label names to match what you want. I've also added an If statement to capture the edge case where a score is exactly 1

Add Column Properties(
	Value Labels(
		{0.5 = "0.5 — 0.59", 0.6 = "0.6 — 0.69", 0.7 = "0.7 — 0.79", 0.8 = "0.8 — 0.89", 0.9 =
		"0.9 — 1"}
	),
	Use Value Labels( 1 ),
	Formula(
		If(
			(0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )) == 1, 0.9,
			0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )
		)
	)
);

(BTW those labels aren't technically accurate, since 0.599 would be labeled as between 0.5 and 0.59)

 

We can wrap some code around this snippet to make it reusable, where we use the current data table, add a new column called Score Binned, and add these properties to that column:

dt = Current Data Table();

dt << New Column("Score Binned", Character);

dt:"Score Binned"n << Add Column Properties(
	Value Labels(
		{0.5 = "0.5 — 0.59", 0.6 = "0.6 — 0.69", 0.7 = "0.7 — 0.79", 0.8 = "0.8 — 0.89", 0.9 =
		"0.9 — 1"}
	),
	Use Value Labels( 1 ),
	Formula(
		If(
			(0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )) == 1, 0.9,
			0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )
		)
	)
);

Now, back on our table, Tables > Summary takes us where we want to go. We put Group and Score Binned in Group, and that's all we actually need:

Michael_MXAK_4-1606923615913.png

Michael_MXAK_5-1606923638599.png

Now, N Rows is what we want, it's just named wrong, so let's double-click the word "Source" on the left and copy the table creation script and replace the named table reference with "dt", our working table, and catch the new table in the variable "summ":

Michael_MXAK_6-1606923683769.png

dt = Current Data Table();

dt << New Column("Score Binned", Character);

dt:"Score Binned"n << Add Column Properties(
	Value Labels(
		{0.5 = "0.5 — 0.59", 0.6 = "0.6 — 0.69", 0.7 = "0.7 — 0.79", 0.8 = "0.8 — 0.89", 0.9 =
		"0.9 — 1"}
	),
	Use Value Labels( 1 ),
	Formula(
		If(
			(0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )) == 1, 0.9,
			0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )
		)
	)
);

summ = dt << Summary(
	Group( :Group, :Score Binned ),
	Freq( "None" ),
	Weight( "None" )
);

Finally, rename N Rows to "Total_Count" as desired:

dt = Current Data Table();

dt << New Column("Score Binned", Character);

dt:"Score Binned"n << Add Column Properties(
	Value Labels(
		{0.5 = "0.5 — 0.59", 0.6 = "0.6 — 0.69", 0.7 = "0.7 — 0.79", 0.8 = "0.8 — 0.89", 0.9 =
		"0.9 — 1"}
	),
	Use Value Labels( 1 ),
	Formula(
		If(
			(0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )) == 1, 0.9,
			0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )
		)
	)
);

summ = dt << Summary(
	Group( :Group, :Score Binned ),
	Freq( "None" ),
	Weight( "None" )
);

summ:"N Rows"n << Set Name("Total_Count");

Now, close the summary table, delete the Score Binned column from the source table, and test the script:

Michael_MXAK_7-1606923867843.png

Final JSL again:

dt = Current Data Table();

dt << New Column("Score Binned", Character);

dt:"Score Binned"n << Add Column Properties(
	Value Labels(
		{0.5 = "0.5 — 0.59", 0.6 = "0.6 — 0.69", 0.7 = "0.7 — 0.79", 0.8 = "0.8 — 0.89", 0.9 =
		"0.9 — 1"}
	),
	Use Value Labels( 1 ),
	Formula(
		If(
			(0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )) == 1, 0.9,
			0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )
		)
	)
);

summ = dt << Summary(
	Group( :Group, :Score Binned ),
	Freq( "None" ),
	Weight( "None" )
);

summ:"N Rows"n << Set Name("Total_Count");

Tested on your attached data, which I literally just noticed

Michael_MXAK_8-1606923973365.png

 

View solution in original post

txnelson
Super User

Re: Summary Data Table by Category

Here are a couple of ways to do this.  But since the example below is really just based upon primary JMP functionality, I strongly suggest that you read the Discovering JMP, and the Using JMP documents, from the JMP Documentation Library, available under the Help pull down menu.

Names Default To Here( 1 );
dt = Current Data Table();

dt << New Column( "Score Category",
	character,
	formula(
		If(
			:Score < .6, "0.50-0.59",
			:Score < .8, "0.60-0.79",
			:Score < .9, "0.80-0.89",
			"0.90-1.00",
		)
	)
);

dtSum = dt << Summary(
	Group( :Group, :Score Category ),
	Sum( :Count ),
	Freq( "None" ),
	Weight( "None" ),
	Link to original data table( 0 )
);

dtSum << delete columns( :N Rows );
dtSum:Name( "Sum(Count)" ) << set name( "Total_Count" );


// or

dt:score << set property(
	"value labels",
	{0.5 <= "0.50-0.59" < 0.6, 0.6 <= "0.60-0.79" < 0.8, 0.8 <= "0.80-0.89" < 0.9, 0.9
	 <= "0.90-1.00" <= 1}
);
	
dtSum = dt << Summary(
	Group( :Group, :Score ),
	Sum( :Count ),
	Freq( "None" ),
	Weight( "None" ),
	Link to original data table( 0 )
);

dtSum << delete columns( :N Rows );
dtSum:Name( "Sum(Count)" ) << set name( "Total_Count" );
Jim

View solution in original post

3 REPLIES 3
Michael_MXAK
Level IV

Re: Summary Data Table by Category

I would use an interactive approach to capture a script for this. First, let's start with the data table:

Michael_MXAK_0-1606922913008.png

I left Count off because we won't need it but it won't hurt if it's there.

 

I selected the Score column, then navigated to Cols, Utilities, Make Binning Formula:

Michael_MXAK_1-1606922971171.png

 

And set Offset to 0.5, and bin width to 0.1:

Michael_MXAK_2-1606923036446.png

Right click the new column (Score Binned) and Copy Column Properties:

Michael_MXAK_3-1606923089990.png

 

Then open a new script window and paste (ctrl-V), which results in this JSL:

Add Column Properties(
	Value Labels(
		{-0.5 = "-0.5 — -0.4", -0.4 = "-0.4 — -0.3", -0.3 = "-0.3 — -0.2", -0.2 =
		"-0.2 — -0.1", -0.1 = "-0.1 — 0", 0 = "0 — 0.1", 0.1 = "0.1 — 0.2", 0.2 =
		"0.2 — 0.3", 0.3 = "0.3 — 0.4", 0.4 = "0.4 — 0.5", 0.5 = "0.5 — 0.6", 0.6 =
		"0.6 — 0.7", 0.7 = "0.7 — 0.8", 0.8 = "0.8 — 0.9", 0.9 = "0.9 — 1", 1 =
		"1 — 1.1", 1.1 = "1.1 — 1.2", 1.2 = "1.2 — 1.3", 1.3 = "1.3 — 1.4", 1.4 =
		"1.4 — 1.5", 1.5 = "1.5 — 1.6", 1.6 = "1.6 — 1.7", 1.7 = "1.7 — 1.8", 1.8 =
		"1.8 — 1.9", 1.9 = "1.9 — 2"}
	),
	Use Value Labels( 1 ),
	Formula( 0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 ) )
)

Two things here: we don't care about any scores less than 0.5 (they don't exist in your description) or greater than 1, so we can change the properties a bit and edit the label names to match what you want. I've also added an If statement to capture the edge case where a score is exactly 1

Add Column Properties(
	Value Labels(
		{0.5 = "0.5 — 0.59", 0.6 = "0.6 — 0.69", 0.7 = "0.7 — 0.79", 0.8 = "0.8 — 0.89", 0.9 =
		"0.9 — 1"}
	),
	Use Value Labels( 1 ),
	Formula(
		If(
			(0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )) == 1, 0.9,
			0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )
		)
	)
);

(BTW those labels aren't technically accurate, since 0.599 would be labeled as between 0.5 and 0.59)

 

We can wrap some code around this snippet to make it reusable, where we use the current data table, add a new column called Score Binned, and add these properties to that column:

dt = Current Data Table();

dt << New Column("Score Binned", Character);

dt:"Score Binned"n << Add Column Properties(
	Value Labels(
		{0.5 = "0.5 — 0.59", 0.6 = "0.6 — 0.69", 0.7 = "0.7 — 0.79", 0.8 = "0.8 — 0.89", 0.9 =
		"0.9 — 1"}
	),
	Use Value Labels( 1 ),
	Formula(
		If(
			(0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )) == 1, 0.9,
			0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )
		)
	)
);

Now, back on our table, Tables > Summary takes us where we want to go. We put Group and Score Binned in Group, and that's all we actually need:

Michael_MXAK_4-1606923615913.png

Michael_MXAK_5-1606923638599.png

Now, N Rows is what we want, it's just named wrong, so let's double-click the word "Source" on the left and copy the table creation script and replace the named table reference with "dt", our working table, and catch the new table in the variable "summ":

Michael_MXAK_6-1606923683769.png

dt = Current Data Table();

dt << New Column("Score Binned", Character);

dt:"Score Binned"n << Add Column Properties(
	Value Labels(
		{0.5 = "0.5 — 0.59", 0.6 = "0.6 — 0.69", 0.7 = "0.7 — 0.79", 0.8 = "0.8 — 0.89", 0.9 =
		"0.9 — 1"}
	),
	Use Value Labels( 1 ),
	Formula(
		If(
			(0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )) == 1, 0.9,
			0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )
		)
	)
);

summ = dt << Summary(
	Group( :Group, :Score Binned ),
	Freq( "None" ),
	Weight( "None" )
);

Finally, rename N Rows to "Total_Count" as desired:

dt = Current Data Table();

dt << New Column("Score Binned", Character);

dt:"Score Binned"n << Add Column Properties(
	Value Labels(
		{0.5 = "0.5 — 0.59", 0.6 = "0.6 — 0.69", 0.7 = "0.7 — 0.79", 0.8 = "0.8 — 0.89", 0.9 =
		"0.9 — 1"}
	),
	Use Value Labels( 1 ),
	Formula(
		If(
			(0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )) == 1, 0.9,
			0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )
		)
	)
);

summ = dt << Summary(
	Group( :Group, :Score Binned ),
	Freq( "None" ),
	Weight( "None" )
);

summ:"N Rows"n << Set Name("Total_Count");

Now, close the summary table, delete the Score Binned column from the source table, and test the script:

Michael_MXAK_7-1606923867843.png

Final JSL again:

dt = Current Data Table();

dt << New Column("Score Binned", Character);

dt:"Score Binned"n << Add Column Properties(
	Value Labels(
		{0.5 = "0.5 — 0.59", 0.6 = "0.6 — 0.69", 0.7 = "0.7 — 0.79", 0.8 = "0.8 — 0.89", 0.9 =
		"0.9 — 1"}
	),
	Use Value Labels( 1 ),
	Formula(
		If(
			(0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )) == 1, 0.9,
			0.5 + 0.1 * Floor( (:Score - 0.5) / 0.1 )
		)
	)
);

summ = dt << Summary(
	Group( :Group, :Score Binned ),
	Freq( "None" ),
	Weight( "None" )
);

summ:"N Rows"n << Set Name("Total_Count");

Tested on your attached data, which I literally just noticed

Michael_MXAK_8-1606923973365.png

 

txnelson
Super User

Re: Summary Data Table by Category

Here are a couple of ways to do this.  But since the example below is really just based upon primary JMP functionality, I strongly suggest that you read the Discovering JMP, and the Using JMP documents, from the JMP Documentation Library, available under the Help pull down menu.

Names Default To Here( 1 );
dt = Current Data Table();

dt << New Column( "Score Category",
	character,
	formula(
		If(
			:Score < .6, "0.50-0.59",
			:Score < .8, "0.60-0.79",
			:Score < .9, "0.80-0.89",
			"0.90-1.00",
		)
	)
);

dtSum = dt << Summary(
	Group( :Group, :Score Category ),
	Sum( :Count ),
	Freq( "None" ),
	Weight( "None" ),
	Link to original data table( 0 )
);

dtSum << delete columns( :N Rows );
dtSum:Name( "Sum(Count)" ) << set name( "Total_Count" );


// or

dt:score << set property(
	"value labels",
	{0.5 <= "0.50-0.59" < 0.6, 0.6 <= "0.60-0.79" < 0.8, 0.8 <= "0.80-0.89" < 0.9, 0.9
	 <= "0.90-1.00" <= 1}
);
	
dtSum = dt << Summary(
	Group( :Group, :Score ),
	Sum( :Count ),
	Freq( "None" ),
	Weight( "None" ),
	Link to original data table( 0 )
);

dtSum << delete columns( :N Rows );
dtSum:Name( "Sum(Count)" ) << set name( "Total_Count" );
Jim
Lino
Level III

Re: Summary Data Table by Category

Thanks Jim and Michael.