Discussions

Sciguy_1 · Jun 20, 2019 01:06 PM

I want to count rows in my dataset if 3 columns (A-B-C) match. I want to enter the value in a new column (D). Here is an example. Thank you for your assistance!

Column A	Column B	Column C	Column D (Counted Tally)
7	Z	S	4
7	Z	S	4
1	A	P	3
1	A	P	3
1	A	P	3
7	Z	S	4
5	E	M	2
5	E	M	2
7	Z	S	4

julian · Jun 20, 2019 01:36 PM

Hi @Sciguy_1

There are a few ways you could go about this. I think a formula column approach is relatively straightforward, with two columns, first, one that counts the occurrences of a match incrementally, and then a second column that finds the max value of that count by group, which is what you're calling a counted tally here. Here's what that table looks like:

And here are the column formulas:

Counting in groups by the columns used for a match:

Col Cumulative Sum( 1, :Column A, :Column B, :Column C )

and the Max of that count column, also grouped by the columns used for a match

Col Maximum( :Count In Groups, :Column A, :Column B, :Column C )

Table is attached for your reference.

Alternatively, if you wish to create a table that lists unique rows only once, with the number of times they matched, you can use Tables > Summary. Enter in your columns as Group, and hit okay. Here's what that setup looks like:

This will return a summary table that shows the unique combinations of values in those three columns, along with a column showing the number of occurrences in the data table.

Once you have this table, you can select, and then drag the N Rows column from the summary table directly into your original table, and JMP will handle all the Matching for you to update the original table with those values. Here's what that looks like:

Finally, the script for making the table is also saved as a Source script to the summary table:

Data Table( "Example.jmp" ) << Summary(
	Group( :Column A, :Column B, :Column C ),
	Freq( "None" ), Weight( "None" )
)

I hope this helps!

@julian

View solution in original post

julian · Jun 20, 2019 01:36 PM

Hi @Sciguy_1

There are a few ways you could go about this. I think a formula column approach is relatively straightforward, with two columns, first, one that counts the occurrences of a match incrementally, and then a second column that finds the max value of that count by group, which is what you're calling a counted tally here. Here's what that table looks like:

And here are the column formulas:

Counting in groups by the columns used for a match:

Col Cumulative Sum( 1, :Column A, :Column B, :Column C )

and the Max of that count column, also grouped by the columns used for a match

Col Maximum( :Count In Groups, :Column A, :Column B, :Column C )

Table is attached for your reference.

Alternatively, if you wish to create a table that lists unique rows only once, with the number of times they matched, you can use Tables > Summary. Enter in your columns as Group, and hit okay. Here's what that setup looks like:

This will return a summary table that shows the unique combinations of values in those three columns, along with a column showing the number of occurrences in the data table.

Once you have this table, you can select, and then drag the N Rows column from the summary table directly into your original table, and JMP will handle all the Matching for you to update the original table with those values. Here's what that looks like:

Finally, the script for making the table is also saved as a Source script to the summary table:

Data Table( "Example.jmp" ) << Summary(
	Group( :Column A, :Column B, :Column C ),
	Freq( "None" ), Weight( "None" )
)

I hope this helps!

@julian

Sciguy_1 · Jun 21, 2019 11:54 AM

Thank you Julian the column approach addresses my application perfectly.

Sciguy_1 · Jun 21, 2019 03:06 PM

To go one step further- Lets say I wanted to add another column that showed the number of times a combination occurred within a subset. So in the example, in the "total in group" column there are 4-"4", 3-"3", and 2-"2". If all the rows (1 thru 9) represented the entire subset, the answer in this case would be 3 because a "4" occurred, a "3" occurred, and a "2" occurred. So in the new column "total in subset" every row would have a 3. I cant figure out how to get figures summed in that manner.

txnelson · Jun 21, 2019 03:24 PM

create a new column and use the following formula for the new column

summarize(byGroup=by(:Total in Group));
n items(byGroup)

Jim

julian · Jun 21, 2019 03:26 PM

Hi @Sciguy_1,

I believe I understand what you're asking, but let me reiterate back to make sure. You'd like a new column that has a count of the number of unique combinations, regardless of how many times those combinations occurred? In the subset you provided, there are 3 unique combinations of the three columns, and as you pointed out, one of the combinations occurred 4 times, another 3 times, and another 2 times. Another way to say this is how many unique rows do you have, as defined by the first three columns. If this is indeed what you're after, the following column formula will do, which essentially counts up the number of times a 1 occurs in the "Count In Groups" Columns, i.e., the number of unique groups:

As Constant(
	uniqueRowCount = Summation(
		i = 1,
		N Row(),
		If( :Count In Groups[i] == 1,
			1,
			0
		)
	)
);
uniqueRowCount;

I've attached the new table here as well. Is this what you were looking to do?

@julian

Sciguy_1 · Jun 21, 2019 04:40 PM

That is exactly what I am wanting to do, you explained it quite well. But upon applying the formula I am not getting the correct values. Let me supply a new example just to make sure we are on the same wavelength. This is a very large and quite dynamic data set. Hope this better explains and does not confuse the issue. Thank you.

TUBE	JAR	BUCKET	SOG	No. Times Tube occurs per JAR X BUCKET X SOG	No. unique times Jar occurs per bucket X SOG combination
ASX	ABE	WDNH	5001	3	2
CDV	ABE	WDNH	5001	3	2
KNL	ABE	WDNH	5001	3	2
NHS	BEN	WDNH	5001	2	2
SDC	BEN	WDNH	5001	2	2
MKD	CAT	DEFC	5004	4	1
KDN	CAT	DEFC	5004	4	1
AMD	CAT	DEFC	5004	4	1
CMD	CAT	DEFC	5004	4	1
MDN	DOG	DEFC	5007	2	2
VNF	DOG	DEFC	5007	2	2
RKE	EEL	DEFC	5007	5	2
LKD	EEL	DEFC	5007	5	2
KDM	EEL	DEFC	5007	5	2
WLD	EEL	DEFC	5007	5	2
KMD	EEL	DEFC	5007	5	2
FDL	FLY	SDER	5008	6	2
SMK	FLY	SDER	5008	6	2
KJM	FLY	SDER	5008	6	2
ALD	FLY	SDER	5008	6	2
NDE	FLY	SDER	5008	6	2
IEJ	FLY	SDER	5008	6	2
DFR	ABE	SDER	5008	3	2
TRE	ABE	SDER	5008	3	2
NFE	ABE	SDER	5008	3	2

julian · Jun 21, 2019 1:51 PM

It appears that every row in your table is unique, so I would expect "No. Times Tube occurs per JAR X BUCKET X SOG" to be 1 for the entire table. Am I misunderstanding this format?

Also - attaching your JMP data table, or a subset of it, with your column formulas intact will be much more helpful to diagnose what is happening, rather than pasting the table into the body of the message.

Sciguy_1 · Jun 24, 2019 12:55 PM

You are correct, every row is unique. Here is a subset of the real dataset. I created a couple columns representing what I am trying to accompolish. The column 'number of times BUCKET appears per SOG' I manually populated.The NULL value can be treated as its own BUCKET. The second column needs to show the number of times J1 appears per: SOG X BUCKET combination. Hope this makes things easier thank you for your assistance.

Sciguy_1 · Jun 24, 2019 12:55 PM

You are correct, every row is unique. Here is a subset of the real dataset. I created a couple columns representing what I am trying to accompolish. The column 'number of times BUCKET appears per SOG' I manually populated.The NULL value can be treated as its own BUCKET. The second column needs to show the number of times J1 appears per: SOG X BUCKET combination. Hope this makes things easier thank you for your assistance.

Discussions

Counting rows based on multi condition match

Re: Counting rows based on multi condition match

Re: Counting rows based on multi condition match

Re: Counting rows based on multi condition match

Re: Counting rows based on multi condition match

Re: Counting rows based on multi condition match

Re: Counting rows based on multi condition match

Re: Counting rows based on multi condition match

Re: Counting rows based on multi condition match

Re: Counting rows based on multi condition match

Re: Counting rows based on multi condition match

Recommended Articles