cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
BHarris
Level VI

JMP equivalent of Excel's COUNTIF function

I'm trying to mimic this Excel-type behavior:

 

 AB
1

failureType

numFailuresThisType
2

f1

=countif($a$2:$a$999,a2)
3f2=countif($a$2:$a$999,a3)
4f1=countif($a$2:$a$999,a4)
5f1=countif($a$2:$a$999,a5)
6f2=countif($a$2:$a$999,a6)
.........

 

... in JMP.  Is this possible as a column formula?

 

I've tried "Col Sum( :failureType == :failureType[ Row() ] )" but that doesn't work and I don't understand why.  Sometimes ":ColumnName" returns a single value, and other times it returns the entire column's-worth of entries and auto-loops over them.  Is there a way to tell which it will do, perhaps from the Scripting Index?

1 ACCEPTED SOLUTION

Accepted Solutions
jthi
Super User

Re: JMP equivalent of Excel's COUNTIF function

 

Maybe Col Sum of 1 by failuretype?

Col Sum(1, :FailureType)

 

 

Here Column 6 has formula Col Sum(1, :age)

jthi_0-1724337733694.png

 

-Jarmo

View solution in original post

14 REPLIES 14
jthi
Super User

Re: JMP equivalent of Excel's COUNTIF function

 

Maybe Col Sum of 1 by failuretype?

Col Sum(1, :FailureType)

 

 

Here Column 6 has formula Col Sum(1, :age)

jthi_0-1724337733694.png

 

-Jarmo
BHarris
Level VI

Re: JMP equivalent of Excel's COUNTIF function

Yep, that'll probably work.

 

Why didn't my Col Sum(...) work?  Is Row() also being updated internally at each step?

jthi
Super User

Re: JMP equivalent of Excel's COUNTIF function

Create column with just Row() and you will see what it returns

jthi_0-1724339067129.png

You can do the same thing for

:failureType[Row()]

and for

:failureType

and finally for

:failureType == :failureType[Row()]

to see what you will end up with (you are comparing current row value of failure type to current row value of failure type with just different syntax, which is almost always 1).

-Jarmo
BHarris
Level VI

Re: JMP equivalent of Excel's COUNTIF function

This formula seems to work as well:

 

Names Default to Here(1);
n1 = :failureType;
Col Sum( :failureType == n1 )

It seems that within the Col Sum function it is expecting to iterate over lists, similar to doing a for loop in python over zipped variables, but in the 2nd line, it's just getting the value of :failureType from the current row.  These kinds of implicit behaviors are incredibly helpful/useful, but only if the user understands how they work (which I would like to).

hogi
Level XII

Re: JMP equivalent of Excel's COUNTIF function

Wow. Kind of surprising. I didn't expect it to work this way.

How did you find out?

 

Open( "$SAMPLE_DATA/Big Class.jmp" );

New Column( "ColSum1",
	Formula(
		Col Sum( :age == 12 );
	)
);

New Column( "ColSum2",
	Formula(
		twelve = 12;
		Col Sum( :age == twelve );
	)
);

New Column( "ColSum3",
	Formula(
		myAGE = :age;
		Col Sum( :age == myAGE );
	)
);

New Column( "ColSum4",
	Formula(
		myAGE = :age;
		myname= :name;
		Col Sum( show(myname,myAge,:name,:age,"-");:age == myAGE );
	)
);

 

hogi
Level XII

Re: JMP equivalent of Excel's COUNTIF function

related post with some parallels: Countif equivalent? 
but without @BHarris 's magic : )

BHarris
Level VI

Re: JMP equivalent of Excel's COUNTIF function

@hogi It's probably not as magical as it seems -- the first argument to Col Sum is an expression, and in this case since n1 is precalculated (likely being set to the value from the current row due to the context), and == expressions seem to return 1 or 0, therefore it's simply adding up a bunch of 1s where the column value equals n1 and 0 otherwise.

 

What's a little magical to me is that sometimes in column formulas, JMP will iterate over the rows in a column when entered as ":columnName" and other times it only pulls the value from that column in the current row.  This has been mysterious to me since the discussion on that other thread.

 

I'm guessing the Col Number() function is the most appropriate one to use in this case, probably caches more cleanly than this does.

 

Frankly, I didn't know Col Number() existed.  I spent most of my adult life in Excel, so I do wish there was a mapping of Excel functions to JMP functions somewhere.

hogi
Level XII

Re: JMP equivalent of Excel's COUNTIF function

New Column( "ColSum4",
	Formula(
		myAGE = :age;
		myname= :name;
		Col Sum( show(myname,myAge,:name,:age,"-");:age == myAGE );
	)
);

is quite illustrative. Col cumulative behaves a bit different from how I exected it to behave.
I guessed Col Sum evaluates the arguemt once and then groups/collects the results according to groupBy.

But actually:

Formula is evaluated for every row, in total 40x for rows RWx.
And with every iteration, Col Sum evaluates it's argument 40 times - for every row RWy.
I guess it tries to cache the result and won't evaluate the argument again for other RWx if it can be sure that it will be the same result.

But in the case of Show() or :age = myAGE, it has to evaluate again and again.
-> in total 40 x 40 times.

 

so ,e.g. for row RWx = 3 (myName = Jane) , it evaluates the argument 40 times.

And for 8 of the 40 RWy, :age == myAge results in 1. So Col Sum is 8 for row 3.
Accordingly, Col Sum is 3 for row 40.

hogi
Level XII

Re: JMP equivalent of Excel's COUNTIF function

 

 


@jthi wrote:

Even though it looks like it does work, I wouldn't use it (I think it should be fine if it required you to have :failureType as byvar). I would think that it is some sort of unintended behavior (even more than using 1 which shouldn't be allowed either according to JMP support, but let us hope that JMP never breaks that
Make using formula expressions in Set Each Value and using expressions as first argument in Col stat...).

 


I also feel much more confident if a functionality is documented - somehow protected against sudden removal ...

 

JMP has to find out if it's OK to cache the results. Who knows the logic?

Here, JMP thinks it's OK to use the cached data:

t0 = hptime();
New Column( "test",
	Formula(
		wait(0.1);
		Col Sum( hp time() )
	),
);

current data table() << run formulas();

show((hptime()-t0)/1000000)