Choose Language Hide Translation Bar
Highlighted
Level II

## How to analyze the numerical order of repetetive values in rows?

Hi,

I am one of the few persons in Sweden who uses JMP whereas most of my academic collegues use SPSS. However, I love JMP and have published 70 articles in surgery with it so it works fine with for me. I am Associate Professor and would describe myself as an advanced JMP amateur that can programme somewhat however there is one issue that I am having problems with. I have posted this question previously but did not exactly understand the answers on the forum. The problem is as follows:

I have a file called Results.

In this file I have a column of 5700 numbers that are sorted in order like:

Nr

3456

3666

3687

3687

3687

5444

5444

5445

I would appreciate information on how I should write a script or if I should write anything in an extra column in order to get the order of repetetive appearances of the numbers. The reason is that I am only interested in the first nr. and if I can get the order I could make a separate file consisting of only order 1.

like

Nr         Order

3456     1

3666     1

3687     1

3687     2

3687     3

5444     1

5444     2

5445     1

Since I am an amateur I would appreciate if you could write the answer in a simple way.

Sincerely yours

Lars Enochsson

Associate Professor

Umeå University

Sweden

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted
Super User

## Re: How to analyze the numerical order of repetetive values in rows?

Since the column Nr is in sorted order, ths can be done with a column formula. Before presenting the formula, you should look into Main Menu > Help > Scripting Index > Functions > Row or search for Lag.  I created a column called Count and added this formula.  By default the first row has a count of 1.  The second condition :Nr == Lag( :Nr, 1) is the condition, if the current row is equal to the previous row, then take the previous count, Lag( :Count, 1 ) and increment it.  Otherwise, the count is 1.

If(
Row() == 1, 1,
:Nr == Lag( :Nr, 1 ), Lag( :Count, 1 ) + 1,
1
)

Note JMP Lag works on text as well as numbers.  However, if I had a column of text, I'd recommend modifying the function to be

If(
Row() == 1, 1,
Trim( Lowercase( :txtCol ) ) == Trim( Lowercase (Lag( :txtCol, 1 ) ) ) , Lag( :Count, 1 ) + 1,
1
)

Hope that helps.

4 REPLIES 4
Highlighted
Super User

## Re: How to analyze the numerical order of repetetive values in rows?

Since the column Nr is in sorted order, ths can be done with a column formula. Before presenting the formula, you should look into Main Menu > Help > Scripting Index > Functions > Row or search for Lag.  I created a column called Count and added this formula.  By default the first row has a count of 1.  The second condition :Nr == Lag( :Nr, 1) is the condition, if the current row is equal to the previous row, then take the previous count, Lag( :Count, 1 ) and increment it.  Otherwise, the count is 1.

If(
Row() == 1, 1,
:Nr == Lag( :Nr, 1 ), Lag( :Count, 1 ) + 1,
1
)

Note JMP Lag works on text as well as numbers.  However, if I had a column of text, I'd recommend modifying the function to be

If(
Row() == 1, 1,
Trim( Lowercase( :txtCol ) ) == Trim( Lowercase (Lag( :txtCol, 1 ) ) ) , Lag( :Count, 1 ) + 1,
1
)

Hope that helps.

Highlighted
Level II

## Re: How to analyze the numerical order of repetetive values in rows?

Thanks. Excellent help. This solved the problem./Lars
Highlighted
Super User

## Re: How to analyze the numerical order of repetetive values in rows?

Try a column formula with the Col Rank() function with a by-variable It may require JMP 13 or later, not sure when support for by-variables in ColRank first was introduced.

Try the script below. It should give the same result as the previous solution. However, it does not depend on sorting.

Current Data Table() << New Column("Order", Formula(Col Rank(:Nr, :Nr)));

(Yet another JMP user in Sweden)

Highlighted
Super User

## Re: How to analyze the numerical order of repetetive values in rows?

Sweet!  ColRank() with a by variable was not available until JMP 13.

As an FYI, here is a script to check performance.  If there were 1,000,000 or more rows, I'd probably sort and use lags.

Names Default to Here(1);

dt = New Table("Test It", add rows(6000),
New Column("Nr", numeric, continuous,<<set each value(Random Integer(1,2500)) )
);

t0 = hptime();
dt << New Column("Count", <<Formula(ColRank(:Nr, :Nr)));
t1 = hptime();

dt:Count << delete formula;

t2 = hptime();
dt << Sort( By(:Nr), Order( Ascending ), Replace Table);
t3 = hptime();

t4 = hptime();
dt << New Column("Count2", <<Formula(If(
Row() == 1, 1,
:Nr == Lag( :Nr, 1 ), Lag( :Count, 1 ) + 1,
1
)));
t5 = hptime();

show(t1-t0, t3-t2, t5-t4); //Col Rank(), Sort(), using Lag()

For 6000 rows, using Col Rank() =  3.13*using lags
t1 - t0 = 18836;
t3 - t2 = 1863;
t5 - t4 = 4161;
For 1,000,000 rows, using Col Rank() =  2.45*using lags
t1 - t0 = 1521679;
t3 - t2 = 694813;
t5 - t4 = 621142;
*/