Discussions

SpannerHead · Apr 24, 2025 10:03 AM

I want to readily identify the item that most frequently occurs by row count in a JMP column. I can easily determine the unique entries, how best can I get a count from there?

dt = Current Data Table();
summarize( lotz = by(:LOT));

Slán

SpannerHead

jthi · Apr 24, 2025 03:33 PM

Summary Platform is what I generally use for this (how I use it depends on the use case: ties, do I need the count, how should missing values be handled and so on)

Names Default To Here(1);

dt = Open("$SAMPLE_DATA/Big Class.jmp");

dt_summary = dt << Summary(
	Group(:sex),
	Freq("None"),
	Weight("None"),
	output table name("res"),
	Private
);

// Using Loc Max
max_row = Loc Max(dt_summary[0, "N Rows"]);
max_val = dt_summary[max_row, 1];
show(max_val);

// Sorting
dt_summary << Sort(By(:N Rows), Replace Table, Order(Descending));
my_val = dt_summary[1, 1];
show(my_val);

Close(dt_summary, no save);

Also just using Mode function might be enough

Names Default To Here(1);

dt = Open("$SAMPLE_DATA/Big Class.jmp");
val = Mode(dt[0, "sex"]);

And there are plenty of other options (like you can see in this thread already).

-Jarmo

View solution in original post

SpannerHead · Apr 24, 2025 05:46 PM

I made this tweak and I appear to be successful.

dt = Current Data Table();

full_proc_rows = dt << get rows where( !Is Missing( :process ) );

proc_val = Mode(dt[full_proc_rows, "process"]);

Slán

SpannerHead

View solution in original post

mmarchandFSLR · Apr 24, 2025 7:24 AM

This takes into account the possibility of a tie for most frequent:

Names Default To Here( 1 );
dt = Data Table( "Big Class" );
dt << Add Rows( {:name = "BARBARA", :age = 13, :sex = "F", :height = 100, :weight = 79} );
Summarize( namez = By( :name ) );
numberz = Transform Each( {v, i}, namez, Length( Where( dt, :name == v ) ) );
max_num = Max( numberz );	//2
big_name = namez[Where( numberz == max_num )];	//{"BARBARA", "ROBERT"}

SpannerHead · Apr 24, 2025 02:36 PM

This is a good looking script but for some reason, it gives numberz as 5 for everything.

Thanks

Slán

SpannerHead

txnelson · Apr 24, 2025 11:30 AM

The Distribution Platform will give you a Mode.

It is under the red triangle for the Summary Statistics Paragraph. Just select

Customize Summary Statistics

and then select the additional statistics you want displayed

Jim

hogi · Apr 24, 2025 12:05 PM

If you prefer a column formula:

Names Default to Here(1);
dt = Open( "$SAMPLE_DATA/Big Class Families.jmp" );
New Column( "N by age",
	Formula( Col Number( 1, :age ) )
)

and to identify the winner:

New Column( "max N by age",
	Formula( Col Number( 1, :age ) == Col Max( Col Number( 1, :age ) ) )
)

Alternatively, you can use Summarize:

Summarize( namez = By( :age ), cnt = count(:age) );

or tables/summary:

Data Table( "Big Class Families" ) << Summary(
	Group( :age )
);

SpannerHead · Apr 24, 2025 02:43 PM

The summarize suggestion is a good one. It gives me 2 dissociated lists. If I could somehow get that to be an associative array, I could use the max "cnt" value to identify the associated "namez" value.

Slán

SpannerHead

SpannerHead · Apr 24, 2025 03:07 PM

I did this and it seems to work.

Summarize( namez = By( :LOT ), cnt = count(:LOT) );

keys = namez;
values = Eval List(cnt);

AA = Associative Array(keys, values);

For( g = 1, g <= N Items( AA ), g++,
    If(values[g] == Max(cnt), mainLOT = keys[g]));

Slán

SpannerHead

jthi · Apr 24, 2025 03:33 PM

Summary Platform is what I generally use for this (how I use it depends on the use case: ties, do I need the count, how should missing values be handled and so on)

Names Default To Here(1);

dt = Open("$SAMPLE_DATA/Big Class.jmp");

dt_summary = dt << Summary(
	Group(:sex),
	Freq("None"),
	Weight("None"),
	output table name("res"),
	Private
);

// Using Loc Max
max_row = Loc Max(dt_summary[0, "N Rows"]);
max_val = dt_summary[max_row, 1];
show(max_val);

// Sorting
dt_summary << Sort(By(:N Rows), Replace Table, Order(Descending));
my_val = dt_summary[1, 1];
show(my_val);

Close(dt_summary, no save);

Also just using Mode function might be enough

Names Default To Here(1);

dt = Open("$SAMPLE_DATA/Big Class.jmp");
val = Mode(dt[0, "sex"]);

And there are plenty of other options (like you can see in this thread already).

-Jarmo

SpannerHead · Apr 24, 2025 04:38 PM

Jarmo

Mighty! This does the trick and it includes non numeric values. The only refinement I need is to have it ignore missing data somehow?

dt = Current Data Table();
val = Mode(dt[0, "process"]);

Slán

SpannerHead

SpannerHead · Apr 24, 2025 05:46 PM

I made this tweak and I appear to be successful.

dt = Current Data Table();

full_proc_rows = dt << get rows where( !Is Missing( :process ) );

proc_val = Mode(dt[full_proc_rows, "process"]);

Slán

SpannerHead

Discussions

Count the Most Frequently Occurring Item in a Column

Re: Count the Most Frequently Occurring Item in a Column

Re: Count the Most Frequently Occurring Item in a Column

Re: Count the Most Frequently Occurring Item in a Column

Re: Count the Most Frequently Occurring Item in a Column

Re: Count the Most Frequently Occurring Item in a Column

Re: Count the Most Frequently Occurring Item in a Column

Re: Count the Most Frequently Occurring Item in a Column

Re: Count the Most Frequently Occurring Item in a Column

Re: Count the Most Frequently Occurring Item in a Column

Re: Count the Most Frequently Occurring Item in a Column

Re: Count the Most Frequently Occurring Item in a Column

Recommended Articles