cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
The Discovery Summit 2025 Call for Content is open! Submit an abstract today to present at our premier analytics conference.
Choose Language Hide Translation Bar
View Original Published Thread

Formula column to obtain unique values using grouping columns

FN
FN
Level VI

What would be a simple column formula to obtain unique values with large datasets (100k rows)? Is this possible?

 

For example, looking at Cars.jmp. The distribution platform gives already the number of levels (unique elements), but I couldn't find a formula column able to calculate these.

 

Make column --> 37 levels (different manufacturers).

 

This is often called cardinality, distinct, or number of unique values.

 

The formula should be able to use group columns so we will obtain, for example, the number of models per manufacturer.

 

Where we have Toyota, the value will be 11 models. Mitsubishi 6 models, etc.

 

This is also possible under JMP Query Builder, for example.

FN_0-1741777214992.png

SELECT t1.Make, COUNT(DISTINCT t1.Model) AS "Count DISTINCT-Model"
FROM Cars t1
GROUP BY t1.Make;

 

Names Default To Here( 1 );
dt = Open( "$SAMPLE_DATA/Cars.jmp" );

Distribution(
	Nominal Distribution( Column( :Make ) ),
	Nominal Distribution( Column( :Model ) ),
	Local Data Filter(
		Add Filter(
			columns( :Make ),
			Where( :Make == "Toyota" ),
			Display( :Make, N Items( 15 ), Find( Set Text( "" ) ) )
		)
	),
	SendToReport(
		Dispatch( {"Make"}, "Frequencies", OutlineBox, {Close( 1 )} ),
		Dispatch( {"Model"}, "Distrib Nom Hist", FrameBox,
			{Frame Size( 110, 178 )}
		)
	)
);

 

 

3 REPLIES 3
txnelson
Super User


Re: Formula column to obtain unique values using grouping columns

Here is my crazy solution

txnelson_0-1741785943324.png

 

As Constant(
	t1 = Current Data Table();
	Query(
		t1,
		"
		SELECT t1.Make, COUNT(DISTINCT t1.Model) AS 'Count DISTINCT-Model'
FROM Cars t1
GROUP BY t1.Make;"
	);
	newdt = Current Data Table();
	makeList = newdt:Make << get values;
	countMat = Column( newdt, 2 ) << get values;
	Close( newdt, nosave );
);
countMat[Contains( makeList, t1:make )];
Jim
hogi
Level XII


Re: Formula column to obtain unique values using grouping columns

jthi
Super User


Re: Formula column to obtain unique values using grouping columns

I think this might provide correct result

Col Sum(Row() == Col Min(Row(), :Model, :Make), :Make)

But currently I would most likely go with Summary -> update and not use a formula

Names Default To Here(1);

dt = Open("$SAMPLE_DATA/Cars.jmp");


// Formula option
newcol = dt << New Column("R", Numeric, Ordinal, Formula(
	Col Sum(Row() == Col Min(Row(), :Model, :Make), :Make)
));


// Summary + update
dt_summary = dt << Summary(
	Group(:Make),
	N Categories(:Model),
	Freq("None"),
	Weight("None"),
	statistics column name format("stat of column"),
	private
);


// Update data table
dt << Update(
	With(dt_summary),
	Match Columns(:Make = :Make),
	Add Columns from Update Table(:N Categories of Model),
	Replace Columns in Main Table(:N Categories of Model)
);

Close(dt_summary, no save);
-Jarmo