cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Check out the JMP® Marketplace featured Capability Explorer add-in
Choose Language Hide Translation Bar
john_madden
Level VI

Col Min() & Col Max() for Character columns

I'm interested in a function that would behave analogously to how Col Min()/Col Max() does, including allowing grouping variables; but instead of operating on numeric columns, it would operate on Character columns. For example:

  • "Col Min Str( :Data, :Grouping)" would return the alphabetically first entry from :Data, grouped by :Grouping.
  • "Col Max Str(:Data, :Grouping)" would return the alphabetically last entry from :Data, grouped by :Grouping.

Ideas?

1 ACCEPTED SOLUTION

Accepted Solutions
txnelson
Super User

Re: Col Min() & Col Max() for Character columns

I think this provides what you want

names default to here(1);
dt=open("$SAMPLE_DATA/big class.jmp");

dt << New Column( "Max Name by Sex",
	Character,
	Nominal,
	Formula(
		Col Max Str = Function( {data, by},
			dt = Current Data Table();
			testBy = Column( dt, by )[Row()];
			Eval(
				Parse(
					"rr = dt << get rows where( :" || by || " == \!"" || testBy ||
					"\!" );"
				)
			);
			myList = As List( Column( dt, data )[rr] );
			myList = Sort List( myList );
			myList[N Items( myList )];
		);
		result = Col Max Str( "Name", "Sex" );
	)
);

dt << New Column( "Min Name by Sex",
	Character,
	"Nominal",
	Formula(
		Col Min Str = Function( {data, by},
			dt = Current Data Table();
			testBy = Column( dt, by )[Row()];
			Eval(
				Parse(
					"rr = dt << get rows where( :" || by || " == \!"" || testBy ||
					"\!" );"
				)
			);
			myList = As List( Column( dt, data )[rr] );
			myList = Sort List( myList );
			myList[1];
		);
		result = Col Min Str( "Name", "Sex" );
	)
);
Jim

View solution in original post

8 REPLIES 8
txnelson
Super User

Re: Col Min() & Col Max() for Character columns

I think this provides what you want

names default to here(1);
dt=open("$SAMPLE_DATA/big class.jmp");

dt << New Column( "Max Name by Sex",
	Character,
	Nominal,
	Formula(
		Col Max Str = Function( {data, by},
			dt = Current Data Table();
			testBy = Column( dt, by )[Row()];
			Eval(
				Parse(
					"rr = dt << get rows where( :" || by || " == \!"" || testBy ||
					"\!" );"
				)
			);
			myList = As List( Column( dt, data )[rr] );
			myList = Sort List( myList );
			myList[N Items( myList )];
		);
		result = Col Max Str( "Name", "Sex" );
	)
);

dt << New Column( "Min Name by Sex",
	Character,
	"Nominal",
	Formula(
		Col Min Str = Function( {data, by},
			dt = Current Data Table();
			testBy = Column( dt, by )[Row()];
			Eval(
				Parse(
					"rr = dt << get rows where( :" || by || " == \!"" || testBy ||
					"\!" );"
				)
			);
			myList = As List( Column( dt, data )[rr] );
			myList = Sort List( myList );
			myList[1];
		);
		result = Col Min Str( "Name", "Sex" );
	)
);
Jim
john_madden
Level VI

Re: Col Min() & Col Max() for Character columns

Jim, 

Thanks, this seems to work just right. It does solve my issue.

But can you say a bit about how it works? It looks interesting, and I'm not familiar with the style, in particular what the Eval ( Parse … ) clause is doing.

Cheers, John

txnelson
Super User

Re: Col Min() & Col Max() for Character columns

The use of Eval(Parse()) methodology, just simply allows one to build a JSL statement(s) into a literal string, and then execute them.  I use this on occasion when JMP does not interpret a statement properly.  I initially tried the statement in question in this form

rr = dt << get rows where( Column( dt, by ) == testBy );

JMP did not like this.....or rather, JMP did not interpret this as I thought it would.  The statement did work if it was in an explicit form

rr = dt << get rows where( :sex == "F" );

Therefore, by generating that specific line as a literal string and using the Eval(Parse()) combined function, it generates the exact explicit form of the code I want and then executes it

Eval(Parse(
	"rr = dt << get rows where( :" || by || " == \!"" || testBy ||"\!" );"
	)
);

 

 

Jim
txnelson
Super User

Re: Col Min() & Col Max() for Character columns

@john_madden, your reply got me started about my use of the Eval(Parse()) method.  It is a powerful but confusing methodology.  So I took a look at the code once again, and by using an As Column() function the whole Eval(Parse()) thing can be removed

names default to here(1);
dt=open("$SAMPLE_DATA/big class.jmp");

dt << New Column( "Max Name by Sex",
	Character,
	Nominal,
	Formula(
		Col Max Str = Function( {data, by},
			dt = Current Data Table();
			testBy = Column( dt, by )[Row()];
			rr = dt << get rows where( as column(dt, by) == testBy  );
			myList = As List( Column( dt, data )[rr] );
			myList = Sort List( myList );
			myList[N Items( myList )];
		);
		result = Col Max Str( "Name", "Sex" );
	)
);
Jim
Isak_JMP
Level I

Re: Col Min() & Col Max() for Character columns

Jim,
Can be this jsl more generic?
for example for the table below

Try Why
A 1A
A 2C
A 2D
B 3B
B 2F
C 1S
C 1W
C 1R

I want to get the max of "Why" per Try i.e.
A 2D
B 3B
C 1W
txnelson
Super User

Re: Col Min() & Col Max() for Character columns

The Col Max Str() function is generic.  It works on whatever character data is passed to it.  Below is it's usage on your sample data table

names default to here(1);
dt=New Table( "Untitled 2",
	Add Rows( 8 ),
	New Column( "Try",
		Character,
		"Nominal",
		Set Values( {"A", "A", "A", "B", "B", "C", "C", "C"} )
	),
	New Column( "Why",
		Character,
		"Nominal",
		Set Values( {"1A", "2C", "2D", "3B", "2F", "1S", "1W", "1R"} )
	)
);

dt << New Column( "Max Why by Try",
	Character,
	Nominal,
	Formula(
		Col Max Str = Function( {data, by},
			dt = Current Data Table();
			testBy = Column( dt, by )[Row()];
			rr = dt << get rows where( as column(dt, by) == testBy  );
			myList = As List( Column( dt, data )[rr] );
			myList = Sort List( myList );
			myList[N Items( myList )];
		);
		result = Col Max Str( "Why", "try" );
	)
);
Jim
vince_faller
Super User (Alumni)

Re: Col Min() & Col Max() for Character columns

Here's another solution that uses one of my favorite properties of math.  

 

Names default to here(1);
dt = open("$SAMPLE_DATA\Big Class.jmp");
l = :name << Get Values;

r = rank(l);

rr = rank(r);
dt << new Column("Order", <<Set Values(rr));

dt << new Column("ColMin", Formula(ColMin(:Order, :age)));
dt << new Column("ColMax", Formula(ColMax(:Order, :age)));

dt << New Column("Col Character Min", character, <<Set Each Value(l[r][:ColMin]));
dt << New Column("Col Character Max", character, <<Set Each Value(l[r][:ColMax]));

 

The way it works is all based on rank.  

 

Rank of a list will give the order required to subset the original list to get an ordered list. 

l = {"B", "D", "C", "A"};
r = rank(l);
// r = [4, 1, 3, 2]
// l[r] = {"A", "B", "C", "D"}

The cool think is if you tank a rank() of a rank() it will give you the order required to bring it back to the original list when used to subscript the sorted list. 

rr = rank(r);
// rr = [2, 4, 3, 1]
// l[r][rr] = {"B", "D", "C", "A"}

Hope this works.

 

Vince Faller - Predictum
Isak_JMP
Level I

Re: Col Min() & Col Max() for Character columns

Both repliers-  thanks -  I'll  validate  this  solution and  reply  you

The  Question is if it  can  work  on ~70k  rows  and how long?  

BTW, I'm  using  very, very  strong (RAM, SSD)  machine on  AWS instance ...