It’s World Statistics Day! To honor the theme of the day, the JMP User Community is having conversations about the importance of trust in statistics and data. And we want to hear from you! Tell us the steps you take to ensure that your data is trustworthy.
Choose Language Hide Translation Bar
Highlighted

## Col Min() & Col Max() for Character columns

I'm interested in a function that would behave analogously to how Col Min()/Col Max() does, including allowing grouping variables; but instead of operating on numeric columns, it would operate on Character columns. For example:

• "Col Min Str( :Data, :Grouping)" would return the alphabetically first entry from :Data, grouped by :Grouping.
• "Col Max Str(:Data, :Grouping)" would return the alphabetically last entry from :Data, grouped by :Grouping.

Ideas?

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

## Re: Col Min() & Col Max() for Character columns

I think this provides what you want

``````names default to here(1);
dt=open("\$SAMPLE_DATA/big class.jmp");

dt << New Column( "Max Name by Sex",
Character,
Nominal,
Formula(
Col Max Str = Function( {data, by},
dt = Current Data Table();
testBy = Column( dt, by )[Row()];
Eval(
Parse(
"rr = dt << get rows where( :" || by || " == \!"" || testBy ||
"\!" );"
)
);
myList = As List( Column( dt, data )[rr] );
myList = Sort List( myList );
myList[N Items( myList )];
);
result = Col Max Str( "Name", "Sex" );
)
);

dt << New Column( "Min Name by Sex",
Character,
"Nominal",
Formula(
Col Min Str = Function( {data, by},
dt = Current Data Table();
testBy = Column( dt, by )[Row()];
Eval(
Parse(
"rr = dt << get rows where( :" || by || " == \!"" || testBy ||
"\!" );"
)
);
myList = As List( Column( dt, data )[rr] );
myList = Sort List( myList );
myList;
);
result = Col Min Str( "Name", "Sex" );
)
);``````
Jim
8 REPLIES 8
Highlighted

## Re: Col Min() & Col Max() for Character columns

I think this provides what you want

``````names default to here(1);
dt=open("\$SAMPLE_DATA/big class.jmp");

dt << New Column( "Max Name by Sex",
Character,
Nominal,
Formula(
Col Max Str = Function( {data, by},
dt = Current Data Table();
testBy = Column( dt, by )[Row()];
Eval(
Parse(
"rr = dt << get rows where( :" || by || " == \!"" || testBy ||
"\!" );"
)
);
myList = As List( Column( dt, data )[rr] );
myList = Sort List( myList );
myList[N Items( myList )];
);
result = Col Max Str( "Name", "Sex" );
)
);

dt << New Column( "Min Name by Sex",
Character,
"Nominal",
Formula(
Col Min Str = Function( {data, by},
dt = Current Data Table();
testBy = Column( dt, by )[Row()];
Eval(
Parse(
"rr = dt << get rows where( :" || by || " == \!"" || testBy ||
"\!" );"
)
);
myList = As List( Column( dt, data )[rr] );
myList = Sort List( myList );
myList;
);
result = Col Min Str( "Name", "Sex" );
)
);``````
Jim
Highlighted

## Re: Col Min() & Col Max() for Character columns

Jim,

Thanks, this seems to work just right. It does solve my issue.

But can you say a bit about how it works? It looks interesting, and I'm not familiar with the style, in particular what the Eval ( Parse … ) clause is doing.

Cheers, John

Highlighted

## Re: Col Min() & Col Max() for Character columns

The use of Eval(Parse()) methodology, just simply allows one to build a JSL statement(s) into a literal string, and then execute them.  I use this on occasion when JMP does not interpret a statement properly.  I initially tried the statement in question in this form

``rr = dt << get rows where( Column( dt, by ) == testBy );``

JMP did not like this.....or rather, JMP did not interpret this as I thought it would.  The statement did work if it was in an explicit form

``rr = dt << get rows where( :sex == "F" );``

Therefore, by generating that specific line as a literal string and using the Eval(Parse()) combined function, it generates the exact explicit form of the code I want and then executes it

``````Eval(Parse(
"rr = dt << get rows where( :" || by || " == \!"" || testBy ||"\!" );"
)
);``````

Jim
Highlighted

## Re: Col Min() & Col Max() for Character columns

@john_madden, your reply got me started about my use of the Eval(Parse()) method.  It is a powerful but confusing methodology.  So I took a look at the code once again, and by using an As Column() function the whole Eval(Parse()) thing can be removed

``````names default to here(1);
dt=open("\$SAMPLE_DATA/big class.jmp");

dt << New Column( "Max Name by Sex",
Character,
Nominal,
Formula(
Col Max Str = Function( {data, by},
dt = Current Data Table();
testBy = Column( dt, by )[Row()];
rr = dt << get rows where( as column(dt, by) == testBy  );
myList = As List( Column( dt, data )[rr] );
myList = Sort List( myList );
myList[N Items( myList )];
);
result = Col Max Str( "Name", "Sex" );
)
);``````
Jim
Highlighted

## Re: Col Min() & Col Max() for Character columns

Jim,
Can be this jsl more generic?
for example for the table below

Try Why
A 1A
A 2C
A 2D
B 3B
B 2F
C 1S
C 1W
C 1R

I want to get the max of "Why" per Try i.e.
A 2D
B 3B
C 1W
Highlighted

## Re: Col Min() & Col Max() for Character columns

The Col Max Str() function is generic.  It works on whatever character data is passed to it.  Below is it's usage on your sample data table

``````names default to here(1);
dt=New Table( "Untitled 2",
New Column( "Try",
Character,
"Nominal",
Set Values( {"A", "A", "A", "B", "B", "C", "C", "C"} )
),
New Column( "Why",
Character,
"Nominal",
Set Values( {"1A", "2C", "2D", "3B", "2F", "1S", "1W", "1R"} )
)
);

dt << New Column( "Max Why by Try",
Character,
Nominal,
Formula(
Col Max Str = Function( {data, by},
dt = Current Data Table();
testBy = Column( dt, by )[Row()];
rr = dt << get rows where( as column(dt, by) == testBy  );
myList = As List( Column( dt, data )[rr] );
myList = Sort List( myList );
myList[N Items( myList )];
);
result = Col Max Str( "Why", "try" );
)
);``````
Jim
Highlighted

## Re: Col Min() & Col Max() for Character columns

Here's another solution that uses one of my favorite properties of math.

``````Names default to here(1);
dt = open("\$SAMPLE_DATA\Big Class.jmp");
l = :name << Get Values;

r = rank(l);

rr = rank(r);
dt << new Column("Order", <<Set Values(rr));

dt << new Column("ColMin", Formula(ColMin(:Order, :age)));
dt << new Column("ColMax", Formula(ColMax(:Order, :age)));

dt << New Column("Col Character Min", character, <<Set Each Value(l[r][:ColMin]));
dt << New Column("Col Character Max", character, <<Set Each Value(l[r][:ColMax]));``````

The way it works is all based on rank.

Rank of a list will give the order required to subset the original list to get an ordered list.

``````l = {"B", "D", "C", "A"};
r = rank(l);
// r = [4, 1, 3, 2]
// l[r] = {"A", "B", "C", "D"}``````

The cool think is if you tank a rank() of a rank() it will give you the order required to bring it back to the original list when used to subscript the sorted list.

``````rr = rank(r);
// rr = [2, 4, 3, 1]
// l[r][rr] = {"B", "D", "C", "A"}``````

Hope this works.

Vince Faller - Predictum
Highlighted

## Re: Col Min() & Col Max() for Character columns

Both repliers-  thanks -  I'll  validate  this  solution and  reply  you

The  Question is if it  can  work  on ~70k  rows  and how long?

BTW, I'm  using  very, very  strong (RAM, SSD)  machine on  AWS instance ...

Article Labels

There are no labels assigned to this post.