Subscribe Bookmark RSS Feed

JSL Matrix/List Programming Change for JMP 13

XanGregg

Staff

Joined:

Jun 23, 2011

Indexing into a matrix (or a list) with another matrix (or list) is a powerful technique for making scripts fast and small, but there is one edge case to worry about which we are fixing for JMP 13. Perhaps I should put "fixing" in quotation marks because any change is a potential disruption, which is why I want to explain it here in advance of JMP 13.


Here's an example of using a matrix as an index into a data column.


Open("$SAMPLE_DATA/Big Class.jmp");

tall rows = Loc(:height << Get Values() >= 67);  // [25, 27, 30, 37, 39, 40]

:height[tall rows]; // [68, 69, 67, 68, 68, 70]

N Rows(:height[tall rows]);  // 6


If you're not familiar with this style of programming, the second line is the key. ":height << Get Values()" returns the values of the height column as a matrix. The comparison ">= 67" returns a matrix of 0s and 1s corresponding to where the expression is false (0) or true (1). The Loc() function returns the locations of the 1s within the previous matrix.

In this case we get [25, 27, 30, 37, 39, 40] as the row numbers where the height column is greater than or equal to 67. That's pretty handy by itself, but the real power comes from being able to use that matrix as the index into a column or other matrix as in the third and fourth lines.


The alternative to this technique would be to write a For() loop to iterate through every row explicitly. Of course, JMP is still looping through the rows in either case, but it's much faster if done internally rather than in JSL. And the JSL is more compact without the explicit loop.

We can do the same thing with a matrix variable instead of a data column reference:

m = :height << Get Values();

tall rows = Loc(m >= 67);  // [25, 27, 30, 37, 39, 40]

m[tall rows]; // [68, 69, 67, 68, 68, 70]

N Rows(m[tall rows]);  // 6


All great so far. Let's look at the same examples with a different height cutoff (70 instead of 67):


tall rows = Loc(:height << Get Values() >= 70);   // [40]

:height[tall rows];   // [70]

N Rows(:height[tall rows]);  // 1

m = :height << Get Values();

tall rows = Loc(m >= 70);   // [40]

m[tall rows];   // 70

N Rows(m[tall rows]);  // error

What went wrong? Notice that data[tall rows] returned 70 instead of [70]. In many cases, a 1x1 matrix and a number are treated the same way, but not always. The N Rows() function is one example where it makes a difference. The indexing is too aggressively "simplifying" the 1x1 matrix into a number.

One work-around is to call Is Matrix() before using N Rows(). Another is to concatenate an empty matrix to the result, which will create a matrix whether the source is a number or a matrix:


m = m |/ []

ln JMP 13, the 1x1 matrix is maintained so that the result of the index operation will be a matrix whenever the index itself is a matrix. The same applies to list indexing and already applied to data column indexing.

We expect the disruption to be rare since the trigger is not common and a 1x1 matrix and a number are often treated the same anyway. Nonetheless, the JMP 13 debugger will have the ability to identify places in your scripts that might be affected by the change.


If you have any feedback or concerns, let us know here or contact danielvalente to see if you're eligible for the JMP 13 Early Adopter program to try out the change on your own code.


8 REPLIES
msharp

Super User

Joined:

Jul 28, 2015

This is a great step in the right direction.

txnelson

Super User

Joined:

Jun 22, 2012

I appreciate the heads up.......this is the right direction.....

Jim
David_Burnham

Super User

Joined:

Jul 13, 2011

I have a lot of code in place that has to handle this exception.  This is a definite improvement.  Thanks for the heads-up, I'll take a look at how it will impact the current pattern of code that I used to handle this situation.

-Dave
markbailey

Staff

Joined:

Jun 23, 2011

We have a case in our advanced scripting course where the result is the empty matrix but what is actually returned is [](0,1)..This form appears to be harmless (that is, it behaves like s simple empty matrix) but it always required an explanation. What does the (0,1) represent and how would it be used? If it does not represent anything or serve any purpose then please consider removing it.

Craige_Hales

Staff

Joined:

Mar 21, 2013

It distinguishes between different empty matrices with a non-zero dimension; sometimes the shape is important:


x00 = [];

x10 = [](1,0);

x01 = [](0,1);

x11 = [](1,1);

show(x00,nrows(x00),ncols(x00));

show(x10,nrows(x10),ncols(x10));

show(x01,nrows(x01),ncols(x01));

show(x11,nrows(x11),ncols(x11));

x00 = [];

N Rows(x00) = 0;

N Cols(x00) = 0;

x10 = [](1, 0);

N Rows(x10) = 1;

N Cols(x10) = 0;

x01 = [](0, 1);

N Rows(x01) = 0;

N Cols(x01) = 1;

x11 = [1];

N Rows(x11) = 1;

N Cols(x11) = 1;

Craige
mark_anawis

Community Trekker

Joined:

Nov 18, 2014

I agree that the new handling of the single value matrix is an improvement, but I also second Mark Bailey's request for elimination of the confusing empty matrix output [](0,1). Replacing it with [] would be easier for the user to understand and deal with. I usually use logic such as the following to handle this situation:

dt=current data table();

rows_to_capture=dt<<get rows where(:mycolumn==number);

if(nrows(rows_to_capture)==0,

expr_when_no_rows,

expr_when_there_are_rows);

mark_anawis

Community Trekker

Joined:

Nov 18, 2014

Great improvement. Here is another suggestion for matrices: simplify the indexing when you are using a conditional. This is best explained through an example. Let's say I have a matrix that I want to find and return the values greater than 5;

mymatrix = [2,4,6,8,10];

newmatrix=mymatrix[loc(mymatrix>5)];

Compare this to similar code in R:

mymatrix = matrix(cbind(2,4,6,8,10));

newmatrix=mymatrix[mymatrix>5];

OK, the first statement in R is longer and the second statement returns a vector rather than a matrix, but the second statement is shorter. It would be nice if JMP could implement the simpler syntax in the second statement in future versions.

Phil_Brown

Super User

Joined:

Mar 20, 2012

I agree, good to have a uniform indexing.  I do like the fact that one can create empty matrices with shape.

PDB