cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
uday_guntupalli
Level VIII

Alternative to Lag() when working with matrices instead of data tables

All, 
      Is there an alternative to the lag() that I could use when working with matrices instead of data tables ? Esentially, I would like to be able to look at the value of the previous row of my column in a matrix to determine the value of current row. This can easily be acheived using the lag() function in a data table. What is an effective way other than looping to acheive this if working with matrices ? 

 

dt = New Table("Test"); 

dt << New Column("Random-1",Numeric,Continuous,<< set values(Random Index(10^3,10^2))); 

dt << New Column("AltToLag",Numeric,Continuous,Formula(If(Row()==1,0,Lag(:Name("Random-1")))));
Best
Uday
1 ACCEPTED SOLUTION

Accepted Solutions
vince_faller
Super User (Alumni)

Re: Alternative to Lag() when working with matrices instead of data tables

It's not exactly super user friendly, but even all of these steps are still ~3 times faster (436:135 us) on my computer when running the vector vs the column.  There are definitely optimizations you could do to this to make it faster too. When I put it to 1000 rows it was ~4 times faster(1057:227us). Also, apparently they got rid of Apply, my bad. 

dt = New Table( "Test" ); 

dt << New Column( "RandomList", Numeric, Continuous, <<Set Values( Random Index( 10 ^ 3, 10 ^ 3 ) ) );
time_column = HPTime();
dt << New Column( "DesFunc",
	Numeric,
	Continuous,
	Formula(
		If( Row() == 1,
			0,
			If( Mod( Lag( :RandomList ), 2 ) == 0,
				:RandomList[Row()] + 1,
				:RandomList[Row()] - 1
			)
		)
	)
);
time_column = HPTime()-time_column;


lag_matrix = Function( {mat, lag},
	{DEFAULT LOCAL},
	n = N Rows( mat );
	lag_vector = 1 :: n;
	empty_mat = J( Abs( lag ), N Cols( mat ), . );
	If( lag > 0,
		lag_vector = lag_vector[1 :: n - Abs( lag )];
		new_mat = empty_mat |/ mat[lag_vector, 0];
	, //else
		lag_vector = lag_vector[Abs( lag ) + 1 :: n];
		new_mat = mat[lag_vector, 0] |/ empty_mat;
	);
	new_mat;
);


v = Column( dt, "RandomList" ) << Get Values;
check_vector = Column(dt, "DesFunc") << Get Values;

time_vector = HPTime();
//lag the vector
lv = lag_matrix(v, 1);
//mod the vector
mod_vector = lv / 2 - floor(lv/2);
//find the mods you want
loc_vector = loc(mod_vector==0);
//invert to find the mods you don't want
invert_vector = J(nrows(v), 1);
invert_vector[loc_vector] = 0;
invert_vector = loc(invert_vector);
//assign values
v[loc_vector] = v[loc_vector] + 1;
v[invert_vector] = v[invert_vector] - 1;
//set first row to 0
v[1] = 0;
time_vector = HPTime()-time_vector;

show(all(v == check_vector));
show(time_column, time_vector);

Hope this helps a little.  

 

 

Vince Faller - Predictum

View solution in original post

11 REPLIES 11
vince_faller
Super User (Alumni)

Re: Alternative to Lag() when working with matrices instead of data tables

Does this work for you?

names default to here(1);
lag_matrix = function({mat, lag}, {DEFAULT LOCAL},
	n = nrows(mat);
	lag_vector = 1::n;
	empty_mat = J(abs(lag), ncols(mat), .);
	if(lag>0, 
		lag_vector = lag_vector[1::n-abs(lag)];
		new_mat = empty_mat|/mat[lag_vector, 0];
	, //else
		lag_vector = lag_vector[abs(lag)+1::n];
		new_mat = mat[lag_vector, 0]|/empty_mat;
	);
	new_mat;
);
lag_matrix((14::28)`||(28::42)`, -1);

 

Vince Faller - Predictum
uday_guntupalli
Level VIII

Re: Alternative to Lag() when working with matrices instead of data tables

@vince_faller
       Thank you for the solution you have offered, is there a way to modify this function to get just the previous row rather than the entire matrix ? 

 

       Maybe the appropriate question would be what is the vectorized format of looping around a matrix with the function - similar to apply functions in R which are much faster than loops. 

 

 

Best
Uday
gzmorgan0
Super User (Alumni)

Re: Alternative to Lag() when working with matrices instead of data tables

Uday,

 

I don't know what you are asking. You can just reference the row  j-1. The script shows both lag1 and dif1 results.

Names Default to Here(1);

dt = Open("$sample_data/Big Class.jmp");

bcmat = dt << get as matrix; //Matrix(40,3)

for(j=2, j<=nrow(dt), j++,
 lag1 = bcmat[j-1,0];
 dif1 = bcmat[j,0] - bcmat[j-1,0];
 show( lag1, dif1)
);
uday_guntupalli
Level VIII

Re: Alternative to Lag() when working with matrices instead of data tables

@gzmorgan0
            What I am trying to ask for is a way to avoid the loop and rely on vectorized code. It is advisable to replace loops with matrix functions to make the code more efficient (https://community.jmp.com/t5/JMP-Blog/JSL-Tip-Replace-Loops-with-Functions-on-Matrices/ba-p/29783) . This is acheived in R through the apply family functions and in Matlab through some matrix functions or custom functions. I am requesting if there is a way to implement such a vectorized code on matrices rather than try and write a loop in this case to find the lag(). 


@gzmorgan0 wrote:

Uday,

 

I don't know what you are asking. You can just reference the row  j-1. The script shows both lag1 and dif1 results.

Names Default to Here(1);

dt = Open("$sample_data/Big Class.jmp");

bcmat = dt << get as matrix; //Matrix(40,3)

for(j=2, j<=nrow(dt), j++,
 lag1 = bcmat[j-1,0];
 dif1 = bcmat[j,0] - bcmat[j-1,0];
 show( lag1, dif1)
);

In the example you have provided, I definitely understand that it is easy to write a loop and get the lag() equivalent, however as the size of your data sets or the # of your iterations increase this will slow your code down. I am looking for an efficient replacement i.e. vectorized equivalent of implementing repetitive operations on matrices in JSL. 

 

 

 

Best
Uday
uday_guntupalli
Level VIII

Re: Alternative to Lag() when working with matrices instead of data tables

@XanGregg , @Craige_Hales 
       Xan or Craige- if it is possible, I would actually request you to kindly make a blog post that covers in detail how to perform vectorization with numerous examples in JSL. This topic is covered at length in SAS , R , Matlab and other environments. I would love to see something similar for JMP. If something similar already exists, would request you to draw my attention to it. 

 SAS Blogs - https://blogs.sas.com/content/iml/2013/05/15/vectorize-computations.html 

 

 

Best
Uday
vince_faller
Super User (Alumni)

Re: Alternative to Lag() when working with matrices instead of data tables

What are you actually trying to do? You're right that matrices are usually faster but looping through them will (in my experience) usually lose any benefit.   JMP 14 now has apply() and applyOver() functions but I haven't had a lot of luck using them. 

Vince Faller - Predictum
uday_guntupalli
Level VIII

Re: Alternative to Lag() when working with matrices instead of data tables

@vince_faller
      Can you help me implement the following in a matrix without a loop ? 

 

dt  = New Table("Test"); 

dt << New Column("RandomList",Numeric,Continuous,<< Set Values(Random Index(10^3,10^2)))
   << New Column("DesFunc",Numeric,Continuous,Formula(If(Row() == 1,0,If(Mod(Lag(:RandomList),2)==0,:RandomList[Row()]+1,:RandomList[Row()]-1))));

       Also, I couldn't find the apply() or applyOver() functions in my scripting index. I am using JMP 14 . Is it a JMP Pro thing ? 

 

image.png 

Best
Uday
vince_faller
Super User (Alumni)

Re: Alternative to Lag() when working with matrices instead of data tables

It's not exactly super user friendly, but even all of these steps are still ~3 times faster (436:135 us) on my computer when running the vector vs the column.  There are definitely optimizations you could do to this to make it faster too. When I put it to 1000 rows it was ~4 times faster(1057:227us). Also, apparently they got rid of Apply, my bad. 

dt = New Table( "Test" ); 

dt << New Column( "RandomList", Numeric, Continuous, <<Set Values( Random Index( 10 ^ 3, 10 ^ 3 ) ) );
time_column = HPTime();
dt << New Column( "DesFunc",
	Numeric,
	Continuous,
	Formula(
		If( Row() == 1,
			0,
			If( Mod( Lag( :RandomList ), 2 ) == 0,
				:RandomList[Row()] + 1,
				:RandomList[Row()] - 1
			)
		)
	)
);
time_column = HPTime()-time_column;


lag_matrix = Function( {mat, lag},
	{DEFAULT LOCAL},
	n = N Rows( mat );
	lag_vector = 1 :: n;
	empty_mat = J( Abs( lag ), N Cols( mat ), . );
	If( lag > 0,
		lag_vector = lag_vector[1 :: n - Abs( lag )];
		new_mat = empty_mat |/ mat[lag_vector, 0];
	, //else
		lag_vector = lag_vector[Abs( lag ) + 1 :: n];
		new_mat = mat[lag_vector, 0] |/ empty_mat;
	);
	new_mat;
);


v = Column( dt, "RandomList" ) << Get Values;
check_vector = Column(dt, "DesFunc") << Get Values;

time_vector = HPTime();
//lag the vector
lv = lag_matrix(v, 1);
//mod the vector
mod_vector = lv / 2 - floor(lv/2);
//find the mods you want
loc_vector = loc(mod_vector==0);
//invert to find the mods you don't want
invert_vector = J(nrows(v), 1);
invert_vector[loc_vector] = 0;
invert_vector = loc(invert_vector);
//assign values
v[loc_vector] = v[loc_vector] + 1;
v[invert_vector] = v[invert_vector] - 1;
//set first row to 0
v[1] = 0;
time_vector = HPTime()-time_vector;

show(all(v == check_vector));
show(time_column, time_vector);

Hope this helps a little.  

 

 

Vince Faller - Predictum
uday_guntupalli
Level VIII

Re: Alternative to Lag() when working with matrices instead of data tables

@vince_faller
    Thanks for the solution, can you kindly explain the thought process and how it is functioning if it is not too much to ask. 

     I see that you are using the combination of lag_matrix function,  mod function and loc() but I don't quite get the logic. 

    Would greatly appreaciate a breadown of the logic so I can learn the though process ( again only if it is not too much to ask) 

 

 

Best
Uday