cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
nikles
Level VI

How to remove a set of values from a matrix/vector?

Hi.  I have a large vector (i.e. 1-D matrix) of values, from which I'd like to remove a subset of the values, where the subset contains a non-sequential set of values from the original vector.  For example:

 

largevect = [1,2,3,4,5]

subset = [2,5]

desired result = [1,3,4]

 

I'm wondering if a simple command or set of commands exists that could do this in JSL?  I could do this with a For Loop, but wondering if there's a more elegant or faster method?

1 ACCEPTED SOLUTION

Accepted Solutions
Craige_Hales
Super User

Re: How to remove a set of values from a matrix/vector?

Probably need more information to answer this well.

 

If you are doing set operations, see JSL Set Operations  for a way to do this with associative arrays:

 

// this would be a little more natural using lists
// or associative arrays, but it is easy to make an
// associative array set from a matrix:
largevect = [20, 30, 40, 50, 10];
subset = [20, 50];

aaBigSet = Associative Array( largevect );
aaSmallSet = Associative Array( subset );

// remove the small set from the big set
aaBigSet << Remove( aaSmallSet );

// getkeys returns a list, make a matrix
result = Matrix( aaBigSet << getkeys );

// the original order is lost, but the set is correct
show(result); // [10, 30, 40];

 

 

It is unclear in your example if subset is a list of values, or the indexes to values. If it is the indexes, and you could generate the set of indexes to keep instead of the set to remove, then it is as simple as

 

largevect = [20, 30, 40, 50, 10];
keepers = [1,3,4];
desired result = largevect[ keepers ];
show(desiredResult); // [20, 40, 50];

 

 

Craige

View solution in original post

6 REPLIES 6
Craige_Hales
Super User

Re: How to remove a set of values from a matrix/vector?

Probably need more information to answer this well.

 

If you are doing set operations, see JSL Set Operations  for a way to do this with associative arrays:

 

// this would be a little more natural using lists
// or associative arrays, but it is easy to make an
// associative array set from a matrix:
largevect = [20, 30, 40, 50, 10];
subset = [20, 50];

aaBigSet = Associative Array( largevect );
aaSmallSet = Associative Array( subset );

// remove the small set from the big set
aaBigSet << Remove( aaSmallSet );

// getkeys returns a list, make a matrix
result = Matrix( aaBigSet << getkeys );

// the original order is lost, but the set is correct
show(result); // [10, 30, 40];

 

 

It is unclear in your example if subset is a list of values, or the indexes to values. If it is the indexes, and you could generate the set of indexes to keep instead of the set to remove, then it is as simple as

 

largevect = [20, 30, 40, 50, 10];
keepers = [1,3,4];
desired result = largevect[ keepers ];
show(desiredResult); // [20, 40, 50];

 

 

Craige
nikles
Level VI

Re: How to remove a set of values from a matrix/vector?

Thanks Craige.

Your first assumption was correct: I wish to remove the literal values of 2 and 5 from the vector containing 1,2,3,4,5.  Another example to help clarify, I wish to remove this subset vector from largevect:

 

largevect = [12, 189, 3, 78, -2]

subset = [189, -2]

dsired result = [12, 3, 78]

 

I've not used AA's much, but I like this as a solution.  Thanks for the help! 

tsl
tsl
Level III

Re: How to remove a set of values from a matrix/vector?

@Craige_Hales ,

do you have any thoughts on removing rows from a large matrix efficiently?

I have a situation where I have  data matrix that is > 1 million rows x 9 cols (  1,604,736 x 9 in the example before me right now )

I need to remove some rows, actually > 78 k rows from it. Here's what I'm doing right now 

	rr = Loc(DataM[0,4] == 1); 
	DataM[rr,0] = [];

So I'm looking for cases where the 4th column = 1 and I want all those rows gone.

The Loc() line is super fast, ( < 0.05 sec ) but the following line is super slow ( 83 seconds )!

I'm using the JMP documentation where I read:

 

Deleting rows and columns is accomplished by assigning an empty matrix to that row or column.
A[k, 0] = []; // deletes the kth row
A[0, k] = []; // deletes the kth column

 

Maybe I'm just stuck because my data matrix is large

 

It's not clear to me whether I could use the Associative Array trick since I'm dealing with a matrix and not a vector

 

Tom

 

Craige_Hales
Super User

Re: How to remove a set of values from a matrix/vector?

Yes. Get the opposite loc, using !=, then
DataM = DataM[rr,0];
Should be 100x faster.
I suspect the way you are doing it recreates the array for each deleted row.
Craige
Jeff_Perkinson
Community Manager Community Manager

Re: How to remove a set of values from a matrix/vector?

DataM = J( 1067000, 8, Random Normal() ) || J( 1067000, 1, Random Binomial( 1, 0.07 ) );

DataM1 = DataM;

DataM2 = DataM;

Close( dt, nosave );

start = Tick Seconds();
foo = As Table( DataM, <<invisible );

foo << select where( :col9 == 1 ) << delete rows;
data2 = foo << get as matrix;

data_table_method = Tick Seconds() - start;

start = Tick Seconds();
rr = Loc( DataM1[0, 9] == 1 );
DataM1[rr, 0] = [];
matrix_method1 = Tick Seconds() - start;


start = Tick Seconds();
rr = Loc( DataM2[0, 9] != 1 );
DataM2 = DataM2[rr, 0];
matrix_method2 = Tick Seconds() - start;

Show( data_table_method, matrix_method1, matrix_method2 );

Results:

data_table_method = 1.0333333333333;
matrix_method1 = 49.166666666667;
matrix_method2 = 0.066666666666606;

@Craige_Hales' method (matrix_method2) is the fastest.

 

I was curious whether using a data table would be fast and, while it's 15x slower than matrix_method2, it's still pretty fast for this modestly-sized problem. I was then curious to see how long it would take if your data is already in a data table (avoiding the conversion to/from the data table) and it gets even faster.

 

foo = As Table( DataM, <<invisible );
start = Tick Seconds();

foo << select where( :col9 == 1 ) << delete rows;
//data2 = foo << get as matrix;

data_table_method2 = Tick Seconds() - start;

Results:

data_table_method2 = 0.316666666666606
-Jeff
Craige_Hales
Super User

Re: How to remove a set of values from a matrix/vector?

Thanks Jeff! 1000x is pretty good. I actually didn't realize a row could be assigned an empty matrix to delete it.

I already suggested to tech support that the doc needed to change or the slow method needed to be sped up.

 

Craige