cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
shampton82
Level VII

"Get rows where" is taking way to long

I have a reference table where each column is a name from a main data table column and the values in the rows are the part numbers that have bad data for that column in the main table.  So for example, in the main table Column 1 is the part number and column 2 is "Temp".  Lets say that the row number equals part number so row 1 is part "1" and let say it has a value of 65 in the "Temp" column.  Part 2 has a temp of -100 and thus is bad data.  So in the reference table under the Column named "Temp" the value of row 1 is "2".  This is repeated for all the part numbers that have bad data in the main table and then is repeated for each column as well.

 

What this gives me is a bad data reference table I can then loop through to automatically remove the bad data each time I pull in the newest data for the main table.  I have the code that works however the line:

(where vlist contains all the part numbers from the reference table and dt2 is the main data table)

try(rowindex=dt2 << Get Rows Where(contains(vlist,:WORKORDERNO )));

is super slow.  It takes 22 seconds for it to complete.  As I have 196 columns to go through this takes way to long.  Any suggestions on how to speed this up would be appreciated!

 

 

The full code is here:

names default to here(1);

dt=open("J:\Public\Ti Process Control\Dashboard Files\JMP data for CC's\Limits\master_data_table_clean_up.jmp", invisible(1)); //this is the reference table
dt<<begin data update;


cols=dt<<get column names();

dt2=data table("Master data pull daily download");  //this is the main table
dt2<<begin data update;
dt2 << Clear Select();

cols2=dt2<<get column names();

vlist={};
for(i=1,i<=2 /*N items(cols)*/,i++,
		try(vlist=column(dt,cols[i])<<get values;);
				starttime = tick seconds();
				try(rowindex=dt2 << Get Rows Where(contains(vlist,:WORKORDERNO )));
				endtime= tick seconds();
				print( "run time",eval insert("Analysis took ^round(endtime - starttime)^ seconds"));
				try(column(dt2,cols[i]) [rowindex] = .);
);

dt2<<end data update;

dt<<end data update;

close(dt,nosave);
11 REPLIES 11
txnelson
Super User

Re: "Get rows where" is taking way to long

I suspect all of the blanks will cause your code to slow down.  Here is a work around for that.  Just replace your creation of the vlist with

Try( vlist = Associative Array( As Column( cols[i] ) ) << get keys );
If( vlist[1] == "",
	Remove From( vlist, 1, 1 )
);
Jim
shampton82
Level VII

Re: "Get rows where" is taking way to long

This worked great, thanks!

 

Thanks to @jthi as well!  Didn't dig into his code he posted above as the copy and paste of txnelson code worked perfectly but I appreciate him taking the time to make it.