cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
The Discovery Summit 2025 Call for Content is open! Submit an abstract today to present at our premier analytics conference.
See how to use to use Text Explorer to glean valuable information from text data at April 25 webinar.
Choose Language Hide Translation Bar
View Original Published Thread

Faster iteration

Jackie_
Level VI

Hi,


Trying to optimize the iterations.

 

I have a reference table where each column is a name from a main data table column. I want to iterate through each row of the reference data table and then add the values from Column(dt_main, "Values") to New A and New B Columns in the reference table if Column(dt_ref, "Nrows")>2 & Comment[i] == "A > B".

 

I am using an associative array because I am interested only in the last recorded values of each type A and B, mapped to the corresponding X/Y/Wafer ID.

 

The below script takes a very long time. Is there a way to optimize?

 

Names Default To Here( 1 );
start = Tick Seconds();
dt_main = Data Table( "maintable" );
dt_ref = Data Table( "referencetable" );


Try( dt_ref << delete columns( :New N, :New R ) );
dt_ref << New Column( "New A", "Numeric" );
dt_ref << New Column( "New B", "Numeric" );
nrows = [];

For( nn = 1, nn <= N Rows( dt_ref ), nn++, 
	
	Insert Into( nrows, nn )
);

dt_ref << begin data update;
For Each( {r, i}, nrows, 
	
	
	If( dt_ref:N Rows[i] > 2 & dt_ref:Comment[i] == "A > B", 
/// Create AA to get the last recorded values for each A and B to the corresponding X/Y/Wafer ID from dt_main
		countsessions = Associative Array(
			dt_main:Type[dt_main << get rows where( dt_main:X Coord == dt_ref:X Coord[i] & dt_main:Y Coord == dt_ref:Y Coord[i] & dt_main:Wafer ID == dt_ref:Wafer ID[i] )], 
			
			dt_main:Value[dt_main << get rows where( dt_main:X Coord == dt_ref:X Coord[i] & dt_main:Y Coord == dt_ref:Y Coord[i] & dt_main:Wafer ID == dt_ref:Wafer ID[i] )];
			
		);
		
		Try( dt_ref:New A[i] = countsessions["A"] );
		Try( dt_ref:New B[i] = countsessions["B"] );
		
	)
);

dt_ref << end data update;
end = Tick Seconds();

Show( Char( end - start ) );

 

1 ACCEPTED SOLUTION

Accepted Solutions
txnelson
Super User


Re: Faster iteration

I took a different route, using the built in JMP Select, Join, Split and Update platforms.  It works very fast, and I appear to get the same results, except for the A>B rows that in your reference table have missing values....but the values I come up with match the N(A) and N(b) columns.  So I may be completely off base.

Names Default To Here( 1 );
start = Tick Seconds();
dt_main = Data Table( "maintable" );
dt_ref = Data Table( "referencetable" );


Try( dt_ref << delete columns( :New A, :New B ) );

// Create a subset of the maintable that contains only data for the rows in
// the reference table that have comment value of A>B
// Select only A>B rows in reference table
dt_ref << select where(:Comment == "A > B");
dtAB = dt_ref<<subset( selected rows(1), columns(:X Coord,:Y Coord,:Wafer ID));

// Use Join to create a new main table with only matching wafer, x, y
dtMainJoin = dt_main << Join(
	With( dtAB ),
	Merge Same Name Columns,
	Match Flag( 0 ),
	By Matching Columns(
		:X Coord = :X Coord, :Y Coord = :Y Coord, :Wafer ID = :Wafer ID
	),
	Drop multiples( 0, 0 ),
	Include Nonmatches( 0, 0 ),
	Preserve main table order( 1 ),
	Output Table( "Join of maintable with Subset of referencetable 2" )
);

// Clean up by deleting dtAB
close( dtAB, nosave );

// Get rid of duplicate data, saving only the last values.
// Use Select Duplicate Rows() to find the duplicates, but since it retains the 1st
// row found, reverse the data table and then apply Select Duplicate Rows()
dtMainJoin << New Column("RowNum", set each value(Row()));

dtMainJoin = dtMainJoin << sort(by(:RowNum), Order(descending), Replace table(1));
found = dtMainJoin << Select Duplicate Rows(match(:Type, :X Coord, :Y Coord, :Wafer ID));

If(found>0, dtMainJoin << delete rows);
dtMainJoin = dtMainJoin << sort(by(:RowNum), Order(ascending), Replace table(1));
dtMainJoin << delete columns(:RowNum);

// Split the table to create the New A and New B Columns
dtSplit = dtMainJoin <<
Split(
	Split By( :Type ),
	Split( :Value ),
	Group( :X Coord, :Y Coord, :Wafer ID ),
	Output Table(
		"Split of Join of maintable with Subset of referencetable 2 by Type"
	),
	Sort by Column Property
);

// Clean up dtMainJoin table by closing it
close(dtMainJoin, nosave);

// Rename the A and B columns
dtSplit:A << set name("New A");
dtSplit:B << set name("New B");

// Use Update to merge the dtSplit data into the reference table
dt_ref << Update(
	With(
		Data Table(
			dtSplit
		)
	),
	Match Columns( :X Coord = :X Coord, :Y Coord = :Y Coord, :Wafer ID = :Wafer ID )
);

// Clean up
close( dtSplit, nosave );

dt_ref << end data update;
end = Tick Seconds();

Show( Char( end - start ) );
Jim

View solution in original post

1 REPLY 1
txnelson
Super User


Re: Faster iteration

I took a different route, using the built in JMP Select, Join, Split and Update platforms.  It works very fast, and I appear to get the same results, except for the A>B rows that in your reference table have missing values....but the values I come up with match the N(A) and N(b) columns.  So I may be completely off base.

Names Default To Here( 1 );
start = Tick Seconds();
dt_main = Data Table( "maintable" );
dt_ref = Data Table( "referencetable" );


Try( dt_ref << delete columns( :New A, :New B ) );

// Create a subset of the maintable that contains only data for the rows in
// the reference table that have comment value of A>B
// Select only A>B rows in reference table
dt_ref << select where(:Comment == "A > B");
dtAB = dt_ref<<subset( selected rows(1), columns(:X Coord,:Y Coord,:Wafer ID));

// Use Join to create a new main table with only matching wafer, x, y
dtMainJoin = dt_main << Join(
	With( dtAB ),
	Merge Same Name Columns,
	Match Flag( 0 ),
	By Matching Columns(
		:X Coord = :X Coord, :Y Coord = :Y Coord, :Wafer ID = :Wafer ID
	),
	Drop multiples( 0, 0 ),
	Include Nonmatches( 0, 0 ),
	Preserve main table order( 1 ),
	Output Table( "Join of maintable with Subset of referencetable 2" )
);

// Clean up by deleting dtAB
close( dtAB, nosave );

// Get rid of duplicate data, saving only the last values.
// Use Select Duplicate Rows() to find the duplicates, but since it retains the 1st
// row found, reverse the data table and then apply Select Duplicate Rows()
dtMainJoin << New Column("RowNum", set each value(Row()));

dtMainJoin = dtMainJoin << sort(by(:RowNum), Order(descending), Replace table(1));
found = dtMainJoin << Select Duplicate Rows(match(:Type, :X Coord, :Y Coord, :Wafer ID));

If(found>0, dtMainJoin << delete rows);
dtMainJoin = dtMainJoin << sort(by(:RowNum), Order(ascending), Replace table(1));
dtMainJoin << delete columns(:RowNum);

// Split the table to create the New A and New B Columns
dtSplit = dtMainJoin <<
Split(
	Split By( :Type ),
	Split( :Value ),
	Group( :X Coord, :Y Coord, :Wafer ID ),
	Output Table(
		"Split of Join of maintable with Subset of referencetable 2 by Type"
	),
	Sort by Column Property
);

// Clean up dtMainJoin table by closing it
close(dtMainJoin, nosave);

// Rename the A and B columns
dtSplit:A << set name("New A");
dtSplit:B << set name("New B");

// Use Update to merge the dtSplit data into the reference table
dt_ref << Update(
	With(
		Data Table(
			dtSplit
		)
	),
	Match Columns( :X Coord = :X Coord, :Y Coord = :Y Coord, :Wafer ID = :Wafer ID )
);

// Clean up
close( dtSplit, nosave );

dt_ref << end data update;
end = Tick Seconds();

Show( Char( end - start ) );
Jim