I have 2 data tables that contain numerical columns of the same name.
I want run a script that will compare the column names across the tables and a unique identifier in a row. if they match then it should run a simple calculation ( % difference between the 2 numbers) and return another table with the results.
I have tried to use the compare tables but this doesn't allow for the calculations I need.
Has anyone done anything similar that I could use?
I have attached 2 sample files just in case anyone feels that they could help me.
not sure how often you have to do this work.
I would suggest just join these two table and using formula to do a minus calculation of each row of these 2 columns.
assuming your data is in fixed sequence of rows.
I have to do this quite often and I sometimes have to look at ~200 columns so some way to automate, even semi-automation, this procedure is preferred. If I join the two tables together the column names are then not the same which will probably make any automation more difficult.
I hope this works for you. I got if from jmp8 scripting guide so ...
Compare two data tables
Here is a script to open two data tables, compare them, and report all results to the log window. This
script makes heavy use of the For operator to loop operations over all the rows (loops with i), nested
with another For operator to loop operations over all the columns (loops with j).
/* COMPARE THE CONTENTS OF TWO DATA TABLES */
// First identify the two data tables to be compared.
// Since you don't specify arguments, Open() presents the
// host-system dialog box for opening a file.
caption("Please locate the first data table.",spoken(1));
caption("Please locate the second data table.");
// Check for differing sizes.
write("Differing number of rows. "),
write("Same number of rows. "));
if(ncol(dt1)!=ncol(dt2),write("Differing number of columns. "),
write("Same number of columns. "));
nr = min(nrow(dt1),nrow(dt2));
nc = min(ncol(dt1),ncol(dt2));
// Check for differing column names.
current data table(dt1);n1=char(ColumnName(j));
current data table(dt2);n2=char(ColumnName(j));
write("\!rDiffering column names in column "||char(j));
// check for differing values
v1 = Column(dt1,j);
v2 = Column(dt2,j);
write("\!rDiffering values at row "||char(i)||
" in column "||char(j));show(v1,v2);e++)));
write("Matching cells have the same values."),
write("\!rNumber of cells that exist in both tables ");
write("and have differing values: "||char(e)));
// check for differing row states
current data table(dt1);r1 = char(rowstate(i));
current data table(dt2);r2 = char(rowstate(i));
write("\!r\!rDiffering row states at row "||char(i));
//Observe how Current Data Table is used to switch between two open windows.
There are many ways to automate this procedure using jsl. Here is one attempt that should work for the two tables posted here. The names of tables and the ID column are hard-coded here but a dialog that lets the user choose tables could easily be implemented. The code should work for any number or naming of substances columns (however the number of substances columns and the order of these must be identical in the two tables).
// Define table variables, dt1, the table with least number of rows used as reference
dt1 = Data Table( "List Report -1.jmp" );
dt2 = Data Table( "List Report -2 rows.jmp" );
// Define id columns
idcol1 = Column( dt1, 1 );
idcol2 = Column( dt2, 1 );
//Get matching rows
m = J( N Row( dt1 ), 2, . );
For( i = 1, i <= N Row( dt1 ), i++,
mi = dt2 << get rows where( idcol2 == idcol1[i] );
If( N Row( mi ) > 0,
m[i, 2] = mi
m[i, 1] = i;
//Make matrix with differences for between matching rows for each substance (edit this formula to get e.g. %)
diff = (dt1 << get asmatrix)[m[0, 1], 0] - (dt2 << get asmatrix)[m[0, 2], 0];
// Put matrix into a new table
dt_diff = New Table( "Difference of matched", New Column( "Matched Substance Identifier", character, values( idcol1[m[0, 1]] ) ) );
dt_diff << set matrix( diff );
//Set column names
colnames = dt1 << get column names( string );
For( i = 2, i <= N Items( colnames ), i++,
Column( dt_diff, i ) << set name( colnames[i] )