cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
cdwhite
Level II

Appending non-duplicate (single column ID) rows to a main table without creating a new table?

Scenario: need to append daily survey data files (that include all previous days) to a main table with only non-duplicative respondent ID records and without creating a new table (because I do not want to lose selected rows in the main table that are hidden or exluded

Two attempts:

Concatenate mainains the main table but brings over all records

Join brings only the new records but creates a new table so row characteristics (hide, exlcude) are lost

1 ACCEPTED SOLUTION

Accepted Solutions
cdwhite
Level II

Re: Appending non-duplicate (single column ID) rows to a main table without creating a new table?

Thank you.  Added new column designated as row state data type to original data table and then used the copy to and from features in the row states menu (by clicking on row state column in column list box on left, not actual column header) to copy over and then re-apply the row states after joining.  The script for the row append join to bring only the new records from a survey data set (and ignore all previously appended cases) into the main data table:

Data Table( "Main survey data set" ) << Join(

  With( Data Table( "Secondary survey data set" ) ),

  Merge Same Name Columns,

  Copy second table formula( 0 ),

  Suppress main table formula evaluation( 0 ),

  By Matching Columns( :ResponseID = :ResponseID ),

  Drop multiples( 0, 0 ),

  Name( "Include non-matches" )(0, 1),

  Preserve main table order( 1 )

)

After joining the two files, copied/restored the row states from the row state column using the row state menu.  Worked perfectly.  Thank you!

View solution in original post

10 REPLIES 10
KarenC
Super User (Alumni)

Re: Appending non-duplicate (single column ID) rows to a main table without creating a new table?

I find joining and concatenating tables a bit of an art (i.e., trial and error until I get what I really want).  However, I would recommend that you look up the column type "row states" as you can save all of your current row states (i.e., your hidden and excluded row states) in a column that you can reapply as needed to your table.

cdwhite
Level II

Re: Appending non-duplicate (single column ID) rows to a main table without creating a new table?

Thank you.  Added new column designated as row state data type to original data table and then used the copy to and from features in the row states menu (by clicking on row state column in column list box on left, not actual column header) to copy over and then re-apply the row states after joining.  The script for the row append join to bring only the new records from a survey data set (and ignore all previously appended cases) into the main data table:

Data Table( "Main survey data set" ) << Join(

  With( Data Table( "Secondary survey data set" ) ),

  Merge Same Name Columns,

  Copy second table formula( 0 ),

  Suppress main table formula evaluation( 0 ),

  By Matching Columns( :ResponseID = :ResponseID ),

  Drop multiples( 0, 0 ),

  Name( "Include non-matches" )(0, 1),

  Preserve main table order( 1 )

)

After joining the two files, copied/restored the row states from the row state column using the row state menu.  Worked perfectly.  Thank you!

pmroz
Super User

Re: Appending non-duplicate (single column ID) rows to a main table without creating a new table?

You can do set operations using associative arrays.  Couldn't find a slick way to append one table to another.  Had to plow through the data using JSL loops.

dt_original = New Table( "Original", Add Rows( 4 ),

    New Column( "Column ID", Numeric, Continuous, Format( "Best", 12 ),

           Set Values( [1, 2, 3, 4] )

     ),

    New Column( "Result", Character, Nominal, Set Values( {"a", "b", "c", "d"} ) )

);

dt_new = New Table( "NewData", Add Rows( 10 ),

    New Column( "Column ID", Numeric, Continuous, Format( "Best", 12 ),

           Set Values( [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] )

     ),

    New Column( "Result", Character, Nominal,

           Set Values( {"a", "b", "c", "d", "e", "f", "g", "h", "i", "j"} )

     )

);

// Determine what rows are in the new table that aren't in the old one

a_original = associative array(column(dt_original, "Column ID"));

a_new      = associative array(column(dt_new, "Column ID"));

a_new << remove(a_original);

new_rows = a_new << get keys;

// There has to be a faster way to append one table to another in place

ni = nrows(dt_original);

nr = nitems(new_rows);

dt_original << add rows(nr);

for (i = 1, i <= ncols(dt_original), i++,

    one_col = column(dt_original, i) << get name;

    for (k = 1, k <= nitems(new_rows), k++,

         m = new_rows[k];

         n = k + ni;

        column(dt_original, one_col)[n] = column(dt_new, one_col)[m];

    );

);

David_Burnham
Super User (Alumni)

Re: Appending non-duplicate (single column ID) rows to a main table without creating a new table?

When performing a join you can avoid creating a new table by using the  option "update the main table with data from second table". 

-Dave
pmroz
Super User

Re: Appending non-duplicate (single column ID) rows to a main table without creating a new table?

I tried using the update main table option but it still creates a new table.

Data Table( "Original" ) << Join(

     With( Data Table( "NewData" ) ),

     Update,

     By Matching Columns( :Column ID = :Column ID ),

     Drop multiples( 1, 1 ),

     Name( "Include non-matches" )(0, 1),

     Preserve main table order( 1 )

);

8791_Join-Update.png

robot
Level VI

Re: Appending non-duplicate (single column ID) rows to a main table without creating a new table?

Hi cdwhite,

I wrote a script a while ago that may help.  The script below will delete duplicate rows while preserving the original data table, including formulas and row states.  Using the Concatenate function followed by this script may solve your problem.

Names Default To Here( 1 );

// Create data table and new data.

dt = Open( "$SAMPLE_DATA/Big Class.jmp" );

dt << Select Randomly( 10 );

dt2 = dt << Subset( (Selected Rows), Output Table Name( "New Data" ) );

dt << Clear Select;

dt2 << Add Rows( {:name = "PETER", :age = 14, :sex = "M", :height = 61, :weight = 124} );

// Randomly set row states in orginial data.

dt << Select Randomly( 10 ) << Hide << Exclude << Clear Select;

// Add new data.

dt << Concatenate( dt2, Append to First Table );

Close( dt2, No Save );

// Delete duplicate rows.

dt = Current Data Table();

col_min = Parse( "Col Min( Row(), :Name(\!"" || Concat Items( dt << get column names( string ), "\!"), :Name(\!"" ) || "\!"))" );  // Create a JSL expression to determine the minimum row number by all other columns.

duplicates = [];  // Create matrix to store duplicate rows.

For Each Row( If( col_min != Row(), duplicates |/= Row() ) );  // Compare each row to "col_min" expression above.  If the current row does not match the minimum row by all other columns, that row will be added to the "duplicates" matrix.

dt << delete rows( duplicates ); // Delete duplicates.

nn = N Rows( duplicates );

New Window( "Delete Duplicate Rows", <<Modal, Text Box( "Done." ), Text Box( Concat( "Deleted ", Char( nn ), " rows." ) ), Button Box( "OK" ) );

Martin
Level V

Re: Appending non-duplicate (single column ID) rows to a main table without creating a new table?

Hello Robot, could you explain what these two lines do?

Thx

1. col_min = Parse( "Col Min( Row(), :Name(\!"" || Concat Items( dt << get column names( string ), "\!"), :Name(\!"" ) || "\!"))" );

2. For Each Row( If( col_min != Row(), duplicates |/= Row() ) );

Martin
Level V

Re: Appending non-duplicate (single column ID) rows to a main table without creating a new table?

So I think I have figured it out.  Line 1 produced a string "Col Min( Row(), :name, :age, :sex, :height, :weight )" that will be used in (2) to determin the row number of a previous row where all columns match.  And, 2 will add the row number of the duplicate row to an array "duplicates".

This is quite cleaver.  I wonder what happens when there are triplicates.  Will it always keep the first instance and delete the second and third?  I will need to try it.

Thank you robot!

robot
Level VI

Re: Appending non-duplicate (single column ID) rows to a main table without creating a new table?

Hi Martin,

You are correct.  Those lines do exactly what you deduced.  I added some comments to help make it more clear to others.

I am glad you find it useful!