cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Join, Concat and Update

Every day we have new data that need to add/update to the existing data table  I need a simple operation that will be able to concatenate new row(s) and/or new column(s) to the existing table and also update the existing table if the new table show a change in the data value by matching columns criteria.

10 Comments
hogi
Level XI

Yes, would be nice: Update, concatenate and join with a single click
I wondered why the interest for such a function is not larger.
https://community.jmp.com/t5/JMP-Wish-List/universal-quot-update-quot-button-for-imported-data/idi-p...

Like "reread" in Cornerstone:
whatever is changed in the source data table, will be added or updated to the jmp data table.

What's possibe in jmp:
Open the new data table + "Table/Update" --> great for update and join - if no new rows were added.


For "concatenate", Martin wrote in his blog:
https://community.jmp.com/t5/Martin-Demel-s-Blog/Updating-a-JMP-data-table-from-a-spreadsheet-made-e...

the idea to read just the last N rows and concatenate with the existing table, which gives you access to the green and orange region.

A single additional button in the Update menu could do the universal job:"append additional rows"
... or for the case of "match columns" enabled: "append non-matched rows"

- if there are more rows in the update table than in the jmp table, they are added at the end of the table

- for "match tables": if there are non-matching rows in the update table, they are added at the end of the table

and if "add columns" is enabled in addition, data from the specified columns in the rectangle on the lower right will be added as well  

HolgerSp_1-1665750511825.png

 

 

 HolgerSp_0-1665749603327.png

 

 

hogi
Level XI

Could be that there are some problems with the "append additional rows" extension.

Nevertheless, the menu can be opened with the "append additional rows" option disabled - then there can't be any harm.

By the way: additional wish
Is it possible to start the menu with "add columns" and "replace columns" set to None as well?
It's rare that you have to use both options at the same time - and it's ultra-dangerous, if you focus on what you want to get (e.g. add columns) and miss the fact that the other option is on by default (replace all columns with matching column names).
It's funny how fast you learn that the first thing that you have to do in the update menu is:
disable "replace columns"

But every user has to learn this on his own -sooner or later.

hogi
Level XI

With the additional cases 
- rows in the update table were deleted

- the order of rows was changed

 

... a short manual procedure that can handle all "updates" except the yellow "join" area:

- delete all rows of the table

- run the scource script *)

- concatenate the original table and the newly loaded table.

This gives you the blue, green, red and orange region.

 

*) source script should include all manual changes to the table.

hogi_0-1674022082627.png

 

 

 

 

 

Status changed to: Acknowledged

@vhuac - Thank you for your suggestion! We have captured your request and will take it under consideration.

SamGardner
Staff
Status changed to: Needs Info

@vhuac @hogi thanks for the suggestions.  It would be helpful to have an actual data example that shows the challenges you face with doing this kind of data table work in JMP.  

 

I will say, based on my experience, that this type data table work is typically completed in several steps, regardless of the tool used to do the work, as each part is a separate type of operation (update data, add columns, add rows).  But having an example so we can see the pain would allow us to better understand and to think about ways we could make this work easier. 

hogi
Level XI

Hi @SamGardner ,

 

Let's say you started with a draft version of STIPS/Wafer Stacked Small.jmp- just lot 1  & 3 - let's call it firstDataSet:

 

 

dt = Open( "$SAMPLE_STIPS/Wafer Stacked Small.jmp" );
firstDataSet = dt << Select Where(
	:Lot == "1" | :Lot == "3"
) << Subset( Selected Rows( 1 ), columns( :Lot, :Wafer, :X_Die, :Y_Die, :Defects ) );
firstDataSet << New Script( "Script1", Print( 1 ) );
firstDataSet << New Script( "Script2", Print( 2 ) );
firstDataSet << New Script( "Script3", Print( 3 ) );
firstDataSet << New Column( "radius",Formula( Root( :X_Die ^ 2 + :Y_Die ^ 2 ) ))

2 days later, new lots were measured (#3 & #4) and new columns were added (Lot_Wafer Label).

Now it would be great to get all the new data into the existing firstDataSet  keeping the JSL Scrips #1 - #3 and the radius column.

hogi_0-1673388131179.png

 

The idea: use the existing Update function, use Lot, wafer x_Die and Y_Die as Match columns

--> blue area gets updated (just imagine that there are some updates), and the green area (new Lot_Wafer Label column) gets added.

Now, the wish:
There are some non-used rows left in the input data table (green, orange block). Such rows could be added as well  - if "append non-matched rows" is enabled.

 

Without "append non-matched rows" in the update function (as it is now), the user has to first split the new input data into a part that can be used for the update and a second part which can be "concatenated".
The new update function could follow the same idea:

View more...
firstDataSet   << New Column( "alreadyExisting",Set Each Value( 1 ));

// Update data tables
dt << Update(
	With( firstDataSet ),
	Match Columns( :Lot = :Lot, :Wafer = :Wafer, :X_Die = :X_Die, :Y_Die = :Y_Die ),
	Add Columns from Update Table( :alreadyExisting )
);

firstDataSet << Delete Columns( :alreadyExisting );

// split new input
dt << Select where( :alreadyExisting == 1 ) <<
Subset( Output Table( "existing" ), Selected Rows( 1 ), Selected columns only( 0 ) );
Data Table( "existing" ) << Delete Columns( :alreadyExisting );


dt << Invert Row Selection <<
Subset( Output Table( "newData" ), Selected Rows( 1 ), Selected columns only( 0 ) );
Data Table( "newData" ) << Delete Columns( :alreadyExisting );


// Update data tables
firstDataSet << Update(	With( Data Table( "existing" ) ),Match Columns( :Lot = :Lot, :Wafer = :Wafer, :X_Die = :X_Die, :Y_Die = :Y_Die ));

firstDataSet <<Concatenate( 	Data Table( "newData" ),Append to first table);

 

hogi
Level XI

@SamGardner : is this info OK?

hogi
Level XI

I just noticed that the option Merge same name columns of the Join Platform does more or less everything that's needed:

https://community.jmp.com/t5/Martin-Demel-s-Blog/Updating-a-JMP-data-table-from-a-spreadsheet-made-e... 

 

There is one drawback:

The Join platform always generate a new output data table, no possibility to update the main table with data from second table
  (nevertheless, there is an option Update main table with data from second table, but it generates a new data table with merged data from main and second data table)

hogi
Level XI

how about 

1) load the new data

2) remove all rows from the main table

3) append the new data via concatenate

 

dt = Current Data Table();
dtnew = dt << Run Script( "Source" );
dt << Select All Rows << Delete Rows;
dt << Concatenate( dtnew, Append to first table );
Close( dtnew, NoSave );

 

guillaumebugnon
Level V

Hello,

Agree, such functions would be very interesting for data manipulation.

 

It would be nice to have at least that I think, it will be a great time saver.

https://community.jmp.com/t5/JMP-Wish-List/Tables-gt-Update-improvment/idi-p/713783

 

Regards,

Guillaume