Share your ideas for the JMP Scripting Unsession at Discovery Summit by September 17th. We hope to see you there!
Choose Language Hide Translation Bar
Highlighted
michaelclarage
Level II

Merge data by time stamp

I need to merge two different data sets. The first, A, has data collected every 2 minutes. The second, B, has data collected only when a measurement crossed a threshhold. 

 

A-sample

2017-01-21:00:02:00, 1, 2, 3, 4

2017-01-21:00:04:00, 2, 3, 4, 5

2017-01-21:00:06:00, 3, 4, 5, 6

...

B-sample

2017-01-21:00:02:05, 4500

2017-01-21:00:05:14, 5700

...

It will be sufficient to put B data into the nearest A row. For example, B-data at 2017-01-21:00:05:14 could be placed into A-row=2017-01-21:00:06:00

 

I have about 170,000 rows in A; and about 4,000 rows in B.

 

Thanks

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted
txnelson
Super User

Re: Merge data by time stamp

There were a couple of items that did not work with the latest version(s) of JMP in the script that was referenced.  I reworked them, and removed the requirement to run this as a function call.  So all you have to do is to open the 2 data tables and run the script below.  I ran it against your data and it worked well.  I am not sure the results are exactly what you want, but it will give you a data table you can work from.

Names Default To Here( 1 );
Clear Symbols();
list_tables = {};


For( i = 1, i <= N Table(), i++,
	Insert Into( list_tables, Data Table( i ) << get name )
);  //sets up names of data tables to pick from
             dt1 = Data Table( list_tables[1] );
dt2 = Data Table( list_tables[2] ); //Reasonable Starting point
             win_tableselect = New Window( "Select Tables",
	<<modal,
	lup_box = Lineup Box( N Col( 2 ),
		Text Box( "Main Table (i.e. SubID List)", <<set font size( 10 ) ),
		Text Box( "Update-With Table (i.e. Tool Data)", <<set font size( 10 ) ),
		c1 = Combo Box(
			list_tables,
			s1 = c1 << getselected;
			dt1 = Data Table( s1 );
			vref1[1] << delete;
			vref1 << append( //Deletes and redraws
                                        				col1 = Col List Box( Data Table( dt1 ), all, maxSelected( 1 ), rt1 = col1 << get selected )
			);
		),   //these parts change the following collistboxes and their fantastic syntax based upon the data table combo box selection
                          		c2 = Combo Box(
			list_tables,
			<<set( 2 ),
			s2 = c2 << getselected;
			dt2 = Data Table( s2 );
			vref2[1] << delete;
			vref2 << append(
				col2 = Col List Box( Data Table( dt2 ), all, maxSelected( 1 ), rt2 = col2 << get selected )
			);
		),   //see above
                          		vref1 = V List Box(
			col1 = Col List Box(
				Data Table( dt1 ), //still stoopdi
                                        				all,
				maxSelected( 1 ),
				rt1 = col1 << get selected
			)
		),
		vref2 = V List Box(
			col2 = Col List Box( Data Table( dt2 ), all, maxSelected( 1 ), rt2 = col2 << get selected )
		),     //the point of the vrefs is to have something to delete without messing with the lineupbox's proclivity for left-right ordering
                    	),
	Text Box( "" ),
	H List Box(
		Button Box( "Ok",
			s1 = c1 << getselected;
			s2 = c2 << getselected;
			dt1 = Data Table( s1 );
			dt2 = Data Table( s2 );
			rt1 = Char( (col1 << get selected)[1] );
			rt2 = Char( (col2 << get selected)[1] );
		),   //not everyone has the << return result option yet if they're not in 11
                          		Button Box( "Cancel" ), 

	), 

);
If( win_tableselect["Button"] == -1,
	Throw( "Canceled by User" )
);
	
       
      dt2 << New Column( "Loc Min", numeric, continuous, formula( Row() ) ); //indexing, basically, to match/update upon
      list_big = Column( dt2, rt2 ) << get values; //creates a vector of values for the following
       
      dt1 << New Column( "Loc Min",
	numeric,
	continuous,
	formula( Loc Min( Abs( Column( rt1 )[Row()] - list_big ) ) )
);   //simple, but fast.  Very fast.
       
      dt_join = dt1 << Join(
	With( dt2 ),
	By Matching Columns( :Loc Min = :Loc Min ),
	Drop multiples( 0, 1 ),
	Name( "Include non-matches" )(0, 0),
	Preserve main table order( 1 )
);
     
      dt1 << delete columns( "Loc Min" );
dt2 << delete columns( "Loc Min" );  //Removes pesky join column
      list_columns = dt_join << get column names( string );
       
      For( i = 1, i <= N Items( list_columns ), i++,
	If( Contains( list_columns, "Loc Min" ),
		Column( dt_join, list_columns ) << set selected( 1 )
	)
);  //the joined table will having lingering Loc Min-like columns.  This just hunts them down and removes them.
      dt_join << delete columns;
dt_join << set name( Char( dt1 << get name ) || " & " || Char( dt2 << get name ) );  //Good enough
      Return( dt_join );
Jim

View solution in original post

8 REPLIES 8
Highlighted
vince_faller
Super User

Re: Merge data by time stamp

https://community.jmp.com/t5/Discussions/Join-Question-timestamp-range-kinda-sorta/m-p/20828

 

This will do it or at least get you on the right track.  And it's stupidly fast. 

Vince Faller - Predictum
Highlighted
michaelclarage
Level II

Re: Merge data by time stamp

thanks for the quick response.

I put a "datetime-in-seconds" column in each table, so all data points have a timestamp.

I run the script, it asks me to choose a column in each table, which I do by selecting the "datetime-in-seconds" column in each table. 

The end result is just an additional column in dt2 called "Loc Min" which is simply ascending integers 1,2,3,.. 

Is the function failing, or am I missing something on how to use it?

Highlighted
michaelclarage
Level II

Re: Merge data by time stamp

attaching subsets of the files I am using

Highlighted
txnelson
Super User

Re: Merge data by time stamp

There were a couple of items that did not work with the latest version(s) of JMP in the script that was referenced.  I reworked them, and removed the requirement to run this as a function call.  So all you have to do is to open the 2 data tables and run the script below.  I ran it against your data and it worked well.  I am not sure the results are exactly what you want, but it will give you a data table you can work from.

Names Default To Here( 1 );
Clear Symbols();
list_tables = {};


For( i = 1, i <= N Table(), i++,
	Insert Into( list_tables, Data Table( i ) << get name )
);  //sets up names of data tables to pick from
             dt1 = Data Table( list_tables[1] );
dt2 = Data Table( list_tables[2] ); //Reasonable Starting point
             win_tableselect = New Window( "Select Tables",
	<<modal,
	lup_box = Lineup Box( N Col( 2 ),
		Text Box( "Main Table (i.e. SubID List)", <<set font size( 10 ) ),
		Text Box( "Update-With Table (i.e. Tool Data)", <<set font size( 10 ) ),
		c1 = Combo Box(
			list_tables,
			s1 = c1 << getselected;
			dt1 = Data Table( s1 );
			vref1[1] << delete;
			vref1 << append( //Deletes and redraws
                                        				col1 = Col List Box( Data Table( dt1 ), all, maxSelected( 1 ), rt1 = col1 << get selected )
			);
		),   //these parts change the following collistboxes and their fantastic syntax based upon the data table combo box selection
                          		c2 = Combo Box(
			list_tables,
			<<set( 2 ),
			s2 = c2 << getselected;
			dt2 = Data Table( s2 );
			vref2[1] << delete;
			vref2 << append(
				col2 = Col List Box( Data Table( dt2 ), all, maxSelected( 1 ), rt2 = col2 << get selected )
			);
		),   //see above
                          		vref1 = V List Box(
			col1 = Col List Box(
				Data Table( dt1 ), //still stoopdi
                                        				all,
				maxSelected( 1 ),
				rt1 = col1 << get selected
			)
		),
		vref2 = V List Box(
			col2 = Col List Box( Data Table( dt2 ), all, maxSelected( 1 ), rt2 = col2 << get selected )
		),     //the point of the vrefs is to have something to delete without messing with the lineupbox's proclivity for left-right ordering
                    	),
	Text Box( "" ),
	H List Box(
		Button Box( "Ok",
			s1 = c1 << getselected;
			s2 = c2 << getselected;
			dt1 = Data Table( s1 );
			dt2 = Data Table( s2 );
			rt1 = Char( (col1 << get selected)[1] );
			rt2 = Char( (col2 << get selected)[1] );
		),   //not everyone has the << return result option yet if they're not in 11
                          		Button Box( "Cancel" ), 

	), 

);
If( win_tableselect["Button"] == -1,
	Throw( "Canceled by User" )
);
	
       
      dt2 << New Column( "Loc Min", numeric, continuous, formula( Row() ) ); //indexing, basically, to match/update upon
      list_big = Column( dt2, rt2 ) << get values; //creates a vector of values for the following
       
      dt1 << New Column( "Loc Min",
	numeric,
	continuous,
	formula( Loc Min( Abs( Column( rt1 )[Row()] - list_big ) ) )
);   //simple, but fast.  Very fast.
       
      dt_join = dt1 << Join(
	With( dt2 ),
	By Matching Columns( :Loc Min = :Loc Min ),
	Drop multiples( 0, 1 ),
	Name( "Include non-matches" )(0, 0),
	Preserve main table order( 1 )
);
     
      dt1 << delete columns( "Loc Min" );
dt2 << delete columns( "Loc Min" );  //Removes pesky join column
      list_columns = dt_join << get column names( string );
       
      For( i = 1, i <= N Items( list_columns ), i++,
	If( Contains( list_columns, "Loc Min" ),
		Column( dt_join, list_columns ) << set selected( 1 )
	)
);  //the joined table will having lingering Loc Min-like columns.  This just hunts them down and removes them.
      dt_join << delete columns;
dt_join << set name( Char( dt1 << get name ) || " & " || Char( dt2 << get name ) );  //Good enough
      Return( dt_join );
Jim

View solution in original post

Highlighted
michaelclarage
Level II

Re: Merge data by time stamp

Jim,

Thanks so much. I think we are all set now.

Highlighted
gzhou
Level I

Re: Merge data by time stamp

How could you modify this JMP script to allow it join two data tables by matching two different variable? 

 

For example: In the data set below, I would like to match by Experiment # and then match by the local min of the timestamp to join values of data table 2 onto data table 1 such that values of data table 2 is in one row of the nearest timestamp in data table 1 with the correct experiment #.

A-sample

Experiment 1, 2017-01-21:00:02:00, 1, 2, 3, 4

Experiment 1, 2017-01-21:00:04:00, 2, 3, 4, 5

Experiment 2, 2017-01-21:00:06:00, 3, 4, 5, 6

...

B-sample

Experiment 1, 2017-01-21:00:02:05, 4500

Experiment 2, 2017-01-21:00:05:14, 5700

 

Any assistance would be helpful!

 

Thanks

Highlighted
txnelson
Super User

Re: Merge data by time stamp

Here is a modification of the script that I believe will give you what you want.  Now, it hasn't been fully tested, since you did not provide sample data tables, but I believe it should work, if the variable you are placing your experiment number into is called "Experiment".

Names Default To Here( 1 );
Clear Symbols();
list_tables = {};


For( i = 1, i <= N Table(), i++,
	Insert Into( list_tables, Data Table( i ) << get name )
);  //sets up names of data tables to pick from
dt1 = Data Table( list_tables[1] );
dt2 = Data Table( list_tables[2] ); //Reasonable Starting point
win_tableselect = New Window( "Select Tables",
	<<modal,
	lup_box = Lineup Box( N Col( 2 ),
		Text Box( "Main Table (i.e. SubID List)", <<set font size( 10 ) ),
		Text Box( "Update-With Table (i.e. Tool Data)", <<set font size( 10 ) ),
		c1 = Combo Box(
			list_tables,
			s1 = c1 << getselected;
			dt1 = Data Table( s1 );
			vref1[1] << delete;
			vref1 << append( //Deletes and redraws
				col1 = Col List Box( Data Table( dt1 ), all, maxSelected( 1 ), rt1 = col1 << get selected )
			);
		),   //these parts change the following collistboxes and their fantastic syntax based upon the data table combo box selection
		c2 = Combo Box(
			list_tables,
			<<set( 2 ),
			s2 = c2 << getselected;
			dt2 = Data Table( s2 );
			vref2[1] << delete;
			vref2 << append( col2 = Col List Box( Data Table( dt2 ), all, maxSelected( 1 ), rt2 = col2 << get selected ) );
		),   //see above
		vref1 = V List Box(
			col1 = Col List Box(
				Data Table( dt1 ), //still stoopdi
				all,
				maxSelected( 1 ),
				rt1 = col1 << get selected
			)
		),
		vref2 = V List Box( col2 = Col List Box( Data Table( dt2 ), all, maxSelected( 1 ), rt2 = col2 << get selected ) ),     //the point of the vrefs is to have something to delete without messing with the lineupbox's proclivity for left-right ordering
	),
	Text Box( "" ),
	H List Box(
		Button Box( "Ok",
			s1 = c1 << getselected;
			s2 = c2 << getselected;
			dt1 = Data Table( s1 );
			dt2 = Data Table( s2 );
			rt1 = Char( (col1 << get selected)[1] );
			rt2 = Char( (col2 << get selected)[1] );
		),   //not everyone has the << return result option yet if they're not in 11
		Button Box( "Cancel" ), 

	), 

);
If( win_tableselect["Button"] == -1,
	Throw( "Canceled by User" )
);
//dt1=data table("AMPERE-concat-cleaned-subset (2)");
//dt2=data table("USGA data 2010 subset (2)");

       
dt2 << New Column( "Loc Min", numeric, continuous, formula( Row() ) ); //indexing, basically, to match/update upon
//list_big = Column( dt2, rt2 ) << get values; //creates a vector of values for the following
//rt2 = column( dt2, "date-time-in-seconds");
dt3 = dt2 << subset( selected rows( 0 ), columns( {"Experiment", "loc min", rt2} ) ); 

// Find out the number of experiments
Current Data Table( dt3 );
Summarize( dt3, experiments = by( :Experiment ) );
If( Column( dt3, "Experiment" ) << get data type == "Numeric",
	For( i = 1, i <= N Items( experiments ), i++,
		experiments[i] = Num( experiments[i] )
	)
);	

For( i = 1, i <= N Items( experiments ), i++,
	dt3 << New Column( "__exp__" || Char( experiments[i] ), set each value( . ) );
	theRows = dt3 << get rows where( :Experiment == experiments[i] );
	For( k = 1, k <= N Rows( theRows ), k++,
		Column( dt3, N Cols( dt3 ) )[theRows[k]] = rt2[theRows[k]]
	);
);

dt3 << delete columns( 1, 2, 3 );
list_big = dt3 << get as matrix;

Close( dt3, nosave );
     
dt1 << New Column( "Loc Min",
	numeric,
	continuous,
	formula(
		whichExp = Loc( experiments, :Experiment );
		Loc Min( Abs( Column( rt1 )[Row()] - list_big[0, whichExp] ) );
	)
);   //simple, but fast.  Very fast.
       
dt_join = dt1 << Join(
	With( dt2 ),
	By Matching Columns( :Loc Min = :Loc Min ),
	Drop multiples( 0, 1 ),
	Name( "Include non-matches" )(0, 0),
	Preserve main table order( 1 )
);
     
dt1 << delete columns( "Loc Min" );
dt2 << delete columns( "Loc Min" );  //Removes pesky join column
list_columns = dt_join << get column names( string );
       
For( i = 1, i <= N Items( list_columns ), i++,
	If( Contains( list_columns, "Loc Min" ),
		Column( dt_join, list_columns ) << set selected( 1 )
	)
);  //the joined table will having lingering Loc Min-like columns.  This just hunts them down and removes them.
dt_join << delete columns;
dt_join << set name( Char( dt1 << get name ) || " & " || Char( dt2 << get name ) );  //Good enough
Return( dt_join );
Jim
Highlighted
gzhou
Level I

Re: Merge data by time stamp

Thank you so much txnelson! I will test it out.

-gzhou
Article Labels

    There are no labels assigned to this post.