cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Browse apps to extend the software in the new JMP Marketplace
Choose Language Hide Translation Bar
michaelclarage
Level II

Merge data by time stamp

I need to merge two different data sets. The first, A, has data collected every 2 minutes. The second, B, has data collected only when a measurement crossed a threshhold. 

 

A-sample

2017-01-21:00:02:00, 1, 2, 3, 4

2017-01-21:00:04:00, 2, 3, 4, 5

2017-01-21:00:06:00, 3, 4, 5, 6

...

B-sample

2017-01-21:00:02:05, 4500

2017-01-21:00:05:14, 5700

...

It will be sufficient to put B data into the nearest A row. For example, B-data at 2017-01-21:00:05:14 could be placed into A-row=2017-01-21:00:06:00

 

I have about 170,000 rows in A; and about 4,000 rows in B.

 

Thanks

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
txnelson
Super User

Re: Merge data by time stamp

There were a couple of items that did not work with the latest version(s) of JMP in the script that was referenced.  I reworked them, and removed the requirement to run this as a function call.  So all you have to do is to open the 2 data tables and run the script below.  I ran it against your data and it worked well.  I am not sure the results are exactly what you want, but it will give you a data table you can work from.

Names Default To Here( 1 );
Clear Symbols();
list_tables = {};


For( i = 1, i <= N Table(), i++,
	Insert Into( list_tables, Data Table( i ) << get name )
);  //sets up names of data tables to pick from
             dt1 = Data Table( list_tables[1] );
dt2 = Data Table( list_tables[2] ); //Reasonable Starting point
             win_tableselect = New Window( "Select Tables",
	<<modal,
	lup_box = Lineup Box( N Col( 2 ),
		Text Box( "Main Table (i.e. SubID List)", <<set font size( 10 ) ),
		Text Box( "Update-With Table (i.e. Tool Data)", <<set font size( 10 ) ),
		c1 = Combo Box(
			list_tables,
			s1 = c1 << getselected;
			dt1 = Data Table( s1 );
			vref1[1] << delete;
			vref1 << append( //Deletes and redraws
                                        				col1 = Col List Box( Data Table( dt1 ), all, maxSelected( 1 ), rt1 = col1 << get selected )
			);
		),   //these parts change the following collistboxes and their fantastic syntax based upon the data table combo box selection
                          		c2 = Combo Box(
			list_tables,
			<<set( 2 ),
			s2 = c2 << getselected;
			dt2 = Data Table( s2 );
			vref2[1] << delete;
			vref2 << append(
				col2 = Col List Box( Data Table( dt2 ), all, maxSelected( 1 ), rt2 = col2 << get selected )
			);
		),   //see above
                          		vref1 = V List Box(
			col1 = Col List Box(
				Data Table( dt1 ), //still stoopdi
                                        				all,
				maxSelected( 1 ),
				rt1 = col1 << get selected
			)
		),
		vref2 = V List Box(
			col2 = Col List Box( Data Table( dt2 ), all, maxSelected( 1 ), rt2 = col2 << get selected )
		),     //the point of the vrefs is to have something to delete without messing with the lineupbox's proclivity for left-right ordering
                    	),
	Text Box( "" ),
	H List Box(
		Button Box( "Ok",
			s1 = c1 << getselected;
			s2 = c2 << getselected;
			dt1 = Data Table( s1 );
			dt2 = Data Table( s2 );
			rt1 = Char( (col1 << get selected)[1] );
			rt2 = Char( (col2 << get selected)[1] );
		),   //not everyone has the << return result option yet if they're not in 11
                          		Button Box( "Cancel" ), 

	), 

);
If( win_tableselect["Button"] == -1,
	Throw( "Canceled by User" )
);
	
       
      dt2 << New Column( "Loc Min", numeric, continuous, formula( Row() ) ); //indexing, basically, to match/update upon
      list_big = Column( dt2, rt2 ) << get values; //creates a vector of values for the following
       
      dt1 << New Column( "Loc Min",
	numeric,
	continuous,
	formula( Loc Min( Abs( Column( rt1 )[Row()] - list_big ) ) )
);   //simple, but fast.  Very fast.
       
      dt_join = dt1 << Join(
	With( dt2 ),
	By Matching Columns( :Loc Min = :Loc Min ),
	Drop multiples( 0, 1 ),
	Name( "Include non-matches" )(0, 0),
	Preserve main table order( 1 )
);
     
      dt1 << delete columns( "Loc Min" );
dt2 << delete columns( "Loc Min" );  //Removes pesky join column
      list_columns = dt_join << get column names( string );
       
      For( i = 1, i <= N Items( list_columns ), i++,
	If( Contains( list_columns, "Loc Min" ),
		Column( dt_join, list_columns ) << set selected( 1 )
	)
);  //the joined table will having lingering Loc Min-like columns.  This just hunts them down and removes them.
      dt_join << delete columns;
dt_join << set name( Char( dt1 << get name ) || " & " || Char( dt2 << get name ) );  //Good enough
      Return( dt_join );
Jim

View solution in original post

8 REPLIES 8
vince_faller
Super User (Alumni)

Re: Merge data by time stamp

https://community.jmp.com/t5/Discussions/Join-Question-timestamp-range-kinda-sorta/m-p/20828

 

This will do it or at least get you on the right track.  And it's stupidly fast. 

Vince Faller - Predictum
michaelclarage
Level II

Re: Merge data by time stamp

thanks for the quick response.

I put a "datetime-in-seconds" column in each table, so all data points have a timestamp.

I run the script, it asks me to choose a column in each table, which I do by selecting the "datetime-in-seconds" column in each table. 

The end result is just an additional column in dt2 called "Loc Min" which is simply ascending integers 1,2,3,.. 

Is the function failing, or am I missing something on how to use it?

michaelclarage
Level II

Re: Merge data by time stamp

attaching subsets of the files I am using

txnelson
Super User

Re: Merge data by time stamp

There were a couple of items that did not work with the latest version(s) of JMP in the script that was referenced.  I reworked them, and removed the requirement to run this as a function call.  So all you have to do is to open the 2 data tables and run the script below.  I ran it against your data and it worked well.  I am not sure the results are exactly what you want, but it will give you a data table you can work from.

Names Default To Here( 1 );
Clear Symbols();
list_tables = {};


For( i = 1, i <= N Table(), i++,
	Insert Into( list_tables, Data Table( i ) << get name )
);  //sets up names of data tables to pick from
             dt1 = Data Table( list_tables[1] );
dt2 = Data Table( list_tables[2] ); //Reasonable Starting point
             win_tableselect = New Window( "Select Tables",
	<<modal,
	lup_box = Lineup Box( N Col( 2 ),
		Text Box( "Main Table (i.e. SubID List)", <<set font size( 10 ) ),
		Text Box( "Update-With Table (i.e. Tool Data)", <<set font size( 10 ) ),
		c1 = Combo Box(
			list_tables,
			s1 = c1 << getselected;
			dt1 = Data Table( s1 );
			vref1[1] << delete;
			vref1 << append( //Deletes and redraws
                                        				col1 = Col List Box( Data Table( dt1 ), all, maxSelected( 1 ), rt1 = col1 << get selected )
			);
		),   //these parts change the following collistboxes and their fantastic syntax based upon the data table combo box selection
                          		c2 = Combo Box(
			list_tables,
			<<set( 2 ),
			s2 = c2 << getselected;
			dt2 = Data Table( s2 );
			vref2[1] << delete;
			vref2 << append(
				col2 = Col List Box( Data Table( dt2 ), all, maxSelected( 1 ), rt2 = col2 << get selected )
			);
		),   //see above
                          		vref1 = V List Box(
			col1 = Col List Box(
				Data Table( dt1 ), //still stoopdi
                                        				all,
				maxSelected( 1 ),
				rt1 = col1 << get selected
			)
		),
		vref2 = V List Box(
			col2 = Col List Box( Data Table( dt2 ), all, maxSelected( 1 ), rt2 = col2 << get selected )
		),     //the point of the vrefs is to have something to delete without messing with the lineupbox's proclivity for left-right ordering
                    	),
	Text Box( "" ),
	H List Box(
		Button Box( "Ok",
			s1 = c1 << getselected;
			s2 = c2 << getselected;
			dt1 = Data Table( s1 );
			dt2 = Data Table( s2 );
			rt1 = Char( (col1 << get selected)[1] );
			rt2 = Char( (col2 << get selected)[1] );
		),   //not everyone has the << return result option yet if they're not in 11
                          		Button Box( "Cancel" ), 

	), 

);
If( win_tableselect["Button"] == -1,
	Throw( "Canceled by User" )
);
	
       
      dt2 << New Column( "Loc Min", numeric, continuous, formula( Row() ) ); //indexing, basically, to match/update upon
      list_big = Column( dt2, rt2 ) << get values; //creates a vector of values for the following
       
      dt1 << New Column( "Loc Min",
	numeric,
	continuous,
	formula( Loc Min( Abs( Column( rt1 )[Row()] - list_big ) ) )
);   //simple, but fast.  Very fast.
       
      dt_join = dt1 << Join(
	With( dt2 ),
	By Matching Columns( :Loc Min = :Loc Min ),
	Drop multiples( 0, 1 ),
	Name( "Include non-matches" )(0, 0),
	Preserve main table order( 1 )
);
     
      dt1 << delete columns( "Loc Min" );
dt2 << delete columns( "Loc Min" );  //Removes pesky join column
      list_columns = dt_join << get column names( string );
       
      For( i = 1, i <= N Items( list_columns ), i++,
	If( Contains( list_columns, "Loc Min" ),
		Column( dt_join, list_columns ) << set selected( 1 )
	)
);  //the joined table will having lingering Loc Min-like columns.  This just hunts them down and removes them.
      dt_join << delete columns;
dt_join << set name( Char( dt1 << get name ) || " & " || Char( dt2 << get name ) );  //Good enough
      Return( dt_join );
Jim
michaelclarage
Level II

Re: Merge data by time stamp

Jim,

Thanks so much. I think we are all set now.

gzhou
Level I

Re: Merge data by time stamp

How could you modify this JMP script to allow it join two data tables by matching two different variable? 

 

For example: In the data set below, I would like to match by Experiment # and then match by the local min of the timestamp to join values of data table 2 onto data table 1 such that values of data table 2 is in one row of the nearest timestamp in data table 1 with the correct experiment #.

A-sample

Experiment 1, 2017-01-21:00:02:00, 1, 2, 3, 4

Experiment 1, 2017-01-21:00:04:00, 2, 3, 4, 5

Experiment 2, 2017-01-21:00:06:00, 3, 4, 5, 6

...

B-sample

Experiment 1, 2017-01-21:00:02:05, 4500

Experiment 2, 2017-01-21:00:05:14, 5700

 

Any assistance would be helpful!

 

Thanks

txnelson
Super User

Re: Merge data by time stamp

Here is a modification of the script that I believe will give you what you want.  Now, it hasn't been fully tested, since you did not provide sample data tables, but I believe it should work, if the variable you are placing your experiment number into is called "Experiment".

Names Default To Here( 1 );
Clear Symbols();
list_tables = {};


For( i = 1, i <= N Table(), i++,
	Insert Into( list_tables, Data Table( i ) << get name )
);  //sets up names of data tables to pick from
dt1 = Data Table( list_tables[1] );
dt2 = Data Table( list_tables[2] ); //Reasonable Starting point
win_tableselect = New Window( "Select Tables",
	<<modal,
	lup_box = Lineup Box( N Col( 2 ),
		Text Box( "Main Table (i.e. SubID List)", <<set font size( 10 ) ),
		Text Box( "Update-With Table (i.e. Tool Data)", <<set font size( 10 ) ),
		c1 = Combo Box(
			list_tables,
			s1 = c1 << getselected;
			dt1 = Data Table( s1 );
			vref1[1] << delete;
			vref1 << append( //Deletes and redraws
				col1 = Col List Box( Data Table( dt1 ), all, maxSelected( 1 ), rt1 = col1 << get selected )
			);
		),   //these parts change the following collistboxes and their fantastic syntax based upon the data table combo box selection
		c2 = Combo Box(
			list_tables,
			<<set( 2 ),
			s2 = c2 << getselected;
			dt2 = Data Table( s2 );
			vref2[1] << delete;
			vref2 << append( col2 = Col List Box( Data Table( dt2 ), all, maxSelected( 1 ), rt2 = col2 << get selected ) );
		),   //see above
		vref1 = V List Box(
			col1 = Col List Box(
				Data Table( dt1 ), //still stoopdi
				all,
				maxSelected( 1 ),
				rt1 = col1 << get selected
			)
		),
		vref2 = V List Box( col2 = Col List Box( Data Table( dt2 ), all, maxSelected( 1 ), rt2 = col2 << get selected ) ),     //the point of the vrefs is to have something to delete without messing with the lineupbox's proclivity for left-right ordering
	),
	Text Box( "" ),
	H List Box(
		Button Box( "Ok",
			s1 = c1 << getselected;
			s2 = c2 << getselected;
			dt1 = Data Table( s1 );
			dt2 = Data Table( s2 );
			rt1 = Char( (col1 << get selected)[1] );
			rt2 = Char( (col2 << get selected)[1] );
		),   //not everyone has the << return result option yet if they're not in 11
		Button Box( "Cancel" ), 

	), 

);
If( win_tableselect["Button"] == -1,
	Throw( "Canceled by User" )
);
//dt1=data table("AMPERE-concat-cleaned-subset (2)");
//dt2=data table("USGA data 2010 subset (2)");

       
dt2 << New Column( "Loc Min", numeric, continuous, formula( Row() ) ); //indexing, basically, to match/update upon
//list_big = Column( dt2, rt2 ) << get values; //creates a vector of values for the following
//rt2 = column( dt2, "date-time-in-seconds");
dt3 = dt2 << subset( selected rows( 0 ), columns( {"Experiment", "loc min", rt2} ) ); 

// Find out the number of experiments
Current Data Table( dt3 );
Summarize( dt3, experiments = by( :Experiment ) );
If( Column( dt3, "Experiment" ) << get data type == "Numeric",
	For( i = 1, i <= N Items( experiments ), i++,
		experiments[i] = Num( experiments[i] )
	)
);	

For( i = 1, i <= N Items( experiments ), i++,
	dt3 << New Column( "__exp__" || Char( experiments[i] ), set each value( . ) );
	theRows = dt3 << get rows where( :Experiment == experiments[i] );
	For( k = 1, k <= N Rows( theRows ), k++,
		Column( dt3, N Cols( dt3 ) )[theRows[k]] = rt2[theRows[k]]
	);
);

dt3 << delete columns( 1, 2, 3 );
list_big = dt3 << get as matrix;

Close( dt3, nosave );
     
dt1 << New Column( "Loc Min",
	numeric,
	continuous,
	formula(
		whichExp = Loc( experiments, :Experiment );
		Loc Min( Abs( Column( rt1 )[Row()] - list_big[0, whichExp] ) );
	)
);   //simple, but fast.  Very fast.
       
dt_join = dt1 << Join(
	With( dt2 ),
	By Matching Columns( :Loc Min = :Loc Min ),
	Drop multiples( 0, 1 ),
	Name( "Include non-matches" )(0, 0),
	Preserve main table order( 1 )
);
     
dt1 << delete columns( "Loc Min" );
dt2 << delete columns( "Loc Min" );  //Removes pesky join column
list_columns = dt_join << get column names( string );
       
For( i = 1, i <= N Items( list_columns ), i++,
	If( Contains( list_columns, "Loc Min" ),
		Column( dt_join, list_columns ) << set selected( 1 )
	)
);  //the joined table will having lingering Loc Min-like columns.  This just hunts them down and removes them.
dt_join << delete columns;
dt_join << set name( Char( dt1 << get name ) || " & " || Char( dt2 << get name ) );  //Good enough
Return( dt_join );
Jim
gzhou
Level I

Re: Merge data by time stamp

Thank you so much txnelson! I will test it out.

-gzhou