cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Leyahans
Level I

Selecting latest Date for same ID

Hi all,

I have a problem here trying to figure out. I'm still new in JSL thus I'm stuck.
I have this table that gives me the ID of an item, the date it was processed and so on. For same ID, multiple reprocess could've been done and it causes the timestamp of DateTime to double/triple. I want to clean up my table and only maintain the ones with latest DateTime for same ID.

 

I know i should use loop for. but I'm stuck at how to make the ID find their partner in the list.

Note: The same ID might be in random line items.

 

Leyahans_0-1675392706064.png

 

Thank you for the help!

 

2 REPLIES 2
vince_faller
Super User (Alumni)

Re: Selecting latest Date for same ID

Colmax() has a by variable you can give to it.  I thik this does what you want.  You can even give multiple by columns if you want.  

 

Names default to here(1);
dt = New Table( "Untitled 101",
	Add Rows( 10 ),
	New Column( "ID",
		Numeric,
		"Continuous",
		Format( "Best", 12 ),
		Set Values( [1, 1, 2, 2, 3, 3, 3, 3, 1, 1] ),
		Set Display Width( 51 )
	),
	New Column( "Datetime",
		Numeric,
		"Continuous",
		Format( "m/d/y h:m:s", 23, 0 ),
		Input Format( "m/d/y h:m:s", 0 ),
		Set Selected,
		Set Values(
			[3758218580, 3758222180, 3758225780, 3758229380, 3758232980, 3758236580,
			3758240180, 3758243780, 3758247380, 3758250980]
		)
	),
);
dt << New COlumn("Max by ID", 		
	Format( "m/d/y h:m:s", 23, 0 ),
	Formula(Colmax(:Datetime, :ID))
)
Vince Faller - Predictum

Re: Selecting latest Date for same ID

Hi, if by "clean up my table and only maintain the ones with latest DateTime for same ID" you mean that you wish to keep only the most recent information, deleting all other rows, take a look at the "select duplicate rows" function. Your process then becomes:

1) sort the table by date--in this case, descending, as you want the most recent data on top

2) use the select duplicate rows function to select "duplicated" rows with respect to grouping columns of interest

3) delete these rows

 

To try this, run the first group of lines in the script below, then run the last 3 lines one by one.

 

Cheers,

Brady

 

Names default to here(1);
dt = astable (J(20,1,randominteger(1, 5)) || J(20,1,randominteger(3e9, 3.5e9) ),
	<< column names ({"group", "date"}));
dt:date <<  Format( "m/d/y h:m", 19 );

///////////////
dt << sort (by (:group, :date), order (ascending, descending), replacetable);
dt << Select Duplicate Rows( Match( :group ) );
dt << delete rows;