Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
Level I

isolating the smallest date difference

i have a large dataset that has a static date on a subject (date of chest CT scan) and i need to find the smallest date difference and keep that row (delete the others).  jmp 16.2.0


Example below - i want to keep rows 2 and 4 (the smallest day difference for subject 1 and 2). 


subj ID.    CT date.      lung function date.      day difference

1               8/30/18            1/29/18                          -213  

1               8/30/18            8/31/18                              1

1               8/30/18            3/4/19                              186

2              12/19/17           11/13/18                          329

2              12/19/17            9/3/19                             623

2              12/19/17           10/4/21                            1385

Super User

Re: isolating the smallest date difference

Hi @jsolo01 ,

How about adding a new formula column "Col Rank" to rank the day difference according to subj ID.?

Then, you can select row "Col Rank" not equal to 0 & delete the rows.


subj ID 1, the min day difference is the 2nd row.

subj ID 2, the min day difference is the 4th row


1) Create new formula column: Col Rank




2) Go to Rows > Row Selection > Select Where


3) Define the condition: Col Ran does not equal 1 > click "Add Condition"




4) Right click the selected rows > Delete Rows



5) Final result as below



Hope it helps.

If it's a repetitive analysis, you can automate the steps through workflow or JSL


Super User

Re: isolating the smallest date difference

Here is my take on a simple way to get what you are asking for;  Just a simple script

Names Default To Here( 1 );
dt = Current Data Table();
dt << select where( :day difference == Col Min( :day difference, :subject ID ) );

dt << subset( selected columns( 0 ), selected rows( 1 ) );
Super User

Re: isolating the smallest date difference

Just to extend a little on @WebDesignesCrow interactive solution:


If you don't already have the difference column, you can create it interactively (select both of your dates, right click on the column header and find Difference (reverse order). This will create a new column where the differences are in seconds. If you want to see them in yours/days you can double click on the created column and modify it as needed (such as add division by In Days(1)) but seconds work for this case


Then right click on this new column's header and make absolute value column out of it


After you have that column, right click on the subjectid column and make it grouping column


next you can right click on the date difference column and create rank formula (note the grouping)


from the final column, right click on one of the "ones", select matching cells


From here you can either create subset using selected rows


OR right click on the row list, invert selection and after inversion delete rows




Then if you need to repeat this you can either make a workflow or capture the script from enhanced log


Below is script created by Workflow builder. Only thing it is missing is opening the table as I don't have the table saved (and some column names are incorrect for the same reason) so I added table creation script (workflow is also attached).

View more...
Names Default To Here(1);

//Custom Action - Create table 'Untitled 2'
dt = New Table("Untitled 2",
	Add Rows(6),
	Compress File When Saved(1),
	New Column("subj", Numeric, "Continuous", Format("Best", 12), Set Values([1, 1, 1, 2, 2, 2])),
	New Column("day1",
		Format("m/d/y", 12),
		Input Format("m/d/y"),
		Set Values([3618432000, 3618432000, 3618432000, 3596486400, 3596486400, 3596486400])
	New Column("day2",
		Format("m/d/y", 12),
		Input Format("m/d/y"),
		Set Values([3600028800, 3618518400, 3634502400, 3624912000, 3650313600, 3716150400])

//New formula column: day2-day1
Data Table("Untitled 2") << New Formula Column(
	Operation(Category("Combine"), "Difference (reverse order)"),
	Columns(:day1, :day2)

//New formula column: Abs[day2-day1]
Data Table("Untitled 2") << New Formula Column(
	Operation(Category("Transform"), "Absolute Value"),

//New formula column: Rank[Abs[day2-day1]][subj]
Data Table("Untitled 2") << New Formula Column(
	Operation(Category("Distributional"), "Rank"),
	Group By(:subj)

//Select matching cells
Data Table("Untitled 2") << Select Where(:"Rank[Abs[day2-day1]][subj]"n == 1);

//Subset data table
Data Table("Untitled 2") << Select Where(:"Rank[Abs[day2-day1]][subj]"n == 1) <<
Subset(Selected Rows(1), Selected columns only(0));