cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar

Creating a script to select only certain values related to specific date range

I was given a file similar to this

PruningTheSpark_0-1734151299506.png

I was asked to create a formula that pulls all the codes that happened between the start and stop dates.  I'm unsure how to set this up as a formula.  Is this something that could be created as a script to pull all the codes that occur between the start and stop dates?  I have never set up a script before and hoping someone can walk me through how to create one.

Thanks!

 

7 REPLIES 7
txnelson
Super User

Re: Creating a script to select only certain values related to specific date range

Welcome to the Community.

The actual statements that will select rows within a given date range and create a new data table with just those selected rows included.

// Select the row that have the date range desired
dt << select where( :Start Date >= informat("01/10/2025","m/d/y") & 
	:End Date <= informat("04/30/2025","m/d/y"));
// Create the new table
dtSubset = dt << subset( selected columns(0), selected rows(1), output table("theSubset"));

Here is a full example that first creates an example data table, and then runs the above lines to select and create the data table.

(Note:  Below example was somehow left out of original posting. Thanks to @jthi for heads up on the omission )

Names Default To Here( 1 );

// Create an Example data table
dt = New Table( "Example",
	Add Rows( 20 ),
	New Column( "Start Date",
		Format( "m/d/Y" ),
		set each value( Today() + In Days( Random Integer( 1, 100 ) ) )
	),
	New Column( "End Date",
		Format( "m/d/Y" ),
		set each value( :Start Date + In Days( Random Integer( 1, 50 ) ) )
	)
);

// Select the row that have the date range desired
dt << select where(
	:Start Date >= Informat( "01/10/2025", "m/d/y" ) & :End Date <=
	Informat( "04/30/2025", "m/d/y" )
);
// Create the new table
dtSubset = dt << subset(
	selected columns( 0 ),
	selected rows( 1 ),
	output table( "theSubset" )
);

Another interesting, interactive way to create the new table is to use a Data Filter. The Data Filter allows the user to interactively select rows in the data table, and once the desired rows are selected, a new table from those rows can be created.

To to this: Once the data table is displayed

txnelson_1-1734155158308.png

Go to the Rows pull down menu and select "Data Filter"

txnelson_2-1734155406295.png

In the Data Filter window, select the Start and End Dates, and Click on the + sign

txnelson_3-1734155523668.png

The Filters will then be displayed.

Click on the Show and Include check boxes.  The indicate the action to take on the rows in the data table when the setting for the filters select the rows.

txnelson_4-1734155694846.png

Now the Filters can be set to make the selection.  The blue lines can be dragged to the desired Star and End Dates, or the values can be directly typed in in the displayed date area for each filter.

Please note how the rows in the data table become selected as the filters are changed

txnelson_5-1734156015619.png

Once the desired rows are selected a new data table can be created by selecting the Show Subset selection from the Data Filter red triangle's list of actions.  Just click on the red triangle beside the Data Filter title and select the option

txnelson_6-1734156239580.png

txnelson_7-1734156306378.png

 

Jim
jthi
Super User

Re: Creating a script to select only certain values related to specific date range

Could you provide example table to make it a bit easier to create an example?

 

I wouldn't most likely create a formula for this but rather a script. I assume you wish to extract all code_x values in which date_x falls between start date and end date for each row? Where should they be recorded? What should be done to duplicates? Can data format be changed (could the data be stacked)?

 

 

@txnelson did JMP community drop your full example?

-Jarmo

Re: Creating a script to select only certain values related to specific date range

This is an example table to use.  You are correct, I need to look at each row and get the codes for each date between the start and end dates.  It doesn't matter if there are duplicates.  Ultimately, I need to know if all the codes are the same, or if there are different codes in each time frame.  I need to know for each row if the codes that fall between the start and stop dates are all "2"s or all " 3"s or a combination of the codes.  

I think this must need to be some sort of script as I cannot think of a formula that will work.  I have no experience creating scripts so any help would be appreciated!!

jthi
Super User

Re: Creating a script to select only certain values related to specific date range

This can be most likely made much simpler with your final need in mind, but as there is no exact specification / result column I just created something I think should help you forward

Names Default To Here(1);

dt = Open("$DOWNLOADS/example table.jmp");
Try(Column(dt, "Date 1") << Set Name("date_1"));

date_cols = Filter Each({colname}, dt << Get Column Names("String"),
	Starts With(colname, "date_");
);
code_cols = Filter Each({colname}, dt << Get Column Names("String"),
	Starts With(colname, "code ");
);

dt << New Column("Result", Character, Multiple Response);

For Each Row(dt,
	dates = dt[Row(), date_cols];
	between_bin = :"Start date"n <= dates <= :"Stop date"n;
	between_idx = Loc(between_bin);
	If(N Items(between_idx) > 0,
		codes = dt[Row(), code_cols[between_idx]];
		valid = Loc(codes);
		If(N Items(valid) == 0,
			codes = {};
		,
			codes = As List(codes[valid`]);
		);
	,
		codes = {};
	);

	res = Transform Each({code}, codes,
		Char(code);
	);
	
	:Result = Concat Items(res, ", ");
);

jthi_0-1734254499524.png

 

-Jarmo
txnelson
Super User

Re: Creating a script to select only certain values related to specific date range

Jarmo's code is spectacular.  I enjoy seeing how he approaches and solves problems. But, for most of my life, I have been considered by my collogues, friends, spouse and children as being strange in thought.  On good days, I am referred to as a "Nerd".

So given my status in life, I have found out that most people do not think the way I do.  

@PruningTheSpark  states

"I think this must need to be some sort of script as I cannot think of a formula that will work.  I have no experience creating scripts so any help would be appreciated!!".

I believe that a majority of JMP users fall close to PruningTheSpark's position. PruningTheSpark may not think there is a non scripting solution to the problem, but JMP's interactive capabilities to manipulate one's data are really powerful.  So below I have put together the interactive steps it takes to solve PruningTheSparks issue.

 

Starting with the sample data table that was provided, I did modify it slightly.  Given the results from Jamo's script, it shows that none of the rows have more than one valid code.  Given that more than one code is a potential, I added in  code on row 2 that would show up for row 2 to have multiple codes.

txnelson_1-1734273699807.png

The first task is to find all of the codes that fall within the Start Date and Stop Date.  Since there are 6 different code columns, the easiest way to do this all at once is to stack the data .

     Tables=>Stack

txnelson_2-1734273947931.png

It is a Multiple series stack, staking the date and the code columns

Now I can get rid of all codes that have missing values since they will not figure into the results.  So I selected one cell that has a missing value in the Data 2 column which contains all of the codes and then right click and choose Select Mating Cells

txnelson_3-1734274329802.png

Which selects all of the rows that have missing values for column Data 2.

Right click in the RowState column and select Delete Rows

txnelson_4-1734274493918.png

and you are left with only the rows that have codes

 

txnelson_5-1734274620789.png

Now I need to determine which of these codes fall within the Start Date and the Stop Date.  For this I am just going to create a new  column, which I will call "in between" which has a formula of

txnelson_0-1734275157461.png

which has a 1 for rows where the column Data (it contains the Date from the original table) has a 1 for being in the range and a 0 for being outside of the range

txnelson_1-1734275281443.png

Using the same technique as when deleting the codes with a missing value, the rows with zeros can be selected and deleted

txnelson_2-1734275419804.png

To clean up the table, some of the no longer needed columns can be deleted

txnelson_3-1734275591364.png

Now the task is to get the codes back into single rows.  To do this, a simple Split can be done

     Tables=>Split

txnelson_4-1734275732612.png

txnelson_5-1734275823204.png

Now to combine the code columns into one column, a Column Utility called Combine Columns can be used.  Just select the columns that are to be combined and then 

     Cols=>Utility=>Combine Columns

txnelson_6-1734275995990.png

Which pops up an input box to fill in the name of the new column, and what delimiter to  place between items, in this case, a ","

txnelson_7-1734276158965.png

this results in the table having a new column

txnelson_8-1734276210733.png

Clean up on the table by deleting the no longer needed columns gives 

txnelson_9-1734276283410.png

Now all that has to be done, is to move the results back into the original data table.  To do this, the Update platform will do the trick.  Click on the original table and 

     Tables=>Update

txnelson_10-1734276458194.png

Fill in the table to get the update from, and select to match record_id of the original table with record_id from the Split table and click on OK and the original table will contain are solution

txnelson_11-1734276643313.png

No script was used, only 2 simple formula columns were required.

 

Now, as one goes through these steps, JMP places in the log, the JSL required to create the interactive steps that were run.  So one can simply copy those pieces of JSL into a script, and at the end have a script that when run will great the solution in the blink of an eye.

names default to here(1);
dt = data table( "Example Table" );

// Stack data table
// → Data Table( "Stacked" )
dtStacked = Data Table( "example table" ) << Stack(
	columns(
		:date_1, :code 1, :date_2, :code 2, :date_3, :code 3, :date_4, :code 4,
		:date_5, :code 5, :date_6, :code 6
	),
	Number of Series( 2 ),
	Output Table( "Stacked" )
);
wait(0);
// Delete selected rows
Data Table( "Stacked" ) << Select Where( Is Missing( :Data 2 ) ) << Delete Rows;

// Delete columns
Data Table( "Stacked" ) << Delete Columns( :Label, :Label 2 );
// New column: is between
Data Table( "Stacked" ) << New Column( "is between",
	Numeric,
	"Continuous",
	Format( "Best", 12 )
);
// Change column formula: is between
Data Table( "Stacked" ):is between << Set Formula(
	:Data >= :Start date & :Data <= :Stop date
);

// Delete selected rows
Data Table( "Stacked" ) << Select Where( :is between == 0 ) << delete rows;

// Delete column: is between
Data Table( "Stacked" ) << Delete Columns( :is between );

// Delete columns
Data Table( "Stacked" ) << Delete Columns( :Stop date, :Start date, :Data );

// Split data table
// → Data Table( "Split" )
Data Table( "Stacked" ) << Split(
	Split By( :Data 2 ),
	Split( :Data 2 ),
	Group( :record_id ),
	Output Table( "Split" ),
	Sort by Column Property
);

// Combine columns
Data Table( "Split" ) << Combine Columns(
	columns( :"1"n, :"2"n, :"3"n ),
	Column Name( "Result" ),
	Delimiter( "," )
);


// Delete columns
Data Table( "Split" ) << Delete Columns( :"1"n, :"2"n, :"3"n );

// Update data table
Data Table( "example table" ) << Update(
	With( Data Table( "Split" ) ),
	Match Columns( :record_id = :record_id )
);

One can also open a new workflow, turn on the recorder and step through the steps and you will end up with a workflow that can be used in the future to create the solution.

txnelson_0-1734293113511.png

 

 

 

 

 

Jim

Re: Creating a script to select only certain values related to specific date range

Thanks so much for the help!  The screenshots are very helpful!

Re: Creating a script to select only certain values related to specific date range

Thanks so much for your help!