cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Browse apps to extend the software in the new JMP Marketplace
Choose Language Hide Translation Bar
Ressel
Level VI

Optimizing a column formula for standardizing time resolved data to 100% rel. at t=0

I have received the formula I want to discuss almost 3 years ago from @txnelson, never let go of it since and thought about it many times before now, slowly coming somewhat more seriously to grips with scripting. In the meantime, the formula as saved me a lot of time.

In the example below, the same formula is added 3 times, each time with a little bit of variation. The issue I want to discuss are the different error messages thrown when copying and running the table script of the table including the newly added formula columns.

 

Names Default To Here( 1 );

// first, we need an example table
dt = New Table( "ExampleTable",
	Add Rows( 24 ),
	New Column( "BatchId",
		Character,
		"Nominal",
		Set Values(
			{"A", "A", "A", "A", "A", "A", "B", "B", "B", "C", "C", "C", "D", "D",
			"D", "E", "E", "E", "F", "F", "F", "G", "G", "G"}
		)
	),
	New Column( "StorageCondition",
		Character,
		"Nominal",
		Set Values(
			{"25Ā°C/60%RH", "25Ā°C/60%RH", "25Ā°C/60%RH", "5Ā°C", "5Ā°C", "5Ā°C",
			"25Ā°C/60%RH", "25Ā°C/60%RH", "25Ā°C/60%RH", "25Ā°C/60%RH", "25Ā°C/60%RH",
			"25Ā°C/60%RH", "25Ā°C/60%RH", "25Ā°C/60%RH", "25Ā°C/60%RH", "25Ā°C/60%RH",
			"25Ā°C/60%RH", "25Ā°C/60%RH", "25Ā°C/60%RH", "25Ā°C/60%RH", "25Ā°C/60%RH",
			"25Ā°C/60%RH", "25Ā°C/60%RH", "25Ā°C/60%RH"}
		)
	),
	New Column( "Time",
		Numeric,
		"Continuous",
		Format( "Best", 12 ),
		Set Values(
			[0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2]
		)
	),
	New Column( "QcParameter",
		Numeric,
		"Continuous",
		Format( "Best", 12 ),
		Set Values(
			[0.2, 0.3, 0.4, 0.2, 0.21, 0.22, 0.3, 0.9, 1.2, ., 0.2, 0.4, 0.1, ., 0.1,
			0, 0, 0, 0.1, 0, 0, 0, 0.1, 0]
		)
	)
);


// adding formula column via approach nr. 1
// this adds the formula alright, but when copying and executing the resulting table script will leave the formula column empty
// no need to discuss this, because this obviously missing the 'eval(eval expr(expr()))' wrapping (see screenshot below) dt << New Column( "QcParameter [% rel.]_1", numeric, formula( curBatch = :BatchId; curCondition = :StorageCondition; theRows = dt << get rows where( :BatchId == curBatch & :StorageCondition == curCondition & :Time == 0 ); d0Row = theRows[1]; d0Time = :QcParameter[d0Row]; (100 * :QcParameter / d0Time); ) ); // adding formula column via approach nr. 2
// this will fill the formula column as desired when copying and running the table script with the added formula columns
// but it will also throw an error message when new rows are added to the table then (see screenshot below) Eval( Eval Expr( dt << New Column( "QcParameter [% rel.]_2", numeric, formula( curBatch = :BatchId; curCondition = :StorageCondition; theRows = Expr( dt ) << get rows where( :BatchId == curBatch & :StorageCondition == curCondition & :Time == 0 ); d0Row = theRows[1]; // removed from approach nr. 3 d0Time = :QcParameter[d0Row]; (100 * :QcParameter / d0Time); ) ) ) ); // adding formula column via approach nr. 3 // this will fill the formula column as desired when copying and running the table script with the added formula columns
// and it will NOT throw an error message when new rows are added to the table then (see screenshot below) Eval( Eval Expr( dt << New Column( "QcParameter [% rel.]_3", numeric, formula( curBatch = :BatchId; curCondition = :StorageCondition; theRows = Expr( dt ) << get rows where( :BatchId == curBatch & :StorageCondition == curCondition & :Time == 0 ); d0Time = :QcParameter[theRows]; (100 * :QcParameter) / d0Time; ) ) ) );

 

Screenshot of error message for formula column "QcParameter [%r rel.]_1]" when rerunning table script with additional formula columns:

Ressel_0-1696856397171.png

 

Screenshot of error message for formula column "QcParameter [%r rel.]_2]" when rerunning table script with additional formula columns:

Ressel_1-1696856568605.png

 

Screenshot of error message for formula column "QcParameter [%r rel.]_3]" when rerunning table script with additional formula columns: 

Ressel_2-1696856674488.png

 

The reason why I am asking this lengthy question is that I assume there must be some reason why variable "d0Row" is used in approach nr. 2, no? For approach nr. 3 I have simply deleted this variable (see screenshot below) and the formula behaves "much more nicely" as a result, i.e., there is no error message when new rows are added ad-hoc by point & click to the data table.

Ressel_3-1696856846545.png

 

My question is therefore, if I am doing something wrong by deleting the "d0Row" variable. In real life I am working with larger data tables and won't be able to check each row individually for its accuracy.

2 ACCEPTED SOLUTIONS

Accepted Solutions
txnelson
Super User

Re: Optimizing a column formula for standardizing time resolved data to 100% rel. at t=0

You should be able to eliminate the error messages with the Try() function

Try(
	curBatch = :BatchId;
	curCondition = :StorageCondition;
	theRows = Current Data Table() << get rows where(
		:BatchId == curBatch & :StorageCondition == curCondition & :Time == 0
	);
	d0Row = theRows[1];
	d0Time = :QcParameter[d0Row];
	If( Row() == 11,
		Show( therows, d0Time )
	);
	(100 * :QcParameter) / d0Time;
);
Jim

View solution in original post

jthi
Super User

Re: Optimizing a column formula for standardizing time resolved data to 100% rel. at t=0

Not exactly related to this question, but you can "simplify" the formula with some Col-function magics. These suggestions should work, but you should verify them before replacing old working formula

dt << New Column("FormulaA", Numeric, Continuous, Formula(
	100 * :QcParameter / Col Sum(If(:Time == 0, :QcParameter, .),:BatchID, :StorageCondition) 
));

If the data is sorted in such a way that Time 0 is always first for the group, you can make it even easier to understand (maybe)

dt << New Column("FormulaB", Numeric, Continuous, Formula(
	100 * :QcParameter / QcParameter[Col Min(Row(), :BatchID, :StorageCondition)]
));

 

-Jarmo

View solution in original post

6 REPLIES 6
txnelson
Super User

Re: Optimizing a column formula for standardizing time resolved data to 100% rel. at t=0

I am not completely sure I am following your issue, but what I am seeing is that your reference to dt is when you copy formula 1, the reference to dt is missing.  In example 2 and 3, you end up with the data table reference pointing to data table "ExampleTable", and your data table name, as shown, is "Example Table 2".  Again, if I am understanding your issue,I think you can solve the issue by using "Current Daa Table()".

curBatch = :BatchId;
curCondition = :StorageCondition;
theRows = Current Data Table() << get rows where(
	:BatchId == curBatch & :StorageCondition == curCondition & :Time == 0
);
d0Row = theRows[1];
d0Time = :QcParameter[d0Row];
If( Row() == 11,
	Show( therows, d0Time )
);
(100 * :QcParameter) / d0Time;

 

Jim
Ressel
Level VI

Re: Optimizing a column formula for standardizing time resolved data to 100% rel. at t=0

Thanks! I'll try and summarize again:

  1. When adding the three formula columns to the example table previously without formula columns, then copying and executing the resulting table script of the table now containing the three formula columns, only examples 2 and 3 will populate the corresponding formula columns with the expected, relative values. The first formula column stays empty.
  2. Moreover, when adding new rows to the final table created under 1. (i.e. from the table script containing now all 3 formula columns), only the third example will not throw an error message when adding new, empty rows.

Because of this observation I removed the "d0Row" variable as part of example 3. Following this, no error message appears when adding new rows to the table and it reproduces the relative data as expected. The question therefore is if "d0Row" is at all required. I created example 3 specifically to test this hypothesis. (Removing "d0Row" makes the formula shorter, but my concern is that this variable is essential or that I am overlooking something. Am I?

 

I am not sure this is related to using or not using "current data table()" in the formula. I've used "current data table()" as a workaround for a long time now. This will still result in error messages when adding new rows to a table, but I still want to be able to add new rows to a table without having to constantly confirm error messages for every relative column in a table when a new row is added. There are dozens of relative columns in them and it costs time to repeat this operation. It is of course possible to suppress evaluation, but this also costs time. Apologies if this example is not clear, but I'm willing to explain further.)

txnelson
Super User

Re: Optimizing a column formula for standardizing time resolved data to 100% rel. at t=0

You should be able to eliminate the error messages with the Try() function

Try(
	curBatch = :BatchId;
	curCondition = :StorageCondition;
	theRows = Current Data Table() << get rows where(
		:BatchId == curBatch & :StorageCondition == curCondition & :Time == 0
	);
	d0Row = theRows[1];
	d0Time = :QcParameter[d0Row];
	If( Row() == 11,
		Show( therows, d0Time )
	);
	(100 * :QcParameter) / d0Time;
);
Jim
jthi
Super User

Re: Optimizing a column formula for standardizing time resolved data to 100% rel. at t=0

Not exactly related to this question, but you can "simplify" the formula with some Col-function magics. These suggestions should work, but you should verify them before replacing old working formula

dt << New Column("FormulaA", Numeric, Continuous, Formula(
	100 * :QcParameter / Col Sum(If(:Time == 0, :QcParameter, .),:BatchID, :StorageCondition) 
));

If the data is sorted in such a way that Time 0 is always first for the group, you can make it even easier to understand (maybe)

dt << New Column("FormulaB", Numeric, Continuous, Formula(
	100 * :QcParameter / QcParameter[Col Min(Row(), :BatchID, :StorageCondition)]
));

 

-Jarmo
Ressel
Level VI

Re: Optimizing a column formula for standardizing time resolved data to 100% rel. at t=0

This was actually exactly related to the discussion. The discussion is about eliminating error messages. Your suggestion does that. Thanks. Secondary goal was explanation of the necessity of "d0Row" variable in the original formula. This hasn't been resolved, but it really doesn't look as if it's required. Simplicity is desirable, of course.

jthi
Super User

Re: Optimizing a column formula for standardizing time resolved data to 100% rel. at t=0

Maybe d0Row is there just in case if there could be multiple places where the time is 0 for each of the groups? Using d0Row you would only take first of those (my formulas for examples do not take that into account (sorting thing kinda does)).

-Jarmo