Discussions

JasonWS · Jan 3, 2025 12:28 PM

I have some data that I would like to calculate the % Error between the first and all subsequent measurements for each unique ID. I haven't been able to figure out how to get data from one cell based on values from other rows-columns. Does this require JSL, or if I can just create a formula as I've done in Excel.

I'd like the 2 columns that I've highlighted to be calculated using a formula, but with thousands of rows of data in the actual set, I could understand if it has to be a script and the values static until the script is run again.

Any help would be appreciated.

I'm using JMP 16.

jthi · Jan 3, 2025 12:31 PM

You can get earliest measurement by using Col Min(Row(), :ID)

:Measurement[Col Min(Row(), :ID)]

You can then use this in separate column or just use this as a part of single formula which will calculate your result

-Jarmo

View solution in original post

jthi · Jan 3, 2025 12:31 PM

You can get earliest measurement by using Col Min(Row(), :ID)

:Measurement[Col Min(Row(), :ID)]

You can then use this in separate column or just use this as a part of single formula which will calculate your result

-Jarmo

JasonWS · Jan 3, 2025 01:14 PM

Thank you, that works.

I do have a concern though; what if I wanted to use the 2nd value for each ID?

So for ID == A I'd have 10.00001 in the "Earliest Measurement" column.

jthi · Jan 3, 2025 02:06 PM

There are quite a few different options but they can be a bit difficult to understand. "Simplest" idea is to combine Col statistical function with if statement and if statement returns either missing or the value you are looking for.

Col Min(If(Col Cumulative Sum(1, :ID) == 2, :Measurement, .), :ID)

Benefit of using Col statistical functions is that they work well with grouping columns and as formulas.

-Jarmo

jthi · Jan 4, 2025 02:55 AM

The method I would consider most "JMP like" is interactive solution and letting JMP create automated process (using enhanced log / workflow builder) and then fixing "issues" left by JMP created script:

As you already have Order column which I assume you can use, create a subset of just Order == 2. Right click -> select matching cells -> make subset

From the subset rename Measurement column to something else. Then go back to your original table and use Update to join the subset back to your table using ID as the match condition column

and now you have your measurement from order two in a column. And JMP will write you a script like this:

// Select matching cells
Data Table("JMP Example") << Select Where(:Order == 2);


// Subset data table
// → Data Table("Subset of JMP Example")
Data Table("JMP Example") << Select Where(:Order == 2) <<
Subset(Copy formula(0), Selected Rows(1), Selected columns only(0));


// Change column name: Measurement → Measurement_ORDER2
Data Table("Subset of JMP Example"):Measurement << Set Name("Measurement_ORDER2");


// Update data table
Data Table("JMP Example") << Update(
	With(Data Table("Subset of JMP Example")),
	Match Columns(:ID = :ID),
	Add Columns from Update Table(:Measurement_ORDER2),
	Replace Columns in Main Table(None)
);

It definitely isn't the best script but it does work and give an idea what is being done. You can also start workflow recording before you start this process and create a workflow to automate this (I personally do not use workflows and prefer enhanced log but they can be very helpful, especially if you wish to automate something). Script created by workflow builder could look something like this


Names Default To Here(1);


Workflow 6=function({}, 
    step_name = "Open Data Table: JMP Example.jmp";
    //Open Data Table: JMP Example.jmp
    Open("$DOWNLOADS/JMP Example.jmp");

    step_name = "Select matching cells";
    //Select matching cells
    Data Table("JMP Example") << Select Where(:Order == 2);

    step_name = "Subset data table";
    //Subset data table
    Data Table("JMP Example") << Select Where(:Order == 2) <<
    Subset(Selected Rows(1), Selected columns only(0));

    step_name = "Change column name: Measurement → Measurement2";
    //Change column name: Measurement → Measurement2
    Data Table("Subset of JMP Example"):Measurement << Set Name("Measurement2");

    step_name = "Update data table";
    //Update data table
    Data Table("JMP Example") << Update(
    	With(Data Table("Subset of JMP Example")),
    	Match Columns(:ID = :ID),
    	Add Columns from Update Table(:Measurement2),
    	Replace Columns in Main Table(None)
    );

    step_name = "Close Data Table: Subset of JMP Example";
    //Close Data Table: Subset of JMP Example
    Close(Data Table("Subset of JMP Example"), NoSave);
);
Workflow 6();

Generally workflow builder created scripts do have similar issues as enhanced log created ones but workflow does have extra benefits (if you wish to consider them like that, I consider them mostly annoyance if anything else than very simple things such as updating references). And here is how I would modify the JMP created script

Names Default To Here(1);

dt = Open("$DOWNLOADS/JMP Example.jmp");
dt << Clear Column Selection << Clear Select;

rows_of_interest = dt << Get Rows Where(:Order == 2);
dt_subset = dt << Subset(Rows(rows_of_interest), 
	Copy formula(0), 
	Selected columns only(0),
	Invisible
);

Column(dt_subset, "Measurement") << Set Name("Measurement_ORDER2");

dt << Update(
	With(dt_subset),
	Match Columns(:ID = :ID),
	Add Columns from Update Table(:Measurement_ORDER2),
	Replace Columns in Main Table(None)
);

Close(dt_subset, no save);

These can also be done using "more advanced" JSL but generally if you want to use formulas and avoid JSL, I wouldn't use them. Only case to utilize these is if you need speed and even in that case these might sometimes end up being slower unless you optimize them. Here is one option using JSL with some comments to explain

Names Default To Here(1);

dt = Open("$DOWNLOADS/JMP Example.jmp");

// Get all IDs into a list using data table subscripting
ids = dt[0, "ID"];

// Get all unique IDs using Summarize (there are many other methods)
// Note that Summarize will change numeric values to strings (doesn't matter in this case)
Summarize(dt, uniq_ids = By(:ID));

// We can loop over the unique IDs while collecting results
// I will store results to Associative Array as it is efficient method and easy to access
aa_idpos = Associative Array();
aa_meas = Associative Array();
idx_of_interest = 2;
For Each({cur_id}, uniq_ids,
	// We can use Loc to find specific ID from a list of IDs
	cur_idx = Loc(ids, cur_id);
	
	// From this we can pick second index for example and store it to our aa
	// Note: Usually you want to check that there is at least two items if you do this
	aa_idpos[cur_id] = cur_idx[2];
	
	// You could also get the Measurement value instead
	// (I prefer index as then I can access value from any of the columns easily)
	aa_meas[cur_id] = :Measurement[cur_idx[idx_of_interest]];
);
// Show(aa_idpos);

// Then you can basically do whatever you want with these
// Fill them into the table for example by using Formula OR For Each Row
// Formula tends to be faster if you just need one column
new_col = dt << New Column("Second Meas by ID", Numeric, Continuous, Formula(
	aa_meas[:ID]
));
// aa_meas WON'T be evaluated unless you specifically do it, so either remove the formula
// or evaluate it using eval(evalexpr()) / eval(substitute())
dt << run formulas;
new_col << Delete Formula();

/*
// or evaluation (one option, there many other options (local variables, not worrying about the scope, ...))
new_col = Eval(EvalExpr(
	dt << New Column("Second Meas by ID", Numeric, Continuous, Formula(
		aa = As Constant(Expr(aa_meas));
		aa[:ID]
	));	
));
*/

Names Default To Here(1); dt = Open("$DOWNLOADS/JMP Example.jmp"); // Get all IDs into a list using data table subscripting ids = dt[0, "ID"]; // Get all unique IDs using Summarize (there are many other methods) // Note that Summarize will change numeric values to strings (doesn't matter in this case) Summarize(dt, uniq_ids = By(:ID)); // We can loop over the unique IDs while collecting results // I will store results to Associative Array as it is efficient method and easy to access aa_idpos = Associative Array(); aa_meas = Associative Array(); idx_of_interest = 2; For Each({cur_id}, uniq_ids, // We can use Loc to find specific ID from a list of IDs cur_idx = Loc(ids, cur_id); // From this we can pick second index for example and store it to our aa // Note: Usually you want to check that there is at least two items if you do this aa_idpos[cur_id] = cur_idx[2]; // You could also get the Measurement value instead // (I prefer index as then I can access value from any of the columns easily) aa_meas[cur_id] = :Measurement[cur_idx[idx_of_interest]]; ); // Show(aa_idpos); // Then you can basically do whatever you want with these // Fill them into the table for example by using Formula OR For Each Row // Formula tends to be faster if you just need one column new_col = dt << New Column("Second Meas by ID", Numeric, Continuous, Formula( aa_meas[:ID] )); // aa_meas WON'T be evaluated unless you specifically do it, so either remove the formula // or evaluate it using eval(evalexpr()) / eval(substitute()) dt << run formulas; new_col << Delete Formula(); /* // or evaluation (one option, there many other options (local variables, not worrying about the scope, ...)) new_col = Eval(EvalExpr( dt << New Column("Second Meas by ID", Numeric, Continuous, Formula( aa = As Constant(Expr(aa_meas)); aa[:ID] )); ));*/

-Jarmo

txnelson · Jan 4, 2025 10:48 AM

If you know that your data are ordered, as they are in your sample data table, and assuming there will always be a second value, then

:Measurement[Col Min( Row(), :ID ) + 1]

is a very simple way to use the 2nd row in the series.

Jim

shampton82 · Jan 4, 2025 11:42 PM

Along the lines of what @txnelson suggested, you could use this as well:

:Measurement[Col Maximum( If( :Order == 2, Row() ), :ID )]

This would return the second measurement for each ID.

@JasonWS

I put together this PowerPoint to go over how to do things along the lines of vlookup in excel for new JMP users. Hope it can help as well.

Steve

Hope it helps!

hogi · Jan 5, 2025 12:33 PM

The topic gets more complicated if you want to aggregate several entries for the reference - or if there are excluded rows.
Besides that, if you apply such calculations often during daily work, you don't want to create the formula again and again / manually.
There is a Wish in the Wish List New Formula column/ Distributional: "Normalize" .
but the interest from other users is quite low - just 2 Kudos ...

So, I followed @mia_stephens advice

... and developed my own AddIn. It can be found in the Marketplace:
https://marketplace.jmp.com/appdetails/Normalization+GUI

The AddIn provides a user Interface to generate such calculations automatically.
Besides many other options, there is an option for "first", "last" and "N-1".

It is also possible to use the second entry for the normalization.
Just activate

and use a Rank/Order column like explained by @shampton82 :

At the end, the formula will look like this:

Very similar to the approach of @shampton82 :
It uses a Col ... aggregation to distribute the reference values to the other rows.

A subtle difference: it distributes the values not the row indices.

Discussions

Equivalent to an Index(Match()) in Excel

Re: Equivalent to an Index(Match()) in Excel

Re: Equivalent to an Index(Match()) in Excel

Re: Equivalent to an Index(Match()) in Excel

Re: Equivalent to an Index(Match()) in Excel

Re: Equivalent to an Index(Match()) in Excel

Re: Equivalent to an Index(Match()) in Excel

Re: Equivalent to an Index(Match()) in Excel

Re: Equivalent to an Index(Match()) in Excel

Recommended Articles