Our World Statistics Day conversations have been a great reminder of how much statistics can inform our lives. Do you have an example of how statistics has made a difference in your life? Share your story with the Community!
Choose Language Hide Translation Bar
Highlighted
Level I

## Who can help to optimize a basic function for completing rows in a data table?

Hello All,

I am looking for some advise from more experienced JMP users for the following problem.

I frequently need to work with a data tables that contain up to 4 rows per unit. However some of the system parameters are only recorded for the first enry of an individual unit although applicable for the other rows associated to that unit as well.

In order to be able to analyze and group the data I need these system parameters to be completed for all rows of an unit.

As I have no chance to alter the data recording to the database I need to take care of this after data base extraction before any analysis.

I therefore created a workaround as shown below.

It provides the desired result but is very slow and actually not feasible for data tables with more than 1000 rows.

dt = Current Data Table();

dt<<Sort(By(:unit), Order(Ascending), Replace Table);

:param_1 << Formula (for each row(:param_1 [row()+1] = (if (:unit[row()] == :unit [row()+1],

:param_1 [row()], :param_1 [row()+1] )),

),

Eval Formula

);

I am sure there is a more efficient and elegant way to solve this problem, but I have to admit that my skills are very limited.

I appreciate any input from more experienced users.

Thanks,

Lutz

3 REPLIES 3
Highlighted
Super User

## Re: Who can help to optimize a basic function for completing rows in a data table?

Try this code.  It assumes that you have a two-column table with one column called Unit and the other called Param_1.  I didn't do any performance testing, but this might zip through your table pretty quickly.  Only sets values if it has to.

dt = New Table( "Unit-Param Example",

Add Rows( 16 ),

New Column( "Unit",

Character,

Nominal,

Set Values(

{"a", "a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "c", "d", "d",

"d", "d"}

)

),

New Column( "Param_1",

Numeric,

Continuous,

Format( "Best", 12 ),

Set Values( [32, 32, 32, 32, 25, ., ., ., ., 14, ., ., ., ., 13, .] )

)

);

// Sort by unit and parameter, so that if any are missing they're in positions 2, 3, 4 of a set

dt << Sort(By(:unit, :param_1), Order(Ascending, descending), Replace Table);

// Get the first row values

old_unit = dt:unit[1];

old_param = dt:param_1[1];

for (i = 2, i <= nrows(dt), i++,

one_unit  = dt:unit[i];

one_param = dt:param_1[i];

if (one_unit == old_unit & is missing(one_param),

dt:param_1[i] = old_param;

);

if (one_unit != old_unit,

old_unit = one_unit;

old_param = dt:param_1[i];

);

);

Highlighted
Super User

## Re: Who can help to optimize a basic function for completing rows in a data table?

Tested my code against a 40,000 row table and it returned in less than a second.

Highlighted
Level I

## Re: Who can help to optimize a basic function for completing rows in a data table?

Thanks a lot. Will test the code first thing in office today.

Article Labels

There are no labels assigned to this post.