Subscribe Bookmark RSS Feed

Who can help to optimize a basic function for completing rows in a data table?

lutze01_hotmail

Community Trekker

Joined:

Dec 15, 2014

Hello All,

I am looking for some advise from more experienced JMP users for the following problem.

I frequently need to work with a data tables that contain up to 4 rows per unit. However some of the system parameters are only recorded for the first enry of an individual unit although applicable for the other rows associated to that unit as well.

In order to be able to analyze and group the data I need these system parameters to be completed for all rows of an unit.

As I have no chance to alter the data recording to the database I need to take care of this after data base extraction before any analysis.

I therefore created a workaround as shown below.

It provides the desired result but is very slow and actually not feasible for data tables with more than 1000 rows.

dt = Current Data Table();

dt<<Sort(By(:unit), Order(Ascending), Replace Table);

:param_1 << Formula (for each row(:param_1 [row()+1] = (if (:unit[row()] == :unit [row()+1],

  :param_1 [row()], :param_1 [row()+1] )),

  ),

  Eval Formula

);

I am sure there is a more efficient and elegant way to solve this problem, but I have to admit that my skills are very limited.

I appreciate any input from more experienced users.

Thanks,

Lutz

3 REPLIES
pmroz

Super User

Joined:

Jun 23, 2011

Try this code.  It assumes that you have a two-column table with one column called Unit and the other called Param_1.  I didn't do any performance testing, but this might zip through your table pretty quickly.  Only sets values if it has to.

dt = New Table( "Unit-Param Example",

    Add Rows( 16 ),

    New Column( "Unit",

        Character,

        Nominal,

        Set Values(

            {"a", "a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "c", "d", "d",

            "d", "d"}

        )

    ),

    New Column( "Param_1",

        Numeric,

        Continuous,

        Format( "Best", 12 ),

        Set Values( [32, 32, 32, 32, 25, ., ., ., ., 14, ., ., ., ., 13, .] )

    )

);

// Sort by unit and parameter, so that if any are missing they're in positions 2, 3, 4 of a set

dt << Sort(By(:unit, :param_1), Order(Ascending, descending), Replace Table);

// Get the first row values

old_unit = dt:unit[1];

old_param = dt:param_1[1];

for (i = 2, i <= nrows(dt), i++,

    one_unit  = dt:unit[i];

    one_param = dt:param_1[i];

    if (one_unit == old_unit & is missing(one_param),

        dt:param_1[i] = old_param;

    );

    if (one_unit != old_unit,

        old_unit = one_unit;

        old_param = dt:param_1[i];

    );

);

pmroz

Super User

Joined:

Jun 23, 2011

Tested my code against a 40,000 row table and it returned in less than a second.

lutze01_hotmail

Community Trekker

Joined:

Dec 15, 2014

Thanks a lot. Will test the code first thing in office today.