Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Insert list of column names into a formula

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Oct 31, 2013 1:48 PM
(8287 views)

Hi,

I am using JMP10. I am trying to create a script to delete duplicate rows of data in a data table. An example of my attempt is as follows:

// Start.

// Create data table.

dt = open("$SAMPLE_DATA/Big Class.jmp");

dt << Select Randomly(5);

subdt = dt << Subset(Output Table Name("subset"), Invisible);

dt << Concatenate(subdt, Append to First Table);

Close(subdt, No Save);

dt << Clear Select;

// Find and delete duplicate rows for bigClass.

icol = dt << New Column( "Index", Numeric, Continuous, Formula( Sequence( 1, N Row( dt ), 1, 1 ) ) );

repcol = dt << New Column( "Repeat Data",

Numeric,

Continuous,

Formula( If( :Index == Col Minimum( :Index, :name, :age, :sex, :height, :weight ), 1, 0 ) )

);

dt << Select Where( As Column(repcol) == 0 );

dt << Delete Rows();

dt << Delete Columns( {icol, repcol} );

// Find and delete duplicate rows for generic data table.

colnames = dt << Get Column Names();

icol = dt << New Column( "Index", Numeric, Continuous, Formula( Sequence( 1, N Row( dt ), 1, 1 ) ) );

repcol = dt << New Column( "Repeat Data",

Numeric,

Continuous,

Formula( If( :Index == Col Minimum( :Index, /* Insert colnames here. */ ), 1, 0 ) )

);

dt << Select Where( As Column(repcol) == 0 );

dt << Delete Rows();

dt << Delete Columns( {icol, repcol} );

// End.

My problem is I would like to make this script run on a generic data table. How can I insert my colnames list into my formula? Is there a more clever way to do this?

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Brilliant idea to use Col Min() to find duplicates!

Below is an alternative approach based on your idea that appears to work. It is simpler because there is no need for any temporary formula columns.

// Create example data table.

Open**(** "$SAMPLE_DATA/Big Class.jmp" **)** << Concatenate**(** Open**(** "$SAMPLE_DATA/Big Class.jmp" **)**, Append to First Table **)**;

// Delete duplicates from generic table

dt = Current Data Table**()**;

col_min = Parse**(** "Col Min( Row(), :Name(\!"" || Concat Items**(** dt << get column names**(** string **)**, "\!"), :Name(\!"" **)** || "\!"))" **)**;

duplicates = **[]**;

For Each Row**(** If**(** col_min != Row**()**, duplicates |/= Row**()** **)** **)**;

dt << delete rows**(** duplicates **)**;

5 REPLIES 5

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Brilliant idea to use Col Min() to find duplicates!

Below is an alternative approach based on your idea that appears to work. It is simpler because there is no need for any temporary formula columns.

// Create example data table.

Open**(** "$SAMPLE_DATA/Big Class.jmp" **)** << Concatenate**(** Open**(** "$SAMPLE_DATA/Big Class.jmp" **)**, Append to First Table **)**;

// Delete duplicates from generic table

dt = Current Data Table**()**;

col_min = Parse**(** "Col Min( Row(), :Name(\!"" || Concat Items**(** dt << get column names**(** string **)**, "\!"), :Name(\!"" **)** || "\!"))" **)**;

duplicates = **[]**;

For Each Row**(** If**(** col_min != Row**()**, duplicates |/= Row**()** **)** **)**;

dt << delete rows**(** duplicates **)**;

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Insert list of column names into a formula

Thanks MS! Very clever. I will still have to study exactly how it works.

Highlighted
##

I was too quick. The first posted version would fail for complex column names that require the :Name("e.g. ∆X") construct. I edited the code above and now it should work more generally.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Insert list of column names into a formula

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Insert list of column names into a formula

It is clever to use Col Min() with an index column to locate duplicates.

I usually use Tables->Summary to get a data table free of duplicates. Just Group by all of the columns that would define a duplicate. This has the added advantage of providing a column (N Rows) that tells you how many duplicate rows there were in the original table.

-Jeff

-Jeff

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Insert list of column names into a formula

Thanks Jeff.