Share your ideas for the JMP Scripting Unsession at Discovery Summit by September 17th. We hope to see you there!
Choose Language Hide Translation Bar
Highlighted
ts2
ts2
Level III

Table Summary in JMP14 takes very long time compared to JMP12.

I have a table with 2 columns, DATETIME and MESSAGE. Message column is ~60 characters.

The table has ~3 million rows, with many duplicate rows because this is a concatenation of many files with some of the same data. I am attempting to get rid of the duplicate rows as fast as possible, which worked fine in JMP 12.2.0.


In JMP 12.2.0 the Summary takes ~8 seconds.
In JMP 14.3.0 the Summary takes an unknown amount of time because I killed the process > 30 minutes.

 

 

dt2 = dt << Summary( Group( :DATETIME, :MESSAGE ), Invisible, Link to original Data Table( 0 ), Table Name( "MessageTable" ) );
1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted
ts2
ts2
Level III

Re: Table Summary in JMP14 takes very long time compared to JMP12.

I have found a workaround, takes about 16 seconds which is much better.

It appears that Summary works OK with only one group. When using more than one group is where the performance degrades severely.

So I made the table into 1 column, then parsed the columns back out after the summary.

Pretty ugly but works.

 

dt << New Column( "KEY", Character, Formula( Char( :DATETIME ) || "@" || :MESSAGE ) );
dt2 = dt << Summary( Group( :KEY ), Invisible, Link to original Data Table( 0 ), TableName( "MessageTable" ) );
dt2 << New Column( "DateTime", Numeric, Format( "y/m/d h:m:s" ), Formula( Num( Words( :KEY, "@" )[1] ) ) );
dt2 << New Column( "Message", Character, Formula( Words( :KEY, "@" )[2] ) );
dt2:DateTime << Delete Formula;
dt2:Message << Delete Formula;
dt2 << Delete Columns( {"KEY", "N Rows"} );

 

View solution in original post

1 REPLY 1
Highlighted
ts2
ts2
Level III

Re: Table Summary in JMP14 takes very long time compared to JMP12.

I have found a workaround, takes about 16 seconds which is much better.

It appears that Summary works OK with only one group. When using more than one group is where the performance degrades severely.

So I made the table into 1 column, then parsed the columns back out after the summary.

Pretty ugly but works.

 

dt << New Column( "KEY", Character, Formula( Char( :DATETIME ) || "@" || :MESSAGE ) );
dt2 = dt << Summary( Group( :KEY ), Invisible, Link to original Data Table( 0 ), TableName( "MessageTable" ) );
dt2 << New Column( "DateTime", Numeric, Format( "y/m/d h:m:s" ), Formula( Num( Words( :KEY, "@" )[1] ) ) );
dt2 << New Column( "Message", Character, Formula( Words( :KEY, "@" )[2] ) );
dt2:DateTime << Delete Formula;
dt2:Message << Delete Formula;
dt2 << Delete Columns( {"KEY", "N Rows"} );

 

View solution in original post

Article Labels

    There are no labels assigned to this post.