Choose Language Hide Translation Bar
Super User

Performance of Tables > Tabulate?

I've never really had trouble with the tabulate platform until now.  I'm doing a tabulation of a table that can be anywhere from 40,000 to 800,000 rows that looks like this:

DatabaseDrugEventNumber of Events

The database column can go from 1 to 20.  I'm creating a tabulation that looks like this:


The performance of tabulate gets really slow when there are a lot of rows.  Perhaps because of the double-nesting of the row variables Database and Event.

I compared the performance to doing a pivot table in Microsoft Excel, and it was no contest: Excel won hands down!  A 400,000 row table in JMP took 3-6 minutes, while the same thing in Excel took less than a second!

Here's a chart showing the performance using two different PCs.  JMP 9 vs 10 made no difference.

2013_Response Times.png

Are there any tricks to speeding up tabulate?  I've attached some code you can play around with.  Just random numbers but you'll get the idea.  The first program creates the table, and the second one runs the tabulation.



0 Kudos
Super User mpb
Super User

Re: Performance of Tables > Tabulate?

I just took the briefest look at this ...for reference I generated a 500,000 row table and used Table>Summary with grouping variables Database and Event, Subgrouping variable Drug and Sum variable Drug. This took about 4 seconds on a T410 / I5 64 bit system using 32 bit JMP. I started up the Tabulate script but cancelled out when I saw it was running long so I don't have a result but it would probably be as you said. I don't know if a Summary based solution would be helpful to you but it's interesting to see the difference. Wonder if the slowness of Tabulate is due the Formatting.