Does anyone have any tips for making a table faster? I'm working with about 300M rows and I'm just wondering if there are any tips for making it not have to work as hard.
For instance, I tried turning a few condition columns (valued 1, 2, 3, 4,...) from character to numeric thinking that ints would be smaller than chars and it actually made the table slower and bigger. Are some data/modeling types intrinsically smaller than others? Or really anything anyone has found that might make working with this thing easier.
I'm currently trying to bin my data columns with int values of 0-255 and just have a frequency column. But if I need the full resolution data, is there anything I can do?
Solved! Go to Solution.
jeff.perkinson's strategy is the right one to use. The key is to get the byte size correct - which the "compress selected columns" does for you. You can also reduce file size on the disk by using the "Compress File on Save" feature under the upper left most red triangle in the scripts area of the data table. A third option that I like for working with text based categories is this add-in: https://community.jmp.com/docs/DOC-6111
I'm going to leverage the fact that I've seen some posts where you are scripting. If you're already familiar with the types of data present and have some ideas of the reports/analysis you wish to perform on the larger dataset, why not try the invisible option on open? If you open a data table invisible (and keep it invisible) then that will save the I/O to the display.
Nate & Vince ,
I have tried that out on a large data table previously.Apart from making a data table invisible - one can also try to make the data table private for further optimization ( again there can be trade offs)
The following webcast by Brady ( a small part of it ) shows the same too .