I have a large csv file containing > 50million rows of data for all 50 states of the US. How can I import only part of the data to JMP (e.g., data for California only)?
One way to do it is to open the cvs file in JMP and then filter out the data for California, but this takes forever. So I was wondering if there is a way to filter the data *before* importing it to JMP? I guess this can be done with the query builder (right?) but can this be done on a csv file? Please advise.
Give multiple file import a try if you are seeing the csv import being slow. It won't solve the filtering issue but it might be quite a bit faster, even with just one file. The filtering should be quite fast once the data is loaded.
Or, keep the source script from the CSV import and use the script next time. CSV import (unlike multiple file import) makes an extra pass over the data to discover which columns are character and numeric. With the source script, it doesn't need to make the extra pass.
Also, if the data table is too big to fit in memory (8GB? 16GB?) then it is going to be slow and increasing memory might be the best choice. If you have a disk light, keep an eye on it, or use task manager, etc to see if the disk is 100% busy.