I am working on big files that eat up a lot of computer memory when I open even a single file...causing my PC to be sluggish and sometimes hang up. Was wondering if there is a jsl solution where a single file can be chunked into smaller files before opening...Like File_A=2GB, will be chunked into 4 smaller files of 250MB each. I will then automate opening and aggregating/computing of each chunked file, close the open chunked file, then concatenate the results to the next chunked file that will be processed....then the loop goes on.
Thanks in advanced.
Hello , This can certainly be done .
Look at "Subset" function in the Scripting Index - this will help you form smaller subsets of your large data table . Here is an example . You can then use "Concat" to join the table .
// Lets start with the assumption that there are 1000 rows in dt table // Also say you want to subset n rows in each iteration ; LL = 1; UL = 250 ; for(i = 1 , i <= 4 , i ++, LL = LL * i ; UL = UL * i ; dt << Select Rows(Index(LL,UL,1)); // Assuming "dt" is your master table dt1 = dt << Subset("Private",Selected Rows(1),Selected Columns(0));
LL = UL; );
I would also like to add that this is not the only approach that can speed up processing of this data . If you were to make the table "invisible" or "private" , that can significantly speed up the process.
dt << Show Window(0);
If you don't want to load the table in memory before splitting it up you can be creative with how you load the data. For flat files specify the start position and number of rows to read inside a loop:
dtProbe = Open("$SAMPLE_DATA/Probe.jmp"); dtProbe << save( "$temp/deleteme123.txt" ); for( i = 1, i <= 2, i++, dt = Open( "$temp/deleteme123.txt", columns( New Column( "security to create", Character, "Nominal" ), New Column( "resource", Character, "Nominal" ), New Column( "group", Character, "Nominal" ), New Column( "(ADM V-net)", Character, "Nominal" ) ), Import Settings( Fixed Column Widths( 19, 9, 6, 105 ), Strip Quotes( 0 ), Use Apostrophe as Quotation Mark( 0 ), Use Regional Settings( 0 ), Scan Whole File( 0 ), Treat empty columns as numeric( 0 ), CompressNumericColumns( 0 ), CompressCharacterColumns( 0 ), CompressAllowListCheck( 0 ), Labels( 1 ), Column Names Start( 1 ), Data Starts( 1 + 100 * (i - 1) ), Lines To Read( 100 ), Year Rule( "20xx" ) ) ) );