- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
How can make JSL read binary files faster?
A binary file has a tabular data.
It has 144 bytes per line,
The first column is an 8-byte millisecond timestamp,
Each subsequent column is split by 4 bytes.
The key is that the following is only to extract the data of some columns.
I wrote a JSL that can read data, but it is slower.
Ask an expert to help fix it.
co = 144;
bb = Load Text File( file, blob() );
gs = 9;
m = 0;
g = 1;
k = 1;
ar = [];
ar = J( z, g + gs, . );//20230815
For( k = 1, k <= z, k++,
ff = Blob Peek( bb, (k - 1) * co + 0, 8 );
a = "";
Try( a = Blob To Matrix( ff, "uint", 8, "little", 8 )[1] );
no = Length( Char( a ) );
If( no > 0,
Try(
nn = Format( a / 1000 + Informat( "01/01/1970", "mm/dd/yyyy" ) + In Hours( 8 ), "yyyy-mm-ddThh:mm:ss" );
b = Num( Munger( nn, 12, 2 ) || Munger( nn, 15, 2 ) || Munger( nn, 18, 2 ) );
);
If( 91500 <= b < 92600,
m = m + 1;
ar[m, 1] = Num( Munger( nn, 1, 4 ) || Munger( nn, 6, 2 ) || Munger( nn, 9, 2 ) );
ar[m, 2] = b;
i = 1;
ff = Blob Peek( bb, (k - 1) * co + 64, 4 );
ar[m, g + 2] = Blob To Matrix( ff, "int", 4, "little", 1 )[1] / 1000/*BM1*/;
i = 1;
ff = Blob Peek( bb, (k - 1) * co + 68, 4 );
ar[m, g + 3] = Blob To Matrix( ff, "int", 4, "little", 1 )[1] / 1000/*BM2*/;
i = 1;
ff = Blob Peek( bb, (k - 1) * co + 84, 4 );
ar[m, g + 4] = Blob To Matrix( ff, "int", 4, "little", 1 )[1]/*BL1*/;
i = 1;
ff = Blob Peek( bb, (k - 1) * co + 88, 4 );
ar[m, g + 5] = Blob To Matrix( ff, "int", 4, "little", 1 )[1]/*BL2*/;
i = 1;
ff = Blob Peek( bb, (k - 1) * co + 104, 4 );
ar[m, g + 6] = Blob To Matrix( ff, "int", 4, "little", 1 )[1] / 1000/*MM1*/;
i = 1;
ff = Blob Peek( bb, (k - 1) * co + 108, 4 );
ar[m, g + 7] = Blob To Matrix( ff, "int", 4, "little", 1 )[1] / 1000/*MM2*/;
i = 1;
ff = Blob Peek( bb, (k - 1) * co + 124, 4 );
ar[m, g + 8] = Blob To Matrix( ff, "int", 4, "little", 1 )[1]/*ML1*/;
i = 1;
ff = Blob Peek( bb, (k - 1) * co + 128, 4 );
ar[m, g + 9] = Blob To Matrix( ff, "int", 4, "little", 1 )[1]/*ML2*/;
);
,
Break()
);
);
Thanks Experts!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can make JSL read binary files faster?
file
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can make JSL read binary files faster?
What is faster? How fast it should be?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can make JSL read binary files faster?
Use the JSL profiler to find the hotspots.
The profiler will probably show the blobpeek and blobtomatrix functions very bright red. It looks like they are called millions of times for 4-byte blobs.
Try this instead: make one giant blobtomatrix call to convert the entire file into four-byte integers. That should be only a few seconds. Use JMP's matrix operations to take the data you need from that.You can probably do the entire conversion without any for-loops.
If needed, you can make another giant matrix of 4- or 8- byte ints, uints, floats. Only use the columns that have the correct binary data. I'd probably just combine two 4-byte ints to make the 8-byte value, something like x[row,1] + ( x[row,2]*(2^32) ). I did not study what you were doing with them, so that might not be what you need.
Also: when you use blobtomatrix for 8-byte integers, JMP will only be able to keep 52 of the 64 bits. Since columns 6 and 7 are zero and are the high bits in a 8-byte little-endian number, it will probably work out OK. (a double precision floating point number only has 52 bits of fraction.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can make JSL read binary files faster?
Thank Craige!
Because the data at the beginning of the file is incomplete, I took the contents of the last line
This code is used for two splits and two blobs To Matrix
I guess so.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can make JSL read binary files faster?
pa = "I:\E\20250127.dat";bb = Load Text File( pa, blob() );
co = 144;gs = 9;m = 0;g = 1;k = 1;z0 = Length( bb );z = z0 / co;ar = [];ar = J( z, g + gs, . );
For( k = 1, k <= z, k++,
ff = Blob Peek( bb, (k - 1) * co + 0, 8 );
a = Blob To Matrix( ff, "uint", 8, "little", 8 )[1];
nn = Format( a / 1000 + Informat( "01/01/1970", "mm/dd/yyyy" ) + In Hours( 8 ), "yyyy-mm-ddThh:mm:ss" );
b = Num( Munger( nn, 12, 2 ) || Munger( nn, 15, 2 ) || Munger( nn, 18, 2 ) );
m = m + 1;
ar[m, 1] = Num( Munger( nn, 1, 4 ) || Munger( nn, 6, 2 ) || Munger( nn, 9, 2 ) );
ar[m, 2] = b;
ff = Blob Peek( bb, (k - 1) * co + 8, 144 );
br = Blob To Matrix( ff, "uint", 4, "little", 34 )[0, 1 :: 34];
z1 = Length( bb );
z2 = (z1 - / 4;
ar[m, g + 2] = br[1, 15] / 1000;
ar[m, g + 3] = br[1, 16] / 1000;
ar[m, g + 4] = br[1, 20];
ar[m, g + 5] = br[1, 21];
ar[m, g + 6] = br[1, 25] / 1000;
ar[m, g + 7] = br[1, 26] / 1000;
ar[m, g + 8] = br[1, 30];
ar[m, g + 9] = br[1, 31];
);
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can make JSL read binary files faster?
OK
Thank Craige!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can make JSL read binary files faster?
Nice! I bet it was a lot faster too! Not a single loop in JSL, all the looping is internal in C++ where it is fast.
For anyone looking and wondering:
2: reads the entire file into a blob.
3: converts the entire blob into a matrix. In particular, the matrix has 36 columns from the original binary data that had 144 bytes per line; the values are all 4-byte integers. 144/4=36.
4: z appears unused; I *think* it may be ~20 million.
5-16: makes the data table columns.
18-19: these innocent-looking statements statements are grabbing tall columns of data from the matrix.
20: attempts to make a 64-bit integer from two 32-bit integers, see comments previously about 52 bits. Again, this is an entire column in one step.
21-30: copy and transform entire columns from the matrix to the table.
This is pretty cool because there are no floating point or text values in the data. Floats could be handled with a second matrix of float 4 or 8. Text is harder but still possible; different choices depending on 7-bit, 8-bit, Unicode character sets and fixed length or variations of variable length strings.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can make JSL read binary files faster?
It is much faster.
Thanks for the expert's advice.
The original 8-byte timestamp can be converted like this.