cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
The Discovery Summit 2025 Call for Content is open! Submit an abstract today to present at our premier analytics conference.
See how to use to use Text Explorer to glean valuable information from text data at April 25 webinar.
Choose Language Hide Translation Bar
View Original Published Thread

How can make JSL read binary files faster?

lala
Level VIII

A binary file has a tabular data.
It has 144 bytes per line,
The first column is an 8-byte millisecond timestamp,
Each subsequent column is split by 4 bytes.
The key is that the following is only to extract the data of some columns.

2025-01-27_13-24-46.png
I wrote a JSL that can read data, but it is slower.
Ask an expert to help fix it.

co = 144;
bb = Load Text File( file, blob() );
gs = 9;
m = 0;
g = 1;
k = 1;
ar = [];
ar = J( z, g + gs, . );//20230815
For( k = 1, k <= z, k++,
	ff = Blob Peek( bb, (k - 1) * co + 0, 8 );
	a = "";
	Try( a = Blob To Matrix( ff, "uint", 8, "little", 8 )[1] );
	no = Length( Char( a ) );
	If( no > 0,
		Try(
			nn = Format( a / 1000 + Informat( "01/01/1970", "mm/dd/yyyy" ) + In Hours( 8 ), "yyyy-mm-ddThh:mm:ss" );
			b = Num( Munger( nn, 12, 2 ) || Munger( nn, 15, 2 ) || Munger( nn, 18, 2 ) );
		);
		If( 91500 <= b < 92600,
			m = m + 1;
			ar[m, 1] = Num( Munger( nn, 1, 4 ) || Munger( nn, 6, 2 ) || Munger( nn, 9, 2 ) );
			ar[m, 2] = b;

			i = 1;
			ff = Blob Peek( bb, (k - 1) * co + 64, 4 );
			ar[m, g + 2] = Blob To Matrix( ff, "int", 4, "little", 1 )[1] / 1000/*BM1*/;
			i = 1;
			ff = Blob Peek( bb, (k - 1) * co + 68, 4 );
			ar[m, g + 3] = Blob To Matrix( ff, "int", 4, "little", 1 )[1] / 1000/*BM2*/;
			i = 1;
			ff = Blob Peek( bb, (k - 1) * co + 84, 4 );
			ar[m, g + 4] = Blob To Matrix( ff, "int", 4, "little", 1 )[1]/*BL1*/;
			i = 1;
			ff = Blob Peek( bb, (k - 1) * co + 88, 4 );
			ar[m, g + 5] = Blob To Matrix( ff, "int", 4, "little", 1 )[1]/*BL2*/;
			i = 1;
			ff = Blob Peek( bb, (k - 1) * co + 104, 4 );
			ar[m, g + 6] = Blob To Matrix( ff, "int", 4, "little", 1 )[1] / 1000/*MM1*/;
			i = 1;
			ff = Blob Peek( bb, (k - 1) * co + 108, 4 );
			ar[m, g + 7] = Blob To Matrix( ff, "int", 4, "little", 1 )[1] / 1000/*MM2*/;
			i = 1;
			ff = Blob Peek( bb, (k - 1) * co + 124, 4 );
			ar[m, g + 8] = Blob To Matrix( ff, "int", 4, "little", 1 )[1]/*ML1*/;
			i = 1;
			ff = Blob Peek( bb, (k - 1) * co + 128, 4 );
			ar[m, g + 9] = Blob To Matrix( ff, "int", 4, "little", 1 )[1]/*ML2*/;
		);
	,
		Break()
	);
);

Thanks Experts!

8 REPLIES 8
lala
Level VIII


Re: How can make JSL read binary files faster?

file

 

jthi
Super User


Re: How can make JSL read binary files faster?

What is faster? How fast it should be?

-Jarmo
Craige_Hales
Super User


Re: How can make JSL read binary files faster?

Use the JSL profiler to find the hotspots.

 

The profiler will probably show the blobpeek and blobtomatrix functions very bright red. It looks like they are called millions of times for 4-byte blobs.

Try this instead: make one giant blobtomatrix call to convert the entire file into four-byte integers. That should be only a few seconds. Use JMP's matrix operations to take the data you need from that.You can probably do the entire conversion without any for-loops.

 

If needed, you can make another giant matrix of 4- or 8- byte ints, uints, floats. Only use the columns that have the correct binary data. I'd probably just combine two 4-byte ints to make the 8-byte value, something like x[row,1] + ( x[row,2]*(2^32) ).  I did not study what you were doing with them, so that might not be what you need.

 

Also: when you use blobtomatrix for 8-byte integers, JMP will only be able to keep 52 of the 64 bits. Since columns 6 and 7 are zero and are the high bits in a 8-byte little-endian number, it will probably work out OK. (a double precision floating point number only has 52 bits of fraction.)

 

Craige
lala
Level VIII


Re: How can make JSL read binary files faster?

Thank Craige!

Because the data at the beginning of the file is incomplete, I took the contents of the last line

This code is used for two splits and two blobs To Matrix

I guess so.

 

2025-01-28_17-03-43.png2025-01-29_14-38-03.png

lala
Level VIII


Re: How can make JSL read binary files faster?

pa = "I:\E\20250127.dat";bb = Load Text File( pa, blob() );
co = 144;gs = 9;m = 0;g = 1;k = 1;z0 = Length( bb );z = z0 / co;ar = [];ar = J( z, g + gs, . );
For( k = 1, k <= z, k++,
	ff = Blob Peek( bb, (k - 1) * co + 0, 8 );
	a = Blob To Matrix( ff, "uint", 8, "little", 8 )[1];
	nn = Format( a / 1000 + Informat( "01/01/1970", "mm/dd/yyyy" ) + In Hours( 8 ), "yyyy-mm-ddThh:mm:ss" );
	b = Num( Munger( nn, 12, 2 ) || Munger( nn, 15, 2 ) || Munger( nn, 18, 2 ) );
	m = m + 1;
	ar[m, 1] = Num( Munger( nn, 1, 4 ) || Munger( nn, 6, 2 ) || Munger( nn, 9, 2 ) );
	ar[m, 2] = b;
	ff = Blob Peek( bb, (k - 1) * co + 8, 144 );
	br = Blob To Matrix( ff, "uint", 4, "little", 34 )[0, 1 :: 34];
	z1 = Length( bb );
	z2 = (z1 -  / 4;
	ar[m, g + 2] = br[1, 15] / 1000;
	ar[m, g + 3] = br[1, 16] / 1000;
	ar[m, g + 4] = br[1, 20];
	ar[m, g + 5] = br[1, 21];
	ar[m, g + 6] = br[1, 25] / 1000;
	ar[m, g + 7] = br[1, 26] / 1000;
	ar[m, g + 8] = br[1, 30];
	ar[m, g + 9] = br[1, 31];
);
lala
Level VIII


Re: How can make JSL read binary files faster?

OK

Thank Craige!

2025-01-30_20-16-46.png

Craige_Hales
Super User


Re: How can make JSL read binary files faster?

Nice! I bet it was a lot faster too! Not a single loop in JSL, all the looping is internal in C++ where it is fast.

For anyone looking and wondering:

2: reads the entire file into a blob.

3: converts the entire blob into a matrix. In particular, the matrix has 36 columns from the original binary data that had 144 bytes per line; the values are all 4-byte integers. 144/4=36.

4: z appears unused; I *think* it may be ~20 million.

5-16: makes the data table columns.

18-19: these innocent-looking statements statements are grabbing tall columns of data from the matrix.

20: attempts to make a 64-bit integer from two 32-bit integers, see comments previously about 52 bits. Again, this is an entire column in one step.

21-30: copy and transform entire columns from the matrix to the table.

This is pretty cool because there are no floating point or text values in the data. Floats could be handled with a second matrix of float 4 or 8. Text is harder but still possible; different choices depending on 7-bit, 8-bit, Unicode character sets and fixed length or variations of variable length strings.

Craige
lala
Level VIII


Re: How can make JSL read binary files faster?

It is much faster.

Thanks for the expert's advice.

The original 8-byte timestamp can be converted like this.