Solved: Preferred data format for tables with many rows?

Ressel · Jun 8, 2023 9:39 AM

Apologies, apparently a noob question (again...).

We are finally working towards performing some analyses on the online sensor ("tag") data we've been collecting for years in our manufacturing plant. The standard time resolution for these tags is 1'' and there are about 1400 tags to be sifted through for interesting information. But to get to that stage, I believe I need to suggest some data format to our Digitalization team. I believe they are expecting everyone who is not on their team (which I'm not) to work with CSV files, but I realize that this format wouldn't be able to cope with approximately 30 million rows, which roughly corresponds with the number of seconds in one year. Now, while this time resolution may not be required, I think it is still worth to approach this as if it were. Also, just using 1-minute granularity would mean approximately 0.5M rows for every year, which still appears inconvenient to handle using CSV.

For handling large tables, i.e., importing them into JMP, is JSON the preferred approach? Or would it perhaps be better to tie into a data base, loading the data directly into JMP from there? Any feedback, any comments I can feed back to our Digitalization team with regards to this issue are highly appreciated. Thank you very much in advance fellow JMPers!

jthi · Jan 30, 2023 03:56 PM

If possible, save the data to some proper database. It is most likely easy to get it from the database to JMP.

-Jarmo

View solution in original post

hogi · Jan 31, 2023 03:28 AM

I often import data with mio of rows from csv. The limit is just the data export of the source application which is at 2GB.

If I have to handle larger data, I split it into pieces. and load several csv files.

so, the 0.5mio rows (which you mention for the 1 minute granularity) is definitely no issue via csv.
Curious to hear if you also manage to get the 30 mio into Jmp.

If not as a single piece, then perhaps in chunks of 5mio?

View solution in original post

jthi · Jan 30, 2023 03:56 PM

If possible, save the data to some proper database. It is most likely easy to get it from the database to JMP.

-Jarmo

hogi · Jan 31, 2023 03:28 AM

I often import data with mio of rows from csv. The limit is just the data export of the source application which is at 2GB.

If I have to handle larger data, I split it into pieces. and load several csv files.

so, the 0.5mio rows (which you mention for the 1 minute granularity) is definitely no issue via csv.
Curious to hear if you also manage to get the 30 mio into Jmp.

If not as a single piece, then perhaps in chunks of 5mio?

Ressel · Feb 1, 2023 8:00 AM

@hogi , sorry my bad. I thought that CSV had the same limitations as Excel, but this is not true. Thank for pointing that out. I still think though that a 0.5M row limit for a 30M row dataset is a potential source of inconvenience. Although, I take it that this is likley also something that can be handled.

Preferred data format for tables with many rows?

Re: Preferred data format for tables with many rows?

Re: Preferred data format for tables with many rows?

Re: Preferred data format for tables with many rows?

Re: Preferred data format for tables with many rows?

Re: Preferred data format for tables with many rows?