cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
EdgeFrog719
Level I

JMP Multiple Series Stack memory Issue

Hi, I was trying to do Multiple series stack for data set (3.5M rows & 225 Columns). It's giving RAM overflow issue (Out of Memory). Using 32 GB RAM. Is there any way to do this in serial way to avoid the memory issue and get the expected data table?

1 ACCEPTED SOLUTION

Accepted Solutions
peng_liu
Staff

Re: JMP Multiple Series Stack memory Issue

A single data table with 3.5M rows & 225 columns = 787.5M rows, costs me 12+GB. The regular stacked table will have 3.5M X 225 = 787.5M rows, and two columns (second being the label), which will be easily in the range around 20GB, So they won't co-exist on 32GB RAM.

I suggest the following sequence:

  1. Take original 3.5M rows & 225 Columns data, save individual columns to separate files, one at a time. And close the new file every time after saving it. Every saved file has two columns, one is the series, the other is the indicator or label. Use numerical type for label. In your case, just use 1 through 225. (Optional, try compress the label column. And the compress column will use a 2 byte field.)
  2. Close the big original file.
  3. Open the first saved file with one column. Then Concatenate other individual files, one at a time, with "Append to first table" checked. And close every individual files immediately after they get appended.

I gave it a try to append 3.5M rows & 2 column small table to the 787.5M row & 2 column big table. The top memory consumption by JMP sometimes jumped over 20GB, but survived, and reduced to ~18+GB after concatenation got done. And I kept appending for another couple of rounds, I observed similar behavior. And I see more substantial memory saving if I use compressed columns. Anyway, seems worth a try. Good luck!

BTW, how do you intend to analyze the data? What platform do you want to use?

View solution in original post

1 REPLY 1
peng_liu
Staff

Re: JMP Multiple Series Stack memory Issue

A single data table with 3.5M rows & 225 columns = 787.5M rows, costs me 12+GB. The regular stacked table will have 3.5M X 225 = 787.5M rows, and two columns (second being the label), which will be easily in the range around 20GB, So they won't co-exist on 32GB RAM.

I suggest the following sequence:

  1. Take original 3.5M rows & 225 Columns data, save individual columns to separate files, one at a time. And close the new file every time after saving it. Every saved file has two columns, one is the series, the other is the indicator or label. Use numerical type for label. In your case, just use 1 through 225. (Optional, try compress the label column. And the compress column will use a 2 byte field.)
  2. Close the big original file.
  3. Open the first saved file with one column. Then Concatenate other individual files, one at a time, with "Append to first table" checked. And close every individual files immediately after they get appended.

I gave it a try to append 3.5M rows & 2 column small table to the 787.5M row & 2 column big table. The top memory consumption by JMP sometimes jumped over 20GB, but survived, and reduced to ~18+GB after concatenation got done. And I kept appending for another couple of rounds, I observed similar behavior. And I see more substantial memory saving if I use compressed columns. Anyway, seems worth a try. Good luck!

BTW, how do you intend to analyze the data? What platform do you want to use?