Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

- JMP User Community
- :
- Discussions
- :
- Copying many large formulas (neural nets) between data tables is very, very slow

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Copying many large formulas (neural nets) between data tables is very, very slow

Apr 22, 2020 8:12 PM
(639 views)

I have a dataset of 10000 rows x 100+ columns used for neural network modeling. I have made 75 different models of 20 nodes x 100 inputs/node and saved the formulas using fast formulas. I have 10 different tables (same dataset, different modeling), each with 75 models and I want to merge all of the models (formulas) into one master table, since I want to apply all of the models to any new data I collect.

Every method I try takes forever. I have used the Copy Formulas Add-in to copy each table's formulas to a single table and it takes about 2 days total to complete all of the table merges and with no indication of progress, it's hard to know if it is still working or has frozen.

I have used Join by matching rows with suppress formula evaluation to get them all into one table and this is fast. However, when I turn formula evaluation back on by manually selecting the formula columns and using Columns>Standardize Attributes>Column Properties>Formula>Suppress Eval, the recalculation of the 750 formulas takes forever. It has been running for 30 hours (maxed out at 12% processor utilization) and I have no idea what the progress is.

I have tried looping through the columns one at a time to turn formula evaluation on but even doing this for 1 column took over 5 minutes, which seems incredibly long.

Is there any way to speed this up? I know there are workarounds but this seems like it should be much easier and faster than this.

4 REPLIES 4

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Copying many large formulas (neural nets) between data tables is very, very slow

Created:
Apr 23, 2020 6:26 AM
| Last Modified: Apr 23, 2020 6:27 AM
(605 views)
| Posted in reply to message from jwalk 04-22-2020

With 'larger' problems it's hard to offer general advice about efficiency without knowing some or all of the details. If you haven't already, you could look at the impact of the 'Begin Data Update()' and 'End Data Update()' table messages, and/or controlling the way formulas are evaluated (or not). FWIW, my instinct tells me that you may do better by using the Formula Depot as an intermediary. I started to write some code to check this out but for problems of this size the modelling itself takes some time so I didn't continue down this path:

```
NamesDefaultToHere(1);
// Define the data volume
nr = 100;
nc = 10;
// Make some random data
dt = AsTable(J(nr, nc, RandomNormal(0, 1)));
dt << setName("Random Data");
// Fit neural and publish prediction formula
cols = dt << getColumnNames("String");
yCol = cols[NItems(cols)];
RemoveFrom(cols, NItems(cols));
nn = dt << Neural(Y(Eval(yCol)), X(Eval(cols)), Fit( NTanH( 3 ) ), Invisible);
nn(Fit(1)) << publishPredictionFormula;
Report(nn) << closeWindow;
```

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Copying many large formulas (neural nets) between data tables is very, very slow

Hi Ian,

Thanks for the rapid response. I just tried the Formula Depot method by Add Formula From Column with all of the formulas and the script creation was fast. When I tested Run Scripts (running 6 scripts) on a duplicate table with the Formula Columns removed, it was very fast. So far so good. When I then tried running all of the scripts on the same table, it started fast but then progressively slowed down as more columns were created (I could see the columns being created in the Columns pane). It eventually appeared to freeze (no new columns created in 5 minutes). Hmmmm. I am now going to try doing it in small batches to see if that helps.

Highlighted
##

I wonder if the formulas are all being recalculated simultaneously and are competing for processor time.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Copying many large formulas (neural nets) between data tables is very, very slow

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Copying many large formulas (neural nets) between data tables is very, very slow

I tried doing Run Script from the Formula Depot on 100 formulas at a time. The first 100 went reasonably fast. Second 100 slowed to a crawl and appeared to stop.

I tried copying the scripts for all formulas from Formula Depot, wrapping in <<Begin Data Update ... <<End Data Update.

I tried running Suppress Eval (0) on columns of formulas that were copied with Suppress Eval (1). I looped through the columns wrapped in <<Begin Data Update ....Col(i) Suppress Eval(0)......Wait(0) <<End Data Update. I tried on 10 columns. Didn't work.

So far, nothing I have tried has worked faster than manually join the tables by row with the formula evaluation turned on and that took days.....

P.S. Is there a way to Break a running script?

Article Labels

There are no labels assigned to this post.