turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- Big neural network, can't save the formulas

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Nov 17, 2016 9:02 AM
(2144 views)

Hello,

I have a large dataset and I have been able to run a neural network on the data successfully after a long wait.

The data has thousands of input variables and about 700 observations.

What is the most efficient way or saving out the formulas? I tried "save formulas" but it was unable to produce a result after 24 hours of waiting, so I restarted jmp. I am going to try fast formulas or SAS dataset next, but thought I would ask the question anyways.

I was also wondering how to recreate the neural network I have in jmp in enterprise miner. The model has just a single layer with 3 tanH activation nodes, is boosted (model runs 40 times) with a learning rate of .6, and uses no penalty. If someone can get me going on how to repeat this in SAS (base or EM) that would be very helpful. I guess alternativly I could run the jmp code in enterprise miner, is that is possible?

Thanks.

4 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Nov 17, 2016 2:34 PM
(2124 views)

Have you tried the SAS Data Step (I am not a SAS user so I don't know much about that but would suggest you look into that option). If you have JMP13 Pro then the formula depot may help you. You can publish a formula to the depot and then from the depot you can create scoring code in a number of languages.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Nov 18, 2016 8:43 AM
(2096 views)

I did a bit more testing with smaller Neural Networks, and that appears to be the most efficient computationally.

Thanks for the suggestion.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Nov 18, 2016 7:04 AM
(2107 views)

With thousands of input variables and 40 boosting steps I can imagine you have a massive and massively complicated formula.

If I may, I would suggest using a variable reduction technique such as Partition if you are using regular JMP or Bootstrap Forest, Boosted Tree or Generalized Regression if you have JMP Pro to get the number of input variables down to a more manageable number of the most important factors. Once you have those factors you can then run a Neural model as before with the reduced set of input variables. Not saying the model still won't be large and/or complicated, but I would bet it would be much smaller than your current NN model. Also, if you use those other techniques you can compare the models to see which one is best via Model Comparison in either the Formula Depot as Karen suggests or standalone Model Comparison.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Nov 21, 2016 7:39 AM
(2062 views)

Yeah, I should try the dimension reduction approach you mention. I am a little hesitant to believe it will work as the input data is similar to a time series with data that has a undetermined lag associated with it, so the relationship between all the input variables are a little "fuzzy" (observation one, input variable one is the first sample in the series for all observations, but I am using the NN to find patterns in the data instead of directly comparing the observations for each variable, if that makes sense) and so I am not 100% sure PCA or the suggested partition method would be suitable.

I think reducing my sample rate for each series and manually removing one series at a time may result in much less data, and an equivalent model.