Hi @ChristerMalm ,
Yesterday's unsession on predictive modeling was very informative. I don't recall the exact discussion on that topic, but I would be cautious in using a prediction formula for the outcome as one of the inputs for the same outcome. My understanding about ensemble modeling is that different models are built, e.g. NN, boosted tree, XGBoost, etc. and then they're averaged together. You can do this through the Model Comparison platform by using the red hot button, or you can write a column formula that does it for you. The ensemble model should fit the data better than any one model by itself, but it's not always the case. One thing to keep in mind, it really depends on what the end goal is and how important interpretability is for the model. If understanding the model equation and how inputs translate to outputs, then model ensembles and complicated non-linear models might not be the best way to go.
I think one of the things that Russ Wolfinger was talking about that's really important is finding the right validation scheme for your data. If the wrong set of training and validation data is chosen, then you might get terrible fits, even though the data is good, like the Nature paper example he talked about. It probably would be best to spend more time on determining the best validation scheme and then working on generating the model.
It also sounds like you might have an imbalanced data set, meaning much more of one occurrence than another. In that case, you'll definitely want to explore changing the logistic probability threshold to see if that helps your prediction. Also, maybe consider the profit matrix approach to assign value to the cost of false positives and false negatives. Chris Gotwalt also demonstrated an interesting approach of using decision trees to help with better understanding cutoffs with imbalanced data.
Hope this helps,
DS