topic Re: XGbbost training, validation and testing in Discussions

XGboost training, validation and testing

Sburel — Sat, 06 Sep 2025 11:35:43 GMT

Hello,

I have a question about handling test datasets in XGBoost. Here's my current approach:

I'm using 10-fold cross-validation for model training and validation
I keep a separate test dataset completely excluded during this training phase (hide and exclude in the table before launching the model)
After training, I manually generate predictions on this test dataset (save predicted to table and I unhide the test data)

My issue is: By handling the test data separately, I'm missing out on the automatic performance metrics that XGBoost can calculate. Is there a way to:

Keep the test data completely segregated during training (using hide and exclude)
BUT still have XGBoost automatically calculate performance metrics on this test data after training is complete?

Re: XGbbost training, validation and testing

frankderuyck — Tue, 01 Apr 2025 14:52:35 GMT

I was just about to launch the same question

Re: XGbbost training, validation and testing

Victor_G — Fri, 05 Sep 2025 16:48:15 GMT

Hi @Sburel,

Just saw your post now, sorry for late reply.

For any model evaluation on test set, you can use Model Comparison platform (on test rows only) : simply select your XGBoost prediction formula as Y Predictor and you'll get Rsquare, RASE (=RMSE) and Average Absolute Error calculated automatically. Some other metrics like Correlation can be calculated with other platform : for Correlation, you can use Multivariate platform with the predicted response and measured response to see the actual vs. predicted plot (also available in Model Comparison by default) as well as correlation value.

Hope this answer will help you or other members,