cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • Register to attend Discovery Summit 2025 Online: Early Users Edition, Sept. 24-25.
  • New JMP features coming to desktops everywhere this September. Sign up to learn more at jmp.com/launch.
Choose Language Hide Translation Bar
Sburel
Level IV

XGboost training, validation and testing

Hello,

I have a question about handling test datasets in XGBoost. Here's my current approach:

  1. I'm using 10-fold cross-validation for model training and validation
  2. I keep a separate test dataset completely excluded during this training phase (hide and exclude in the table before launching the model)
  3. After training, I manually generate predictions on this test dataset (save predicted to table and I unhide the test data)

My issue is: By handling the test data separately, I'm missing out on the automatic performance metrics that XGBoost can calculate. Is there a way to:

  • Keep the test data completely segregated during training (using hide and exclude)
  • BUT still have XGBoost automatically calculate performance metrics on this test data after training is complete?
2 REPLIES 2
frankderuyck
Level VI

Re: XGbbost training, validation and testing

I was just about to launch the same question

Victor_G
Super User

Re: XGbbost training, validation and testing

Hi @Sburel,

 

Just saw your post now, sorry for late reply.

For any model evaluation on test set, you can use Model Comparison platform (on test rows only) : simply select your XGBoost prediction formula as Y Predictor and you'll get Rsquare, RASE (=RMSE) and Average Absolute Error calculated automatically. Some other metrics like Correlation can be calculated with other platform : for Correlation, you can use Multivariate platform with the predicted response and measured response to see the actual vs. predicted plot (also available in Model Comparison by default) as well as correlation value.

 

Hope this answer will help you or other members,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

Recommended Articles