cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
Emma1
Level III

Confusion matrix "training"

Hello,

 

I have no question about the confusion matrix of a multinomial logistic regression model

 

Here's there is the variable Y with 3 modalities: 1,2,3 

 

Emma1_0-1626855734783.png

 

Does "Training" mean that JMP automatically does a learning database to learn the model?

Indeed when I calculate the total number of observations on my confusion matrix : 311+79+37+98+209+59+48+57+187 = 1085

While my database is 1369 rows

And 1085/1369 = 0.79

Does this mean that my confusion matrix is done on a learning dataset that is 80% of my database?

And how to know which data has been used to test the ACCURACY of my model on the test dataset?

 

Thank you

1 REPLY 1

Re: Confusion matrix "training"

The data set is split into subsets. The 'training set' is used to fit or train the model. The 'validation set' is used to select the model. It should reflect performance with new data. Some people also use a 'test set,' a third hold out subset to evaluate the selected model.

 

You do not know to which set your data were assigned, unless you explicitly define it yourself. Create a validation data column in your data with indicators of which set and use it in the Validation modeling role when you launch your modeling platform.