Discussions

Emma1 · Jun 8, 2023 5:36 PM

Hello,

I have no question about the confusion matrix of a multinomial logistic regression model

Here's there is the variable Y with 3 modalities: 1,2,3

Does "Training" mean that JMP automatically does a learning database to learn the model?

Indeed when I calculate the total number of observations on my confusion matrix : 311+79+37+98+209+59+48+57+187 = 1085

While my database is 1369 rows

And 1085/1369 = 0.79

Does this mean that my confusion matrix is done on a learning dataset that is 80% of my database?

And how to know which data has been used to test the ACCURACY of my model on the test dataset?

Thank you

Mark_Bailey · Jul 21, 2021 01:34 PM

The data set is split into subsets. The 'training set' is used to fit or train the model. The 'validation set' is used to select the model. It should reflect performance with new data. Some people also use a 'test set,' a third hold out subset to evaluate the selected model.

You do not know to which set your data were assigned, unless you explicitly define it yourself. Create a validation data column in your data with indicators of which set and use it in the Validation modeling role when you launch your modeling platform.

Discussions

Confusion matrix "training"

Re: Confusion matrix "training"

Recommended Articles