Solved: Re: Confusion Matrix is not seen when building a Bootstrap Forest with a Categor...

madhu · Nov 13, 2024 01:31 PM

Hi.

I have been working with the Equity file (JMP Database) on JMP Pro 18 to build a Bootstrap Forest with a Categorical Response (BAD). I have followed the instruction as given on https://www.jmp.com/support/help/en/17.1/?os=win&source=application#page/jmp/example-of-bootstrap-fo...

However, the report does not include the confusion matrix. Can anyone suggest me if I am missing any step?

Moreover, at the beginning of this guide it is written that "bootstrap forest model is built to predict whether a customer is a bad credit risk." However, from the analysis presented in the guide how can I clearly conclude the outcome (i.e., bad credit risk)?

jthi · Nov 13, 2024 01:45 PM

Change BAD column to Nominal modeling type.

-Jarmo

View solution in original post

Victor_G · Nov 13, 2024 01:52 PM

Hi @madhu,

It's strange, relaunching a Bootstrap Forest on this Equity dataset does provide me a confusion matrix :

I also checked, and the Confusion matrix is by default displayed in the report, its appearance is automatic and not related to the platform preferences :

You can maybe contact technical support at support@jmp.com ?

For your second question, the objective in this dataset is to predict based on various descriptors (LOAN, MORTDUE, VALUE, ...DEBTINC) if a person has a bad credit risk (level "Bad Risk" in response column "BAD") or an acceptable one (level "Good Risk").

The aim for a bank/insurance company is to lend money only to people that can pay back, so if you are able to detect people that can pay you back effectively thanks to various descriptors and a model, you can more easily determine if the risk to lend money is worth or not.

Hope this response may help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

View solution in original post

madhu · Nov 13, 2024 01:42 PM

Here is the Equity dataset

jthi · Nov 13, 2024 01:45 PM

Change BAD column to Nominal modeling type.

-Jarmo

madhu · Nov 14, 2024 06:26 AM

Hi #jthi

This is correct. I have noticed that the BAD data were marked as "continuous". I have now changed it to "nominal" and the problem is now solved. Thank you for your support.

Victor_G · Nov 13, 2024 01:52 PM

Hi @madhu,

It's strange, relaunching a Bootstrap Forest on this Equity dataset does provide me a confusion matrix :

I also checked, and the Confusion matrix is by default displayed in the report, its appearance is automatic and not related to the platform preferences :

You can maybe contact technical support at support@jmp.com ?

For your second question, the objective in this dataset is to predict based on various descriptors (LOAN, MORTDUE, VALUE, ...DEBTINC) if a person has a bad credit risk (level "Bad Risk" in response column "BAD") or an acceptable one (level "Good Risk").

The aim for a bank/insurance company is to lend money only to people that can pay back, so if you are able to detect people that can pay you back effectively thanks to various descriptors and a model, you can more easily determine if the risk to lend money is worth or not.

Hope this response may help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

madhu · Nov 14, 2024 07:08 PM

Dear @Victor_G

I have noticed that the BAD data were marked as "continuous". I have now changed it to "nominal" and the problem is now solved. Thank you for your support.

Regarding my second question, I understand that the bootstrap forest model is built to predict whether a customer is a bad credit risk.

In the Example of Bootstrap Forest with a Categorical Response as seen on the below link, three reports were produced and interpreted that I quote below.

https://www.jmp.com/support/help/en/17.1/?os=win&source=application#page/jmp/example-of-bootstrap-fo...

Overall Statistics Report

“The Overall report shows that the misclassification rates for the Validation and Test sets are about 11.4% and 9.9%, respectively. The confusion matrices suggest that the largest source of misclassification is the classification of bad risk customers as good risks.

The results for the Test set give you an indication of how well your model extends to independent observations. The Validation set was used in selecting the Bootstrap Forest model. For this reason, the results for the Validation set give a biased indication of how the model generalizes to independent data.”

Column Contribution Report

[You are interested in determining which predictors contributed the most to your model.]

“The Column Contributions report suggests that the strongest predictor of a customer’s credit risk is DEBTINC, which is the debt to income ratio. The next highest contributors to the model are DELINQ, the number of delinquent credit lines, and VALUE, the assessed value of the customer.”

Missing Values Report

“The DEBTINC column contains 1267 missing values, which amounts to about 21% of the observations. Most other columns involved in the Bootstrap Forest analysis also contain missing values. The Informative Missing option in the launch window ensures that the missing values are treated in a way that acknowledges any information that they carry.”

However, I struggle to understand how do the results of the above three reports can be used to predict whether a customer is a bad credit risk.

Confusion Matrix is not seen when building a Bootstrap Forest with a Categorical Response

Re: Confusion Matrix is not seen when building a Bootstrap Forest with a Categorical Response

Re: Confusion Matrix is not seen when building a Bootstrap Forest with a Categorical Response

Re: Confusion Matrix is not seen when building a Bootstrap Forest with a Categorical Response

Re: Confusion Matrix is not seen when building a Bootstrap Forest with a Categorical Response

Re: Confusion Matrix is not seen when building a Bootstrap Forest with a Categorical Response

Re: Confusion Matrix is not seen when building a Bootstrap Forest with a Categorical Response

Re: Confusion Matrix is not seen when building a Bootstrap Forest with a Categorical Response