Classification of Breast Cancer Cells Using JMP
Marie Guadard, North haven Group Phil Ramsey, North Haven Group
This paper illustrates some of the features of JMP that support classification and data mining. We will utilize the Wisconsin Breast Cancer Diagnostic Data Set, a set of data used to classify breast lumps as malignant of benign based on the values of 30 potential predictors, obtained by measuring the nuclei of fluid removed using a fine needle aspirate. We begin by illustrating some visualization techniques that help build an understanding of the data set. After partitioning our data into a training set, a validation set and a test set, we fit four models to the training data. These include a logistic model, a partition model and two neural net models. We then compare the performance of these four models on the validation data set to choose one. The test set is used to assess the performance of this final model.