Boston Housing Case - Model Validation and Comparison

The Boston Housing analytics case study.  The complete collection of analytics cases is available from Collection: Analytics Case Study Library.


Download the case study
Download the data

Key ideas:

Model validation, stepwise regression, regression trees, neural networks, validation statistics and model comparison.


The objective of this study is to develop a model to predict the median value of homes in the Boston area. The data were originally collected and assembled in the mid-1970s (Harrison and Rubinfield, 1978), so this example is a bit dated. However, it is typical of a socioeconomic data set that is used to inform economic or public policy decisions, and the data set is well-known throughout the data mining community.

The Task:

Our goal is to use the available data build a model that makes accurate predictions about home values in the Boston area. To ensure that the model predicts well for data not used to build the model, we use model validation. We will build different models (e.g., multiple regression, regression tree and neural network) in JMP Pro, compare the performance of these models, and select the best-performing model.



From Building Better Models with JMP® Pro, Chapter 8, SAS Press (2015). Grayson, Gardner and Stephens. Used with permission.