Visualizing the variability of a curve using bootstrapping
Dec 4, 2012 9:22 AM
This is part 4 of a series of blog posts on the new bootstrap feature in JMP Pro, Version 10, that began with an introduction to one-click bootstrapping.
In previous posts in this series, we have looked at bootstrapping functions of the output and multiple tables simultaneously. In this post, we will use a little bit more JSL to bootstrap a particular function (a regression curve) and display the bootstrapped curve graphically. This is an idea that is briefly discussed in Efron & Tibshirani (1993).
This post will also highlight another new feature of JMP 10: the Fit Curve platform (also known as the "new Nonlinear" platform). The Fit Curve platform can be found interactively through the Nonlinear launch dialog (Analyze > Modeling > Nonlinear). Instead of specifying an X column that has a column formula, specify an X column that does not have a column formula. This will get you to the new platform. The Fit Curve platform then allows you to fit a variety of predefined curves to your bivariate data.
For the example today, we will be fitting curves to exponential growth data, specifically the US population data in the JMP Sample Data directory. We first fit a linear regression line to the data, do a bootstrap analysis of the slope and intercept estimates of the regression line, and plot the collection of lines produced by the bootstrap sample. Then we fit an exponential curve to the data and do the same thing. We can then compare these plots of "bootstrapped curves." In Figure 1, we can see that there is a lot more variability around the estimates of the linear regression line (left panel) than around the estimates for the exponential curve (right panel).
After I showed this example to one of my colleagues who develops the Fit Curve platform, he immediately wanted to see how bad a fit he could find for the bootstrapped curve. He came up with a pharmacokinetic example that might look like a quadratic curve to someone unaccustomed to this type of data. But from the following figure, you can see that the quadratic is not a good fit for this data. Using a pharmacokinetic model, you get a much better fit, as shown in the right panel of Figure 2.
For the interested reader, this is essentially "bagging" (bootstrap aggregation), described in Hastie et al. (2009).
I hope this series of use cases of the bootstrap feature in JMP Pro has gotten you interested in trying it out. As mentioned in the previous posts, please share any interesting ways that you are using the bootstrap feature. The slides and journal file of examples (including this one) from my Discovery Summit talk can be found on the Discovery website.