This script creates an interactive demonstration of simple linear regression. You can select data columns for the Y and X roles or you can use a randomly generated data sample. Simply open and run the script:
Select a data column for the dependent variable and click the Y, Response button. Select another data column for the independent variable and click the X, Predictor button. Click OK.
(Note that you do not have to use data columns for this demonstration. Simply click Cancel and the script will generate a random sample for you.)
The demonstration provides a scatter plot of the data as its main feature. The horizontal blue line represents your best fit line. It initially represents the null hypothesis that the response is independent of the predictor or that the slope is 0. You fit the blue line to the data by moving it with two handles (small squares). Be careful to place your pointer in the center of the square before dragging it. Also be careful to stop dragging when the square is away from a data point so that it is easy to move it later.
Below the plot is a collection of buttons to turn various graphical features on and off. These features allow you to:
To the right of the scatter plot is a bar chart for the sum of squares of the residuals or error and two outlines. The Model Order outline allows you to alternate between a first-order and a second-order linear regression model. The Regression outline presents the same reports as you would expect from the Bivariate or the Fit Least Squares (Minimal Report) platforms. The Regression outline depends on your line, so if you move the line, this report is immediately updated.
A typical demonstration might follow these steps:
Further possibilities include changing the data. You can drag a data point. You can click anywhere to add another data point. These features allow you to discuss leverage (move a data point vertically (change in response) for an observation near the center or near one of the ends) and influence (change the scale of the axes and add a new observation far away from the original data and out of line with the regression fit).