Você está na página 1de 3

Linear regression and the coefficient of determination in SPSS

Bro. David E. Brown, BYUIdaho Dept. of Mathematics


February 2, 2012
To use the following instructions, your data need to be two numeric (Scale) variables.
1. Start SPSS and enter your data or open your data file.
2. Make any necessary adjustments in the Variable View. Pay particular attention to the Measurement levels of your variables. They must be scale for the following to work.
3. To make scatterplots. . .
In the Graphs menu, click Legacy dialogs. A submenu will appear.
In the submenu, click Scatter/Dot, which is near the bottom.
The Scatter/Dot selector box appears.
Select Simple Scatter and click Define. The Simple Scatterplot dialog appears.
Move to the Y Axis: box the name of the variable you want on your vertical axis. (For association
and correlation, it does not matter which variable you put on the y-axis and which you put on
the x-axis.)
Move to the X Axis box the name of the variable you want on your horizontal axis.
If you want, you can click Titles... to add a title, subtitle, and so on; click Continue to return
to the Simple Scatter dialog.
Click OK. The SPSS Output Viewer appears, with your scatterplot in it.
4. To calculate the regression coefficients (slope and intercept) and the coefficient of determination. . .
In the Analyze menu, click Regression. A submenu will appear.
In the submenu, click Linear... The Linear Regression dialog will appear.
Put in the Dependent: box the name of your response variable.
Put in the Independent(s): box the name of your explanatory variable.
Click Statistics.... The Linear Regression:

Statistics dialog appears.

To get the regression coefficients (slope and intercept), make sure the box for Estimates has
a check in it.
To get the coefficients of correlation and determination, make sure the box for Model fit
has a check mark in it.
If you want the mean and standard deviation of the explanatory and response variables, put
a check mark next to Descriptives.
Click Continue. SPSS returns you to the Linear Regression dialog.
To get a plot of residuals versus fits. . .
Click Plots. The Linear Regression: Plots dialog appears.
Move *ZRESID to the Y: box. (ZRESID stands for Standardized residuals. SPSS wont
plot the actual residuals, but their z-scores, which is better in some ways.)
1

Move *ZPRED to the X: box. (Again, SPSS will use the z-scores of the fitted (or, predicted)
values, and not the fitted values themselves.)
You will see check boxes for Histogram and Normal probability plot. If your instructor
requires you to check the requirements for linear regression, make sure there is a check mark
next to these. SPSS will give you a histogram and a P-P plot (which is just like a Q-Q plot)
of the residuals. (Youll need these when deciding whether linear regression is appropriate for
your data.)
Click Continue. SPSS returns you to the Linear Regression dialog.
If your instructor wants you to save the residuals. . .
Click Save. . . The Linear Regression: Save dialog appears.
Put a check mark next to the type of residual your instructor has told you to save.
Click Continue. SPSS returns you to the Linear Regression dialog.
OK. The SPSS Output Viewer window appears.
First is the Descriptive Statistics table, with the means and standard deviations of your
response and explanatory variables.
Next is a table of Correlations.
The first set of rows gives you the correlation coefficients for your variables.
The next set of rows gives you the corresponding P-values.
The final set of rows tells you how many data points were used.
Next is a table called Variables Entered/Removed. Kindly ignore this table.
The Model Summary table is next.
Pearsons correlation coefficient is under the column heading R.
The coefficient of determination is r2 , and so is found in the R Square column.
Please ignore the Adjusted R Square column.
The standard error of the estimate is in the rightmost column.
An ANOVA table is next. Please ignore it.
Next is the Coefficients table.
Under Model youll see that the first row is for the Constant, that is, the y-intercept, and
the second row is for the explanatory variable. This really means that the second row is
for the slope of the regression line.
Next is a pair of columns under Unstandardized Coefficients. The value of B in the
first row is the y-intercept of your regression line. The value of B in the second row is the
slope of the regression line.
Under Std. Error are the standard errors of the y-intercept and slope, respectively.
Please ignore the Standardized Coefficients column.
The t column gives the t-scores for the tests of H0 : intercept = 0 and H0 : slope = 0,
respectively.
The rightmost column gives the P-values for the hypothesis tests just mentioned.
Next, if you asked for it, is the Residual Statistics table, giving the min., max., mean,
and standard deviation of the predicted values, the residuals, and the standardized predicted
values and residuals.
Next is the histogram-with-normal-curve for the standardized residuals, if you asked for it.
The normal P-P plot of the standardized residuals is next, if you asked for it. (This is just
like a Q-Q plot.)
Finally, youll see the plot of residuals versus fits (all standardized), if you requested it. It
kind of looks like a scatterplot.

Page 2

In my classes, we now stop to check all the requirements for linear regression. Thats what the plots are
for. We do not use the regression line for ANYTHING until we have verified that all the requirements are
met.
As always, if you have questions, please ask them!

Page 3

Você também pode gostar