Você está na página 1de 38

Introduction to SPSS

SPSS Books
&

Class Note Prof A.S.Prabhu

SPSS Analysis without anguish: Version 10, Coakes S.J., 519.502855369 Doing statistics with SPSS, Kerr A.W.,519.50285 A guide to computing statistics with SPSS release 10, Howitt D.,300.2855369 Data Analysis using SPSS versions 8 to 10, Foster J., 519.50285

1.1 INTRODUCTION
Statistical Package for the Social Sciences (SPSS) is a very powerful Data Analysis Application. It provides a user-friendly tool for analysing Questionnaire Data and other Data Sets. It can also be used in conjunction with most standard spreadsheet and word processing packages to produce professional reports on the findings of any Analysis.

1.2 GETTING STARTED


Open SPSS as follows: Click on the Start button (bottom left hand corner). Go to Programs and then SPSS for Windows and then SPSS 12.0 for Windows.

1.3 SPSS DATA FILES


Once SPSS is activated the user is presented with the SPSS Screen. A new Datafile is opened automatically. This file contains two Windows, a Data View and a Variable View (i) Data View

This Data View consists of a grid of Columns and Rows, similar to a spreadsheet. The Columns represent Variables and the Rows represent Cases. The numerical data from the questionnaire are typed in this Window. (ii) Variable View The variable view also contains a grid of rows and columns. In this Window the rows represent the variables in the analysis and the columns are used to define the characteristics of each variable. We will discuss this in greater detail in the next section.

2. CODING DATA
When processing most questionnaires it is necessary to code each response to a given question. For example, in answer to the gender question each reply is assigned a code

Gender Male Female

Code 1 2

Each answer is assigned a numerical code, to ease computer data entry.

Using the Variable View for coding variables To code a variable in SPSS we need to enter the Variable View by clicking the Variable View tab.

1. Variable Name

In the first column enter the variable name. This can take a maximum of 8 characters. In this example we can use the name Gender in the first column of the first row. Note: The Variable Type, Width and Decimals will be entered automatically.

2. Variable Label Enter a Variable Label in the box provided. The Variable Label provides a more detailed description of the question than the Variable Name, as it can contain many more characters. In our example we might enter Gender of Respondent as the Variable Label. If this variable is used in any analysis it is the Variable Label that appears in the Output. If no Variable Label is entered then the Variable Name is used in any Output. 3. Variable Values Value labels are used to define the coding system for the variable. To define a coding system for a variable, click the grey box in the Value cell. Type the Value and the corresponding Value Label in the boxes provided (e.g. 1 as the Value and Male as the Value Label) and then click the ADD button.

Repeat this process until all Values and Value Labels are entered.

4. Missing Values Invariably when data is collected using self-completed questionnaire forms there will be several questions left unanswered. This can happen for a variety of reasons, including simple carelessness, a lack of willingness on the part of the respondent to supply the desired information or a lack of competence to answer the question. Also a question may not be relevant to some of the respondents. In SPSS we must enter a code to represent missing data. In this example we will enter the value 0 to represent a non-response. To enter a missing value code, click the grey box in the missing values cell and select discrete missing values. Enter 0 as the missing value and click OK.

All variables on the questionnaire can be coded in this way. It should be noted that most ratio/interval variables will not need value labels. For example for the variable How many time have you been abroad, there is no need to define value labels, because the answers will already be numerical values. Codes for grade

Use 1 as the missing value for Maths and Stats marks Completed variable view

Completed data view labels on)

Completed data view (value From the View menu select Value Labels

3 DESCRIPTIVE STATISTICS
Descriptive Statistics are a group of techniques for describing the breakdown of a variable or variables. They include Frequency Tables, Mean Scores, Graphs and other Statistics.

3.1 Frequencies
The Frequencies procedure provides statistics and graphical displays that are useful for describing many types of variables. For a first look at your data, the Frequencies procedure is a good place to start. To use the frequencies procedure 1. Select Descriptive Statistics from the Analyze Menu 2. Select Frequencies in the sub-group 3. The frequencies Window contains a list of all the variables in the file, a Variables box, a Display frequency tables option and three pushbuttons namely Statistics, Charts and Format. 4. Select the variables you are interested in from the Variables list on the left and then place in the Variables Box on the right (using the arrow in the center). 5. For a Frequency Distribution Table tick the Display frequency tables box.

6. If you require some variable Statistics use the Statistics button and

select the appropriate statistic and click the adjoining box. Then Click Continue 7. For bar charts, pie charts, and histograms, click the Charts button and select the relevant chart as above. Then press continue 8. When all Statistics and Plots have been chosen, click the OK button. 9. The results will be displayed in a separate output window. Example: Measures of Central Tendency, Dispersion and Shape for Maths and Stats marks 1. Select Descriptive Statistics from the Analyze Menu. 2. Select Frequencies in the sub-group. 3. Select the variables Exam mark for Maths and Exam mark for Stats and move over to the Variables Box on the right. 4. Make sure Display frequency table option is off. 5. Click the Statistics button.

Select some central tendency and dispersion measures.

Graphical Displays Histogram 1. Select Histogram from the Graphs menu. 2. Place Exam marks for Maths variable in the variable box and click the OK button. Stem and Leaf Plots 1. Select Descriptive Statistics from the Analyze Menu 2. Select Explore in the sub-group. 3. Select the variables you are interested in from the Variables list on the left and then place in the Dependent List box on the right (using the arrow in the center). In this example place Exam marks for Stats variables in the box. 4. Click on the Plots button at the bottom of the dialog box. 5. Select the stem and leaf option. 6. Then press continue. 7. Then click the OK button. Box plots 1. Select Boxplot from the Graphs Menu 2. Select Simple in the sub-group and Summaries of separate variables and then click the Define button. 3. Select the variables you are interested in from the Variables list on the left and then place in the Box Represents box on the right (using the arrow in the center). In this example place Exam marks for Maths and Exam marks for Stats variables in the box. 4. Then Click OK

Frequency Distribution Select Descriptive Statistics from the Analyze Menu Select Frequencies in the sub-group. Select the variable gender and place in the Variables Box. For a Frequency Distribution Table tick the Display frequency tables box. 5. For bar chart or pie chart click the Charts button and select the relevant chart as above. Then press continue 6. Then click the OK button.
1. 2. 3. 4.

Bar Charts 1. Select Bar from the Graphs menu. 2. Select a Simple bar chart and Summaries for groups of cases. 3. Place gender in the category axis box and press OK. Clustered Bar Charts 1. Select Bar from the Graphs menu. 2. Select a Clustered bar chart and Summaries for groups of cases. 3. Place the overall grade variable in the Define clusters by box and gender in the category axis box and press OK. 4. Under Bars Represents select % of cases. 5. Click OK.

4. RE-CODING DATA
For analysis purposes it is often desirable to re-code a variable or to calculate a new variable using a combination of old variables.

4.1 Re- coding


A variable like Grades with values (1 = A, 2 = B, 3 = C, 4 = D, 5 = Fail) can be re-coded into a new variable with two values, say (1 = Grade C and higher, 2 = Grade D and less) using the following procedure
1. In the Transform Menu select Recode 2. Select Into Different Variables

3. Select the Input variable (variable to be re-coded) from the Variable list and click the black arrow. In this example, use the Grade variable. 4. Enter the Name of the New (Output) Variable e.g. GradeCat 5. Enter a Label for the New Variable (Optional).

6. Click the Change button. 7. Click the Old and New Values button. 8. Enter the old value of the variable on the left side of the screen and the

new value on the right and click the Add button. For instance, in the example above a respondent who obtained an A was coded as 1, a B was coded as 2 and a C was coded as 3 in the old variable, and will be coded as 1 in the new variable. Those who obtained a D or failed were coded as 4 and 5 respectively in the old variable and will be coded as 2 in the new variable.

9. Click Continue 10. Click OK. The new variable should appear at the end of the variable list.

11. Enter an appropriate label for this new variable and code the value labels appropriately. Use 1 as the value for the missing value.

4.2 Compute:The Compute command allows the user to form a new


variable using a mathematical combination of several variables in the data set. In many surveys a researcher will ask several questions (Likert Scale) on a particular topic and ask the respondent to rate their agreement. To gauge the respondents overall view on the topic he can combine these questions and calculate the average score. To do this or calculate any other combination of variables use the following procedure.
1. In the Transform Menu select Compute

2. Enter a name of the New Variable (Target Variable)

3. In the Numerical Expression Window enter the formula, using the variable list and the calculator, for calculating the new variable. e.g Var1 + Var 2. Var1 * Var2 (Multiplied) 4. Click OK There are a large number of mathematical functions available for these calculations. The new variable will appear at the end of the variable list. Example: Compute the mean mark for students who sat the Maths and Stats exam. In the Transform Menu select Compute and type MeanMark as the New Variable in the Target Variable box. 2. In the Function group box, select Statistical. 3. In the Functions and Special Variables box double click on the Mean function. The function Mean should now be in the Numeric Expression box.
1.

Now double click on the variable for Exam marks for Maths to replace the first ? in the expression Mean. Remove the second ? in the Mean expression and then double click on the variable for Exam marks for Stats. You should now have the expression MEAN(Maths, Stats) in the expression box. 5. Click OK. You should now see a new variable MeanMark added in the variable view window. In the data view window, the MeanMark variable gives the mean mark for Maths and Stats for each student. 6. Enter an appropriate label for this new variable and use 1 as the value for the missing value.
4.

Once you have the data, it is fairly simple to compute a onesample t-test in SPSS. Choose, from the menu, ANALYZE / COMPARE MEANS / ONE SAMPLE T TEST You will get the dialog box that appears below:

Your test variable is amount; click this over into the test variable column. What is your test value? It is the value of the mean under the null, in this case, 2. Enter 2 into the test value box. You may want to check out the options box to see what is available to you. The box is below.

Note that will automatically get the SPSS default 95% confidence interval for the data. The missing values command tells SPSS how to deal with missing data; the SPSS default is fine for most purposes (this will become more critical when you

are dealing with more variables and have missing data in your set). Click OK once you have set up your dialog boxes correctly, and SPSS will run the test. The output that you will receive is below:

Take a moment to see what you have here. Of course, your critical information is the t value (-2.74), your degrees of freedom (9), and the significance of your test (.023; notice that you can get exact p-values in SPSS). You also get your sample mean, the standard deviation (this is the estimated standard deviation), and the estimated standard error of the mean. You could compute the t-test yourself with this information: t = 1.177 2 = -2.74 .9499 / 10 Also note that your 95% confidence interval of the difference ranges from 1.50 to -.14. Its strange that SPSS computes the confidence intervals this way, but you can easily get the confidence intervals around the mean as follows:

If you were to compute the confidence intervals by hand, you would calculate Mean +/ tcrit (=.05, 2 tailed) * hat / N Or 1.177 +/ 2.262 * .9499 / 10, the confidence interval is . 4975 to 1.8565
Because you use +/- .6795 to give the bounds of your interval. Using these values around the mean difference (-.8230) you get the SPSS values; using these values around the mean (1.177) you get .4975 to 1.8565. These are the values we would use. What would a confidence interval look like if you were using a lower alpha level (i.e., having a greater % confidence)? Lets try using a 99% confidence interval. Run the t-test as before, but select in the options that you want a 99% c.i..

You will get the table above. Notice that your confidence interval is now larger (this makes intuitive sense; to have a greater degree of confidence, you would need a larger range of possible values). This time, the confidence interval of the difference contains 0 (this is essentially the same as stating that the confidence interval around the mean contains 2 (c.i. around the mean would be .201 to 2.153).
In other words, if you use an level of .01, you will fail to reject the null hypothesis, and your confidence interval would contain the null value. You can see that your exact p-value is .023, too high to reject the null if is .01.

Now lets do a paired-groups t-test. The example for this problem comes from the Implicit Assumptions Test (IAT). The IAT method is used to assess automatic evaluations (i.e., evaluations that occur outside of conscious awareness and individuals control). The purpose of this study was to see if the IAT method could be used to assess automatic associations toward specific people who play significant roles in our lives (e.g., mother, romantic partner). The hypothesis was that if the IAT is able to assess automatic attitudes, it should be able to discriminate between a positive significant person and a negative significant person. Subjects in the study were asked to nominate a positive person from their lives and a negative person, and then two IATs were performed to assess their automatic associations toward each person. The two variables represent the strength of positive automatic associations toward each person (larger numbers indicating more liking). Use a two-tailed test, = .05.

Why is this a paired-samples t test (also called a repeated measures t test)? Because each participant is involved in both independent variable manipulations (in this case, both positive and negative significant person associations). The key to this type of test is that a line of data represents each participants performance. Thus, the two dependent variables are paired in that they apply to each participant.

To run this test, choose the commands

ANALYZE / COMPARE MEANS / PAIRED-SAMPLES T TEST


from the menu system. You will get the following dialog box:

Here, you need to click on both variables (the positive and negative dependent variables). Click one, hold down the SHIFT key, and click the other. Then select them to move to the test variable box. Your options are the same as they were for the one-sample t test. The output follows:

For this study, the null hypothesis is that there is NO DIFFERENCE between conditions (positive or negative), so Ho: positive = negative. The alternative hypothesis is H1: positive negative. The first table of your output gives you the mean, standard deviation, and standard error of the mean of the two conditions. Dont worry about the second table for now, other than knowing that it shows how the two dependent variables correlate with one another. The third table is the t test itself. Notice that you have a t of 5.681, with 19 DFs, and that the significance is listed as .000. What kind of p-value is that? Well, since SPSS only displays three decimal places, you know that p is at least less than .005. Your 95% confidence interval is constructed around the difference between the means. If the null hypothesis is that there is no difference, this value is 0. Since 0

is not in your 95% confidence interval, you can reject the null at the alpha level of .05 (actually even less than that, as you know from the exact p-value). SO, a paired samples t test is essentially the same as a one-sample t test of the difference scores. Lets prove that you would get the same result if you were to run the test that way: First, compute a variable of the difference scores. Select: TRANSFORM / COMPUTE

You will get the following dialog box:

Select that your target variable (diff) is equal to pmsiat nmsiat. Then click OK. Now, run the ANALYZE / COMPARE MEANS / ONE SAMPLE T TEST command again. This time, your test variable is diff. You will get the following:

Notice that this is the same t, with the same dfs, significance level, and confidence intervals as when you ran the paired-samples t test. Independent-samples t test: Now, lets get a new data set that applies to independent groups. In this study, the IAT is again used, this time to assess automatic associations with mother. The variable of interest is gender; that is, do males and females differ in their automatic associations with their mothers? Ho: males = females. The alternative hypothesis is H1: males females. The data set is as follows (click here for the data set):

Notice that you still have two variables, but here the first variable is the independent variable (sex) and the second is the dependent variable (momiatms, or a measure of automatic associations to mother. This should look quite different from the pairedsamples t test; put simply, nothing is paired in this study.
To run this t test, select:

ANALYZE / COMPARE MEANS / INDEPENDENTSAMPLES T TEST The dialog box that you will get follows

Your test variable (the DV) is momiatms; the grouping variable is sex. You need to define the groups, which are 0 and 1 (what the experimenter has used, but you can use anything; here 1 is male and 0 is female). Once you click OK, you will get the following:

Not much is new here; Levenes test is a measure that tests an important assumption of the independent-samples t-test; that is, the two groups have equal variances. Here well assume that the variances are equal (use the first line in the second table for your t). What can you say about this study? It looks like your t, with 24 degrees of freedom (N1 + N2 2), is 2.084. The p-value is .048. You can reject the null hypothesis at the alpha level of .05, but what do you conclude? It appears that males have more positive automatic associations with mother than females.

Correlation
How does education influence the types of occupations that people enter? One way to think about occupations is in terms of occupational prestige. Your data set includes a variable, PRESTG80, in which a prestige score was

assigned to respondents occupations, where higher numbers indicate greater prestige.Lets hypothesize that as education increases, the level of prestige of ones occupation also increases. To test this hypothesis, click on "Analyze," "Correlate," and "Bivariate." The dialog box shown in Figure 7-1 will appear on your screen. Click on EDUC, and then click the arrow to move it into the box. Do the same with PRESTG80.

Figure 7-1 The most widely used bivariate test is the Pearson correlation. It is intended to be used when both variables are measured at either the interval or ratio level, and each variable is normally distributed. However, sometimes we do violate these assumptions. If you do a histogram of both EDUC, chapter 4, and PRESTG80, you will notice that neither is actually normally distributed. Furthermore, if you noted that PRESTG80 is really an ordinal measure, not an interval one, you would be correct. Nevertheless, most analysts would use the Pearson correlation because the variables are close to being normally distributed, the ordinal variable has many ranks, and because the Pearson correlation is the one they are used to. SPSS includes another correlation test, Spearmans rho, that is designed to analyze variables that are not normally distributed, or are ranked, as is PRESTG80. We will conduct both tests to see if our hypothesis is supported, and also to see how much the results differ depending on the test used in other words, whether those who use the Pearson correlation on these types of variables are seriously off base. In the dialog box, the box next to Pearson is already checked, as this is the default. Click in the box next to Spearman. Your dialog box should now look like the one in Figure 7-2. Click OK to run the tests.

Figure 7-2 Your output screen will show two tables: one for the Pearson correlation, and one for the Spearmans rho. The results of the Pearsons correlation, which is called a correlation matrix, should look like the one in Figure 7-3:

Figure 7-3 The correlation coefficient may range from 1 to 1, where 1 or 1 indicates a perfect relationship. The further the coefficient is from 0, regardless of whether it is positive or negative, the stronger the relationship between the two variables. Thus, a coefficient of .453 is exactly as strong as a coefficient

of -.453. Positive coefficients tell us there is a direct relationship: when one variable increases, the other increases. Negative coefficients tell us that there is an inverse relationship: when one variable increases, the other one decreases. Notice that the Pearson coefficient for the relationship between education and occupational prestige is .520, and it is positive. This tells us that, just as we predicted, as education increases, occupational prestige increases. But should we consider the relationship strong? At .520, the coefficient is only about half as large as is possible. It should not surprise us, however, that the relationship is not perfect (a coefficient of 1). Education appears to be an important predictor of occupational prestige, but no doubt you can think of other reasons why people might enter a particular occupation. For example, someone with a college degree may decide that they really wanted to be a cheese-maker, which has an occupational prestige score of only 29, while a high-school dropout may one day become an owner of a bowling alley, which has a prestige score of 44. Given the variety of factors that may influence ones occupational choice, a coefficient of .520 suggests that the relationship between education and occupational prestige is actually quite strong. The correlation matrix also gives the probability of being wrong if we assume that the relationship we find in our sample accurately reflects the relationship between education and occupational prestige that exists in the total population from which the sample was drawn (labeled as Sig. (2-tailed)). The probability value is .000 (remember that the value is rounded to three digits), which is well below the conventional threshold of p < .05. Thus, our hypothesis is supported. There is a relationship (the coefficient is not 0), it is in the predicted direction (positive), and we can generalize the results to the population (p < .05). Recall that we had some concerns about using the Pearson coefficient, given that PRESTG80 is measured as an ordinal variable. Figure 7-4 shows the results using Spearmans rho. Notice that the coefficient, .523, is nearly identical to coefficient obtained using the Pearson correlation. What do you conclude?

Figure 7-4

Regression
We can also analyze the relationship between education and occupational prestige using regression analysis. But first, lets look at the relationship graphically by creating a scatterplot. Click on "Graphs," "Scatter" and "Define" (we will use the default format, Simple). This will open up the dialog box shown in Figure 7 5. In the box on the left, click on EDUC then on the arrow key that is pointing toward the box labeled "X Axis" (because it is the independent variable in our hypothesis). Next, click on PRESTG80 and move it into the box labeled "Y Axis" (because it is the dependent variable). Your dialog box should look like the one in Figure 7-5. Then, click OK.

Figure 7-5 What you see is a plot of the number of years of education by the occupational prestige score for persons in the data set who have a job. Your scatterplot should look like the one in Figure 7-6:

Figure 7-6 You can edit your graph to make it easier to interpret. First, double-click anywhere in the graph. This will cause the graph to open in its own window. Then, double-click on the X-axis. A dialog box will open. In the Range section of the box, change the Minimum to 0. In the Major and Minor Divisions sections, change the Increments to 2. When you finish, it should look like the one in Figure 7-7. Then, click OK.

Figure 7-7 Now, on the Menu Bar, click on Chart, then Options. In the Fit Line section, click in the box next to Total. Then, click on the Fit Options button, and click in the box next to Display R-square in legend. Your boxes should look like those in Figures 7-8 and 7-9. Click Continue, then OK.

Figure 7-8

Figure 7-9

Your graph now looks like the one in Figure 7-10. Notice the Fit Line that is now drawn on the graph. Regression (and correlation) analyze linear relationships between variables, finding the line that best fits the data (i.e. it keeps the errors, distances of points from the line, to a minimum). The Fit Line shows you the line that describes the linear relationship. Also notice the R-square statistic listed to the right of the graph. Multiplied by 100, this

statistic tells us the percentage of the variation in the dependent variable (PRESTG80, on the Y-axis) that is explained by the scores on the independent variable (EDUC, on the X-axis). Thus, years of education predicts 27.03% of the variation in occupational prestige in our sample. Recall that the Pearson coefficient was .520. If you square the Pearson coefficient (.520 x .520), you get .2704 the same as the R-square (give or take some rounding)! Thus, by knowing the correlation coefficient, you can also know the amount of variance in one variable (dependent) that is explained by the other variable (independent) in a bivariate analysis.

Figure 7-10

Doing a regression analysis can help us to understand the Fit Line in more detail.

Figure 7-11 We can get more information about the regression line. Minimize the SPSS Chart Editor. Click on "Analyze," "Regression," and "Linear." This opens up the dialog box shown in Figure 7-11. Move PRESTG80 to the "Dependent" box, and EDUC to the "Independent(s)" box. Click OK. The results should look like those shown in Figure 7 12.

Figure 7-12 The first table just shows the variables that have been included in the analysis. The second table, Model Summary, shows the R-square statistic, which is .270. Where have you seen this before? What does it mean?

The third table, ANOVA, gives you information about the model as a whole. ANOVA is discussed briefly in chapter 6. The final table, Coefficients, gives results of the regression analysis that are not available using only correlation techniques. Look at the Unstandardized Coefficients column. Two statistics

are reported: B, which is the regression coefficient, and the standard error. Notice that there are two statistics reported under B: one labeled as (Constant), the other labeled as EDUC. The statistic labeled as EDUC is the regression coefficient, which is the slope of the line that you saw on the scatterplot (note that in scholarly reports, it is conventional to refer to the regression coefficient using the lower case, b). The one labeled as (Constant) is not actually a regression coefficient, but is the Y-intercept (SPSS reports it in this column for convenience only). What do these numbers mean? You may recall from your statistics course that the formula for a line is: Y = a + bX Y refers to the value of the dependent variable for a given case, a is the Yintercept (the point where the line crosses the Y-axis, listed as Constant on your output), b is the slope of the line which describes the relationship between the independent and dependent variables (B for EDUC), and X is the value of the independent variable for a given case. We know that the linear relationship between X and Y (EDUC and PRESTG80) is not perfect. The correlation coefficient was not 1 (or 1), and the scatterplot showed plenty of cases that did not fall directly on the line. Thus, it is clear to us that knowing someones education will not tell us without fail what their occupational prestige is, and furthermore, we are only analyzing a sample of cases and not the whole population to which we want to generalize our findings. It is clear that there is some error built into our findings (the reason that the Fit Line is usually called the Best Fit Line). For these reasons, it is conventional to write the formula for the line as = a + bX + e, where e refers to error. What can we do with this formula? One thing we can do is make predictions about particular values of the independent variable, using just a little arithmetic. All we have to do is plug the values from our output into the formula for a line (for our purposes, we will ignore the e): = 9.84 + 2.565X 9.84, the Y-intercept (or Constant), is interpreted as the average occupational prestige score (our dependent, or Y variable), holding constant the effects of education (our independent, or X variable). 2.565 is the slope of the line. That is, if you refer back to the scatterplot, if you move one unit to the right on the X-axis, then move 2.565 units upward, you will intersect with the regression line. (It is possible to have a negative coefficient. In that case, to intersect

with the line, you would move one unit to the right, and then B units downward.) What occupational prestige score would our results predict for a person who completed high school, but no higher education? All we have to do is enter 12 (as in twelve years of education) into our education: = 9.84 + 2.565(12) = 40.62 We find that having 12 years of education is associated with an occupational prestige score of 40.62. But what of the error? We know that not every high school graduate has this exact prestige score. We acknowledge this when we discuss results by stating that on average, those with 12 years of education will have occupations with prestige scores of 40.62. This language points out to our readers that it is likely that some of those respondents scored higher and some lower, but that 40.62 represents a central point. In sum, the error tells us about the distance from actual values of Y (the answers that the GSS survey respondents gave) and predicted values of Y (the ones you calculate based on the GSS respondents information in the X variable). Thus, the error is the difference between a predicted value of Y for a given case and the actual value of Y for a given case ( -Y). More generally, though, when we discuss regression results, we rarely compute predicted scores for particular values of the independent variable. Instead, in scholarly reports, we usually point out the general process at work. In our case, we would say that each additional year of education is associated with a 2.565 increase on the occupational prestige scale. Note that we refer to an additional year of education because our independent variable was measured as years of school completed. Thus, the unit of measurement is years. We say there was a 2.565 increase in prestige with a unit increase in education, because that is the distance we have to move to intersect with the Y-axis, which represents occupational prestige. Confidence Intervals help us estimate parameters of a population from sample statistics. Confidence Intervals are a range of possible values of a parameter. We express this interval with a specific degree of confidence. The degree of confidence tells a reader how confident we are that the population parameter falls within our stated interval. For this example we will use the variable How Spiritual do you consider yourself? in our example data set made up of genetic counselors. The name of this variable in the data set is sprscale. This variable allowed the sample of genetic counselors to rank how spiritual they are on a scale of 1 to 10.

To compute a confidence interval in SPSS, you begin by selecting Analyze Descriptive Statistics Explore. Once the Explore window pops up, scroll down on the left list until you find sprscale, then click on the arrow that will send it to the dependent list:

Você também pode gostar