Escolar Documentos
Profissional Documentos
Cultura Documentos
David Nadler
STAGE 1: Descriptive Statistics
Table 1-1
Table 1-2
1
Table 1-3
Table 1-4
FEV by Gender
Std. Std. Error
Gender N Mean Deviation Mean
Forced Expiratory Female 318 2.4512 .64574 .03621
Volume (L) Male 336 2.8124 1.00360 .05475
Table 1-5
Testing for Significant Mean Differences of FEV between Boys and Girls
Forced Expiratory Volume (L)
Equal variances
Equal variances assumed not assumed
Levene's Test for F 68.592
Equality of Variances Sig. .000
t-test for Equality of t -5.504
Means df 575.753
Sig. (2-tailed) .000
According to the data from Tables 1-4 and 1-5, females had lower FEV measurements
(M = 2.4512, SD = .64574) than that of males (M = 2.8124, SD = 1.00360), t(575.753) = -5.504,
p < .001; Levene’s = 68.592.
2
Table 1-6
Table 1-7
Testing for Significant Mean Differences of FEV between Smokers and Non-Smokers
Forced Expiratory Volume (L)
Equal variances Equal variances not
assumed assumed
Levene's Test for F 1.596
Equality of Variances Sig. .207
t-test for Equality of t -6.464 -7.150
Means df 83.273
Sig. (2-tailed) .000
According to the data from Tables 1-6 and 1-7, smokers had a higher FEV (M = 3.2769,
SD = .74999) than that of non-smokers (M = 2.5661, SD = .85052), t(652) = -6.464, p < .001.
Table 2-1
Correlations between FEV and Age, Height, Gender and Smoking Status
Variable Forced Expiratory Volume (L)
Age Pearson Correlation .756
Sig. (2-tailed) .000
Height Pearson Correlation .868
Sig. (2-tailed) .000
Gender Spearman Correlation .144
Sig. (2-tailed) .000
3
Smoking Status Spearman Correlation .258
Sig. (2-tailed) .000
According to the correlations calculated in Table 2-1, the independent variables Age,
Height, Gender, and Smoking Status are all significantly correlated to FEV. The correlation
coefficients for Age and Height are much greater than those of Gender and Smoking Status, so
we can assume that Age and Height are correlated more strongly to FEV.
Age and Height correlations to FEV were calculated by using the Pearson method
because all of the variables are on a continuous scale. Gender and Smoking Status correlations
to FEV were calculated by using the Pearson method because the variables are nominal or
ordinal.
Table 2-2
Table 2-3
4
Adjusted R-square, the coefficient of determination, shows how much of the variation of
the dependent variable is explained by the model. In this case, 57.2% of the variance of FEV can
be explained by this model.
Table 2-4
Table 2-5
Normality check
To check for the normality of distribution, we can first plot the Regression Standardized
Residual and see if it follows the shape of a normal curve. We plotted the histograms of the
5
Regression Standardized Residual of the independent variables on the dependent variable in
A P-P plot should follow a 45-degree line if there is normality. We can see from Figure 2
that the plots follow the 45-degree line. We can further support that the model has normality.
6
Figure 2. Normal P-P Plot for Forced Expiratory Volume
Linearity
7
When we check the homoscedasticity, or homogeneity of variance, of our model, we are
really checking that the regression model’s variance of the dependent variable is equal across the
range of the independent variable. The null hypothesis for homoscedasticity is that the variance
is homogeneous throughout. According to Figure 3, the variance of the residuals is not constant
Collinearity
Table 1
Collinearity tells us that one independent variable is highly correlated with another
independent variable, in turn increasing the standard error of the beta coefficients and making it
difficult to understand how each independent variable affects the outcome. If the tolerance of a
variable is less than .2, we can assume that collinearity exists. In Table 1 above, we see that the
tolerances for age, height, gender and smoking status are all above .2, so we can state that these
8
Stage 4. Multiple Regression
A multiple regression was run using forced entry using forced exhaled volume as the
dependent variable (outcome) and age, height, gender and smoking status as the predicators. The
model summary states that the Adjusted R Square value is .774, which means that 77.4% of the
variance can be explained by this model. The model is significant: F(4,649) = 560.021, p < .001.
Age, height and gender are significant factors (all have p < .001). Smoking status was not a
Table 1 above has already presented us with the factors of the predictor variables on the
outcome. For example, a one unit increase in age would have a .066 increase in forced exhaled
volume.
In the stepwise method, the analysis computes which of the predictors are significant to
keep and removes the ones that are not. The same data is entered into this model as with the
simultaneous method except that the method needs to be changed to stepwise. When height is
the only predictor variable, it accounts for 75.3% of the variance of the model. This model is
significant: F(1, 652) = 1994.731, p < .001. When age is then added as a predictor, 76.6% of the
variance is accounted. This model is significant: F(2, 651) = 1067.956, p < .001. When gender
is then added to age and height as a predictor, 77.4% of the variance is accounted. This model is
Table 2 shows how the predictors affect the outcome (forced exhaled volume) in the
stepwise model. It should be noted that smoking status was excluded from this model. Looking
at Model 3, we can see how the three significant predictor variables affect the outcome. For
example, a one unit increase in height will have a .105 increase in forced exhaled volume. We
9
prefer to use the forced entry multiple regression model above because it accounts for all four
predictor variables, not just the ones that were shown to be significant predictors.
Table 2
Standardized
Unstandardized Coefficients Coefficients
predictor variables (age, height and gender) had a significant effect on the outcome (forced
exhaled volume) using smoking status as a covariant. Table 3 below presents the output of the
ANCOVA analysis.
Table 3
Source df F p
Corrected Model 343 9.567 .000
Intercept 1 7917.868 .000
smoke 1 .480 .489
10
Table 3
Source df F p
age 16 2.241 .004
height 54 7.210 .000
sex 1 6.373 .012
age * height 174 1.220 .066
age * sex 10 1.225 .274
height * sex 35 1.232 .180
age * height * sex 42 .729 .893
Total 654
Corrected Total 653
We find that using smoking status (smoke) as the covariant, age, height and gender (sex)
are all significant predictors of forced expiratory volume. The interactions between these three
In the hierarchical method, we add the independent variables in different blocks into the
analysis. The first block consists of the variables we use as the control variables. These should
be known predictors from previous research studies. The second block of variables are then
analyzed to see how they may contribute to the model beyond that of the control variables. In
this case, our first block included the control variable smoking status. The second block of
Approximately 6% of the variance is explained through the control variable. When age,
gender and height are added to the model, 77.4% of the variance is explained. The model with
all six predictors is significant: F(4, 649) = 95.160, p < .001. The model with only the control
variable is also significant: F(1, 652) = .594, p = .619. The only insignificant predictor variable
11
in the model is smoking status (p = .141).
The result from the hierarchical method of multiple regression can be put into the
Method FEV =
Hierarchical -4.457 - .087Smoking Status + .066Age + .157Gender + .104Height
12