Você está na página 1de 13

Childhood Respiratory Disease in East Boston:

A Measure of Forced Expiratory Volume

David Nadler
STAGE 1: Descriptive Statistics
Table 1-1

Descriptive Statistics for Continuous Variables


Variable Parameter Statistic Std. Error
Age Mean 9.93 .116
Median 10.00
Variance 8.726
Std. Deviation 2.954
Forced Expiratory Mean 2.6368 .03390
Volume (L) Median 2.5475
Variance .752
Std. Deviation .86706
Height Mean 61.1436 .22302
Median 61.5000
Variance 32.530
Std. Deviation 5.70351

Table 1-2

Percent of Participants by Gender


Gender Frequency Percent
Female 318 48.6
Male 336 51.4
Total 654 100.0

1
Table 1-3

Percent of Participants by Smoking Status


Status Frequency Percent
Non-smoker 589 90.1
Smoker 65 9.9
Total 654 100.0

Table 1-4

FEV by Gender
Std. Std. Error
Gender N Mean Deviation Mean
Forced Expiratory Female 318 2.4512 .64574 .03621
Volume (L) Male 336 2.8124 1.00360 .05475

Table 1-5

Testing for Significant Mean Differences of FEV between Boys and Girls
Forced Expiratory Volume (L)
Equal variances
Equal variances assumed not assumed
Levene's Test for F 68.592
Equality of Variances Sig. .000
t-test for Equality of t -5.504
Means df 575.753
Sig. (2-tailed) .000
According to the data from Tables 1-4 and 1-5, females had lower FEV measurements
(M = 2.4512, SD = .64574) than that of males (M = 2.8124, SD = 1.00360), t(575.753) = -5.504,
p < .001; Levene’s = 68.592.

2
Table 1-6

FEV by Smoking Status


Smoking Std. Std. Error
Status N Mean Deviation Mean
Forced Expiratory Non-smoker 589 2.5661 .85052 .03505
Volume (L) Smoker 65 3.2769 .74999 .09302

Table 1-7

Testing for Significant Mean Differences of FEV between Smokers and Non-Smokers
Forced Expiratory Volume (L)
Equal variances Equal variances not
assumed assumed
Levene's Test for F 1.596
Equality of Variances Sig. .207
t-test for Equality of t -6.464 -7.150
Means df 83.273
Sig. (2-tailed) .000
According to the data from Tables 1-6 and 1-7, smokers had a higher FEV (M = 3.2769,
SD = .74999) than that of non-smokers (M = 2.5661, SD = .85052), t(652) = -6.464, p < .001.

STAGE 2: Correlation and Simple Regression

Table 2-1

Correlations between FEV and Age, Height, Gender and Smoking Status
Variable Forced Expiratory Volume (L)
Age Pearson Correlation .756
Sig. (2-tailed) .000
Height Pearson Correlation .868
Sig. (2-tailed) .000
Gender Spearman Correlation .144
Sig. (2-tailed) .000

3
Smoking Status Spearman Correlation .258
Sig. (2-tailed) .000
According to the correlations calculated in Table 2-1, the independent variables Age,

Height, Gender, and Smoking Status are all significantly correlated to FEV. The correlation

coefficients for Age and Height are much greater than those of Gender and Smoking Status, so

we can assume that Age and Height are correlated more strongly to FEV.

Age and Height correlations to FEV were calculated by using the Pearson method

because all of the variables are on a continuous scale. Gender and Smoking Status correlations

to FEV were calculated by using the Pearson method because the variables are nominal or

ordinal.

Table 2-2

Linear Regression Data Using Age as a Predictor of FEV (in liters)


Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) .432 .078 5.541 .000
Age .222 .008 .756 29.533 .000
For each one year increase in age, we will expect to see a .222 liter increase in FEV.

Table 2-3

Model Summary of Age as a Predictor of FEV


Adjusted R Std. Error of the
Model R R Square Square Estimate
1 .756a .572 .572 .56753
R, the multiple correlation coefficient, shows the linear correlation between the observed
and model predicted values of the dependent variable. The larger the value, the stronger the
relationship.

4
Adjusted R-square, the coefficient of determination, shows how much of the variation of
the dependent variable is explained by the model. In this case, 57.2% of the variance of FEV can
be explained by this model.

Table 2-4

Linear Regression Data Using Height as a Predictor of FEV (in liters)


Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) -5.433 .181 -29.939 .000
Height .132 .003 .868 44.662 .000
For each one inch increase in height, we will expect to see a .132 liter increase in FEV.

Table 2-5

Model Summary of Height as a Predictor of FEV


Adjusted R Std. Error of the
Model R R Square Square Estimate
a
1 .868 .754 .753 .43068
R, the multiple correlation coefficient, shows the linear correlation between the observed
and model predicted values of the dependent variable. The larger the value, the stronger the
relationship.
Adjusted R-square, the coefficient of determination, shows how much of the variation of
the dependent variable is explained by the model. In this case, 75.4% of the variance of FEV can
be explained by this model.

Stage 3. Assumption Checking.

Normality check

To check for the normality of distribution, we can first plot the Regression Standardized

Residual and see if it follows the shape of a normal curve. We plotted the histograms of the

5
Regression Standardized Residual of the independent variables on the dependent variable in

Figure 1. The plots have a normally distributed characteristic.

Figure 1. Histogram of Standardized Residuals for Forced Expiratory Volume

A P-P plot should follow a 45-degree line if there is normality. We can see from Figure 2

that the plots follow the 45-degree line. We can further support that the model has normality.

6
Figure 2. Normal P-P Plot for Forced Expiratory Volume

Linearity

Figure 3. Scatterplot to Show Linearity of Data

7
When we check the homoscedasticity, or homogeneity of variance, of our model, we are

really checking that the regression model’s variance of the dependent variable is equal across the

range of the independent variable. The null hypothesis for homoscedasticity is that the variance

is homogeneous throughout. According to Figure 3, the variance of the residuals is not constant

and we see the heteroscedasticity of the scatterplot.

Collinearity

Table 1

Coefficient Data for the Multiple Regression Model

Unstandardized Standardized Collinearity


Coefficients Coefficients Statistics

Model B Std. Error Beta t p Tolerance VIF


1 (Constant) -4.457 .223 -20.001 .000
Age .066 .009 .223 6.904 .000 .331 3.019
height .104 .005 .685 21.901 .000 .353 2.830
Gender .157 .033 .091 4.731 .000 .943 1.060
Smoking
-.087 .059 -.030 -1.472 .141 .827 1.210
Status

Collinearity tells us that one independent variable is highly correlated with another

independent variable, in turn increasing the standard error of the beta coefficients and making it

difficult to understand how each independent variable affects the outcome. If the tolerance of a

variable is less than .2, we can assume that collinearity exists. In Table 1 above, we see that the

tolerances for age, height, gender and smoking status are all above .2, so we can state that these

variable are not highly correlated amongst each other.

8
Stage 4. Multiple Regression

A multiple regression was run using forced entry using forced exhaled volume as the

dependent variable (outcome) and age, height, gender and smoking status as the predicators. The

model summary states that the Adjusted R Square value is .774, which means that 77.4% of the

variance can be explained by this model. The model is significant: F(4,649) = 560.021, p < .001.

Age, height and gender are significant factors (all have p < .001). Smoking status was not a

significant predictor (p = .141).

Table 1 above has already presented us with the factors of the predictor variables on the

outcome. For example, a one unit increase in age would have a .066 increase in forced exhaled

volume.

In the stepwise method, the analysis computes which of the predictors are significant to

keep and removes the ones that are not. The same data is entered into this model as with the

simultaneous method except that the method needs to be changed to stepwise. When height is

the only predictor variable, it accounts for 75.3% of the variance of the model. This model is

significant: F(1, 652) = 1994.731, p < .001. When age is then added as a predictor, 76.6% of the

variance is accounted. This model is significant: F(2, 651) = 1067.956, p < .001. When gender

is then added to age and height as a predictor, 77.4% of the variance is accounted. This model is

also significant: F(3, 650) = 744.634, p < .001.

Table 2 shows how the predictors affect the outcome (forced exhaled volume) in the

stepwise model. It should be noted that smoking status was excluded from this model. Looking

at Model 3, we can see how the three significant predictor variables affect the outcome. For

example, a one unit increase in height will have a .105 increase in forced exhaled volume. We

9
prefer to use the forced entry multiple regression model above because it accounts for all four

predictor variables, not just the ones that were shown to be significant predictors.

Table 2

Coefficient Data for the Stepwise Model

Standardized
Unstandardized Coefficients Coefficients

Model B Std. Error Beta t p


1 (Constant) -5.433 .181 -29.939 .000
height .132 .003 .868 44.662 .000
2 (Constant) -4.610 .224 -20.558 .000
height .110 .005 .722 23.263 .000
Age .054 .009 .185 5.961 .000
3 (Constant) -4.449 .223 -19.952 .000
height .105 .005 .688 21.986 .000
Age .061 .009 .209 6.766 .000
Gender .161 .033 .093 4.864 .000

Finally, a two-way analysis of covariance (ANCOVA) was conducted to determine if the

predictor variables (age, height and gender) had a significant effect on the outcome (forced

exhaled volume) using smoking status as a covariant. Table 3 below presents the output of the

ANCOVA analysis.

Table 3

ANCOVA Results with Smoking Status as Covariant


Dependent Variable:Forced Expiratory Volume (L)

Source df F p
Corrected Model 343 9.567 .000
Intercept 1 7917.868 .000
smoke 1 .480 .489

10
Table 3

ANCOVA Results with Smoking Status as Covariant


Dependent Variable:Forced Expiratory Volume (L)

Source df F p
age 16 2.241 .004
height 54 7.210 .000
sex 1 6.373 .012
age * height 174 1.220 .066
age * sex 10 1.225 .274
height * sex 35 1.232 .180
age * height * sex 42 .729 .893
Total 654
Corrected Total 653

We find that using smoking status (smoke) as the covariant, age, height and gender (sex)

are all significant predictors of forced expiratory volume. The interactions between these three

predictors are not significant (age * height, age * sex, etc.).

In the hierarchical method, we add the independent variables in different blocks into the

analysis. The first block consists of the variables we use as the control variables. These should

be known predictors from previous research studies. The second block of variables are then

analyzed to see how they may contribute to the model beyond that of the control variables. In

this case, our first block included the control variable smoking status. The second block of

variables included age, gender and height.

Approximately 6% of the variance is explained through the control variable. When age,

gender and height are added to the model, 77.4% of the variance is explained. The model with

all six predictors is significant: F(4, 649) = 95.160, p < .001. The model with only the control

variable is also significant: F(1, 652) = .594, p = .619. The only insignificant predictor variable

11
in the model is smoking status (p = .141).

The result from the hierarchical method of multiple regression can be put into the

following equation to predict the outcome:

Method FEV =
Hierarchical -4.457 - .087Smoking Status + .066Age + .157Gender + .104Height

12

Você também pode gostar