Você está na página 1de 4

4 Appendix

4.1 Statistical details:


Multiple regression model was used in this project. The explanatory variable sex is binary variable, while
the BMI is continuous variable. The fitted model can be expressed as follow:

yi = 0 + 1 x1i + 2 x2i + i (1)


In order to find the equation of the prediction line, we can calculate the estimated slope 1 and estimated
intercept 0 that minimize the total squared prediction error.

T test was used to test whether each coefficient is 0. The test statistic is :

j
T = tnk+1 (2)
sj

where
s
1
sj = s P 2 2 (3)
(x
i ij x
j ) (1 Rj )

The rejection region is at significant level = 0.05, therefore, we can conclude the variable is statistical
signifiant predicting the blood glucose level when P (|T | > tobs ) < 0.05.

F test was used to examine whether the model is significant, that is, at least one of the s is not 0.
The test statistic is :

y y)2 /k
P
M S(reg) (
F = =P Fk,nk1 (4)
M S(res) (yi yi )2 /(n (k + 1))
If fobs > Fk,nk1,=0.05 , we can conclude that the model significantly explained the variance of the
outcome variable.

4.2 R code used for the project


Please see computing project.R

2
Figures:

20 30 40 50 60 70 1.0 1.2 1.4 1.6 1.8 2.0

1e+08
AID

8e+07
0.01 0.01 0.01

6e+07
70
H4BMI
60
50

0.06 0.00
40
30
20

160
GLUCOSE

120
-0.16

80
40
1.0 1.2 1.4 1.6 1.8 2.0

BIO_SEX4

6e+07 8e+07 1e+08 40 80 120 160

Figure 1: paired scatterplot for examined variables

3
Normal Q-Q Plot glucose Normal Q-Q Plot residuals

4
160
140

2
120
Sample Quantiles

Sample Quantiles

0
100
80

-2
60
40

-4

-2 0 2 -2 0 2

Theoretical Quantiles Theoretical Quantiles

Figure 2: Q-Q plot for blood glucose level and residuals

Você também pode gostar