Escolar Documentos
Profissional Documentos
Cultura Documentos
Assignment Two
Dr. LeSage
October 29, 2014
Assignment Two
Introduction:
In order to further analyze selling prices for a sample of 200 homes in Toledo, Ohio, the
possibilities of collinearity, heteroscedasticity and spatial or serial correlation will be addressed.
Considering that these selling prices are a cross-section model and not a time series model,
spatial correlation rather than serial correlation will be analyzed. For each of these effects, the
nature of the problem and their respective diagnostics as well as corrective procedures will be
discussed.
Part One Section One: Collinearity
Collinearity, also known as multicollinearity, violates the assumption under CLRM that
states, There is no exact linear relationship among the regressors (Gujarati). Therefore, if one
or more relationships are discovered among the regressors, it is likely that this sample of 200
houses exhibits a collinearity problem. Collinearity can cause OLS estimators to have large
variance and covariance, wider confidence intervals, t-statistics with higher likelihood of
insignificance and a high R^2 value. Additionally, the presence of collinear variables can change
coefficient values of other variables within the model. Collinearity can confound multiple
regressors and cause difficulty in deciphering an individual regressors impact on the model.
# half
baths
0.00
0.00
0.00
0.00
0.00
0.22
0.73
0.05
In figure 1.1 shown below, we can observe that a significant coincidental movement of
two or more coefficients does not exist. This is consistent with our previous conclusion that
collinearity is not present in this sample of selling prices.
Figure 1.1
Values of Regression Coefficients as a Function of
5000
tla
lotsize
rooms
beds
full baths
half baths
age
4000
Regression Coefficients
3000
2000
1000
0
-1000
-2000
-3000
-4000
-5000
0.5
1
1.5
2
2.5
3
Value of , vertical line shows H-K value
3.5
4
-3
x 10
significance. The t-probability for tla is once again lower in comparing the OLS with NeweyWest results and there is no statistical change in the other variables.
Moving on to the Geweke robust regression, we see that unlike the White and NeweyWest regressions, lotsize remains significant at the 90% level. Additionally, the t-probability for
lotsize is lower in the Geweke regression than it is in the OLS regression. It is also important to
note that the number of half baths is significant at the 90% level in the Geweke regression as
well as the Newey-West regression, unlike in the OLS.
There is a notable change in the Geweke coefficients several variables, which points to an
outlier problem rather than a heteroscedasticity problem. The visual representation of residuals in
Figure 2.1 is consistent with the existence of an outlier problem, especially around the 200th
observation but also between the 40th and 90th observations. Figure 2.1 does not exhibit a
megaphone shape, which is consistent with our conclusion that we have an outlier problem rather
than heteroscedasticity. Figure 2.2 shown below displays a dramatic spike in a vi estimate near
the 200th observation and smaller spikes throughout the sample, which adds to the evidence
indicating that there is indeed an outlier problem.
Table 2.1
OLS
Variable
constant
tla
lotsize
# rooms
# bedrooms
# full baths
# half baths
White
NeweyWest
tprobabilit
tCoefficient
y
t-probability
probability
39289.35224
6
0.000000
0.000000
0.000000
15.451200
0.000001
0.000033
0.000044
1.235406
0.05548
0.023303
0.030193
-678.130925
0.473424
0.491529
0.484962
565.753718
0.700257
0.694302
0.675111
3106.830399
0.212101
0.194091
0.194819
0.1308
0.119239
0.080833
Geweke Robust
tprobabilit
Coefficient
y
41294.88457
9
0.000000
16.921674
0.000000
1.124557
0.067979
-942.270632
0.362778
566.184302
0.708540
2937.587720
0.240276
0.078907
3833.847406
-408.122795
age
0.000000
0.000000
0.000000
4523.058435
-427.801944
Figure 2.1
4
x 10
residuals
1
0
-1
-2
-3
-4
20
40
60
80
100
120
140
residuals sorted by house sizea
160
180
200
0.000000
Figure 2.2
Vi plot for outliers and hetero
8
7
Vi estimates
6
5
4
3
2
1
0
20
40
60
80
100
120
140
Observations sorted by house size
160
180
200
SEM. The Robust SEM exhibits a value of .023972 and a t-statistic of .850692, which is not
significant and is consistent with our previous conclusion that spatial correlation is not present in
our sample. Again, the significance levels of explanatory variables remained the same as those of
the OLS. The coefficients, however, are notably different for some variables from those of the
OLS, which is consistent with our conclusion that there is an outlier problem in this sample.
Figure 3.1 provides a visual depiction of the spatial error models vi estimates. This plot
reiterates our outlier problem in that there erratic vi estimates for several observations.
3.5
Vi estimates
2.5
1.5
20
40
60
80
100
120
140
Observations unsorted
160
180
200
Conclusion
In comparing the OLS regression with two ridge regressions, we were able to determine
that our sample does not contain a collinearity problem. The variables that were possibly
involved in collinear relationships did not change in significance after the ridge regression,
which would be expected if collinearity was present.
Furthermore, our diagnostics revealed that this sample does not contain a
heteroscedasticity problem. At first it was unclear whether or not we had heteroscedasticity or
outliers, but upon running the Geweke Robust Regression, it became apparent that our sample
contains outliers and does not have a heteroscedasticity problem. This outlier problem was
further illustrated by a residual plot and two vi estimate plots. The lack of significance in our
values indicates that this sample does not contain a spatial correlation problem. Given our
conclusion that this sample does not contain problems of collinearity, heteroscedasticity or
spatial correlation, but does in fact have an outlier problem, it is most appropriate to use a Robust
OLS regression model in analyzing this data set.
Works Cited
Gujarati, Damodar. ECONOMETRICS BY EXAMPLE . Houndmills, Basingstoke,
Hampshire: Palgrave Macmillan, 2011.