Você está na página 1de 5

Erasmus School of Economics

MOOC Econometrics

Test Exercise 1

1. Scatter Plot

The scatter plot shows that


theres an outlier, with the
highest sales and the lowest
advertising. This is not
usual, considering the other
data.

If I fit a regression line to


my data, I would expect it
to be near all the points that
lie down, but It wouldnt be
so close to all of them
its slope would be biased
upwards because of the
outlier.

2. Coefficient Estimation

Salest = 1 + 2 Avertisingt +e t
By using the OLS criterion, the model estimation is:

^
Salest =29.62690.3246 Advertisingt

To test if 2=0 , I set the two hypothesis:

Ho: 2=0

Ha: 2 0

In order to evaluate this, I create a T

The T-value is:


0.3246
T= =0.7073
0.4589

And its going to be compared with the T-statistic T (0.025,20) =2.0860 .


As T-value is between both T-statistics, we cant reject the null hypothesis, so
advertising wouldnt be relevant in the explanation of the sales done in a week.
This problem appears because the outlier is biasing the regression, and so it doesnt
reflect the positive relation between the level of advertising and sales, as it is shown
in the graph below.

3. Residuals

The Histogram of the residuals shows that they are mostly near zero, but there is
one thats really big, so the regression line is far from it (this might be the outlier).
4. Identifying the Outlier

When I realize that the value (6,50)


came from a different event, I would
take it out of the sample, because its
not related to the other 19
observations. And its just biasing my results.

5. Taking Down the Outlier

When I took the outlier from the sample and run the new regression, it seems to fit
better to the data, and reflects what I expected: the higher level of advertising, the
more sales the shop makes.

^
Salest =21.125+0.3750 Advertisingt
In order to know if the regressor 2 is significant in the explanation of the sales,

a hypothesis test has to be done as follows:

Ho: 2=0

Ha: 2 0

The T-value is
0.3750
T= =3.7868
0.0857

As the T-value is larger than the T-statistic T (0.025,19) =2.0930, the null hypothesis
is rejected and so the level of advertising is relevant in the explanation of the sales.

6. Differences

The principal differences between this two parts, are the slope and the relevance of
the advertising.
In the first part, the level of advertising wasnt relevant because it said that, in
average, less advertising means more sales, and that was something contradictory
with what the data showed at the beginning.
The second part, when the outlier was removed, said that in average, more
advertising means more sales. And as this is related to what the data shows, the
regressor 2 is relevant in the explanation of the sales.

Ive learnt that I have to be careful when sampling is being made, in order to avoid
this kind of mistakes that may bias the regression and all the analysis.

Você também pode gostar