Você está na página 1de 12

Modeling Lowes sales

Forecasting sales is obviously of crucial importance to businesses. Revenue streams are


random, of course, but in some industries general economic factors would be expected to have
a great effect on sales. One such industry is the building supply industry, since contractor
work is a driving force for such purchases.
Is it possible to model sales of Lowes Companies (the worlds second largest home improvement retailer and the 14th largest retailer in the U.S.) as a function of generally available
economic factors related to the housing industry? The data studied here were gathered by
Mike Nannizzi, and refer to 79 consecutive quarters from the first quarter of 1983 through
the third quarter of 2002. We are interested in modeling Lowes quarterly sales, in millions of
dollars, as a function of housing starts (in millions) and average mortgage rate (I also thank
Mike for some of the financial analysis quoted here). Examination of the revenue variable
shows that it is righttailed; since it is a money variable, it is natural to take the target
variable as logged (base 10) sales. That is, we will fit a semilog model.

Recall, by the way, that these sales are in millions of dollars, so these quarterly sales are as
big as $7.5 billion. Theres a lot of money in hammers and nails!
Here are scatter plots of logged sales versus housing starts and mortgage rate. As would
be expected, there is a direct relationship with housing starts (more new houses meaning
more building supplies), and an inverse relationship with mortgage rate (higher rates meaning
fewer purchases of houses, with the resultant fewer repairs). We also see evidence in both
plots of two distinct subgroups in the data, with apparently different relationships between
the variables. The group with flatter sales corresponds to the 1980s, while that with higher
sales corresponds to the 1990s.
c
2015,
Jeffrey S. Simonoff

There is also a strong relationship between logged sales and time, reflecting an annual
proportional growth in sales. Once again we see evidence that the 1980s and 1990s correspond
to two distinct time periods. Why would that be? Unlike Home Depot, which was the market
leader in the (urban and suburban) home improvement industry, Lowes spent the 1980s in
mostly rural markets, aiming to support local contractors. As the home improvement concept
became tremendously profitable into the 1990s, Lowes changed its focus to compete more
directly with Home Depot.

c
2015,
Jeffrey S. Simonoff

Here are the results of fitting the model of logged revenue on the three predictors:

Regression Analysis: Log Sales versus Housing starts, Mortgage, Time


Analysis of Variance
Source
DF
Regression
3
Housing starts
1
Mortgage
1
Time
1
Error
75
Total
78

Adj SS
11.8100
0.3727
0.0138
2.4666
0.0600
11.8699

Adj MS F-Value P-Value


3.93665 4924.32
0.000
0.37271 466.23
0.000
0.01377
17.23
0.000
2.46663 3085.49
0.000
0.00080

Model Summary
S
R-sq R-sq(adj) R-sq(pred)
0.0282742 99.49%
99.47%
99.44%

Coefficients
Term
Coef
SE Coef T-Value P-Value
VIF
Constant
1.8700
0.0444
42.16
0.000
Housing starts
0.09847 0.00456
21.59
0.000 1.13
Mortgage
0.01551 0.00374
4.15
0.000 5.67
Time
0.018073 0.000325
55.55
0.000 5.44
c
2015,
Jeffrey S. Simonoff

Regression Equation
Log Sales = 1.8700 + 0.09847 Housing starts + 0.01551 Mortgage + 0.018073 Time

The regression fit is apparently very strong. The coefficients can be interpreted as follows.
An increase of one million housing starts in a quarter is associated with increasing sales by
25.5%, holding all else fixed (10.0985 = 1.255). The coefficient for mortgage rates is puzzling,
as it is positive; an increase in mortgage rate by one percentage point is associated with
an increase in sales of 3.6% (10.01551 = 1.036), holding all else fixed. In fact, this variable
adds little to the fit, as the model with it removed has R2 = .994. Finally, given the other
variables, there is a 4.2% quarterly increase in sales (10.01807 = 1.042).
Unfortunately, there are problems with this model. There is apparently structure left in
the data, related to the time effect noted earlier. In addition, there is a strong effect that
sales in the third quarter are systematically lower than during the rest of the year.

c
2015,
Jeffrey S. Simonoff

We can try to address these model deficiencies by adding two more predictors: Time2,
to address the parabolic pattern in the residuals related to time, and an indicator variable
identifying the third quarter. Here is the resultant regression output:

Analysis of Variance
Source
DF
Regression
5
Housing starts
1
Mortgage
1
Time
1
Time sq
1
Q3
1
Error
73
Total
78

Adj SS
11.8416
0.2340
0.0010
0.1370
0.0103
0.0168
0.0283
11.8699

Adj MS F-Value P-Value


2.36831 6100.00
0.000
0.23395 602.58
0.000
0.00103
2.66
0.107
0.13696 352.77
0.000
0.01035
26.65
0.000
0.01681
43.29
0.000
0.00039

Model Summary
S
R-sq R-sq(adj) R-sq(pred)
0.0197040 99.76%
99.74%
99.71%

Coefficients
Term
Constant

Coef
2.0649

c
2015,
Jeffrey S. Simonoff

SE Coef T-Value P-Value


0.0489
42.27
0.000

VIF

Housing starts
0.09386 0.00382
Mortgage
0.00517 0.00317
Time
0.014280 0.000760
Time sq
0.000037 0.000007
Q3
-0.03489 0.00530

24.55
1.63
18.78
5.16
-6.58

0.000
1.64
0.107
8.43
0.000 61.17
0.000 37.43
0.000
1.08

Regression Equation
Log Sales = 2.0649 + 0.09386 Housing starts + 0.00517 Mortgage + 0.014280 Time
+ 0.000037 Timesq - 0.03489 Q3

The collinearity between Time and Time2 is to be expected, so we dont have to worry
about that. Apparently we dont need mortgage rate now, so that original positive coefficient
wasnt something to worry about anyway:

Analysis of Variance
Source
DF
Regression
4
Housing starts
1
Time
1
Time sq
1
Q3
1
Error
74
Total
78

Adj SS
11.8405
0.2329
0.2886
0.0214
0.0165
0.0294
11.8699

Adj MS F-Value P-Value


2.96013 7457.39
0.000
0.23293 586.82
0.000
0.28855 726.94
0.000
0.02136
53.82
0.000
0.01650
41.56
0.000
0.00040

Model Summary
S
R-sq R-sq(adj) R-sq(pred)
0.0199233 99.75%
99.74%
99.71%

Coefficients
Term
Coef
SE Coef T-Value P-Value
VIF
Constant
2.1379
0.0197 108.65
0.000
Housing starts
0.09349 0.00386
24.22
0.000
1.63
Time
0.013331 0.000494
26.96
0.000 25.30
Time sq
0.000044 0.000006
7.34
0.000 25.25
Q3
-0.03453 0.00536
-6.45
0.000
1.08
c
2015,
Jeffrey S. Simonoff

Regression Equation
Log Sales = 2.1379 + 0.09349 Housing starts + 0.013331 Time + 0.000044 Time sq
- 0.03453 Q3

Given time, and whether it is the third quarter, one million additional housing starts is
associated with an expected 24.0% increase in Lowes sales. Given time and the number of
housing starts, sales are 7.7% lower in the third quarter. Why would this be? We wouldnt
be surprised to see higher sales in the first part of the year, since that is the peak construction
season in the northern part of the country, but why wouldnt this affect the fourth quarter
as well? In fact, there is evidence that Lowes sold goods at a steeper discount in the fourth
quarter, as its income as a percentage of sales is onethird lower than in any of the other
three quarters. This could, perhaps, reflect a desire to pump up endofyear sales, so as to
meet analysts sales expectations.
The time effect is a little trickier, since it is a quadratic relationship. Since the coefficient
for Time2 is positive, were seeing an increasing growth rate in sales over time, and a little
calculus can make that more specific. Given all else is held fixed, the expected rate of change
of the response as a function of a predictor xwhen x is in the model quadratically (1x+2x2)
is just the partial derivative with respect to x, or 1 + 22 x. Thus, given all else is held
fixed, at the first quarter of 1983 the estimated expected time-related rate of sales growth is
3.1% (.0133314 + (2)(.00004389)(1) = .0134, and 10.0134 = 1.031); on the other hand, given
all else is fixed, at the first quarter of 2002 the estimated expected time-related rate of sales
growth is 4.7% (.0133314 + (2)(.00004389)(77) = .0201, and 10.0201 = 1.047). Thus, unless
economic conditions change, it seems that Lowes sales can be expected to continue to rise.
The model now seems to fit pretty well (although the plots of residuals versus housing
starts and time of year seem to hint at nonconstant variance).

c
2015,
Jeffrey S. Simonoff

c
2015,
Jeffrey S. Simonoff

Given the very high R2 , we can say that housing starts and the timerelated variables, we
can predict Lowes sales very accurately. Indeed, the standard error of the estimate s = .0199
implies that 95% of the time Lowes sales are predicted to within roughly 910% high or low
(10.0398 = .912; 10.0398 = 1.096). Of course, that translates into as much as $750 million,
so we shouldnt get too excited!
Another potential approach we could have taken here is to split the data into pre1990
and post1990 groups, being consistent with the earlier scatter plots. We can do this using
the pooled / constant shift / full model approach we discussed earlier. Here is the full model
fit:

Analysis of Variance
Source
DF
Regression
9
Housing starts
1
Time
1
Mortgage
1
Q3
1
1980s
1
Housing80s
1
Time80s
1
Mortgage80s
1
Q380s
1
Error
69
Total
78

Adj SS
11.8515
0.0889
1.2738
0.0059
0.0155
0.0086
0.0001
0.0138
0.0052
0.0031
0.0185
11.8699

c
2015,
Jeffrey S. Simonoff

Adj MS F-Value P-Value


1.31683 4923.18
0.000
0.08893 332.48
0.000
1.27379 4762.25
0.000
0.00586
21.90
0.000
0.01555
58.12
0.000
0.00864
32.29
0.000
0.00007
0.25
0.619
0.01385
51.77
0.000
0.00523
19.56
0.000
0.00313
11.70
0.001
0.00027

Model Summary
S
R-sq R-sq(adj) R-sq(pred)
0.0163547 99.84%
99.82%
99.79%

Coefficients
Term
Coef
SE Coef T-Value P-Value
VIF
Constant
1.8473
0.0394
46.88
0.000
Housing starts
0.08892
0.00488
18.23
0.000
3.87
Time
0.019151 0.000278
69.01
0.000
11.83
Mortgage
0.01631
0.00349
4.68
0.000
14.77
Q3
-0.04195 0.00550
-7.62
0.000
1.69
1980s
0.4004
0.0705
5.68
0.000 329.84
Housing80s
-0.00356 0.00714
-0.50
0.619
60.92
Time80s
-0.005737 0.000797
-7.20
0.000
12.17
Mortgage80s
-0.02239 0.00506
-4.42
0.000 233.21
Q380s
0.03226
0.00943
3.42
0.001
2.12

Regression Equation
Log Sales = 1.8473 + 0.08892 Housing starts + 0.019151 Time + 0.01631 Mortgage
- 0.04195 Q3 + 0.4004 1980s - 0.00356 Housing80s - 0.005737 Time80s
- 0.02239 Mortgage80s + 0.03226 Q380s

Separate slopes for the housing starts variable dont seem to be supported:

Analysis of Variance
Source
DF
Regression
8
Housing starts
1
Time
1
Mortgage
1
Q3
1
1980s
1
Time80s
1
Mortgage80s
1
Q380s
1
Error
70

Adj SS
11.8514
0.1606
1.5004
0.0059
0.0158
0.0095
0.0138
0.0052
0.0031
0.0185

c
2015,
Jeffrey S. Simonoff

Adj MS F-Value P-Value


1.48142 5598.58
0.000
0.16057 606.82
0.000
1.50040 5670.29
0.000
0.00586
22.16
0.000
0.01578
59.63
0.000
0.00950
35.92
0.000
0.01380
52.14
0.000
0.00518
19.58
0.000
0.00314
11.85
0.001
0.00026
10

Total

78

11.8699

Model Summary
S
R-sq R-sq(adj) R-sq(pred)
0.0162667 99.84%
99.83%
99.80%

Coefficients
Term
Coef
SE Coef T-Value P-Value
VIF
Constant
1.8501
0.0388
47.72
0.000
Housing starts
0.08726
0.00354
24.63
0.000
2.07
Time
0.019204 0.000255
75.30
0.000
10.10
Mortgage
0.01632
0.00347
4.71
0.000
14.76
Q3
-0.04140 0.00536
-7.72
0.000
1.62
1980s
0.3866
0.0645
5.99
0.000 279.51
Time80s
-0.005695 0.000789
-7.22
0.000
12.04
Mortgage80s
-0.02224 0.00503
-4.42
0.000 232.37
Q380s
0.03088
0.00897
3.44
0.001
1.94

Regression Equation
Log Sales = 1.8501 + 0.08726 Housing starts + 0.019204 Time + 0.01632 Mortgage
- 0.04140 Q3 + 0.3866 1980s - 0.005695 Time80s
- 0.02224 Mortgage80s + 0.03088 Q380s

This model implies predictions of sales to within 78%, roughly 95% of the time. The
model yields two fitted lines: for the 1980s,

LogSales = 2.2368+.0873Housing starts+.01351Time.0059Mortgage rate.0105Q3,


and for post1990,

LogSales = 1.8501+.0873Housing starts+.0192Time+.0163Mortgage rate.0414Q3.


The housing starts effect is very similar to that in the quadratic model, and the third
quarter effect was stronger in the later time period. Consistent with the increasing predicc
2015,
Jeffrey S. Simonoff

11

tions from the quadratic model, the estimated annual rate of change in sales (given the other
variables) was 3.2% in the earlier time period, and 4.5% in the latter time period, certainly
good news for Lowes. Interestingly, a similar analysis to this one using Home Depot revenues
shows the opposite pattern, with the rate of change of Home Depots revenues decreasing
in recent time periods. Perhaps this accounts for the relatively poor performance of Home
Depot stock; Home Depots price dropped more than 50% from June 2002 to March 2003,
while that of Lowes dropped only (?) 15%.
There are two other points worth mentioning here. These data form a time series, of
course, and even though the plot of standardized residuals versus time didnt show apparent
autocorrelation, there is, in fact, some autocorrelation in the residuals. Its not that important, however; some basic time series remedies (which we will talk about later) only change
the standard error of the estimate from .0163 to .016. In addition, we should recognize that
part of the time trend effect that we are seeing is presumably an inflation effect; an analysis
that avoided that (uninteresting) effect could be accomplished by using constant dollar sales
(inflation-adjusted), rather than the actual (nominal) dollar sales.

Minitab commands
To create all K indicators for a categorical variable (like Quarter) click on Calc Make
Indicator Variables and enter the variable name under Indicator variables for:.
The program will choose default names for the indicators, but you can change them if
you wish.

c
2015,
Jeffrey S. Simonoff

12