Você está na página 1de 8

Regression Analysis Project

Pfizer Inc In the last five years, pharmaceutical companies have been experimenting with new ways to market their products and increase awareness and sales. In previous times product promotion primarily meant sending sales reps to doctors offices to convince them to prescribe the companies product over the competitions. Such methods of product promotion have come under fire from the FDA and other citizen safety watch groups, and have been proving less and less productive. Pfizer Inc, along with other pharmaceutical companies, has begun investing more promotion dollars into television and direct mail advertisement. The company is also offering discounts on a majority of their over the counter drugs (OTC). Data collected over a two year period, shows the amount of money Pfizer spent on television and direct mail advertising as well as discount offers, and the revenue generated, all on a monthly bases. See fig 1.1 Fig 1.1 (Gross Revenue in $ Billions, Others in $ Millions)
Month 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Gross Rev 4.1 3.9 5 4.8 4.7 4 4.6 5.1 4.4 4.4 4.8 5.1 5 4.8 5.2 4.6 4.8 5.2 5.3 5.5 5.2 4.9 5.4 4.8 TV advert 17 14 20 18 16 16 16 19 15 15 16 17 17 17 17 16 16 17 17 18 17 17 18 17 Discounts 4 4 4 4 3 3 3 4 4 3 2 2 2 3 3 3 3 3 3 3 3 3 2 2 Direct mail 6 6 8 7 7 6 7 8 8 8 8 8 8 8 10 9 8 9 9 9 8 8 10 9

By using regressional analysis, we will attempt to identify which of the three new methods of promotion are positively and significantly impacting Pfizers annual income revenue. Using Scatter plots to examine the correlation between monthly spending in each promotional category and the monthly revenue income, we are able to infer which variables are impact full. Television advertising appears to cluster at similarly equal distances around a curve/line. This formation suggests that this is a valid independent variable which influences the companys monthly income ( See Graph 1.A), Graph 1.A

aph

Correlation

0.649

Graph 1. B (see below), depicts the correlation between the cost of discounting OTC drugs and the monthly revenue income. This scatterplot does not show any correlation between the two variables.

Graph 1.B

Correlation

-0.386

Graph 1. C (see below), depicts the correlation between the cost of mail advertising and the monthly revenue income. This plot shows correlation between the two variables.
Graph 1.C

Correlation

0.784

A time series plot would not be helpful in determining the relationship between our independent and dependant variables as it will only capture the fluctuation in income revenue over a 24 month period. See graph 1.D below.
Graph 1.D

Using the Regression procedure, we can identify the intercept and slope of the least squares line for income Revenue as a function of Television advertising (see Fig 1.2) Fig 1.2
Multiple Summary R R-Square Adjusted R-Square StErr of Estimate Durbin Watson

0.6486
Degrees of ANOVA Table Explained Unexplained Freedom

0.4207
Sum of Squares

0.3944
Mean of Squares

0.332286217
F-Ratio

0.8674
p-Value

1 22
Coefficient

1.764222466 2.429110867
Standard Error

1.764222466 0.11041413
t-Value

15.9782

0.0006

Regression Table Constant

p-Value

Lower Limit

Upper Limit

1.196597146

0.90817011

1.3176

0.2012

0.686832387

3.080026679

TV advert

0.215587267

0.053933519

3.9973

0.0006

0.103735994

0.32743854

This data is available under the Coefficient column in Fig 1.2. The equation for the least Squares line is: Predicted Income Revenue = 1.19656 + 0.2156(TV Advertising) This regression equation can be interpreted to say, that the Income Revenue index tends to increase by amount 0.22 for each 1-unit increase in the TV advertising expenses index. The graph below (Graph 1.E) shows the distribution of fitted and residual values. The distribution supports the position that TV Advertising has an impact on income revenue as the residuals are small and scattered randomly around 0 with no apparent pattern.
Graph 1.E

The Data sheet below (Fig 1.3) shows a few of the measured fitted and residual values and their correlation to each other. Data was generated for all point of observation, 24 months, but abbreviated here for use in explanatory purpose only. Fig 1.3
Graph Data 1 2 3 Gross Rev Fit Residual

4.1 3.9 5

4.861580681 4.21481888 5.508342481

-0.761580681 -0.31481888 -0.508342481

4.8

5.077167947

-0.277167947

The regression analysis of the Direct Mail promotion (Fig 1.4) shows that the Standard Error of Estimate is higher using TV advertising (compare with Fig 1.2) than direct mail, making direct mail a better predictor of income revenue. Another variable we must observe in making accurate analysis is the R squared (Coefficient of determination). In Fig 1.2 above, we observed an R-Squared value of 0.4207, and in Fig 1.4 a value of 0.6153. The closer R-squared is to 1, the greater the Impact our independent variable has on the outcome. As in these two cases, it is noted that Direct mail has a greater impact on the Income Revenue than TV Advertising. Fig 1.4
Multiple Summary R R-Square Adjusted R-Square StErr of Estimate Durbin Watson

0.7844
Degrees of ANOVA Table Explained Unexplained Freedom

0.6153
Sum of Squares

0.5979
Mean of Squares

0.270771305
F-Ratio

1.2463
p-Value

1 22
Coefficient

2.580357143 1.61297619
Standard Error

2.580357143 0.0733171
t-Value

35.1945

< 0.0001

Regression Table Constant Direct mail

p-Value

Lower Limit

Upper Limit

2.388095238 0.303571429

0.4130821 0.051170967

5.7812 5.9325

< 0.0001 < 0.0001

1.531415397 0.197449339

3.244775079 0.409693518

Including both the TV advertising and Direct mail advertising as variables in the regression analysis yields the data chart in Fig 1.5. We observe an increase in the Rsquare value that gives it a greater value than each of the individual analysis. The Standard error of estimation also drops below the individual values, which signifies a more accurate model for prediction. The new equation for the least squares line is: Predicted Income Revenue = 0.4225 + 0.144(TV Advertising) + 0.247(Direct Mail) Another variable to observe in the data output is the adjusted R-squared, this variable is calculated to identify independent variables that have no impact on the prediction. If the

adjusted R-squared variable drops when an independent variable is added, the variable has no impact and can be dropped. The R-squared value increased in Fig 1.5. Fig 1.5
Multiple Summary R R-Square Adjusted R-Square StErr of Estimate Durbin Watson

0.8841
Degrees of ANOVA Table Explained Unexplained Freedom

0.7817
Sum of Squares

0.7609
Mean of Squares

0.208781651
F-Ratio

1.3237
p-Value

2 21
Coefficient

3.277948003 0.91538533
Standard Error

1.638974002 0.043589778
t-Value

37.6000

< 0.0001

Regression Table Constant TV advert Direct mail

p-Value

Lower Limit

Upper Limit

0.42253141 0.14400991 0.246996107

0.585543884 0.035998492 0.041913977

0.7216 4.0004 5.8929

0.4785 0.0006 < 0.0001

0.795173757 0.069146946 0.159831221

1.640236578 0.218872873 0.334160993

To validate the effectiveness of using the regression analysis tool, we create another analysis using all three variables. By including the Discount promotion variable, which we expect will not have any positive impact on the income variable, based on the scatter plot results of graph 1.C, we generate the data set in Fig 1.6 below. Fig 1.6
Multiple Summary R R-Square Adjusted R-Square StErr of Estimate Durbin Watson

0.8951
Degrees of ANOVA Table Explained Unexplained Freedom

0.8012
Sum of Squares

0.7714
Mean of Squares

0.204149363
F-Ratio

1.3839
p-Value

3 20
Coefficient

3.359794086 0.833539247
Standard Error

1.119931362 0.041676962
t-Value

26.8717

< 0.0001

Regression Table Constant TV advert Discounts Direct mail

p-Value

Lower Limit

Upper Limit

0.780351639 0.157005231 0.100460415 0.213187826

0.626907622 0.03640082 0.071687601 0.047557529

1.2448 4.3132 -1.4014 4.4827

0.2276 0.0003 0.1764 0.0002

0.527354745 0.081074451 -0.24999813 0.113984558

2.088058023 0.232936012 0.0490773 0.312391094

We observe in Fig 1.6 that contrary to our visual interpretation of the scatter plot (graph 1.C) the Discount Promotion variable does indeed positively impact the income revenue. The R-Squared as well as the Adjusted R-Squared increased and the Standard Error decreased. This result prompts us to take a closer look at the impact of the Discount variable. Regression analysis of the variable (see Fig 1.7) show that it has a very low R-square value and high standard error of estimate. Nonetheless, we know that it positively impacts our income revenue and can be included for more accurate prediction. Fig 1.7
Multiple Summary R R-Square Adjusted R-Square StErr of Estimate Durbin Watson

0.3860
Degrees of ANOVA Table Explained Unexplained Freedom

0.1490
Sum of Squares

0.1103
Mean of Squares

0.40274722
F-Ratio

1.3600
p-Value

1 22
Coefficient

0.624816223 3.56851711
Standard Error

0.624816223 0.162205323
t-Value

3.8520

0.0625

Regression Table Constant Discounts

p-Value

Lower Limit

Upper Limit

5.542965779 -0.23878327

0.379081541 0.121663498

14.6221 -1.9627

< 0.0001 0.0625

4.756798781 0.491097921

6.329132778 0.013531381

Você também pode gostar