Você está na página 1de 17

MINITAB Project

http://wps.prenhall.com/esm_walpole_probstats_7/55/14203/3636047.c...

Home

Chapter 14

MINITAB Project

MINITAB Project
STATISTICS EXPLORATION #12 ANALYSIS OF VARIANCE PURPOSE - to use MINITAB to perform a one-way (one-factor) analysis of variance perform a two-way (two-factor) analysis of variance perform multiple comparisons of the means test for equality of variance BACKGROUND INFORMATION Analysis of Variance (ANOVA) is a method of testing the equality of three or more population means by analyzing sample variances. The ANOVA method uses the F-distribution. The assumptions that are applied when testing hypothesis that three or more samples come from populations with the same mean: The populations from which the samples were selected, are assumed to be normally distributed. The populations from which the samples were selected, are assumed to have the same variance. The samples selected from the different populations are assumed to be random and independent of each other. ANOVA is based on a comparison of two different estimates of the variance common to the different populations. The two estimates are: Variance between samples. Variance within samples. The term one-way analysis of variance (one- or single factor analysis of variance) is used because the sample data are separated into groups according to one characteristic or factor. The term treatment or level is a property or characteristic that enables us to distinguish the different populations from each other. The null hypothesis for the one-way ANOVA is H0: The means for the different populations are equal versus the alternative H 1: The means are not all equal. The term two-way analysis of variance (two-factor analysis of variance) is used because the sample data are separated into groups according to two characteristics or factors. For the two way analysis of variance we have three hypotheses to test: Two hypotheses involve the main effects (factors) and the other involves the interaction between the two factors. If A and B are the two factors in a two-way ANOVA, we say there is an interaction effect between the factors A and B when the difference between the mean levels of factor A depends on the different levels of factor B. PROCEDURES First, load the MINITAB (windows version) software as described in Exploration #0. We will use MINITAB to help test hypotheses relating to one-way and two-way ANOVA. 1. ONE WAY ANOVA The MINITAB commands that will be used are as follows: Stat ANOVA One-way (Unstacked). This will enable us to perform hypothesis tests for equality of means for the different levels for the single factor. Also, we will be using the Unstacked option because the responses will be in different columns. Example 1: Sociologists often investigate the relationship between socioeconomic status (lower class, middle

1 of 17

3/23/2012 8:41 PM

MINITAB Project

http://wps.prenhall.com/esm_walpole_probstats_7/55/14203/3636047.c...

class, and upper class) and academic performance. A random sample of seven grade point averages (GPA) for freshmen college students from each of these three groups was selected. The data is given in the following table.

Lower Class

Middle Class

Upper Class

2.87

3.23

2.25

2.16

3.45

3.13

3.14

2.78

2.44

2.51

3.77

3.27

1.80

2.97

2.81

3.01

3.53

1.36

2.16

3.01

2.53

Test at the 5% significance level to determine whether there is a difference among the true mean grade point averages for the three socioeconomic classes. Let Let Let represent the mean GPA for the students classified in the lower class. represent the mean GPA for the students classified as middle class. represent the mean GPA for the students classified as upper class.

Now the null hypothesis to be tested is that the three means are equal. Thus, we can write the null and the alternative hypotheses as H0 : = = .

H1: At least two of the population means differ In order to generate the results for the test, enter the data values in columns C1, C2, and C3. Label C1 as Lower, label C2 as Middle, and label C3 as Upper. Next select the MINITAB menu options Stat ANOVA One-way (Unstacked) and fill in the text boxes as displayed in Figure 12.1.

2 of 17

3/23/2012 8:41 PM

MINITAB Project

http://wps.prenhall.com/esm_walpole_probstats_7/55/14203/3636047.c...

Figure 12.1: One-way Analysis of Variance dialog box with entries to help generate results for the one-way test Now, to tell MINITAB that we would like to perform the test, click on the OK button and the Session window will provide us with the information as shown in Figure 12.2.

Figure 12.2: Session window output with information to help test H0:

Observe that the output gives results for the F-Test with a P-value of 0.025. As a consequence of these results, we can summarize the test as follows, using a P-value approach:

Null hypothesis

H0:

Alternative hypothesis

H1: At least two of the population means differ from each other.

Test statistic

T.S.: P-value = 0.025

3 of 17

3/23/2012 8:41 PM

MINITAB Project

http://wps.prenhall.com/esm_walpole_probstats_7/55/14203/3636047.c...

Decision rule

D.R.: For a significance level of 0.05, reject H0 if the P-value of 0.025 is less than the significance level of 0.05.

Conclusion

Since 0.025 < 0.05, reject the null hypothesis. That is, at the 5% level of significance, we can conclude that the average GPA for the three social classifications are significantly different from each other.

NOTE: We can write up an equivalent test using the critical region approach as in Exploration #9 using the F statistic. Since the One-way Analysis of Variance dialog box generates both confidence intervals and hypothesis test information, for the time being ignore the confidence intervals displayed in Figure 12.2. If dot-plots or box-plots are needed to display the data, you can choose the Graphs option in Figure 12.1 and select appropriately. Figure 12.3 shows the box-plots for the three variables superimposed on a single graph. Observe from this display that there are no outliers. Also, the plots suggest that the mean GPA for the middle class is larger than the GPA for the lower and upper classes and the GPA for the lower and upper classes are approximately equal to each other.

Figure 12.3: Box-plots for the variables Lower, Middle, and Upper Recall: For the one-way ANOVA one must assume that the populations for the variables of interest must be normal or approximately normal. The normality assumption can be verified by producing a normal probability plot of the sample data by

4 of 17

3/23/2012 8:41 PM

MINITAB Project

http://wps.prenhall.com/esm_walpole_probstats_7/55/14203/3636047.c...

selecting Graph Probability Plot. The procedure for carrying this out was discussed in Exploration #6 and will be addressed again for the data in Example 1. Recall, if the normality plot is approximately linear, then one can assume that the data was sampled from a normal or approximately normal distribution. Example 2: Establish whether the three data sets given in Example 1 can be assumed to come from normal or approximate normal distributions. Test at the 1% significance level. We can use the Anderson-Darling test to test for normality by selecting Stat Basic Statistics Normality Test and follow Example 2 in Exploration #10. Figure 12.4 shows the Normal Probability Plot for the lower class values. Observe that the plotted points lie almost on a straight line that would indicate that the sample came from a normal distribution. In addition, observe the P-value for the Anderson-Darling test to be 0.544. This is large relative to the significance level of 0.01. Thus, one would not reject the null hypothesis H0: the distribution is normal for the Anderson-Darling test. Hence one may assume normality for the sampling distribution. Note: Similar tests for the middle class and upper class values would result in the same conclusion as in the case for the lower class values. Verify.

Figure 12.4: Anderson-Darling test for normality for the Lower class values As a consequence of these results, we can summarize the test as follows, using a P-value approach:

Null hypothesis

H0: The distribution for the lower class values is normally distributed.

Alternative hypothesis

H1: The distribution for the lower class values is not normally distributed.

Test statistic

T.S.: P-value = 0.544

Decision rule

D.R.: For a significance level of 0.01, reject H0 if the P-value of 0.544 is less than the significance level of 0.01.

5 of 17

3/23/2012 8:41 PM

MINITAB Project

http://wps.prenhall.com/esm_walpole_probstats_7/55/14203/3636047.c...

Conclusion

Since 0.544 > 0.01, do not reject the null hypothesis. That is, at the 1% level of significance, we cannot conclude that the distribution for the lower class values is not normally distributed.

Example 3: Establish whether the assumption of equal variances for the data sets given in Example 1 is reasonable. Test at the 5% significance level. In order for us to test for equal variance we need to rearrange the data for the three classes. Figure 12.5 shows how we need to arrange the data in order to test for equal variance. We need to stack the values for the lower, middle, and the upper classes in one column. This column was renamed as Response. Next we need to associate the values in the response column (C5) with the classes. These are listed in column C6 and labeled as Factors.

Figure 12.5: Data window with rearrange values Now to test for equal variance, select Stat ANOVA Test for Equal Variances. Figure 12.6 shows the Test for Equal Variances dialog box with the appropriate entries to perform the test. Observe that the Confidence level is 95.

6 of 17

3/23/2012 8:41 PM

MINITAB Project

http://wps.prenhall.com/esm_walpole_probstats_7/55/14203/3636047.c...

Figure 12.6: Test for Equal Variances Click on the OK button and the results of the test will be generated. This is shown in Figure 12.7.

Figure 12.7: Results for the Test for Equal Variances Figure 12.7 presents two test results results for the Bartletts Test and results for the Levenes Test. The P-vales for both tests are large (0.395 and 0.540) relative to the conventional significance level of 0.01, 0.05, and 0.1. Thus, one would not reject the null hypothesis H0: the variances are equal for the both the Bartletts and Levenes tests. Hence one may assume that the variances for the sampling distribution are equal to each other. In addition, from Figure 12.7, observe that the 95% confidence intervals for the variances (sigmas) overlap each other. This would imply that that the variances are not significantly different at the 5% significance level. As a consequence of these results, we can summarize the test as follows, using a P-value approach:

7 of 17

3/23/2012 8:41 PM

MINITAB Project

http://wps.prenhall.com/esm_walpole_probstats_7/55/14203/3636047.c...

Null hypothesis

H0: The variances for the different populations are equal.

Alternative hypothesis

H1: At least two of the variances are different.

Test statistic

T.S.: P-value = 0.395 (using Bartletts test)

Decision rule

D.R.: For a significance level of 0.05, reject H0 if the P-value of 0.395 is less than the significance level of 0.05.

Conclusion

Since 0.395 > 0.05, do not reject the null hypothesis. That is, at the 5% level of significance, we cannot conclude that the variances are significantly different.

1. TWO WAY ANOVA The MINITAB commands that will be used are as follows: Stat ANOVA Two-way. This will enable us to perform hypothesis tests for equality of means for the different levels for the two factors as well as interaction between these factors. Example 4: A videocassette recorder (VCR) repair service wished to study the effect of VCR brand and service center on the repair time measured in minutes. Three VCR brands (A, B, C) and three service centers were specifically selected for analysis. Each service center repaired two VCRs of each brand. The results are shown in the following table. (Source: Practical Statistics by Example Using Microsoft Excel; Sincich, Levine, and Stephan, Prentice Hall Publishing, 1999).

VCR BRANDS

Service Centers

52

48

59

57

39

67

51 3 43

61

58

52

64

37

44

65

46

50

69

8 of 17

3/23/2012 8:41 PM

MINITAB Project

http://wps.prenhall.com/esm_walpole_probstats_7/55/14203/3636047.c...

(a) Test at the 5% significance level to determine whether there is a main effect due to service centers. Now the null hypothesis to be tested is that the three means for the service centers are equal. Thus, we can write the null and the alternative hypotheses as H0: There are no difference among the means for the main effect due to Service Centers. H1: At least two of the means for the main effect due to Service Centers are different. In order to generate the results for the test, we need to enter the response values in one column, the levels for the row factor (Service Centers), and the levels for the column factor (VCR Brands). These values are entered in columns C1, C2, and C3 respectively. Label C1 as Response, label C2 as RowFactor, and label C3 as ColFactor. The data window should look like that in Figure 12.8.

Figure 12.8: Data entry window for Example 4 Next select the MINITAB menu options Stat ANOVA Two-way and fill in the text boxes as displayed in Figure 12.9. Observe that we also have selected the Display means options for both factors.

9 of 17

3/23/2012 8:41 PM

MINITAB Project

http://wps.prenhall.com/esm_walpole_probstats_7/55/14203/3636047.c...

Figure 12.9: Two-way Analysis of Variance dialog box with entries to help generate results for the two-way test Now, to tell MINITAB that we would like to perform the test, click on the OK button and the Session window will provide us with the information as shown in Figure 12.10.

Figure 12.10: Session window output with information to help test for the main effects means for Service Centers Observe that the output gives results for both the RowFactor (Service Centers) and the ColFactor (VCR Brands). Note the P-values for the RowFactor, ColFactor and interaction are 0.617, 0.001, and 0.061 respectively.

10 of 17

3/23/2012 8:41 PM

MINITAB Project

http://wps.prenhall.com/esm_walpole_probstats_7/55/14203/3636047.c...

As a consequence of these results, we can summarize the test among the means for the main effect due to Service Centers as follows. Using the P-value approach:

Null hypothesis

H0: There are no difference among the means for the main effect due to Service Centers.

Alternative hypothesis

H1: At least two of the means for the main effect due to Service Centers are different.

Test statistic

T.S.: P-value = 0.617

Decision rule

D.R.: For a significance level of 0.05, reject H0 if the P-value of 0.617 is less than the significance level of 0.05.

Conclusion

Since 0.617 > 0.05, do not reject the null hypothesis. That is, at the 5% level of significance, we cannot conclude that the average repair time for the three Service Centers are significantly different from each other.

NOTE from Figure 12.10: The 95% confidence intervals for the RowFactor (Service Centers) over lap with each. This supports the conclusion of the test. (b) Test at the 5% significance level to determine whether there is a main effect due to VCR brands. As a consequence of these results in Figure 12.10, we can summarize the test among the means for the main effect due to VCR Brands as follows. Using the P-value approach:

Null hypothesis

H0: There are no difference among the means for the main effect due to VCR Brands.

Alternative hypothesis

H1: At least two of the means for the main effect due to VCR Brands are different.

Test statistic

T.S.: P-value = 0.001

Decision rule

D.R.: For a significance level of 0.05, reject H0 if the P-value of 0.001 is less than the significance level of 0.05.

Conclusion

Since 0.001 < 0.05, reject the null hypothesis. That is, at the 5% level of significance, we can conclude that the average repair time for the three VCR brands are significantly different from each other.

11 of 17

3/23/2012 8:41 PM

MINITAB Project

http://wps.prenhall.com/esm_walpole_probstats_7/55/14203/3636047.c...

NOTE from Figure 12.10: The 95% confidence intervals for the ColFactor (VCR Brands) do not all over lap with each. The 95% confidence interval for brand C does not over lap with that for brand A or brand B. The 95% confidence interval for brand C indicates that the average repair time for this brand is significantly greater than that for brand A or brand B. This supports the conclusion of the test. (c) Test at the 5% significance level to determine whether there is an interaction between the two factors of Service Centers and VCR brands. As a consequence of these results in Figure 12.10, we can summarize the test among the means for the main effect due to VCR Brands as follows. Using the P-value approach:

Null hypothesis

H0: There is no interaction between the two factors Service Centers and VCR Brands.

Alternative hypothesis

H1: There is an interaction between the two factors.

Test statistic

T.S.: P-value = 0.061

Decision rule

D.R.: For a significance level of 0.05, reject H0 if the P-value of 0.061 is less than the significance level of 0.05.

Conclusion

Since 0.061 > 0.05, do not reject the null hypothesis. That is, at the 5% level of significance, we cannot conclude that the interaction between the two factors is affecting average repair time.

We can use MINITAB to present interaction plots. To produce interaction plots, select Stat ANOVA Interactions Plot and fill in the appropriate text boxes for the Interactions Plot dialog box. The resulting interaction plots are shown in Figure 12.11.

12 of 17

3/23/2012 8:41 PM

MINITAB Project

http://wps.prenhall.com/esm_walpole_probstats_7/55/14203/3636047.c...

Figure 12.11: Interaction Plots for Service Centers and VCR Brands From Figure 12.11, one can observe that the lines intersect with each other. This would indicate that there is some interaction between these factors. This implies that the effect of VCR Brands upon repair time depends upon the Service Center. NOTE: Our test concluded that there were no significant interaction at the 5% significance level. This does really contradict the results of the interaction plots. The interaction will be significant at the 6.1% significance level. NOTE: The assumptions for these tests are-The The The The population distribution of the observations for any factor-level combination is approximately normal. variance of the probability distribution is constant and the same for all factor-level combinations. treatments (factor-level combinations) are randomly assigned to the experimental units. observations for each factor-level combination represent independent random samples.

The assumptions of normality and constant variance can be verified using previous procedures. NOTES EXPLORATION #12: HOMEWORK ASSIGNMENT Name: _____________________ Date: ______________________ Course #: ___________________ Instructor: _________________ 1. A psychologist wants to investigate the effect of social background on the time (in minutes) it takes freshmen to solve a puzzle. A random sample of students from different backgrounds was selected, and given the puzzle to solve under the conditions. The following table shows the results for the time it took the students to solve the puzzle.

Inner City

Urban

Suburban

Rural

13 of 17

3/23/2012 8:41 PM

MINITAB Project

http://wps.prenhall.com/esm_walpole_probstats_7/55/14203/3636047.c...

16.5

10.9

18.6

14.2

5.2

5.2

8.1

24.5

12.1

10.8

6.4

14.8

14.3

8.9

5.2

24.9

14.3

16.1

7.5

5.1

7.5

12.3

12.9

12.3

8.9

6.9

15.1

10.9

a. Use MINITAB to test whether social background has no effect on the time required to solve the puzzle. Test at the 5% significance level using the P-value approach. H0: H1: Test Statistic: Decision Rule: Conclusion: b. Does the confidence intervals support your claim in part (a). Discuss. c. Use the Anderson-Darling test to determine whether the normality assumption holds for the Inner City social classification. H0: H1: Test Statistic: Decision Rule: Conclusion: d. Repeat the Anderson-Darling test for the remaining social classes and present hard copies of the plots. Based on the plots, discuss whether the normality assumption holds for these classes. Discuss. Note: Provide a hard copy of the normal probability plots with your work. e. Establish whether the assumption of equal variances for the data sets is reasonable. Test at the 5% significance level and write up an appropriate hypothesis test. H0: H1:

14 of 17

3/23/2012 8:41 PM

MINITAB Project

http://wps.prenhall.com/esm_walpole_probstats_7/55/14203/3636047.c...

Test Statistic: Decision Rule: Conclusion: 2. In the Journal of Nutrition (July 1995), researchers at the University of Georgia studied the impact of vitamin-B supplement on the kidney. The experimental units were 28 Zuker rats, a species that tend to develop kidney problems. 50% of the rats were classified as obese and the other 50% as lean. Within each group, half were randomly assigned to receive a vitamin-B supplement and the other half was given a regular diet free of vitamin-B. One of the response variables that was measured was the weight in grams of rats kidney at the end of a 20-week feeding period. The data is summarized in the following table.

DIET

Rat Size

Regular

Vitamin-B Supplement

1.68

1.51

1.80

1.65

1.71

1.45

Lean

1.81

1.44

1.47

1.63

1.37

1.35

1.71

1.66

2.35

2.93

2.97

2.72

Obese

2.54

2.99

2.93

2.19

2.84

2.63

15 of 17

3/23/2012 8:41 PM

MINITAB Project

http://wps.prenhall.com/esm_walpole_probstats_7/55/14203/3636047.c...

2.05

2.61

2.82

2.64

a. Would you consider this as a two-factor ANOVA? Discuss. b. Test at the 5% significance level to determine whether there is a main effect due to the size of the rats. Use the P-value approach. H0: H1: Test Statistic: Decision Rule: Conclusion: c. Test at the 5% significance level to determine whether there is a main effect due to the diet that was administered. Use the P-value approach. H0: H1: Test Statistic: Decision Rule: Conclusion: d. Test at the 5% significance level to determine whether there is an interaction effect between the size of the rats and the diet administered. Use the P-value approach. H0: H1: Test Statistic: Decision Rule: Conclusion: e. Use the Anderson-Darling test to determine whether the normality assumption for the two factors was violated. Use the 5% level of significance. Based on the plots and the P-values for the test, is it reasonable to assume that the populations are approximately normally distributed? Discuss. Note: You will need to perform four tests two for the sizes of the rats (lean and obese) and two for the diets (regular and vitamin-B supplement). You will need to enter four columns of data into MINITAB to perform the tests. The four columns of data are values for the variables: lean, obese, regular and vitamin-B. Provide hard copies of the graphs with your work with an appropriate title for each graph. f. Determine whether the assumption of equal variance for the two-factor experiment is valid. Based on the confidence interval plots for Bartletts and Levenes test , is it reasonable to assume that the variances are equal? Discuss. Note: You will need to perform two tests one for the sizes of the rats (lean and obese) and one for the diets (regular and vitamin-B supplement). You will need to enter four columns of data into MINITAB to perform the tests.

16 of 17

3/23/2012 8:41 PM

MINITAB Project

http://wps.prenhall.com/esm_walpole_probstats_7/55/14203/3636047.c...

The four columns of data are values for the variables: lean, obese, regular and vitamin-B. Provide hard copies of the graphs with your work with an appropriate title for each graph.

Copyright 1995 - 2010 Pearson Education . All rights reserved. Pearson Prentice Hall is an imprint of Pearson .

Legal Notice | Privacy Policy | Permissions

17 of 17

3/23/2012 8:41 PM

Você também pode gostar