Você está na página 1de 38

Hypothesis Testing

Outline
    

Introduction to Hypothesis Testing Basic Concepts t-test for one and two samples ANOVA Chi-Square Test for Categorical data

Hypothesis Testing
 

A technique of Statistical Inference Statistical Inference is the process of getting(or varifying) information about the population on the basis of sample drawn from the same population Hypothesis testing is the process of testing (or varifying) a hypothesis ( a conjecture, a belief, a statement, or a claim) regading population parameter(s) on the basis of information contained in a sample drawn from that population

Hypothesis Testing

Population

Hypothesis Testing
I believe the population mean age is 50 (hypothesis).

Population

Hypothesis Testing
I believe the population mean age is 50 (hypothesis).

Population

Random sample Mean DX = 20

Hypothesis Testing
I believe the population mean age is 50 (hypothesis). Reject hypothesis! Not close.

Population

Random sample Mean DX = 20

Whats a Hypothesis?


1. A Belief about a Population Parameter




I believe the mean GPA of this class is 3.5!

Parameter Is Population Mean, Proportion, Variance Must Be Stated Before Analysis

1984-1994 T/Maker Co.

Steps in Hypothesis Testing


    

State H0 State H1 Choose E Choose n Choose test

   

Collect data Compute test statistic Compute p-value Make statistical decision Express decision

Basic Concepts
 

 

Null Hypothesis (H0): The hypothesis tested for possible rejection (nullification) Alternative Hypothesis (H1): The one which has to be accepted in case H0 is rejected. Also called the Research Hypothesis. A null hypothesis always assumes no difference or no effect Our aim in hypothesis is always to support the statement in H1.

Example of H0 and H1
 

    

For the Language teaching experiment, our null hypothesis would be: There is no difference between the means in the populations from which the two groups of scores were drawn That is, A = B or A - B = 0 The alternative hypothesis could be simply that A B , or A > B or A < B

Basic Concepts


Significance Level (): Probability of wrongly rejecting a true H0. Represents a kind of margin of error. Usually fixed at some small value like (0.05 (5%) or 0.01 (1%) Test-statistic: A statistic (calculated from sample data) used to test H0. Common such statistics are t-statistic, Chi-square statistic, F-statistic The corresponding statistical test of hypothesis is then referred to as t-test, chi-square test or F-test

Basic Concepts


Test-statistic: A statistic (calculated from sample data) used to test H0. Common such statistics are tstatistic, Chi-square statistic, Fstatistic The corresponding statistical test of hypothesis is then referred to as ttest, chi-square test or F-test

Basic Concepts: p-Value




1. Probability of Obtaining a Test Statistic More Extreme (eor u than Actual Sample Value Given H0 Is True 2. Called Observed Level of Significance


Smallest Value of E at which H0 Can Be Rejected If p-Value u E, Do Not Reject H0 If p-Value < E, Reject H0

3. Used to Make Rejection Decision


 

Basic Concepts

t-test for one and two samples


  

Called t-test because the test statistic used is denoted by the symbol t A t-test is suitable when the data are measured on interval or ratio scale In case of one sample, t-test is used to test the hypothesis regarding the value of a population mean For example, we may be interested to test the hypothesis that the mean sentence length in a literary work is 20 That is, H0 : = 20

t-test for one and two samples




 

A two-sample t-test is used to compare two groups with regard to some (quantitative) characteristic For the Language teaching experiment, we would like to test H0 : A = B or H0 : A - B = 0

Example: One-sample t-test


Suppose there is a proficiency test for students of French as a second language in which students educated to British A level standard are expected to score a mean of 80 marks. In a certain year teaching activities at some schools are disrupted by selective strikes. Ten students are chosen at random from those schools and are administered the test just before the time when they are due to sit the A level examination. The score of the ten students were: 62 71 75 56 80 87 62 96 57 69 (Mean= 71.5, Standard Deviation = 13.18) Do you think their performance has been affected by the interruption of their studies?

Example: One-sample t-test


One way to answer this question is to test whether the students seem to have achieved results as good as, or worse than, those achieved by the body of students who have taken the same proficiency test in previous years in their French language education.  The null hypothesis for this test will be that the students tested come from a population whose mean score is 80.  That is we wish to test H0 : = 80 versus H1 : < 80 The results are demonstrated in the SPSS session


Example: Two (independent) samples t-test

Example: Two (independent) samples t-test


 

     

For the Language teaching experiment, our null hypothesis would be: There is no difference between the means in the populations from which the two groups of scores were drawn That is, A = B or A - B = 0 The alternative hypothesis could be simply that A B , or A > B or A < B Results for the test are in the SPSS session

Example: Paired t-test




Error gravity scores of ten native English-speaking teachers and ten Greek teachers of English for 32 sentences which appeared in the compositions of GreekCypriot learners of English were recorded with the results given in the table: We wish to test the hypothesis that the two groups of teachers give the same error gravity scores on average Results in the SPSS session

Group 1 (English) Sample Mean Sample SD Sample Size

Group 2 (Greek)

25.03 6.25 32

28.28 7.85 32

ANOVA (Analysis of Variance)


  

  

Aim is to compare several groups (more than two) It is an extension of the two (idependent) samples t-test The null hypothesis states that all the groups are same (that is there is no difference among the groups) Symbolically, we state that H0 : 1 = 2 = 3 =. = k Vs. H1 : not all i are same (that is, not all groups are same)

Example: ANOVA test

ANOVA table
Source Sum of of Squares df Variation (SS) Between Poems 63.48 Within Poems 3 Mean Squares (MS) F p-value

MSB
=21.16

(from MSB/MSW software) = 9.20 0.000

105.80

46

MSW
=2.30

Total

169.28

49

Example: ANOVA test

Chi-Square Test


A common application of the chi-square test is to situations where we wish to test whether two characteristics are independent, or are associated in such a way that high frequencies of one tend to be coupled with high frequencies of other Suppose we want to investigate the relationship between tense and aspect in a particular language, and have classified the verbal phrases in a set of texts as being present or past tense, and also as either progressive or nonprogressive in aspect. We can arrange our data in the form of a 2x2 table, commonly known as a Contingency table (given on next slide)

Classification of verb phrases for tense and aspect


Aspect Past tense Present tense 476 297 773 Total

Progressive 308 Non315 Progressive Total 623

784 612 1396

Expected Frequencies ( E = (Row total x Column total)/ Grand total )


Aspect Past tense Present tense 434.1 338.9 773 Total

Progressive 349.9 Non273.1 Progressive Total 623

784 612 1396

Chi-square test
 

      

Our null hypothesis is: H0: There is no association between the two features: past/present and progressive/nonprogressive Alternative hypothesis is: H1:The two features are associated Test statistic: X2 = [ (O-E)2/E ] O = Observed frequency E = Expected Frequency Reject H0 if p-value = P( X2 > X2 (clac)) is small That is if p-value < 0.05 (5%)

Chi-square Test
O E (O E)2 / E

308 315 476 297

349.9 273.1 434.1 338.9

(308-349.9)2 / 349.9 = 5.02 (315-273.1)2 / 273.1 = 6.43 (476-434.1)2 / 434.1 = 4.04 (297-338.9)2 / 338.9 = 5.18 [ (O-E)2/E ] = 20.67

Chi-Square Test


P-value = 0.0000 Conclusion: Reject H0 and conclude that there is a significant association between Tense and Aspect. Note that the Chi-square test can be carried out for any rxc contingency table

Chi-Square Test for rxc tables




A study was conducted to test a possible relationship between first language background and desire for a student-centered classroom in an adult ESL class, advanced level. Apply the chi-square test at an 0.05 level of significance to test for association between L1 background and the response on the question. H0: The two attributes (L1 background and stated opinion) are independent H1: The two attributes are associated

Chi-Square Test for rxc tables


Opinion
For Against Far East group 11 45 Spanish group 30 12 8 Mid East group 25 7 10 Others 21 11 11

Undecided 16

Você também pode gostar