Você está na página 1de 13

GEDS 902: Statistics Class Notes - Peter A. Boateng (PhD in Business Admin).

Summer 2012

GEDS 902: STATISTICS


POPULATION, SAMPLE, & HYPOTHESIS (Date: 28/5/12)

POPULATION - the totality of a group. SAMPLE - subset of population. Sample is used to generalize or infer. o Generalize/infer: take part of a population and investigate. Results from studying the sample could be concluded as affecting the whole population.

Example of hypothesis testing: A housewife tasting to see if prepared soup is good to eat. Population the soup; Sample the tasted soup.

3. Accuracy a. Results obtained from a sample are more accurate than results obtained from population. Focus and thoroughness from studying a sample for accuracy is certain. Resources may not be enough to supervise studying a population. 4. Destructive Items a. For example manufacture a drug, and just start selling it without testing. What will happen? Mass burial. The drug will have to be tested before selling to avoid destruction. b. A bulb manufacturer wanting to study the life span of the bulb will have to take a sample of the bulbs, dismantle (if needed) and study it, instead of dismantling all the bulbs. Hypothesis testing cannot be carried out without having a data.

Why Do We Take Sample? 1. Cost a. It is expensive to study every object within a population. Studying a sample requires fewer resources. 2. Time a. Studying a whole population will be time consuming, as compared to studying part of a population.
1

METHODS OF DATA COLLECTION


Two Types of Data: Primary Data & Secondary Data. 1. Primary a. Data collected by the researcher himself. b. Data collected for a specific purpose, and used for that purpose.

GEDS 902: Statistics Class Notes - Peter A. Boateng (PhD in Business Admin). Summer 2012

c. When the data is used for another purpose, it becomes a secondary data. 2. Secondary. a. Data obtained from somewhere; bulleting, subscription agencies, journals. b. Extracting information from other sources. Collection of Primary Data: obtained by Questionnaire Administration o No research is better than its questionnaire. o Faulty questionnaire means faulty research. o Questionnaire is a form that contains a list of questions intended to obtain information. o Questionnaire is administered by Mail Problems of using mail respondent may assign another to fill out the questionnaire; difficult to judge the sincerity of the respondent; respondents may not respond by returning the mail; the postmaster too may also fail to deliver the mail. personal interview interviewer approaches the respondent and personally asks questions.
2

Advantage: Under this, the sincerity of the respondent could be judged. Responses could be compared with environment and gestures and judge. Advantage here is that the interviewer is able to explain and convince the respondent of the need to cooperate. Problem: they quoted me out of context. Interviewer could misinterpret responses by respondent. telephone observation seek responses from the respondent through the telephone Problem: inconsistent network; respondents may not answer unknown callers; respondents may refuse to avoid disturbances. o Observation First-hand information; personal presence and recording stories. Problem: risky; inability to tell the true story. o Experiment

GEDS 902: Statistics Class Notes - Peter A. Boateng (PhD in Business Admin). Summer 2012

Peculiar to the sciences; chemistry, biology, physics. o Focus/group discussion Obtaining information by engaging a group in a discussion. This could be friends, members of a community, students, church members, etc.

SAMPLING TECHNIQUES
How do you select a sample? Probability Sampling every element in the population has equal chance. o Simple random sampling. It is the best. Every element in the population has equal chance of being selected. This can be done through; Lottery all elements placed in a basket, giving all objects an equal chance of being selected (for information).

Random table elements are assigned numbers, placed in a table, then selected at random. Systematic sampling Selection is done at Kth interval. For example you decide to select any person who falls on the 7th (Kth) interval; this means in multiples of 7. This technique encourages biasness. Stratified sampling. Used where there is heterogeneous population. Separate groups (into strata), then use simple random sampling or systematic sampling to select the sample. Cluster sampling. (Date: 4/6/12) This is selection from randomly chosen groups of neighboring individuals. Cluster sampling is used when the population has not been listed. Choosing a group, you may have to interview everybody in the group; you cant select part of the group. Multi-stage sampling. Here, few areas are selected, which we believe are representative of the population as a whole;

GEDS 902: Statistics Class Notes - Peter A. Boateng (PhD in Business Admin). Summer 2012

We then take a random sampling within each of these areas where the population is spread particularly a geographical area. This works like cascading. For example, pick states in Nigeria; then pick local governments from each state; you can then pick other groups from each local government as sampling.

Non-probability Sampling not all elements have an equal chance. o Quota sampling. Very useful in market research. For example, looking around to interview only people who are wearing suits. Need to have a criterion before interviewing. Here, the enumerator is given instructions (for example, an instruction for an enumerator to interview 30 people who are corporately dressed during a graduation ceremony). In this case, if one respondent refuses to respond, the enumerator can still look around for another person to fill the instruction.
4

Unlike probability sampling there is always room to fill blank spaces when a respondent refuses to respond. For probability sampling, if a respondent refuses to respond, the enumerator cannot select another person/object as replacement. This is done by specifying how many people or items within a certain group you want to be sampled (set a quota) and then collect data from anyone or anything fitting into the required category until the quota is filled. o Judgmental sampling (Purposive sampling) An expert uses personal judgment to select what a truly representative sample will be. It involves human judgment. There could be bias in this method. o Convenience sampling Select a sample at your own convenience. Is there any difference between purposive and convenience sampling.???

GEDS 902: Statistics Class Notes - Peter A. Boateng (PhD in Business Admin). Summer 2012

QUESTIONNAIRE DESIGN
For a researcher to achieve a good response rate with the required type of answers from a questionnaire, then a great deal of care must be taken in the choice and design of the questionnaire. Designing a good questionnaire requires the following as guide: 1. Questions must be easily understood. a. Use simple English with clear expressions 2. Questions should not be ambiguous. a. Eg. do you think that boys or girls have better dress sense or is it simply the influence of their parents? 3. *Questions should not require the respondent to perform calculations or decide upon classifications. a. Eg. asking a person to calculate how much interest he pays on a loan taken from a bank. b. Asking a person to indicate the number of bottles of soft drinks taken in a year. 4. Questions should be relevant to the enquiry. a. Dont ask questions that are not related to the subject of study. 5. Unless it is mandatory, avoid personal questions. a. Keep in mind that some personal questions may embarrass/offend the respondent; others make them feel uncomfortable. Respondents may end
5

6.

7.

8. 9.

up not responding at all how old are you?; how many children do you have?; Whenever possible, give people a set of answers (closed ended questions) to choose from. a. This will reduce the problem you will encounter when categorizing answers. b. You should however, allow people the opportunity to give an answer other than those you specify, if they wish to (use others, pls specify.). The questions should be as short as possible. a. avoid long questions. b. avoid a leading question (eg. do you attend a university because of the certificate?; or "are you against the death penalty?). Questions should be asked in a logical sequence. a. Questions should be asked in their right order. Do not ask questions that rely on memory. a. Eg. when did the armed robbers attack the shop in your community?

GEDS 902: Statistics Class Notes - Peter A. Boateng (PhD in Business Admin). Summer 2012

HYPOTHESIS TESTING
Definition of Terms: 1. Null Hypothesis a. hypothesis of no change/difference; no effect; b. It is denoted by Ho. c. Always stated in this form: i. Ho: Babcock University is equal to University of Ibadan. ii. This means that there is no difference between the two schools. 2. Alternative Hypothesis a. This is hypothesis of a change/difference. b. It is denoted by H1. c. It is the researcher who determines the H1. d. It could be in the form of: i. H1: Babcock University is > Univ. of Ibadan ii. H1: BU < UI iii. BU UI e. All these statements must be taken one at a time; the researcher must determine whether these assumptions/statements are true or not. Null hypothesis is not tested because there is no difference.

3. One-tailed Test a. Here is where H1 is well specified. i. Ho: Maclean = Close up ii. H1: Maclean > Close up

or iii. Ho: Maclean = Close up iv. H1: Maclean < Close up

GEDS 902: Statistics Class Notes - Peter A. Boateng (PhD in Business Admin). Summer 2012

Alpha should not be too large. A higher level of significance affects the number of casualties. IF CALCULATED IS LESS THAN TABULATED IT FALLS IN THE ACCEPTANCE REGION: ACCEPT. IF CALCULATED IS GREATER THAN TABULATED IT FALLS IN THE REJECTION REGION: REJECT. 4. Two-tailed Test a. It has two tails. b. It is the one where the H1 (alternative hypothesis) is not well specified. i. Ho: Mac = close up. ii. H1: Mac close up.

11 June, 2012 HYPOTHESIS ERRORS TYPE I & TYPE II ERRORS DECISION Reject Ho Ho True Type I error Ho False Correct decision Type II error Accept Ho Correct decision

TYPE I ERROR: - Although it is unlikely that a test statistic would fall in the critical region when Ho is true. In this case we reject Ho and make an error in doing so. This is called TYPE I ERROR (ie. Rejecting Ho when it is true). Example rejecting the truth that normally men commit more crime than women. The truth is that men commit more crime than women. TYPE II ERROR: - This occurs when one fails to reject Ho when it is false. A Type II error will occur if a test statistic does not fall in the critical region, when Ho is in fact false.

In social sciences, the alpha is 5%, and 0.1% in other areas.

GEDS 902: Statistics Class Notes - Peter A. Boateng (PhD in Business Admin). Summer 2012

Critical Region: - This is a subset of the sample space which leads to the rejection of the null hypothesis under consideration. Significance Level: - It is the probability of taking a wrong decision (ie probability of making an error).
Where:

GENERAL PROCEDURE FOR TEST OF HYPOTHESIS


1. Formulate the null and alternative hypothesis. a. This is the first thing to be done after the data has been collected. Ho and H1 must be specified/set. H1 is picked from the objectives of the study. H1 is the research hypothesis. 2. Determine the appropriate test statistic and compute its value. a. Determine whether it is Z or t. b. Z means it is normal. c. When n is larger than 30, ie n 30 we use Z. d. When n is small, that is, n < 30, we use t distribution. 3. Choose the level of significance, ie . a. Generally, 5% is used for management sciences. 4. Determine the critical region. 5. Make a statistical decision. 6. Interpret results. a. Interpret statistical statements. b. Explain in simple terms why Ho or H1 has be accepted or rejected.
8

When n is small, that is, n < 30, we use t distribution. with (n 1) degree of freedom.
Where:

***Determine the critical region

ONE TAIL TEST The table measures 0.5 Take = 5%

GEDS 902: Statistics Class Notes - Peter A. Boateng (PhD in Business Admin). Summer 2012

Reject values that are 1.29 TWO TAILED TEST This means that any value that is greater than or equal to () 1.645 must be rejected (in other words, only ACCEPT values that are less than 1.645).

1.645 When

-/2 - 1.96
9

+/2 +1.96

GEDS 902: Statistics Class Notes - Peter A. Boateng (PhD in Business Admin). Summer 2012

This means that any value that is 1.96 and 1.96 must be rejected (in other words, only ACCEPT values that are between 1.96 and +1.96).

Number of students = 21

MEAN COMPUTATION
Mean is termed as an average denoted by . Given X1, X2, X3, Xn, the mean is Where: X1, X2, X3, Xn could be taken as ages of students in a class. Mean: Ages of students: 50, 52, 42, 35, 65, 54, 32, 40, 42, 41, 46, 49, 47, 61, 45, 35, 18, 40, 48, 32, 38 Total of students ages

Sample Mean: Ages of students: 50, 52, 42, 35, 65, 54, 32, 40, 42, 41, 46, 49, 47, 61, 45, 35, 18, 40, 48, 32, 38

Total of students ages = 912


10

GEDS 902: Statistics Class Notes - Peter A. Boateng (PhD in Business Admin). Summer 2012

Example 1: A sport biologist claims that female distant runners turns to be taller on the average than women in general who have an average height of 64. To study this claim, she obtained a random sample of 40 female distance runners, and their heights were recorded; and their mean is given as 65.6. The standard deviation is 3.3. Using this result, test the claim at 5% level of significance.


Where:

Solve by following the procedure for hypothesis testing. 1. Formulate the null and alternative hypothesis. Ho: = 64 H1: > 64 2. Determine the appropriate test statistic and compute its value. a. Determine whether it is Z or t. b. Z means it is normal. c. When n is larger than 30, ie n 30 we use Z. d. When n is small, that is, n < 30, we use t distribution. Here we use Z since the population is more than 30. ...continue solving till final figure 3. Choose the level of significance, ie . Generally, 5% is used for management sciences. 4. Determine the critical region. It is a ONE TAIL TEST because the H1 is well specified. On the TABLE, Z at one tail is 1.645. To have this figure, complete the above calculation and you will use it to determine the TABULATED figure (the figure should be 1.645).

Z = 1.645. 5. Make a statistical decision.


11

GEDS 902: Statistics Class Notes - Peter A. Boateng (PhD in Business Admin). Summer 2012

Here you will compare CALCULATED with TABULATED, and decide whether you should accept or reject hypothesis. 6. Interpret results. a. Interpret statistical statements. b. Explain in simple terms why Ho or H1 has be accepted or rejected. ====================

- CALCULATED The corresponding figure on the table (TABULATED) is 1.96. Remember that this is a two tailed test (see diagram):

EXAMPLE 2: TWO TAIL TAEST. The mean lifetime of a sample of 100 fluorescent light bulbs produced by a company is computed to be 1570 hours with standard deviation of 120 hours. If the u (population mean) of all the bulbs produced by the company is 1600 hours, test whether there is a similar difference (two tail test) at 5% level of significance.

- 2.5

+2.5

-/2 - 1.96 This means that Ho has to be rejected.

+/2 +1.96

12

GEDS 902: Statistics Class Notes - Peter A. Boateng (PhD in Business Admin). Summer 2012

EXAMPLE 3: In an intelligent test on 10 students, the following scores were obtained: 105, 120, 90, 65, 130, 110, 120, 115, 125, and 100. Given that the average score for the class before a special tutorial for the test was 105. As the special tutorial improve the performance of students at 1% level of significance.

Next, find and s. A.


We use "t" test in this case because n is less than 30. When n is small, that is, n < 30, we use t distribution. with (n 1) degree of freedom.
Where: B. Compute standard deviation - s.

13

Você também pode gostar