Escolar Documentos
Profissional Documentos
Cultura Documentos
CAYABYAB
MSC 6 TF 1:00-2:30
BSED/FILIPINO
PROF. MOGUER
1)Normal Distributions
Z Score Formulas
You may also see the z score formula shown to the left. This is exactly the same
formula as z = x / , except that xx(the sample mean) is used instead of (the
population mean) and s (the sample standard deviation) is used instead of (the
population standard deviation). However, the steps for solving it are exactly the same.
Z Score Formula: Standard Error of the Mean
When you have multiple samples and want to describe the standard deviation of those
sample means (the standard error), you would use this z score formula:
z = (x ) / ( / n)
This z-score will tell you how many standard errors there are between the sample mean
and the population mean
Sample problem: In general, the mean height of women is 65 with a standard deviation
of 3.5. What is the probability of finding a random sample of 50 women with a mean
height of 70, assuming the heights are normally distributed?
z = (x ) / ( / n)
= (70 65) / (3.5/50) = 5 / 0.495 = 10.1
The key here is that were dealing with a sampling distribution of means, so we know we
have to include the standard error in the formula. We also know that 99% of values fall
within 3 standard deviations from the mean in a normal probability distribution (see 68
95 99.7 rule). Therefore, theres less than 1% probability that any sample of women will
have a mean height of 70.
Where
xx= sample mean
0 = population mean
s = sample standard deviation
n = sample size
If you have only one item in your sample, the square root in the denominator
becomes 1. This means the formula becomes:
In simple terms, the larger the t score, the larger the difference is between the
groups you are testing. Its influenced by many factors including:
3) For a certain type of computers, the length of time bewteen charges of the
battery is normally distributed with a mean of 50 hours and a standard deviation
of 15 hours. John owns one of these computers and wants to know the
probability that the length of time will be between 50 and 70 hours.
Answer:
Let x be the random variable that represents the length of time. It has a mean of 50 and
a standard deviation of 15. We have to find the probability that x is between 50 and 70
or P( 50< x < 70)
For x = 50 , z = (50 - 50) / 15 = 0
For x = 70 , z = (70 - 50) / 15 = 1.33 (rounded to 2 decimal places)
P( 50< x < 70) = P( 0< z < 1.33) = [area to the left of z = 1.33] - [area to the left of z = 0]
= 0.9082 - 0.5 = 0.4082
The probability that John's computer has a length of time between 50 and 70 hours is
equal to 0.4082.
4) Entry to a certain University is determined by a national test. The scores on
this test are normally distributed with a mean of 500 and a standard deviation of
100. Tom wants to be admitted to this university and he knows that he must score
better than at least 70% of the students who took the test. Tom takes the test and
scores 585. Will he be admitted to this university?
Answer:
Let x be the random variable that represents the scores. x is normally ditsributed with a
mean of 500 and a standard deviation of 100. The total area under the normal curve
represents the total number of students who took the test. If we multiply the values of
the areas under the curve by 100, we obtain percentages.
Image: UH.edu
Percentage of scores
Bottom 4%
Next bottom 7%
Middle 20%
Next top 7%
Top 4%
The mean lies in the middle of the fifth stanine, cutting the center 20% into two parts.
Loss of Information
Stanines are a very simple way of categorizing items into top, middle and bottom
percentages. This simplicity means that its a very imprecise way to measure anything.
Everyone in the same stanine receives the same score. For example, a person at the
bottom of the 5th is almost 20 percentage points below the person at the top of the 5th.
These differences are what is called loss of information.
The percentile rank of a score is the percentage of scores in its frequency distribution
that are equal to or lower than it. For example, a test score that is greater than or equal
to 75% of the scores of people taking the test is said to be at the 75th percentile, where
75 is the percentile rank. In educational measurement, a range of percentile ranks, often
appearing on a score report, that shows the range within which the test takers true
percentile rank probably occurs. The true value refers to the rank the test taker would
obtain if there were no random errors involved in the testing process.
You could perform all these steps by hand. For example, you could find a critical value
by hand, or calculate a z value by hand. For a step by step example, see:
This tests for a difference in proportions. A two proportion z-test allows you to compare
two proportions to see if they are the same.
The null hypothesis (H0) for the test is that the proportions are the same.
The alternate hypothesis (H1) is that the proportions are not the same.
Sample question: lets say youre testing two flu drugs A and B. Drug A works on 41
people out of a sample of 195. Drug B works on 351 people in a sample of 605. Are the
two drugs comparable? Use a 5% alpha level.
Step 1: Find the two proportions:
P1 = 41/195 = 0.21 (thats 21%)
P2 = 351/605 = 0.58 (thats 58%).
Set these numbers aside for a moment.
Step 2: Find the overall sample proportion. The numerator will be the total number of
positive results for the two samples and the denominator is the total number of people
in the two samples.
p = (41 + 351) / (195 + 605) = 0.49.
Set this number aside for a moment.
Step 3: Insert the numbers from Step 1 and Step 2 into the test statistic formula:
Example
Lets say youre interested in whether the average New Yorker spends more than the
average Kansan per month on movies.
You ask a sample of 3 people from each state about their movie spending. You might
observe a difference in those averages (like $14 for the average Kansan and $18 for the
average New Yorker). But that difference is not statistically significant; it could easily just
be random luck of which 3 people you randomly sampled that makes one group appear
to spend more money than the other. If instead you ask 300 New Yorkers and 300
Kansans and still see a big difference, that difference is less likely to be caused by the
sample being unrepresentative.
Note that if you asked 300,000 New Yorkers and 300,000 Kansans, the result would
likely be statistically significant even if the difference between the group was only a
penny. The t-tests effect size complements its statistical significance, describing the
magnitude of the difference, whether or not the difference is statistically significant.
Definition
A statistically significant t-test result is one in which a difference between two groups is
unlikely to have occurred because the sample happened to be atypical. Statistical
significance is determined by the size of the difference between the group averages, the
sample size, and the standard deviations of the groups. For practical purposes
statistical significance suggests that the two larger populations from which we sample
are actually different.