Escolar Documentos
Profissional Documentos
Cultura Documentos
Then for each number: subtract the Mean and square the result
(the squared difference).
The heights (at the shoulders) are: 600mm, 470mm, 170mm, 430mm and
300mm.
Find out the Mean, the Variance, and the Standard Deviation.
Your first step is to find the Mean:
Answer:
Mean =
1970
5
= 394
so the mean (average) height is 394 mm. Let's plot this on the chart:
To calculate the Variance, take each difference, square it, and then
average the result:
And the good thing about the Standard Deviation is that it is useful. Now
we can show which heights are within one Standard Deviation (147mm) of
the Mean:
4+4+4+4
=
=4
4
That looks good (and is the Mean Deviation), but what about this case:
|7| + |1| + |-6| + |-2|
7+1+6+2
=
=4
4
Oh No! It also gives a value of 4, Even though the differences are more
spread out!
So let us try squaring each difference (and taking the square root at the
end):
42 + 42 + 42 + 42
64
=
=4
4
72 + 12 + 62 + 22
90
=
= 4.74...
4
That is nice! The Standard Deviation is bigger when the differences are
more spread out ... just what we want!
In fact this method is a similar idea to distance between points, just
applied in a different way.
And it is easier to use algebra on squares and square roots than absolute
values, which makes the standard deviation easy to use in other areas of
mathematics.
Describing Data: Why median and IQR are often better than mean
and standard deviation
Skip to end of metadata
Attachments:4
Added by Jim Wahl, last edited by Jim Wahl on Mar 20, 2013
Go to start of metadata
Cf4
Grading on a curve in college instilled a habit for using mean and standard
deviation to describe a set of continuous data points. On any given
assessment, about 68 percent of students were within one standard
deviation of the mean. These were the "B" students. About 15 percent
were one standard deviation above; these students each received an "A"
and the rest were "other." At the end of the semester, there might be one
or two students in a 100-student freshman course who were two standard
deviations above the class average. These students would get a nice
letter from the head of the department.
The habit of using "mean" and "standard deviation" and the convenient
rule that 68 percent of samples are within one standard deviation of the
mean and 95 percent are within two standard deviations makes these
measures attractive. Unfortunately, mean and standard deviation are
trickier to use than you might remember.
For starters, these statistics only work well on normally distributed "bell
curve" data. Such things as test scores or heights / weights of a
population are all generally "normal." Errors on a broadband service,
network capacity estimates, and stability metrics are generally not
normal. Error distributions, for example, are often "positively skewed" with
the left side of the bell curve compressed (most lines have low error
counts) and with a long tail on the right side of the mean (a few outliers
have continuous errors).
Another problem is that mean and standard deviation are not robust
against outliers. Below are two groups of data with identical mean and
standard deviation. But theyre not identical: Group I has a wider
distribution below the mean and Group II has a single high outlier.
Below are the Group I and II data with the IQR and median. Now the red
median bar makes it clear that Group I's values are typically higher than
Group II's, but Group I also has a wider spread, as indicated by the wider
IQR shaded bar.
Excel Function
MIN()
MAX()
quartiles Q1 and Q3
QUARTILE.INC(, 1)
QUARTILE.INC(, 3)
median
MEDIAN() or QUARTILE.INC(, 2)
IQR
Q3 Q1