Você está na página 1de 9

A2 statistics page 1

Statistics for A2 Biology


In A2 biology we need to learn how to use statistics to analyse experimental results. Sometimes experimental results are not
very clear and so it is difficult to make a firm conclusion. In these cases an appropriate statistical test can help to clarify the
results so that a valid conclusion can be made. Statistics can therefore be used to extract the maximum amount of
information from experimental data, and are an essential tool of experimental biology.
There are three stages in using a statistical test: choosing a test; carrying it out; and making a conclusion. Before we choose
a test though, we need to learn a little about different kinds of test and different kinds of data.

What kinds of statistics tests are there?


S ta tistics
D escrip tiv e S tatistic s

In fere n tia l S ta tistics

R etu rn a v alu e th at su m m a riz es d ata


e .g . m e an , m e d ia n, sta nd a rd d ev iatio n
p o p ulation in d e x, etc .

R etu rn a p rob a b ility ( P -v a lu e ) th at a


n u ll h y p o th e sis is co rre ct.

C o m p a ra tiv e S ta tistic s

A sso cia tio n S tatistics

Tests fo r a differe n ce b etw e en tw o o r


m o re se ts o f d a ta
e .g . t-test, 2 -test, etc.

Tests fo r as asso ciatio n b e tw e en tw o


se ts o f d ata
e .g . co rrelation , regre ssio n.

Descriptive statistics are used to summarise data so you can simplify them and plot a graph. This is what you did in AS,
and includes the mean and 95% confidence interval. For non-normal data you may instead have to calculate the median and
interquartile range.
Inferential Statistics test a statement called the null hypothesis, and return a probability (called a P-value) that the null
hypothesis is true. The null hypothesis is a mathematical statement and is fixed for a given test (the exact null hypothesis is
given for each test on the next few pages). It has nothing to do with (and can be quite different from) any scientific
hypothesis you may be making about the result of the experiment. The lower the probability, the less likely it is that the null
hypothesis is true, and in biology we usually take 0.05 (or 5%) as the cut-off. This may seem very low, but it reflects the
facts that biology experiments are expected to produce quite varied results.
If P < 5% (0.05) then we reject the null hypothesis, and conclude that there is a significant difference or association.
If P 5% (0.05) then we accept the null hypothesis, and conclude that there is no significant difference or association.
There are two kinds of inferential statistics:
Comparative statistics are used to compare different sets of data to see if they are different (e.g. is this group bigger than
that group?). The null hypothesis states that there is no difference between the sets of data. You must also choose between
matched-sample experiments and independent sample experiments. In a matched-sample experiment each measurement
from the first group matches up with corresponding measurement in the other groups, perhaps because they were all made
on the same subject (e.g. a "before and after" experiment). Otherwise you have an independent sample experiment. If it isn't
obvious which value matches with which, then it's probably not matched.
Association statistics are used to look for an association (or correlation) between two sets of data (e.g. if this goes up does
that go up?). The null hypothesis states that there is no association between the sets. A scatter graph (or mosaic chart for
categoric data) of one factor against the other (without a line of best fit) indicates the association. If both factors increase
together then there is a positive correlation; if one factor decreases when the other increases then there is a negative
correlation; and if the scatter graph has apparently random points then there is no correlation:

HGS Biology

NCM/01/06

A2 statistics page 2

N o C o rre la tio n

variable 2

N eg a tiv e C o rre latio n

variable 2

variable 2

P o sitiv e C o rre la tio n

variab le 1

variab le 1

variab le 1

The association is described by two statistics:


First there is a P-value, which tests the null hypothesis that there is no correlation. If P<5% then the null hypothesis is
rejected and there is a significant correlation (with the strength indicated by the correlation coefficient); but if P>5%
then we accept the null hypothesis that there is no correlation (and the value of the correlation coefficient is
meaningless).
Second there is a correlation coefficient, which gives the strength of the association. It varies from 0 (no correlation) to
1 (perfect correlation). Positive values indicate a positive correlation while negative values indicate a negative
correlation. The larger the absolute value (positive or negative), the stronger the correlation.
A correlation does not necessarily mean that there is a causal relationship between the factors (i.e. changes in one factor
cause the changes in the other). The changes may both be caused by a third factor, or it could be just coincidence. Further
controlled studies would be needed to find out.

What kinds of data are there?


There are different statistical tests designed for different kinds of data. Data can be classified as quantitative (numbers) or
qualitative (words).

D ata
Q u an titativ e D ata
(n u m b ers)
n o rm ally
d istrib uted

Q u alitativ e D ata
(w o rd s)

n o t n orm ally
d istrib uted
o r o rd in a l

c an b e
ran k e d

c an n o t be
ran k e d

N o rm a lly D istr ib u ted D a ta

O rd in a l D a ta

N o m in a l D a ta

u se p ara m e tric te sts

u se n o n p ara m e tric te sts

u se freq u en c y tests

Most quantitative measurements (e.g. length, mass, temperature, rates, counts) are normally-distributed, especially if you
have a large number of repeats. For normally-distributed data you can use the most powerful Parametric Tests.
Other quantitative measurements are not normally-distributed. These include arbitrary scales like "1-5", calculated data
and data sets with extreme outliers. In this case the parametric tests are invalid, so choose Non-Parametric Tests.
Some qualitative data can be ranked, so can be replaced by numerical ranks (e.g. big, medium, small become 1, 2, 3).
These can then be analysed using Non-Parametric Tests.
Finally, the data can simply be categories that cannot be ranked (e.g. colours, shapes, species). We can't do maths on
categoric data, but we can count the numbers in each category to give frequencies, and then compare these observed
frequencies with some expected frequencies using Frequency Tests.

HGS Biology

NCM/01/06

A2 statistics page 3
All the different statistical tests you will come across are summarised in this table. Don't panic, you don't need to learn this!

Descriptive
Statistics

Summarise Data
Chart

Comparative
Statistics

Independent
Samples

2 groups
2 groups
2 groups

Matched
Samples

2 groups

Chart
Association
Statistics

Test for an Association


Details of Linear Relationship

Parametric Tests
(for normal data)

Non-Parametric Tests
(for ordinal data)

Frequency Tests
(for nominal data)

Mean, Standard
Deviation, 95% CI

Median, Quartiles

Bar Chart

Box Plot

Pie or Bar Chart

Unpaired t-Test

Mann-Whitney U-Test

Anova

Kruskal-Wallis Test

Paired t-Test

Wilcoxon Test

Matched Anova

Friedman Test

Scatter Graph

Scatter Graph

Mosaic Chart

Pearson Correlation

Spearman Correlation

Chi-Squared Test
of Association

Linear Regression

Chi-Squared Test
or G-Test

The Six Steps in a Statistics-Based Investigation


1. Choose a Test

While planning your investigation, choose a suitable statistical test using Merlin's "Choose a
Test" or the table above.

2. State the Null


Hypothesis

Look up the test in this document. State the null hypothesis as precisely as you can for your
experiment; e.g. "There is no difference between the means of the plant heights in the two areas".
This statement is what the statistical test will actually test.

3. Obtain
Results

Carry out the investigation and obtain results. For hypothesis-testing we need as many replicate
measurements as possible, and as a guide, aim for at least 10 replicates and preferably 20 in each
set.

4. Present the
Data

Present your raw data in a neat results table. Use Merlin to calculate descriptive statistics, like
mean and 95% CI, and plot an appropriate graph.

5. Carry out the Type the Merlin formula for your chosen test into an empty cell e.g. =TTESTP(B3:B12,C3:C12).
Statistical Test This will return the P-value for that test. It's a good idea to format this cell as a percentage
(Format menu > Cells > Number tab > Percentage), so for example a P-value of 0.02 appears as
2%.

6. Make a
Conclusion

State whether you accept or reject the null hypothesis, and write a sentence explaining exactly
what that means in this case. For example if P was < 5% then you would reject the null
hypothesis and say that the plants in group A are significantly taller than the plants in group B. Or
if P was > 5% then you would accept the null hypothesis and say that as far as you can tell from
the data, there is no significant difference between the heights of the two groups of plants. The
wording of the conclusion is important, so use these examples as a guide and think carefully
about what you are saying. Note that if P > 5% we haven't proved the null hypothesis, but since
our data are consistent with it, we accept it.

There are lots of examples of conclusions on the next five pages. This is a statistics reference guide, which describes each
of the tests in detail.

HGS Biology

NCM/01/06

A2 statistics page 4

Unpaired t-Test

(t-test)
This test is used to compare two sets of data, and it
tests the null hypothesis that the two sets have the
same mean. The data must be normally-distributed and
there must at least 10 replicates (and preferably much
more). If P<5% then the null hypothesis is rejected
and there is a significant difference between the two
means.
The Merlin function is =TTESTU(range1, range2). In
this example the effect of two fertilisers on yield of
potatoes
is
compared.
The
formula
=TTESTU(B3:B12,C3:C12) is typed into cell B15
and formatted as a percentage. The P value is >5% so
we accept the null hypothesis and conclude that there
is no significant difference between the two fertilisers.

Paired t-Test
This test is used to compare two sets of paired
(matched) data, and it tests the null hypothesis that the
mean difference between the pairs is zero. The data
must be in pairs, they must be normally-distributed
and there must at least 10 replicates (and preferably
much more). If P<5% then the null hypothesis is
rejected and there is a significant difference between
the two sets.
The Merlin function is =TTESTP(range1, range2).
This example compares pulse rate before and after
eating a large meal. Because each individual had their
pulse measured before and after the meal then the data
are paired. The formula =TTESTP(B3:B12,C3:C12) is
typed into cell B15 and formatted as a percentage. The
P value is <5% so we reject the null hypothesis and
conclude that pulse rate is significantly higher after a
meal.

ANOVA (analysis of variance)


This test is used to compare two or more sets of data,
and it tests the null hypothesis that the sets have the
same mean. The data must be normally-distributed and
there must at least 10 replicates (and preferably much
more). If P<5% then the null hypothesis is rejected,
and at least one of the sets is significantly different.
The Merlin function is =ANOVA(range), where each
column in the range is a different set. This example
compares 3 different colours of light on the rate of
photosynthesis in Elodea measured by length of
oxygen bubble produced in a given time. The formula
=ANOVA(B3:D12) is typed into cell B15 and
formatted as a percentage. The P value is <5% so we
reject the null hypothesis and conclude that at least
one of the colours is significantly different. From the
means, green must be significantly lower than the
others.

HGS Biology

NCM/01/06

A2 statistics page 5

Matched Samples ANOVA


This test is used to compare two or more sets of
matched data, and it tests the null hypothesis that the
mean difference between the sets is zero. The data
must be matched, they must be normally-distributed
and there must at least 10 replicates (and preferably
much more). If P<5% then the null hypothesis is
rejected and at least one of the sets is significantly
different.
The Merlin function is =ANOVAM(range), where
each column in the range is a different set. This
example compares the mass of food eaten by 8 deer at
four different times of year. The formula
=ANOVAM(B3:E10) is typed into cell B13 and
formatted as a percentage. The P value is <5% so we
reject the null hypothesis and conclude that at least
one of the months is significantly different. From the
means, significantly less food is eaten in May and
August.

Mann-Whitney U-Test
This test is used to compare two sets of data, and it
tests the null hypothesis that the two sets have the
same median. The data can be any form so long as
they can be ranked and there must at least 10
replicates (and preferably much more). If P<5% then
the null hypothesis is rejected and there is a significant
difference between the two medians.
The Merlin function is =UTEST (range1, range2).
This example compares the abundance of blown algae
(measured on a 1-5 score) on two different shores. The
formula =UTEST(B3:B12,C3:C12) is typed into cell
B13 and formatted as a percentage. The P value is
<5% (just!) so we reject the null hypothesis and
conclude that there is significantly more algae on the
sheltered shore.

Wilcoxon Matched-Pairs Test


This test is used to compare two sets of paired
(matched) data, and tests the null hypothesis that the
median difference between the pairs is zero. The data
must be in pairs; can be any form that can be ranked
and there must at least 10 replicates (and preferably
much more). If P<5% then the null hypothesis is
rejected and there is a significant difference between
the two sets.
The Merlin function is =WILCOXON(range1,
range2). This example compares memory scores
(number of words recalled from a text) before and
after
drinking
alcohol.
The
formula
=WILCOXON(B3:B12,C3:C12) is typed into cell C13
and formatted as a percentage. The P value is <5% so
we reject the null hypothesis and conclude that
memory is significantly worse after drinking alcohol.

HGS Biology

NCM/01/06

A2 statistics page 6

Kruskall-Wallace Test
This test is used to compare two or more sets of data,
and it tests the null hypothesis that the sets have the
same median. The data can be any form that can be
ranked and there must at least 10 replicates (and
preferably much more). If P<5% then the null
hypothesis is rejected and there is a significant
difference between at least one of the sets.
The Merlin function is =KWTEST(range), where each
column in the range is a different set. This example
compares decay rates of three species of leaf,
measured by % of leaf area remaining after 8 weeks
burial. The formula =KWTEST(B3:D11) is typed into
cell B12 and formatted as a percentage. The P value is
<5% so we reject the null hypothesis and conclude that
there is a significant difference between at least two of
the leaves. From the box plot, the beech leaves must
decay significantly more slowly than the other two
species.

Friedman Test
This test is used to compare two or more sets of
matched data, and it tests the null hypothesis that the
median difference between the sets is zero. The data
can be any form that can be ranked but must be
matched, and there must at least 10 replicates (and
preferably much more). If P<5% then the null
hypothesis is rejected and at least one of the sets is
significantly different.
The Merlin function is =FRIEDMAN(range), where
each column in the range is a different set. This
example compares the symptoms of patients (on a
score system) before and after treatment with a drug.
The formula =FRIEDMAN(B3:D12) is typed into cell
B13 and formatted as a percentage. The P value is
<5% so we reject the null hypothesis and conclude that
at least one of days is significantly different. From the
medians, there is a significant drop in symptoms after
treatment, so the drug works.

2) Test
Chisquared (
This test is used for frequencies of categoric data, and it tests the null hypothesis that there is no difference between the
observed and expected frequencies. If P<5% then the null hypothesis is rejected and there is a significant difference
between the frequencies. The Excel function is =CHITEST(obsrange, exprange). There are different ways of calculating
the expected frequencies:
Sometimes the expected frequencies can be calculated
from a quantitative theory such as Mendel's laws of
genetics. In this example the frequencies of flower
colours from a genetic cross are compared to an
expected 3:1 ratio. The expected frequencies can be
calculated from the total number of observations (929)
using simple Excel formulae. The formula
=CHITEST(B2:B3,C2:C3) is typed into cell C5 and
formatted as a percentage. The P value is >5% so we
accept the null hypothesis and conclude that the
observed data are consistent with Mendel's law.

HGS Biology

NCM/01/06

A2 statistics page 7
Other times the expected frequencies can be calculated
by assuming that the frequencies in all the categories
should be the same. In this example the frequencies of
boys and girls born in a hospital over a period of time
are compared to an expected 1:1 ratio. The expected
frequencies can be calculated from the total number of
observations (445) using simple Excel formulae. The
formula =CHITEST(B2:B3,C2:C3) is typed into cell
C5 and formatted as a percentage. The P value is >5%
so we accept the null hypothesis and conclude that
there is no difference between male and female births.

G-Test
This test is also used for frequencies of categoric data, and it can be used whenever the chisquared test can be used. Many
statisticians prefer the G-test to the chisquared test. It has the same null hypothesis as the chisquared test. The Merlin
function is =GTEST(obsrange, exprange).
For the genetic cross example above GTEST gives the P-value 53.04%.
For the sex of baby example above GTEST gives the P-value 6.45%.
So in both cases the conclusion is the same.

Pearson Linear Correlation


This test is used to test for a correlation between two
sets of normally-distributed data. The correlation
coefficient is called r and is calculated using the
function
=PEARSON(range1, range2).
The
corresponding P-value, which tests the null hypothesis
that r=0, is calculated using the function
=PEARSONP(range1, range2).
In this example the heights of 10 fathers are compared
with their sons. A scatter graph is plotted and the
formula =PEARSONP(B2:B12,C2:C12) is typed into
cell C13 and formatted as a percentage and the
formula =PEARSON(B2:B12,C2:C12) is typed into
cell C14. The P value is <5% so the null hypothesis is
rejected and we conclude that there is a significant
positive correlation with a strength of 0.72, so tall
fathers do have tall sons.

Spearman Rank Correlation


This test is used to test for a correlation between two
sets of non-normal data. The correlation coefficient is
called rs and is calculated using the function
=SPEARMAN(range1, range2). The corresponding Pvalue, which tests the null hypothesis that rs=0, is
calculated using the function =SPEARMANP(range1,
range2).
In this example the social status of hens (measured by
their pecking order) is compared with their mass. A
scatter graph is plotted and the formula
=SPEARMANP(B2:B11,C2:C11) is typed into cell
C12 and formatted as a percentage and the formula
=SPEARMAN(B2:B11,C2:C11) is typed into cell
C13. The P value is <5% so the null hypothesis is
rejected and we conclude that there is a significant
negative correlation with a strength of -0.77, so big
hens are higher up the pecking order.

HGS Biology

NCM/01/06

A2 statistics page 8

Chisquared Test of Association


This test is used to test for an association between two
factors that are measured by categoric data. The two
sets of categories are the rows and columns of a
contingency table, which contains the frequencies. The
association coefficient is called Cramer's V and is
calculated
using
the
Merlin
function
=CRAMER(range). The corresponding P-value is
called the 2 test of association and it tests the null
hypothesis that there is no association (i.e. Cramer's
V=0). It is calculated using the Merlin function
=CHIASSOC(range).
This example investigates whether there are more
nests in birch or sycamore trees, in other words
whether there is an association between tree species
and birds' nests. The formula =CHASSOC(B2:C3) is
typed into cell B4 and formatted as a percentage and
the formula =CRAMER(B2:C3) is typed into cell B5.
The P value is <5% so the null hypothesis is rejected
and we conclude that there is a significant association,
though it is only weak, with a strength of 0.22.

Linear Regression
This is used to describe a linear relationship between
two sets of data. It is used when you already know that
one variable causes the changes in the other variable
(i.e. there is a causal relationship). Regression fits a
straight line to the data, and gives the values of the
intercept and slope (or gradient) of that line (a and b in
the equation y = a + bx).
The Merlin function =REGRESS(xrange, yrange, flag)
returns values for the slope and intercept as well as
their 95% confidence intervals. The flag value
determines whether the intercept is fixed at zero.
REGRESS is an array function, so a square of 4 cells
is selected and the function is entered with ctrl-shiftEnter.
In this example the absorption of a yeast cell
suspension is plotted against its cell concentration
from
a
cell
counter.
The
formula
=REGRESS(A2:A12,B2:B12,0) was typed into cells
b15:c16 and entered as an array formula. The intercept
was fixed at zero because 0 cells have 0 absorbance.
The straight trendline was also plotted on the scatter
graph. The regression can then be used to make
quantitative predictions. For example, we could
predict that a sample with an absorbance of 1.37 has a
cell concentration of 9 x 107 cells cm-3.

HGS Biology

NCM/01/06

A2 statistics page 9

Excel Tips

Take time to tidy all results tables, as Excel's default formatting isn't very good e.g. line up titles with values.
Format all numbers to an appropriate number of decimal places (Format menu > Cells > Number tab > Number).
Format P-values as percentages (Format menu > Cells > Number tab > Percentage). This automatically multiplies the
P-value by 100 and adds the % sign to make small P values easier to read and understand.
Use Merlin for charts, even if Excel provides the same chart (e.g. a scatter graph). The format of Merlin charts is better
for scientific data. Even so, take time to tidy up all charts, adjusting the size, shape, colour, font size, etc. to make the
chart clear.
Never use an Excel line chart.

Name That Test!


For each of these investigations, choose the best statistics test. In some cases there may be more than one equally good
answer, generally if you can't tell whether the dependent variable is normally distributed or not.
1.

The leaf area of many plants (of the same species)


growing in open and shaded areas is measured to
investigate the effect of light intensity on leaf area.

2.

The number of mayfly larvae found in the middle


and edge of a stream was compared at 12 stations
along the stream.

3.

4.

To test whether the -radiation used to sterilise


barley seeds affected the seeds themselves, batches
of 100 seeds were exposed to three different doses
of radiation (none/low/high) and then planted to see
if the seeds germinated or not.
Thirteen plants were kept in large sealed glass
bottles and the carbon dioxide concentration inside
was measured over a period of 24h to see if the
decrease in carbon dioxide per hour during the day
was different from the increase in carbon dioxide
per hour during the night.

5.

What is the relation between planting density of oat


seedlings and yield of seeds per plant?

6.

Is there a relation between the number of seeds


produced by a plant and mean mass per seed?

HGS Biology

7.

Doctors were alarmed by the apparently high


incidence of a childhood disease in their town Given
the known national incidence of the disease, how
can they tell if the frequency in their town is
different?

8.

Investigating the abundance of orchids in four


different fields.

9.

Investigating the effect of three different diets on


Body Mass Index .

10. A student using lichens as a measure of air pollution


measured lichen diversity at several locations in a
town to see if there was a relation between distance
from town center and diversity index.
11. In order to find the best preparation of a vaccine,
three different forms of the vaccine were inject into
the same 20 healthy volunteers on three separate
occasions, and the concentrations of antibodies in
the blood was measured.
12. To investigate the effect of handling by humans on
lab rats, the activity of 10 handled and 10 unhandled
rats was scored on a 1-5 scale (quiet to very active).

NCM/01/06

Você também pode gostar