Você está na página 1de 51

BIOMETRY

(SUMMARY)

The usual course of events for conducting scientific work


The Scientific Method
Conduct an experiment or a series of controlled systematic observations

Observation

Develop a Working Hypothesis

Appropriate statistical tests

Reformulate or extend hypothesis

Confirm or reject hypothesis

The usual course of events for conducting scientific work


The Scientific Method
Record the number of times that small crickets move away from small and large crickets.

Observation
In a group of crickets, small ones seem to avoid large ones

Develop a Working Hypothesis


There will be movement away from large cricket by small ones

Conduct an experiment or a series of controlled systematic observations


Chi square test

Appropriate statistical tests

Reformulate or extend hypothesis


Avoidance may depend on previous experience

Confirm or reject hypothesis


There is a significant difference in the number of times small crickets move away from large vs. small ones

Imagine that you are collecting samples (i.e. a number of individuals) from a population of little ball creatures - Critterus sphericales Little ball creatures come in 3 sizes:

Small =
Medium =

Large =

You end up with a total of five samples

-sample 1 -sample 2 -sample 3 -sample 4 -sample 5

The real population (all the little ball creatures that exist)

Your samples

Each sample is a representation of the population BUT No single sample can be expected to accurately represent the whole population

To be statistically valid, each sample must be:


1) Random:

Thrown quadrat??

Guppies netted from an aquarium?

Assign numbers from a random number table

13

4 6

7
5 2 9 14 11

10

12 15

To be statistically valid, each sample must be:


2) Replicated:

But not - Pseudoreplication


Not pseudoreplication

Pseudoreplication

10 samples from 10 different trees

10 samples from the same tree

Sample size = 10

Sample size = 1

TYPES OF
DATA

RATIO DATA - constant size interval - a zero point with some reality

e.g. Heights, rates, time, volumes, weights

INTERVAL DATA - constant size interval - no true zero point


zero point depends on the scale used

e.g. Temperature

Ordinal Scale
- ranked data

-grades, preference surveys

Nominal Scale

Team numbers Drosophila eye colour

The kind of data you are dealing with is one determining factor in the kind of statistical test you will use.

Statistics and Parameters

Measures of: Central tendency - mean, median, mode

Dispersion - range, mean deviation, variance, standard deviation, coefficient of variation

The real population (all the little ball creatures that exist)

Central tendency - Mean

The real population (all the little ball creatures that exist)

Your samples

The real population (all the little ball creatures that exist)

Central Tendency 1) Arithmetic mean At Population level


Measuring the diameters of all the little ball creatures that exist

m = SXi N
m - population mean

Xi - every measurement in the population N - population size

Your samples

X = SXi n

X = SXi n

X = SXi n

X = SXi n

X = SXi n

Sample mean

X = SXi n

Sum of all measurements in the sample Sample size

If you have sampled in an unbiased fashion

X = SXi n X = SXi n X = SXi n


Each roughly equals m

X = SXi n
X = SXi n

Central tendency - Median Median - middle value of a population or sample

e.g. Lengths of Mayfly (Ephemeroptera) nymphs

5th value (middle of 9)

Odd number of values


Median value

Even number of values


Median value

Median = middle value

Median =

+ 2

Or - to put it more formally


Odd number of values (i.e. n is odd)

Median = X(n+1)
2

Even number of values

Median = X(n/2) + X(n/2) + 1 2

Central tendency - Mode c. Mode - the most frequently occurring measurement


Mode

Frequency (= number of times each measurement appears in the population

Values (= measurements taken)

Measures of Dispersion Why worry about this?? -because not all populations are created equal
Distribution of values in the populations are clearly different BUT means and medians are the same

Mean & median

Measures of Dispersion 1. Range - difference between the highest and lowest values
Remember little ball creatures and the five samples

Range =

Range - crude measure of dispersion

Note - three samples do not include the highest value

and - two samples do not include the lowest

Measures of Dispersion 2. Mean Deviation X is a measure of central tendency Take difference between each measure and the mean

Xi - X
BUT

SXi - X = 0
So this is not useful as it stands

Measures of Dispersion 2. Mean Deviation (contd) But if you take the absolute value -get a measure of disperson

S |Xi - X|
and

S |Xi - X|
n

= mean deviation

Measures of Dispersion 3. Variance -eliminate the sign from deviation from mean Square the difference

(Xi - X)2
And if you add up the squared differences - get the sum of squares

S(Xi - X)2

(hint: youll be seeing this a lot!)

Measures of Dispersion 3. Variance (contd) Sum of squares can be considered at both the population and sample level Population Sample

SS = S(Xi -

m)2

ss = S(Xi - X)2

Measures of Dispersion 3. Variance (contd) If you divide by the population or sample size - get the mean squared deviation or VARIANCE Population Sample

s2

= S(Xi N

m)2

s2 = S(Xi - X)2 n-1


Sample variance

Population variance

Measures of Dispersion 3. Variance (contd) Note something about the sample variance

s2 = S(Xi - X)2 n-1

Degrees of freedom or df or n

Measures of Dispersion 4. Standard Deviation - just the square root of the variance Population Sample

s = S(Xi N

m)2

s = S(Xi - X)2 n-1

Standard Deviation - very useful Most data in any population are within one standard deviation of the mean

NORMAL DISTRIBUTION

From previous slide show


Type of data

Discrete

Continuous

Other distributions

2 categories & Bernoulli process

> 2 categories

Use a Binomial model to calculate expected frequencies

Use a Poisson distribution to calculate expected frequencies

Now were dealing with:


Type of data

Discrete

Continuous

Other distributions

2 categories & Bernoulli process

> 2 categories

Use a Binomial model to calculate expected frequencies

Use a Poisson distribution to calculate expected frequencies

Normal Distribution - bell curve

Central Limit Theorem Any continuous variable influenced by numerous random factors will show a normal distribution.

Normal curve is used for: 2) Continuous random data


Weight, blood pressure weight, length, area, rates

Data points that would be affected by a large number of random (=unpredictable) events physical activity age genes

Blood pressure
smoking diet

stress

Normal curves can come in different shapes

So, for comparison between them, we need to standardize their presentation in some way

Standarize by calculating a Z-Score


Z = value of a random variable - mean standard deviation or Z=X- s

Example of a z-score calculation The mean grade on the Biometrics midterm is 78.4 and the standard deviation is 6.8. You got a 59.7 on the exam. What is your z-score? Z=X- s Z = 59.7 - 78.4 = -2.75 6.8

If you look at the formula for z-scores: z = value of a random variable - mean standard deviation

z is also the number of standard deviations a value is from the mean

Each standard deviation away from the mean defines a certain area of the normal curve

Você também pode gostar