Você está na página 1de 59

INTRODUCTION TO

BIOSTATISTICS

DR. M. Surya Durga Prasad


Asst. Professor,
SOMS, UOH
This session covers:

Origin and development of Biostatistics


Definition of Statistics and Biostatistics
Reasons to know about Biostatistics
Types of data
Graphical representation of a data
Frequency distribution of a data
STATISTICS
Definition:

It is a field of study concerned with


techniques or methods of collection of data,
classification, summarizing, interpretation, drawing
inferences, testing of hypothesis and making
recommendations etc

Statistics is nothing but SCIENCE OF FIGURES


Statistics is the science which deals
with collection, classification and
tabulation of numerical facts as the
basis for explanation, description
and comparison of phenomenon.

------ Lovitt
Origin and development of
statistics in Medical Research
In 1929 a huge paper on application of
statistics was published in Physiology
Journal by Dunn.
In 1937, 15 articles on statistical methods
by Austin Bradford Hill, were published in
book form.
In 1948, a RCT of Streptomycin for
pulmonary tb., was published in which
Bradford Hill has a key influence.
Then the growth of Statistics in Medicine
from 1952 was a 8-fold increase by 1982.
C.R. Rao
Douglas Altman Ronald Fisher Karl Pearson

Gauss -
BIOSTATISICS
(1) Statistics arising out of biological
sciences, particularly from the fields of
Medicine and public health.
(2) The methods used in dealing with
statistics in the fields of medicine, biology
and public health for planning,
conducting and analyzing data which
arise in investigations of these branches.
Reasons to know about
biostatistics:
Medicine is becoming increasingly
quantitative.
The planning, conduct and interpretation
of much of medical research are
becoming increasingly reliant on the
statistical methodology.
Statistics pervades the medical literature.
Example: Evaluation of Penicillin (treatment
A) vs Penicillin & Chloramphenicol
(treatment B) for treating bacterial
pneumonia in children< 2 yrs.

What is the sample size needed to demonstrate


the significance of one group against other ?
Is treatment A is better than treatment B or
vice versa?
If so, how much better ?
What is the normal variation in clinical
measurement ? (mild, moderate & severe) ?
How reliable and valid is the measurement ?
(clinical & radiological) ?
What is the magnitude and effect of laboratory
and technical error ?
How does one interpret abnormal values ?
CLINICAL MEDICINE

Documentation of medical history of


diseases.
Planning and conduct of clinical studies.
Evaluating the merits of different
procedures.
In providing methods for definition of
normal and abnormal.
PREVENTIVE MEDICINE

To provide the magnitude of any health


problem in the community. Eg: TB, Cataract
To find out the basic factors underlying the
ill-health.
To evaluate the health programs which was
introduced in the community
(success/failure).
To introduce and promote health legislation.
WHAT DOES STAISTICS
COVER ?
Planning
Design
Execution (Data collection)
Data Processing
Data analysis
Presentation
Interpretation
Publication
HOW A BIOSTATISTICIAN
CAN HELP ?
Design of study
Sample size & power calculations
Selection of sample and controls
Designing a questionnaire
Data Management
Choice of descriptive statistics & graphs
Application of univariate and multivariate
statistical analysis techniques
INVESTIGATION

Data Colllection

Inferential Statistiscs
Descriptive Statistics
Data Presentation
Estimation Hypothesis Univariate analysis
Measures of Location
Tabulation Testing
Measures of Dispersion
Diagrams Ponit estimate Multivariate analysis
Measures of Skewness &
Graphs Inteval estimate
Kurtosis
Main sources for the collection of data:-

1. Experiments
2. Surveys
3. Records
Principles followed for the presentation of
statistical data:

They should be presented in such a way that


data should
Become concise with out losing details
Arouse interest in reading
Become simple and meaningful to form impressions
Need few words to explain
Define problem and suggest the solution too and
Become helpful in further analysis
for good presentation of data, full labelling,
simplicity, and honesty are essential requirements
TYPES OF DATA

QUALITATIVE DATA
DISCRETE QUANTITATIVE
CONTINOUS QUANTITATIVE
Qualitative data:-
In such data there is no notion of magnitude or
size of the characteristic or attribute as the
same cannot be measured
There is only one variable i.e., frequency
They are classified by counting the individuals
having the same characteristic and not by
measurement
This data are DISCRETE in nature such as no.
of deaths in different years, population of
different towns and so on.
QUALITATIVE

Nominal
Example: Sex ( M, F)
Exam result (P, F)
Blood Group (A,B, O or AB)
Color of Eyes (blue, green,
brown, black)
ORDINAL
Example:
Response to treatment
(poor, fair, good)
Severity of disease
(mild, moderate, severe)
Income status (low, middle,
high)
Quantitative data:-

There are two variables


1. Magnitude
2. Frequency
The quantitative data obtained from the
characteristic variable are also called
continuous data as each individual has one
measurement from continuous spectrum or
range such as body temperature from 35 c to
42 c.
QUANTITATIVE (DISCRETE)

Example: The no. of family members


The no. of heart beats
The no. of admissions in a day

QUANTITATIVE (CONTINOUS)

Example: Height, Weight, Age, BP,


Serum
Cholesterol and BMI
Discrete data -- Gaps between possible values

Number of Children

Continuous data -- Theoretically,


no gaps between possible values

Hb
CONTINUOUS DATA

DISCRETE DATA

wt. (in Kg.) : under wt, normal & over wt.


Ht. (in cm.): short, medium & tall
Table 1 Distribution of blunt injured patients
according to hospital length of stay
hospital length of stay Number Percent
1 3 days 5891 43.3
4 7 days 3489 25.6
2 weeks 2449 18.0
3 weeks 813 6.0
1 month 417 3.1
More than 1 month 545 4.0
Total 14604 100.0
Mean = 7.85 SE = 0.10
Scale of measurement
Qualitative variable:
A categorical variable

Nominal (classificatory) scale


- gender, marital status, race

Ordinal (ranking) scale


- severity scale, good/better/best
Scale of measurement
Quantitative variable:
A numerical variable: discrete; continuous

Interval scale :
Data is placed in meaningful intervals and order. The unit of
measurement are arbitrary.

- Temperature (37 C -- 36 C; 38 C-- 37 C are equal) and


No implication of ratio (30 C is not twice as hot as 15 C)
Ratio scale:
Data is presented in frequency distribution in
logical order. A meaningful ratio exists. they
feature an identifiable absolute zero point

- Age, weight, height, pulse rate


- pulse rate of 120 is twice as fast as 60
- person with weight of 80kg is twice as heavy
as the one with weight of 40 kg.
Scales of Measure

Nominal qualitative classification of


equal value: gender, race, color, city
Ordinal - qualitative classification
which can be rank ordered:
socioeconomic status of families
Interval - Numerical or quantitative
data: can be rank ordered and sizes
compared : temperature
Ratio - Quantitative interval data along
with ratio: time, age.
Thank You
INVESTIGATION

Data Colllection

Inferential Statistiscs
Descriptive Statistics
Data Presentation
Estimation Hypothesis Univariate analysis
Measures of Location
Tabulation Testing
Measures of Dispersion
Diagrams Ponit estimate Multivariate analysis
Measures of Skewness &
Graphs Inteval estimate
Kurtosis
Frequency Distributions

data distribution pattern of


variability.
the center of a distribution
the ranges
the shapes
simple frequency distributions
grouped frequency distributions
midpoint
Tabulate the hemoglobin values of 30 adult
male patients listed below

Patien Hb Patien Hb Patien Hb


t No (g/dl) t No (g/dl) t No (g/dl)
1 12.0 11 11.2 21 14.9
2 11.9 12 13.6 22 12.2
3 11.5 13 10.8 23 12.2
4 14.2 14 12.3 24 11.4
5 12.3 15 12.3 25 10.7
6 13.0 16 15.7 26 12.5
7 10.5 17 12.6 27 11.8
8 12.8 18 9.1 28 15.1
9 13.2 19 12.9 29 13.4
10 11.2 20 14.6 30 13.1
Steps for making a
table
Step1 Find Minimum (9.1) & Maximum (15.7)

Step2 Calculate difference 15.7 9.1 = 6.6

Step3 Decide the number and width of


the classes (7 c.l) 9.0 -9.9, 10.0-10.9,----

Step4 Prepare dummy table


Hb (g/dl), Tally mark, No. patients
DUMMY TABLE Tall Marks TABLE
Hb (g/dl) Tall marks No. Hb (g/dl) Tall marks No.
patients patients

9.0 9.9 9.0 9.9 l 1


10.0 10.9 10.0 10.9 lll 3
11.0 11.9 11.0 11.9 lll 6
12.0 12.9 12.0 12.9
13.0 13.9
llll llll 10
13.0 13.9
14.0 14.9 14.0 14.9 llll 5
15.0 15.9 15.0 15.9 3
lll 2
ll
Total Total - 30
Table Frequency distribution of 30 adult male
patients by Hb
Hb (g/dl) No. of
patients
9.0 9.9 1
10.0 10.9 3
11.0 11.9 6
12.0 12.9 10
13.0 13.9 5
14.0 14.9 3
15.0 15.9 2
Total 30
Table Frequency distribution of adult patients by
Hb and gender:
Hb Gender Total
(g/dl)
Male Female

<9.0 0 2 2
9.0 9.9 1 3 4
10.0 10.9 3 5 8
11.0 11.9 6 8 14
12.0 12.9 10 6 16
13.0 13.9 5 4 9
14.0 14.9 3 2 5
15.0 15.9 2 0 2

Total 30 30 60
Elements of a Table
Ideal table should have Number
Title
Column headings
Foot-notes
Number Table number for identification in a report

Title,place - Describe the body of the table, variables,


Time period (What, how classified, where and when)

Column - Variable name, No. , Percentages (%), etc.,


Heading

Foot-note(s) - to describe some column/row headings,


special cells, source, etc.,
Table II. Distribution of 120 (Madras) Corporation divisions
according to annual death rate based on registered deaths in
1975 and 1976

Death rate (/1000 per


No.annum)
of divisions
7.0-7.9 4 (3.3)
8.0 - 8.9 13 (10.8)
9.0 - 9.9 20 (16.7)
10.0 - 10.9 27 (22.5)
11.0 - 11.9 18 (15.0)
12.0 - 12.9 11 (0.2)
13.0 - 13.9 11 (9.2)
14.0 - 14.9 6 (5.0)
15.0 - 15.9 2 (1.7)
16.0 - 16.9 4 (3.3)
17.0 - 18.9 3 (2.5)
19.0 + 1 (0.8)
Total 120 (100.0)

Figures in parentheses indicate percentages


DIAGRAMS/GRAPHS

Discrete data
--- Bar charts (one or two groups)

Continuous data
--- Histogram
--- Frequency polygon (curve)
--- Stem-and leaf plot
--- Box-and-whisker plot
Example data

68 63 42 27 30 36 28 32
79 27 22 28 24 25 44 65
43 25 74 51 36 42 28 31
28 25 45 12 57 51 12 32
49 38 42 27 31 50 38 21
16 24 64 47 23 22 43 27
49 28 23 19 11 52 46 31
30 43 49 12
Histogram

20
Frequency

10

11.5 21.5 31.5 41.5 51.5 61.5 71.5


Age

Figure 1 Histogram of ages of 60 subjects


Polygon

20
Frequency

10

11.5 21.5 31.5 41.5 51.5 61.5 71.5


Age
Example data

68 63 42 27 30 36 28 32
79 27 22 28 24 25 44 65
43 25 74 51 36 42 28 31
28 25 45 12 57 51 12 32
49 38 42 27 31 50 38 21
16 24 64 47 23 22 43 27
49 28 23 19 11 52 46 31
30 43 49 12
Stem and leaf plot
Stem-and-leaf of Age N = 60
Leaf Unit = 1.0

6 1 122269
19 2 1223344555777788888
(11) 3 00111226688
13 4 2223334567999
5 5 01127
4 6 3458
2 7 49
Box plot

80

70

60

50
Age

40

30

20

10
Descriptive statistics report:
Boxplot
- minimum score
- maximum score
- lower quartile
- upper quartile
- median
- mean

- the skew of the distribution:


positive skew: mean > median & high-score whisker is longer
negative skew: mean < median & low-score whisker is longer
Pie Chart
Circular diagram total -100%
10%
Divided into segments each
representing a category
20% Mild
Decide adjacent category
Moderate
Severe The amount for each category is
70% proportional to slice of the pie

The prevalence of different degree of


Hypertension
in the population
Bar Graphs
25
Heights of the bar indicates
20 20
20
frequency
16
Number

15 12 12
9 8 Frequency in the Y axis
10
5 and categories of variable
0 in the X axis
Smo Alc Chol DM HTN No F-H
Exer The bars should be of equal
Risk factor
width and no touching the
other bars
The distribution of risk factor among cases with
Cardio vascular Diseases
HIV cases enrolment in
USA by gender
Bar chart
12
Enrollment (hundred)

10
8
6
Men
4 Women
2
0
1986 1987 1988 1989 1990 1991 1992

Year
HIV cases Enrollment
in USA by gender
Stocked bar chart
18
16
Enrollment (Thousands)

14
12
10
8 Women
6 Men
4
2
0
1986 1987 1988 1989 1990 1991 1992
Year
Graphic Presentation of
Data
the frequency polygon
(quantitative data)

the histogram
(quantitative data)

the bar graph


(qualitative data)
General rules for designing
graphs
A graph should have a self-explanatory
legend
A graph should help reader to understand
data
Axis labeled, units of measurement
indicated
Scales important. Start with zero (otherwise
// break)
Avoid graphs with three-dimensional
impression, it may be misleading (reader
visualize less easily
Any Questions

Você também pode gostar