Você está na página 1de 79

PRESENTED BY DR.RAJ KUMAR SINGH(JR-1) DEPTT .

OF ORTHODONTICS AND DENTAL ANATOMY

SUPERVISOR DR. SANJEEV KUMAR VERMA CHAIRMAN, DEPTT.OF ORTHODONTICS AND DENTAL ANATOMY DR.Z.A.DENTAL COLLEGE,AMU,ALIGARH

CO-SUPERVISOR DR.MD.SAIF KHAN LECTURER, DEPTT .OF PERIODONTICS DR.Z.A.DENTAL COLLEGE,AMU,ALIGARH

Overview of seminar
Introduction to medical computing
Role of medical computing Introduction to statistics

How to use statistics


Role of statistics conclusion

What does it mean computers

in medicine ?

The Computer Meets Medicine and Biology:

Emergence of a Discipline

After taking this course, you should know the answers to these questions:

Why

is information management a central issue in biomedical research and clinical practice?

What are integrated information-management environments and how might we expect them to affect the practice of medicine and biomedical research in coming years?

What do we mean by the terms medical computer science, medical computing ,medical informatics, clinical informatics, nursing informatics, bioinformatics, and health informatics?
Why should health professionals and students of the

health professions learn about medical-informatics concepts and informatics applications?

How has the development of mini-computers, microprocessors, and the Internet changed the nature of biomedical computing? How is medical informatics related to clinical practice , biomedical engineering, molecular biology, decision science, information science, and computer science?

Role of computing
Medical Decision making: Probabilistic medical

reasoning. Patient care systems. Patient monitoring systems. Computer aided surgery. Computer based patient record systems.

Clinical decision support systems.


The internet. Standards in medical informatics.

Imaging modalities.
Image management systems. Telemedicine. Bioinformatics.

Conventional data collection for clinical trial


Medical records

Data sheets

Computer database

Analyses

Results

WHAT IS STATISTICS

Introduction
Statistics is a science that comprises data collection

methods,processing of data into useful information,and utilising this information in taking decisions with least error
Medical statistics-

A collection of statistical procedures particularly well-suited to the analysis of healthcare-related data.

Medicine is empirical science depending on observations Medical data are necessary for any medical decision,be it for diagnosis,for treatment planning or prognosis,is that some information is availaible for the patient

Also for medico-legal or ethical reasons,to justify


Thus, they must be evidence based

Uncertainities in medicine arise mainly due to

1)biological variability 2)environmental variability 3)sampling fluctuations 4)chance variability 5)instrument variability

To deal with such enormous uncertainities that

pervade all aspects of medical practice, a separate science has developed,called biostatistics
It provides methods to measure uncertainities by

probabilities ,and helps to control the impact of uncertainities on medical practice by laying principles to choose decisions that judiciously combine the probabilities with judgements

How to use statistics


Develop an underlying question of interest
Generate a hypothesis Design a study

Collect Data
Analyze Data Descriptive statistics Statistical Inference

Hypothesis: tentative assumption of the study or expected results of the study


It should be very specific and limited to the piece of

research in hand because it has to be tested.

The role of hypothesis is to guide the researcher by

delimiting the area of research and to keep him on the right track.

Develop Study Design Research question Study sample Sample size Enrollment/Follow-up strategies On-going monitoring

sampling
Sample is that part of target population which is

actually enquired on or investigated


Types of sampling:-

1)Simple random 2)systematic random 3)stratified random 4)cluster random 5)multistage random

Existing data
Primary data are those which one elicits from

individual patients,subjects or other units(such as hospitals or laboratories) directly


Secondary data are those that are elicited by others Secondary data sources include disease specific

database on the web,medical literature,records of surveys and registrations done by the government

Generation of new data


Existing data may be incomplete and insufficient to

provide answers to specific questions


For these data are specially generated through new surveys

and experiments
Basically there are two types of studies to generate new

data-descriptive and analytical


In either setup,it is necessary that a sample of subjects is

studied

data collection designs

Objective

descriptive

analytical

Method

surveys

observational

experimental

Time frame

prospective retrospective crosssectional


(One point time)

(Cohort-cause to effect) (Effect to cause)

Setting

animal trial

Describing data with tables


1) frequency table
2) relative and cumulative frequency 3) grouped frequency

4) open- ended groups


5) cross-tabulation

Frequency table
variables frequency

Mortality (%)
11.2-15.1 15.2-20.1 20.2-25.1 25.2-30.1

Tally
1, 1, 1, 1, 1, 1, 1, 1, 1 1, 1, 1, 1, 1, 1, 1, 1 1, 1, 1, 1, 1 1, 1, 1

No. of ICU
9 8 5 3

30.2-35.1

1,

Relative and cumulative frequency


parity No.of women Percentage (relative frequency) Cumulative percentage

0
1 2 3 4 7 8

5
6 14 10 3 1 1

12.5
15 35 25 7.5 2.5 25

12.5
27.5 62.5 87.5 95 97.5 100

Cross tabulation
Two variables within a single group of individuals
Caries Yes Occlusal 21 (84%) (66) 2 or fewer children No 11 (73%)(34) 32(100) Totals

proximal Totals

4 (16%) (50) 4 (27%)(50) 25(100%) 15(100%)

8(100) 40

Describing data with charts


1) Charting nominal data

(1) the pie chart

(2) the simple bar chart


(3) the cluster bar chart (4) the stacked bar chart

2) Charting ordinal data


(1) the pie chart (2) the bar chart

3) Charting discrete metric data 4) Charting continuous metric data 1)the histogram

Pie chart

4-5 categories One variable Start at 0 in the same order as the table
Pie chart: Hair color of children reciving d-phenothrin

dark , 21, 21%

blonde, 18, 18% blonde

red, 4, 4%

brown red dark

brown, 55, 57%

Simple bar diagram

Clustered bar diagram


Cluster percetage bar chart of the hair color receiving Malathion and dphenothrin
60 50 40 30 20 10 0 malathion d-penothrin 16 4 28 18 4 22 blonde brown red dark 52 56

Histogram
Exercise 3-5, Histogram
40 35 30 25 20 15 10 5 0 19 20-24 25-29 30-34 35 Percentage age distribution of pregnant women Thrombosis cases

Step chart
Exercise 3.8 Cumulative percetage o finfants 120 100 90 80 60 40 20 0 0 60 36.67 16.67 6.67 5 10 Cumulative percetage o finfants 100

Charting cumulative ordinal or discrete metric data

Cumulative frequency curve


Exercise 3.9 Ogive
120 100 80 60 40 20 0 15-24 Attempting suicide Later successful

25-34

35-44

45-54

55-64

65-74

75-84

> 85

Percentage cumulative frequency curves of age for male suicide attempters and later succeeders

Data collection ,types and quality


Evidence based decisions are only as good as the

evidence itself
Thus it is important that the data gathered for creating

evidence is correct

Methods such as interview,examination ,investigations

are availaible

He must decide which method is best for particular

information

Data can be either , quantitative or qualitative


Qualitative data can be on nominal scale or ordinal

scale
Quantitative data are on metric scale

Nominal scale data


It can be allocated into one of a number of categories. Blood type, sex(male/female) No meaningful order

Ordinal scale data


It can be allocated to one of a number of categories but

be put in meaningful order.


Very satisfied, satisfied, neutral, unsatisfied, very

unsatisfied.

Descrete metric data


Countable variables. Integer form Numbers of things Age, numbers of men

Continuous metric data


Measurable variables.
Round to the nearest integer Kg, m, mmHg, hour, years

Quality of data is assessed in terms of validity and

reliability of the measurements or of the tools used to obtain the data


Validity - the ability to correctly measure the

characteristic that it purports to measure

For tests,this is assessed in terms of sensitivity-

specificity ,and positive and negative predictivities


Reliability - the ability to give same result when used

repeatedly in identical conditions

Statisitcal analyses
Descriptive Statistics Describe the sample
Inference Make inferences about the population Primarily performed in two ways:

Hypothesis testing Estimation (more important !!)

Prediction

Descriptive statistics
Descriptive statistics are a way of summarizing the complexity of the data with a single number.

A. For one variable ("univariate analysis"): Measures of "CENTRAL TENDENCY") (averages) and of DISPERSION or variance around that average. Examples: Means, Modes, Medians, Standard Deviation, quartiles

B. Descriptive statistics for the strength of relationship between two variables (bivariate analysis) or among a set of variables
(multivariate analysis) are measures of ASSOCIATION or correlation.

Measure of central tendency

Nominal & Ordinal Frequencies Percents Medians Modes (all)

Interval & Ratio

Means

Measure of dispersion
Nominal & Ordinal (qualitative) Range Deviation Interval & Ratio(quantitative) Standard Quartiles

Measure of association
Nominal & Ordinal Interval & Ratio

Cross-tabulation Non-Parametric Phi, Gamma , Eta Lamda, Tau-B etc.

Pearson's R

Measure of significance
Nominal & Ordinal Chi Squre ,t-test Interval & Ratio Anova (F-ratio)

Inferential statistics
Are measures of the SIGNIFICANCE of the relationship between two or more variables. Significance refers to the probability that the findings could be attributed to sampling error.
Appropriate statistics depend on the LEVEL OF MEASUREMENT OF THE DEPENDENT VARIABLE (and of the independent variable).

Parameters
Summary measures , as mean and standard deviation

can be obtained for a sample as also for entire population


Summary measures,when obtained for the entire

target population ,are called parameters


The values of parameters are hardly ever known

because nobody has time and resources to study the entire population

When parameter values are unknown,as almost

invariably is,it becomes necessary to fall back on samples to get some tangible lead regarding the characteristic of population
Measures such as mean and SD when obtained for

sample subjects are called statistics

Standard deviation and normal


mean

Tests of parametric significance


1) Student t-test:

for comparison of mean between 2 groups 2) Anova F-test:

for comparison of means in three or more groups (both the above test requires that the means follow a Gaussian distribution and hence are called parametric tests)

Nonparametric test
When sample size is very small and distribution is

skewed, parametric tests cannot be used


In such cases ,non parametric tests(less powerful test

than parametric) are used


For paired data - non-parametric tests commonly used

are sign test and other is Wilcoxon signed rank test

For unpaired two-sample data - the non-parametric

test is Mann-Whitney test


Another important non-parametic test is Chi-square

test(used for nominal data),a test of proportion


This is used to test the significance of association of

two or more qualitative characteristics

Point estimation and standard error


It is a reality that samples in all likelihood will differ

from one another


Even though there is rarely a need for a second sample

in scientific endeavours provided the first is chosen with due precautions such as random selection and inclusion of sufficient number of individuals

In such cases ,summary measures based on one

sample alone are considered good estimates of the respective characteristics of target population
These are called point estimates

Although point estimates obtained from carefully

derived sample are fairly representative of population parameters,uncertainities arising out of sampling variation must be taken into account
Sampling variation is a reality that says that samples in

all likelihood will differ from one another

S.E. of mean calculates these uncertainities Point estimates have reliability only when SE is small

Confidance interval
When SE is large,an interval estimate should be

obtained
This is also called confidence interval This is the range that is very likely to contain the

parameter value

This likelihood is called confidence level Generally a 95% confidence level is used The 95% CI is obtained as statistic+_2 SE of that

statistic

Null hypothesis
It is the hypothesis that says that there is no

difference,or that asserts the existing knowledge or claim,and is tested for refutation by the study
For eg- newer drug B is not better than existing drug A

for releiving toothache


A null hypothesis is sought to be refuted by

conducting a study

A null hypothesis is either rejected or not rejected,it is

never accepted
Alternate hypothesis is the assertion that is accepted

when the null is rejected


Note that alternative is accepted when null is rejected

but nothing is accepted when null is not rejected

Evidance against null


In case of medical studies,evidence is provided in

terms of the results of a trial conducted on some patients,or observations regarding natural occurences in a group or many group of people

The evidence is considered sufficient against the null

hypothesis if 1)study is unbiased 2)There are no confounders that can affect the findings 3)Sample size is sufficient to inspire confidence in results and sampling fluctuations are minimal

Type-1 error and p- values


Type I error - when a true null hypothesis is rejected

due to the wrong evidence provided by the data


This is serious error
The probability of type-I error is called P-value

Thus, P value is the chance that the presence of

difference is concluded when actually there is none


It is this type I error that later on forces ban on some

drugs after they are licensed for marketing

The maximum threshold of tolerance of the

probability of type-I error is called the significance level It is denoted by and is fixed in advance,generally at 0.05 percent
P-value is calculated on basis of the data but is fixed

in advance

When P<,the null hypothesis is rejected and it is

stated that the result is statistically significant


Type-II error occurs when a false null is not rejected The probability of type-II error is denoted by

Role of statistics
Define problem question and research aims Review of litrature Develop a hypothesis Design experiments or other tests

Collect and record data Analyse and interprete data Revise or modify protocol Replication of results

Public understanding of results


Scientific impact of research Peer review

Statistical software
There are many softwares to perform statistical

analysis and visualization of data. Some of them are SAS (System for Statistical Analysis),PASW S-plus, R, Matlab, Minitab16, BMDP, Stata, SPSS, StatXact, Statistica, LISREL, JMP, GLIM, HIL, MS Excel etc.

Some useful websites for more information of

statistical softwares-

http://www.galaxy.gmu.edu/papers/astr1.html http://ourworld.compuserve.com/homepages/Rainer_Wuerlaen der/statsoft.htm#archiv http://www.R-project.org

Minitab 16
Menu and tools arranged logically, matching textbooks

and training materials Project Manager organizes analysis ReportPad for generating reports Easily export output to PowerPoint and Word Clear, comprehensive Help system StatGuide explains output

Tool-specific tutorials
Glossary of statistical terms Methods and formulas used in calculations

Smart Dialog Boxes remember recent settings


Hundreds of sample data sets Available in multiple languages Increased speed and improved performance

THANK YOU

Você também pode gostar