Medical Computing and Statistics

PRESENTED BY DR.RAJ KUMAR SINGH(JR-1) DEPTT .
OF ORTHODONTICS AND DENTAL ANATOMY
SUPERVISOR DR. SANJEEV KUMAR VERMA CHAIRMAN, DEPTT.OF ORTHODONTICS AND DENTAL ANATOMY DR.Z.A.DENTAL COLLEGE,AMU,ALIGARH
CO-SUPERVISOR DR.MD.SAIF KHAN LECTURER, DEPTT .OF PERIODONTICS DR.Z.A.DENTAL COLLEGE,AMU,ALIGARH
Overview of seminar
Introduction to medical computing
Role of medical computing Introduction to statistics
How to use statistics

Role of statistics conclusion
What does it mean computers
in medicine ?
The Computer Meets Medicine and Biology:
Emergence of a Discipline
After taking this course, you should know the answers to these questions:
Why
is information management a central issue in biomedical research and clinical practice?
What are integrated information-management environments and how might we expect them to affect the practice of medicine and biomedical research in coming years?
What do we mean by the terms medical computer science, medical computing ,medical informatics, clinical informatics, nursing informatics, bioinformatics, and health informatics?
Why should health professionals and students of the
health professions learn about medical-informatics concepts and informatics applications?
How has the development of mini-computers, microprocessors, and the Internet changed the nature of biomedical computing? How is medical informatics related to clinical practice , biomedical engineering, molecular biology, decision science, information science, and computer science?
Role of computing
Medical Decision making: Probabilistic medical

reasoning. Patient care systems. Patient monitoring systems. Computer aided surgery. Computer based patient record systems.
Clinical decision support systems.

The internet. Standards in medical informatics.
Imaging modalities.
Image management systems. Telemedicine. Bioinformatics.
Conventional data collection for clinical trial

Medical records
Data sheets
Computer database
Analyses
Results
WHAT IS STATISTICS
Introduction
Statistics is a science that comprises data collection
methods,processing of data into useful information,and utilising this information in taking decisions with least error
Medical statistics-
A collection of statistical procedures particularly well-suited to the analysis of healthcare-related data.
Medicine is empirical science depending on observations Medical data are necessary for any medical decision,be it for diagnosis,for treatment planning or prognosis,is that some information is availaible for the patient
Also for medico-legal or ethical reasons,to justify

Thus, they must be evidence based
Uncertainities in medicine arise mainly due to
1)biological variability 2)environmental variability 3)sampling fluctuations 4)chance variability 5)instrument variability
To deal with such enormous uncertainities that
pervade all aspects of medical practice, a separate science has developed,called biostatistics
It provides methods to measure uncertainities by
probabilities ,and helps to control the impact of uncertainities on medical practice by laying principles to choose decisions that judiciously combine the probabilities with judgements
How to use statistics

Develop an underlying question of interest
Generate a hypothesis Design a study
Collect Data
Analyze Data Descriptive statistics Statistical Inference
Hypothesis: tentative assumption of the study or expected results of the study

It should be very specific and limited to the piece of
research in hand because it has to be tested.
The role of hypothesis is to guide the researcher by
delimiting the area of research and to keep him on the right track.
Develop Study Design Research question Study sample Sample size Enrollment/Follow-up strategies On-going monitoring
sampling
Sample is that part of target population which is
actually enquired on or investigated

Types of sampling:-
1)Simple random 2)systematic random 3)stratified random 4)cluster random 5)multistage random
Existing data
Primary data are those which one elicits from
individual patients,subjects or other units(such as hospitals or laboratories) directly

Secondary data are those that are elicited by others Secondary data sources include disease specific
database on the web,medical literature,records of surveys and registrations done by the government
Generation of new data

Existing data may be incomplete and insufficient to
provide answers to specific questions

For these data are specially generated through new surveys
and experiments
Basically there are two types of studies to generate new
data-descriptive and analytical

In either setup,it is necessary that a sample of subjects is
studied
data collection designs
Objective
descriptive
analytical
Method
surveys
observational
experimental
Time frame
prospective retrospective crosssectional

(One point time)
(Cohort-cause to effect) (Effect to cause)
Setting
animal trial
Describing data with tables

1) frequency table
2) relative and cumulative frequency 3) grouped frequency
4) open- ended groups

5) cross-tabulation
Frequency table
variables frequency
Mortality (%)
11.2-15.1 15.2-20.1 20.2-25.1 25.2-30.1
Tally
1, 1, 1, 1, 1, 1, 1, 1, 1 1, 1, 1, 1, 1, 1, 1, 1 1, 1, 1, 1, 1 1, 1, 1
No. of ICU
9 8 5 3
30.2-35.1
1,
Relative and cumulative frequency

parity No.of women Percentage (relative frequency) Cumulative percentage
0
1 2 3 4 7 8
5
6 14 10 3 1 1
12.5
15 35 25 7.5 2.5 25
12.5
27.5 62.5 87.5 95 97.5 100
Cross tabulation
Two variables within a single group of individuals
Caries Yes Occlusal 21 (84%) (66) 2 or fewer children No 11 (73%)(34) 32(100) Totals
proximal Totals
4 (16%) (50) 4 (27%)(50) 25(100%) 15(100%)
8(100) 40
Describing data with charts

1) Charting nominal data
(1) the pie chart
(2) the simple bar chart

(3) the cluster bar chart (4) the stacked bar chart
2) Charting ordinal data

(1) the pie chart (2) the bar chart
3) Charting discrete metric data 4) Charting continuous metric data 1)the histogram
Pie chart
4-5 categories One variable Start at 0 in the same order as the table
Pie chart: Hair color of children reciving d-phenothrin
dark , 21, 21%
blonde, 18, 18% blonde
red, 4, 4%
brown red dark
brown, 55, 57%
Simple bar diagram
Clustered bar diagram

Cluster percetage bar chart of the hair color receiving Malathion and dphenothrin
60 50 40 30 20 10 0 malathion d-penothrin 16 4 28 18 4 22 blonde brown red dark 52 56
Histogram
Exercise 3-5, Histogram
40 35 30 25 20 15 10 5 0 19 20-24 25-29 30-34 35 Percentage age distribution of pregnant women Thrombosis cases
Step chart
Exercise 3.8 Cumulative percetage o finfants 120 100 90 80 60 40 20 0 0 60 36.67 16.67 6.67 5 10 Cumulative percetage o finfants 100
Charting cumulative ordinal or discrete metric data
Cumulative frequency curve

Exercise 3.9 Ogive
120 100 80 60 40 20 0 15-24 Attempting suicide Later successful
25-34
35-44
45-54
55-64
65-74
75-84
> 85
Percentage cumulative frequency curves of age for male suicide attempters and later succeeders
Data collection ,types and quality

Evidence based decisions are only as good as the
evidence itself
Thus it is important that the data gathered for creating
evidence is correct
Methods such as interview,examination ,investigations
are availaible
He must decide which method is best for particular
information
Data can be either , quantitative or qualitative

Qualitative data can be on nominal scale or ordinal
scale
Quantitative data are on metric scale
Nominal scale data

It can be allocated into one of a number of categories. Blood type, sex(male/female) No meaningful order
Ordinal scale data

It can be allocated to one of a number of categories but
be put in meaningful order.

Very satisfied, satisfied, neutral, unsatisfied, very
unsatisfied.
Descrete metric data

Countable variables. Integer form Numbers of things Age, numbers of men
Continuous metric data

Measurable variables.
Round to the nearest integer Kg, m, mmHg, hour, years
Quality of data is assessed in terms of validity and
reliability of the measurements or of the tools used to obtain the data

Validity - the ability to correctly measure the
characteristic that it purports to measure
For tests,this is assessed in terms of sensitivity-
specificity ,and positive and negative predictivities

Reliability - the ability to give same result when used
repeatedly in identical conditions
Statisitcal analyses
Descriptive Statistics Describe the sample
Inference Make inferences about the population Primarily performed in two ways:
Hypothesis testing Estimation (more important !!)
Prediction
Descriptive statistics
Descriptive statistics are a way of summarizing the complexity of the data with a single number.
A. For one variable ("univariate analysis"): Measures of "CENTRAL TENDENCY") (averages) and of DISPERSION or variance around that average. Examples: Means, Modes, Medians, Standard Deviation, quartiles
B. Descriptive statistics for the strength of relationship between two variables (bivariate analysis) or among a set of variables
(multivariate analysis) are measures of ASSOCIATION or correlation.
Measure of central tendency
Nominal & Ordinal Frequencies Percents Medians Modes (all)
Interval & Ratio
Means
Measure of dispersion
Nominal & Ordinal (qualitative) Range Deviation Interval & Ratio(quantitative) Standard Quartiles
Measure of association
Nominal & Ordinal Interval & Ratio
Cross-tabulation Non-Parametric Phi, Gamma , Eta Lamda, Tau-B etc.
Pearson's R
Measure of significance
Nominal & Ordinal Chi Squre ,t-test Interval & Ratio Anova (F-ratio)
Inferential statistics
Are measures of the SIGNIFICANCE of the relationship between two or more variables. Significance refers to the probability that the findings could be attributed to sampling error.
Appropriate statistics depend on the LEVEL OF MEASUREMENT OF THE DEPENDENT VARIABLE (and of the independent variable).
Parameters
Summary measures , as mean and standard deviation
can be obtained for a sample as also for entire population

Summary measures,when obtained for the entire
target population ,are called parameters

The values of parameters are hardly ever known
because nobody has time and resources to study the entire population
When parameter values are unknown,as almost
invariably is,it becomes necessary to fall back on samples to get some tangible lead regarding the characteristic of population
Measures such as mean and SD when obtained for
sample subjects are called statistics
Standard deviation and normal

mean
Tests of parametric significance

1) Student t-test:
for comparison of mean between 2 groups 2) Anova F-test:
for comparison of means in three or more groups (both the above test requires that the means follow a Gaussian distribution and hence are called parametric tests)
Nonparametric test
When sample size is very small and distribution is
skewed, parametric tests cannot be used

In such cases ,non parametric tests(less powerful test
than parametric) are used

For paired data - non-parametric tests commonly used
are sign test and other is Wilcoxon signed rank test
For unpaired two-sample data - the non-parametric
test is Mann-Whitney test

Another important non-parametic test is Chi-square
test(used for nominal data),a test of proportion

This is used to test the significance of association of
two or more qualitative characteristics
Point estimation and standard error

It is a reality that samples in all likelihood will differ
from one another

Even though there is rarely a need for a second sample
in scientific endeavours provided the first is chosen with due precautions such as random selection and inclusion of sufficient number of individuals
In such cases ,summary measures based on one
sample alone are considered good estimates of the respective characteristics of target population
These are called point estimates
Although point estimates obtained from carefully
derived sample are fairly representative of population parameters,uncertainities arising out of sampling variation must be taken into account
Sampling variation is a reality that says that samples in
all likelihood will differ from one another
S.E. of mean calculates these uncertainities Point estimates have reliability only when SE is small
Confidance interval
When SE is large,an interval estimate should be
obtained
This is also called confidence interval This is the range that is very likely to contain the
parameter value
This likelihood is called confidence level Generally a 95% confidence level is used The 95% CI is obtained as statistic+_2 SE of that
statistic
Null hypothesis
It is the hypothesis that says that there is no
difference,or that asserts the existing knowledge or claim,and is tested for refutation by the study
For eg- newer drug B is not better than existing drug A
for releiving toothache

A null hypothesis is sought to be refuted by
conducting a study
A null hypothesis is either rejected or not rejected,it is
never accepted
Alternate hypothesis is the assertion that is accepted
when the null is rejected

Note that alternative is accepted when null is rejected
but nothing is accepted when null is not rejected
Evidance against null

In case of medical studies,evidence is provided in
terms of the results of a trial conducted on some patients,or observations regarding natural occurences in a group or many group of people
The evidence is considered sufficient against the null
hypothesis if 1)study is unbiased 2)There are no confounders that can affect the findings 3)Sample size is sufficient to inspire confidence in results and sampling fluctuations are minimal
Type-1 error and p- values

Type I error - when a true null hypothesis is rejected
due to the wrong evidence provided by the data

This is serious error
The probability of type-I error is called P-value
Thus, P value is the chance that the presence of
difference is concluded when actually there is none

It is this type I error that later on forces ban on some
drugs after they are licensed for marketing
The maximum threshold of tolerance of the
probability of type-I error is called the significance level It is denoted by and is fixed in advance,generally at 0.05 percent
P-value is calculated on basis of the data but is fixed
in advance
When P<,the null hypothesis is rejected and it is
stated that the result is statistically significant

Type-II error occurs when a false null is not rejected The probability of type-II error is denoted by
Role of statistics
Define problem question and research aims Review of litrature Develop a hypothesis Design experiments or other tests
Collect and record data Analyse and interprete data Revise or modify protocol Replication of results
Public understanding of results

Scientific impact of research Peer review
Statistical software
There are many softwares to perform statistical
analysis and visualization of data. Some of them are SAS (System for Statistical Analysis),PASW S-plus, R, Matlab, Minitab16, BMDP, Stata, SPSS, StatXact, Statistica, LISREL, JMP, GLIM, HIL, MS Excel etc.
Some useful websites for more information of
statistical softwares-
http://www.galaxy.gmu.edu/papers/astr1.html http://ourworld.compuserve.com/homepages/Rainer_Wuerlaen der/statsoft.htm#archiv http://www.R-project.org
Minitab 16
Menu and tools arranged logically, matching textbooks
and training materials Project Manager organizes analysis ReportPad for generating reports Easily export output to PowerPoint and Word Clear, comprehensive Help system StatGuide explains output
Tool-specific tutorials
Glossary of statistical terms Methods and formulas used in calculations
Smart Dialog Boxes remember recent settings

Hundreds of sample data sets Available in multiple languages Increased speed and improved performance
THANK YOU

Medical Computing and Statistics

Enviado por

Dados do documento

Descrição original:

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Medical Computing and Statistics

Enviado por

Direitos autorais:

Formatos disponíveis

PRESENTED BY DR.RAJ KUMAR SINGH(JR-1) DEPTT .

OF ORTHODONTICS AND DENTAL ANATOMY

CO-SUPERVISOR DR.MD.SAIF KHAN LECTURER, DEPTT .OF PERIODONTICS DR.Z.A.DENTAL COLLEGE,AMU,ALIGARH

How to use statistics

What does it mean computers

The Computer Meets Medicine and Biology:

is information management a central issue in biomedical research and clinical practice?

health professions learn about medical-informatics concepts and informatics applications?

Clinical decision support systems.

Conventional data collection for clinical trial

A collection of statistical procedures particularly well-suited to the analysis of healthcare-related data.

Also for medico-legal or ethical reasons,to justify

Uncertainities in medicine arise mainly due to

To deal with such enormous uncertainities that

How to use statistics

Hypothesis: tentative assumption of the study or expected results of the study

research in hand because it has to be tested.

The role of hypothesis is to guide the researcher by

actually enquired on or investigated

individual patients,subjects or other units(such as hospitals or laboratories) directly

Generation of new data

provide answers to specific questions

data-descriptive and analytical

data collection designs

prospective retrospective crosssectional

(Cohort-cause to effect) (Effect to cause)

Describing data with tables

4) open- ended groups

Relative and cumulative frequency

4 (16%) (50) 4 (27%)(50) 25(100%) 15(100%)

Describing data with charts

(1) the pie chart

(2) the simple bar chart

2) Charting ordinal data

dark , 21, 21%

blonde, 18, 18% blonde

brown red dark

brown, 55, 57%

Simple bar diagram

Clustered bar diagram

Charting cumulative ordinal or discrete metric data

Cumulative frequency curve

Data collection ,types and quality

Methods such as interview,examination ,investigations

He must decide which method is best for particular

Data can be either , quantitative or qualitative

Nominal scale data

Ordinal scale data

be put in meaningful order.

Descrete metric data

Continuous metric data

Quality of data is assessed in terms of validity and

reliability of the measurements or of the tools used to obtain the data

characteristic that it purports to measure

For tests,this is assessed in terms of sensitivity-

specificity ,and positive and negative predictivities

repeatedly in identical conditions

Hypothesis testing Estimation (more important !!)

Measure of central tendency

Nominal & Ordinal Frequencies Percents Medians Modes (all)

Interval & Ratio

Cross-tabulation Non-Parametric Phi, Gamma , Eta Lamda, Tau-B etc.

can be obtained for a sample as also for entire population