78 visualizações

Enviado por Chris Mykytyshyn

A quick primer of epidemiology for medical students

- Abnormal Urinalysis FC.pdf
- KAdams_RRL#5
- Economics of Health Care - Book
- Exercise for Rotator Cuff thy
- CriticalAppraisalWorksheetHarm Etiology
- Presentation in Epidemiology
- 5 Diabetics Ulsers
- Comprehensive Chromosome Screening Discussion Paper
- Barrington 1999 the Cochrane Library
- revision 01381772-josephrmartin-practicumexercise11representingyourselfprofessionallyasananthropologist
- Prevalence Survey4[1]
- jmmcr004036
- Abnormal Urinalysis FC.pdf
- Question and Protocol
- 964f154e-4166-474a-a523-51f82a2af1f5 (1)
- SOL-2010-1-5-18
- i1062-6050-42-3-388
- SpivackCheng_AOAS
- Early Versus Delayed Post-operative Bathing or Showering to Prevent Wound Complications
- cabalag2014.pdf

Você está na página 1de 95

Schul i ch School of Medi ci ne and Denti str y

Uni versi ty of Western Ontari o

I ntroducti on to the Course

My goals are: (1) that you all gets 90s, and (2) that you learn to love epidemiology as much as I do.

Unrealistic goals, perhaps, and to get there Ill need your help: if ever there is something that you dont

understand, ask; if ever there is something youre curious to learn more about, ask.

From what I understand, the notes provided on WebCT are good enough for you to pass the course. Im

hoping that my notes will be good enough for you to ace the course. These are based on both the

objectives and the content of Dr. El-Masris lectures. They are not based on Dr. Champions lectures,

since Im not in London. Ive done my best to ensure that they are complete and cover the required

material.

Good luck!

Aidan Findlater

aidan@aidanndlater.com

SSMD Medicine, Class of 2015

Notes on the Notes

When a lecturer emphasized something that wasnt in the objectives, Ive marked it as NiO (Not in

Objectives) and will let you decide whether its worth studying.

Text that looks like this paragraph summarizes the main points relating to the objective. Its the main stuff

you need to know, based on the lecture, course notes, and my own knowledge.

Smaller, indented text that looks like this is extra information that might provide a better or more complete

understanding of the objective. Its similarly based on the lecture, course notes, and my own knowledge.

[A: Italicized text in brackets presents is my own editorializing. This is where Ill put stuff when maybe I

disagree with the lecturer or if Im not sure that theyd agree with me. You should be able to safely ignore

it when studying for the exam.]

Tabl e of Contents

Intro to EBM (9 Jan 2012) 1

Dene evidence-based medicine (EBM) 1

Describe the components of EBM 1

Describe the rationale for the use of EBM 2

Describe the evidence pyramid with respect to ranking of evidence sources 2

Understand basic study design principles [NiO] 2

Understand how background and foreground knowledge are used [NiO] 3

Provide an example of a proven therapy that is not used optimally 4

Provide an example of a harmful therapy that has been used in the past 4

Describe barriers to the use of EBM 4

Fundamentals of Epidemiology I (16 Jan 2012) 5

Describe the relevance of epidemiology (methods and results) to clinical methods 5

Dene epidemiology (Last, 2001) and understand the components of the denition 5

Dene, and apply to clinical information, the concepts of prevalence and incidence 5

Dene, and apply to clinical information, the concept of risk (probability) 7

Dene, and apply to clinical information, the concept of outcomes: case denition, case series, validity, and reliability 7

Dene, and apply to clinical information, the concept of exposures 8

Dene, and apply to clinical information, the contingency table 8

Dene, and apply to clinical information, measures of association 9

Dene and apply the concepts of epidemic and outbreak 11

Dene and apply the concept of epidemic curves with different shapes 11

Dene and apply the concept of surveillance 12

Dene and apply the concept of alternative explanations for a nding 12

Fundamentals of Epidemiology II (23 Jan 2012) 15

Summary of study designs 15

Describe the case-control study design, including advantages and disadvantages 17

Describe the cohort study design, including advantages and disadvantages 17

Describe the RCT study design, including advantages and disadvantages 18

Describe the ecologic study design, including advantages and disadvantages 18

Describe the time-series study design, including advantages and disadvantages 18

Describe the cross-sectional study design, including advantages and disadvantages 18

Describe the natural experiment study design, including advantages and disadvantages 19

Describe the major sources of bias that may occur in etiologic studies in humans, and apply this information to inter-

pretation of observational studies 19

Dene causation 19

Describe approaches to determining causation 19

Describe measures of mortality [NiO] 21

Fundamentals of Biostatistics I (30 Jan 2012) 22

Dene probability and odds 22

Classify different sampling approaches 22

Simple random sample (SRS) 23

Systematic sample with random start 23

Cluster sampling 24

Multistage sampling 24

Stratied sampling 25

Convenience sampling 25

Quota sampling 25

Differentiate between levels and types of measurement 25

Describe a normal distribution and compare to a skewed distribution 26

Understand the mean, median, mode, and variance, standard deviation, and range, and be able to calculate the mean,

median, mode, and range 27

Identify the most appropriate measurement for central tendency and dispersion for different levels of measurement

including interquartile range and standard deviation 28

Fundamentals of Biostatistics II (6 Feb 2012) 29

Distinguish between estimation and hypothesis testing 29

Interpret p-value and condence interval 29

Dene and interpret Type I error (alpha) 31

Dene and interpret Type II error (beta) 31

Dene and interpret power 31

Identify factors required for sample size and power calculations 31

Distinguish between a negative and an underpowered trial 31

Dene and interpret statistical interaction [NiO] 32

Interpreting multiple/multivariate regression [NiO] 32

Are the Results Valid? I (13 Feb 2012) 33

Compare and contrast observational (cohort, case-control and case series) and experimental studies 33

Dene study population and inclusion criteria 33

Dene randomization and allocation concealment and differentiate between the two 33

Dene block and stratied randomization 34

Dene blinding, and recognize studies where blinding may not be possible 35

Dene intention-to-treat analysis, and describe advantages 36

Understand what the CONSORT RCT reporting guidelines are 37

Are the Results Valid? II (27 Feb 2012) 39

Identify and interpret baseline data in a clinical trial 39

Identify attrition, and discuss possible effects on results of clinical trial 39

Compare and contrast efcacy and effectiveness, and internal and external validity 39

Dene bias, and recognize different sources of bias in studies, including publication bias 40

RCTs: What are the Results? I (5 Mar 2012) 41

Describe the problem of multiplicity in analysis, and apply this information with respect to interpretation of subgroup

analysis, multiple and secondary outcomes 41

Differentiate between primary and secondary outcomes, and apply this information to clinical trials 41

Dene composite outcome 42

Describe the valid use of composite outcomes 42

Describe rationale for using composite outcomes 42

Describe potential problems with subgroup analyses 42

Describe criteria for valid subgroup analyses 42

Interpret subgroup analyses in a clinical trial 42

Dene interim analysis 43

Describe reasons for early termination of clinical trials 43

Dene surrogate outcome, and recognize use of surrogate outcomes in a clinical trial as well as potential drawbacks of

the use of surrogates 43

Describe study phases in clinical trials 43

Describe problems of adverse event recognition including the use of the rule of three 43

RCTs: What are the Results? II (19 Mar 2012) 45

Differentiate between dichotomous and continuous outcomes 45

When provided with information from a clinical trial, develop a 2x2 table 45

When provided with information from a clinical trial, calculate and interpret the control event rate (CER) 47

When provided with information from a clinical trial, calculate and interpret the experimental event rate (EER) 47

When provided with information from a clinical trial, calculate and interpret the relative risk (RR) 48

When provided with information from a clinical trial, calculate and interpret the absolute risk reduction (ARR) 48

When provided with information from a clinical trial, calculate and interpret the relative risk reduction (RRR) 48

When provided with information from a clinical trial, calculate and interpret the number needed to treat (NNT) 48

When provided with information from a clinical trial, calculate and interpret odds ratio (OR) 49

Provide information regarding strengths and weaknesses of NNT 49

Critically appraise an article on therapy 50

Case-Control and Cohort Studies (26 Mar 2012) 51

Describe the purpose and structure of case-control and cohort study design 51

Describe the strengths and weaknesses of cohort and case-control studies 52

Recognize and describe types of bias that may occur 54

Recognize and describe confounding 55

Dene and calculate relative risk and odds ratio 56

Critically appraise a case-control study 56

Critically appraise a cohort study 56

Prognosis (2 Apr 2012) 57

Differentiate between risk and prognostic factors 57

Describe the elements of prognostic studies 57

Interpret a survival curve 58

Recognize potential sources of bias in cohort studies of prognosis 60

Diagnosis (9 Apr 2012) 61

Discuss the use of diagnostic tests clinically 61

Describe the characteristics and denitions of normal and abnormal test results 61

Develop a 2x2 diagnostic test result table when provided with data from a study of a diagnostic test 62

Dene and calculate sensitivity and specicity 62

Dene and calculate positive and negative predictive value 62

Dene and calculate prevalence 63

Apply the role of pretest probability or prevalence in interpretation of diagnostic test results 63

Interpret likelihood ratios 64

Interpret kappa 64

Interpret a receiver operating characteristic (ROC) curve 65

Critically appraise a study on a diagnostic test 65

Screening (16 Apr 2012) 66

Dene and differentiate between the three levels of prevention (primary, secondary, and tertiary) 66

Differentiate between screening and case-nding 66

Differentiate between diagnostic and screening tests 66

Describe criteria for a screening program 66

When provided with information about a screening test calculate sensitivity, specicity, positive predictive value (PPV),

negative predictive value (NPV) and prevalence 67

Describe the impact of prevalence of disease on the results of diagnostic or screening tests 67

Apply the impact of prevalence of disease to clinical situations 67

Dene and recognize lead-time, length-time and compliance bias 67

Discuss possible adverse effects of screening programs 68

The Interpretation of Statistical Results (23 Apr 2012) 69

Describe the difference between unadjusted and adjusted results 69

Interpret statistical ndings and the level of measurement of the outcome variable in linear, logistic, and survival analy-

ses 69

Describe the importance of describing sample characteristics in epidemiologic research 71

Describe various ways of selecting which variables should be included in a multivariate analysis 72

Non-regression statistical tests 72

Meta-Analysis (30 Apr 2012) 73

Dene and compare/contrast review, systematic review, and meta-analysis 73

Summarize steps required for a systematic review, including framing a specic question for review 74

Summarize steps required for a systematic review, including identifying relevant literature 74

Summarize steps required for a systematic review, including assessing the quality of the literature 74

Summarize steps required for a systematic review, including summarizing the evidence 74

Recognize the possible bias due to publication bias and describe approach to identifying publication bias using a fun-

nel plot 74

Interpret a forest plot 76

Describe benets and limitations of a meta-analysis 77

Dene heterogeneity 77

Recognize that heterogeneity may mean a meta-analysis is not feasible/valid 78

Interpret data from a cumulative meta-analysis 78

Describe the role of a sensitivity analysis 79

Communicating Risk (7 May 2012) 80

Describe effective risk communication as the basis for informed consent 80

Dene health literacy 80

Dene health numeracy 80

Describe patient perception of risk and the impact of health literacy and numeracy on patient risk perception and un-

derstanding 80

Describe cognitive biases that affect risk assessment and decision-making [NiO] 81

Outline the basic dimensions of risk 81

Identify techniques that have been shown to improve patient understanding of risk, such as verication techniques and

the roles of qualitative and quantitative and graphic presentations of risk, and decision aids 82

References 83

Appendix I The Student Guide to Research 85

Starting your project 85

Observational studies 86

Collecting the data 86

Analyzing the data 86

Writing it up 86

I ntro to EBM

(9 Jan 2012)

Dene evidence-based medicine (EBM)

Evidence-based medicine is the conscientious, explicit, and judicious use of current best evidence

in making decisions about the care of individual patients (1). EBM is not cookbook medicine.

Put another way: the integration of best research evidence with clinical experience and patient values to

facilitate clinical decision-making (2).

Generally, the current best evidence mostly comes from medical research, although there are other sources.

But remember, its extremely important to consider patient values when making decisions. EBM uses clinical

expertise to integrate the best research evidence, patient values, the health care system and available

resources, and the clinical setting.

EBM requires self-directed life-long learning. Reading journals and attending medical conferences are

important for keeping up to date with the current research, but the research must be evaluated critically. Just

because it was peer-reviewed doesnt mean it was done well.

Describe the components of EBM

EBM is a process with four steps:

1. Asking an answerable question. Answerable questions generally follow the PICO format. Specify

the Population in which the study will be done, the Intervention or exposure that will be applied, the

Control or comparison group (if applicable), and the Outcomes that are being investigated.

2. Tracking down the best available evidence to answer the question.

3. Critically appraising the evidence for validity and interpreting the results.

4. Integrating information with clinical expertise and the individual patient.

When asking an answerable question, remember the PICO format. Compare Is nicotine patch effective? to

Is nicotine patch better for smoking cessation than counselling alone in heavy-smoking adult Canadian

men?. The PICO wording of the research question often makes the best title for a research paper.

When tracking down the best available evidence, use guidelines published by national organizations (e.g.

Agency for Health Care and Quality), EBM-focused journals (e.g. EBM, EB Cardiovascular Medicine, etc.),

Epidemiology I Course Notes

1

systematic reviewers (e.g. Cochrane Library), and nally the primary literature (e.g. PubMed, ProQuest, etc.).

The systematic reviews and meta-analyses from the Cochrane Library probably provide the strongest form of

evidence. [A: I agree. Cochrane is great.]

Describe the rationale for the use of EBM

Between the health care we have and the care we could have lies not just a gap, but a chasm (3). [A:

EBM is presumably an attempt to bridge this chasm.]

Describe the evidence pyramid with respect to ranking of evidence sources

It can be useful to classify the reliability of studies based on their study designs. There are a variety to do

this, but the one discussed in the class is:

[A: 0. Systematic reviews and meta-analyses]

1. Experimental studies

1. Randomized controlled double-blinded studies

2. Randomized controlled studies

2. Observational studies

1. Cohort studies

2. Case-control studies

3. Case series

4. Case reports

3. Basic research and expert opinion

1. Animal research

2. Ideas, editorials, opinions

3. In vitro (test tube) research

[A: Double-blind is a loose term because it doesnt specify who was blinded. Depending on the study,

you can have triple- or quadruple-blind studies, and double-blind may not be enough.]

[A: Also, remember that the hierarchy is just a guideline! A well-designed cohort study is probably more

reliable than a poorly-design RCT, and same for cohort versus case-control.]

Understand basic study design principles [NiO]

[A: He provided an overview of the basic classication of study designs. Each is discussed at greater

length in later lectures, so Ive formatted the following as optional.]

There are two types of research: experimental and observational.

If the researcher assigns the exposure (e.g. you choose who to give the medication to and who gets placebo),

then it is an experimental design. If the exposure was not determined by the researcher (e.g. smokers

choosing to smoke, you didnt assign them to it), then the research is observational.

For observational studies, was there a comparison group? If yes, then you have a cohort study or case-control

study; if no, then its a descriptive study (generally a case series). Cohort studies compare outcomes between

Prepared by Aidan Findlater

2

exposed to unexposed groups (e.g. taking smokers and non-smokers and seeing who gets cancer). Case-

control studies compare exposures between outcome groups (e.g. taking cancer and non-cancer patients

and seeing who smoked). [A: for very good reasons that we will see later, case-control studies are considered

to be a weaker form of evidence because they are more easily biased.]

For experimental studies, they can be randomized (which is good) and blinded (which is good). They can also

be neither. Randomized clinical trials (RCTs) are not necessarily blinded. [A: You could also have a non-

randomized trial, but that would be very strange and suggests that you actually have an observational, not

experimental, study. Double-check who was determining the participants exposure status.]

Understand how background and foreground knowledge are used [NiO]

[A: I didnt understand what he was trying to say in lecture. The following mixes a bunch of sources,

including the lecture. Im not clear on whether this information is actually examinable.]

"Background" questions ask for general knowledge about a condition or thing. What causes

migraines? Theyre about getting basic information, not about decision-making.

"Foreground" questions ask for specic knowledge to inform clinical decisions or actions. In young

children with acute otitis media, is short-term antibiotic therapy as effective as long term antibiotic

therapy? They generally follow a PICO pattern.

Background questions are answered using your background knowledge: book learnin and basic stuff

you learn in textbooks and lectures. Foreground questions are answered using your background

knowledge plus extra research and critical thinking. Compare What is SARS? (background) to How

can we diagnose and treat SARS? (foreground).

When diagnosing, foreground thinking is evidenced by starting broad and then narrowing it down with

your investigations. Background thinking is more about starting with a specic diagnosis and then trying

to prove that its right (hypothesis-testing). Foreground thinking is much more open-minded and is

favoured, but you will always mix both.

Epidemiology I Course Notes

3

Provide an example of a proven therapy that is not used optimally

It is estimated that 30-40% of patients (or higher) do not receive care according to present evidence.

[A: HAND-WASHING. What would Semmelweis do?]

Provide an example of a harmful therapy that has been used in the past

It is estimated that 20-25% of care that is provided is not needed or is harmful.

[A: Hormone-replacement therapy could be an example. Early studies suggested it was good for you,

and people were prescribing it like it was a prescription for Tic-Tacs. Later, better studies showed that it

was actually harmful. Now its used more carefully.]

Describe barriers to the use of EBM

Theres a time lag between research being done and physicians being aware of the research.

It can be used in the wrong patients in whom there is minimum benet or even harm, especially when

a practice is adopted quickly.

Changing behaviour is not easy.

The system can make it difcult to provide the best care, like insurance not covering the most

effective treatment.

These problems may be at the individual, team, or system level.

Prepared by Aidan Findlater

4

Fundamental s of Epi demi ol ogy I

(16 Jan 2012)

Describe the relevance of epidemiology (methods and results) to clinical methods

It allows us to investigate the etiology, diagnosis, prognosis, and treatment of diseases.

E.g. Outbreak investigation, which is effectively research with an investigative approach.

Epidemiology can be descriptive or analytic. Descriptive epidemiology is about person, place, and time.

Analytic epidemiology is about studying causation.

Dene epidemiology (Last, 2001) and understand the components of the

denition

The study of the distribution and determinants of health-related states or events in specied

populations, and the application of this study to the control of health problems (4).

The components are:

distribution: can refer to people, places, or time periods

determinants: causal factors that affect health

health-related states: persist over time (e.g. depression, hypertension, quality of life)

health-related events: point-in-time occurrences (e.g. heart attack, injury, hospital admission)

specied populations: you must clearly dene your study population

application and control: epidemiology is an applied science.

A clearer denition is perhaps: The study of how disease is distributed in populations and the factors that

inuence or determine this distribution (5).

[A: Or, as I tell people, epidemiologists try to gure out what makes us healthy or unhealthy.]

Dene, and apply to clinical information, the concepts of prevalence and

incidence

In order to compare between different countries, years, etc. we cant just count the number of cases. In order

for the measurement to be meaningful, we must also know the size of the population. For example, consider

Epidemiology I Course Notes

5

1,000 cases of HIV in Canada compared to 1,000 cases of HIV in the US. The prevalence in Canada will be

much higher than the prevalence in the US because the US has a much larger population.

Point prevalence measures the number of cases that exist at a specic point in time divided by the size

of the population being studied (e.g. the percentage of the class who is currently, right now, experiencing

a cold).

Period prevalence measures the number of cases that exist during a specic period in time divided by

the size of the population being studied (e.g. the percentage of the class who has a cold at any point this

week, including those who are having a cold as the week starts).

Lifetime prevalence measures the number of people who have the outcome at any point in their life

divided by the size of the population being studied (e.g. the percentage of the class who will have a cold

at any point in their life, which should be around 100%).

When you think of prevalence, think of existing cases, both old cases that already exist and new cases that

start during the study period. This means that, every time you screen for prevalence for the rst time, it will be

high because you will see the already existing, old cases as well as the new ones.

When choosing which prevalence measure to use (point vs period vs lifetime), we must choose the right

measure. For short-lived, acute diseases, we can only really use point prevalence. Period and lifetime

prevalence are usually used for more chronic diseases. Prevalence cannot be used for studying etiology

(disease causation). It is useful for monitoring diseases and measuring diseases burden, and therefore helps

us understand how well we are managing diseases, especially chronic diseases. Its a good statistic for

politicians, policy-makers, and health statisticians. Comparing HIV prevalence in Canada to a developing

country may show a higher prevalence in Canada simply because the HIV patients are living longer, and are

therefore picked up in the prevalence estimate.

Cumulative incidence (or just incidence) measures the number of new cases that develop over a

specic period divided by the size of the population at risk (e.g. percent of people who get a cold at any

time this week out of those who have upper respiratory tracts). This can be displayed as a percentage.

That last part, at risk, refers to the population who could become cases. It trips people up, even proper

researchers, but it shouldnt. It just means that, if youre studying hysterectomies, you should only ever be

dividing by the number of women, since the research is only applicable to that population. If youre studying

uterine cancer, your population at risk only includes women who dont already have uterine cancer and who

still have their uterus (i.e. have not had hysterectomies).

Incidence rate or incidence density measures the number of new cases that develop over a specic

period divided by the total person-time. Person-time is a measure of the amount of time that each

study participant contributes, since each one may enter or exit the study at different times. If not

everyone in the study is followed equally, you must use incidence rate.

For example, lets say that you are conducting a year-long study (1 Jan 2011 to 31 Dec 2011) of the risk of

death after being diagnosed with colon cancer. Our population at risk is people with a colon cancer diagnosis

and our outcome is death. Each participant may enter the study (i.e. be diagnosed) at a different time and exit

the study (i.e. either die or leave for another reason) at a different time, so we know that we need to use

Prepared by Aidan Findlater

6

incidence rate instead of cumulative incidence. In this hypothetical study, we have three participants: one

participant who lives for three months after diagnosis and then dies; another who lives for six months after

diagnosis and then dies;#a third who had colon cancer at the start of the study and lives until the end without

dying. The participants will respectively have contributed three person-months, six person-months, and twelve

person-months, for a total of 21 person-months. With two deaths, the incidence density will therefore be two

new cases (i.e. deaths) divided by 21 person-months, or 0.095 deaths per person-month.

[A: The incidence rate is not the hazard rate. Not sure what our prof was saying about that, but feel free to

ignore it for now. Well see hazard rates again later.]

Incidence rates are often multiplied by an arbitrary factor, just to make the number easier to think about.

The above example would probably be better reported as 95.2 cases per 1,000 person-months.

Prevalence is useful for measuring chronic health states and burden of disease, and for health care

planning and funding. Its bad for studying causation.

Incidence is useful for measuring health-related events and for studying causality, but it can be very

difcult to nd new cases.

You should understand prevalence and incidence in order to understand health data that you will see. For

example, if prevalence of HIV in Canada is 2% and in South Africa is 1%, you should ask about the incidence.

Perhaps the cumulative incidence in Canada is 0.5% and in South Africa is 2.5%. In that case, you can

conclude that people with HIV in South Africa are dying at a far greater rate than those in Canada, and that

South Africa therefore has a greater HIV problem than Canada.

Prevalence can be increased by: increased incidence, decreased mortality, increased duration of

disease, or increased case-nding (e.g. if you start a screening program).

[A: Prevalence ~= Incidence x Duration, where duration is determined by the time until either cure or death.

Everything else follows from this simple equation.]

With regards to increased case-nding, any country that starts a good screening program will have higher

prevalence than a country with poor screening. The higher prevalence in that case just means that the well-

screened country is catching the cases sooner than the poorly-screened country.

Dene, and apply to clinical information, the concept of risk (probability)

The probability that an event will occur, e.g., that an individual will become ill or die, within a stated

period of time or age (4). Incidence is a measure of risk for the disease.

Dene, and apply to clinical information, the concept of outcomes: case

denition, case series, validity, and reliability

A case denition is a set of diagnostic criteria that must be fullled in order to identify a person as a

case of a particular disease (4). They are based on clinical signs/symptoms, laboratory tests, a

combination of both, or a scoring system.

Epidemiology I Course Notes

7

Case denitions often exist to identify or delineate cases based on the certainty of the diagnosis. They

are usually broken down into denite/conrmed, probable, and possible/suspected cases (from

strongest to weakest denition). For each, a patient must meet specic criteria for the given level of

certainty. The rst two are often used to for research because they have higher signal-to-noise, while the

latter is more useful for making public health decisions where you might want to err on the side of

caution).

Case denitions should be valid (i.e. it measures what you think it measures) and reliable (i.e. it works

the same when applied by different people [inter-rater reliability]).

Every time you read a research report, the author must describe how they dened a case. For example, a

device for detecting atrial brillation requires a case denition of atrial brillation (is it 1 second, 5 seconds,

etc.). If you do not have a standardized case denition, you cant really share your results because your

colleagues cannot interpret the results. For any research, a case denition has to be provided and has to be

convincing. One useful source of case denitions for hospital complications is the CDC, which has specic

criteria for each hospital-acquired infection. Using standardized denitions makes research into nosocomial

infections more generalizable from one hospital to the next.

Dene, and apply to clinical information, the concept of exposures

Exposure is any independent variable that could cause the disease. It can be physical (e.g. radiation),

chemical (e.g. tobacco smoke), biological (e.g. genetics, infection), or sociological (e.g. gender, race/

ethnicity, socioeconomic status). It can also be combinations of these, as in occupation.

Dene, and apply to clinical information, the contingency table

The 2x2 table (two-by-two table) is the simplest form of investigating disease-exposure associations. To

investigate the effect of an exposure on a disease outcome, take a bunch of people who dont have the

disease and follow them. Some of the people you follow will have been exposed to whatever youre

studying and some wont have been exposed. After following them for a while, some will develop the

disease or outcome that youre interested in. You can then break the population down into a 2x2 table.

Disease + Disease -

Exposure +

Exposure -

a b total exposed = a + b

c d total unexposed = c + d

total population = a + b + c + d

[A: Know this table. Love this table. Write it out yourself and get comfortable with it.]

The incidence of a disease is the number of people who develop the disease divided by the number of

people who were at risk of developing it at the start of the study. Were interested in seeing if the

incidence, or risk, is higher in the exposed group compared to the unexposed group. We can easily

calculate the exposure-specic risks from our 2x2 table:

Prepared by Aidan Findlater

8

incidence (or risk) of disease in exposed = =

incidence (or risk) of disease in unexposed = =

incidence (or risk) of disease in total population = =

Sometimes, we want to look at the odds of a disease instead of the risk of it. In a casino, 2-to-1 odds of

winning means that you will win 2 for every 1 you lose, or 2 positive outcomes for every negative

outcome. This corresponds to a 2/3 probability of wining. Similarly, the odds of disease are calculated by

dividing the number of people with the disease by the number of people without it. Again, we can easily

calculate these numbers from the 2x2 table:

odds of disease in exposed = =

odds of disease in unexposed = =

[A: Odds are odd. We hate them, but theyre a lot easier to work with for certain things. Note that, if the

disease is rare, b will be much, much larger than a, and so a/(a+b) will be very close to a/b. The same also

holds for the unexposed, so that if a disease is rare, d will be much, much larger than c, and c/(c+d) will be

very close to c/d. That is to say, the odds approximate the risk, given that the disease is rare.]

Dene, and apply to clinical information, measures of association

A measure of association is a calculation that we do to see how strongly an exposure correlates with

an outcome.

Relative risks: risk ratio and odds ratio

A risk ratio is a ratio of risks; an odds ratio is a ratio of odds. The ratios were interested in are the risk or

odds of disease in the exposed group compared to the risk or odds of disease in the unexposed group:

risk ratio or relative risk (RR) = = =

odds ratio (OR) = = =

Although they mean different things, the general interpretation is the same for both measures.

Epidemiology I Course Notes

9

If relative risk or odds ratio is less than one, it indicates that people

who are exposed are less likely to get the outcome than those who

are not exposed. The exposure is protective.

If the relative risk or odds ratio is equal to one, it indicates people

who are exposed are as likely to get the outcome as those who are

not exposed. The exposure has no effect on the outcome.

If the relative risk or odds ratio is greater than one, it indicates that people who are exposed are more

likely to get the outcome than those who are not exposed. The exposure is a risk factor.

For example, if the incidence of cancer in smokers is 20% and in non-smokers is 10%, the relative risk of

cancer for smokers versus non-smokers is 2, and therefore smokers have twice the risk of cancer as non-

smokers.

But for many studies, we cant study the whole population at risk, and therefore cant calculate risks or

risk ratios. However, we can often still calculate odds ratios. [A: We will see more of this when we get to

case-control studies.]

For example, studying HIV incidence in homeless people, we cant follow all homeless people (that would be

impossible!). Instead, we get a sample of people who develop HIV and a sample of people who do not, and

we nd out how many were homeless in each outcome group. By doing this, we are xing the number of

people in each of the disease groups and our 2x2 table no longer represents the actual population. If we pick

100 diseased participants and 100 non-diseased participants, that doesnt mean that the risk of the disease is

50%, because were xing how many people are in each disease category. In this case, a+b and c+d are not

the population sizes of the exposure groups. We can no longer calculate incidence from our 2x2 table.

Instead of incidence, we calculate an odds ratio given by the odds of exposure in the diseased participants

divided by the odds of exposure in the non-diseased participants. As it turns out, this gives exactly the same

result as calculating our normal OR.

odds of exposure in diseased group = a / c

odds of exposure in non-diseased (control) group = b / d

odds ratio = odds in exposed / odds in unexposed = [a / c] / [b / d] =

Notice how this is the same as if we were calculating the odds ratio based on the odds of disease for each

exposure group. Wow! So cool! So it doesnt matter that we dont have a representative population sample,

the odds ratio is the same as if we had.

In any case-control study where youve xed the number of people in the disease and non-disease groups,

you cannot report a risk ratio, only an odds ratio. El-Masri and I will hunt you down if you do.

However, the interpretation of the OR is slightly different than the RR. An RR of 2 means that the risk is

doubled in the exposed group compared to the unexposed group; an OR of 2 means that the odds are

Prepared by Aidan Findlater

10

Interpreting relative risks

OR < 1 or RR < 1 => protective

OR = 1 or RR = 1 => no effect

OR > 1 or RR > 1 => risk factor

doubled. [A: Never say that an OR of 2 means that the exposure doubles your risk of an outcome; it

doubles your odds of it.]

[A: But now think about the fact that the odds approximates the risk for rare diseases (see my explanation in

the 2x2 table section, above). Because of this, the OR approximates the RR when the outcome is rare.]

True rates and rate ratio

The rate here refers to the incidence rate. A rate ratio is therefore a ratio of incidence rates. [A: I prefer

to be more specic and call these incidence rate ratios, or IRRs.]

rate ratio or incidence rate ratio =

The general interpretation of IRRs is the same as for RRs and ORs:

IR < 1 => protective

IR = 1 => no effect

IR > 1 => risk factor

But again, the specic interpretation is a little different. An IRR of 2 means that the exposure doubles the

incidence rate of the outcome, not the risk of it.

Dene and apply the concepts of epidemic and outbreak

Epidemic: The occurrence in a community or region of cases of an illness, specic health related

behavior, or other health-related events, clearly in excess of normal expectancy (4).

You all know it: more disease than we expect.

Outbreak: [A]n incident in which two or more individuals have the same disease, have similar

symptoms, or excrete the same pathogens; and there is a time, place, and/or person association

between these individuals. (FDA)

Basically, an epidemic just means more disease than expected, while an outbreak means that the cases of

disease are somehow related.

Dene and apply the concept of epidemic curves with different shapes

A graphical plotting of the distribution of cases by time of onset (4). [A: When a case comes in, you

ask when their symptoms started and use that. Time of onset, not time of report.]

Epidemiology I Course Notes

11

Point source: steep left-hand slope, outbreak lasts for about one incubation period. Its a one-time

event, like a picnic, and only shows one peak.

Intermittent common source: shows several, individual peaks. Its from a single source that is only

exposing people intermittently, such as a contaminated cafeteria thats only open once every two weeks.

Continuous common source: right-hand is gradual if the epidemic runs its course, or sudden if control

measures are implemented. Its from a single source that is continuously exposing people, such as a

contaminated cafeteria thats open all the time, so youll see a constant, steady supply of sick people.

Propagated (progressive source): multiple peaks. Basically, its a bunch of point source outbreaks as

people become infected and then spread it to another group. Peaks are about one incubation period

apart [A: although a large outbreak would have them all running together into a giant mess]. The easiest

example is an STD like chlamydia.

Dene and apply the concept of surveillance

The systematic and continuous collection, analysis and interpretation of data, closely integrated with

the timely and coherent dissemination of the results and assessment to those who have the right to

know so that action can be taken (6).

[A: Thats a ridiculous denition. Porta is way too wordy.]

Surveillance is essentially descriptive epidemiology over a long time.

Dene and apply the concept of alternative explanations for a nding

[A: Or, as I call it, Everything you thought you knew is wrong. Observational studies are always subject

to confounding, and even experimental studies can be explained away.]

Confounding and other biases

Bias is the systematic deviation of results or inferences from the truth (6).

Prepared by Aidan Findlater

12

Bias is systematic. If its not systematic, its not bias (its random chance).

Confounding is a type of bias. Heres the awful denition given in the notes: Distortion of the estimated

effect of an exposure on an outcome, caused by the presence of an extraneous factor associated both

with the exposure and the outcome, i.e. confounding caused by a variable that is a risk factor for the

outcome among nonexposed persons, and is associated with the exposure of interest, but is not an

intermediate step in the causal pathway between exposure and outcome (4).

Breathe, guys. If an exposure is associated with an outcome, you have to consider the possibility that

something else is causing both the exposure and the outcome to be positive in the same individuals (and that

the exposure isnt actually affecting anything).

My favourite example is this: lighters (those things that make ames) are associated with lung cancer. Is this

relationship causal? No! The relationship is confounded by smoking status. That is, smokers are more likely

than non-smokers to own lighters and are also more likely to get lung cancer. Easy, right?

A slightly more nuanced example is that some researchers found an association between coffee consumption

and heart disease. Gotta stop drinking coffee, right? Wrong! The results were later explained as confounding

by smoking. A smoker is more likely than a non-smoker to drink coffee. A smoker is also more likely to get

heart disease. As soon as you adjust for smoking status, the association between coffee consumption and

heart disease disappears. You dont have to give up your coffee, just your cigarettes.

Confounding is a problem in all observational research. No matter what potential confounders you adjust for,

there will always be residual confounding that you havent accounted for. [A: Confounding is really only a

problem when talking about causation. If youre developing predictive models, then who cares if the

association is confounded or not? Its still a real association. If all you have is data on which people own

lighters, you can still develop a model that predicts who will get lung cancer, it just wont be quite as good.]

[A: My favourite form of bias is called confounding by indication, which sort of means reverse causality.

Looking at data on aspirin use and risk of heart disease, you will see an association. Does this mean that

aspirin causes heart disease? No! People with heart disease often take aspirin. In this case, their heart

problems are causing the aspirin use, not the other way around. The aspirin merely indicates the presence of

existing problems, and is probably actually protective.]

Epidemiology I Course Notes

13

Chance

All numbers we come up with in our studies are estimates, and no matter how high your estimate, it

could always be due to chance. Some ways of assessing this possibility are p-values [A: which suck,

never use them!] and condence intervals.

Condence intervals give you a measure of how precise your estimate is. Thinks of a newspaper giving a

statistic, saying, X plus or minus Y, 19 times out of 20. The condence interval goes from X-Y to X+Y,

and the 19 out of 20 means its a 95% condence interval. If you have a small condence interval, you

have a more precise estimate. [A: Remember, thats different from having an accurate estimate.] You can

get a small condence interval by either having a large sample size or a strong effect.

A p-value is about hypothesis-testing. [A: I dont like p-values or hypothesis-testing and wont discuss

them unless you guys want me to. El-Masri didnt, so hopefully that means it isnt on the test...]

Stratied analysis

When you stratify, you separate the data based on a potential confounder before doing your analysis.

Basically, you take your original 2x2 table and make two more: one of people who have the confounder

and one of people who do not. You then calculate two separate ORs or RRs, one for each of the strata.

Prepared by Aidan Findlater

14

Fundamental s of Epi demi ol ogy I I

(23 Jan 2012)

Summary of study designs

Epidemiology I Course Notes

15

Study Design Description Advantages Disadvantages

case-control Find people with disease;

nd similar people without

disease. Ask them about

previous exposures and

compare the groups rates

of the exposure.

Relatively fast and easy.

Good for rare diseases. If

done well, can

approximate the results of

a cohort study.

Easily biased. Only gives

odds ratios.

cohort Find people with

exposure; nd similar

people without exposure.

Follow the groups for a

while and compare them

for outcome.

Less risk of bias than

case-control.

Confounding is a problem

even in well-designed

cohort studies. Often

requires long follow-up.

RCT Randomly split your

participants, give one

group the exposure and

keep one group as

controls. Compare the

groups for outcome.

Causal inference is easy.

No risk of confounding.

Can be expensive. Many

exposures cannot be

ethically randomized.

ecologic Compares groups at an

aggregate, population

level.

Quick and easy. Same as for cohort, with

the additional risk of

cross-level bias/ecologic

fallacy.

time-series Analyzing changes in a

single measurement over

time. [A: Silly denition.]

Quick and easy. Cant test hypotheses. [A:

Except you can!]

cross-sectional Sample the population at

a specic point in time.

Quick and easy. Cant tell whether

exposure happened

before outcome, because

its all sampled at the

same time.

natural experiment Look for something where

people or groups get

different exposures due to

circumstances outside

their control. Compare

them for outcomes.

Can be powerful if done

right.

Not actually randomized,

so still risks bias.

[A: I hate their denition of time-series. Its useless. Dont remember it except for the exam.]

Prepared by Aidan Findlater

16

Describe the case-control study design, including advantages and disadvantages

In a case-control study, youre nding people who have a disease (cases) and comparing them to people

who dont (controls). Youre grouping by outcome, and youre comparing the groups for differences in

exposure rates. With case-control studies, you can only calculate odds ratios, never risk ratios.

For example, say youre studying how cell phone use affects risk of brain cancer. Its a rare outcome, so you

probably wont do a cohort study since youd only pick up a small number of brain cancer cases. You do a

case-control study instead, since its faster and works well for rare diseases. You go to the local cancer centre

and actively recruit people who have brain cancer (the cases). For every case you recruit, you nd one or two

people who are similar age and sex but who dont have brain cancer (the controls). [A: I wont give details on

nding the controls because sampling controls to minimize bias is hard. Ask an epidemiologist.] Now you take

the two groups and you assess their past cell phone use, asking them things like whether they have had a cell

phone in the past, how long theyve had it for, and often much they used it. If the people with brain cancer

were more likely to have used a cell phone in the past than the controls, then you would conclude that cell

phone use is associated with brain cancer.

[A: An odds ratio approximates a risk ratio when done properly. You can even do your control-group sampling

in such a way that the odds ratio approximates the incidence rate ratio. Cool, right?]

[A: And heres a bias thats huge with case-control studies: recall bias. Even if theres no effect, people with

brain cancer are more likely to report cell phone use because (a) they care more than controls and therefore

think longer and harder about their exposures, and (b) they suspect a link and are more likely to report positive

exposure status even when they had minimal exposure simply because they suspect that it was the cause of

their disease.

Imagine someone is told that they have brain cancer and then asked about cell phone use. Compare that to

someone who just had a toe amputated and is asked about cell phone use. Which is more likely to report cell

phone use? The brain cancer person.

Recall bias isnt just theoretical, either. Epidemiologists have run studies to assess it, but comparing self-

reported cell phone use to actual cell phone carrier data.]

Describe the cohort study design, including advantages and disadvantages

In a cohort study, youre nding people who have an exposure and comparing them to people who

dont. Youre grouping based on exposure, and comparing the groups for differences in outcome rates.

Youll take a population that includes both exposed and unexposed people who dont have the disease at

baseline, then following up after a while. Some of them will have developed the disease youre interested in,

and some wont. The analysis compares the rates of the outcome/disease between the exposed to

unexposed groups.

A famous example is the Framingham Heart Study, where researchers followed almost an entire town,

periodically sending questionnaires to assess a bunch of different exposures, and waited for people to develop

or die from cardiovascular disease. Then they compared those with CVD to those without it, for a whole bunch

of the exposures they were tracking. As it turned out, smoking was bad for you! Cohort studies like

Framingham provide some of the strongest evidence we can get for exposures like that, since you cant

randomize people to smoke or not smoke.

Epidemiology I Course Notes

17

Describe the RCT study design, including advantages and disadvantages

Randomize people to exposure or not exposure. Compare for differences in outcome rates. Pretty direct

causal inference, if randomized and blinded properly, but RCTs are usually expensive and time-

consuming.

[A: Randomizing and blinding can easily go awry, as can data analysis. Ask an epidemiologist.]

Describe the ecologic study design, including advantages and disadvantages

A study in which the units of analysis are populations or groups of people, rather than individuals (4).

For example, breast cancer rates have been positively correlated with per capita fat consumption across

countries. A major limitation with ecologic studies is cross level bias (or the ecologic fallacy), which

would occur, for example, if women with breast cancer who lived in countries with a lot of fat consumption

actually had low fat diets, and vice versa. Because there is no way to study these individual exposures and

outcomes using group-level data, this design is better for generating than testing hypotheses.

Describe the time-series study design, including advantages and disadvantages

Analyzing an outcome as it changes over time. Data are collected continuously or periodically.

Because many health events are recorded by calendar time, it is possible to study trends in these health

events. Time series are used descriptively to generate hypotheses. These analyses can indicate whether an

outcome is increasing or decreasing.

For example, a Canadian study noted a decrease in hospital and Emergency Department (ED) visits for

diabetes complications over a 5 year period. They arent testing a hypothesis, theyre simply noting a trend.

Time-series data can also be examined for seasonal trends that might indicate a potential cause (e.g. due to

viral exposures, seasonal dietary constituents, or air pollution due to coal burning).

They can also be compared before and after a specic date when a health policy change was instituted, as

another piece of evidence that an exposure was a cause. For example, toxic shock syndrome cases before

and after Rely tampons were withdrawn, and Reyes Syndrome before and after warning labels were put on

aspirin, both provide strong visual evidence that the outcome was reduced after exposure was reduced. Note

that time series can also lead to erroneous conclusions. All changes in trends coincide with some event, many

purely by chance.

[A: Ive never heard of a time-series study. What theyre talking about here is simply a descriptive study using

time series data, but you can also do an analytical study using time series data. To do that, youd take two

time series, one for exposure and one for outcome, and look for correlations. If the exposure goes up, does

the outcome as well?]

Describe the cross-sectional study design, including advantages and

disadvantages

A study that examines the relationship between diseases (or other health related characteristics) and

other variables of interest as they exist in a dened population at one particular time. ... The temporal

sequence of cause and effect cannot necessarily be determined in a cross-sectional study (4).

Prepared by Aidan Findlater

18

An obvious example of the limitations of a cross-sectional study would be if an association were found

between obesity and depression in a group of high school students, because it would be impossible to know

whether the obesity led to the depression, the depression to the obesity, or both were caused by a third factor.

Describe the natural experiment study design, including advantages and

disadvantages

Naturally occurring circumstances in which subsets of the population have difference levels of exposure

to a supposed causal factor, in a situation resembling an actual experiment where human subjects would

be randomly allocated to groups (4).

While not a true randomized experiment, there are instances where an exposure occurs to members of a

population in an essentially random way. An area similar to the exposed area is then selected as the

unexposed group. If the case can be made that the exposure was essentially at random, then the study can

yield valuable information that would otherwise not be available. For example, the Nagasaki and Hiroshima A-

bomb explosions exposed residents of those two cities to ionizing radiation. It is possible to estimate the

radiation exposure of survivors based on where they were during the explosions. Then, cancer rates have

been compared between survivors of these cities and similar cities chosen from elsewhere in Japan. If the

case can be made that the exposed and unexposed cities had similar cancer rates before the bombs, the

opportunity exists to learn about the long-term effects of ionizing radiation across an exposure gradient not

usually seen. This is a very strong design if the circumstances allow it.

Describe the major sources of bias that may occur in etiologic studies in humans,

and apply this information to interpretation of observational studies

[A: Im not really sure what this objective is trying to get at.]

Broadly speaking, biases are either selection bias, from the way youre recruiting your participants, or

measurement bias, from the way youre measuring the exposures or outcomes. Remember, it must

introduce a systematic error to be a bias.

An example of selection bias would be if you were to use a breast cancer screening booth to estimate

prevalence of breast cancer. People who are more worried about having breast cancer are more likely to self-

select for participation, so your estimates of prevalence will be high.

An example of measurement bias is the recall bias that I described in the case-control section, above.

Dene causation

There is no single absolute rule that can be used to decide if an exposure or event E causes an outcome

O. The denition they ask us to think about is this:

E occurs. Later, O occurs. Had E been absent, O would not have occurred, all else being held constant.

Describe approaches to determining causation

Counterfactual

Counterfactual: A measure of effect in which at least one of two circumstances in the denition of

variables must be contrary to fact (4). [A: What a BS denition.]

Epidemiology I Course Notes

19

This just means that you imagine a world where you smoke and you compare it to a world in which you dont

smokeeverything else is identical except for the smoking. Then you see what the differences are in

outcome. If you get lung cancer in the world in which you smoke, then the smoking caused the lung cancer

since it was the only difference between the two worlds. Obviously, it is impossible to actually do this

comparison, hence counterfactual.

Causality in reality

Points that they want you to know:

You cant prove causation in an individual. For example, theres no way to prove that a specic person

got their lung cancer by smoking.

RCTs are the best tool we have for establishing causation.

You cant ethically randomize someone to something that you think will cause harm, so RCTs are

limited.

Observational studies are susceptible to bias.

Observational studies are often the rst, and sometimes the only, evidence we have.

Observational studies often give the same results as RCTs.

In medicine, we must be practical. Even if we cant prove causation, we can still gure out what the

potential risks and benets are, and act accordingly. Even if we dont have perfect evidence, it might

still be prudent to act.

Kochs postulates

Specic to infectious disease.

the suspected agent must be present in all cases

must not be found in other cases of disease

is capable of reproducing disease in experimental animals

must be recoverable from the experimental animals after the disease was reproduced

Bradford-Hill Criteria

These are guidelines, and are not always all necessary. Except for temporality.

1. Strength of association (relative risk, odds ratio): the larger the apparent effect, the more likely it is to

be causal [A: This one is really debatable. Were often looking for effects that are very small.]

2. Consistency: do we see the same results over and over again, in different studies?

3. Specicity: it only produces one specic effect of interest [A: Again, very debatable. Smoking does a

lot of bad stuff to you besides just lung cancer.]

4. Temporal relationship (temporality): the cause must come before its effect

5. Biological gradient (dose-response relationship): if you increase the exposure, does it increase risk of

the outcome?

6. Plausibility (biological plausibility): theres a plausible pathophysiological mechanism of effect

Prepared by Aidan Findlater

20

7. Coherence: the effect is compatible with existing theories and knowledge [A: This is similar to

biologic plausibility but broader.]

8. Experiment (reversibility): lowering the exposure lowers risk of the outcome

9. [A: Analogy: Could other things explain it? This one isnt in the notes, but was one of the original

criteria.]

Describe measures of mortality [NiO]

death rate or mortality rate =

case-fatality rate =

Note that the denominator gives the number of cases who die from the disease. That is, the deaths must be

attributable to the disease, and doesnt include those who were, say, run over by a bus.

proportionate mortality =

This is just telling you how much of the mortality is caused by a given disease. For example, of all the deaths in

Hotel Dieu in 2011, how many (proportionally) were caused by CHF?

Epidemiology I Course Notes

21

Fundamental s of Bi ostati sti cs I

(30 Jan 2012)

Dene probability and odds

Probability is the proportion or percentage of successes out of the total number of trials. Odds are the

ratio of successes to failures.

Take a coin ip. You have a probability of getting heads of 1/2 (50% or 0.5), which corresponds to an odds of

1/1 (or just 1). If you have a six-sided die, your probability of rolling a one is 1/6 while your odds of rolling a one

are 1/5.

More mathematically, if you have N trials, with S successes and F failures, then:

N = S + F

probability of success = P(S) = S / N = S / (S + F)

odds of success = Odds(S) = S / F

[A: Why are they asking you to dene probability and odds so long after you were expected to know risk ratios

and odds ratios? I dont know.]

Classify different sampling approaches

Sampling scheme Description Advantages Disadvantages

Simple random sample Every unit has the same,

non-zero probability of

being selected.

Its the ideal sampling

technique and will not add

any bias to the study.

It requires you to have a list

of all units in the

population, which isnt

always feasible.

Systematic sample with

random start

For some number N, take

every Nth unit on the list,

starting at random unit

between the rst and Nth.

Easier than a SRS in

certain situations.

If theres any clumping,

your sample might not be

representative.

Cluster sampling You sample groups (like

neighbourhoods), where

you pick a random set of

neighbourhoods and talk to

everyone in each sampled

neighbourhood.

Efcient for sampling larger

populations, especially

when you have to actually

go out and collect the data

(e.g. walk from block to

block).

Since its not a perfectly

random sample of people,

youll need to include more

people to get the same

statistical power as a

simple random sample. The

number by which the

sample size must be

multiplied is known as the

design effect.

Prepared by Aidan Findlater

22

Sampling scheme Description Advantages Disadvantages

Multistage sampling You sample in multiple

levels or stages. For

example, a random sample

of cities, then within each

city, a random sample of

neighbourhoods, then

within each neighbourhood

a random sample of

houses, and nally within

each house a random

sample of the occupants.

Best for sampling really

large populations.

Same as for cluster

sampling. Multistage also

sampling makes running

and analyzing a study more

complex. Analysis must

use sampling weights.

Stratied sampling Samples within dened

strata of the population.

Can oversample from

certain strata.

Oversampling ensures that

enough people are

sampled from minorities of

interest.

Requires you to be able to

stratify the population.

When oversampled,

analysis requires weighting.

Convenience sampling Take whomever you can. Quick and easy. Since its a non-random or

non-probability sample,

you really cant generalize

to the broader population.

Quota sampling Like stratied sampling

where the strata are

sampled by convenience

rather than randomly.

Quick and easy. Since its a non-random or

non-probability sample,

you really cant generalize

to the broader population.

Simple random sampling is the ideal sampling method, and most statistical analyses assume that your

samples were SRS. If you dont use SRS, you often have to adjust your analysis using sampling weights.

The non-probability sampling methods are easily biased and often not generalizable.

Simple random sample (SRS)

This is your classic probability sampling technique. First, you enumerate all possible units, that is, make

a list of all the things in the population youre sampling from. This list is your sampling frame. Then you

take a completely random sample of units from that list. Every unit has an equal and non-zero

probability of being selected for the sample.

For example, drawing 10 students names out of a hat would give a simple random sample of students. In a

class size of 38, everyone has the same 10/38 = 26% chance of being selected. The list of students names

that youve torn up and put in the hat is the sampling frame.

SRS is the ideal that all other sampling techniques are trying to approximate. All of our statistics assume

that our samples are generated in this way, which is why we need things like the design effect and

weighting for cluster or multistage samples.

Systematic sample with random start

As above, you start with your sampling frame. Then you pick a number, which Ill call N. Start sampling,

on the list, at a random position between the rst and Nth item. Then take every Nth unit after that until

you get to the end of the list.

Epidemiology I Course Notes

23

E.g. Take a list of every student in the class (your sampling frame). every sixth person, starting at random

somewhere between the rst and sixth person on the list.

This technique can be useful for things like sampling houses from a block. Instead of creating a list of all the

houses and consulting a random number table, you simply roll a six-sided die, start at that house, and then

take every sixth house after it.

If the sampling frame has any clustering or clumping of data, then your sample may not be

representative of the population.

Cluster sampling

Instead of randomly sampling units, you randomly sample groups of units. For example, instead of

sampling people, you randomly select neighbourhoods and include everyone in each of the sampled

neighbourhoods. You might be sampling 20 neighbourhoods of 500 people each, but thats not the same

things as sampling 10,000 people at random. In order to get the same statistical power, you need to

increase the sample size by some multiple, called the design effect. In a sample with a design effect of

two, you would need to include twice as many people as if you had sampled them perfectly randomly.

For example, were interested in studying med students in Windsor. We pick two years at random and

interview everyone in those two years.

Multistage sampling

This is like cluster sampling, except instead of including every unit in the sampled groups, we take a

sample of them. This has the same design effect as cluster sampling, but analysis is much more

complicated and requires sampling weights. Distrust any multistage sampled study that does not report

using sampling weights.

You can have several layers or stages in multistage sampling.

For example, were interested in studying med students in Windsor. We pick two years at random and

interview 30% of students in each of the two years.

Or, a more complex example, say you want to sample all Canadians. You select 50 cities at random, then

within each city you select (at random) three census tracts, then four blocks within each census track, and

twelve houses within each block. By weighting your results appropriately, you can generalize from this sample

to the entire Canadian population.

[A: I think its important to point out that, for cluster and multistage sampling, you arent picking your

groups randomly, but rather are picking them with probabilities proportional to size. In the Windsor

med student example above, the probability of picking the fourth year class would be lower than the

probability of picking the rst year class, because the rst years have more people. If you dont do that,

then your weighting wont work out and you cant generalize.]

Prepared by Aidan Findlater

24

Stratied sampling

Stratify the population by some characteristic (age, sex, or whatever) and sample randomly within each

stratum. This allows you to oversample from specic strata, which will need to be accounted for by

weighting the results in the analysis.

For example, if your study population only has a few young people but you want to be able to make inferences

about them as a subgroup, you might stratify on age and sample 50% of the young stratum and 20% of the

older stratum.

You cant just calculate a mean, now, since your sample has proportionally more young people than the

population youre sampling from and any means that you calculate would be heavily skewed. To adjust for

this, you use weighting.

Convenience sampling

You take a sample of whomever you can get. Its a non-probability sampling technique, so its really

hard to generalize it to a larger population.

For example, walking around asking random people on the street to ll out a questionnaire, or going to a

fertility clinic and recruiting the patients.

Convenience samples often suffer from volunteer bias, since people who volunteer for a study are likely

to be systematically different from the broader population.

Quota sampling

A method by which the proportions in the sample in various subgroups (according to criteria such as

age, sex, and social status of the individuals to be selected) are chosen to agree with the corresponding

proportions in the population. The resulting sample may not be representative of characteristics that

have not been taken into account (4).

This is just doing a convenience sample where you continue sampling until you have enough people in all

subgroups or strata youre interested in. Its like a cross between a convenience sample and a stratied

sample.

Differentiate between levels and types of measurement

Types of variables: continuous, discrete, or categorical

Continuous variables are attributes or characteristics that theoretically have innitely ne gradations.

For example, weight. You arent restricted to having weights in units (1 kg, 2 kg, etc.), you can have fractions

(1.3 kg, 0.45928kg). Therefore its continuous.

Discrete variables only exist in distinct units and are expressed in integers or counts.

For example, 1 child, 2 children. You cant have 1.3403 children. Heart rate, in beats per minute, could be

considered discrete, since you cant have half-beats. [A: Although thats technically true, most people think of

heart rate as continuous.]

Epidemiology I Course Notes

25

Categorical variables have natural categories.

For example, colours.

Dichotomous variables are categorical variables with two categories.

For example, alive or dead, male or female, stroke or not stroke.

Continuous variables are often turned into ordinal, categorical, or dichotomous variables in order to

make them useful for regressions and other analyses that assume normal distributions. For example, if

age is not normally distributed in your sample, you might dichotomize into old and young based on a

certain cutoff, or into age categories like <20 years, 21-40 years, and >40 years.

Levels of variables: interval, ratio, ordinal, nominal

[A: I dont really know what levels means. Normally Id just call all of these variables.]

The interval and ratio levels assume that each interval or unit on the scale is the same as every other

interval (that is, equal intervals). The ratio level further assumes that a zero on the scale represents an

absence of the phenomenon being measured. All continuous variables are measured at either the integer

or ratio level.

For example, the Celsius scale is an interval scale. The difference between 20 and 21C is the same as the

difference between 30 and 31C, but 0C is an arbitrary designation. Weight is a ratio measure, because the

difference between 20g and 21g is the same as the difference between 30g and 31g, and 0g is an absence of

weight. Because its a ratio scale, you can also say that 4g is twice as heavy as 2g (hence ratio measure).

Compare that to an interval measure like Celsius, where 4C isnt twice as hot as 2C.

The ordinal level represents variables as labels that have an order and can be ranked. Likert scales are

a common example.

For example, rating your interest in this class, where 1 is hate it, 2 is dislike it, 3 is dont care either way, 4

is like it, and 5 is love it. The distance between hating and disliking isnt necessarily the same as the

distance between disliking and not caring, but there is denitely an order to them. A health-related example is

stage or grade of cancer.

[A: Ordinal, as in order. Compare this to nominal, as in names.]

The nominal levels represents variables as labels that dont have an order. Categorical variables are

measured at the nominal level.

For example, colours. Theres no inherent order to red and blue. A health-related example is type of cancer.

[A: Nominal, as in names. Compare this to ordinal, as in order.]

Describe a normal distribution and compare to a skewed distribution

A normal distribution is the classic bell-shaped distribution. It is dened by two parameters, the mean

and the standard deviation. If its right skewed, then it has a long right tail. If its left skewed, then it has a

long left tail.

Prepared by Aidan Findlater

26

[A: Understanding the following notes makes some stuff easier to understand but is not necessary for the

course.]

There are two types of statistics: parametric and non-parametric. Parametric means that you are assuming

that the data follows a parametric distribution, that is, any distribution that can be dened by one or more

parameters. Non-parametric means that you arent assuming the data follows any distribution at all.

The most common parametric distribution seen in the literature is the normal distribution, whose parameters

are the mean and the standard deviation. Another common example is the chi-squared distribution, whose

single parameter is the number of degrees of freedom.

Non-parametric statistics are less common. They assume nothing about the data, which makes them more

robust but also less likely to reach statistical signicance.

Understand the mean, median, mode, and variance, standard deviation, and

range, and be able to calculate the mean, median, mode, and range

You have a sample of data drawn from a population. You want to make inferences about the population.

One thing you might be interested in is measures of central tendency, which are used to talk about the

typical result. Another thing you might be interested in is the dispersion, or spread, of the data, which

describes how far away the numbers are from the measure of central tendency youve used.

Assuming that the sample is a simple random sample from the population, the following measures are

unbiased approximations of the population.

The mean is your traditional average, where you add all the numbers in your sample and divide by the

sample size. The mean is easily skewed by outliers (data points that are really far away from the other

data points and dont look like they t in). With a dichotomous variable coded as 0 or 1, the mean

corresponds to the proportion of 1s in the sample.

A percentile is the number below which some percentage of the sample lies. For example, the 95

th

percentile is the number in the sample that is higher than 95% of the data.

The median is what you get when you line up all the numbers in your sample and take the middle one.

Another name for the median is he 50

th

percentile. The median is not skewed by outliers. [A: The median

is usually better than the mean for working with more skewed or non-parametric data. Its basically the

non-parametric equivalent of a mean.]

If you have an odd sample size, your median is just the middle number; if you have an even sample size, then

your median is the average of the two most middle numbers.

The mode is the most common number in your sample, which corresponds to the highest peak on a

histogram. The mode is not skewed by outliers.

A bi-modal distribution is one in which there are two modes, or two large peaks. It usually indicates that your

sample has two underlying populations that are mashed together, like if your sample has both men and

women but they respond very differently to treatment.

Epidemiology I Course Notes

27

The variance is the mean squared distance from the mean of the data. Standard deviation (SD) is the

square root of variance. Both measure how spread out the data is; the smaller the variance (and hence

standard deviation), the more closely your data are grouped around the mean.

Variance =

2

=

P

n

i=1

(x x)

2

n1

Standard Deviation = =

Variance

Although its technically wrong, you can think of the SD as sort of like the average distance from the mean.

[A: You divide the variance by n-1 instead of by n because youre working with a sample. Theres a very nice

mathematical reason for it, but all you need to know is that it makes sure that its an unbiased estimate of the

population variance.]

The interquartile range (IQR) is the difference between the third and the rst quartile; that is, between

the 75

th

and 25

th

percentiles. You nd the smallest data point that is higher than 75% of the data and

subtract the smallest data point that is higher than 25% of the data. [A: The IQR is basically the non-

parametric equivalent of the standard deviation.]

Identify the most appropriate measurement for central tendency and dispersion

for different levels of measurement including interquartile range and standard

deviation

Situation Measures of central tendency Measures of dispersion

Nominal/categorical data mode

Ordinal data median

mode

IQR

Interval or ratio data mean

median

mode

SD (or variance)

IQR

[A: The above table is has the measures ordered by (my) preference.]

Prepared by Aidan Findlater

28

Fundamental s of Bi ostati sti cs I I

(6 Feb 2012)

Distinguish between estimation and hypothesis testing

Hypothesis testing tests hypotheses. Your H0, or null hypothesis, is a falsiable statement that is

evaluated using p-values. Null hypotheses are usually statements about no effect or no difference,

which are set up in contrast to an alternate hypothesis that assumes some effect or difference.

Estimation gives estimates. Instead of asking a yes or no question, youre asking for a number that is

often accompanied by a range (the condence interval). [A: Hard stuff, right guys?]

Its the difference between asking is there a difference in mean blood pressure between treated and control

groups? (hypothesis testing) and how big is the difference in average blood pressure between treated and

control groups?. Generally, epidemiologists prefer estimates, for reasons that you will see below.

Interpret p-value and condence interval

The p-value is used in hypothesis testing. You set your null and alternate hypotheses, and you gather

your data. You can then calculate a p-value, which answers the following question: assuming that the

null hypothesis is true, what is the probability of seeing data at least as unlikely as ours? That rst part

is bolded because its important. You are not calculating the probability that your alternate hypothesis is

true; only Bayesian statistics can do that. You are assuming that there is no effect or difference (the null

hypothesis), and asking what percentage of a theoretically innite number of trials would have found

results at least as far out as yours.

Think about that for a minute: this is why our signicance threshold (alpha) of p < 0.05 corresponds to a 5%

false-positive (Type I) rate. The p-value corresponds to the proportion of results at least as extreme as ours

that will be false positives.

The condence interval (CI), in contrast, gives an idea of how precise an estimate is. The theoretical

situation is this: if we could repeat the exact same experiment an innite number of times, 95% of the

calculated 95% condence intervals will include the true population value. The true population value that

were estimating will be contained within 95% of those 95% CIs.

Think about that for a minute: what will affect the condence interval? If we sample a larger proportion of the

population, well be able to say that were closer to the true population value, so larger sample sizes make

tighter condence intervals. But if the data points are all really different, then well be less condent saying that

Epidemiology I Course Notes

29

were close to the true population value, so a larger spread (the variance or standard deviation) makes wider

condence intervals.

The null value is the value that corresponds to no effect or difference. For RRs, ORs, and IRRs, the null

is 1. If a 95% CI contains the null value, it is not statistically signicant at a 5% false positive rate (that is,

we know that a corresponding p-value would be greater than 0.05 if we calculated it). If a 95% CI does

not contain the null value, then it is statistically signicant (a corresponding p-value would be less than

0.05).

And think for another minute: what happens if were estimating a difference or effect where the true population

value is the null value? Then 95% of the 95% CIs we calculate will contain the null value and 5% will nota

5% false positive rate!

Condence intervals are better than p-values. They can test all the same hypotheses as p-values (as

weve just seen), but they also give us an idea of how precise the estimates are and how strongly we

should interpret the statistical signicance or non-signicance.

Tight condence intervals suggest that the sample size was appropriate; a negative result with a small

condence interval probably really is a true negative. Wide condence intervals suggest that the sample

size was inappropriate; a negative result could be a true negative but could also be due to the sample

size being too small to detect the difference.

The following table gives an example of how to interpret condence intervals of a relative risk:

Estimate 95% CI Interpretation

2.1 0.8 to 3.1 Not statistically signicant (includes the null value of 1), and a tight condence

interval. We should feel safe saying that theres no effect in this study,

(assuming no bias).

2.1 2.0 to 3.6 Statistically signicant and a tight condence interval. We should feel safe

saying that there is a true effect in this study (assuming no bias).

2.1 1.4 to 37 Statistically signicant and a wide condence interval. We shouldnt feel too

safe saying that there is a true effect in this study (assuming no bias).

4.2 0.95 to 10.6 Not statistically signicant (includes the null value of 1) with a wide condence

interval. Given that the condence interval is wide and mostly on one side,

were more cautious; it may still represent a clinically signicant effect

(assuming no bias).

[A: I added the bias thing in because I think thats important to think about. P-values and CIs only tell us about

the statistical probabilities given a theoretically perfect situation where we have perfect simple random

sampling and no bias.]

We can also do hypothesis testing for statistically signicant differences between two groups by

comparing their condence intervals. If you are comparing two estimates (like blood pressure between

Prepared by Aidan Findlater

30

two groups), then the difference is statistically signicant at an alpha of 0.05 (5% false positive rate) if the

two 95% condence intervals do not overlap.

For example, if smokers have a mean blood pressure of 180mmHg (95% CI 150-190) and non-smokers have

a mean blood pressure of 120mmHg (95% CI 100-130), the condence intervals do not overlap so a

statistically signicant difference exists at an alpha < 0.05 level.

Because CIs give you more information than p-values, you should always report condence intervals

instead of p-values. Remember to also consider whether the results are clinically signicant, not just

whether they are statistically signicant.

Dene and interpret Type I error (alpha)

Type I error is when you reject a null hypothesis (that is, conclude that there is a difference or effect)

when there really isnt a true difference or effect. The alpha (!), or false positive rate, is the proportion of

false positives that youre willing to accept (assuming that your results are only affected by random

chance).

Type I = false positive

Dene and interpret Type II error (beta)

Type II error is when you accept a null hypothesis (that is, conclude that there is not a difference or

effect) when there really is a true difference or effect. The beta ("), or false negative rate, is the

proportion of false negatives that youre willing to accept (assuming that your results are only affected by

random chance).

Type II = false negative

Dene and interpret power

Power = 1 - ". The most common power youll see is 80%, or " = 0.2. People usually talk about power

in the context of sample size calculations. For some reason, people talk about # and power instead of #

and ".

Identify factors required for sample size and power calculations

The four components are: Type I error, Type II error, effect size, and sample size. Power corresponds

to 1 - ". Power decreases with a more stringent # (lower Type I error), increases with a larger effect size,

and increases with a larger sample size.

You want to detect a difference between two groups, such as control and intervention. The sample size that

you need for an experiment is determined by your desired alpha and power and the minimum effect size you

want to detect. For our purposes, sample size calculations refer to the size in each group.

Distinguish between a negative and an underpowered trial

Corresponding to the last objective, there are a number of explanations for a negative (statistically non-

signicant) result:

# too low;

Epidemiology I Course Notes

31

sample size too small;

effect size too small;

or there really is no effect!

Dene and interpret statistical interaction [NiO]

Interaction is when the effect of an intervention or exposure is modied by a third variable. Interaction

should always be considered when interpreting studies.

An example is asbestos exposure and lung cancer. Asbestos causes lung cancer; smoking causes lung

cancer. The risk of lung cancer among smoking asbestos miners is expected to be high and you can

calculate what you would expect the risk to be. But people who are expected to both have an even

higher risk of lung cancer than you if you calculate what it should be. The effect of asbestos on lung

cancer is increased in the presence of smoking. In this case, theyre working synergistically to increase

risk of lung cancer above and beyond what one would naturally expect.

Interpreting multiple/multivariate regression [NiO]

Lets say that youre reading about a multivariate logistic regression that predicts CHF based on

presence of ischemic heart disease and several other variables. Logistic regressions give you odds

ratios. If the result of this logistic regression was that heart disease has an OR of 1.98, then you would

say, The odds of CHF is 1.98 times higher among patients with ischemic heart disease than those

without, assuming that the other variables are held constant (i.e. adjusting for the other variables).

Prepared by Aidan Findlater

32

Are the Resul ts Val i d? I

(13 Feb 2012)

Compare and contrast observational (cohort, case-control and case series) and

experimental studies

Weve already seen some of this. The main difference between observational and experimental studies is

that, in an experimental study, we assign the exposure status to the participants.

The major problem with causal inference in cohort studies is confoundingthe possibility that an

association between our exposure and outcome only exists only because of some third variable (the

confounder) thats affecting both exposure and outcome. With observational studies, we can adjust for

confounders that we know of and can measure. With RCTs, however, the randomization adjusts for all

confounders, known and unknown, measurable and unmeasurable. Randomization is like magic that

removes confounding.

According to the prof, only RCTs can establish causality. [A: I personally think our prof emphasizes this

point way too much, since its denitely possible to do causal inference without an RCT.]

Dene study population and inclusion criteria

A trial requires inclusion and exclusion criteria for who you will include in the study. More restrictive

criteria increases power but limits generalizability (also called external validity).

A homogeneous population, where everyone is very similar, will give you more statistical power by will

limit the studys external validity (your ability to generalize beyond the study sample). A heterogeneous

population will improve external validity but will add a lot of noise to the data and limit your ability to

detect an effect. You need to nd a balance between homogeneity and heterogeneity of your sample.

Dene randomization and allocation concealment and differentiate between the

two

Randomization is where the participants are randomly assigned to either the intervention/exposure

group or the control group.

Allocation concealment means that you cannot predict who has been randomized to what group. Its to

prevent the researchers from picking the right envelope for their patient. Bobby, I really think you

Epidemiology I Course Notes

33

should wait until after this next person goes... This concept is related to blinding, except allocation

concealment is about the randomization process (making sure that the randomization is actually random)

and blinding is about making sure that no one nds out after randomization.

Allocation concealment makes sure that randomization is actually random. It has to do with the people who

are in charge of allocating people to one group or another. All studies need allocation concealment, but not all

need blinding. Take a surgical intervention being compared to medical treatment. You cant blind anyone to

the intervention in that case, since its pretty obvious whos getting surgery. But you still need allocation

concealment to prevent the person whos doing the randomization from knowing who theyre sending where.

If the person whos randomizing can check the next envelope to determine what group the next person will be

randomized to, they may start sending people they think need surgery to the surgery group and people they

think need medical care to the medical care group. The worst example of this that I remember hearing about

was that the residents didnt want to do the experimental surgery without the attending surgeon, so when he

wasnt there (usually at night), they randomized everyone to the control group.

You have to be extra careful with block randomization, since the allocator may be counting and will be able to

predict the next random assignment.

Since this is a bit of a confusing concept, here are some more denitions of allocation concealment:

[Allocation concealment is] a method of generating a sequence that ensures random allocation between two

or more arms of a study without revealing this to study subjects or researchers. The quality of allocation

concealment is enhanced by computer-based random allocation and other procedures to make the process

impervious to allocation bias (6).

Allocation concealment helps to minimize selection bias by shielding the randomization code before and until

the treatments are administered to subjects or patients, whereas blinding helps avoid observer bias by

protecting the randomization code after the treatments have been administered (7).

Randomization and allocation concealment are important because they ensure that, in aggregate, the

two groups are more or less the same at the start of the trial. Any confounders are now distributed at

random, so they cant bias our conclusions about the effect of our intervention or exposure. It accounts

for all known and unknown confounders.

Dene block and stratied randomization

Lets say were assigning people to intervention or exposure using a random number table. If our

allocation is completely random, we may randomize everyone to experiment just by random chance.

Block randomization is where youre forcing equal numbers to be allocated to intervention and control

groups. The way we do this is by breaking our random number table into blocks, and then within each

block were requiring there to be equal numbers of intervention and control assignments, but in a

random order within the block.

For example, check out the following simple random allocation table. Oops, weve randomize almost

everyone to the intervention!

Prepared by Aidan Findlater

34

Simple Randomization

Intervention

Control

Intervention

Intervention

Intervention

Intervention

Intervention

Control

Intervention

Intervention

Compared this to block randomization, below, where were forcing a 50-50 split. In the table below, our

block size is 4, so that every four participants include two interventions and two controls.

Block Randomization Block

Intervention 1

Intervention

1

Control

1

Control

1

Control 2

Intervention

2

Intervention

2

Control

2

Stratied randomization makes sure that certain subgroups of interest are randomized equally between

the intervention and control groups. We stratify based on our variable (e..g. gender), and then do block

randomization within each stratum. We do this if we dont trust randomization to ensure equal

representative of a very important variable between the two groups.

We should only stratify on factors that have a known and important effect on the outcome.

Dene blinding, and recognize studies where blinding may not be possible

Blinding refers to making sure that the treatment is not known. Single-blinded usually means that the

participants dont know if theyre in the treatment or the control arm. Double-blinded usually means

that neither the participants nor the data collectors know that. If possible, the data analysts should also

be blinded. Ideally, no one in the world knows who is in what arm until after the data has been analyzed,

except for the members of the data and safety monitoring analysts (who will be discussed later, I hope).

Blinding is important. People who know that theyre in the control arm are often more likely to drop out, and

people who know that theyre in the treatment arm are more likely to experience a placebo effect. People who

Epidemiology I Course Notes

35

see no effect, or who only experience bad side effects, are more likely to drop out. If people are dropping out

differently because of their treatment/control allocation, thats a bias. If people are reporting results differently

because of their treatment/control allocation, thats a bias. Bias sucks, so we need to keep everyone as

blinded as possible.

People can unblind themselves in any number of ways. If youre comparing Vitamin C to placebo, you have to

somehow mask the taste of the pill, because people know what Vit C tastes like. If youre comparing a drug to

placebo where the drug causes side effects like a headache, people will know what arm theyre in unless you

use an active placebo that also causes those same side effects. Its tricky to blind people to treatment, and it

takes a great deal of care to do properly.

But at the end of the study, how do we know if our blinding worked? The easiest, simplest (and only?) was to

measure the effectiveness of blinding is by asking participants what arm they think they were allocated to. You

then compare their responses to their actual allocation to see if there is statistical evidence that blinding was

not effective. Of course, researchers and companies hate doing this. Its a lot easier to publish if you havent

disproved your own results. Ignorance is bliss. (Dont be like them; be good and try to measure blinding

effectiveness.)

Its pretty intuitive when you cant blind studies. If youre randomizing to surgery or medical treatment,

its pretty hard to blind people.

Dene intention-to-treat analysis, and describe advantages

People drop out of studies, they die, and they dont adhere to their treatment. There are two ways of

dealing with drop-outs, deaths, and non-compliance: per-protocol analysis and intention-to-treat

analysis.

Per-protocol analysis only looks at people who fully completed the treatment or control that they were

allocated to. Intention-to-treat analysis includes all people who were randomized, regardless of how

compliant or alive they were.

Intention-to-treat is generally regarded as less biased, since its better at dealing with loss to follow-up

(death or dropouts). Its also a better measure of the real-life effectiveness of a treatment, since we can

only write a prescription or referral (that is, allocate them to treatment) but we cant make sure that they

actually comply.

If the treatment is so horrible that most patients stop taking it, intention-to-treat analysis will be much less

biased than per-protocol. Consider that some people will get better and some with get worse, regardless of

treatment. Since people who see a completely random improvement are more likely to continue taking their

treatment in the face of horrible side effects, then the only people left in the treatment arm (with side effects)

are those who had random improvements. People in the control arm (with less awful side effects, since it

would be unethical to give truly awful side effects) mostly adhere, including those who randomly improved and

those who did not. A per-protocol analysis will then show that the treatment has an effect, even though it

didnt do anything except cause the non-improvers to drop out! Awful. Always prefer intention-to-treat

analysis, and be suspicious of studies that do not.

And if theres a huge difference between intention-to-treat and per-protocol analyses, then consider if theres a

problem with the treatment, the implementation, the randomization, the allocation concealment, or the

Prepared by Aidan Findlater

36

blinding. Theres something shy going on there. Everyone who is randomized should be included in the

analysis.

Intention-to-treat is also more reective of the real-world effectiveness of the treatment. We can only allocate

people to treatment or no treatment. You can only right a prescription or a referral, you cant actually force

your patient to follow through. If theres a miracle drug thats so god-awful to take that no patient actually

complies, then whats the point of writing a prescription for it? ITT analysis accounts for this, while per-protocol

would show the amazing results for the 1 out of 100 people that actually followed through on treatment. Thats

almost useless to me as a physician.

ITT analysis sometimes underestimates the effect of a treatment. That is, its a conservative estimate of

the treatments effect.

Understand what the CONSORT RCT reporting guidelines are

[A: Ive added this in because I think that these reporting guidelines are hugely important. CONSORT is

for RCTs, but similar guidelines exist for observational studies and meta-analyses, as well.]

CONSORT is a well-thought-out set of guidelines for the proper reporting of randomized clinical trials. It very

clearly explains everything that you need to report (and therefore, that you need to have thought about). If you

ever want to do a thorough analysis of any RCT, pull out the CONSORT checklist and see which items they

arent reporting.

One of the most basic things that CONSORT says is that Table 1 should be baseline demographics and Figure

1 should be the owchart below. If you ever read an RCT that does not have those two things, throw it away;

its useless.

The lecturer consistently uses CONSORT to refer specically to the CONSORT-recommended owchart. I

dont know why.

Epidemiology I Course Notes

37

!"#$"%& $()(*+*,( -./. 0123 45)67)+

!"#$ Schulz kl, AlLman uC, Moher u, for Lhe CCnSC81 Croup. CCnSC81 2010 SLaLemenL: updaLed guldellnes for reporLlng parallel group

randomlsed Lrlals. %&' 2010,340:c332.

For more information, visit www.consort-statement.org.

Assessed for ellglblllLy (n= )

Lxcluded (n= )

noL meeLlng lncluslon crlLerla (n= )

uecllned Lo parLlclpaLe (n= )

CLher reasons (n= !

Analyzed (n= )

Lxcluded from analysls (glve reasons) (n= )

LosL Lo follow-up (glve reasons) (n= )

ulsconLlnued lnLervenLlon (glve reasons) (n= )

AllocaLed Lo lnLervenLlon (n= )

8ecelved allocaLed lnLervenLlon (n= )

uld noL recelve allocaLed lnLervenLlon (glve

reasons) (n= )

LosL Lo follow-up (glve reasons) (n= )

ulsconLlnued lnLervenLlon (glve reasons) (n= )

AllocaLed Lo lnLervenLlon (n= )

8ecelved allocaLed lnLervenLlon (n= )

uld noL recelve allocaLed lnLervenLlon (glve

reasons) (n= )

Analyzed (n= )

Lxcluded from analysls (glve reasons) (n= )

8

1

1

2

9

)

(

5

2

,

8

,

)

1

:

;

5

;

0

2

1

1

2

3

<

=

>

8andomlzed (n= )

?

,

7

2

1

1

+

*

,

(

Prepared by Aidan Findlater

38

Are the Resul ts Val i d? I I

(27 Feb 2012)

Identify and interpret baseline data in a clinical trial

Due to randomization, the two groups should be no different in baseline characteristics, at least not

statistically signicantly different.

[A: This is one of the few times that Ill say its okay to report p-values instead of condence intervals.]

Identify attrition, and discuss possible effects on results of clinical trial

Attrition refers to the loss of participants during a trial. If people are leaving the trial for reasons related

to their allocation to treatment or no treatment, then it introduces a bias. If the attrition is the same in

both groups (that is, unrelated to allocation), then it doesnt bias the results but does reduce your

sample size and therefore power.

Compare and contrast efcacy and effectiveness, and internal and external

validity

Efcacy is the effect of the treatment under ideal conditions, with the patients chosen very carefully and

lots of attention paid to them by their health-care providers. Effectiveness is the effect of the treatment

under real-world conditions, where the criteria for treatment are a lot looser and health-care providers

are busier.

Internal validity refers to the validity of the results of our study. Low bias means high internal validity.

Good study design means high internal validity.

If were studying med students at UWO, we take a sample of med students from the university and want to

generalize to all med students in the university. If we have good internal validity, we can do that.

External validity refers to the broader generalizability of our results to larger populations or different

situations.

If were studying med students at UWO, we take a sample of med students from the university for our study.

Since we probably shouldnt generalize our results to med students at McGill, our study has low external

validity.

Epidemiology I Course Notes

39

You can see that efcacy studies are designed for high internal validity but low external validity, since we

want a really good answer within our population but dont about generalizing from the efcacy study to

other populations or situations. Effectiveness trials, in contrast, are designed for high external validity,

since we want the results to be more generalizable.

Dene bias, and recognize different sources of bias in studies, including

publication bias

For a review of bias more generally, see Fundamentals of Epidemiology I (16 Jan 2012) and

Fundamentals of Epidemiology II (23 Jan 2012).

RCTs are less susceptible to bias than observational studies. Selection bias and confounding are

minimized by randomization and allocation concealment. Measurement bias is minimized by proper

blinding. However, RCTs are not immune to bias, and pharmaceutical companies love to calculate

systematically inated (or biased) estimates of the effect of their drugs. [A: Thank goodness for kind-

hearted, honest epidemiologists. Right, guys?]

One really interesting bias is called publication bias, which refers to the facts that positive results are

more likely than negative results to be published, especially in high-impact journals. If your study shows

a huge effect of the intervention, youre gonna get published. If your study shows no effect, youll have

to ght to get published anywhere and are guaranteed not to get into a high-impact factor journal.

Imagine youre a pharmaceutical company. The only results that you want doctors to see are the ones that

show a benet, and hopefully a large benet. You have an incentive to bury any negative trials and only publish

the positive ones.

[A: PLoS ONE is a great journal trying to singlehandedly ght publication bias. It publishes all research

submitted to it that is well-done and well-reported, regardless of the results.]

A way to reduce publication bias is to use a clinical trails registry where all clinical trials must submit

their protocol before they start and their results upon completion. That way you can nd the results of all

RCTs, even if they never got published.

When youre doing a systematic review of clinical trials, always search the clinical trials registries!

You can assess publication bias using a funnel plot. If you want to know more about those, google it, Im tired.

Prepared by Aidan Findlater

40

RCTs: What are the Resul ts? I

(5 Mar 2012)

Describe the problem of multiplicity in analysis, and apply this information with

respect to interpretation of subgroup analysis, multiple and secondary outcomes

If you do enough statistical tests, you will eventually nd something that is statistically signicant just by

random chance. The probability of getting a false positive result increases with the number of

subgroups, outcomes, and time points being compared. This is referred to as multiplicity, multiple-

hypothesis testing (if youre using p-values), or multiple comparisons.

If you do 20 tests at a 0.05 level of signicance, you are saying that you expect one of those to be statistically

signicant by chance alone. Every subgroup you test, every outcome you test, increases your chance of

getting such a false positive. If you test ve subgroups for four different outcomes each, you expect at least

one false positive. If you read a study that reports lots of p-values or condence intervals, start to worry.

There are a couple of ways to statistically adjust for this issue, all of which really amount to making the alpha

smaller (i.e. stricter). If you need to do this, ask a statistician.

There was an HIV vaccine trial once where the vaccine didnt work, but the researchers kept doing subgroup

analyses until they found a subgroup that showed a statistically signicant result. They hadnt dened the

subgroups beforehand, and only studied them after the fact when they found them to be statistically

signicant. We call this shing or data dredging, and the conclusions should only be considered to be

hypothesis-generating. It sucks to get a negative result, and it can be tempting to keep looking until you get

one, but be careful.

Differentiate between primary and secondary outcomes, and apply this

information to clinical trials

This is pretty intuitive. Remember our PICO format? The O is for primary outcome, the main thing were

interested in. If you could only measure a single outcome, the one you would choose is the primary

outcome.

You can have any number of secondary outcomes, which are anything other events or outcomes that

you might be interested in. Due to the problem of multiple comparisons (as in the above objective), these

secondary outcomes are usually used more for generating new hypotheses than for making strong

conclusions.

Epidemiology I Course Notes

41

In an RCT of some drug on heart failure, you probably want death to be the primary outcome. You may also

wish to collect information on morbidity, though, like days spent in hospital or risk of hospitalization. Think of

digoxin. It doesnt extend life, but thankfully they looked at secondary outcomes and found that it potentially

improved quality of life.

Dene composite outcome

Composite outcomes are what you call it when you combine several outcomes into one variable.

They include people who have had any of the combined outcomes.

For example, a composite outcome for cardiovascular events might include people who have had either

myocardial infarction or stroke.

Composite outcomes make it harder to interpret the results and are sometimes combined when they

shouldnt be.

For example, if a drug leads to a decrease in the composite outcome of death or chest pain, it could mean

that there were fewer deaths and less chest pain but it is also possible that the composite was driven by

decreased chest pain and no change or even an increase in death.

Describe the valid use of composite outcomes

Composite outcomes should be pre-dened (i.e. dened a prior), clinically meaningful, important to

patients, and biologically plausible. You should be careful that one of the components isnt skewing

the results of your composite outcome.

The components of the composite outcome should also be individually dened as secondary outcomes.

Describe rationale for using composite outcomes

Composite outcomes increases statistical power. That is, you dont need as large a sample size to get

statistical signicance. They also allow you to combine several things that you expect to change, so that

you can analyze all of them at once.

Describe potential problems with subgroup analyses

See multiple hypothesis testing, above. Subgroups have fewer participants, so they also have sample

size (i.e. power) issues.

Describe criteria for valid subgroup analyses

1. Subgroups should be dened a priori (i.e. before the study starts).

2. All subgroup analyses that are done should be reported.

3. The subgroup should be biologically different.

4. A difference in effect within the subgroup should be biologically plausible.

5. There should be statistical evidence of this difference in effect.

[A: I would say that these are in order of importance. 1 & 2 are necessary, the rest are nice to have.]

Interpret subgroup analyses in a clinical trial

They should be considered to be hypothesis-generating and should not change your clinical practice.

Prepared by Aidan Findlater

42

The professor believes that its better to present your subgroup analyses in the guise of interaction analysis. I

agree.

Dene interim analysis

An interim analysis is when you analyze the data before the study has nished. You still do the nal

analysis after completing the study. Interim analyses are usually done by an independent data monitoring

committee in order to determine if the study should be stopped early (see next objective).

Describe reasons for early termination of clinical trials

The criteria for early termination must be set before the trial starts. The main reason to end a study early

is harm, if you discover that the intervention is actually hurting people. Alternately, you may stop

because of benet, if the treatment is helping so much that you cant ethically withhold it from the

control group. The nal reason is to stop because of futility, if youve gathered enough data to say

conclusively that the intervention is not working.

Dene surrogate outcome, and recognize use of surrogate outcomes in a clinical

trial as well as potential drawbacks of the use of surrogates

It can sometimes be hard to measure what you really care about but easy to measure something that we

think is a pathophysiological intermediate on the way to the outcome we care about. A surrogate

outcome is what we measure when we cant measure the outcome we really care about.

Take hypertension treatments: it can be difcult to get funding to follow people until they die (what we care

about), but its easy and quick to measure blood pressure (our surrogate outcome).

Extrapolating from surrogate outcomes to the primary outcome that were really interested in can be

misleading.

Describe study phases in clinical trials

There are four phases:

I. Safety: Screening for safety

II. Efcacy: Establishing the testing protocol

III. Effectiveness: Final testing in a large-scale RCT

IV. Post-approval: Monitoring the drug as its used in the population at large

RCTs are almost always phase III studies. You can identify harmful side effects in all phases. Phases I-

III are required for approval by Health Canada or the US FDA. Phase IV studies can lead to the drug

being pulled from the market, like with Vioxx.

Describe problems of adverse event recognition including the use of the rule of

three

Straight from Wikipedia: If no major adverse events occurred in a group of n people, there can be 95%

condence that the chance of major adverse events is less than one in n / 3. This means that the upper

bound on the 95% condence interval on the adverse event rate is approximately 3 / n.

Epidemiology I Course Notes

43

Further from Wikipedia: For example, in a trial of a drug for pain relief in 1500 people, none have a major

adverse event. The rule of three says we should have 95% condence that the rate of adverse events is no

more frequent than 1 in 500.

For example, if 14 people were treated and none of them develop an adverse event, we can be 95% condent

that the true rate of adverse events is 3/14 = 0.214 = 21% or less.

Prepared by Aidan Findlater

44

RCTs: What are the Resul ts? I I

(19 Mar 2012)

Differentiate between dichotomous and continuous outcomes

[A: Just check Fundamentals of Biostatistics I (30 Jan 2012).]

Dichotomous outcomes are generally easier to analyze and are the basis of the two-by-two tables. For

our examples, well only be considering dichotomous outcomes.

Examples of dichotomous outcomes are death versus no death, stroke versus no stroke, and acute

myocardial infarction versus no acute MI. Examples of continuous outcomes are change in blood pressure,

change in BMI, and change in CD4+ count.

When provided with information from a clinical trial, develop a 2x2 table

[A: Yall love the two-by-two tables, right?] You should be able to create a two-by-two table from the

papers ow diagram and tables. Remember that the two-by-two table will have different numbers if

youre doing per-protocol analysis instead of intention-to-treat.

As a refresher, heres the standard RCT two-by-two table:

Disease + Disease -

Treatment

No treatment

a b

c d

Ill now work through a quick example, to make the distinction between intention-to-treat and per-protocol

analysis clear. Feel free to skip it, I wont be hurt.

Epidemiology I Course Notes

45

Below are Figure 1 and Table 3 from the PROactive study, which looked at the effect of pioglitazone on stroke

(and several other cardiovascular outcomes).

As you can see from the ow diagram, they did an intention-to-treat (ITT) analysis. Good! ITT is calculated

based on the participants allocation, and does not consider whether or not they completed the protocol or

Published in: Lancet (2005), vol.366, iss. 9493, pp. 1279-1289

Status: Postprint (Authors version)

All time-to-event analyses were done by Iitting a proportional hazards survival model with treatment as the only

covariate. The proportional hazards assumption was tested with the method described by Grambsch and

Therneau.

12

Homogeneity oI response was examined by testing Ior interaction in each oI 25 prespeciIied sets oI

subgroups. We used linear models or logistic regression models Ior other endpoints, as appropriate. All analyses

were by intention to treat.

This study is registered as an International Standard Randomised Controlled Trial, number ISRCTN

NCT00174993.

figure 1: Trial profile

Role of the funding source

The study was designed by the international steering committee, who also approved the protocol and

amendments. The sponsors had two representatives on the international steering committee and the same two

were also members oI the executive committee. Data analysis, data interpretation, and writing oI the report was

done by the executive committee, with contributions Irom the international steering committee, the data and

saIety monitoring committee, and the endpoint adjudication committee. All the authors had Iull access to all the

data in the study and had Iinal responsibility Ior the decision to submit Ior publication.

Published in: Lancet (2005), vol.366, iss. 9493, pp. 1279-1289

Status: Postprint (Authors version)

Table 3: Numbers of first events contributing to the primary composite and main secondary endpoints

Primary composite endpoint Main secondary endpoint

Pioglitazone

(n2605)

Placebo

(n2633)

Pioglitazone

(n2605)

Placebo

(n2633)

Any endpoint 514 572 301 358

Death 110 122 129 142

Non-Iatal Ml (excluding silent Ml) 85 95 90 116

Silent Ml 20 23 NA NA

Stroke 76 96 82 100

Major leg amputation 9 15 NA NA

Acute coronary syndrome 42 63 NA NA

Coronary revascularisation 101 101 NA NA

Leg revascularisation 71 57 NA NA

MImyocardial inIarction. NAnot applicable. This table describes the events that make up the primary composite endpoint, so iI death is

not the Iirst event, it does not appear.

Table 4: Effect of pioglitazone and placebo on each component of the primary endpoint

First events Total events

Pioglitazone

(n2605)

Placebo

(n2633)

HR (95Cl) Pioglitazone Placebo

Death 177 186 096 (078-118) 177 186

Non-Iatal Ml (including silent Ml) 119 144 083 (065-106) 131 157

Stroke 86 107 081 (061-107) 92 119

Major leg amputation 26 26 101 (058-173) 28 28

Acute coronary syndrome 56 72 078 (055-111) 65 78

Coronary revascularisation 169 193 088 (072-108) 195 240

Leg revascularisation 80 65 125 (090-173) 115 92

Total 803 900

Data reIer to Iirst event oI that particular type. MI myocardial inIarction.

Table 5: Hazard associated with relevant baseline characteristics* for the main secondary endpoint

HR (95Cl) P

Age (year) 105 (104-106) 00001

Previous stroke 171(140-208) 00001

Current smoker (vs never smoker) 170 (134-216) 00001

Past smoker (vs never smoker) 119 (100-142) 00512

Creatinine ~ 130 i-imol/L 167 (120-231) 00022

Previous myocardial inIarction 149 (125-178) 00001

HBA, ~7-5 148 (124-176) 00001

Peripheral obstructive artery disease 135 (110-165) 00036

Diuretic use 133 (113-157) 00007

LDL cholesterol ~4 mmol/L (vs 3 mmol/L) 133 (105-167) 00165

LDL cholesterol 3-4 mmol/L (vs 3 mmol/L) 122 (101-146) 00357

Insulin use 132 (112-155) 00008

Percutaneous coronary intervention or coronary artery bypass graIt 076 (063-093) 00083

Statin use 083 (069-100) 00452

Allocation to pioglitazone 084(072-098) 00309

*Resulting Irom stepwise selection procedure (other variables considered: sex, body-mass index, duration oI diabetes |5 vs 5 to 10 vs

10 years|, use oI metIormin versus sulphonylureas, combined blood pressure |low risk vs high risk|, triglycerides |low risk vs at risk vs high

risk|, HDL cholesterol |low risk vs at risk vs high risk|, micral test results |positive vs negative|, previous acute coronary syndrome,

evidence oI coronary artery disease, photocoagulation therapy, metabolic syndrome |present vs absent|, use oI blockers, use oI angiotensin-

converting enzyme inhibitors).

Prepared by Aidan Findlater

46

were lost to follow-up. ITT is what you should do unless you have a good reason to do per-protocol, like your

supervisor telling you to do per-protocol. The two-by-two table for an ITT analysis of stroke would look like

this:

Disease + Disease -

Treatment

No treatment

76 2605 76 = 2529

96 2633 96 = 2537

A per-protocol analysis, on the other hand, is limited to those who are not lost to follow-up. The two-by-two

table for a per-protocol analysis of stroke would look like this:

Disease + Disease -

Treatment

No treatment

76 2427 76 = 2351

96 2447 96 = 2351

[A: By coincidence, the no-disease numbers are the same in both groups here. Dont get distracted by that,

its random chance.]

As practice, try calculating the RRs or ORs for both, and seeing how they differ.

When provided with information from a clinical trial, calculate and interpret the

control event rate (CER)

The CER is the event rate in the control (no treatment) group.

CER =

c

c +d

This is obviously the same as the incidence of the outcome in the no treatment group. Depending on the

study design, they may measure it as an incidence rate instead of an incidence, in which case you would have

to do an incidence rate calculation instead. For a review of incidence and incidence rate, see Fundamentals of

Epidemiology I (16 Jan 2012).

We use the CER to estimate the number (or rather, the incidence or incidence rate) of bad events that we

expect to happen in our treatment group if they hadnt got the treatment.

When provided with information from a clinical trial, calculate and interpret the

experimental event rate (EER)

The EER is the event rate in the intervention (treatment) group.

EER =

a

a +b

This is obviously the same as the incidence of the outcome in the treatment group. Depending on the study

design, they may measure it as an incidence rate instead of an incidence, in which case you would have to do

an incidence rate calculation instead. For a review of incidence and incidence rate, see Fundamentals of

Epidemiology I (16 Jan 2012).

Epidemiology I Course Notes

47

When provided with information from a clinical trial, calculate and interpret the

relative risk (RR)

The RR for a trial is the same as a normal RR. Since the CER and EER are usually just the incidences of

events, we get:

RR =

EER

CER

When provided with information from a clinical trial, calculate and interpret the

absolute risk reduction (ARR)

The CER is the rate of events without intervention and the EER is the event rate with the intervention.

The absolute risk reduction tells us how many events the intervention is preventing, by taking the

difference of the CER and EER.

ARR = CER EER

If our events are good events instead of bad, you may want to calculate the ARR as EER CER. In the notes,

they specify the ARR more generally, as the absolute value of the difference of CER and EER. [A: I dont agree

with this, since you should be allowed to have a negative ARR if the intervention turns out to be harmful, as

happens from time to time. A positive ARR reects a reduction in risk, and risk is usually bad.]

When provided with information from a clinical trial, calculate and interpret the

relative risk reduction (RRR)

The relative risk reduction tells us the proportion or fraction of the events that we would have

expected (given by the CER) that have been prevented with the treatment (the ARR).

RRR =

ARR

CER

=

CER EER

CER

RRR is always higher than ARR, so many pharmaceutical companies choose to report the RRR instead of the

ARR. Both RRR and ARR are important, but dont be fooled by a high RRRalways check the ARR, too.

When provided with information from a clinical trial, calculate and interpret the

number needed to treat (NNT)

The number needed to treat tells you how many people you need to treat in order for the treatment to

do something good for one of those people.

If the outcome is death or disease, it tells you how many people you need to treat in order to prevent a single

harmful outcome. Alternately, if the outcome that youre measuring is some sort of improvement, the NNT tells

you how many people you need to treat in order to see a single patient improve.

Remember, the NNT is talking about the effect of a specic drug or intervention; some people will get better or

worse regardless of treatment.

[A: The NNT is probably the most important number to consider for clinical decision-making because it

tells us how useful a treatment is in terms that we can intuitively understand.]

Prepared by Aidan Findlater

48

The higher the NNT, the less useful the drug or intervention is because it means that we need to treat more

people in order to see any benet. For example, take a trial for the effect of pioglitazone on preventing stroke.

If the NNT is 143, that means you have to give 143 patients with pioglitazone in order to avoid a single stroke.

The ideal NNT is 1, where each patient that we give the drug to is expected to improve because of it. Thats

rare to see.

The NNT is calculated unintuitively (to me) as:

NNT =

1

ARR

1

CER EER

The fancy bars on either side means to round up. [A: Im teaching you math notation, too. Thats a two-

fer.]

The NNT can also be calculated from the odds ratio using a much more complicated formula that you can

look up if you ever need it.

There is also a number needed to harm (NNH), which is the same as NNT except its for the adverse effects

of the treatment.

Using the pioglitazone example above, if it has an NNH for heart failure of 31 (not great), then 1 of every 31

patients you give the drug to are expected to get heart failure that they otherwise would not have gotten it.

Considering that the NNT is 143 for stroke, thats a lot of people getting heart failure for a very small number

who are avoiding stroke.

Like most of the statistics we use, the NNT (and NNH) corresponds to a specic period of time (for

example, death within one year of allocation).

When provided with information from a clinical trial, calculate and interpret odds

ratio (OR)

The odds ratio is, as always, a ratio of odds.

OR =

a d

b c

Most trials are for rare outcomes, so the OR is a good approximation of the RR. [A: Most people treat the

OR as though it was identical to the RR, but you know to be a little more cautious.]

If you need a review of odds ratios, see Fundamentals of Epidemiology I (16 Jan 2012).

Provide information regarding strengths and weaknesses of NNT

The strengths of NNT are the that it is intuitive, simple to interpret, easy to calculate, and takes into

account peoples baseline risk. The weakness is that you cant use it alone to make decisions, since you

always need to consider things like a patients age, adherence to therapy, costs, etc. [A: Thats really a

weakness of all statistics that we use, though.]

Epidemiology I Course Notes

49

Critically appraise an article on therapy

asdfasdfasdfasdfasdfasdfasdfasdfasdfasdfasdfasdfasdfasdfasdfasdfasdfsafdf

Prepared by Aidan Findlater

50

Case-Control and Cohort

Studi es (26 Mar 2012)

Describe the purpose and structure of case-control and cohort study design

[A: This should be mostly review at this point.]

Case-control studies

In a case-control study, youre nding people who have a disease (cases) and comparing them to

people who dont (controls). Youre grouping by outcome, and youre comparing the groups for

differences in exposure rates.

Controls. Selection of controls is a huge source of bias in case-control studies since it is very easy to

introduce confounders.

Ideally, you can identify a hypothetical cohort from which the cases are drawn from (such as, all people within

the hospitals catchment area), and then controls are sampled at random from this hypothetical cohort. Its

much harder to do in real life.

If cases are people who show up to my hospital with mesothelioma then the hypothetical cohort is people

who would show up to my hospital if they had mesothelioma. You can see how it might be hard to randomly

sample from that cohort. One common way to do it is to use random-digit dialling in the hospitals catchment

area for the disease of interest (which, for a tertiary centre and a rare disease, could be a huge area). Even

then, people who have land lines and are willing to act as controls in a research study are probably different

sorts of people from the cases.

Matching. To reduce confounding, you can try to make each control match one of the cases for

important variables (often age and sex). You do 1:1 matching (one control for each case) or you can

choose to match more than one control to each case in order to increase statistical power. In the end, if

theres an old woman as a case, there would be one or more old women as controls who are matched to

that case. If you match, you need to use more involved statistical analyses and you should not include

the matched variable in any regression that you do.

Theres not much added power after a ratio of 4 controls to each case, which is why you dont often see more

than that.

Epidemiology I Course Notes

51

Frequency matching. Instead of matching each control to a specic case, you can also frequency

match, so that the overall demographics of the controls is close to the overall demographics of the

cases. Cases will have just as many old people and as many women as the controls, even if they dont

specically have the same number of old women.

Instead of trying to match controls to cases, you can use multivariate regression to adjust for the potential

confounders after the fact. Matching increases statistical power but can be difcult and can sometimes bias

the results.

Probably the best way to do case-control studies is called a nested case-control study. Imagine youre

following a cohort of people along. Each time someone gets the outcome of interest, you take a random

selection of four people in the cohort who dont have the outcome at that point in time (these are your

controls). Later on, its possible that one of those controls will eventually become a case (and will have four

other controls selected for it who are outcome-free at that point in time). This sort of time-dependent sampling

(known as incidence density sampling) means that the odds ratio you get is an approximation of the incidence

rate ratio instead of the relative risk.

One of the reasons that the nested case-control design results in high-quality case-control studies is because

its really only possible to do it if you have a well-dened cohort from which you can get both your cases and

controls. Having a well-dened cohort will make any case-control study better, regardless of the method used

to sample controls.

Cohort studies

In a cohort study, youre nding people who have an exposure and comparing them to people who

dont. Youre grouping based on exposure, and comparing the groups for differences in outcome rates.

Cohort studies can be prospective, where youre picking a group of people and following them up every

month or year, or they can be retrospective, where you construct a group of people that you follow up

through their past records. [A: Know these denitions for the quizzes, but dont obsess about them in real

life.]

You could also take a retrospective cohort and contact all of them to continue following them over time. This is

referred to as an ambidirectional cohort. Fun, right?

Describe the strengths and weaknesses of cohort and case-control studies

[A: Theres a good summary table at the end of the Champion notes for Case-Control and Cohort

Studies.]

Case-control studies

The good:

Good for rare diseases, since a cohort may only pick up a few new cases each year

Good for diseases with long latency periods, for the same reason

Prepared by Aidan Findlater

52

The bad:

Recall bias is a type of measurement bias thats really only present in case-control studies

Recruitment of controls often results in selection bias

Cannot infer causation

Recall bias is a problem when cases are more likely than controls to report having had an exposure,

either because theyre thinking harder and remembering better, or because theyve created an

association in their head that theyre trying to validate, or because the researcher is grilling them harder

on the exposure. Recall bias is not the same as poor recall, where people just cant remember their past

exposure. Recall bias is differential recall or reporting between the two cases and controls.

Imagine youre studying the effect of aspartame on birth defects. Your cases are mothers who gave birth to

children with congenital malformations and your controls are women at the same hospital who gave birth to

normal children. You interview the new mothers to assess their exposure to diet sodas. The mothers of

children with the malformations have heard bad things about aspartame and are probably going to report diet

soda consumption even if they only drank a single can fteen years ago. The women of normal infants will be

less likely to report such an exposure.

One study thats a decent learning example is Recall bias in the assessment of exposure to mobile phones

by Vrijheid et al. (8). They checked reported cell phone usage against actually cell phone records and found

both poor recall and recall bias.

You can reduce recall bias by making sure that the exposure assessment is identical between cases and

controls, including making sure that interviews are standardized and interviewers are blinded to case/

control status. You can also hide the exposure question within the questionnaire (hiding the aspartame

question between questions about smoking status and what colour car they drive, for example).

[A: Im not actually suggesting you ask participants about the colour of the car they drive. That was a joke.]

Cohort studies

The good:

The next best thing to an RCT when its impossible or unethical to randomize participants

You can study as many outcomes and exposures as you want, which is why theyre such a rich

source of data

You can calculate odds ratios, relative risks, and incidence rate ratios

You can use them to study rare exposures by selecting an appropriate cohort

Epidemiology I Course Notes

53

The bad:

Large and expensive

Retrospective cohorts are limited to the data that you can nd in the records

Confounding! Confounding, confounding, confounding. For example, a given vitamin might only

protect against heart disease because health nuts tend to take vitamins, not because vitamins do

anything. Coffee appears to cause heart disease because smokers are also more likely to be coffee-

drinkers. And so on. For ever and ever. All observational studies have confounding, though careful

study design and analysis can minimize it.

Recognize and describe types of bias that may occur

Bias is a systematic error and reduces a studys internal validity. Broadly speaking, bias can be

divided into selection bias and measurement bias (also called information bias).

Selection bias

Selection bias occurs when the way you choose your groups (cases and controls in a case-control study

or exposure groups in a cohort study) introduces a systematic error.

Incidence-prevalence bias. Remember how diseases that kill quickly will have very low prevalence

even if their incidence is high? In a study, this means that a study of prevalent cases will miss all those

incident cases where the person died before the study was done. For example, if you recruit people who

are hospitalized for acute MI, youll miss everyone who died before they even got to the hospital.

Detection bias. The exposure of interest makes it more likely that disease will be detected, even though

it may not effect the actual risk of disease. For example, if HRT causes endometrial bleeding and such

bleeding is an indication for testing for endometrial cancer, then women on HRT are more likely to be

tested for cancer, and a spurious relationship between HRT and cancer will be found.

Non-respondent bias. People who respond to surveys are different from those who dont. For example,

smokers are less likely to return questionnaires that include questions on smoking, so your sample will

be biased towards non-smokers.

[A: Membership bias sounds way too much like confounding to me. Ive never heard of it and it isnt in

(6), so Im going to ignore it.]

Measurement (information) bias

Measurement bias occurs when theres a systematic difference between the groups in the way that

outcomes, exposures, or confounders are measured. Outcomes, exposures, and confounders should be

collected in the exact same way for both cases and controls (in case-control studies) and for both

exposed and non-exposed groups (in cohort studies).

Subject bias. The study subjects (or participants) in one group may be more likely to report symptoms

or falsely report compliance than the other group. [A: This is a pretty general term that seems to include

most of the other biases.]

Prepared by Aidan Findlater

54

Recall bias. This was discussed in the case-control section, above.

Hawthorne effect. People who know that theyre being studied often report more positive results for no

apparent reason. Its kind of like a placebo effect. [A: Check Wikipedia for this one, its cool.]

Detection bias. Data collectors may look more carefully for an outcome or exposure in one of the

groups than the other. This is protected against by strict training, adherence to interview or data

collection protocols, and, ideally, blinding of the data collectors.

Recognize and describe confounding

A confounder is a variable that differs between the comparison groups and is associated with the

outcome. My favourite example is, as always, owning a lighter causing lung cancer; the apparent

correlation is confounded by smoking status. Confounding arises because a persons exposure status is

associated with a whole bunch of things. Coffee drinking is associated with a bunch of things, including

persons smoking status, their ethnicity, their age, their occupation, and many other things I cant

possibly think of. This is why I say that observational studies always have confounding (although its not

always a problem).

Randomizing breaks the connection between the exposure and the confounders. If you randomize

people to owning a lighter, suddenly its no longer associated with their smoking status. Smokers are

randomly and (hopefully) equally distributed between the two exposure groups, and the apparent

relationship between lighter ownership and lung cancer disappears (since it was really the smoking

causing the lung cancer). This is why RCTs are so great; they remove the effect of confounders, even

those we cant measure.

You can try to remove confounding from observational studies by matching, stratication, or

multivariate regression. When we used stratication or multivariate regression to remove confounders,

we call our result an adjusted odds ratio or relative risk. However, these techniques arent perfect. You

have to start worrying about measurement bias of the confounders, not just the measurement of the

exposure and outcome. More importantly, you can only adjust for (or match on, or stratify by) things that

you can and do measure. Confounding that we dont measure is called residual confounding.

Confounding can be a problem in RCTs, since randomization is random. Just by chance, more smokers may

be assigned to lighter ownership, creating a spurious relationship. Larger sample sizes make this less likely.

Stratied randomization can also ensure that both groups have the same major baseline characteristics.

Matching. Matching is usually only done for major confounders like age and sex. In case-control,

controls are selected that match cases for suspected confounders. In cohort studies, non-exposed are

matched to exposed participants for suspected confounders. [A: I dont think Ive ever seen a matched

cohort study.]

Stratication. The data is divided into strata based on a confounder, and the measure of effect (usually

OR or RR) is calculated within each strata. The strata-specic results are then merged, if possible, to

give an adjusted odds ratio or risk ratio that is free of any confounding by the variable that was stratied

on.

Epidemiology I Course Notes

55

For example, the lighters and lung cancer data shows an unadjusted odds ratio of 9. If we look at just the

smokers and calculate an odds ratio, though, the OR is 1. We dont see any effect of lighters on lung cancer

within the stratum, since the lighter owners and non-owners no longer differ by smoking status. If we look at

just the non-smokers, the OR is also 1, for the same reason. Merging them back together gives an adjusted

odds ratio of 1.

Multivariate regression. This is by far the most common thing youll see. It uses fancy statistics, which

youll see again in The Interpretation of Statistical Results (23 Apr 2012). When people say that they

adjusted for confounders, this is usually what they mean.

Dene and calculate relative risk and odds ratio

See Fundamentals of Epidemiology I (16 Jan 2012).

Critically appraise a case-control study

Consider sampling, measurement, and confounding. [A: Always mention recall bias if they assessed

exposure with a questionnaire.]

Critically appraise a cohort study

Consider the comparison being made, whether the comparison makes sense, whether theres any

selection bias, and whether theres any confounding.

Prepared by Aidan Findlater

56

Prognosi s

(2 Apr 2012)

Differentiate between risk and prognostic factors

Risk factors predicts who gets the disease. Prognostic factors predict what happens to them after

they get it. Theres usually a lot of overlap between the two, with things like age and sex being both

major risk factors and major prognostic factors for lots of diseases.

Risk factor Prognostic factor

Patients

Outcomes

Rates

Factors

Start healthy Start with disease

Onset of disease Death, disability, etc.

Rare outcomes Common outcomes

May overlap May overlap

Describe the elements of prognostic studies

Prognostic studies use a prospective cohort where the cohort is dened by the presence of disease. It

uses a (hopefully random) sample of people with the disease who join the cohort at a dened inception

time and are then followed up over time for the outcome(s) of interest. The zero time denes when they

join the cohort, such as time at diagnosis, when symptoms rst appear, or when treatment is started.

Diseases have a natural history: the disease begins as a subclinical biologic disease, it becomes detectable

though its still subclinical, symptoms start to appear, the patient sees a physician, the disease is diagnosed,

and the disease is treated. When studying prognosis, we have to decide when a case actually starts. Studies

can dene their inception cohort as starting at any stage along the diseases history, like a patient joins the

cohort when they rst feel symptoms or a patient joins the cohort when theyre rst diagnosed with the

disease. Changing the zero time changes the prognosis, even for the same course of disease.

The cohort should constitute an unbiased sample of all people at the given stage of disease and the study

should collect data on baseline characteristics. The cohort must be followed up for long enough for clinically

important outcomes to occur.

The results of prognostic studies can be reported as 5- or 10-year survival, case-fatality rate, response

to treatment, remission, or disease-specic mortality.

Epidemiology I Course Notes

57

5-year survival rate: the percentage of cases that survive for at least 5 years after a diagnosis or a

treatment.

You can also do 10- and 20-year survival rates. For example, if the 10-year survival of a ductal carcinoma in

situ is 98%, then 98% of people who are diagnosed with it will still be alive after 10 years. Remember that

lead-time bias will mean that the 5-year survival of cases detected by screening is usually longer than those

detected otherwise.

Case-fatality rate: the percentage of people with a disease who die from that disease within a given

period of time.

This is usually used more for acute diseases and outbreak investigations. If 100 people are diagnosed with

lung cancer and 15 people die from it within 10 years, then the case-fatality would be 15%.

Disease-specic mortality rate: the proportion of people in the population dying from the disease,

often given in deaths per 10,000 people.

This mortality rate is different from case-fatality rate, which considers only the people who already have the

disease. For example, start with a population of 100,000 people, 200 of whom get the u and 100 of those

with the u die from it. The case-fatality would be 100/200 = 50%, whereas the disease-specic mortality

would be 100/100,000 = 10 u deaths per 10,000 people.

Response rate: the percentage of cases showing improvement following an intervention.

If 100 diabetics are given insulin therapy and 90 improve, then the response rate is 90%. Its possible that 89

of them would have improved anyway, so this doesnt account for that.

Remission rate: the percentage of cases whose disease becomes undetectable.

This is similar to the response rate, but the outcome is remission instead. Bear in mind that a person can go

into remission but later relapse.

Interpret a survival curve

If youre dealing with time-to-event data then you want to do survival analysis.

Time-to-event data in prognostic studies comes from the inception cohort. People enter the cohort at the time

that they meet the zero-time criterion, then they stay in the cohort until the event (like death) is reached. If

youre studying death after lung cancer diagnosis, then a person will enter the cohort when theyre

diagnosed, live a few years or decades, and die. Survival analysis deals with this sort of data very well.

Note that you can do your usual cohort analysis with relative risks but its limited to a single point in time, like

the RR for death at ve years since diagnosis. Survival analysis, on the other hand, takes time into account,

which is one reason that people prefer it.

Survival is usually displayed in a survival curve, which plots survival against time. The median survival

is the time on the x axis at which the curve crosses 50% survival on the y axis (see the image below).

Censoring (which will be discussed later in this lecture) is displayed using a tick mark on the curve to

indicate the point at which a person left the study.

Prepared by Aidan Findlater

58

The technical denition of survival is a little bit tricky. Imagine that the study follows people up for several

months after they enter the cohort. Break the time up into months. Survival in any given month is the

percentage of people who started the month alive and who end the month alive. Instead of months, break it

into weeks, or days, or hours, or minutes. The survival function is what you get when the chunks of time

become innitely small, and thats what our survival curves are trying to approximate. One of the advantages

of this odd denition is that it deals with censoring quite nicely (see below for a description of censoring). If

someone leaves the study after 6 months, you can still include them in the survival analysis for those months,

then remove them from the denominator afterward.

Because the sample size decreases over time, as people die or leave the study for other reasons (that is,

theyre censored), the precision of the estimate decreases over time. The estimates of survival at 1 year

will have a tighter condence interval than the estimates at 5 years, just because youre dealing with

smaller numbers.

Censoring is a fancy way of saying that someone stopped being followed up before they got the

outcome, either because they dropped out or because the study ended. You know they survived for at

least as long they were in the study, but you dont know what happened to them after they left it.

For example, consider a study of death after lung cancer diagnosis. Participants enter the study when they get

diagnosed with lung cancer, and you follow them until they die. If someone is diagnosed, is followed for six

months, then moves to another country, then you know that they survived for at least the rst six months but

you dont know what happened to them afterward. Similarly, someone who enters the study two months

before the study ends will only contribute two months before theyre censored (assuming that they dont die

during those two months). Those are both examples of censoring.

Survival analysis reports its results in hazard ratios (HRs). To quote a brilliant young epidemiologist who

isnt me, For all practical purposes, hazards can be thought of as incidence rates and thus the HR can

be roughly interpreted as the incidence rate ratio (9).

The hazard function gives the probability of dying at a given point in time, assuming that you had survived until

that point in time.

One way of interpreting the hazard ratio is as the odds that an individual with the higher hazard reaches

the endpoint rst. For example, a HR of 2 means that theres a 67% (2/3) chance of the treated patient

dying rst.

Epidemiology I Course Notes

59

Cox proportional hazards model is the most common method for estimating the hazard ratio. Its a

form of regression analysis, so you can use it to adjust for confounders.

Recognize potential sources of bias in cohort studies of prognosis

The only bias that was discussed specically for prognostic studies is from false cohorts, which are

when you dont use an inception cohort. If you just go out to gather a cohort of people with the disease,

you can only include people who are still alive and have the disease. Your cohort wont include anyone

who died before you started the study, so your results will be biased toward longer-living patients.

Prepared by Aidan Findlater

60

Di agnosi s

(9 Apr 2012)

Discuss the use of diagnostic tests clinically

To monitor therapy

Diagnostic tests should be:

A good diagnostic test is reliable, which means that it gives the same answer for the same patient regardless

of the evaluator (that is, its reproducible). Measures of inter-rater reliability include Cohens kappa, which is

discussed later in this lecture.

Feasibility and acceptability include the patients perspective and cost.

For a test to benet patients

The result of the test and therapy must improve patient outcomes

Describe the characteristics and denitions of normal and abnormal test results

[This doesnt seem to get much discussion in the notes or the slides.]

Normal test results are negative results (no disease, since thats the normal result). Abnormal test results

mean that the test is positive.

Epidemiology I Course Notes

61

Develop a 2x2 diagnostic test result table when provided with data from a study

of a diagnostic test

To evaluate a test, you compare it to a gold standard test which is assumed to be perfectly sensitive

and specic. Each person in the diagnostic study should be evaluated by the gold standard and the new

test (ideally with blinding to the results of the gold standard).

In most people, the results of the new test will be the same as those of the gold standard. In some people,

though, the test will say they have disease when the gold standard says they dont (false positives). Other

times, the test will say they dont have disease when the gold standard says they do (false negatives). These

pairs of results are used to ll a 2x2 table from which you can calculate all of the statistics that were interested

in.

Gold Standard Gold Standard

Positive Negative

New test

Positive

New test

Negative

a

(true positives)

b

(false positives)

c

(false negatives)

d

(true negatives)

Dene and calculate sensitivity and specicity

Sensitivity gives the probability of testing positive if you really have the disease. It does not tell you the

probability that you have the disease if you test positivethats the positive predictive value, below.

sensitivity =

a

a +c

=

TP

TP +FN

=

detected true cases

all true cases

Specicity gives the probability of testing negative if you really dont have the disease. It does not tell

you the probability that youre disease-free if you test negativethats the negative predictive value,

below.

specicity =

d

b +d

=

TN

FP +TN

=

detected true non-cases

all true non-cases

A great mnemonic is SpPIn and SnNOut (spin and snout): a highly specic test, when positive, rules

disease in; a highly sensitive test, when negative, rules disease out.

Dene and calculate positive and negative predictive value

Positive predictive value (PPV) gives the probability that you really have the disease if you test positive.

PPV =

a

a +b

=

TP

TP +FP

Negative predictive value (NPV) gives the probability that you really dont have the disease if you test

negative.

Prepared by Aidan Findlater

62

NPV =

d

c +d

=

TN

TN +FN

PPV and NPV are great. Theyre intuitive and understandable, and are exactly what we want when we think

about the results of a diagnostic test. Unfortunately, they depend on the prevalence of disease in the

population being tested, as well discuss next.

Dene and calculate prevalence

Prevalence is the proportion of people with disease in the population being studied, based on the results

of the gold standard test. Pretty straightforward.

Prevalence =

a + c

b + d

Sensitivity and specicity dont change with the prevalence. If you add more people with disease to your 2x2

table, youre basically adding more a + c. These extra people will go into the a and c cells based on how

good the test is, which ends up leaving the sensitivity unchanged. But you can see that, if youre adding

people into cells a and c, then your PPV will go up (toward 1) and your NPV will go down (toward 0). The

opposite happens if you add more people who dont have disease. In that case, youre adding more d + b.

Set up your own 2x2 table and play around with it to convince yourself.

Apply the role of pretest probability or prevalence in interpretation of diagnostic

test results

The sensitivity and specicity are inherent properties of the diagnostic test. The PPV and NPV, however,

depend on the prevalence of disease in the population being studied. If you increase the prevalence, the

PPV goes up and the NPV goes down. If you decrease the prevalence, the PPV goes down and the NPV

goes up.

The pre-test probability is the probability that a person has the disease before you take into account

the results of a test. Its how likely you think it is that they have the disease, based on things like signs,

symptoms, and clinical judgment. The pre-test probability can also be thought of as the prevalence of

the disease in a patient him or her.

Saying, A patient with this symptom has a 20% chance of having the disease, is basically the same thing as

saying, Of all patients with this symptom, 20% have the disease.

Just like the PPV and NPV depend on the prevalence of disease, they can be thought of as depending on the

pre-test probability. Having a high pre-test probability is like doing the test in a high-prevalence population. If

you think someones really likely to have the disease, then a positive test is very convincing (higher PPV) while

a negative test is more likely to be a false negative (lower NPV). The inverse is true if you think someone is not

likely to have the disease.

If the test was evaluated in a real-world setting, with patients similar to those that you would expect to send

for testing, then the PPV and NPV from the evaluation study may be useful to you when making decisions. If

the population in which it was evaluated has a much higher prevalence (a high-risk population), then the PPV

will be overestimated compared to your population and NPV will be underestimated.

Epidemiology I Course Notes

63

Theres a very nice example in the Champion notes, under The problem of prevalence.

Interpret likelihood ratios

Likelihood ratios (LRs) are ways of updating our pre-test probability based on the results of the test.

Theyre calculated from the sensitivity and specicity, so they dont depend on the prevalence.

PPV and NPV are intuitive. They tell us how likely someone is to have disease based on the outcome of the

test, but they change based on the prevalence. LRs provide a way of quickly determining the PPV and NPV

(or post-test probability) for any prevalence (or pre-test probability).

LR can be thought of as the odds of a disease given a positive (LR+) or

negative (LR) test result divided by the odds of disease in the population

(or pre-test odds). For example, with an test that has an LR+ of 20, a

positive test result means that your odds of having the disease are 20

times higher than when you were untested.

An LR of 1 is useless. An LR of 10 means that the test is good at ruling in

disease, and an LR of 0.1 means that its good at ruling out disease.

You can very easily use LRs by taking advantage of a nomogram, shown

on the right. Draw a straight line from your pre-test probability on the left

through the LR in the middle (LR+ if the test was positive or LR if

negative) and keep going until you hit the post-test probability on the

right.

Interpret kappa

Cohens kappa is a measure of agreement between independent raters/observers/testers. It measures

the agreement that cannot be explained be random chance. It varies from 0 (no agreement except by

chance) to 1 (perfect agreement).

I like Champions example chance agreement between two radiologists reading lms. One rates all the lms

and nds 20% are abnormal. The other falls asleep on the normal button and rates all lms as normal.

Technically, the agreement is 80%.

Prepared by Aidan Findlater

64

Interpret a receiver operating characteristic (ROC) curve

Many tests are not dichotomous but give a range of values. In

order to calculate sensitivity and specicity, though, we need a

dichotomous result: positive or negative. To make it, we just take

our range of values and say that every above a certain threshold

value is positive and everyone under it is negative. Different

thresholds will have different specicities and sensitivities.

For example, lets consider using serum creatinine as a test for

renal failure. If the threshold is 0, our test will say that everyone

has renal failure; itll catch all the actual cases of renal failure

(sensitivity=1) but doesnt rule out renal failure when it isnt there

(specicity=0). On the other hand, if the threshold is innity, our

test will say that no one has renal failure; itll correctly rule out

renal failure when it isnt present (specicity=1) but will fail to

catch any actual cases (sensitivity=0). All other thresholds will give values somewhere between those two

extremes.

If we calculate a sensitivity and specicity for each possible threshold, we can plot them. The resulting

plot is call an ROC curve, shown on the right. You can see that VA (whatever that is) is a generally better

test than NE.

If we take an ROC curve and calculate the area under the curve (AUC), it can tell us how good the test

is in general. The straight black line in the ROC above shows a useless test. It tells us nothing, and has

an AUC of 0.5. The ideal test, with sensitivity and specicity of 100% across all threshold values, would

have an AUC of 1. Normal tests, like those shown, fall somewhere between 0.5 and 1.

Critically appraise a study on a diagnostic test

Was the gold standard appropriate?

Does the test include the gold standard as a part of it? This is bad.

Was there an appropriate spectrum of patients that are similar to those you would want to test in your

clinic or hospital?

Was there verication bias? Is every test result compared to the gold standard, or only certain results?

Is there good intra- and inter-rater reliability?

What are the sensitivity, specicity, PPV, and NPV?

Are condence intervals provided? Are the estimates precise?

How does the test compare to others?

Is is available, affordable, and accessible?

Will the results of the test change your management?

Epidemiology I Course Notes

65

Screeni ng

(16 Apr 2012)

Dene and differentiate between the three levels of prevention (primary,

secondary, and tertiary)

Primary prevention prevents disease from ever occurring. Examples include health promotion, exercise,

smoking cessation programs, and immunization.

Secondary prevention tries to catch disease while its still latent or subclinical and treat it before the

disease becomes an illness. Examples include screening.

Tertiary prevention tries to reduce the impact of symptomatic disease. Examples include rehab

programs.

Differentiate between screening and case-nding

Screening is testing large numbers of asymptomatic people for disease. Case-nding is testing a

small number of people (or even one) where theres a high suspicion of disease, such as presence of

symptoms or recent contact with an infected person.

Differentiate between diagnostic and screening tests

Diagnostic testing is done in patients with suspected disease and positive results tell you that the

disease is probably present. Screening is done in patients without suspected disease and positive

results tell you that the disease may be present. Generally speaking, screening tests err on the side of

being sensitive rather than specic, so that they pick up a lot of true cases even if they also pick up a lot

of false positives.

Describe criteria for a screening program

Screening programs should only be implemented if people benet from early detection of the disease.

The following should be true:

The disease is serious with a clear natural history, has a known prevalence, and has an effective

therapy if the disease is found early.

The screening test should be safe, should be cost-effective, and should have a known sensitivity,

specicity and ROC.

Prepared by Aidan Findlater

66

The healthcare system should clearly dene the screened population. The identied cases should be

followed up and offered an available and accessible treatment that is acceptable to the individual.

When provided with information about a screening test calculate sensitivity,

specicity, positive predictive value (PPV), negative predictive value (NPV) and

prevalence

[A: This is identical to a diagnostic test. Literally.]

Describe the impact of prevalence of disease on the results of diagnostic or

screening tests

[A: Same as diagnostic tests.]

Remember, screening is done in asymptomatic people. This means that theres a low prevalence, and that

decreases the tests PPV. If the PPV is really low, then very few of the positives will be true positives and youll

be wasting everyones time and money. Thats why people only screen in higher-prevalence populations, like

women over 55 for breast cancer. Screening a well-dened subpopulation can make a screening test more

useful.

Apply the impact of prevalence of disease to clinical situations

[A: Same as diagnostic tests.]

Again, screening is done in asymptomatic people and has a lower PPV than if that same test is used

diagnostically.

Dene and recognize lead-time, length-time and compliance bias

All three biases make tests seem benecial even if they dont make a difference.

Lead-time bias. A screening test may add years lived with (diagnosed) disease without adding actual

years of life, just because youve picked it up early. Screening will therefore appear to lengthen the lives

of people with disease, but really its just that youre picking it up earlier.

Length-time bias. People with slowly progressing disease will spend more time in the pre-clinical

disease stage, and are therefore more likely to be picked up by a screening test. Put another way, people

whose disease progresses quickly and kills them are less likely to be detected by screening tests than

the slow-progressing cases. Your screened population will have more of these slow-progressing cases,

so the years lived with (diagnosed) disease will, again, be greater; screening will appear to be

benecial.

Epidemiology I Course Notes

67

Compliance bias. People who participate in screening programs make better patients and do better on

treatment. This is a form of selection bias.

Discuss possible adverse effects of screening programs

Labelling or stigma

Complications from the test, including discomfort, radiation exposure, chemical exposure

There are things you die of, and things you die with.

Prepared by Aidan Findlater

68

The I nterpretati on of Stati sti cal

Resul ts (23 Apr 2012)

[A: This is probably one of the more useful topics in the course, since almost every interesting study uses

some form of regression.]

Describe the difference between unadjusted and adjusted results

Unadjusted results are the simple, crude estimates for OR, RR, and IRR that we learned to calculate

earlier in the course. They dont take into account any extra information besides the outcome and the

exposure of interest.

Thats usually ne for RCTs, but when we start getting data that has confounding, like data from

observational studies, then our unadjusted estimates are misleading because they dont account for the

confounders. Adjusted results are those that adjust for measured confounders using statistical

techniques like multivariate regression, which well discuss below.

Lets bring this back to my favourite confounding example: the hypothetical analysis of the effect of lighters on

risk of lung cancer. Our unit of analysis is an individual; our exposure of interest is whether they own a lighter;

our outcome is whether they get lung cancer; our main confounder that we measure is whether they smoke.

An unadjusted result would suggest that lighters cause cancer, since we get a strong relationship and a high

odds ratio. Oops! Those results are horribly confounded. So lets rerun the analysis, using a multivariate logistic

regression (explained below), adding in smoking status to the model. Suddenly the estimated odds ratio for

the effect of lighters on lung cancer plummets to 1, no effect. Thats our odds ratio that has been adjusted for

smoking status.

Interpret statistical ndings and the level of measurement of the outcome

variable in linear, logistic, and survival analyses

These are statistical models to calculate the effect of independent variables (or predictors) on a

dependent variable (or outcome). The results give you the effect or contribution of the independent

variables to the value of the dependent variable. After running the regression, you get a list of

coefcients for each of the independent (or predictor) variables. These coefcients tell us the effect of

the independent variables on the dependent variable.

Epidemiology I Course Notes

69

Your choice of model depends on the outcome variable. Linear regression is used for predicting

continuous outcome variables; logistic regression is used for predicting dichotomous or binary

outcome variables; and survival analysis is used for predicting time-to-event outcome variables.

The results of a simple linear regression tell you the direct effects of the independent variables on the

outcome. If our outcome is blood pressure (in mmHg) and the coefcient of daily salt intake (in grams) is

0.5, that tells us that each gram of daily salt contributes 0.5 mmHg to the blood pressure. The model

says that increasing your daily salt intake by 10 grams a day will raise your blood pressure by 5 mmHg.

Its a simple model, and assumes that this linear relationship exists everywhere. But really, will eating 20kg of

salt each day raise your blood pressure by 100 mmHg? The assumption probably isnt really true, but in order

for the results to be useful, the assumption just has to be more-or-less true over a normal range of values.

The results of a logistic regression tell you the odds ratios for the effects of the independent variables

on the outcome. Theyre interpreted like normal odds ratios.

Technically, the coefcient tells you the effect of the independent log odds, as I describe below. But everyone

reports the odds ratio because its very simple to calculate from the coefcient.

The results of a survival analysis tell you the hazard ratios or incidence rate ratios for the effects of the

independent variables on the outcome. Theyre interpreted like normal hazard ratios or incidence rate

ratios. Survival analysis is done using Kaplan-Meier estimates for bivariate models and Cox

proportional hazard models for multivariate models.

Generalized linear models (GLMs) are regression models that follow the form f(Y) = 0 + 1 x X1 + 2 x X2 ...

where the type of regression youre doing is dened by the link function f and the assumed statistical

distribution of Y. The Xs in the equation are the independent variables, and the Y is the dependent variable

(that depends on the values of the independent variables). The s are the coefcients of the independent

variables.

Simple linear regression uses the identity function (f(Y) = Y) and assumes that Y follows a normal distribution

(which means Y is treated as a continuous variable), so it reduces to Y = (Y) = 0 + 1 x X1 + 2 x X2 ... This

is the sort of regression that were taught in high school and undergrad. When you use a single independent

variable, the equation becomes Y = 0 + 1 x X1, which is just Y = M X + B. Hopefully thats a familiar

equation, since its the general for a simple line graph. Linear regression is basically drawing a straight line in

order to minimize the distance from the line to the points.

Logistic regression uses a logit function (f(Y) = logit(Y) = ln (Y / (1-Y))) and assumes that Y follows a binomial

distribution (which means Y is treated as a dichotomous variable), so it reduces to ln (Y / (1-Y)) = 0 + 1 x X1

+ 2 x X2 .... If the outcome is a risk (probability), then Y / (1-Y) is p / (1-p), which is the odds. This means that

logistic regression predicts the log odds. Our regression coefcients will tell us the direct, linear contribution of

the independent variables to the log odds. If we take the exponential function of the coefcient, we get the

odds ratio. Its possible to calculate a relative risk from the coefcient, but its a lot harder. So now you know

why most reported research results give the odds ratio.

Survival analysis is more complicated and I dont want to talk about it.

Prepared by Aidan Findlater

70

Heres a summary table:

Model Outcome Results

Linear Continuous Direct interpretation

Logistic Dichotomous/binary Odds ratio

Survival analysis Time-to-event Hazard ratio

What I mean by direct interpretation in the above table is that the coefcient we get out is directly

interpretable as the variables contribution to the outcome variable. See the paragraph on linear regression,

above.

Univariate or bivariate regression refers to using a single independent variable to predict the

dependent variable, and therefore gives you unadjusted results. The name is confusing because some

people are referring only to the number of independent variables and others including the outcome

variable in their count. Its unimportant, just know that when you read either univariate or bivariate in the

literature, both refer to an unadjusted model.

Multivariate regression just means youre using more than one independent variable, so the effect

measure (OR, IRR, etc.) of our outcome of interest is adjusted for the other independent variables in the

model.

A regression is a way of predicting the value of one variable (the outcome) based on the value of one or more

other variables (the exposure and confounders). Let's say you're predicting the risk of MI based on

hypertension and family history of MI (using a logistic regression, since MI is a dichotomous outcome). The

result will give you an odds ratio for the effect of hypertension on risk of MI (adjusted for family history) and also

an odds ratio for the effect of family history on risk of MI (adjusted for hypertension).

Its pretty obvious at this point that exposure of interest is just a matter of perspective. Any one of the

independent variables could be considered as the exposure of interest, and the results for each one of them is

adjusted for all the others.

Note that there is no one true model, even for a given data set. If youre looking at the effect of a new drug

on all-cause mortality, you could do logistic regression (where the outcome is dead or not dead) or you could

do a survival analysis (where the outcome is time-to-death). Survival analysis is probably preferable in this

case, since it takes into account the fact that some people will be in the trial for longer than others, but a

logistic regression wouldnt exactly be wrong.

Describe the importance of describing sample characteristics in epidemiologic

research

It allows you to see if the study population is similar to the population that you are working with.

It allows you to identify possible confounders.

Epidemiology I Course Notes

71

Describe various ways of selecting which variables should be included in a

multivariate analysis

[A: This isnt an objective, and I think its a little beyond what you need to know for this course, but the

lecturer spent a lot of time on it so Ill discuss it.]

Non-regression statistical tests

[A: These arent in the objectives, but the professor went through them and I think theyre actually

important. Even if you dont focus on them for studying, heres a table that you can refer to later.]

Prepared by Aidan Findlater

72

Meta-Anal ysi s

(30 Apr 2012)

Dene and compare/contrast review, systematic review, and meta-analysis

Reviews are summaries of the current state of research on a given topic. They come in two basic types:

traditional narrative reviews (which the notes refer to simply as review articles), where an expert in the

eld writes their take on things using whichever sources they like best, and systematic reviews, where

a team systematically searches, reviews, and reports on the current state of the entire body literature. [A:

The course uses review to refer exclusively to narrative reviews; I wont.] Statistically pooling the results

of a bunch of studies is called a meta-analysis, and should only ever be done as a part of a systematic

review.

Narrative reviews are prone to bias, since they depend entirely on the sources, opinions, and views of

the author. They can be as selective as they wish with their references, focussing on a small number of

studies and excluding relevant, valid research. It would be possible, for example, to write a review article

that cherry-picked papers to conclude that smoking cures lung cancer. In order to trust the results of the

review, you must trust the author of the review. Two people writing narrative reviews of the same topic

can arrive at very different conclusions.

Systematic reviews try to minimize the reviewers ability to bias the conclusions of the review by

systematically searching for an summarizing all relevant published research (and sometimes the

unpublished research, too). To achieve this lofty goal, reviewers write out in excruciating detail how they

intend to nd and interpret the relevant literature. Just like real science, systematic reviews have a

methods section that allows anyone to replicate their results. Two people writing systematic reviews of

the same topic should arrive at similar conclusions, assuming that their methods are sound.

A systematic review of narrative review articles found strong bias in the narrative review articles when

compared to the current state of the literature at the time that the article was written (10). Systematic reviews

are the way forward. In Cochrane we trust.

If youre trying to publish original research, some journals now require you to do a systematic review of the

topic to show that your research wasnt a waste of time and money.

A meta-analysis is a statistical pooling of the results of a systematic review. You take all the studies you

found in your systematic review, extract the numerical results as risk ratios, odds ratios, or some other

Epidemiology I Course Notes

73

number, and then do a weighted average of them to give a single quantitative summary. The weights

are generally based on the sample size, so that larger studies are given more weight.

Summarize steps required for a systematic review, including framing a specic

question for review

A proper systematic review requires a well-dened research question in the standard PICO format (see

Intro to EBM (9 Jan 2012)). The population, exposure, and outcome will determine the search terms that

you will use.

Summarize steps required for a systematic review, including identifying relevant

literature

You must develop a well-dened search strategy, which includes the databases and sources that will

be searched, the keywords that will be used, and the inclusion and exclusion criteria that you will use to

determine which search results to include in your review.

The following are sources to consider:

MEDLINE database (US-based) and EMBASE database (EU-based)

Clinical trials registers (including Cochranes and the US governments)

foreign language literature

references in primary sources

experts may have access to unpublished material

raw data from trials (by personal communication)

In a cohort study, you have to dene how and who you will recruit into the study. Its exactly the same with a

systematic review, except that your unit of analysis is now research studies instead of people.

Summarize steps required for a systematic review, including assessing the quality

of the literature

Reviewers will often apply standard tools that assess the methodological rigour of the articles that are

being included in the review. These critical appraisal tools should be predened.

Summarize steps required for a systematic review, including summarizing the

evidence

Use tables and graphs, including forest plots, to summarize the results. Provide an conclusion, if

possible, that answers the research question.

Recognize the possible bias due to publication bias and describe approach to

identifying publication bias using a funnel plot

Publication bias is what happens when authors and journals prefer to publish positive (statistically

signicant) results that are interesting. The published research is therefore a biased sample of all

research thats been done.

Prepared by Aidan Findlater

74

When you do research, some of the positive (statistically signicant) results will be false positives (you only see

an effect because of random chance). If youre only publishing the positive results, youll be publishing a lot of

false positives without the true negatives that would normally balance them out. In the end, treatments and

interventions that dont do anything end up looking effective just because the only studies published are those

that were statistically signicant from random chance alone.

Positive results are more likely to be:

published

published quickly

in English

in more than one journal (where one trial generates a bunch of papers; its how careers are made!)

cited by others

Its more likely to be a problem with smaller trials, since a large and expensive trial is likely to be

published regardless of the outcome.

A few years back, there was a nice analysis of publication bias in trials of antidepressants (11), a hugely

protable market for pharmaceutical companies. They found that 31% of trials on the topic were never

published, and that the published literature was wildly skewed in favour of antidepressants.

A funnel plot is a graphical way to

assess publication bias. It plots study

size, power, or standard error (the

inverse of precision) against their point

estimates. Larger, more precise

studies are higher on the plot than

smaller, less precise studies. We

expect the results of small studies to

vary while the results of larger studies

should converge toward the true

estimate. We expect, therefore, that

the dots on our plot will form a

symmetrical triangle [A: Which, for

some reason, is called a funnel. I guess its like an upside-down funnel.]. If the triangle (or funnel) is

truncated or asymmetrical, its usually because the smaller negative trials were never published. That is,

if the plot doesnt look like a triangle, then theres publication bias.

Standard error is basically the standard deviation. Its proportional to the width of the condence interval, so

that a wide condence interval means large standard error and a narrow condence interval means small

standard error.

Compare the above funnel plot to one that shows publication bias, below:

Epidemiology I Course Notes

75

Notice how it no longer has that symmetrical triangle shape, since only the more positive trials are

included. A meta-analysis of these trials will be biased.

Interpret a forest plot

A forest plot is a standard way of summarizing the

numerical results (ORs, RRs, HRs, etc.) of a

systematic review, instead of listing the results in a

boring table. [A: Im pretty sure its name comes from

the expression, cant see the forest for the trees.]

Each study in the review is shown on the plot as a

line with a square in the middle. The line shows the

95% condence interval for the studys result, the

square shows the actual point estimate, and the size

of the square represents the sample size of the study. If a meta-analysis was done, the result of the

meta-analysis will be shown as a diamond at the bottom of the plot, with the width of the diamond

showing its 95% condence interval.

In the above example plot, ve studies are included in the review, ordered by date of publication. The results

are presented as odds ratios. The result of a meta-analysis summarizing the ve studies is also shown as a

diamond with a dashed vertical line at its point estimate. The Ng et al. study is the largest sample, has the

smallest condence interval, and has a point estimate (OR of 2.1) very close to the point estimate of the meta-

analysis that pools all of the studies together (OR of 2.2). Note that all of the studies condence intervals

include the pooled estimate (the dashed line).

Do you remember the proper denition of a condence interval? If a study were repeated an innite number of

times and a condence interval calculated each time, then 95% of the condence intervals we calculate will

include the true value that were trying to estimate. Refresh your memory with Fundamentals of Biostatistics II

(6 Feb 2012). That scenario is pretty close to what were seeing on this forest plot, isnt it? Weve repeated a

study a bunch of times and now were comparing them. Assuming that the studies in our systematic review

are identical repetitions, differing only in their sample of people, then we expect that 95% of the condence

intervals will contain the true value, which were assuming is our pooled estimate. That means that, if the

condence intervals are all over the place and a bunch of them do not include the pooled estimate, then we

Prepared by Aidan Findlater

76

know somethings wrong. Just by looking at a graphical summary of study results, we can tell whether the

studies were done in similar ways!

Its also important to remember that a meta-analysis is a statistical pooling of the results of a systematic

review, so a forest plot can be used in a systematic review even if they dont meta-analyze the results.

Describe benets and limitations of a meta-analysis

Benets:

Allows quantitative integration of multiple studies, including smaller studies that may have been

inconclusive on their own

More precise estimates because a larger sample size means more statistical power [A: MOAR

POWER!]

A well-written meta-analysis will be transparent so that you can easily assess its risk of bias

Limitations:

Publication bias can bias the result, usually in favour of the intervention being studied

Criteria for included studies are critical to understanding the meta-analysis [A: Not really sure what this

means.]

Its often not appropriate to combine studies, such as when theres a lot of heterogeneity (see below)

Meta-analyses on the same topic can come to differing conclusions, mostly because of differences

in search strategies and inclusion/exclusion criteria

The results of subsequent large RCTs may differ from the results of a meta-analysis, often due to

publication bias

Meta-analysis provides a quantitative summary of the current state of our knowledge. If the state of our

knowledge sucks, or were getting a biased sample of the state of our knowledge, then our meta-analysis will

suck too. Those are problems with systematic reviews, too. The important thing to remember is that its not

always advisable to statistically pool the results (that is, do a meta-analysis). This issue will be discussed in the

next few objectives.

Dene heterogeneity

Heterogeneity means the studies differ from each other. When reading a meta-analysis, we must think

about clinical, methodological, and statistical heterogeneity.

Heterogeneity may be due to clinical differences in the population, intervention, or outcome.

For example, study location, age and sex of patients, type or dose of medication, and denition of outcome.

It may also be due to methodological differences in the study design, quality, duration, and analysis.

Epidemiology I Course Notes

77

For example, cohort studies compared to RCTs, 3-year studies compared to 15-year studies, and intention-

to-treat compared to per-protocol.

Finally, consider the idea of statistical heterogeneity, which is just a fancy statistical way of saying that

the numbers dont add up. That is to say, the numerical results of the studies vary from each other more

than you would expect from chance alone. You can usually see statistical heterogeneity just by looking

at the forest plot. Formal tests also exist for it.

Formal tests include the simple Chi-squared and the fancier I-squared. If statistical heterogeneity exists, it is

inappropriate to statistically pool (i.e. meta-analyze) the study results.

Clinical and methodological heterogeneity means that the studies differ from each other in why, how,

when, and where they were done (without considering the numerical results). Statistical heterogeneity

means that the numerical results differ from each other (without considering whether the studies

themselves).

[A: Personally, I think that statistical heterogeneity is the most important concept here. If theres statistical

heterogeneity, then it can usually be explained by the presence of clinical or methodological heterogeneity. If

theres no statistical heterogeneity, then the clinical or methodological heterogeneity probably doesnt make a

difference.]

Recognize that heterogeneity may mean a meta-analysis is not feasible/valid

If theres heterogeneity, then your studies are not measuring the same thing as each other. If theres

heterogeneity (clinical, methodological, or statistical), then you do not meta-analyze. Assess

differences in the studies PICOs (population, intervention, control/comparison, outcome), assess

differences in the studies methodologies, consider the quantitative variation

If you read a systematic review where they meta-analyzed a bunch of studies without considering

heterogeneity, they did bad. Question their results.

Interpret data from a cumulative meta-analysis

A cumulative meta-analysis is just a different way of making a forest plot, not a different type of meta-

analysis. A forest plot usually orders the studies by date of publication, where the line and square beside

each study name represents the result of that study. In a cumulative meta-analysis, the line and square

instead represent a meta-analysis of all studies published up to that point in time. It allows you to see

how the state of our knowledge has changed over time.

Heres a side-by-side comparison of a normal meta-analysis and cumulative meta-analysis looking at the effect

of streptokinase after acute MI (12):

Prepared by Aidan Findlater

78

By 1988, 33 studies had been done, most of which were not statistically signicant. But if you pool the results

with a meta-analysis, there was statistically signicant evidence that streptokinase was benecial after the

eighth trial, done in 1973. That means that as many as 24 RCTs were done to answer a research question that

had already been answered! Because no one bothered to meta-analyze the extant statistically non-signicant

trials until 1992, researchers wasted time and money and patients were not given an effective treatment.

Describe the role of a sensitivity analysis

A sensitivity analysis tells you how robust the results of a meta-analysis are to changes in the decisions

and assumptions that were made. Basically, you repeat the meta-analysis a bunch of different ways and

see if the results change.

For example, including results with poor methods, including or excluding outliers, and pooling the studies

using different methods (xed- or random-effects).

Epidemiology I Course Notes

79

Communi cati ng Ri sk

(7 May 2012)

Describe effective risk communication as the basis for informed consent

Effective risk communication is the basis for informed consent.

Dene health literacy

Health literacy is the degree to which individuals have the capacity to obtain, process, and

understand basic health information and services needed to make appropriate health decisions,

according to the IOM (13).

Or, the denition that the WHO uses: the cognitive and social skills which determine the motivation and ability

of individuals to gain access to, understand and use information in ways which promote and maintain

good health (14).

Health literacy is worse among the elderly, minorities, and people with low SES. Health literacy can be

measured using a number of tools, including the Test of Functional Health Literacy in Adults (TOFHLA)

Dene health numeracy

The broadest denition of numeracy is the ability to comprehend, use, and attach meaning to

numbers (15). Health numeracy is numeracy applied to health.

I think of health numeracy as the ability to understand and interpret medical numbers (like statistics) and to use

them to make decisions.

An example is understanding what a 10% chance of something means and how many people out of 100

that would correspond to, or knowing whether 2.9 per 1000 is higher or lower risk than 8.2 per 1000. These

seemingly simple statistics can really confuse people, even doctors.

Describe patient perception of risk and the impact of health literacy and

numeracy on patient risk perception and understanding

Low health literacy is associated with: (16)

more hospitalizations

Prepared by Aidan Findlater

80

People often confuse relative and absolute risks, and dont know when or how to apply them. People

tend to overestimate benets when presented with relative risk reductions.

The way that a problem is framed can impact how a patient, or doctor, understands the information.

Framing the same information in different ways can emphasize the benets (gain) or the costs (loss) of a

given treatment.

The lecture slides have a great example, using breast cancer. Read and try to gure out the following two

scenarios:

(1) The probability that a woman has breast cancer is 0.8%. If she has breast cancer, the probability that a

mammogram will show a positive result is 90%. If a woman does not have breast cancer the probability of a

positive result is 7%. Take, for example, a woman who has a positive result. What is the probability that she

actually has breast cancer?

(2) Eight out of every 1000 women have breast cancer. Of these eight women with breast cancer seven will

have a positive result on mammography. Of the 992 women who do not have breast cancer some 70 will still

have a positive mammogram. Take, for example, a sample of women who have positive mammograms. How

many of these women actually have breast cancer?

Just by reframing the question using natural frequencies instead of probabilities makes it simple to gure out

the right answer.

Describe cognitive biases that affect risk assessment and decision-making [NiO]

Anchoring: perceived risk is based on the risk of some other event thats familiar to the patient

Conrmation bias: we hear things that conrm our suspicions and ignore those that doesnt t with

them

Dread factor: we dread cancer and therefore see risks of cancer as greater

Habituation: everyday or usual activities seem less risky (we dont think about risks of crossing the

street)

Miscalibration: were overly condent about the extent and accuracy of knowledge

Optimism bias: were optimistic about our own outcomes (risk denial)

Outline the basic dimensions of risk

Epidemiology I Course Notes

81

What value does the patient give to the risk? Does the patient perceive it as important?

Identify techniques that have been shown to improve patient understanding of

risk, such as verication techniques and the roles of qualitative and quantitative

and graphic presentations of risk, and decision aids

The basics of risk presentation (mostly from (17)) are:

Natural frequency may be better than percentage (but depends on the patient)

Use a consistent denominator (if the benet is per-1000, then the risks should be, too)

NNT seems to be most difcult conceptually

Acknowledge uncertainty

Decision aids are pamphlets, videos, or websites that help patients and doctors to understand the risks,

harms, and benets of their healthcare options. The benets of using decision aids (mostly from (18)) are:

Improved knowledge of the options

Decisions are more consistent with patient values

Patients participate more in decision-making

Check out http://www.thennt.com/ for some straightforward risk communication tools. There are a lot of

resources out there, you just have to look. The Cochrane Review articles on risk communication (17) and

decision aids (18) are good places to start.

Prepared by Aidan Findlater

82

References

1. Sackett DL, Rosenberg WM, Gray JA, Haynes RB, Richardson WS. Evidence based medicine:

what it is and what it isn't. BMJ [Internet]. 1996 Jan. 13;312(7023):712. Available from: http://

www.bmj.com/content/312/7023/71.full

2. Sackett DL, Straus SE, Richardson WS, Rosenberg W, Haynes RB. Evidence-based medicine:

How to practice and teach EBM. 2nd ed. New York: Churchill Livingstone; 2000.

3. Committee on Quality of Health Care in America. Crossing the Quality Chasm [Internet].

Washington, D.C.: The National Academies Press; 2001. Available from: http://www.nap.edu/

openbook.php?record_id=10027

4. Last JM, editor. A Dictionary of Epidemiology. 4th ed. New York: Oxford University Press; 2000.

5. Gordis L. Epidemiology. Saunders; 2008.

6. Porta M, editor. A Dictionary of Epidemiology. 5th ed. New York: Oxford University Press; 2008.

7. Wang D, Bakhai A. Clinical trials: a practical guide to design, analysis, and reporting. London:

Remedica; 2006.

8. Vrijheid M, Armstrong BK, dard DB, Brown J, Deltour I, Iavarone I, et al. Recall bias in the

assessment of exposure to mobile phones. J Expo Sci Environ Epidemiol. 2009 May 1;19(4):369

81.

9. Hernn MA. The hazards of hazard ratios. Epidemiology. 2010 Jan. 1;21(1):135.

10. Schmidt LM, Gtzsche PC. Of mites and men: reference bias in narrative review articles: a

systematic review. J Fam Pract. 2005 Apr. 1;54(4):3348.

11. Turner EH, Matthews AM, Linardatos E, Tell RA, Rosenthal R. Selective publication of

antidepressant trials and its inuence on apparent efcacy. N. Engl. J. Med. 2008 Jan. 17;358(3):

25260.

Epidemiology I Course Notes

83

12. Lau JJ, Antman EME, Jimenez-Silva JJ, Kupelnick BB, Mosteller FF, Chalmers TCT. Cumulative

meta-analysis of therapeutic trials for myocardial infarction. N. Engl. J. Med. 1992 Jul. 23;327(4):

24854.

13. Institute of Medicine. Health Literacy: A prescription to End Confusion. National Academy Press;

2004.

14. Nutbeam D. The evolving concept of health literacy. Soc Sci Med. 2008 Dec.;67(12):20728.

15. Fischhoff B. Communicating Risks and Benets: An Evidence Based Users Guide: An Evidence

Based Users Guide [Internet]. Brewer NT, Downs JS, editors. Silver Spring: US Department of

Health and Human Services, Food and Drug Administration; 2011. p. 240. Available from: http://

www.fda/gov/ScienceResearch/SpecialTopics/RiskCommunication/default.htm

16. Berkman ND, Sheridan SL, Donahue KE, Halpern DJ, Crotty K. Low health literacy and health

outcomes: an updated systematic review. Ann. Intern. Med. 2011 Jul. 19;155(2):97107.

17. Akl EA, Oxman AD, Herrin J, Vist GE, Terrenato I, Sperati F, et al. Using alternative statistical

formats for presenting risks and risk reductions. Cochrane Database Syst Rev. 2011 Jan.

1;3:CD0067766.

18. Stacey D, Bennett CL, Barry MJ, Col NF, Eden KB, Holmes-Rovner M, et al. Decision aids for

people facing health treatment or screening decisions. Cochrane Database Syst Rev. 2011 Jan.

1;10:CD0014311.

Prepared by Aidan Findlater

84

Appendi x I

The Student Gui de to Research

So you scored yourself a sweet research gig? Congratulations! Research can be interesting and even

fun, but it can also be daunting and at times overwhelming. Here are a few pointers and resources

(besides myself) that may help.

Starting your project

Your research supervisor has thrown a few medical-sounding words together and told you to research

it. What do you do next? This section will help you to nail down the specics that youll need in order to

actually do the research, and should apply to any type of epidemiological research youll be doing.

First, nail down the specic research question youll be answering. PICO is your best friend. Break your

research question down into population, intervention or exposure, control or comparison, and outcome.

Dene each one as specically as possible, using denitions that are consistent with the existing

literature. If you cant tell me specically what you mean by second-hand smoke or adolescents,

then you shouldnt be doing the research. Now that you understand exactly what youre researching,

summarize your PICO into a one-sentence question that you can use when telling others about your

research.

Once you have your PICO gured out, think about confounders. What sorts of things might connect

your exposure and outcome, even if theres no causal relationship between the two? Come up with a list

of confounders that you think will signicantly affect your research question, then try to gure out how

you will measure each one. Youll need strict denitions for each of your confounders, just like you do for

your exposure and outcome. Try to make sure that all your variables are dened so that your research

can be compared to the existing research.

With your well-dened variables in mind, think about how you might analyze the data youll get. I dont

expect you to have a specic statistical model set in stone, but you need to have an idea of what kinds

of numbers youll be working with and how they might be made to answer your research question. Is the

outcome dichotomous? How about the exposure and confounders? Are you going to get a risk ratio or

odds ratio out of it, or are you comparing a continuous measure like blood pressure change? If you

arent sure, ask an epidemiologist, or consult with your supervisor (who will probably mumble something

about p-values).

Epidemiology I Course Notes

85

Observational studies

Most of the research projects youll be doing are some sort of observational research, where bias and

confounding run rampant. Go back and review sources of bias in case-control and cohort studies. After

that refresher, think about the selection and measurement bias that your study will have, and how you

can minimize it.

Consider how you are choosing who or what you will be including in your study. Is your sampling

method going to give you a good sample of the population youre trying to study, or will it be a biased

sample thats full of rich/poor/healthy/sick/worried/helpful people? Try to gure out the ways that the

study population will fail to reect the population youre actually interested in.

Consider how you are measuring your variables (exposure, outcome, and confounders). Does the status

of any variable affect the way that any other variable is measured? For example, if they have the

outcome, do you search for or measure the exposure differently? And so on, for all your variables.

Collecting the data

Ask your supervisor.

Analyzing the data

Ask your supervisor, or just ask me: aidan@aidanndlater.com.

For simple stuff like means and standard deviations, Excel is ne. For more complicated stuff, youll

need to use statistical software. R is free, student pricing exists for Stata and SPSS, and SPSS and SAS

can be accessed in your web browser through UWOs MyVLab (http://myvlab.uwo.ca/).

Software Notes Resources

SPSS Point-and-click interface

Popular with social scientists

Available at UWO through MyVLab

http://ssnds.uwo.ca/helpnotes/spss.asp

Stata Point-and-click interface

Popular with economists and epidemiologists

Aidans second-favourite!

http://ssnds.uwo.ca/helpnotes/stata.asp

R Its all typing, no point-and-click

Steeper learning curve

Extremely exible if you know basic programming

Free and open source

Aidans favourite!

http://www.r-project.org/

http://www.statmethods.net/

SAS Available at UWO

PROC WILL_DRIVE_YOU_INSANE

Costs more than hiring a biostatistician

Aidan hates it!

http://www.uwo.ca/its/sitelicense/sas/index.html

http://ssnds.uwo.ca/helpnotes/sas.asp

Writing it up

There are standards for reporting for every type of research. Each comes with a checklist to make sure

youre including everything that you should be, and many including templates for recommended ow

charts and tables. Even if your research was awful, theres no reason that your research reporting cant

Prepared by Aidan Findlater

86

be top-notch! Bear in mind that you might not have room in your article to t in everything on the

checklist. Use them as a guideline rather than an absolute standard.

Go download the checklist that applies to your project:

Study Design Guideline URL

Case-control

STROBE http://www.strobe-statement.org/ Cohort STROBE http://www.strobe-statement.org/

Cross-sectional

STROBE http://www.strobe-statement.org/

Randomized trials CONSORT http://www.consort-statement.org/

Non-randomized trials TREND http://www.cdc.gov/trendstatement/

Diagnostics STARD http://www.stard-statement.org/

Systematic reviews and meta-analyses PRISMA http://www.prisma-statement.org/index.htm

A much more thorough list of reporting standards is maintained by the EQUATOR Network at http://

www.equator-network.org/. If you cant nd an appropriate guideline on there, then it doesnt exist

anywhere.

Epidemiology I Course Notes

87

- Abnormal Urinalysis FC.pdfEnviado porAaron
- KAdams_RRL#5Enviado porKandi Adams
- Economics of Health Care - BookEnviado porDemsy Audu
- Exercise for Rotator Cuff thyEnviado porLiz Luz
- CriticalAppraisalWorksheetHarm EtiologyEnviado porGatri Wulandari
- Presentation in EpidemiologyEnviado porKristine Kollin
- 5 Diabetics UlsersEnviado porSyah Diyah
- Comprehensive Chromosome Screening Discussion PaperEnviado porThe Vancouver Sun
- Barrington 1999 the Cochrane LibraryEnviado porAlbertocho
- revision 01381772-josephrmartin-practicumexercise11representingyourselfprofessionallyasananthropologistEnviado porapi-356727014
- Prevalence Survey4[1]Enviado porsanjivdas
- jmmcr004036Enviado pordrfaruqui2551
- Abnormal Urinalysis FC.pdfEnviado porAaron
- Question and ProtocolEnviado porbijugeorge1
- 964f154e-4166-474a-a523-51f82a2af1f5 (1)Enviado porTupicica Gabriel
- SOL-2010-1-5-18Enviado porAlina Toma
- i1062-6050-42-3-388Enviado porMuizzudin Azza
- SpivackCheng_AOASEnviado porUjacob Uwarren
- Early Versus Delayed Post-operative Bathing or Showering to Prevent Wound ComplicationsEnviado poragus
- cabalag2014.pdfEnviado porEko Setiawan
- art%3A10.1007%2Fs10803-011-1409-4Enviado porLiezel Ann Esteban
- research lab 2 responsesEnviado porapi-300675885
- lateral position.pdfEnviado porParli Harun
- consensoneuriteEnviado poriisisiis
- Characteristics and Key FindEnviado porNacko Stavreski
- DurationEnviado porapi-19966661
- TB Terapi - CopyEnviado porFatimah Alzahra
- 16-e-07 Communication-Lecture 7 PPT.pdfEnviado porJanko Schmidt
- 16-e-07 Communication-Lecture 7 PPT.pdfEnviado porJanko Schmidt
- 0introEnviado porVijaya Bhatt

- King of SpainEnviado porChris Mykytyshyn
- RxFiles.ca All ChartsEnviado porBrian Harris
- 100 Gross Anatomy ConceptionsEnviado porChris Mykytyshyn
- Block FeesEnviado porChris Mykytyshyn
- God Only Knows BarbershopEnviado porChris Mykytyshyn
- CPSO Guidelines Female CircumcisionEnviado porChris Mykytyshyn
- Blood Borne Pathogens PolicyEnviado porChris Mykytyshyn
- EPI Practice+ExerciseEnviado porChris Mykytyshyn
- Primer on Population HealthEnviado porChris Mykytyshyn
- Microbiology - Bad Bug BookEnviado porkanetreble
- 29822948 Muz Choir Amp Piano Lion King SATB Medley Sheetmusic(1)Enviado porJackie O'Jae
- Lion King MedleyEnviado porwesliekhoo
- Maternal Mortality in CanadaEnviado porChris Mykytyshyn
- Ultrasound 2012 ReportEnviado porChris Mykytyshyn
- Medical Biochemistry at a GlanceEnviado porIkpeamarom Alex O
- Some Nights C Major TenorEnviado porChris Mykytyshyn
- Professionalism EducationEnviado porChris Mykytyshyn

- light wwsEnviado pormspalem
- The Effect of Gamma Radiation on Freezing Tolerance of Chickpea (Cicer Aretinum L.) at in Vitro CultureEnviado porBagus Herwibawa
- Cefr f2 English Lesson CheckerEnviado porMasrina Ina
- Mat Class Vii DppEnviado porNilesh Gupta
- A General Relativity WorkbookEnviado porHilberto Silva
- Public Speaking NotesEnviado porHope Alternado
- APUSH Process Paper First Draft (1)Enviado porPriscilla Wu
- Biochemical Cycles ENVIROEnviado porRomel Leo Alojado
- KODAK Manual de Radiografía en La Industria Moderna Cap 1Enviado porGabriel Ferrario
- ASTM-A131.pdfEnviado porMohamed Salem
- WPFEnviado porNk Snk
- Impact of Nursing TheoryEnviado porAnonymous OP6R1ZS
- RPH mbmmbiEnviado porAbirami Markandan
- Definition & Scope of Political ScienceEnviado porbiplob
- Mesh 2Enviado porthesome4
- Executive AssistantEnviado porapi-72678201
- openpyxlEnviado porsudarsancivil
- CENTUM VPEnviado porNuman
- What It Means to Be 98% Chimpanzee by Jonathan MarksEnviado porJGokz
- 3875-offshore-oil-gas-uk-ind-rev.pdfEnviado pormayureshrmahajan
- Copy of Vinod CvEnviado porJinal Rudani
- Limit or QueEnviado porNofrialdie
- Invicta-LupahRevolution-150Enviado porjhayub
- Sara Youssef ResumeEnviado porSara Sameiha
- 8516Enviado porDanyal Chaudhary
- Business Models Galore - The Colorful World of Corporate UniversitiesEnviado porAdrianPedroviejoFernandez
- Essays on Tradi...Enviado porNeha Buttan
- Scaling Agile An Executive Guide feb 2010 Scott AmblerEnviado porUsman Waheed
- Eeccmc2018 Paper 1164Enviado porrizkifauzian
- Results of Final Year Research Project-FYP 2017 of BIMS-Arid Agriculture University, Rawalpindi.Enviado pordaniyal