Você está na página 1de 36

Randy Glasbergen cartoon motivational speaker with flip chart and pointer

The LowCarb Lecture on Clinical Trials

What the mind can conceive and believe, it can achieve. You can make a fortune, build an empire, change the world. But don't try to lose weight because that never works.

William D Heavlin bill.heavlin@comcast.net


revised August 2006

Objectives:
Target audience: Clinical research associates and other research support staff. Time expected: ~2 hours

Just pop these on before we start, Mrs Foobar. fg 2005 cartoon Doctor handing woman with dog goggles like he is wearing. Cabinet has truth goggle box.

Topic: after this class you will


see some key concepts for clinical trials, for clinical trial statistics, and understand how they are reflected in the design of clinical trials.
August 2006 bill.heavlin@comcast.net

Basic Outline:

I make my pie charts with real pie! Everyone who pays attention gets a slice at the end of the meeting.

Randy Glasbergen 2001 cartoon Clinical trials history Man with pie and flip chart under and ethics arm talking to woman in business Statistics worldview suit with briefcase. Sample size and confidence intervals Endpoint and design hierarchies Effects researchers worry about Random assignment Multiplicity Typical clinical trial practices and why
August 2006 bill.heavlin@comcast.net

A selected history of clinical trials


1747 James Lind 1863 Austin Flint scurvy rhuematism 6 paired patients, citrus 1st use of a placebo self-exposed to Aedes aegytpi Randomized experiments Randomization by coin toss 10 points: consent...benefit 1st use of random numbers teratogenic effect not tested Contemporary guidelines
bill.heavlin@comcast.net

1901 Walter Reed, malaria Clara Maass 1925 RA Fisher 1931 Amberson agriculture tuberculosis

1947 Nuremberg Code 1948 BMRC 1961 FO Kelsey 1993 HHS, WHO
August 2006

tuberculosis sleep aid

Nuremberg Code
1.voluntary consent 2.potential benefit 3.based on animal experiments and literature 4.avoids suffering, injury 5.no a priori reason to expect death 6.risk proportionate to benefit
August 2006

7.proper facilities for care 8.research only by qualified scientifists 9.subjects may withdraw consent at any time 10. scientist may terminate at any time

bill.heavlin@comcast.net

clinical trial ethics


medical confidentiality: Subjects medical records, and the fact of their participation in the clinical trial, are considered private. Trial records must anonymize all results, including patients' identification, whether results are to be published or not. informed consent: requires that significant risks be disclosed, and any risks important to a given patient. Protocol must consider both the general (the reasonable patient) and the specific (this particular patient). Patient consent can be withdrawn at any time. Can be no consequences to withdrawal of consent. Full disclosure to subjects who participate in the trial, and this limits the role for deception in clinical trials. Today, subjects are told that the chances that they may receive the new treatment(s) being tested over the control/placebo. Balancing Treatment vs. Research Objectives. When standard treatments exist for the disease being studied in clinical trials, a standard treatment is always used in place of a placebo for serious diseases. Research Benefit: Plausible chance of new treatment being beneficial to subject, and of research being useful (sound protocol, sufficient sample size, likelihood of publication). Routinely breaking single-blind protocol therefore undermines the ethical basis of the trial. Equipoise: Research justified by the lack of consensus about appropriate treatment. Intellectual position that recognizes the limits of one's opinions.
August 2006 bill.heavlin@comcast.net

Clinical trial phases

cartoon: phase I: safety, tolerability, two men with tablecloth pharmokinetics, scibbled full, one of whom pharmodynamics. Usually is yanking tablecloth from adjacent, occupied table involves ramping doses. phase II: expanded version of phase I, IIa for dosing, IIb for And for phase two... efficacy. When drugs fail, they usually fail in phase II. phase III: definitive, hence expensive assessment of efficacy. typically double-blind. Necessary for a regulatory submission. phase IV: post-launch surveillance. Rarer adverse effects detected here.
August 2006 bill.heavlin@comcast.net

The Statistics Worldview


uniform measurement: Data are numbers collected using the same method on all subjects. random variation: Even when everything else is the same, different patient outcomes are expected. generalizability: In spite of such variation, we can learn something about a larger group, from data on a smaller group. sample size: How much we learn depends on how much databetter precision from larger n, experimental design: how it is gathered, and effect size: how big the effect is on what we measure.
August 2006 bill.heavlin@comcast.net

(a) usual six-sided die


0.5 probability 0.4 0.3

(b) roll twice, pick larger


0.5 probability 0.4

0.3

0.2 0.1 0.0 1 2 3 4 5 6 die roll

0.2

0.1

0.0 1 2 3 4 5 6 die roll

after one observation, can you tell is it (a) or (b)? after 1000 observations? how much data, then?
August 2006 bill.heavlin@comcast.net

Statistical Approach
Pick a statistic, e.g. average die roll Pick a sample size, e.g. n=20 rolls Pick a rule, e.g. average roll > 4
If the die is fair, the chance that the statistic is larger than the threshold should be low. If the die is not fair, the chance that the statistic is larger than the threshold should be high.
August 2006 bill.heavlin@comcast.net

averages of 20 rolls
D0n sample: 100000 1.5 1.0 0.5 0.0 0.0 2.0 4.0

1 fair die 20% > 4.0

Statistic: average Sample size: 20 Rule: > 4


August 2006

D1n sample: 100000 3.0 2.0 1.0 0.0 0.0 2.0 4.0
bill.heavlin@comcast.net

88% > 4.0 larger of 2 rolls

The Art of Sample Sizes (I)


In the previous example, n=20 is too small, while n=34 (next) is just large enough. These sample sizes depended on the distribution of a fair die (the null hypothesis) and on the distribution of the larger of two rolls (the alternative hypothesis).

August 2006

bill.heavlin@comcast.net

averages of 34
D0n sample: 100000 1 fair die 1.5 1.0 0.5 0.0 2.0 3.0 4.0 1 = 97.6% % > 4.0 = 5% % > 4.0

Statistic: average Sample size: 34 Rule: > 4


August 2006

D1n sample: 100000 larger of 3.0 2 rolls 2.0 1.0 0.0 3.0 4.0

5.0
bill.heavlin@comcast.net

The Art of Sample Sizes (II)


When the null hypothesis is really true (e.g. control is equivalent to treatment), the chance of concluding otherwise is called a type I error. Typically =0.05. When the alternative is really true, the chance of concluding otherwise is called a type II error. Typically =0.1 or 0.5. 1 is called power.

August 2006

bill.heavlin@comcast.net

Power Curves for 2 Designs


1.0 power

0.8

= Design A (better)

0.6

0.4

type I error =0.05

0.2

= Design B
0.0

null hypothesis
.25 .30 .35 .40 .45

probability of event <= 3 months .50 .55 .60 .65

August 2006

bill.heavlin@comcast.net

Confidence Intervals
Sampling distribution: A die roll has a probability distribution of its values, and a statistic, e.g. the average of 20, has a distribution of its values. We observe only one observation from the sampling distribution.

0.5 probability 0.4 0.3

0.2 0.1 0.0 1 2 3 4 5 6 die roll

D0n sample: 100000 relative 1.5 1.0 0.5 0.0 0.0 2.0 4.0
average of 20 rolls probability

August 2006

bill.heavlin@comcast.net

We observe only one observation from the sampling distribution, and so there are various true distributions that are consistent with this one value. A Confidence Interval is the range of true values that are (for a given probability) consistent with the observed statistic.

With n=20, observing a mean between 2.75 and 4.25 would be consistent with a true mean of 3.5.
2.75 95%
D0n sample: 100000 relative
probability

4.25

1.5 1.0 0.5 0.0 0.0 2.0 4.0

true value = 3.5


August 2006

average of 20 rolls

bill.heavlin@comcast.net

averages of 10

single obs

10

11

12

13

14

10

11

12

13

14

Confidence Intervals

for a given probability, the range of true population parameters (e.g. population averages) that are consistent with the observed data. are the error bars of summary statistics (e.g. averages, and so on). give the sizes the treatment effects, hence are the focus of clinical trials. grow ever smaller as sample size increases.
August 2006

Prediction Intervals

for a given probability, the range of single values one can expect to observe from a population. are the error bars of single observations (i.e. averages of n=1). give the range of outcomes for individual subjects. converse to a constant size as the sample size increases.

bill.heavlin@comcast.net

Confidence Intervals
Rules of Thumb: Square root rule: To make the width of a confidence interval smaller by a factor of 2, the sample size must be increased by 22 = 4; ...smaller by a factor of 3, increased by 32 = 9. 50 percent power rule: If the sample size n is just large enough to exclude a particular alternative, the probability the alternative is in the confidence interval is 0.50. 90 percent power rule: 2.75n50, where n50 is the sample size of the 50% power.
August 2006 bill.heavlin@comcast.net

Endpoint Hierarchies
1. Quantitative change from baseline 2. Quantitative outcome 3. Time to event (if any) 4. Count of events 5. any event (yes/no) 6. Assessment scale 7. Clinician assessment 8. Self- or family report
August 2006

objective lab measure

behaviorial

nomothetic

ipsative
bill.heavlin@comcast.net

Design Hierarchies
1. Compare subject to self i.e. to baseline (ab) i.e. to alternating on-off treatments (abab) 2. to similar subjects i.e. stratum based on clinical similarity 3. to those in similar care i.e. stratum based on care environment, e.g. hospital, referring clinic, primary care physician 4. Compare one group to another
August 2006

local control

stratified

unstratified

bill.heavlin@comcast.net

Ways to Boost Power


Increase sample size Size groups equally Study drug efficacy rather than effectiveness Move up on Design Hierarchy Move up on Endpoint Measurement Hierarchy Longer periods to observe time-to-event endpoints.
August 2006 bill.heavlin@comcast.net

Effects Researchers Worry About


Hawthorne effect Selection and selfselection bias Placebo effect Rosenthal effect Stratum effect Cohort and adjunct care effects Treatment compliance
August 2006

S Harris cartoon: several surgeons around patient on operating table, two aside, of which one has surgical mask down to speak.

We'll just mill around till he's asleep, and then send him back up. This operation is actually for a placebo effect.'
bill.heavlin@comcast.net

Hawthorne effect
Named for the Chicago Hawthorne Works once owned by Western Electric.

C Barsotti cartoon: two men at bar, somewhat facing one another

If a group knows they are being studied, their outcomes may be biased. Must be able to answer what would happen if Participating in an the treatment had no experimental protocol and effect, in spite of being measuring clinical outcome part of an experiment, can itself improve outcomes. hence Patients may take more care than they would otherwise. Control Groups.
August 2006 bill.heavlin@comcast.net

I can't explain itit's just a funny feeling that I'm being Googled.

Selection & self-selection bias

Wiley cartoon: couple at maitre d' station, who states No, this isn't a men's club. It's just that no woman has ever been able to pass the dress code. Sign states coat and tie required for mail patrons. Women required to withstand the scrutiny of other women.

The tendency of subjects of better prognosis to be assigned to a particular treatment: Clinicians more likely to refer patients for which they are more hopeful, and work harder on patients when they have more hope. Self-selection bias: Patients who actively seek out new or experimental therapies may be different, may manage their care better, comply better, (a form of the Hawthorne effect). To overcome these issues, randomized control groups.
August 2006 bill.heavlin@comcast.net

Random Assignment
In an experiment, a method of assigning subjects to treatment and control groups that gives each subject a prescribed (usually equal) chance of being assigned to each group. Randomization accomplished by tossing coin, rolling die, random number table, On average, ensures the treatment groups are comparable before treatments begin, reduces treatment selection bias.
August 2006 bill.heavlin@comcast.net

Placebo effect
The action of a drug or psychological treatment that is not attributable to any specific operations of the agent. When made worse, sometimes called the nocebo effect. Also called the subject-expectancy effect. Veyant cartoon: For example, a tranquilizer can Single man at table, with plate of food, reduce anxiety both because of wine glass, flowers, speaking to waiter. its special biochemical action and because the recipient No, there's nothing wrong with the food. I just needed a little attention. expects relief. Good experimental practice blinds the patients as to which treatment group they are in: Single-blind.
August 2006 bill.heavlin@comcast.net

Rosenthal effect
The tendency for results to conform to experimenters expectations unless stringent safeguards are instituted to minimize human bias. Also called the Pygmalion effect and the observer-expectancy effect. Robert Rosenthal, professor of psychology, UCLA, performed many of the first experiments revealing the problem. Good experimental practice blinds the clinicians, researchers, and follow-up interviewers to which group any patient belongs. Double-blind.
August 2006 bill.heavlin@comcast.net

Stratum effect
Certain treatment locations have generally better outcome than others, perhaps because the treatment is better, or because the associated patient population is healthier.
cartoon: Guggenheim Museum in New York, looked upon by two men in convertible car.

Are they allowed to do that on Fifth Avenue?

Good experimental practice has at each treatment location both treatment and control groups. Good analysis practice compares the treatment outcomes of a stratum to only its local control groupwithin the same stratum. Stratified.
August 2006 bill.heavlin@comcast.net

Cohort and adjuvant care effects


Usually, adjuvant care improves over time, as personnel become more experienced, and additional treatments become available. Good experimental practice has treatment and control groups treated in the same time period, so Historical control groups are less satisfactory, hence Concurrent control groups.
August 2006

cartoon: two bald men in prison cell, on upper and lower bunks. sign: Many years later

My third felony was a smart move. Folks on the outside are still waiting for health care.

bill.heavlin@comcast.net

Treatment compliance
Once a patient is assigned to a treatment group, all subsequent behavior is in some sense an outcome also. Comparing only patients that adhered closely to their assigned treatments creates a selection bias. Good data analysis practice analyzes outcomes based on which group the patient was intended for treatment intention to treat (ITT). Compliance can sometimes be corrected for during data analysis.
August 2006

Mark Parisi cartoon: Acupuncture sign on wall. One porcupine lying on stomach, another poised beside table. caption: Oh, jeepers...I've completely lost track here... Whadaya say I just randomly pull out some needles and we'll call it even?

bill.heavlin@comcast.net

Multiplicity Issues
heads-tails, 2 out of 3 game
H
T

First toss

HH

HT

TH

TT

Second toss

HHH

HHT

HTH

HTT

THH

THT

TTH

TTT

Third toss

one toss is fair, best of 3 is fair but picking #tosses after one toss is not fair = 5/8
August 2006 bill.heavlin@comcast.net

Three Kinds of Multiplicity:


Interim analyses, Multiple endpoints, Multiple comparisons of groups.

lying u l ti p M is bit's Rab !!! k Wor


cartoon: man holding baby, beside 3 kids and pregnant wife, being addressed by several rabbits, who hold a sign.

We'd like a word with you!

August 2006

bill.heavlin@comcast.net

Dealing with Multiplicity

Interim analyses: Schedule interim looks into formal protocol, Raise threshold of significance to keep type I error at stated You've got one foot in the grave. level. Further testing will determine if it's your left or your right. Endpoints: Pick a single primary endpoints, and then others become only secondary, or One composite endpoint by combining several. Groups: One treatment vs one control condition, avoiding analyses within strata. Isolating dose-finding to phase II.
August 2006 bill.heavlin@comcast.net

cartoon: man in gown on examination table, being addressed by doctor

A Typical Phase III Design


Control group: placebo or standard care Randomized, concurrent control groups, stratified by site sample size gives 90% power, and type I error of 5% placebo-controlled, single- and double-blind, one treatment arm, one primary endpoint, any interim analyses scheduled up front, all conforming to planned protocol.
August 2006 bill.heavlin@comcast.net

some concluding remarks


The needs of the patient/subject come first, and the integrity of the research protocol is also key to ethical research. Philosophically, these considerations are represented by the concept of equipoise, and practically by the randomization, single, and double-blinding protocols. Continued recruitment from all strata improves the study beyond the sample size effect. As with any clinical setting, encourage compliance to treatment.
August 2006 bill.heavlin@comcast.net

Você também pode gostar