Você está na página 1de 34

Generalized Causal Inference

IGS
2
Source: Times Daily 1998
Conditions for Causality

John Stuart Mill (1806-1873)

1. Temporal precedence (X precedes Y)

2. Covariation (Y changes with X)

3. No plausible alternative explanations


3
Causality

Are causes necessary and sufficient conditions


for their effects?

J.L. Mackie would disagree . . .


Consider Short circuits cause house fires
A short circuit is not necessary for a fire to burn houses
may catch fire for many other reasons
A short circuit is not sufficient other conditions (i.e.,
oxygen and inflammable material) must be present

4
Causality (contd)

INUS condition:
An insufficient but non-redundant part of an
unnecessary but sufficient condition
(Mackie 1974, p. 62)

Insufficient but
Necessary part of a condition which is itself
Unnecessary but
Sufficient for the result

5
Effect

Effect: the difference between what did


happen and the counterfactual, i.e.
something that is contrary to the fact

Counterfactual What would have happened


to the same individuals who received a
treatment if they simultaneously had not
received that treatment?

6
Two roads diverged in a yellow wood,
And sorry I could not travel both
Robert Frost
7
Counterfactual

We cannot observe a counterfactual: Acts


demolish their alternatives, that is the
paradox (Salter 1975, p. 36)
Hence we try to
Create reasonbale but imperfect approximations
to the counterfactual
Measure how these approximations differ from
the counterfactual

8
Research Designs
Experiment: A study in which an intervention is deliberately
introduced to observe its effects.
Randomized Experiment: An experiment in which units are assigned
to receive the treatment or an alternative condition by a random
process such as the toss of a coin or a table of random numbers.
Quasi-Experiment: An experiment in which units are not assigned to
conditions randomly.
Natural Experiment: Not really an experiment because the cause
usually cannot be manipulated; a study that contrasts a naturally
occurring event such as an earthquake with a comparison condition.
Correlational Study: Usually synonymous with non-experimental or
observational study; a study that simply observes the size and
direction of a relationship among variables.
Source: Shadish et al. 2002, p. 12
9
The Origins of Experimentation

Empedocles (495 BC-430 BC): empirical demonstrations


against Parmenides
Leonardo da Vinci (1452-1519)
William Gilbert (1544-1603)
Francis Bacon (1561-1626)
Hacking (1983) says of early experimenter Sir Francis Bacon: He
taught that not only must we observe nature in the raw, but that we
must also twist the lions tale, that is, manipulate our world in order
to learn its secrets (p. 149)
Galileo Galilei (1564-1642)
Scientific Revolution in the 17th century

10
Experimentation

Natural philosophy Modern science

Part of our odinary life


What happens to my grades if I study more?
What happens to my weight if I exercise more?
. . .

11
Natural Philosophy vs. Modern Science

Natural Philosophy Modern Science


First principles Theory Observation
Theory Observations to Correction of errors in
support theory theory

Passive observation Deliberate interventions


and observation of effects
No control of extraneous Control of extraneous
influences influences (e.g.,
laboratories)
12
Randomized Experiments

Random assignment creates two or more


groups of units that are probabilistically
similar to each other on the average.

Hence, any outcome differences between


groups are very likely due to treatment.

13
Causal Description vs. Causal Explanation

Randomized experiments are the Cadillac


of research designs for describing the effects
of manipulations (Molar causation)

Randomized experiments do not help much


in explaining the mechanisms through which
and the conditions under which the causal
relationship holds (Molecular causation)

14
Mediators and Moderators

A mediator is a variable that accounts for all


or some of the observed relationship
between a predictor and an outcome.
whats a predictor?

A moderator is a variable that affects the


strength or direction of the relationship
between a predictor and an outcome; in
other words, the effect of the predictor on
the outcome depends on the level of the
moderator.
15
Quasi-Experiments

Unlike with randomized experiments, in quasi-experiments


assignment to conditions is by means of
Self-selection: units choose treatment for themselves
(e.g. individuals voluntarily enrolling in job training
programs)
Selection by non-random mechanisms: the decision of
who gets which treatment is based on some non-random
criterion (e.g. school principal allocating teachers to
classes)
Treatment and control groups may differ in many systematic
(non-random) ways other than the presence of the
treatment

16
Quasi-Experiments (contd)

Systematic differences between treatment


and control groups may constitute potential
alternative explanations for the observed
effect
We try to rule out alternative explanations
using
Theory
Logic
Design
Measurement
17
Quasi-Experiments (contd)

The ruling out of alternative explanations is


related to a falsificationist logic (Popper
1959)
Many confirming observations are not sufficient
to prove an hypothesis
One single disconfirming observation is sufficient
to falsify an hypothesis
Scientists should try to falsify the conclusions
they wish to draw

18
Natural Experiments

Study of a natural setting that appears to assign a


treatment in a reasonably random manner
Often treatment is not manipulable (e.g.
earthquake)
Example (Card and Krueger 1993)
In 1992, New Jersey rose its state-mandated minimum
wage from $4.25 to $5.05.
RQ: What were the effect of a minimum wage increase on
the employment of low-skill teenagers in the fast food
industry?

19
Natural Experiments (contd)

Example (Card and Krueger 1993)


Treatment group: fast food workers in NJ
Control group: fast food workers in neighboring
Pennsylvania, which did not increase its minimum wage
Rationale
NJ and PA are similar states
Teenagers families decisions to live in one or the other are very
unlikely to be correlated with NJs decision to raise its minimum
wage in 1992
Two sources of counterfactual evidence:
Treated vs. non-treated (NJ vs. Pennsylvania)
Before vs. after treatment (Before vs. after minimum wage
increase)
20
(Card and Krueger) vs.(Neumark and Wascher)

21
Correlational Studies

Synonyms: non-experimental or passive-


observational designs
Counterfactual inference is difficult due to the lack of
Pretests and
Control groups
Causal claims are particularly problematic when
We dont know all alternative plausible explanations
Alternative explanations cannot be measured
Statistical models are not well-specified

22
Confounds and Spurious Correlations

Source: http://tylervigen.com/spurious-correlations, last accessed on Sept. 6, 2016

23
Formal Statistical Inference

The process of drawing conclusions


about a population based on sample
data
Practical questions
How much uncertainty is associated with
sample data?
Do my results constitute strong evidence or
just a lucky draw/chance finding?

24
Validity

Validity is the approximate truth of an


inference
Validity is a property of inferences not a
property of designs
We will study
Four types of validity
Threats to each type of validity
Possible remedies

25
Four Types of Validity (SKC definitions)

Statistical Conclusion Validity: The validity of inferences


about the correlation (covariation) between treatment X and
outcome Y
Internal Validity: The validity of inferences about whether
observed covariation between X (the presumed treatment)
and Y (the presumed outcome) reflects a causal relationship
from X to Y as those variables were manipulated or
measured
Construct Validity: The validity of inferences about thenon so cosa vuol dire
higher order constructs that represent sampling particulars
External Validity: The validity of inferences about whether
the cause-effect relationship holds over variation in persons,
settings, treatment variables, and measurement variables
Source: Shadish et al. 2002,26p. 38
Four Types of Validity (More intuitive)

Statistical Conclusion Validity: Is the use of


statistics approriate to infer whether X and Y
covary?
Internal Validity: Does observed covariation
between X and Y result from a causal relationship?
Construct Validity: Are we actually measuring
the concepts that we want to measure?
External Validity: Does the the causal relationship
between X and Y holds over varied persons,
treatments, outcome measures, and settings?

27
Random Sampling

Simple random sampling is the basic


sampling technique where we select a group
of subjects (a sample) for study from a larger
group (a population)
Each individual is chosen by chance and each
member of the population has an equal chance
of being included in the sample
Every possible sample of a given size has the
same chance of selection

28
Random Assignment

An aspect of an experimental design in which


study participants are assigned to the
treatment or control group using a random
procedure
Random assignment creates two or more groups
of units that are probabilistically similar to ecah
other on the average
Hence, any outcome differences between groups
are very likely due to treatment

29
Random Sampling vs. Random Assignment

RADOM NO RANDOM
ASSIGNMENT ASSIGNMENT

RANDOM High Internal Validity Low Internal Validity Generalizability to


SAMPLING High External Validity High External Validity Population

NO RANDOM High Internal Validity Low Internal Validity No Generalizability to


SAMPLING Low External Validity Low External Validity Population

Causation Correlation

30
Main Types of Data

Cross-sectional data: multiple cases (such as


individuals, firms, countries, or regions) are
observed at the same point of time, or without
regard to differences in time
Panel data (longitudinal data or cross-sectional time
series data): multiple cases (people, firms,
countries, etc.) are observed at two or more time
periods.

31
Variables, Attributes, Values

VARIABLE GENDER

ATTRIBUTES FEMALE MALE

VALUES 1 2

32
Levels of Measurement

Nominal: no ordering is implied


e.g., party affiliation, gender, race, university major
Ordinal: the attributes can be rank-ordered but distances
between attributes do not have any meaning
e.g., pain measurement scale: no pain, mild, moderate, severe, worst pain
imaginable
Interval: distance between attributes does have meaning but
there is no meaningful zero
e.g., temperature in Celsius: the difference between a temperature of 80C
and 70C is the same difference as between 50C and 40C, however 0C
does not mean no heat
Ratio: has all the properties of an interval variable, and also
has a clear definition of zero, i.e., when the variable equals
zero there is none of that variable
e.g., height, weight, number of children in a family, temperature in Kelvin
33
UTOS

Units

Treatments
Observations made on the units

Setting of the experiment


34