Escolar Documentos
Profissional Documentos
Cultura Documentos
1 INTRODUCTION
INDING causal associations between two events or
sets of events with relatively low frequency is very
useful for various realworld applications. For exam
ple, a drug used at an appropriate dose may cause one or
more adverse drug reactions (ADRs), although the prob
ability is low. Discovering this kind of causal relation
ships can help us prevent or correct negative outcomes
caused by its antecedents. However, mining these rela
tionships is challenging due to the difficulty of capturing
causality among events and the infrequent nature of the
events of interest in these applications.
In this paper, we try to employ a knowledgebased
approach to capture the degree of causality of an event
pair within each sequence since the determination of cau
sality is often ultimately application or domain depen
dent. We then develop an interestingness measure that
incorporates the causalities across all the sequences in a
database. Our study was motivated by the need of disco
vering ADR signals in postmarketing surveillance, even
though the proposed framework can be applied to many
different applications. ADRs represent a serious world
wide problem [1, 2]. They can complicate a patients med
ical condition or contribute to increased morbidity, even
death. Studies have shown that ADRs contribute to about
5% of all hospital admissions and represent the 5
th
com
monest cause of death in hospitals [3].
Even though premarketing clinical trials are required
for all new drugs before they are approved for marketing,
these trials are necessarily limited in samplesize and du
ration, and thus are not capable of detecting rare ADRs.
In general, an ADR cannot be recognized by these trials if
its occurrence rate is less than 0.1% [1].Therefore, drug
safety depends heavily on postmarketing surveillance;
that is, the monitoring of impacts of medicines once they
have been made available to consumers. In the U.S., cur
rent postmarketing surveillance methods primarily rely
on the FDAs spontaneous reporting system Med
Watch
TM
. Because ADR reports are filed at the discretion
of the users of the system, there is gross underreporting
xxxx-xxxx/0x/$xx.00 200x IEEE
'
'
'
'
'
0,
( , ),
( , )
1 _ ( , ),
_ ( , ),
i i
i i
L i i
i i
i i
if c or c is unknown
overlap c c if cue i is nominal
S c c
normalized diff c c if cue i is quantitative
fuzzy sim c c if cue i is fuzzy value
(1)
In the above definition, the cue similarity is set 0 if ei
ther of the cue values is unknown. The function overlap()
is defined as:
'
'
1,
( , )
0,
i i
i i
if c c
overlap c c
otherwise
=
(2)
That is, for nominal cue values, the similarity is 1 if the
cue values are equal; otherwise it is 0. Computing the
similarity for quantitative cue values involves distance
measure normalized_diff :
'
'
_ ( , )
i i
i i
i
c c
normalized diff c c
=
A
(3)
where
i i i
a b A =
is employed to normalize the cue dif
ference, and
i
a
and
i
b are the maximum and minimum
values for cue i, respectively.
The last expression in the above formula,
'
_ ( , )
i i
fuzzy sim c c
, deals with cues represented by fuzzy
sets. Let cue i be defined on the universe of discourse X
and x be an element of X. In this case, fuzzy cue values
i
c
and
i
c
'
are two fuzzy sets whose membership func
tions are defined in terms of x . Their similarity could be
defined on the basis of possibility measures on the [0, 1]
interval [51]:
( ) ( )
( ) ( ) ( )
'
, , , 0.5
_ ( , )
1.5 , , ,
i i i i
i i
i i i i
poss c c if poss c c
fuzzy sim c c
poss c c poss c c otherwiese
' '
<
' ' -
(4)
where
( ) ( ) ( )
'
, max min ( ), ( )
i
i
i i c
c
x
poss c c x x x X
'
= e
(5)
Here,
( )
i
c
x
and
'
( )
i
c
x are the membership functions of
fuzzy set
i
c
and
i
c
'
, respectively. The min() and max()
are standard fuzzy operators, and hence
( ) ( )
'
max min ( ), ( )
i
i
c
c
x
x x
computes the maximum of all the
minimums between pairs
( )
i
c
x
and
'
( )
i
c
x for all the
elements x . i
c
is the complement of
i
c
(i.e.,
( ) 1 ( )
i
i
c
c
x x =
).
The similarity between two sets of cue values V and V
is named as global similarity SG(V, V). It is defined as the
weighted sum of all the local similarities with respect to
each pair of cue values. That is,
( )
'
1
1
( , )
,
n
i L i i
i
G n
i
i
w S c c
S V V
w
=
=
' =
(6)
where | | 0,1
i
w e is the weight for cue i, which represents
the relative significance of the cue and is assigned by the
user.
The above global similarity is used to find the most
matching experience whose associated causality category
can characterize the causal association of a pair of interest
in a particular event sequence. We use this approach to
obtain a similarity value between the current pair and
each of the experiences. After that, these similarity values
are normalized so that their sum is equal to 1. These nor
malized values are then used to represent the member
ship values of corresponding categories for the pair of
interest in a particular sequence. For example, if the nor
malized similarity value between a pair and the expe
rience exhibiting possible association is 0.6, we would
say the causality of the pair is possible with a member
ship value of 0.6 in that sequence. If we use d and s to
represent a drug and a symptom, respectively, Table 2
gives a couple of sample drugsymptom pairs as well as
their membership values related to each causality catego
ry for the patient cases that contain the pairs. Note that,
given a particular drugsymptom pair, there may exist
none, one, or multiple patient cases that contain this pair.
On the other hand, the same patient case may contain
multiple different pairs. Patient 4 in the table contains
both <d1, s2> and < d3, s5>. C<X,Y> within one patient case is
equal to the weighted sum of the membership values of
all causality categories.
In general, if there exist m
experiences that classifies
the causality between X
and Y into m
distinctive cate
gories, the degree of causality is defined as
(7)
where i is the membership of the
th
i
causality category
for the pair, and wi represent the corresponding weight
when converging all the causality categories into one.
Moreover,
The selection of wi
is based on two considerations. First,
causality categories representing stronger causal associa
tions should have higher weights. That is, if we assume
the causality levels represented by m experience are in a
TABLE 2
MEMBERSHIP OF EACH CAUSALITY CATEGORY FOR SAMPLE
SIGNAL PAIRS (PID MEANS PATIENT IDENTIFICATION NUMBER)
1
1
m
i
i
=
=
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
6 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
decreasing order, then wm > > w2 > w1
must be satisfied,
where wm
and w1 correspond to the highest and lowest
causality levels, respectively. Second, the range of C<X,Y>
should be [0, 1]. That is, C<X,Y> should be 0 for the extreme
situation ={0, 0,1} where the evidence in a sequence
strongly shows unlikely association of the pair. If all the
evidence in the patient supports very likely association
(i.e. ={1,0 0}), C<X,Y> should be 1. Otherwise, C<X,Y> is
between 0 and 1. In general, the weight for the
th
i
causali
ty category is calculated by
(8)
where 1 m i > > . For example, if there are 4 causality cate
gories (i.e. m=4), then the set of weights for them would
be
{ } 1.0, 0.667, 0.333, 0.0
. One can see that the computed
C<X,Y> using these weights satisfies the above two crite
rions.
We introduce causal association rule (CAR), denoted
by
C
X Y , meaning that X
potentially causes Y . We
define the support of a CAR, supp(
C
X Y ), as the accu
mulated votes over all sequences. The vote of the
th
j
se
quence is represented by C
j
<X,Y>. That is
supp(
C
X Y ) = (9)
where n is the number of sequences that contain the pair.
N is the total number of sequences in the database. The
advantage of this definition is that the degree of causality
is incorporated into the measure such that sequences ex
hibiting higher causality will contribute more to the
measure.
Based on the above definition of the support of a CAR,
we define an interestingness measure called causal
leverage:
causalleverage(
C
X Y )
= supp (
C
X Y ) supp (
C
X )
supp ( Y ) (10)
where supp (
C
X ) is the proportion of sequences whose
votes are greater than 0 with respect to the pair <X,Y>.
supp( Y ) is the proportion of sequences that containY .
This measure extends the traditional leverage measure
(i.e., leverage( X Y )=supp( X Y ) supp( X )supp( Y ))
by incorporating the degree of causality of the pair
among all sequences into it.
With better capability of capturing causal associations,
the causalleverage measure can be employed to mine the
causal association of pair <X,Y> given a collection of event
sequences. However, its ability of discovering the causal
associations between drugs and their potential ADRs is
still limited because of the complexity and special charac
teristics of electronic patient data. Recall that the degree
of causal association of a pair within a particular patient
case is determined by four cues: temporal association,
rechallenge dechallenge, and other explanations. One can
see that three of them are timerelated. For a particular
pair, their values can be abstracted from a specific patient
case using fuzzy rules and these values collectively de
termine the degree of causal association of the pair in the
patient case. However, the obtained cue values may not
really represent the actual causal association of the pair
due to the complexity of the electronic patient data. One
of the complexities is that frequently prescribed drugs are
easily temporally associated with any symptom by coin
cidence. For example, if the hazard period T is equal to 90
days and the start date of a drug appears once every two
months within the same case, then the drug has temporal
association with any symptom existed in the patient case.
Similarly, a particular drug tends to coincidently form
temporal association with symptoms (e.g. fever) caused
by common diseases. Note that symptoms of a potential
ADR cannot be differentiated from those of an underlying
disease, both of which are encoded using the same ICD9
codes in electronic patient databases. For a drug
symptom pair, the associated symptom is possibly caused
either by a drug or by a disease after the drug is started. If
both a drug and a symptom have high frequencies in a
health database, they can even easily form rechallenge by
coincidence. As a result, the obtained causalleverage value
can be high for the pair even though this value does not
imply any degree of causality. Therefore, it is necessary to
find an exclusion mechanism to reduce the undesirable
effects caused by high frequent events (e.g., either drugs
or symptoms).
Intuitively, if a high frequent event Y randomly occurs
after another event X within time period T, then it
should have the same chance to also occur before X
within the same period T among many event sequences.
Case1 in Fig. 2 demonstrates a pattern where the symp
tom s1 occurs before the drug d1 within time period T. A
reasonable explanation of this pattern is that the underly
ing disease of the symptom s1 causes the prescription of
the drug d1. Case2 indicates a pattern where the symptom
s1 occurs after the drug d1 within time period T. If we look
at Case2 only, a reasonable explanation could be that the
symptom s1 represents a possible ADR caused by the
drug d1 since they exhibit reasonable temporal associa
tion. However, if we look at these two patient cases to
Fig. 2. Illustration of patient cases that exhibit different types of event
patterns.
1
1
i
i
w
m
,
1
n
j
X Y
j
C
N
=
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
JI ET AL.: MINING INFREQUENT CAUSAL ASSOCIATIONS IN ELECTRONIC HEALTH DATABASES 7
gether, it is more likely that the symptom s1 simply occurs
randomly. In this case, both the pair <d1, s1> and its re
versed pair <s1, d1> exhibit temporal associations in vari
ous patient cases. In some cases, the above two patterns
can coexist in the same patient case as shown in Case3. For
such a pattern, there exist several possible explanations.
First, d1 is caused by the first s1 and the second s1 occurs
coincidently. Second, the second s1 is caused by d1 and the
first s1 occurs coincidently. Third, s1 and d1 simply occur
randomly without any causal association. A more realistic
patient case can even exhibit a more complex pattern as
shown in Case4. It denotes that d1 has temporal and re
challenge relationship with s1 since their temporal associ
ation occurs twice along the time. However, this pattern
can be explained differently. For example, the second d1
occurs after the first s1, which can be explained as the un
derlying disease of the symptom causing the prescription
of the drug. The temporal association between s1 and d1
also occurs twice and they also exhibit rechallenge rela
tionship. Considering all these four types of patterns to
gether, if drugs and/or symptoms occur randomly and
frequently, they may falsely form temporal association
and rechallenge relationships. However, the number of
cases exhibiting these relationships for a pair <drug, symp
tom> and that for its reversed pair (i.e., <symptom, drug>)
should be similar.
Given the above observations, we define a reverse caus
alleverage of CAR, denoted as reverse causal
leverage(
C
X Y ), with respect to pair <X,Y>.
reverse causalleverage(
C
X Y )
= causalleverage(
C
Y X )
= supp (
C
Y X ) supp (
C
Y )
supp ( X ) (11)
where supp (
C
Y X ) is the support of the CAR
C
Y X .
supp (
C
Y ) is the proportion of sequences whose votes
are greater than 0 with respect to the pair <Y,X>.
supp( X ) is the proportion of sequences that contain X .
The measure represents the degree of reverse causality
between X and Y . That is, it denotes the strength that Y
causes X . Intuitively, if X and Y are randomly distri
buted among the patient cases in a database and do not
have any causal relationships, causalleverage(
C
X Y )
should have same or similar value as reverse causal
leverage(
C
X Y ), no matter how high the frequency of the
pair is. Otherwise, if X causes
Y , the former should be
greater than the latter. On the other hand, if Y causes X ,
then the former should be smaller than the latter. Hence,
we define an exclusive causalleverage measure as below:
exclusive causalleverage(
C
X Y )
= causalleverage(
C
X Y ) reverse causalleverage(
C
X Y )
=
supp (
C
X Y ) supp (
C
X )
supp ( Y )
[supp (
C
Y X ) supp (
C
Y )
supp ( X )] (12)
If this measure has a positive value, it denotes that X has
causality with Y . The bigger the value, the stronger the
causal association. If its value is negative, it indicates a
reverse causality. Otherwise, a value close to zero is a
sign of no causal association between X causes
Y . Thus,
this measure not only provides a way to quantify the de
gree of causal association but also excludes undesirable
effects caused by highfrequency events that occur ran
domly.
4.2 Mining ADR Signal Pairs from Electronic Pa-
tient Databases
Drugs and their associated ADRs have causal relation
ships. In this subsection, we examine how to search for
potential ADR signal pairs from an electronic patient da
tabase using the above exclusive causalleverage measure.
We assume that patient data are stored in relational tables
in a database and can be retrieved using database lan
guage like Structured Query Language (SQL). These
tables are linked through patient identification numbers
(PIDs). We also assume that the drugrelated data and
symptomrelated data are stored in two tables called Pa
tient Drug Table and Patient Symptom Table, respective
ly.
Fig. 3 shows an overall picture of the data mining algo
rithm. First, preprocessing is needed to get two types of
information: 1) a list of all drugs D(d1, d2, , dm) in the
database and the support count for each drug; 2) a list of
all symptoms S(s1, s2, sn) in the database and the sup
port count for each symptom. The lists of drugs and
symptoms are needed to form all possible drugsymptom
pairs whose causal strengths will be assessed. Since a pa
tient database normally only contains a subset of all drugs
on the market and a subset of all symptoms, it is neces
sary to search the Patient Drug Table and the Patient
Symptom Table to get the drugs and symptoms covered
by the database. Using the discovered drugs and symp
toms in the database (instead of all drugs on the market
and all available symptoms) can avoid forming unneces
sary pairs, which will reduce the computational complex
ity of the data mining algorithm. In addition, the support
Fig. 3. Algorithm for mining causal association rules
Fig. 4. Data structures for storing drug/symptom support counts. (a)
drug hash table, (b) symptom hash table
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
8 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
count for each drug or symptom will be used to calculate
the exclusive causalleverage value for related pairs. Hence
their values also need to be computed.
Algorithm 1 shows how to search a database (DB) for
the list of drugs and the support count for each drug. The
discovered drugs and their support counts are stored in a
hash table as shown in Fig 4a. Initially, the hash table is
empty. We use Dk to represent the set of drugs taken by
the k
th
patient pk in the database. For each patient, we first
retrieve all the drugs taken the patient. For each of these
drugs, we then check whether the hash table contains the
drug. If the hash table does not contain the drug, the
drugs support count is set as 1, and both the drug and its
support count are added to the hash table. Otherwise, the
drugs support count is increased by 1. After all the pa
tient cases are searched, the hash table is returned. Note
that, when calculating the support count for a drug, it is
counted only once for one patient case even if the drug
appears several times in that patient case. One can see
that the computational complexity of this searching
process can be affected by the total number of patients N
in the database and the average number of drugs taken
by a patient. The later is normally determined by the type
of patients in the database. For instance, old patients often
have multiple diseases and thus may take multiple drugs
either at the same time or different times.
The algorithm used to search the Patient Symptom Ta
ble for a list of symptoms covered by the database as well
as the support count for every symptom (Fig 4b) is similar
to Algorithm 1. If users are only interested in mining the
potential ADRs of a particular drug or a couple of drugs,
the users can specify the drugs of interest. Similarly, the
users can also specify the list of symptoms if they want to
analyze which drugs can cause the symptoms of interest.
In both cases, however, the Patient Drug Table and the
Patient Symptom Table still need to be searched in order
to get the support count for each drug or symptom.
After getting the list of drugs D(d1, d2, , dm) and the
list of symptoms S(s1, s2, sn), the next step is to generate
all the possible pairs, each of which represents a CAR.
Algorithm 2 shows the process for pair generation and
evaluation. Please note that most existing data mining
methods mine all interesting association rules that com
bine all possible events or items in a database. We are
only interested in mining drugsymptom pairs that can be
easily generated, given D and S. That is, we are not inter
ested in other patterns like drugdrug pairs, symptom
symptom pairs or combinations of multiple drugs and
symptoms. Thus, our algorithm generates a much fewer
number of candidate rules, which implies much less
complexity. The complexity of this pair generation
process is O(mn) where m and n are the number of drugs
and symptoms, respectively. In addition, since we are
interested in mining infrequent patterns, it is inappropriate
to prune pairs using the support measure (i.e. support >
minsupp). However, in postmarketing surveillance, a sig
nal pair generated by a data mining method is generally
not considered as valid if only one or two patient cases
contain the pair [9]. Therefore, we utilize a minimum
support count mincount=3 (instead of minsupp) to further
reduce the number pairs that will be evaluated. Specifical
ly, we retrieve the PIDs of those patient cases that contain
both the drug and the symptom of each pair. This is done
by sending a SQL statement to query the database. Mod
ern database management systems can utilize optimiza
tion techniques like index to speed up this process. If the
number of PIDs that support a pair is greater than or
equal to mincount (Line 4 of Algorithm 2), the pair is eva
luated. To evaluate the strength of the causal association
of the pair, its causalleverage value is first computed.
Then, the pair is reversed. That is, the pair < di, sj > be
comes < sj, di >. The reverse causalleverage value of pair < di,
sj > is equal to the causalleverage value of its reversed pair
< sj, di >. After that, the exclusive causalleverage value of the
pair is computed by subtracting its reverse causalleverage
value from its normal causalleverage value.
Algorithm 3 shows how to compute the causal
leverage value of a general pair between event X and Y .
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
JI ET AL.: MINING INFREQUENT CAUSAL ASSOCIATIONS IN ELECTRONIC HEALTH DATABASES 9
Both X and Y could be either drug event or symptom
event. First, the drug or symptom hash table is searched
in order to get the support count for event Y . Then, for
each PID that supports the pair, a process called cue ab
straction is used to extract a set cue values V from the
related patient case. Specifically, a list of drug start dates
and a list of symptom dates are retrieved from the Patient
Drug Table and the Patient Symptom Table, respectively.
Note that, since each patient case records the patients
history for a long period of time, the same drug may be
prescribed many times and thus multiple drug start dates
may exist within one patient case. Similarly, there may
exist multiple symptom dates for the symptom within the
same patient case. Therefore, we must compare each start
date of the drug with each symptom date of the symptom
in order to obtain interested temporal patterns for the pair
within the case. The complexity of this process is O(LdLs),
where Ld and Ls are the length of the list of drug start
dates and the length of the symptom dates, respectively.
Ld and Ls often depend on the characteristics of the pa
tient. For example, a patient with chronic diseases tends
to take the same drug for many times and have repeated
symptoms. After getting the temporal patterns, cue val
ues for temporal association, rechallenge and dechallenge are
derived from these patterns using fuzzy rules. Note that,
in order to make our algorithm more generic, the cue oth
er explanations was not utilized in this study since it is
drugdependent. Readers are referred to our previous
paper for specific fuzzy rules that were used in this
process [16].
After the set of cue values V of the pair are extracted
from the related patient case, similarity values are com
puted between V and each set of cue values V in the ex
perience knowledge base (EKB). This step is also called
feature matching in the Fuzzy RPD model. These similari
ty values are then normalized so that they are trans
formed to the format shown in Table 2. After that, the
degree of causality C<X, Y> within the current patient is
computed. If C<X, Y> is greater than 0, it is added to the ac
cumulated votes and the number of cases whose votes are
greater than 0 is increased by one. After the above com
putation is done for all the supporting cases of the pair,
the causalleverage value of the pair is computed (refer to
Eq. (10)) and returned.
Finally, we rank all the pairs in a decreasing order ac
cording to their exclusive causalleverage values after all
these values are computed. The higher the value, the
more likely a causal association exists in the pair or CAR.
5 EXPERIMENTS
5.1 Experiment data
Our experiment data were from Veterans Affairs Medical
Center in Detroit, Michigan. We were interested in two
classes of drugs: statin drugs and ACEI (angiotensin
converting enzyme inhibitors) drugs. Deidentified elec
tronic data of all the patients who received at least one of
the drugs of our interest during the time period from Sep
tember 4, 2007 to September 16, 2009 were retrieved.
Event data such as dispensing of drug, office visits, and
certain laboratory tests were retrieved for all the patients.
For each event certain details were obtained. For example,
the data for dispensing of drug include the name of the
drug, quantity of the drug dispensed, dose of the drug,
drug start date, drug schedule, and the number of refills.
The retrieved data included 16,206 patients. 15,605
(96.3%) were male, and 601 (3.7%) were female. Their av
erage age was 68.0. The drugs enalapril, pravastatin and
rosuvastatin were selected to test the proposed data min
ing framework. Pravastatin and rosuvastatin are both
statin drugs, and enalapril is an ACEI drug. These three
drugs allow us to examine the effectiveness of our data
mining framework in identifying potential ADRs for
drugs both in the same drug class and in different drug
classes. The total number of patients who took one of the
three drugs enalapril, pravastatin and rosuvastatin at
least once is 1021, 1383, and 882, respectively. Note that
multiple drugs might be prescribed for the same patient
either at the same date or different dates along time. The
total number of distinctive ICD9 codes was 3,954 in the
data. Each code represented a potential ADR that might
be causally associated with any drug.
The data were stored in five relational tables in a Mi
crosoft Access database. These tables contained patients
demographic data, visit data, diagnostic data, drug
related data, and laboratory data. SQL was used to query
the tables and obtain desired information. The query re
sults were returned to Java programs that implemented
the data mining algorithm. Fuzzy sets and fuzzy rules
were coded using Fuzzyjess [52], a Javabased fuzzy infe
rence engine.
5.2 Experiment Results
Table 3 gives the causalleverage, reverse causalleverage and
exclusive causal_leverage between drug enalapril and each
of seven ICD9 codes: 276.7 (hyperkalemia), 366.02 (post
subcaps polar cataract), 428.0 (congestive heart failure),
327.27 (central sleep apnea in conditions classified else
where), 305.60 (cocaine abuse unspecified), 275.42
(Hypercalcemia), and 525.9 (unspecified disorder of the
teeth and supporting structures). The code 276.7
represents a wellknown ADR caused by drug. The table
shows that its causalleverage is much larger than its reverse
TABLE 3
COMPARING THE DEGREE OF ASSOCIATION BETWEEN ENALAPRIL
AND EACH OF SEVEN ICD-9 CODES UNDER DIFFERENT MEAS-
URES (HAZARD PERIOD T = 90 DAYS)
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
10 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
causalleverage, and thus its exclusive causal_leverage is clos
er to its causalleverage. The code 366.02 was considered by
our physicians as neither an ADR nor a medical condition
due to drug enalapril. One can see that its causalleverage
and reverse causalleverage are very close even though ei
ther one is larger than the causalleverage of 276.7. The
code 428.0 denotes a medical condition for taking enala
pril. The table indicates that its causalleverage is smaller
than its reverse causalleverage, which results in a negative
exclusive causal_leverage value. The table also includes four
other disease conditions that are not related to drug ena
lapril. That is, they are neither ADRs caused by enalapril
nor medical conditions that cause the prescription of ena
lapril. The code 327.27 has the smallest positive exclusive
causal_leverage value among all those unrelated codes.
Based on the codes 276.7 and 327.27, one can see that the
positive range of exclusive causal_leverage values is 2.94E7
~ 1.36 E4 with respect to drug enalapril. The code 305.60
has the largest positive exclusive causal_leverage value
among those unrelated codes. Its reverse causalleverage is
0, which means that no cases exhibit reasonable temporal
patterns for the reverse pair. According to Table 4, the
rank of this code is as high as 6. This indicates that some
unrelated codes could still be ranked high due to the
complexity of the data. The codes 275.42 and 525.9 are
simply two randomlyselected symptoms that are not
related to drug enalapril. They exhibit similar properties
as codes 366.02 and 327.27. That is, their causalleverage
and reverse causalleverage values are relatively large, but
their absolute values of the exclusive causal_leverage meas
ure are small. Ranking these seven ICD9 codes using
their causalleverage values, the code 428.0 will demon
strate the strongest causal association with drug enalapril.
But if exclusive causal_leverage is used, the code 276.7 (i.e.
the known ADR) will have the strongest association with
the drug. Therefore, the table indicates that our algorithm
can indeed effectively remove some undesirable effects
caused by frequent events.
For each of the three selected drugs, we computed the
exclusive causalleverage values for all the pairs between
the drug and each of the 3,954 ICD9 codes. We ranked
the codes in a decreasing order according to each pairs
exclusive causalleverage value. The top 10 codes were eva
luated by our physicians to determine whether they were
likely real ADRs associated with the drug. The physicians
made decisions based on the Physicians Desk Reference
[53], literature reporting, and their clinical experience.
We also ranked the association between the drug and
each ICD9 code using three other measures: causal
leverage, traditional leverage without considering causali
ty, and risk ratio. Risk ratio is defined as the probability
that a case coexists with the ICD9 code relative to the
probability that a noncase coexists with the same ICD9
code. Risk ratio and leverage represent two typical fre
quencybased interestingness measures. The hazard pe
riod was set as T=90 days when causalleverage and exclu
sive causalleverage were calculated. The value of this pa
rameter needs to be chosen carefully since if it is too large,
some unrelated symptoms could form false temporal rela
tionships with the drug. If it is too small, real signals may
be excluded because some ADRs only happen after a
drug has been taken for as long as tens of days. After ba
lancing these two situations, our physicians think 90
would be an appropriate value for this parameter.
Table 4 lists the top 10 ICD9 codes identified by the
exclusive causalleverage measure for drug enalapril. Eight
of them were thought as real ADRs associated with ena
lapril with high likelihood. Their ranks in the three other
TABLE 5
COMPARING RANKS OF ICD-9 CODES GIVEN DRUG PRAVASTA-
TIN UNDER FOUR DIFFERENT INTERESTINGNESS MEASURES
(NOS - NOT OTHERWISE SPECIFIED; NEC NOT ELSEWHERE
CLASSIFIED)
TABLE 4
COMPARING RANKS OF ICD-9 CODES GIVEN DRUG ENALAPRIL
UNDER FOUR DIFFERENT INTERESTINGNESS MEASURES
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
JI ET AL.: MINING INFREQUENT CAUSAL ASSOCIATIONS IN ELECTRONIC HEALTH DATABASES 11
measures are also included in the table. One can see that
neither risk ratio nor leverage ranked the eight ICD9
codes highly. The table also indicates that the results gen
erated by these two measures (i.e. risk ratio and leverage)
are much more similar than those generated by the other
two measures. By looking at the ranks of the eight ICD9
codes, the causalleverage measure produced much better
performance as compared with the risk ratio and leverage
measures since it could capture the causal relationship
between a pair. Even though the tradition leverage meas
ure obtained comparable or even better ranks for two of
the eight ICD9 codes (i.e. 276.7 and 233.7), it ranked the
remaining six ICD9 codes as low as around 1000. With
capability of reducing the undesirable effects of frequent
ICD9 codes, the exclusive causalleverage measure
achieved the best performance.
The experiment results for drug pravastatin are given
in Table 5. Seven out of the top 10 ICD9 codes were iden
tified as likely ADRs by our physicians. The causal
leverage measure gives higher ranks for all these seven
ICD9 codes than both risk ratio and traditional leverage
measures. Another interesting observation is that the
causalleverage measure classified three out of the seven
ICD9 codes into its top 10 list. This implies that this
measure does have the potential to discover some poten
tial ADRs because of its capability in capturing the causal
relationship between a drug and a symptom. Consistent
ly, the exclusive causalleverage measure again achieved the
best performance.
Table 6 gives the experiment results for drug rosuvas
tatin. Six out of the top 10 ICD9 codes were found to be
likely ADRs of this drug. In terms of the relative abilities
of the four interestingness measures in discovering poten
tial ADRs, these results are consistent with the results
demonstrated by the other two drugs.
Since the length of hazard period T affects the perfor
mance of our exclusive causalleverage measure, we tested
the rankings of three randomly selected known ADRs of
drug enalapril for different T values (Table 7). Either 30 or
150 does not seem to be appropriate values for T since
some known ADRs were ranked very low. Similar rank
ings were achieved when T was set to 60, 90 or 120 days.
6 DISCUSSION
The above experiment results showed that our exclusive
causalleverage measure outperformed traditional interes
tingness measures like risk ratio and leverage because of
its ability to capture suspect causal relationships and ex
clude undesirable effects caused by frequent events.
However, due to the complexity, incompleteness, and
potential bias (e.g. tolerable ADRs sometimes may not be
recorded) of the data, some ICD9 codes may still be false
ly ranked high based on the exclusive causalleverage. That
is, false associations cannot be completely avoided using
our method. Please note that, in postmarketing surveil
lance, data mining is primarily used to shorten the long
list of potential drugADR pairs. Like the other data min
ing methods, the signal pairs found by our approach will
be subject to further analysis (e.g., epidemiology study),
case review, and interpretation by drug safety profession
als experienced in the nuances of pharmacoepidemiology
and clinical medicine [54].
When compared to the ADRs from clinical trials, the
ADRs identified by the exclusive causalleverage measure
for the statins (pravastatin and rosuvastain) are unique
and different. Clinical trials revealed some similar ADRs
between the two statins including nausea, headaches,
myalgia, and dizziness. However, the ADRs in this
study, as listed in Tables 5 and 6, revealed no overlap of
ADRs between the two statins. One reason for this dispar
ity could be that we only evaluated the top 10 symptoms
ranked by our measure. The difference may also be ac
counted by the differences in setting and sampling of cas
es. Clinical trial data come from double blinded placebo
studies, but the hospital data represent a naturalistic
study. In clinical trials, the participants are individuals
with hyperlipidemia as their only diagnosis or the prima
ry diagnosis, and they are not allowed to have any com
plicated medication regimen or unstable medical condi
tions. The data set in our study derived from hospita
lized patients, who are primarily male (96.3%) with mean
age of 68, who normally have multiple diagnoses in vari
ous severities and taking variety of medications. As a
result, the ADRs derived from the exclusive causalleverage
measure represent more naturalistic and real patients in
clinical care. The method allows us to examine and eva
luate complicated samples of patients to draw useful
causal information.
TABLE 6
COMPARING RANKS OF ICD-9 CODES GIVEN DRUG ROSUVAS-
TATIN UNDER FOUR DIFFERENT INTERESTINGNESS MEASURES
TABLE 7
RANKINGS OF THREE KNOWN ADRS OF DRUG ENALAPRIL UN-
DER DIFFERENT LENGTHS OF HAZARD PERIOD T
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
12 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
The focus of this paper is on how to design a novel in
terestingness measure and apply the result to the impor
tant adverse drug reaction problem (i.e., finding interest
ing ADR signal pairs). Algorithm design and efficiency
analysis become more important when one studies how
to efficiently mine all possible rare eventsets and associa
tion rules based on minimal support. Another difference
is that according to the literature, existing rare association
rule mining research assumes that all the data are stored
in a single table. In contrast, our work must use data from
a relational (medical) database composed of several re
lated tables. Although designing efficient SQL statements
and data models to handle large volumes of data in a re
lational database is beyond the scope of this paper, we
will consider developing an efficient algorithm suitable
for relational databases and analyze its complexity and
efficiency in the future.
7 CONCLUSION
Mining the causal association between two events is very
important and useful in many real applications. It can
help people discover the causality of a type of events and
avoid its potential adverse effects. However, mining these
associations is very difficult especially when events of
interest occur infrequently. We have developed a new
interestingness measure, exclusive causalleverage, based on
an experiencebased fuzzy RPD model. This measure can
be used to quantify the degree of association of a CAR.
Moreover, the measure was designed to mask the unde
sirable effects caused by highfrequency events. We have
applied this measure to detect the causal associations be
tween each of the three drugs (i.e., enalapril, pravastatin,
and rosuvastatin) and the ICD9 codes. A data mining
algorithm was developed to search a real electronic pa
tient database for potential ADR signals. Experimental
results showed that our algorithm could effectively make
known ADRs rank high among all the symptoms in the
database.
ACKNOWLEDGMENT
We would like to thank Ellison Floyd and Richard E. Mil
ler for retrieving the patient information from the VAMC
Detroits databases. This work was supported in part by
National Institutes of Health under grant 1 R21
GM08282101A1.
REFERENCES
[1] J. Talbot and P. Waller, Stephens Detection of New Adverse Drug
Reactions, 5 ed.: John Wiley & Sons, 2004.
[2] L. T. Kohn, J. M. Corrigan, and M. S. Donaldson, To error is
human: building a safer health system. Washington, DC: National
Academy Press, 2000.
[3] J. Lazarou, B. H. Pomeranz, and P. N. Corey, Incidence of
adverse drug reactions in hospitalized patients: a metaanalysis
of prospective studies, JAMA, vol. 279, pp. 12005, Apr 15 1998.
[4] K. Hartmann, A. K. Doser, and M. Kuhn, Postmarketing safety
information: how useful are spontaneous reports?,
Pharmacoepidemiology & Drug Safety, vol. 8, pp. 6571, 1999.
[5] L. Hazell and S. A. W. Shakir, UnderReporting of Adverse
Drug Reactions A Systematic Review, Drug Safety, vol. 29, pp.
385396, 2006.
[6] K. E. Lasser, P. D. Allen, S. J. Woolhandler, D. U. Himmelstein,
S. M. Wolfe, and D. H. Bor, Timing of new black box warnings
and withdrawals for prescription medications, Journal of the
American Medical Association, vol. 287, pp. 221520, 2002.
[7] A. Szarfman, J. M. Tonning, and P. M. Doraiswamy,
Pharmacovigilance in the 21st century: new systematic tools
for an old problem., Pharmacotherapy, vol. 24, pp. 1099104,
2004.
[8] M. Lindquist, I. R. Edwards, A. Bate, H. Fucik, A. M. Nunes,
and M. Stahl, From association to alerta revised approach to
international signal analysis, Pharmacoepidemiology & Drug
Safety, vol. 1, pp. 1525, 1999.
[9] S. J. Evans, P. C. Waller, and S. Davis, Use of proportional
reporting ratios (PRRs) for signal generation from spontaneous
adverse drug reaction reports, Pharmacoepidemiology & Drug
Safety, vol. 10, pp. 4836, 2001.
[10] W. DuMouchel, Bayesian Data Mining in Large Frequency
Tables, With an Application to the FDA Spontaneous Reporting
System, Am Stat pp. 177190, 1999.
[11] P. Purcell and S. Barty, Statistical techniques for signal
generation: the Australian experience, Drug Safety, vol. 25, pp.
41521, 2002.
[12] M. Hauben, Early postmarketing drug safety surveillance: data
mining points to consider, Annals of Pharmacotherapy, vol. 38,
pp. 162530, Oct 2004.
[13] Y. Ji, H. Ying, M. S. Farber, J. Yen, P. Dews, R. E. Miller, and R.
M. Massanari, A Distributed, Collaborative Intelligent Agent
System Approach for Proactive Postmarketing Drug Safety
Surveillance, IEEE Transactions on Information Technology in
Biomedicine, vol. 14, pp. 826837, 2010.
[14] H. Jin, J. Chen, H. He, G. Williams, C. Kelman, and C. OKeefe,
Mining unexpected temporal associations: applications in
detecting adverse drug reactions, IEEE Transactions on
Information Technology in Biomedicine, vol. 12, pp. 488500, 2008.
[15] G. N. Norn, J. Hopstadius, A. Bate, K. Star, and I. R. Edwards,
Temporal pattern discovery in longitudinal electronic patient
records Data Mining and Knowledge Discovery, vol. 20, pp. 361
387, 2010.
[16] Y. Ji, H. Ying, P. Dews, M. S. Farber, A. Mansour, J. Tran, R. E.
Miller, and R. M. Massanari, A Fuzzy RecognitionPrimed
Decision ModelBased Causal Association Mining Algorithm
for Detecting Adverse Drug Reactions in Postmarketing
Surveillance, presented at the Proceedings of The IEEE
International Conference on Fuzzy Systems, IEEE World
Congress on Computational Intelligence, Barcelon, Spain, 2010.
[17] Y. Ji, H. Ying, P. Dews, A. Mansour, J. Tran, R. E. Miller, and R.
M. Massanari, A Potential Causal Association Mining
Algorithm for Screening Adverse Drug Reactions in
Postmarketing Surveillance, Information Technology in
Biomedicine, IEEE Transactions on, vol. 15, pp. 428437, 2011.
[18] Y. Ji, R. M. Massanari, J. Ager, J. Yen, R. E. Miller, and H. Ying,
A Fuzzy LogicBased Computational RecognitionPrimed
Decision Model, Information Science, vol. 177, pp. 43384353,
2007.
[19] P.N. Tan, M. Steinbach, and V. Kumar. (2005). Introduction to
Data Mining.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
JI ET AL.: MINING INFREQUENT CAUSAL ASSOCIATIONS IN ELECTRONIC HEALTH DATABASES 13
[20] L. Geng and H. J. Hamilton, Interestingness Measures for Data
Mining: A Survey, ACM Computing Surverys, vol. 38, 2006.
[21] A. K. H. Tung, H. Lu, J. Han, and L. Feng, Efficient mining of
intertransaction association rules IEEE Transactions on
Knowledge and Data Engineering, vol. 15, pp. 43 56, 2003.
[22] C. Marinica and F. Guillet, KnowledgeBased Interactive
Postmining of Association Rules Using Ontologies, IEEE
Transactions on Knowledge and Data Engineering, vol. 22, pp. 784
797, 2010.
[23] R. AGRAWAL and R. SRIKANT, Fast algorithms for mining
association rules, presented at the Proceedings of the 20th
International Conference on Very Large Databases, Santiago,
Chile, 1994.
[24] P.N. Tan and V. Kumar, Interestingness Measures for
Association Patterns : A Perspective, Department of Computer
Science, University of Minnesota, 2000.
[25] W. Klosgen, Explora: a multipattern and multistrategy
discovery assistant, in Advances in knowledge discovery and data
mining, U. M. Fayyad, et al., Eds., 1st ed Cambridge, MA: MIT
Press, 1996, pp. 249271.
[26] N. Lavrac, P. Flach, and B. Zupan, Rule Evaluation Measures:
A Unifying View, presented at the Proceedings of the 9th
International Workshop on Inductive Logic Programming,
Bled, Slovenia, 1999.
[27] G. M. Weiss, Mining with rarity: a unifying framework,
SIGKDD Explor. Newsl., vol. 6, pp. 719, 2004.
[28] B. Liu, W. Hsu, and Y. Ma, Mining association rules with
multiple minimum supports, presented at the Proceedings of
the fifth ACM SIGKDD international conference on Knowledge
discovery and data mining, San Diego, California, United
States, 1999.
[29] H. Yun, D. Ha, B. Hwang, and K. H. Ryu, Mining association
rules on significant rare data using relative support, J. Syst.
Softw., vol. 67, pp. 181191, 2003.
[30] Y.H. Hu and Y.L. Chen, Mining association rules with
multiple minimum supports: a new mining algorithm and a
support tuning mechanism, Decis. Support Syst., vol. 42, pp. 1
24, 2006.
[31] L. Szathmary, P. Valtchev, and A. Napoli, Finding minimal
rare itemsets and rare association rules, presented at the
Proceedings of the 4th international conference on Knowledge
science, engineering and management, Belfast, Northern
Ireland, UK, 2010.
[32] L. Szathmary, A. Napoli, and P. Valtchev, Towards Rare
Itemset Mining, presented at the Proceedings of the 19th IEEE
International Conference on Tools with Artificial Intelligence
Volume 01, 2007.
[33] L. Troiano, G. Scibelli, and C. Birtolo, A Fast Algorithm for
Mining Rare Itemsets, presented at the Proceedings of the 2009
Ninth International Conference on Intelligent Systems Design
and Applications, 2009.
[34] Y. S. Koh and N. Rountree, Finding sporadic rules using
AprioriInverse, vol. 3518 LNAI, ed, 2005, pp. 97106.
[35] M. Adda, L. Wu, and Y. Feng, Rare itemset mining, 2007, pp.
7380.
[36] C. Zhang and S. Zhang, Association Rule Mining: Models and
Algorithms. New York: SpringerVerlag, 2002.
[37] I. Guyon, D. Janzing, and B. Schlkopf, Causality: Objectives
and Assessment, JMLR Workshop and Conference Proceedings,
vol. 6, pp. 138, 2010.
[38] J. Pearl, Causality: Models, Reasoning and Inference, 2nd ed.:
Cambridge University Press, 2009.
[39] D. Koller and N. Friedman, Probabilistic Graphical Models:
Principles and Techniques, 1st ed.: The MIT Press, 2009.
[40] M. Hassoun, Fundamentals of Artificial Neural Networks: The MIT
Press, 2003.
[41] G. A. Fink, Markov Models for Pattern Recognition: From Theory to
Applications: Springer, 2010.
[42] D. Heckerman, Bayesian Networks for Data Mining, Data
Min. Knowl. Discov., vol. 1, pp. 79119, 1997.
[43] G. F. Cooper, A Simple ConstraintBased Algorithm for
Efficiently Mining Observational Databases for Causal
Relationships, Data Min. Knowl. Discov., vol. 1, pp. 203224,
1997.
[44] A. Krogh, M. Brown, I. S. Mian, K. Sjolander, and D. Haussler,
Hidden Markov models in computational biology.
Applications to protein modeling, J Mol Biol, vol. 235, pp. 1501
31, 1994.
[45] S. R. Eddy, Profile hidden Markov models, Bioinformatics, vol.
14, pp. 75563, 1998.
[46] G. A. Klein, A recognitionprimed decision making model of
rapid decision making, Decision Making In Action: Models and
Methods, pp. 138147, 1993.
[47] L. A. Zadeh, Fuzzy sets, Information and Control, vol. 8, pp.
338353, 1965.
[48] H. Ying, Fuzzy Control and Modeling: Analytical Foundations and
Applications: IEEE Press, 2000.
[49] I. R. Edwards and J. K. Aronson, Adverse drug reactions:
definitions, diagnosis, and management., Lancet, vol. 356, pp.
12559, 2000.
[50] Y. Ji, H. Ying, J. Yen, and R. M. Massanari, A Fuzzy Logic
Based Computational RecognitionPrimed Decision Model,
Information Science, vol. 77, pp. 43384353, 2007.
[51] M. Cayrol, H. Farreny, and H. Prade, Fuzzy Pattern
Matching, Kybernetes, vol. 11, pp. 103116, 1982.
[52] R. Orchard, Fuzzy Reasoning in Jess: The FuzzyJ Toolkit and
FuzzyJess, in Third International Conference on Enterprise
Information Systems, Setubal, Portugal, 2001, pp. 533542.
[53] P. Staff, Ed., Physicians Desk Reference 2011. PDR Network,
2010.
[54] M. Hauben and X. Zhou, Quantitative methods in
pharmacovigilance: focus on signal detection, Drug Safety, vol.
26, pp. 15986, 2003.
Yanqing Ji received the B.Eng. degree in
industrial automation from Qingdao University,
Qingdao, China, in 1997, the M.Sc. degree in
physical electronics from the University of
Science and Technology of China, Hefei, Chi-
na, in 2001, and the Ph.D. degree in computer
engineering from Wayne State University,
Detroit, MI, in 2007. He is currently an Assis-
tant Professor with the Department of Electric-
al and Computer Engineering, Gonzaga Uni-
versity, Spokane, WA. His research interests
include multiagent systems, parallel and distri-
buted systems, data mining, and their biomedical applications. Dr. Ji
is a member of the Program Committees of various international
conferences and workshops including the 2009 International Work-
shop on Graphics Processing Unit Technologies and Applications,
and reviewer for various international journals.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
14 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
Hao Ying (S88M90SM97-F2012) re-
ceived the B.S. degree in electrical engineer-
ing, the M.S. degree in computer engineering
from Donghua University, Shanghai, China, in
1982 and 1984, respectively, and the Ph.D.
degree in biomedical engineering from The
University of Alabama, Birmingham, in 1990.
He is currently a Professor with the Depart-
ment of Electrical and Computer Engineering,
Wayne State University, Detroit, MI. He has
authored a book entitled Fuzzy Control and
Modeling: Analytical Foundations and Applications (IEEE Press,
2000), over 90 peer-reviewed journal papers, and more than 130
conference papers. Prof. Ying is an Associate Editor or a member of
Editorial Board for 10 international journals and is a member of the
Fuzzy Systems Technical Committee of the IEEE Computational
Intelligence Society. He was Program Chair for The 2005 North
American Fuzzy Information Processing Society (NAFIPS) Confe-
rence and Co-Program Chair for the 2010 NAFIPS Conference as
well as for The International Joint Conference of NAFIPS Confe-
rence, the Industrial Fuzzy Control and Intelligent System Confe-
rence, and the NASA Joint Technology Workshop on Neural Net-
works and Fuzzy Logic held in 1994.
John Tran received his BS in Chemistry from Santa Clara Universi-
ty, California in 1990 and MD from School of Medicine, St. Louis
University in 1995. John completed his Residency Training in Gen-
eral Psychiatry from Washington University in St. Louis, Missouri in
1999 and his Fellowship Training in Geriatric Psychiatry from Uni-
versity of Washington (UW), Seattle, Washington in 2000. He com-
pleted his Post-Doctoral Research Fellowship in Geriatric Psychiatry
from UW in 2002. He is currently Board Certified in General and
Geriatric Psychiatry. John currently serves as the Medical Director
and Chief of Geriatric Psychiatry for Frontier Behavioral Health. He
is responsible for directing and providing psychiatric services to au-
dult and elderly patients in outpatient psychiatric clinic as well as in
long term care settings, including skilled nursing facilities, assistant
livings, and adult family homes. He currently supervised 11 psy-
chiatrists and 10 Advanced Registered Practice Nurses. John has
great interests in clinical research and teaching. He is the Director
and Primary Instigator for Frontier Institute where he conduct NIH
and Industry sponsored clinical trials for new medication develop-
ments for psychiatric disorders including depression, anxiety, schi-
zophrenia, dementias, etc. He also serves as Associate Clinical
Professor in School of Medicine, University of Washington, as well
as Adjunct Associate Clinical Professor in School of Pharmacy and
School of Nursing, Washington State University, in teaching and
training of medical students, psychiatry residents, pharmacy and
nursing students.
Peter Dews received his medical degree from Wayne State Univer-
sity in 1986. He also holds a bachelor's degree in microbiology and a
master's degree in clinical research design and statistical analysis
from the University of Michigan. He is currently the Chief Medical
Officer and Vice President of Medical Aff airs for St. Mary Mercy
Hospital, Trinity Health. He was the medical director of ambulatory
services for the Wayne State University Physician Group's Depart-
ment of Internal Medicine and had practiced general medicine for
nearly 15 years before he joined Trinity Health. Dr. Dews is a mem-
ber of the Society of General Internal Medicine, the American Col-
lege of Physicians, the Association of Health Services Research, the
American Medical Informatics Society and the American College of
Physician Executivesphotograph and biography not available.
Ayman Mansour received his B.Sc in Electrical and Electronics
Engineering from University of Sharjah, Sharjah, U.A.E, in 2004 and
his M.Sc. degree in Electrical Engineering from the Univertsity of
Jordan, Amman, Jordan, in 2006. He joined the Department of Elec-
trical and Computer Engineering at Wayne State University, Detroit,
Michigan in 2008 as a PhD student. His research interests include
multiagent systems, intelligent systems and data mining.
R. Michael Massanari is a graduate of the Feinberg School of Med-
icine, Northwestern University, Chicago, IL, in 1967. He received
post-graduate training at the University of Pennsylvania in Philadel-
phia and Northwestern University in Chicago. He also received a
M.S. degree in Preventive Medicine and Epidemiology in 1985 from
the University of Iowa in Iowa City, Iowa. He is currently Executive
Director of the Critical Junctures Institute in Bellingham, WA. He
served as Director of the Center for Healthcare Effectiveness Re-
search (CHER) and Professor of Medicine and Community Medicine
at Wayne State University until his retirement in 2009. Dr. Massanari
is a Fellow of the American College of Physicians and a member of
the Academy of Health and American Public Health Association.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.