Escolar Documentos
Profissional Documentos
Cultura Documentos
Marzena Kryszkiewicz()
Abstract. Certainty factor and lift are known evaluation measures of associa-
tion rules. These measures, nevertheless, do not guarantee accurate evaluation
of strength of dependence between rule’s constituents. In particular, even if
there is a strongest possible positive or negative dependence between rule’s
constituents X and Y, these measures may reach values quite close to the values
characteristic for rule’s constituents independence. In this paper, we first
re-examine both certainty factor and lift. Then, in order to better evaluate de-
pendence between rule’s constituents, we offer and examine a new measure –
a dependence factor. Unlike in the case of the certainty factor, when defining
our measure, we take into account the fact that for a given rule X → Y, the mi-
nimal conditional probability of the occurrence of Y given X may be greater
than 0, while its maximal possible value may less than 1. In the paper, a number
of properties and relations of all investigated measures are derived.
1 Introduction
Certainty factor and lift are known evaluation measures of association rules. The for-
mer measure was offered in the expert system Mycin [7], while the latter is widely
implemented in both commercial and non-commercial data mining systems [2]. Nev-
ertheless, they do not guarantee accurate evaluation of the strength of dependence
between rule’s constituents. In particular, even if there is a strongest possible positive
or negative dependence between rule constituents X and Y, these measures may reach
values quite close to the values indicating independence of X and Y. This might sug-
gest that one deals with a weak dependence, while in fact the dependence is strong. In
this paper, we first re-examine both certainty factor and lift. Then we offer and ex-
amine a new measure - a dependence factor - to evaluate dependence between rule’s
constituents more accurately than by means of the certainty factor or lift. Unlike in the
case of the certainty factor, when defining our measure, we take into account the fact
that that for a given rule X → Y, the minimal conditional probability of the occurrence
of Y given X may be greater than 0, while its maximal possible value may less than 1.
We end the paper with the comparison of dependence evaluation capabilities of the
dependence factor, certainty factor and lift.
As certainty factor [7] and lift [2] were introduced first as measures for rules, we
will define them in the rule context, however our work is more general and can be
applied whenever a strength of dependence (or independence) between any events is
© Springer International Publishing Switzerland 2015
N.T. Nguyen et al. (Eds.): ACIIDS 2015, Part II, LNAI 9012, pp. 135–145, 2015.
DOI: 10.1007/978-3-319-15705-4_14
136 M. Kryszkiewicz