Você está na página 1de 201

Doru Adrian Pnescu

Knowledge Representation
and Reasoning
Course Lectures

EDITURA

CONSPRESS
2013

Copyright 2013, Editura Conspress


EDITURA CONSPRESS
este recunoscut de
Consiliul Naional al Cercetrii tiinifice din nvmntul Superior

Lucrare elaborat n cadrul proiectului: "Reea naional de centre pentru


dezvoltarea programelor de studii cu rute flexibile i a unor instrumente
didactice la specializarea de licen i masterat, din domeniul Ingineria
Sistemelor"

Descrierea CIP a Bibliotecii Naionale a Romniei


Pnescu, Doru Adrian
Suport de curs pentru Reprezentarea cunoaterii i raionament
Doru Adrian Pnescu Bucureti, Editura Conspress 2013
ISBN

Colecia Carte universitar

CONSPRESS
B-dul Lacul Tei nr. 124, sector 2
cod 020396, Bucureti
Tel: (021) 242 2719 / 300; Fax: (021) 242 0781
2

Contents
Bibliography ..5
Course overview ..5
Chapter I Introduction to Knowledge Representation and
Reasoning (KRR) ..6
Basic notions in KRR ..7
KRR: Representation and Reasoning 17
Knowledge Based Systems (KBSs) ...19
KBSs Importance of reasoning 23
KBSs The role of Logic .26
Homework ......26
Chapter II Important aspects on the Language of
First-Order Logic (FOL) ..27
Introduction .28
Language of First-Order Logic Syntax 30
Language of First-Order Logic Semantics .36
Language of First-Order Logic Pragmatics 48
Homework ......58
Chapter III Knowledge Engineering .60
Knowledge engineering and knowledge management ...61

Ontology .71
V oc abul ar y

.91

Types of facts .94


Reification .102
Entailment 105
Homework

111

Chapter IV Resolution .112


Introduction ..113
Resolution in Propositional Calculus ...114
Resolution derivations ...117
A Resolution based entailment procedure .119
Resolution in FOL ...123
Answer extraction ...133
S k o l e m i z a ti o n

..136

Treatment of equality .139


The computational complexity of resolution ...142
Homework 145
Chapter V Production Systems ..148
Introduction Direction of reasoning .149
Basic operation ..151
Working memory

..153

Production rules

...154

Conflict resolution

.161

Efficiency in Production Systems .165


Homework 196

Bibliography
This course notes follow to a large extent the following
book:
Ron Brachman, Hector Levesque, Knowledge
representation and reasoning, Elsevier, Amsterdam,
2004
Other sources that were used are:
Frank van Harmelen, Vladimir Lifschitz, Bruce Porter, Handbook
of Knowledge Representation, Elsevier, Amsterdam, 2008
Natasha Noy, Ontology Development 101, Stanford University,
2008
Irma Becerra-Fernandez, Avelino Gonzalez, Rajiv Sabherwal,
Knowledge Management, Prentice Hall, 2004
Amrit Tiwana, The Knowledge Management Toolkit, Prentice
Hall, 2000
Joseph Giarratano, Gary Riley, Expert Systems: Principles and
Programming, PWS-KENT, Boston, 1989

Course overview
Knowledge Representation (KR)
Logic based KR

Reasoning
Expert Systems (ES)

Chapter 1 Introduction to KRR


Basic notions in Knowledge
Representation and Reasoning (KRR)
Intelligence
Knowledge
Representation
Reasoning
The role of Logic

Chapter 1 Introduction to KRR


Objectives for Chapter 1
Understanding the relation between
intelligence knowledge representation
Establishing the importance of reasoning
Setting up the borders for a knowledge
based system

Basic notions in KRR


Intelligence complex and mysterious
phenomenon
Intelligence is dependent on knowledge
Thinking (reasoning) bring out what is
relevant from what one knows to solve a
problem

Basic notions in KRR


KRR is a chapter of Artificial Intelligence (AI)
AI - the study of intelligent behavior achieved
through computational means
KRR is that part of AI that is concerned with
how an agent (software program) uses what
it knows in deciding what to do

Basic notions in KRR


KRR is considered to be the sub-field of AI
concerned with the representation of
information in computers in ways that allow
computers to draw reasonable conclusions
from them.

Basic notions in KRR

Actions/Decisions
Reasoner
+
Knowledge

World
Sensory
information

Basic notions in KRR


Approaches of AI
Symbolic approach KRR (Exp. Syst., KBSyst)
Connectionist approach neural networks
Evolutionary approach genetic algorithms

Basic notions in KRR: Knowledge


Knowledge definition:
Information and skills acquired through
experience or education; the theoretical or
practical understanding of a subject.
What is known in a particular field; facts and
information

Basic notions in KRR: Knowledge


How we refer to knowledge?
Ann knows that
Ann knows that Tom will go to school
John knows that Chemistry is easy to learn.

Basic notions in KRR: Knowledge


Knowledge is a relation between a knower,
like John, and a proposition, that is, the idea
expressed by a simple declarative sentence
Propositions are abstract entities that can be
true or false, right or wrong.
John knows that p John knows that it is
true that p

10

Basic notions in KRR: Knowledge


John knows something John has
formed a judgment of some sort, and has
come to realize that the world is one way
and not another
Propositions are used to classify the two
cases

Basic notions in KRR: Knowledge


Propositions can be of different kinds
John hopes that Mary will come to the party.
Verbs: know, regret, fear, doubt denote
propositional attitudes (relations between
agents and propositions)

11

Basic notions in KRR: Knowledge


There are cases involving knowledge that do
not explicitly mention propositions:
John knows who Mary is taking to the party
John knows how to fix the robot
John knows how to make a good first impression
John knows Mary well

Basic notions in KRR: Knowledge


A further concept used in AI belief
BDI (Belief Desire Intention) software
agents architecture
John knows that p. Johns believes that p.
The second approach is used when one does
not want to claim that the judgment about the
world is necessarily accurate.

12

Basic notions in KRR: Knowledge


There is a full range of propositional attitudes:
John is absolutely certain that p
John is confident that p
John is of the opinion that p
John suspects that p

Basic notions in KRR: Representation


Representation is a relationship between two
domains, where the first is meant to stand
for or take the place of the second.
Usually, the first domain, the representor, is
more concrete, immediate, simple or
accessible in some way.
The first domain models the second one.

13

Basic notions in KRR: Representation


In AI we use symbols to construct
representations
A group of symbols (a word) stand for
(represent) a certain entity; e.g. John, love,
truth, robot
A more meaningful representation is got by a
set of words; e.g. John enjoys the lecture

Basic notions in KRR: Representation


Knowledge representation - the field of AI
concerned with using formal symbols to
represent a collection of propositions
believed by an agent.
A knowledge representation model is not
supposed to stand for all the propositions
believed by the agent.

14

Basic notions in KRR: Representation


There may very well be an infinite number of
propositions believed, only a finite number of
which are ever represented.
It will be the role of reasoning to bridge the
gap between what is represented and what is
believed.

Basic notions in KRR: Reasoning


It is the formal manipulation of those symbols
representing a collection of believed
propositions to produce representations of
new ones.
Computer based reasoning benefits from the
fact that symbols are more accessible than
the propositions they represent.

15

Basic notions in KRR: Reasoning


Reasoning manipulates the groups of symbols
(move them around, take them apart, copy
them, string them together) in such a way as to
construct representations of new propositions.
An example (two hypotheses and a
conclusion):
H1: John is friend with Tom
H2: Tom attends the Robotics lecture
C: Someone being friend with John attends the
Robotics lecture

Basic notions in KRR: Reasoning


Such a form of reasoning is named logical
inference because the final sentence
represents a logical conclusion of the
propositions represented by the hypotheses.
Reasoning is a form of calculation over
symbols (first noticed by Gottfried Leibniz in
the seventeenth century).

16

KRR: Representation and Reasoning


Knowledge representation is important for AI
since it provides means to describe the
behavior of complex systems.
Knowledge representation and reasoning can
deal with terms like beliefs, desires, goals,
intentions, hopes

KRR: Representation and Reasoning


An example: an AI program for chess playing
Variant1: It moved this way because it
believed its king was vulnerable, but still
wanted to move forward the pawn
Variant2: It moved this way because the
evaluation procedure P using a static
evaluation function F returned a value of +10,
after a depth search of 4 steps

17

KRR: Representation and Reasoning


The AI approach: to consider an intentional
stance toward the chess-playing system.
The intentional mechanism (used by the BDI
software architecture) is not always
appropriate.
An example: The thermostat knew the room is
cold and wanted to warm it up.

KRR: Representation and Reasoning


Knowledge representation a stance towards
representing a complex system.
A knowledge representation must possess two
important properties:
we (from the outside) can understand the symbolic
representation as standing for propositions
the system can correctly operate due to the
symbolic representation

18

KRR: Representation and Reasoning


When these properties apply, one can say that
the Knowledge Representation Hypothesis
takes place.
Knowledge Representation Hypothesis
construction of systems using an intentional
stance based on a symbolic representation.
Such a system: Knowledge Based Systems
(KBS)
The symbolic representation involved:
Knowledge Base (KB)

Knowledge Based Systems


Two forms of a program
V1
(defrule R
=>
(printout t snow has color white crlf
grass has color green crlf) )

19

Knowledge Based Systems


V2
(deffacts information
(entity snow color white)
(entity vegetation color green)
(class vegetation tree grass plant))

Knowledge Based Systems


V2
(defrule R1
(entity ?x color ?y) =>
(printout t ?x has color ?y crlf))
(defrule R2
(entity ?x color ?y) (class ?x $? ?z $?) =>
(printout t ?z has color ?y crlf))

20

Knowledge Based Systems


Both programs print the color of snow and
grass.
Only the second program is designed according
to the Knowledge Representation Hypothesis.
KBS the presence of a KB: a collection of
symbolic structures representing what the
system believes and reasons with during its
operation.
There may be a KBS lacking the symbolic
representation (e.g. neural networks).

Knowledge Based Systems


There are both advantages and
disadvantages for KBSs
A procedural approach can determine a
faster solution in comparison with a KBS
There is a paradox of Expert Systems
(Hubert Dryfus)
ESs are superior as being KB: they reason
over explicitly represented knowledge

21

Knowledge Based Systems


The paradox: novices are the ones who think
and reason; experts recognize and to react
KBSs are superior in dealing with a set of
tasks that is open-ended
KBSs can receive new tasks and easily make
them depend on previous knowledge

KBSs can debug faulty behavior by locating


the erroneous beliefs of the system

Knowledge Based Systems


KBSs can concisely explain and justify the
behavior of the system

KBS - by design, it has the ability to be told


facts about its world and adjust its behavior
correspondingly
KBSs possess cognitive penetrability

22

KBSs Importance of reasoning


KBSs action to depend on what the system
believes about the world, as opposed to just
what the system has explicitly represented
A KB will involve quite general facts, which will
then need to be applied to particular situations
An example (two facts)
A part transfer operation solves task x. (1)
Any part transfer operation can be solved by a
robot. (2)

KBSs Importance of reasoning


Problem to solve: it is appropriate to involve a
robot in order to resolve task x
A KBS can derive the solution to the above
problem, though this does not result from the
presented facts.
A KBS is beyond a data base system (the
behavior of a data base system is conditioned
only on the facts it is able to retrieve).

23

KBSs Importance of reasoning


The KBS should provide the answer p, if it
believes it.
The system should believe p if, according to
the beliefs it has represented, the world it is
imagining is one where p is true.
Logical entailment: if the world is so that
propositions (1) and (2) are true, then it is also
true:
A robot can solve task x

KBSs Importance of reasoning


A KBS should believe all and only the
entailments of what it has explicitly
represented.
The job of reasoning, then, according to this
account, is to compute the entailments of a
KB; this is a simplistic view!
It can be too difficult computationally to decide
which sentences are entailed by the kind of KB
we will want to use.
The decision problem in Logics is undecidable.

24

KBSs Importance of reasoning


Any procedure that always gives us answers in
a reasonable amount of time will occasionally
either:
miss some entailments (it is logically incomplete);
return some incorrect answers (it is logically
unsound).

There are also conceptual reasons why a KBS


might consider unsound or incomplete
reasoning.

KBSs Importance of reasoning


Examples:
something not entailed by the KB, but that is a
reasonable answer
a KBS might come to believe a collection of facts
from various sources that, taken together, cannot
all be true (the KB is not sound)

While it would be a mistake to identify


reasoning in a KBS with logically sound and
complete inference, it is the right place to
begin.

25

KBSs The role of Logic


Importance of Logic for KBS:
It studies:
Entailment relations
Truth conditions
Rules of inference

AI (the symbolic approach) is based on the


calculus of First Order Predicates (First Order
Logic)

Chapter 1 Introduction to KRR


Homework
Consider a manufacturing control system; assess the
system intelligence; propose some enhancements in
order to obtain a KBS.
Explain the role of a knowledge representation
language and a reasoning mechanism in KBSs.
Give examples of two different knowledge
representation languages and reasoning
mechanisms.
Problems 1 and 4, Brachman and Levesque, pp. 1314.

26

Chapter 2 Important aspects on the


Language of First-Order Logic
Chapter Overview
Syntax
Semantics
Pragmatics
Explicit and implicit belief

Chapter 2 Important aspects on the


Language of First-Order Logic
Objectives for Chapter 2
To understand how first-order logic
(FOL) can provide a modelling tool for
KBSs
To undestand the way FOL can establish
a sound reasoning mechanism
To discover the limitations of FOL

27

Introduction Basic Aspects


First point for an AI based system: to be
able to express (formulate) the ideas to be
used
There is the need of a language
Language of first-order logic

Introduction Components of a language

Syntax: specifies which groups of symbols,


arranged in what way, are to be considered
properly formed
This student has sound knowledge on
Robtics well formed English sentence

28

Introduction Components of a language

Semantics: specifies what the well-formed


expressions are supposed to mean
Without semantics we cannot consider beliefs
Pragmatics: specifies how the meaningful
expressions in the language are to be used
There is someone right behind you could be
used as a warning

Introduction Components of a language

For knowledge representation, this involves


how we use the meaningful sentences of a
representation language as part of a KB from
which inferences will be drawn.
These three aspects apply mainly to
declarative languages, the sort we use to
represent knowledge.

29

Language of First-Order Logic Syntax

An alphabet the set of admissible symbols


In FOL, there are two sorts of symbols:
the logical ones
the nonlogical ones.
Intuitively, the logical symbols are those that have a
fixed meaning or use in the language.

Language of First-Order Logic Syntax


There are three sorts of logical symbols:
punctuation: (, ), and ..
connectives: 7, , , , , , and =.
variables: an infinite supply of symbols, which we
will denote here using x, y, and z, sometimes with
subscripts and superscripts (V is the set of variables).

30

Language of First-Order Logic Syntax


The nonlogical symbols are those that have
an application-dependent meaning or use.
In FOL, there are two sorts of nonlogical
symbols:
1. function symbols: an infinite supply of symbols,
which we will write in uncapitalized mixed case, e.g.,
bestFriend, and which we will denote more generally
using a, b, c, f , g, and h, with subscripts and
superscripts.

Language of First-Order Logic Syntax


2. predicate symbols: an infinite supply of symbols,
which we will write in capitalized mixed case, e.g.,
OlderThan, and which we will denote more generally
using P, Q, and R, with subscripts and superscripts.

One distinguishing feature of nonlogical


symbols is that each one is assumed to have
an arity, that is, a nonnegative integer
indicating how many arguments it takes.

31

Language of First-Order Logic Syntax


By convention, a, b, and c are only used for
function symbols of arity 0, which are called
constants
g and h are only used for function symbols of
nonzero arity.
Predicate symbols of arity 0 are sometimes
called propositional symbols.

Language of First-Order Logic Syntax


Note that = is not treated as a predicate
symbol, but as a special logical symbol.
There are two types of legal syntactic
expressions in FOL: terms and formulas.
Intuitively, a term will be used to refer to
something in the world, and a formula will be
used to express a proposition.

32

Language of First-Order Logic Syntax


The set of terms of FOL is the set satisfying
these conditions:
every variable is a term;
if t1, ... , tn are terms, and f is a function symbol of
arity n, then f (t1, ... , tn) is a term.

Language of First-Order Logic Syntax


The set of formulas of FOL is the set satisfying
these constraints:
if t1, ... , tn are terms, and P is a predicate symbol of
arity n, then P(t1, ... , tn) is a formula;
if t1 and t2 are terms, then t1 = t2 is a formula;
if and are formulas, and x is a variable, then
7,( ), ( ), x. , and x. are formulas.

33

Language of First-Order Logic Syntax


Formulas of the first two types (containing no other
simpler formulas) are called atomic formulas or atoms.
Notational abbreviations and conventions:
we will add or omit matched parentheses and periods
freely
we use square and curly brackets to improve
readability
in the case of predicates or function symbols of arity
0, we usually omit the parentheses since there are no
arguments to enclose
we reduce the number of parentheses by assuming
that 7 has higher precedence than , and has higher
precedence than

Language of First-Order Logic Syntax


By the propositional subset of FOL, we mean
the language with no terms, no quantifiers, and
where only propositional symbols are used.
An example of propositional formula:
(P 7(Q R)), where P, Q, R are propositional
symbols

34

Language of First-Order Logic Syntax


We also use the following abbreviations
(equivalences):
( ) (7 );
( ) (( ) ( ))

A variable occurrence in a formula can be free or


bound.
A variable occurrence is bound in a formula if it
lies within the scope of a quantifier, and free
otherwise.

Language of First-Order Logic Syntax


An example:
y. P(x) x[P(y) Q(x)]
y (P(x) x(P(y) Q(x)))

If x is a variable, a formula and t a term, then


(x|t) means the substitution of all the free
occurrences of x in by t.
A sentence of FOL is any formula without free
variables.
The sentences of FOL are what we use to
represent knowledge.

35

Language of First-Order Logic Semantics


It is important to understand what claim a
sentence of FOL makes about the world
Only thus, the beliefs of the KBS can be
derived
One cannot realistically expect to specify
once and for all what a sentence of FOL means
The nonlogical symbols are used in an
application-dependent way

Language of First-Order Logic Semantics


An example:
One cannot agree what the sentence
Happy(john) claims about the world
This can be decided only if one knows the
interpretation of the nonlogical symbols
involved
Conclusion: we need a clear specification of
the meaning of sentences as a function of the
interpretation of the predicate and function
symbols.

36

Language of First-Order Logic Semantics


Main points of semantics specification:
There are objects (entities) in the world
For any predicate P of arity 1, some of the
objects will satisfy P and some will not. An
interpretation of P decides for each object
whether it has or does not have the property
specified by P.
Predicates of other arities are handled
similarly.

Language of First-Order Logic Semantics


A function symbol is interpreted as a mapping
from tuples of objects to an object
No other aspects of the world matter
To notice that the above statements are all
one needs to say regarding the meaning of the
nonlogical symbols

37

Language of First-Order Logic Semantics


Examples:
DemocraticCountry a predicate symbol of
arity 1
The meaning of DemocraticCountry in some
interpretation will be no more and no less than
those objects that are countries that we
consider to be democratic.

Language of First-Order Logic Semantics


Examples:
child_of a function symbol of arity 2
The meaning of the above function symbol is
a mapping:
People x People People
child_of(john, mary) dan

38

Semantics - Interpretations
Meanings are typically captured by specific
interpretations
An interpretation I in FOL is a pair <D, I>, where D
is any nonempty set of objects, called the domain of
the interpretation (the universe of discourse - UoD),
and I is a mapping, called the interpretation mapping,
from the nonlogical symbols to functions and relations
over D.
UoD - the set of entities over which the variables may
range (UoD is part of the world considered by the
current problem)

Semantics - Interpretations
It is important to stress that an interpretation
need not involve only mathematical objects
D can be any set, including people, garages,
numbers, sentences, fairness, unicorns, chunks
of peanut butter, situations, and the universe,
among other things.
The interpretation mapping I is devoted to
assigning the meaning of the predicate and
function symbols

39

Semantics - Interpretations
To every predicate symbol P of arity n, I[P] is
an n-ary relation over D:
I[P] D D

(D is taken n times)

Examples:
Robot a unary predicate symbol
I[Robot] would be some subset of D (presumably
the set of robots in that interpretation)

Semantics - Interpretations
Examples:
SmarterThan a binary predicate symbol
I[SmarterThan] would be some subset of D x D
(presumably the set of pairs of objects in D where the
first element of the pair is smarter than the second)

To every function symbol f of arity n, I[f] is an


n-ary function over D:
I[f]

D D D

40

Semantics - Interpretations
Examples:
bestServant a unary function symbol
I[bestServant] would be a function D D
(presumably the function that maps a person to
his/her best servant)
johnSmith a 0-ary function
I[johnSmith] would be an element of D (presumably
somebody called John Smith)

Semantics - Interpretations
It is useful to think of the interpretation of
predicates in terms of their characteristic
functions.
The interpretation of an n-ary predicate
becomes an n-ary function to {0, 1}:
I[P]

D D {0, 1}

41

Semantics - Interpretations
Two distinct interpretations of a predicate:
as a relation
as a function to {0, 1}
The relationship between the two specifications
is that a tuple of objects is considered to be in
the relation over D if and only if the
characteristic function over those objects has
value 1.

Semantics - Interpretations
There is an advantage of using the second
type of interpretation.
It allows us to see how predicates of arity 0
(i.e., the propositional symbols) are handled:
If P is 0-arity predicate, then I[P] is either 0, or 1.

42

Semantics - Interpretations
It is normal to think as 0 represents the truth
value False, while 1 stands for True.
Thus, for the propositional subset of FOL, one
can ignore D completely.
For propositional calculus, an interpretation is
simply a mapping, I, from the propositional
symbols to either 0 or 1.

Semantics - Denotation
Denotation clarifies the interpretation of terms
that contain variables.
Given an interpretation I = <D, I> we can
specify which elements of D are denoted by any
variable-free term of FOL.
An example:
bestServant(johnSmith); I[bestServant](I[johnSmith]) D

43

Semantics - Denotation
To deal with terms including variables, one
needs a variable assignment over D, that is, a
mapping from the variables of FOL to the
elements of D.
- variable assignment:
: V D, [x] D
By using both an interpretation I and a variable
assignment the denotation of every term t can
be calculated.

Semantics Denotation
If x is a variable, then ||x||I, = [x]
If t1, ... , tn are terms, and f is a function symbol
of arity n, then
||f(t1, , tn)||I, = F(d1, , dn)
F = I[f], di = ||ti||I,
The above rules are recursive
According to these recursive rules, ||ti||I, is
always an element of D

44

Semantics Satisfaction and Models


By the means of an interpretation and a
variable assignment one can specify which
sentences of FOL are true and which are false
Example:
Robot(bestServant(johnSmith)) is true in I if
and only if the following holds:
If we use I to get hold of the subset of D denoted
by Robot and the object denoted by
bestServant(johnSmith), then that object is in
the set.

Semantics Satisfaction and Models


To deal with formulas containing free variables,
we should use a variable assignment
Formally, given an interpretation I and variable
assignment , we say that the formula is satisfied
in I (is true in I), written:

I, |= ,
according to the following rules:
Assume that t1, ... , tn are terms, P is a predicate of
arity n, and are formulas, and x is a variable.

45

Semantics Satisfaction and Models


1. I, |= P(t1, ... , tn) iff <d1, ... , dn> P, where
P = I[P], and di = ||ti||I,
2. I, |= t1 = t2 iff ||t1||I, and ||t2||I, are the
same element of D;
3. I, |= 7 iff it is not the case that I, |= ;
4. I, |= ( ) iff I, |= and I, |= ;

Semantics Satisfaction and Models


5. I, |= ( ) iff I, |= or I, |= (or both);
6. I, |= x. iff I, |= , for some variable
assignment that differs from on at most x;
this means there exists an object a in D that
substituted for x determines I, |= (a, aD,
and I, |= (x|a) )
7. I, |= x. iff I,|= , for every variable
assignment that differs from on at most x
(a, aD, I, |= (x|a) )

46

Semantics Satisfaction and Models


When the formula is a sentence, it is easy to
see that satisfaction does not depend on the
given variable assignment (recall that sentences
do not have free variables)
In this case, we write I |= and say that is
true in the interpretation I, or that is false
otherwise
In the case of the propositional subset of FOL,
it is sometimes convenient to write I[]= 1 or
I[]= 0 according to whether I |= or not.

Semantics Satisfaction and Models


We will also use the notation I |= S, where S is
a set of sentences, to mean that all of the
sentences in S are true in I.
We say in this case that I is a logical model of
S.

47

Language of First-Order Logic Pragmatics


The semantic rules of interpretation tell us how
to understand precisely the meaning of any term
or formula of FOL in terms of a domain and an
interpretation for the nonlogical symbols over
that domain.
Important point:
The FOL language can be used to represent
knowledge

Language of First-Order Logic Pragmatics


From the point of view of Pragmatics, one
should understand that:
A KBS is supposed to reason about concepts like
DemocraticCountry or Robot by the means of a
given interpretation
A KBS has to be endowed with an interpretation,
which could involve (perhaps infinite) sets of
objects

48

Pragmatics Logical consequence


Although the semantic rules of interpretation
depend on the interpretation of the nonlogical
symbols, there are formulas that do not depend
on the meaning of those symbols.
Example:
I is an interpretation so that formula is true;
Let be the formula ; then is also true
under the interpretation I (this does not depend
on the nonlogical symbols in or )
is a logical consequence of

Pragmatics Logical consequence


Definition
Let S be a set of sentences (formulas), and
any sentence. is a logical consequence of S,
or that S logically entails , which we write:
S |=
if and only if, for every interpretation I, if I |= S
then I |=

49

Pragmatics Logical consequence


The definition of logical consequence has some
equivalent formulations:
S |= iff every model of S is a model of
S |= iff there is no interpretation I so that I |= S U {7}

We say about the set S U {7} that is


unsatisfiable, when there is no I to be its logical
model

Pragmatics Logical consequence


Definition
A sentence is logically valid (tautology), which we
write:
|=
when it is a logical consequence of the empty set.
Equivalent formulations:
is valid if and only if, for every interpretation I, it is the
case that I |=
is valid if and only if, the set {7} is unsatisfiable

50

Pragmatics Logical consequence


Entailment is a special case of validity:
If S ={1, ... , n}, then S |= iff the formula
[(1 n) ] is logically valid.
The connection between KBS and logical
entailment is at the heart of the knowledge
representation enterprise.

Pragmatics Logical consequence


KBSs means system that can reason
Given something like the fact that irb is a robot, it
should be able to conclude that irb is also an
industrial equipment, an automatic device, and so
on.
The KBS is told or can learn a sentence like
Robot(irb) that is true in some user-intended
interpretation, and that can then come to believe
other sentences true in that interpretation.

51

Pragmatics Logical consequence


A KBS will not and cannot have access to the
interpretation of the nonlogical symbols itself.
A KBS cannot be given the set of sentences
true in that interpretation as beliefs, because,
among other things, there will be an infinite number
of such sentences.
If a KBS is given a restricted set of sentences
being true in the user intended interpretation, it can
use logical entailment to derive new valid
sentences.

Pragmatics Logical consequence


An example
Robot(irb) is true in the intended interpretation ()

A KBS can safely conclude:


7 7 Robot(irb)
Robot(irb) Happy(john)

This kind of conclusions are not very useful

52

Pragmatics Logical consequence


To get an efficient KBS we need to include within
the set of sentences S of the KB, statements
connecting the nonlogical symbols involved.
For the above example, such a sentence is:
x.Robot(x) Industrial_equipment(x)

()

If S = {, } then S |= Industrial_equipment (irb)

Pragmatics Logical consequence


By including formula as one of the premises in
S, we rule out interpretations like the one where the
set of robots is not a subset of the set of industrial
equipment.
One can conclude with the fundamental tenet of
knowledge representation.
1. Reasoning based on only logical consequence
allows safe, logically guaranteed conclusions to be
drawn.

53

Pragmatics
2. The KBS starts with a collection of sentences as
given premises (KB)
3. The KB includes not only facts about particulars of
the intended application, but also those expressing
connections among the nonlogical symbols involved
4. Calculating the entailments thus becomes a core
part of a KBS, as it is like the form of reasoning we
would expect of someone who understood the
meaning of the terms involved (KB becomes a richer
set)

Pragmatics
IMPORTANT CONCLUSION
This is all existing on knowledge representation
and reasoning; the rest is just details.

54

Pragmatics Explicit and implicit belief


The KB can be viewed with the two components:
working memory (WM) rule base (RB)
The WM - beliefs of the system that are explicitly
given (introduced, sensed, or learned)
The RB - connections among the nonlogical
symbols involved
Besides these, there are entailments of the KB beliefs that are only implicitly given

Pragmatics Explicit and implicit belief


It is often nontrivial to move from explicit to implicit
beliefs.
An example from the blocks world
green

A
B

not green

Fig. 2.1. Blocks world example

55

Pragmatics Explicit and implicit belief


The problem can be formalized
a, b, c the names of the blocks (constants)
G, O predicate symbols, standing for green
and on
The initial KB:
S = {O(a, b), O(b, c), G(a), 7G(c)}

Pragmatics Explicit and implicit belief


S |= (?)
where

(*)

= x y. G(x) 7G(y) O(x, y)


A sketch of the demonstration:
Two cases to be considered: I |= G(b); I |= 7G(b);
In both cases (*) takes place

56

Pragmatics Explicit and implicit belief


Calculating what is implicit in a given collection of
facts will sometimes involve subtle forms of
reasoning
For FOL the problem of determining whether one
sentence is a logical consequence of others is in
general unsolvable
The process of calculating the entailments of a
KB deductive inference

Pragmatics Explicit and implicit belief


A reasoning process is logically sound if
whenever it produces , then is guaranteed to be
a logical consequence.
A reasoning process is logically complete if it is
guaranteed to produce whenever is entailed.
The computational difficulty of FOL is one of the
factors that will lead us to consider various other
options of reasoning

57

Chapter 2 Important aspects on the


Language of First-Order Logic
Homework
Prove that the following two hypotheses do not
entail the following conclusion
Hypotheses: Some students are hard-working.
Some students are intelligent.
Conclusion: Some students are hard-working and
intelligent.

Chapter 2 Important aspects on the


Language of First-Order Logic
Homework
Express the following sentences in FOL:
Some students were enrolled in college and did not
graduate.
All students that were enrolled attended the
laboratories.
Some students attended the laboratories and did
not graduate.

58

Chapter 2 Important aspects on the


Language of First-Order Logic
Homework
Problems 1 and 4, Brachman and Levesque,
pp. 28-30.
Express the following sentences in FOL:
Metallic objects are electrically conductive.
The wires are metallic objects.
Metallic objects are not electrically conductive and
insulating.
The wires are not insulating.

Chapter 2 Important aspects on the


Language of First-Order Logic
Homework
Consider the following set of sentences*:
H1 Andrew is the father of Bob.
H2 Bob is the father of Chris.
H3 Every grandfather is someone's father.
C Andrew is a grandfather of Chris.
Translate these sentences into FOL. Show semantically
(by reasoning about interpretations) that H1 H3 do
not logically entail C.
* This problem is part of an examination paper at
University of Nottingham, School of Computer
Science

59

Chapter 3 Knowledge Engineering


Chapter Overview
Knowledge engineering & Knowledge
management
Ontology
Vocabulary
Types of facts
Reification
Entailment

Chapter 3 Knowledge Engineering


Objectives for Chapter 3
Understanding the role and importance
of knowledge engineering
Understanding the way an ontology can
be a designing tool for KBSs
Studying some issues regarding the
knowledge base development

60

Knowledge engineering and knowledge management


Knowledge management (KM) and
knowledge engineering (KE) are topical terms
They are invoked when talking about:
Knowledge-based society
Knowledge-based economy
Knowledge-based company
Knowledge-based marketing
Knowledge-based development

Knowledge engineering and knowledge management


The terms knowledge management knowledge engineering seem to be used as
interchangeably, as the terms data and
information used to be.
The difference stems from the distinctness
between manager and engineer
Manager - exercise executive, administrative and
supervisory direction
Engineer - lay out, construct, contrive or plan out,
usually with a specific skill

61

Knowledge engineering and knowledge management


The knowledge manager establishes the
direction a process should take
The knowledge engineer develops the means
to accomplish that direction
Though they have distinct tasks, they should
cooperate

Knowledge engineering and knowledge management


Tasks for knowledge managers:
To establish the knowledge needs of the
enterprise (what knowledge is necessary to
make what decision and enable what actions)
To ascertain the enterprise knowledge
management policies

62

Knowledge engineering and knowledge management


Tasks for knowledge engineers:
Knowledge representation and encoding
Manage data repositories, work flows,
groupware technologies
Set up the processes by which knowledge
requests are examined, information
assembled, and knowledge returned to the
requestors

Knowledge engineering and knowledge management


Knowledge manager - can be the Chief
Information Officer (CIO) or the person in
charge of the Information Resource
Management (IRM).

Knowledge engineer - most likely a computer


scientist specialized in the development of
artificial intelligence knowledge bases.

63

Knowledge engineering and knowledge management


Knowledge management (KM) comprises a
range of strategies and practices used in an
organization to identify, create, represent,
distribute, and enable adoption of insights
and experiences.
Such insights and experiences comprise
knowledge, either embodied in individuals or
embedded in organizations as processes or
practices.

Knowledge engineering and knowledge management


Knowledge engineering (KE) - an engineering
discipline that involves integrating knowledge
into computer systems in order to solve
complex problems normally requiring a high
level of human expertise
KM & KE ensure the creation,
communication and application of knowledge
of all types to meet enterprise goals
Companies - physical assets versus
intangible assets

64

Knowledge engineering and knowledge management


Adaptability of companies is related to KM
Creating new elements of knowledge
Fast & efficient dissemination of knowledge
Incorporation of knowledge into new products
and services

Knowledge engineering and knowledge management


Nine reasons to assume KM:
1. Companies are "knowledge intensive"
rather than "capital intensive"
2. Unstable markets require a new
organization for companies
3. KM allows companies to control the change
(lets you lead change, so that change does
not lead you)

65

Knowledge engineering and knowledge management


Nine reasons to assume KM:
4. Only knowledge survives
5. Avoiding loss determined by tacit
knowledge
6. Knowledge should be disseminated

Knowledge engineering and knowledge management


Nine reasons to assume KM:
7. Complexity of production situations can be
mastered by KM
8. KM provides decision support systems
9. Use of the opportunities resulting from
globalization

66

Knowledge engineering and knowledge management


KM is important for two types of companies:
1. Companies that must keep the place in
dispute with competitors (maintenance and
upgrading the kernel of knowledge)
2. Companies that rely on products that
require innovation

Knowledge engineering and knowledge management


KM influences the main organizational
processes of an enterprise
The impact of KM can be considered on three
important features:
Optimality (best decisions/ production
mechanisms)
Efficiency (minimizing time and costs)
Adaptability (making various processes in a
creative way, complying to requests)

67

Knowledge engineering and knowledge management


Dimensions of impacts of KM

People

Processes

Products

Performance

Knowledge Management

Knowledge engineering and knowledge management


KM impacts on people
KM Syst

Knowledge

Employee
Learning

Employee
Adaptability

Employee
Job
Satisfaction

68

Knowledge engineering and knowledge management


KM case studies
1. Ford and Firestone
Companies have the information necessary
on the incompatibility of the Ford Explorer and
Firestone tires
Information was not properly integrated and
communicated between companies
Results: accidents, lawsuits, loss of
customers

Knowledge engineering and knowledge management


KM case studies
2. Xerox
KM is conducted as a scientific community
(Community of Practice)
Provides visibility of advanced solutions
Ability to solve unstructured problems
Knowing the latest technologies

69

Knowledge engineering and knowledge management


KM case studies
2. Toyota
It uses an advanced KM System
Motivate employees to participate in the
dissemination of knowledge
Eliminate "free riders
Reduces costs associated with finding and
accessing valuable knowledge elements

Knowledge engineering and knowledge management

Investment in KM is a capital investment


that produces long term benefits.
Ideas are capital. The rest is just money.
Deutsche Bank, Wall Street Journal

70

Ontology Introduction; Basic Aspects


The same way as for any software system,
for a KBS a principled architecture is to be
designed first
Some main points should be decided
before implementation
Reasons and times that inference will be
necessary in our systems behavior

Ontology a key notion for KBS


An ontology must be staked out as a starting
point
Ontology defines the set of representational
primitives with which to model a UoD
Ontology a description of the concepts and
relations existing in a UoD, which are
established to support a software system

71

Ontology a key notion for KBS


The definitions of concepts/relations include
information about their meaning and
constraints on their logically consistent
application
The purpose of an ontology can be viewed as
setting up the rules so that a group of entities
should give the same interpretation for a part
of the world

Ontology a key notion for KBS


An ontology is an explicit description of a
domain:
concepts
properties and attributes of concepts
constraints on properties and attributes
individuals

72

Ontology a key notion for KBS


An ontology defines
a common vocabulary
a shared understanding (among people and/or
software entities)

An ontology should enable reuse of domain


knowledge
to avoid re-inventing the wheel
to introduce standards

Ontology a key notion for KBS


To make domain assumptions explicit
easier to change domain assumptions
easier to understand and update legacy data

To separate domain knowledge from the


operational knowledge
re-use domain and operational knowledge
separately

73

Ontology Development
Defining terms in the domain and relations
among them
Defining concepts in the domain (classes)
Arranging the concepts in a hierarchy (subclasssuperclass hierarchy)
Defining which attributes/properties (slots) classes
can have and constraints on their values
Defining individuals and filling in slot values
(instances)

Ontology Development
An example of ontology

74

Ontology Development
Ontology development versus Object-oriented modeling

An OO Structure

An ontology
- It

reflects the structure of the world

- It reflects the structure


of data and code
- It is often about structure of concepts - It is usually about
behavior (methods)
- Actual physical representation is not
- It describes the
an issue
physical representation
of data (long int, char,
etc.)

Ontology Development
determine
determine
scope
scope

consider
reuse

enumerate
terms

define
classes

define
properties

define
constraints

create
instances

What is the domain that the ontology will


cover?
For what we are going to use the ontology?
For what types of questions the information in
the ontology should provide answers?
Who will use and maintain the ontology?

75

Ontology Development
Example: questions for the wine ontology
Which wine characteristics should I consider when
choosing a wine?
Is Bordeaux a red or white wine?
Does Cabernet Sauvignon go well with seafood?
What is the best choice of wine for grilled meat?
Which characteristics of a wine affect its
appropriateness for a dish?
Does a bouquet or body of a specific wine change
with vintage year?

Ontology Development
determine
scope

consider
consider
reuse
reuse

enumerate
terms

define
classes

define
properties

define
constraints

create
instances

Why reuse other ontologies?


to save the effort
to interact with the tools that use other ontologies
to use ontologies that have been validated
through use in applications

76

Ontology Development
Examples of existing ontologies
Ontology libraries
Protg ontology library (protege.stanford.edu)
Ontolingua ontology library
(www.ksl.stanford.edu/software/ontolingua/)

Upper ontologies
IEEE Standard Upper Ontology (suo.ieee.org)
Cyc (www.cyc.com)

Domain-specific ontologies
UMLS Semantic Net
GO (Gene Ontology) (www.geneontology.org)
OBO (Open Biological Ontologies) (obo.sourceforge.net)

Ontology Development
determine
scope

consider
reuse

enumerate
enumerate
terms
terms

define
classes

define
properties

define
constraints

create
instances

What are the terms we need to talk about?


What are the properties of these terms?
What do we want to say about the terms?

77

Ontology Development
Examples of terms for wine ontology
wine, grape, winery, location
wine color, wine body, wine flavor, sugar
content
white wine, red wine, Bordeaux wine
food, seafood, fish, meat, vegetables, cheese

Ontology Development
determine
scope

consider
reuse

enumerate
terms

define
define
classes
classes

define
properties

define
constraints

create
instances

A class is a concept in the domain


a class of wines
a class of wineries
a class of red wines

A class is a collection of elements with similar


properties
Instances of classes
a California wine youll have for lunch

78

Ontology Development
Inheritance is a common mechanism in a
class hierarchy
Classes usually constitute a taxonomic
hierarchy (a subclass-superclass hierarchy)
A class hierarchy is usually an IS-A
hierarchy:
an instance of a subclass is an instance of
a superclass
If you think of a class as a set of elements, a
subclass is a subset

Ontology Development
Multiple inheritance can create problems
A class can have more than
one superclass
The subclass inherits properties
and restrictions from all the
parents
Different systems resolve
conflicts differently

Industrial
Equipment

Programmable
System

Robot

79

Ontology Development
determine
scope

consider
reuse

enumerate
terms

define
classes

define
define
properties
properties

define
constraints

create
instances

Properties in a class definition describe


attributes of instances of the class
each wine will have color, sugar content,
producer, etc.

Ontology Development
A taxonomy of properties
Types of properties
intrinsic properties: flavor and color of wine
extrinsic properties: name and price of wine
parts: ingredients in a dish
relations to other objects: producer of wine (winery)

Simple and complex properties


simple properties (attributes): contain primitive values
(strings, numbers)
complex properties: contain other objects (e.g., a winery
instance)

80

Ontology Development
Properties
A subclass inherits all the properties from the
superclass
If a wine has a name and flavor, a red wine also
has a name and flavor

If a class has multiple superclasses, it inherits


properties from all of them
Port is both a dessert wine and a red wine. It
inherits sugar content: high from the former and
color:red from the latter

Ontology Development
determine
scope

consider
reuse

enumerate
terms

define
classes

define
properties

define
define
constraints
constraints

create
instances

Property constraints (restrictions) describe or


limit the set of possible values for a property
the name of a wine is a string
the wine producer is an instance of Winery
a winery has exactly one location

81

Ontology Development
Property constraints
Global property constraints apply to the
property throughout the ontology
range for hasColor is always an instance of
WineColor

Local property constraints (restrictions) valid


only for instances of the class and its
subclasses
hasColor for RedWine has the value Red

Ontology Development
Property constraints
Cardinality the number of values a property can or must
have

Minimum cardinality
Minimum cardinality 1 means that the property must have a
value (required)
Minimum cardinality 0 means that the property value is
optional

Maximum cardinality
Maximum cardinality 1 means that the property can have at
most one value (functional property)
Maximum cardinality greater than 1 means that the property
can have more than one value (multiple-valued property)

82

Ontology Development
determine
scope

consider
reuse

enumerate
terms

define
classes

define
properties

define
constraints

create
create
instances
instances

Create an instance of a class

The class becomes a direct type of the instance


Any superclass of the direct type is a type of the
instance
Assign property values for instances

Property values should conform to the constraints


Knowledge-acquisition tools often make the check

Ontology Development
All the siblings in the class hierarchy must
be at the same level of generality
If a class has only one child, there may be
a modeling problem
If a class has more than a dozen children,
additional subcategories may be
necessary

83

Ontology Development
Danger of multiple
inheritance: cycles in
the class hierarchy

B
subclass-of

Classes A, B, and C
have equivalent sets of
instances

subclass-of

subclass-of

Ontology Development
Different modes of the development
top-down - define the most general concepts
first and then specialize them
bottom-up - define the most specific concepts
and then organize them in more general
classes
combination

84

Ontology Development
Classes represent concepts in the domain,
not their names
The class name can change, but it will still
refer to the same concept
Synonym names for the same concept are
not different classes
Many systems allow listing synonyms as part of
the class definition

Ontology Development
A wine is not a kind-of wines
A wine is an instance of the class Wines
Class names should be either
all singular
all plural

85

Ontology Development
Subclasses of a class usually have
Additional/new properties
Additional/new restrictions
Participate in different relationships

Ontology Development
It is to decide either for a new class or a
property value
Wine

OR

White wine
Rose wine
Red wine

Wine
Colour: white, rose, red

86

Ontology Development
How important is the distinction for the
domain?
A class of an instance should not change
often
Individual instances are the most specific
objects in an ontology
If concepts form a natural hierarchy,
represent them as classes

Ontology Development
Domain of a property the class (or classes)
that have the property
More precisely: class (or classes) of
instances which can have the property
Range of a property the class (or classes)
to which property values belong

87

Ontology Development
When defining a domain or range for a slot,
find the most general class or classes
Consider the produces slot for a Winery:
Range: Red wine, White wine, Ros wine
Range: Wine
Consider the flavor slot
Domain: Red wine, White wine, Ros wine
Domain: Wine

Ontology Development
Inverse properties (e.g., maker produces)
contain redundant information, but
Allow acquisition of the information in either direction
Enable additional verification
Allow presentation of information in both directions

The actual implementation differs from system to


system
Are both values stored?
When are the inverse values filled in?
What happens if we change the link to an inverse property?

88

Ontology Development
Default value a value the slot gets when
an instance is created and a specific value
is not provided
A default value can be changed
The default value is a common value for
the slot, but is not a required value

Ontology Development
Naming convention
Define a naming convention for classes and
properties and adhere to it
Features of an ontology tool to be
considered:
Can classes and properties have the same
names?
Is the system case-sensitive?
What delimiters are allowed?

89

Concluding Remarks about Ontologies


There is no single correct class hierarchy, but
there are some guidelines
An ontology should not contain all the
possible information about the domain
No need to specialize or generalize more than the
application requires
No need to include all possible properties of a
class
Only the most salient properties
Only the properties that the applications require

Concluding Remarks about Ontologies


Ontology of wine, food, and their pairings probably
will not include
Bottle size
Label color
My favorite food and wine

An ontology of biological experiments


will contain
Biological organism
Experimenter
Is the class Experimenter a subclass of Biological
organism?

90

Vocabulary
A main point for an ontology is to agree on the
use of a certain vocabulary
The nonlogical symbols are those that have
an application-dependent meaning or use
Vocabulary establishing means the setting up
of the nonlogical symbols
Considered example: a soap opera world

Vocabulary
Constants:
Named individuals: maryJones, johnSmith,
tomJones
One needs to allow multiple identifiers that could
used to refer to the same individual (e.g. tom,
johnSmith)
Other entities: corporations, restaurants, places,
objects (e.g. astraInsuranceCompany,
iasiTownCouncil, tomsHouse, studentgroup1)

91

Vocabulary
After capturing the set of individuals that will
be central to the agents world, it is next
essential to circumscribe the basic types of
entities that those individuals are.
The unary predicates should be used
Person(x)
Other Examples:
Man, Woman, Place, Company, Jewelry, Pen
Restaurant, Bar, House, University

Vocabulary
The unary predicates can also describe the
set of attributes that the entities can have
A vocabulary of properties that can hold for
individuals are to be established
Examples:
Rich, Beautiful, Unscrupulous, Bankrupt,
ClosedForRepairs

Important remark: The syntax of FOL is


limited in that it does not allow us to distinguish
between the predicates for properties and
those used to describe the types of entities

92

Vocabulary
The n-ary predicates can express
relationships
Examples of predicates of arity 2:
MarriedTo, DaughterOf, LivesAt

Examples of predicates of higher arity:


ConspiresWith,OccursInTimeInterval

Vocabulary
The n-ary functions are also part of the
vocabulary
Both unary and higher arity functions can be
included
Examples of functions
fatherOf, bestFriendOf (unary functions)
childOf, gradeOf (binary functions; e.g.
gradeOf(John, KRR) = 10) )

93

Vocabulary
Important remark: there may be cases when
both a function or a predicate can be involved
Some criteria to decide:
Functions are taken to be total in FOL; if
exceptions are possible, then use predicates
(e.g. bestFriendOf versus BestFriendOf)
Functions are needed as a means to keep using
the FOL (e.g. Happy(bestFriendOf(x)) is an FOL
formula, while Happy(BestFriendOf(x)) is beyond
FOL)

Types of facts - Simple, core facts of the UoD


Simple facts
Atomic formula
Negations of atomic formula
Examples:
Man(john), Company(faultyInsuranceCompany),
Knife(butcherknife1)
These define the basic ontology of UoD
Suggestive names are not a form of knowledge
representation, as they do not support logical inference

94

Types of facts - Simple, core facts of the UoD


Properties of the entities
These are important, since we most often want to
see what properties (and relationships) are implied
by a set of facts
Examples:
Rich(john), 7HappilyMarried(jim),
WorksFor(jim, ann), bestFriendOf(jim) = john,
fic = faultyInsuranceCompany

Types of facts - Complex facts


Complex facts are expressed by non-atomic
formula
Whenever we use quantifiers the domain of the
variables must be specified (this is usually the UoD)
Examples:
All the rich men love Jane
x[Rich(x) Man(x) Loves(x, jane)]

95

Types of facts - Complex facts


All women, with the possible exception of Jane,
love John
x[Woman(x) 7(x = jane) Loves(x, john)]

Universals are also useful for expressing very


general facts, not even involving any known
individuals
x y [Loves(x, y) 7Blackmails(x, y)]

Universal quantification helps us to express


statements which otherwise would mean to
enumerate all the individuals of the UoD (a kind
of abbreviation)

Types of facts - Complex facts


To express incomplete knowledge about our
world we also need complex facts
Examples
Friends(jane, john) Friends(jane, jim)
x [Adult(x) Friends(x, ann)]

The use of the existential quantification is not


necessarily an abbreviation, but determined by
the lack of knowledge

96

Types of facts - Complex facts


Complex facts are used to express closure
sentences the limits of the UoD
Examples
x [Lawyer(x) x = jane x = john ]

For a KBS, it may be necessary to make


explicit all the distinct individuals of the UoD
x [x = jane x = fic x = tomsHouse ]
jane john; fic tomsHouse;

Types of facts - Terminological facts


We need to provide a set of facts about the
terminology we are using
Terminological facts are supporting the
inferential component of the KBS
Terminological facts capture the relationships
between statements expressed by predicates
and functions

97

Types of facts - Terminological facts


Examples of terminological facts:
Disjointness two predicates that are disjoint
x [Man(x) 7 Woman(x)]

Subtypes - predicates that imply a form of


specialization, wherein one type is subsumed by
another
x [Surgeon(x) Doctor(x)]

Types of facts - Terminological facts


Examples of terminological facts:
Exhaustiveness two or more subtypes
completely account for a supertype (reverse of
subtype assertion)
x [Adult(x) Man(x) Woman(x)]

Symmetry used for symmetric relationships


x y [MarriedTo(x, y) MarriedTo(y, x)]

98

Types of facts - Terminological facts


Examples of terminological facts:
Inverse to make evident that some
relationships are the opposite of others
x y[ChildOf(x, y) ParentOf(y, x)]

Type restrictions the arguments of some


predicates must be of certain types
x y [MarriedTo(x, y) Person(x) Person(y)]

Types of facts - Terminological facts


Examples of terminological facts:
Full definitions to create compound
predicates that are completely defined by a
logical combination of other predicates (the use
of biconditional/equivalence)
x [RichMan(x) Man(x) Rich(x)]

Conclusion: terminological facts are typically


captured in FOL as universally quantified
conditionals

99

Types of facts Other sorts of facts


In certain domains, there are some other
types of facts that we may want to capture.
Each of these is problematical for a
straightforward application of FOL.
Extensions of FOL or other knowledge
representation languages allow their handling.
The choice of the language to use in a KBS will
ultimately depend on what types of facts/
conclusions are most important for the
application.

Types of facts Other sorts of facts


Statistical and probabilistic facts include
those facts that involve portions of the sets of
individuals satisfying a predicate, in some cases
exact subsets and in other cases less exactly
quantifiable.
Examples:
Half of the companies are located on the East
Side.
Most of the employees are restless.
Almost none of the employees are completely
trustworthy.

100

Types of facts Other sorts of facts


Default and prototypical facts regard
characteristics that are usually true, or
reasonable to assume true unless told otherwise
Examples:
Company presidents typically have secretaries
intercepting their phone calls.
Cars have four wheels.
Companies generally do not allow employees that
work together to be married.
Birds fly.

Types of facts Other sorts of facts


Intentional facts express peoples mental
attitudes and intentions, that is, they can reflect
the reality of peoples beliefs but not necessarily
the real world itself
Examples:
John believes that Henry is trying to blackmail him.
Jane does not want Jim to know that she loves
him.

Ultimately, a KBS should be able to express &


reason with anything that can be expressed by
natural language, anything that we can imagine
as being either true or false.

101

Reification
The FOL language gives us the basic tools for
representing facts with a great deal of flexibility
There is also considerable flexibility in what we
consider to be the individuals in the domain
Sometimes is useful to introduce new abstract
individuals that might not have been considered
in a first analysis
Reification the idea of making up new
individuals as needed for the most appropriate
detail level

Reification
Example1: the event of Johns buying a bike
Purchases(john, bike)
Purchases(john, bike, feb15)
Purchases(john, bike, feb15, $200)

The problem here is that the arity of the


Purchases predicate depends on how much
detail we will want to express

102

Reification
We may not be able to predict in advance the
level of detail; we need another approach, more
flexible.
To introduce an abstract individual, together
with a unary predicate and some functions
p23 a constant representing the event of
purchasing
Purchase(p23) agent(p23) = john object(p23)
= bike time(p23) = feb15 cost(p23) = $200

Reification
We gain flexibility more/less details by
adjusting the formula
Advantage - the arity of the predicate and
function symbols involved can be determined in
advance
Example2: a set of predicates to represent the
marriage relationships: MarriedTo(x, y);
PreviouslyMarriedTo(x, y); ReMarriedTo(x, y)

103

Reification
We can reify the marriage events as abstract
individuals (entities)
We can determine anyones current marital
status and complete marital history directly from
these individuals
Marriage(m20) husband(m20) = john
wife(m20) = ann date(m23) =

Reification
In representing commonsense information we
need individuals for numbers, dates, times,
addresses, and so on.
Basically, any object about which we can ask
a wh-question should have an individual
standing for it in the KB so it can be returned as
the result of a query.

104

Reification
We gain flexibility in representing quantitative
knowledge, by using proper functions
Example:
ageInYears(suzzy) = 12
ageInMonths(suzzy) = 144
years(age(suzzy)) = 12
months(x) = 12 years(x)

Entailment
Entailment is part of the reasoning process
Entailment allows deriving of implicit
conclusions from statements explicitly
represented in the KB
KB collection of simple and complex facts
(considered example, soap opera world)
If the only tasks are to answer simple
questions (e.g. Is John rich man?), then
entailments will not be needed

105

Entailment
Entailments are needed when complex
statements are analyzed
Example1: Is there a company whose CEO
(chief executive officer) loves Jane?
: x [Company(x) Loves(ceoOf(x), jane)]?
KB |= (I, if I |= KB (*), then I |= )

Entailment
I |= {Rich(john), Man(john)}

- from (*)

(1)

I|=x[Rich(x)Man(x)Loves(x,jane)] - from (*) (2)


(1), (2) I |= Loves(john, jane)

(3)

I |= {john = ceoOf(fic),
Company(faultyInsuranceCompany),
fic = faultyInsuranceCompany}

106

from (*) (4)

Entailment
(3),(4)I |= Company(fic)Loves(ceoOf(fic), jane)

I |= x [Company(x) Loves(ceoOf(x), jane)]

KB |=

Entailment
Remarks:
The provided solution determined not only that
there is a company whose CEO loves Jane, but
also what that company is.
We can be interested in finding out not only
whether something is true or not, but also which
individuals satisfy a property of interest.
A KBS has to face not only to yesno questions,
but to wh-questions as well (who? what? where?
when? how? why?)

107

Entailment
Example2: If no man is blackmailing John, then
is he being blackmailed by someone he loves?
: x [Man(x) 7 Blackmails(x, john)]
y [Loves(john, y) Blackmails(y, john)] ?
KB |=
It is more easy to demonstrate the requested
entailment by using:
KB |= 1 1 iff KB U {1} |= 1

Entailment
I |= KB (*) and
I |= x [Man(x) 7 Blackmails(x, john)]

(**)

I |= y [Loves(john, y) Blackmails(y, john)] ?


I |= x [Adult(x) Blackmails(x, john)] from (*) (1)
I |= x [Adult(x) Man(x)Woman(x)] from (*) (2)

108

Entailment
I |= x [Woman(x) Blackmails(x, john)]
from (**), (1), (2)

(3)

I |= Loves(john, jane) from previous example (4)


I |= x[Woman(x) 7(x = jane) Loves(x, john)]
from (*)
I |= x y [Loves(x, y) 7Blackmails(x, y)]

(5)

from (*)

(6)

Entailment
I |=x[Woman(x)7(x = jane)7Blackmails(x, john)]
from (5, 6)

(7)

I |= Blackmails(jane, john) from (7, 3)


(8)
I |= Blackmails(jane, john) Loves(john, jane)
from (8, 4)

(9)

I |= y [Loves(john, y) Blackmails(y, john)] from (9)

109

Entailment
We have illustrated, in intuitive form, how a
proof can be thought of as a sequence of FOL
sentences, starting with those known to be true
in the KB or surmised as part of the
assumptions dictated by the query.
The proof proceeds logically using the facts in
the KB and the rules of logic.

Entailment
In some cases the answer to a question may
be a negative one.
Entailment can be involved in such a case, in
a more complicated way.
We have to prove:
KB |
We must produce a specific interpretation and
argue that it satisfies every sentence in the KB
as well as the negation of

110

Chapter 3 Knowledge Engineering


Homework
Consider the following universes of discourse
and sketch an ontology for each of them:
the manufacturing cell from the Robotics
Laboratory
the laboratory of Programable Logic Controlers

Problems 1, 2 and 3, Brachman and


Levesque, pp. 45-46.

Chapter 3 Knowledge Engineering


Homework
Consider the rule-based solution (in CLIPS) for
planning a robot activity in the world of
blocks. Analyze the working memory and
establish which facts are simple, complex and
terminological.

111

Chapter 4 Resolution
Chapter Overview
Introduction
Resolution in Propositional Calculus
Resolution Derivations
A Resolution based Entailment Procedure
Resolution in FOL
Answer Extraction
Skolemization
Equality
Dealing with the computational complexity of
resolution

Chapter 4 Resolution
Objectives for Chapter 4
Understading the resolution based reasoning
mechanism
A procedure to apply the resolution rule for all
the cases of FOL formulas
A procedure to extract an answer from a
knowledge base
Understanding the resolution limitations

112

Resolution - Introduction
Resolution is the main method applied by
automated theorem proving
It details how to automate a deductive
reasoning procedure
Problems to solve:
Given a knowledge base KB and a sentence ,
we would like a procedure that can determine
whether or not KB |=
If [x1, ... , xn] is a formula with free variables
among the xi, we want a procedure that can find
terms ti, if they exist, such that KB |= [t1, ... , tn]

Resolution - Introduction
There is no procedure that can fully satisfy
this specification
The possible result: a procedure that does
deductive reasoning in as sound and
complete a manner as possible, and in a
language as close as possible to that of full
FOL

113

Resolution - Introduction
If we take the KB to be a finite set of
sentences {1, ... , n}, there are several
equivalent ways of formulating the deductive
reasoning task:
KB |=
|= [(1 n) ]
KB U {7} is not satisfiable

Resolution is simpler for propositional


calculus

Resolution in Propositional Calculus


It is easy to show that for any formula there
is an equivalent formula so that is in the
conjunctive normal form.
Example:
p, q, r, s predicate symbols of arity zero
( p 7q) (q r 7s p) (7r q)

Automated theorem proving uses only


formulas in the conjunctive normal form
(CNF)

114

Resolution in Propositional Calculus


The procedure to convert any propositional
formula to CNF is as follows:
1.

2.
3.
4.

Eliminate the logical operations and by


using only 7, and
Use the tautology: |= 77
Use De Morgans laws
Distribute over , using the following
equivalences:

|= ( ( ))(( ) )(( )())

Resolution in Propositional Calculus


5.

Collect terms, using the following


equivalences:

|= ( )

|= ( )

The CNF is also named clausal normal form:


a formula is a conjunction of clauses, and a
clause is a disjunction of literals (p is a
positive literal, 7p is a negative literal)

115

Resolution in Propositional Calculus


It is convenient to use a shorthand
representation for CNF:
a clausal formula is a set of clauses
a clause is a set of literals

An example:
(( p 7q r) q
{[p, 7q, r], [q]}

Resolution in Propositional Calculus


A clause with a single literal unit clause
The empty clause - [ ] - is understood as a
representation of 7 TRUE (something always
FALSE)
The procedure applied by automated theorem
proving to conclude
KB |=

is:

116

Resolution in Propositional Calculus


1.

Put the formulas in KB and 7 into CNF;

2.

If the resulting set of cluases is un-satisfiable,


then KB |=
In other words, any question about entailment
can be reduced to a question about the
satisfiability of a set of clauses.

Resolution derivations
To discuss reasoning at the symbol level, it is
common to posit what are called rules of
inference: statements of what formulas can be
inferred from other formulas.
Automated theorem proving uses the
resolution rule of inference:
A B, 7B C |= A C

Expert systems use the modus ponens rule of


inference:
A, A B |= B

117

Resolution derivations
Resolution rule can be expressed for set
representation of formulas:
Given a clause of the form C1 = c1 U {} containing
some literal , and a clause of the form C2 = c2 U
{7} containing the complement of , infer the
clause c1 U c2 consisting of those literals in the
first clause other than and those in the second
other than 7.
We say in this case that c1 U c2 is a resolvent of
the two input clauses with respect to (c1 U c2 =
Res(C1, C2) )

Resolution derivations

Examples:
[w, p, 7q], [s, w, 7p] |= [w, 7q, s]

[p], [7p] |= [ ] (this means {[p], [7p]} is unsatisfiable)


A resolution derivation of a clause c from a set of
clauses S is a sequence of clauses c1, ... , cn,
where the last clause, cn, is c, and where each ci
is either an element of S or a resolvent of two
earlier clauses in the derivation. We write S |- c if
there is a derivation of c from S.

118

Resolution derivations
The importance of resolution for reasoning:
If S |- c, then S |= c (the converse does not hold)
Conclusion 1: In general, as a form of reasoning,
Resolution is sound, but not complete.
Conclusion 2: For c = [ ], resolution is both
sound and complete (S |- [ ] if and only if S |= [ ])

A Resolution based entailment procedure


The main steps for a symbol-level procedure
able to determine if KB |=
1.
2.

3.

Put both KB and {7} into CNF


Check if S = KB U {7} is un-satisfiable by
applying resolution derivation
If [ ] is got by resolution derivation, then
positive answer: KB |=

119

A Resolution based entailment procedure


The resolution procedure:
Input: a finite set S of propositional clauses
Output: satisfiable or unsatisfiable
1. check if [ ] S; if so, return unsatisfiable
2. otherwise, check if there are two clauses in S
such that they resolve to produce another
clause not already in S; if not, return
satisfiable
3. otherwise, add the new resolvent clause to S,
and go back to step 1

A Resolution based entailment procedure


Remarks:
The procedure can be made deterministic
quite simply: we need to settle on a strategy
for choosing which pair of clauses to use
when there is more than one pair that would
produce a new resolvent.
One possibility is to use the first pair
encountered; another is to use the pair that
would produce the shortest resolvent. It might
also be a good idea to keep track of which
pairs have already been considered to avoid
redundant checking.

120

A Resolution based entailment procedure


Remarks:
If we were interested in returning or printing
out a derivation, we would want to store with
each resolvent pointers to its input clauses.
The procedure does not distinguish between
clauses that come from the KB and those that
come from the negation of , which we will
call the query.
Observe that if we have a number of queries
we want to ask for the same KB, we need only
convert the KB to CNF once and then add
clauses for the negation of each query.

A Resolution based entailment procedure


Remarks:
If we want to add a new fact to the KB, we
can do so by adding the clauses for to those
already calculated for KB.
To use this type of entailment procedure, it
makes good sense to keep KB in CNF, adding
and removing clauses as necessary.

121

A Resolution based entailment procedure


Example
KB:
Toddler; Toddler Child; Child Male Boy;
Infant Child; Child Female Girl; Female
KB |= Girl ?

A Resolution based entailment procedure


S = {Toddler, 7Toddler Child, 7Child
7Male Boy, 7Infant Child, 7Child
7Female Girl, Female, 7Girl}
S = {[Toddler] , [7Toddler, Child], [7Child,
7Male, Boy], [7Infant, Child], [7Child,
7Female, Girl], [Female], [7Girl]}

122

A Resolution based entailment procedure


Derivation by resolution:
C1 = [Toddler] , C2 = [7Toddler, Child],
C3 = [7Child, 7Male, Boy], C4 = [7Infant, Child],
C5 = [7Child, 7Female, Girl], C6 = [Female],
C7 = [7Girl]}
C8 = [Child]
Res(C1, C2)
C9 = [7Female, Girl]
Res(C5, C8)
Res(C9, C6)
C10 = [Girl]
C11 = [ ]
Res(C10, C7)

Resolution in FOL
For the general case of FOL we have to
handle terms and quantifiers
Again we have to convert the formulas in the
CNF (clausal normal form)
First, we neglect existential quantifiers

123

Resolution in FOL
Procedure to obtain CNF:
Important remark a formula in CNF is a sentence
which if containing variables, these are
universally quantified
1. Eliminate the logical operations and by
using only 7, and
2. Move 7 inward so that it appears only in front
of an atom, using De Morgans laws and the
following two:
|= 7 x. x. 7
|= 7 x. x. 7

Resolution in FOL
3.

Standardize variables, that is, ensure that


each quantifier is over a distinct variable by
renaming them as necessary. This uses the
following equivalences (provided that x does
not occur free in ):
|= y. x. (y|x)
|= y. x. (y|x)

124

Resolution in FOL
4.

5.

Eliminate all remaining existentials (discussed


later)
Move universals outside the scope of and
using the following equivalences (provided
that x does not occur free in )
|= ( x. ) (x. ) x ( )
|= ( x. ) (x. ) x ( )

Resolution in FOL
6.
7.

Distribute over , as before


Collect terms as before

The result of this procedure is a quantified


version of CNF, a universally quantified
conjunction of disjunctions of literals that is once
again logically equivalent to the original formula
(ignoring existentials).

125

Resolution in FOL
As for the propositional calculus, the clausal form
allows us to use sets instead of formulas.
In the general case, a literal can be:
P(t1, , tn) P predicate symbol of arity n,
ti term (variable or function symbol)

Resolution in FOL
Example:
x y [(P(x) 7R(a, f(b, x))) Q(x, y)]
{[P(x) 7R(a, f(b, x))], [Q(x, y)]}

To apply resolution in FOL we use substitutions


= {x1|t1, , xn|tn}; xi variable; ti term (xi
does not appear in ti)
= (x1|t1, , xn|tn); is an instance of

126

Resolution in FOL
Example:
= {x|a, y|g(a, b, z)}
= P(x, c, y)
= P(a, c, g(a, b, z))

A term, a literal or a formula (clause) is ground


if it contains no variables.

Resolution in FOL
General rule of resolution for FOL
Suppose we are given a clause of the form
c1 U {1} containing some literal 1, and a clause
of the form c2 U {72} containing the complement
of a literal 2. Suppose we rename the variables
in the two clauses so that each clause has
distinct variables, and there is a substitution
such that 1 = 2. Then, we can infer by
resolution the clause (c1Uc2) consisting of those
literals in the first clause other than 1 and those
in the second other than 2, after applying .

127

Resolution in FOL
We say in this case that unifies 1 and 2, and
that is a unifier of the two literals.
With this new general rule of Resolution, the
definition of a derivation by resolution stays the
same, and ignoring equality, it is the case that
S |- [ ] if and only if S |= [ ], as for the
propositional calculus.

Resolution in FOL
Example 1
KB
x(GradStudent(x) Student(x))
x(Student(x) HardWorker(x))
GradStudent(sue)

Question
HardWorker(sue)

Problem
KB |= HardWorker(sue)

128

Resolution in FOL
Derivation by resolution:
C1 = [7GradStudent(x), Student(x)] ,
C2 = [7Student(y), HardWorker(y)],
C3 = [GradStudent(sue)], C4 = [7HardWorker(sue)],
C5 = [7Student(sue)]
- Res(C2, C4); = {y|sue}
C6 = [7GradStudent(sue)] - Res(C5, C1); ={x|sue}
C7 = [ ]
Res(C3, C6)

Resolution in FOL
Example 2
KB
On(a, b) = C1
On(b, c) = C2
Green(a) = C3
7Green(c) = C4

Question
Q = x y (On(x, y) Green(x) 7Green(y))

129

Resolution in FOL
Q = x y 7(On(x, y) Green(x) 7Green(y)) =
x y (7On(x, y) 7Green(x) Green(y)) = C5
C6 = [7Green(a), Green(b)] - Res(C1, C5); ={x|a, y|b}
- Res(C6, C3)
C7 = [Green(b)]
C8 = [7Green(b), Green(c)] - Res(C2, C5); ={x|b, y|c}
C9 = [7Green(b)]
- Res(C8, C4)
C10 = [ ]
- Res(C9, C10)

Resolution in FOL
Example 3
Using resolution derivation it is possible to get
answers to queries that we might think of as
requiring some computation, too.
To do arithmetic, for example, we can use the
constant zero to stand for 0, and succ to stand
for the successor function.

130

Resolution in FOL
Every natural number can then be written as a
ground term using these two symbols.
4 = succ(succ(succ(succ(zero))))

The predicate Plus(x, y, z) stands for the


relation: x + y = z
KB
x Plus(zero, x, x)
x y z(Plus(x, y, z) Plus(succ(x), y, succ(z)) )

Resolution in FOL
From this KB one can derive that: 2 + 3 = 5
(namely, Plus(2, 3, 5)).
The same KB and the query:
u Plus(2, 3, u) = Q

KB
C1 = [Plus(0, s, s)]
C2 = [7Plus(x, y, z), Plus(succ(x), y, succ(z))]
C3 = 7Q = [7 Plus(2, 3, u)]

131

Resolution in FOL
[7Plus(2, 3, u)]

[7Plus(x, y, z), Plus(succ(x), y, succ(z))]

= {x|1, y|3, u|succ(v), z|v}


[7Plus(1, 3, v)]
= {x|0, y|3, v|succ(w), z|w}
[Plus(0, s, s)]

[7Plus(0, 3, w)]
= {s|3, w|3}
[]

Resolution in FOL
The answer is obtained by examining the
bindings of variables
u succ(v)
v succ(w)
w3

This form of computation, including locating the


answers in a derivation of an existential, is what
underlies the PROLOG programming language.

132

Answer extraction
It is not always possible to get answers to
questions by looking at the bindings of variables
in a derivation of an existential.
It can happen that a KB entails some x. P(x)
without entailing P(t) for any specific t. For
example, this happens for the block world
problem with the green and not green blocks.
A general method to deal with answers to
queries the answer extraction process.

Answer extraction
We replace a query such as x. P(x) (where x is
the variable we are interested in) by
x(P(x) 7A(x)), where A is a new predicate
symbol occurring nowhere else, called the
answer predicate.
When the answer predicate is added, it will not
be possible to derive the empty clause from the
modified query. Instead, we terminate the
derivation as soon as we produce a clause
containing only the answer predicate.
The answer predicate tells us the result.

133

Answer extraction
Example 1
KB
Student(john) = C1
Student(jane) = C2
Happy(john) = C3
Q = x (Student(x) Happy(x) 7A(x))
7Q = x (7Student(x) 7Happy(x) A(x)) = C4

Answer extraction
Resolution derivation
C5 = [7Student(john), A(john)]; Res(C3, C4);
= {x|john}
C6 = [A(john)]; Res(C5, C1)
In this example an anwer is produced.
There can be many such answers, but each
derivation only deals with one.
An example: add to the previous KB the
hypothesis: Happy(jane)

134

Answer extraction
The answer extraction process helps especially
in cases involving indefinite answers.
Example
KB
Student(john) = C1
Student(jane) = C2
Happy(john) Happy(jane) = C3
7Q = x (7Student(x) 7Happy(x) A(x)) = C4

Answer extraction
C5=[7Happy(john), A(john)] Res(C1, C4); ={x|john}
C6=[7Happy(jane), A(jane)] Res(C2, C4); ={x|jane}
C7=[A(jane), Happy(john)] Res(C6, C3)
C8 = [A(jane), A(john)] Res(C5, C7)

The result is correct: An answer is either


Jane or John.

135

Answer extraction
It is worth noting that the answer extraction
process can result in clauses containing
variables.
Example
KB
x. Student(f(a, x))
y z. Happy(f(y, g(z)))
7Q = x (7Student(x) 7Happy(x) A(x))
Result: A(f(a, g(z))) An answer is any instance
of the term f(a, g(z))

Skolemization
We still have to solve the fourth step of the
procedure to obtain CNF (slide 26)
For example, the formula xy z.P(x, y, z) cannot
be transformed in clausal form with the presented
rules
The main idee of Skolemization:
Because some individuals are claimed to exist, we
introduce names for them (called Skolem constants
and Skolem functions) and represent facts using those
names. If we are careful not to use the names
anywhere else, what will be entailed will be precisely
what was entailed by the original existential.

136

Skolemization
The previous formula becomes: yP(a, y, f(y))
with: a constant; f function symbol of arity 1
(a and f are called Skolem symbols)
Skolemization definition:
Replace each existential variable by a new function
symbol with as many arguments as there are
universal variables dominating the existential.

Skolemization
In other words, if we start with:

x1(... x2(... x3(... y[... y ...] ...) ...) ...)


where existentially quantified y appears in the
scope of universally quantified variables x1, x2, x3,
and only these, we end up with:

x1(... x2(... x3(... [... f(x1, x2, x3) ...] ...) ...) ...)

137

Skolemization
If is our original formula and is the result of
converting it to CNF including Skolemization, then
it is no longer the case that
|= ( )
as it was before.

Skolemization
What can be shown, however, is that is
satisfiable if and only if is satisfiable, and this is
really all we need for Resolution.
Skolemization depends crucially on the universal
variables that dominate the existential.
A formula like xyR(x, y) entails

y xR(x, y), but the converse does not hold.

138

Treatment of equality
So far, we have ignored formulas containing
equality.
If we were to simply treat equality as a normal
predicate, we would miss many unsatisfiable sets
of clauses.
An example, {a = b, b = c, a c} is unsatisfiable
To handle these, it is necessary to augment the
set of clauses to ensure that all of the special
properties of equality are taken into account.

Treatment of equality
What we require are the clausal versions of the
axioms of equality.
reflexitivity: x (x = x)
symmetry: x y (x = y y = x)
transitivity: x y z (x = y y = z x = z)
Substitution for functions:
x1y1 xnyn(x1=y1 xn=yn
f(x1, , xn) = f(y1, , yn) )

139

Treatment of equality
Substitution for predicates:

x1y1 xnyn(x1=y1 xn=yn


P(x1, , xn) = P(y1, , yn) )
The above subtitutions can get different expressions:

x1y1 xnyn(x1=y1 xn=yn P(x1, , xn)


P(y1, , yn) )

Treatment of equality
It can be shown that with the addition of these
axioms, equality can be handled, and soundness
and completeness of Resolution for the empty
clause will be preserved.
Example
KB
x. Married(father(x), mother(x))
father(john) = bill

Query
Married(bill, mother(john))

140

C1
C2

Treatment of equality
C3 = [7Married(bill, mother(john))]
C4 = x1y1x2y2(x1=y1 x2=y2 Married(x1,x2)
Married(y1,y2) )
C4 = x1y1x2y2 (x1y1 x2 y2
7Married(x1,x2) Married(y1,y2) )
C4 = [x1y1, x2 y2, 7Married(x1,x2), Married(y1,y2)]

Treatment of equality
C5 = [x1bill, x2 mother(john), 7Married(x1, x2)]
Res(C3, C4); = {y1|bill ,y2|mother(john)}
C6 = [x2 mother(john), 7Married(father(john), x2)]
Res(C5, C2); = {x1|father(john)}
C7 = [mother(john) mother(john)] Res(C6, C1);
= {x|john, x2|mother(john)}
C8 = [ ] Res(C7, C0) C0 = reflexivity

141

The computational complexity of resolution


Resolution does not provide a general effective
solution to reasoning.
According to the decision problem result,
resolution can be trapped in an infinite loop.
Example
KB
xy (LessThan(succ(x), y) LessThan(x, y) )
Querry
LessThan(zero, zero)

The computational complexity of resolution


[LessThan(x, y), 7LessThan(succ(x), y)]

[7LessThan(0, 0)]
= {x|0, y|0}

[7LessThan(1, 0)]
= {x|1, y|0}
[7LessThan(2, 0)]

...

= {x|2, y|0}

142

The computational complexity of resolution


Although we never generate the empty clause,
we might generate an infinite sequence looking for
it.
We cannot simply use a depth-first procedure to
search for the empty clause, because we run the
risk of getting stuck on such an infinite branch.
There is no way to detect when we are on such
a branch.

The computational complexity of resolution


From a knowledge representation point of view,
it means that there can be no procedure that,
given a set of clauses, returns satisfiable when the
clauses are satisfiable and unsatisfiable
otherwise.
Resolution is refutation complete: If the set of
clauses is unsatisfiable, some branch will contain
the empty clause (even if some branches may be
infinite).

143

The computational complexity of resolution

Positive result: a breadth-first search is


guaranteed to report unsatisfiable when the
clauses are unsatisfiable.
Negative result: when the clauses are
satisfiable, the search may or may not
terminate.

The computational complexity of resolution


Resolution is not the single reasoning
mechanism to be used.
Some other options:
to give more control over the reasoning process
to the user
to use representation languages less expressive
than full FOL or than full propositional logic, but
more computationally effective
to use certain strategies that can improve the
performance of an automated theorem-proving
system

144

The computational complexity of resolution


It is worth observing that in some applications
of Resolution it is reasonable to wait for answers,
even for a long time.

An example:
Using Resolution to do mathematical theoremproving, in order to determine whether or not
Goldbachs Conjecture or its negation follows
from the axioms of number theory

Chapter 4 Resolution
Homework
Solve by resolution the problems 2 and 4 of
Chapter 2 (the last sentence will be considered
the conclusion to be derived)
Problems 1 and 2, Brachman and Levesque,
pp. 75-76.

145

Chapter 4 Resolution
Homework
Show by resolution that the following set of clauses is
inconsistent (derive empty clause from it)*:
[A; B; C]; [A; B; 7C]; [A; 7B; C]; [A; 7B; 7C];
[7A; B; C]; [7A; B; 7C]; [7A; 7B; C]; [7A; 7B; 7 C]
*This problem is part of an examination paper at
University of Nottingham, School of Computer Science

Chapter 4 Resolution
Homework
Transform the following formula to clausal
form: xy(P(x, y) zQ(x, y, z))

Prove by resolution that from the sentence:


Students are citizens it results Students
votes are citizens votes

146

Chapter 4 Resolution
Homework
Consider the following knowledge base (where run,
nothing, now and bear are constants)*:
t x(See(t; x) Dangerous(x) BestAction(t; run))
t(7 x(See(t; x) Dangerous(x)) BestAction(t; nothing))
Dangerous(bear)
See(now; bear)

Show using resolution that the knowledge base entails


xBestAction(now; x) and extract the answer (which
action is it).
* This problem is part of an examination paper at
University of Nottingham, School of Computer Science

147

Chapter 5 Production Systems


Chapter Overview
Introduction Direction of Reasoning
Basic Operation
Working Memory
Production Rules
Conflict Resolution
Efficiency in Production Systems

Chapter 5 Production Systems


Objectives for Chapter 5
To understand the two types of reasoning
directions for production systems and their
appropriateness
To understand the efficiency problems for
production systems
To be able to design & implement efficient
expert systems

148

Introduction Direction of reasoning


Production systems regard the theoretical
background for rule-based systems
Rule-based systems constitute one main
approach for expert system
The concept of an ifthen conditional or rule
(if P is true then Q is true) is central to
knowledge representation

Introduction Direction of reasoning


Formula P Q represents a rule and it is
equivalent with: 7P Q
From a reasoning point of view we can look
at rules in two different ways:
moving from assertions of P to assertions of Q;
moving from goals of Q to goals of P.
(assert P) (assert Q)
(goal Q) (goal P).

149

Introduction Direction of reasoning


Although both of these arise from the same
connection between P and Q, they
emphasize the difference between focusing
on asserting facts and seeking the
satisfaction of goals.
We usually call the two types of reasoning
that they suggest
data-directed reasoning - reasoning from P to Q;
goal-directed reasoning - reasoning from Q to P.

Introduction Direction of reasoning


Data-directed reasoning forward chaining
(forward along )
Goal-directed reasoning backward chaining
(move against )
Data-directed reasoning might be most
appropriate in a database-like setting, when
assertions are made and it is important to
follow the implications of those assertions.
Example: monitoring problem (CLIPS)

150

Introduction Direction of reasoning


Goal-directed reasoning might be most
appropriate in a problem-solving situation,
where a desired result is clear and the means
to achieve that result the logical foundations
for a conclusion are sought.
Example: diagnosis problem (PROLOG)

It is still possible to use a forward chaining


system to do goal directed reasoning (by an
appropriate interpretation of the formula P Q),
and conversely to use backward chaining for
data-directed reasoning

Basic operation
A production system is a forward-chaining
reasoning system that uses rules (also called
production rules or simply, productions) as its
representation of general knowledge.
A production system keeps an ongoing
memory of assertions in what is called its
working memory (WM).

151

Basic operation
The WM is like a database, but more volatile; it
is constantly changing during the operation of
the system.
The basic operation of a production system is
a cycle of three steps that repeats until no
more rules are applicable to the WM, at which
point the system halts.
The three parts of the cycle are as follows:

Basic operation
Recognize (match): find which rules are
applicable, that is, those rules whose
antecedent conditions are satisfied by the
current working memory.
Resolve conflict: among the rules found in the
first step (called a conflict set - agenda), choose
which of the rules should fire, that is, get a
chance to execute.
Act: change the WM by performing the
consequent actions of all the rule selected in
the second step.

152

Working memory
Working memory is composed of a set of
working memory elements (WMEs).
Each WME is a tuple:
(type|relation attribute1: value1 . . . attributen: valuen)

Type, attributei, and valuei are atoms.

Working memory
Declaratively, we understand each WME as an
existential sentence:
x(type(x) attribute1(x) = value1
attributen(x) = valuen)
Note that the individual about whom the assertion is
made is not explicitly identified in a WME (it can
appear problems regarding the duplicate facts).
If we choose to do so, we can identify individuals by
using an attribute that is expected to be unique for
each individual.
In expressing relationships among objects, reification
can be used.

153

Production rules
The antecedent of a production rule is a set of
conditions.
If there is more than one condition, they are
understood conjunctively.
Each condition can be positive or negative.
The body of each condition is a tuple of the
following form:

Production rules
(type attribute1: specification1 . . . attributek:
specificationk)
Each specification is one of the following:
an atom
a variable
an evaluable expression
a test
the conjunction, disjunction, or negation

154

Production rules
Examples:
(person age: [n+4] occupation:x)
(not (person age: {< 23 >6} ) )
A positive condition is satisfied if there is a
matching WME in the WM.

Production rules
A negative condition is satisfied if there is no
matching WME (negation is interpreted as
failure, like in PROLOG-type systems the
Closed World Assumption).
The consequent sides of production rules have
a strictly procedural interpretation, all of the
actions in the consequent are to be executed
in sequence.

155

Production rules
Each action is one of the following:
ADD pattern: this means that a new WME specified
by pattern is added directly to the WM.
REMOVE i: i is an integer, and this means to
remove (completely) from WM the WME that
matched the i-th condition in the antecedent of the
rule. This construct is not applicable if that condition
was negative.
MODIFY i (attribute specification): this means to
modify the WME that matched the i-th condition in
the antecedent by replacing its current value for
attribute by specification. MODIFY is also not
applicable to negative conditions.

Production rules
Note that in the actions of rules, any variables
that appear refer to the values obtained when
matching the antecedent of the rule.
Example:
IF (student name: x) THEN ADD (person name: x)
The corresponding formula is:
x(Student(x) Person(x))

156

Production rules
Example1:
We have three bricks, each of different size,
sitting in a heap. We have three identifiable
positions in which we want to place the bricks
with a robotic hand; call these positions 1, 2,
and 3. Our goal is to place the bricks in those
positions in order of their size, with the largest in
position 1 and the smallest in position 3.

Production rules
The patterns used for WMEs:
(counter <value>)
(brick <name> <size> <position>)

The initial WM:


(counter
(brick A
(brick B
(brick C

1)
10 heap)
30 heap)
20 heap)

157

Production rules
We can achieve our goal with two production
rules that work with any number of bricks.
The first one will place the largest currently
available brick in the hand.
The other one will place the brick currently in
the hand into the next position, going through
the positions sequentially.

Production rules
(defrule R1
?a <- (brick ?name ?s heap)
(not (brick ?n&~?name ?size&:(> ?size
?s) heap))
(not (brick ? ? hand))
=>
(retract ?a)
(assert (brick ?name ?s hand))
(printout t Brick ?name is grasped
crlf))

158

Production rules
(defrule R2
?a <- (brick ?name ?s hand)
?b <- (counter ?i)
=>
(retract ?a ?b)
(assert (brick ?name ?s ?i)
(counter (+ ?i 1)))
(printout t Brick ?name is placed in
position ?i crlf))

Production rules
Remarks:
In this example, no conflict resolution is
necessary, because only one rule can fire at a
time (the two rules are disjoint).
The first rule (R1) is recursively activated for
the current, largest size brick.

159

Production rules
Remarks:
Due to the undecidable character of first order
logic, certain constraints are imposed on
production systems.
For example, in CLIPS no variable can be
used in a WME.

Production rules
Remarks:
It results that CLIPS uses one of the following
forms of modus ponens inference rule:
C, C D
D

A(a ), x ( A( x ) B( x ))
B(a )

C, D are formulas in propositional calculus;


A, B are formulas in first order logic;
a is a constant; x is a variable.

160

Production rules
Remarks:
If an inference engine can use any formula in first
order logic, then it is named a first order inference
engine:
xA( x), x( A( x) B ( x))
xB( x)

CLIPS uses a 0.5 order inference engine; the most


simple inference engine is a zero order one.

Conflict resolution
Whether we are doing data-directed reasoning
or goal-directed reasoning, it may be the case
that more than one rule is applicable.
There are many conflict resolution strategies
for arriving at the most appropriate rule to fire.
The most used conflict resolution strategies:
specificity and recency.

161

Conflict resolution
Specificity: select the applicable rule whose
conditions are most specific (generally, more
complex the case in CLIPS).
One set of conditions is said to be more
specific than another if the set of WMs that
satisfy it is a subset of those that satisfy the
other.
Example:
IF (bird) THEN ADD (canFly)
IF (bird weight: {>100}) THEN ADD (cannotFly)

Conflict resolution
Recency: select an applicable rule based on
how recently it has been used.
There are different versions of this strategy,
ranging from firing the rule that matches on the
most recently created/modified WME (the case
in CLIPS) to firing the rule that has been least
recently used.
Prefering the rules that match the most recent
WMEs can be used to make sure a problem
solver stays focused on what it was just doing
(typical related with depth-first search).

162

Conflict resolution
Somehow related with conflict resolution is
refractoriness: do not select/activate a rule that
has just been applied, as being matched by the
same WMEs.
This prevents the looping behavior that results
from firing a rule repeatedly because of the
same WMEs.
Nontrivial rule systems often need to use more
than one conflict resolution criterion.

Conflict resolution
Examples:
The OPS5 production rule system uses the
following four criteria for selecting the rule to fire
among those that are found to be applicable.
1. Discard any rule that has just been used for the
same values of variables (refractoriness);
2. Order the remaining instances in terms of
recency of WME matching the first condition,
and then the second condition, and so on;
3. Order the remaining rules by number of
conditions (complexity);
4. If there is still a conflict, select arbitrarily among
the remaining candidates.

163

Conflict resolution
One interesting approach to conflict resolution
is provided by the SOAR system.
This system is a general problem solver that
attempts to find a path from a start state to a
goal state by applying productions.
It treats selecting which rule to fire as deciding
what the system should do next.

Conflict resolution
Thus, if unable to decide on which rule to fire
at some point, SOAR sets up a new metagoal
to solve, namely, the goal of selecting which
rule to use, and the process iterates.
When this metagoal is solved (which could in
principle involve metametagoals, etc.), the
system has made a decision about which base
goal to pursue, and therefore the conflict is
resolved.

164

Efficiency in Production Systems


From the three steps of the production systems
working cycle, for efficiency the most important
one is matching.
As much as 90% of the running time is spent
by matching, even when this was implemented
using sophisticated techniques (e.g., indexing,
hashing).
Thus, it results the importance of knowing how
to design the rules/facts in order to reduce the
time/memory consumed by matching.

Efficiency in Production Systems


A great number of rule based systems use the
same matching mechanism: the RETE
algorithm.
The word 'Rete' is Latin for 'net' RETE
algorithm makes use of some networks.
Matching has to find all the rules having their
conditions satisfied by the WMEs.
In principle, there can be used to distinct
approaches.

165

Efficiency in Production Systems


The first possibility is to start from the rules
conditions (LHS) and check them against the
WMEs.
This approach means the matching process is
guided by rules.
This method has certain disadvantages.

Efficiency in Production Systems


WM
(Facts)

Rules

Fig. 1

Agenda

166

Efficiency in Production Systems


The matching process has to be repeated after
each rules execution, since this can modify the
WM.
WM possesses a property: temporal redundancy.
WM canges slowly over time.
It means many tests are redundant, because for
them the state has not been changed from the
previous cycle.

Efficiency in Production Systems


WM
(Facts)

Rules

Facts modified by the


last rules execution

Agenda

Fig. 2

167

Efficiency in Production Systems


Two basic ideas for the RETE algorithm:
Because the WM is canging slowly and the rules are
static, it is more efficient to guide the matching from
facts towards rules.
It is important to keep from one cycle to the next one
the state of the matching process and compute only
the changes, as determined by the modifications of the
WM.

Efficiency in Production Systems


WM
(Facts)

Rules

Agenda

Fig. 3

168

Efficiency in Production Systems


The matching process in the RETE algorithm is
solved in two stages, by the means of two
networks
The first network the pattern network
It contains information on the conditional
elements (patterns) that are matched by the facts
of the WM
The pattern network does not take into account
the constraints determined by the repeated use of
variables in more patterns.

Efficiency in Production Systems


When more rules have the same pattern, the
information on this pattern is stored once.
For efficiency, the RETE algorithm takes
advantage of structural similarity in the rules.
Only in the case when a rule does not repeatedly
use a variable in more pattern, the matching
process is entirely solved by the pattern network

169

Efficiency in Production Systems


The pattern network is organized as a tree.
The root not has an initializing role.
The nodes on the first level represent the
matching for the first field of the patterns; the
nodes on the second level represent the matching
for the second field of patterns, and so on.
Each node contains the condition for the
corresponding field in the pattern.
The value of a variable is checked only in the
case when that variable is repeatedly used in the
same pattern.

Efficiency in Production Systems


Example
(defrule R
(data 2 ~optim ?x)
(data ?x ?y ?x)
=>. . .)

Root Node

Field_1 = data

Field_2 = 2

Field_2 =
no_constraint

Field_3 optim

Field_3 =
no_constraint

Field_4 =
no_constraint

Field_4 = Field_2

Fig. 4

170

Efficiency in Production Systems


Remarks for efficiency
When a fact is added or removed from WM, the
pattern network is checked from the root node
towards the terminal nodes
When the condition of a node is not satisfied the
checking process is stopped
Conclusion: the order of the fields in a rule pattern
is important

Efficiency in Production Systems


The use of multiple variables makes the pattern
network much more complicated (the direct
correspondence between the fields in a pattern
and in a fact is altered)
Conclusion: when possible, avoid the use of
multiple variables

171

Efficiency in Production Systems


The second stage of the RETE algorithm uses a
second network: the join network (partial
matches network)
This network makes the check for the use of
variables in more patterns
If we consider a rule with more patterns, initially
the first two patterns are checked for the
variables used in common; if the matching
constraints determined by the common variables
are satisfied, then the first two patterns form a
partial match

Efficiency in Production Systems


Then the third pattern is checked together with
the first two, and if the constraints determined by
the common variables are satisfied then a new
partial match is obtained
This process continues until the last pattern is
also checked together with the previous ones,
obtaining the last partial match, which is also the
global match, the one that determines the
activation of the rule

172

Efficiency in Production Systems


The terminal nodes in the pattern network act is
inputs for the join network
The join network is a hierarchical one, too; the
first node contains the conditions on the
common variables determined by the first two
patterns, the second node takes into account the
additional constraints determined by the
variables in the third pattern, and so on.

Efficiency in Production Systems


Example:
(defrule R1
(data 2 ~optim ?x)
(data ?x ?y ?x)
(condition ?x ~?y)
=>. . .)

173

Root Node

Field_1 =
data

Field_1 =
condition

Field_2 = 2

Field_2 =
no_constraint

Field_2 =
no_constraint

Field_3
optim

Field_3 =
no_constraint

Field_3 =
no_constraint

Field_4 =
no_constraint

Field_4 =
Field_2

Pattern
Network

Field_4 of pattern_1 =
Field_2 of pattern_2

Join Network

Field_2 of pattern_2 = Field_2


of pattern_3 AND Field_3 of
pattern_2 Field_3 of pattern_3

Fig. 5

Activated Rule

Efficiency in Production Systems


The diagram of Fig. 5 is a simplified one
The join network for a rule with n patterns does
not have n-1 nodes, but n nodes
There is an additional first node in the join
network that does not make any check, but
receives data from the last node of the first
pattern in the pattern network
Thus, each node in the join network is
connected with a single node in the pattern
network

174

Efficiency in Production Systems


For efficiency, it is important to remark that
within the join network the checks are done from
the first pattern towards the last one
To reduce the memory consumption, the join
network takes into account the structural
similarity in the rules
Information on the matching process are
displayed in accordance with the results
provided by the two networks of the RETE
algorithm

Efficiency in Production Systems


Example WM
f 0 (initial fact)
f 1 (data 2 3 5)
f 2 (data 2 3 2)
f 3 (conditie 2 3)
f 4 (conditie 2 2)
f 5 (conditie 5 2)

175

Efficiency in Production Systems


(matches R1)
Matches for Pattern 1
f-1
f-2
Matches for Pattern 2
f-2
Matches for Pattern 3
f-3
f-4
f-5
Partial matches for CEs 1 - 2
f-2,f-2
Partial matches for CEs 1 - 3
f-2,f-2,f-4
Activations
f-2,f-2,f-4

Efficiency in Production Systems


Example:
(defrule R2
(data 2 ~optim ?x)
(data ?x ?y ?x)
(conditie ?z ~?y)
=>. . .)

176

Efficiency in Production Systems

(matches R2)

Matches for Pattern 1


f-1
f-2
Matches for Pattern 2
f-2
Matches for Pattern 3
f-3
f-4
f-5
Partial matches for CEs 1 - 2
f-2,f-2
Partial matches for CEs 1 - 3
f-2,f-2,f-5
f-2,f-2,f-4
Activations
f-2,f-2,f-5
f-2,f-2,f-4

Efficiency in Production Systems


Conclusion on the RETE algorithm
RETE algorithm solves the matching step by
means of two networks
RETE algorithm determines a fast matching,
with memory consumption
Knowing the details of the RETE algorithm, a
rule based program can be
designed/implemented for efficiency

177

Efficiency in Production Systems


1. The importance of fields order
Example Two patterns
(data <point> <pressure> <temperature>)

(1)

(data <point> <temperature> <pressure>)

(2)

Problem: determine all the points with a pressure


between 100 and 150

Efficiency in Production Systems


(defrule R-1
(data ?point ?x&:(and (numberp ?x) (> ?x 100)
(< ?x 150) ) ?y&:(numberp ?y) )
=>
(printout t Point: ?pnt pressure : ?x
temperature: ?y crlf))
(defrule R-2
(data ?point ?x&:(numberp ?x) ?y&:(and
(numberp ?y) (> ?y 100) (< ?y 150) ) )
=>
(printout t Point: ?pnt pressure : ?x
temperature: ?y crlf))

178

Efficiency in Production Systems


Rule R-1 is more efficint that R-2, since it makes
earlier the complex test on the value of pressure
Rule R-2 wastes the time/memory in the
matching process using facts that will not
comply the final test on the pressure value
When the WM contains a great number of facts
the inefficiency the difference between the two
previous rules can be significant

Efficiency in Production Systems


Conclusion for efficiency
The most powerfull tests those that determine
the most important filtering (they are met least
frequently) must be placed as close as possible
to the pattern beginning
An efficient order of fields in the pattern of the
previous example:
(data <pressure> <temperature> <point>)

179

Efficiency in Production Systems


2. The importance of patterns order
Example Two rules with different order of patterns
(defrule R-I
(sequence ?x ?y ?z ?w)
(element ?x)
(element ?y)
(element ?z)
(element ?z)
=>
(printout t Sequence: ?x ?y ?z ?w crlf))

Efficiency in Production Systems


(defrule R-II
(element ?x)
(element ?y)
(element ?z)
(element ?z)
(sequence ?x ?y ?z ?w)
=>
(printout t Sequence: ?x ?y ?z ?w crlf))
The two rules are equivalent with respect to their
activitation, but they are different from the efficiency point of
view

180

Efficiency in Production Systems


WM
f0
f1
f2
f3
f4
f5
f6
f7
f8

(initial-fact)
(sequence a b c d)
(element a)
(element b)
(element c)
(element d)
(element e)
(element f)
(element g)

Efficiency in Production Systems


Matches for patterns of rule R-I
Pattern 1
Pattern 2
Pattern 3
Pattern 4
Pattern 5

f1
f 2, f 3, f 4, f 5, f 6, f 7, f 8
f 2, f 3, f 4, f 5, f 6, f 7, f 8
f 2, f 3, f 4, f 5, f 6, f 7, f 8
f 2, f 3, f 4, f 5, f 6, f 7, f 8

Result: 29 pattern matches

181

Efficiency in Production Systems


Partial matches for rule R-I
Pattern 1
[f 1]
Pattern 1 and 2
[f 1, f 2]
Pattern 1, 2 and 3
[f 1, f 2, f 3]
Pattern 1, 2, 3 and 4 [f 1, f 2, f 3, f 4]
Pattern 1, 2, 3, 4 and 5 [f 1, f 2, f 3, f 4, f 5]
Result: 5 partial matches

Efficiency in Production Systems


Matches for patterns of rule R-II
Pattern 1
Pattern 2
Pattern 3
Pattern 4
Pattern 5

f 2, f 3, f 4, f 5, f 6, f 7, f 8
f 2, f 3, f 4, f 5, f 6, f 7, f 8
f 2, f 3, f 4, f 5, f 6, f 7, f 8
f 2, f 3, f 4, f 5, f 6, f 7, f 8
f1

Result: 29 pattern matches

182

Efficiency in Production Systems


Partial matches for rule R-II
Pattern 1 [f 2],[f 3],[f 4],[f 5],[f 6], [f 7], [f 8]
Pattern 1 and 2

[f2,f 2];[f 2,f 3];[f 2,f 4];[f 2,f 5];[f 2,f 6];
[f 2,f 7];[f 2,f 8]
[f 3,f 2];[f 3,f 3];[f 3,f 4];[f 3,f 5];[f 3,f 6];
[f 3,f 7];[f 3,f 8]
....
[f 8,f 2];[f 8,f 3];[f 8,f 4];[f 8,f 5];[f 8,f 6];
[f 8,f 7];[f 8,f 8]
Result: 56 partial matches only for patterns 1 and 2

Efficiency in Production Systems


If we count the following partial matches:
343 partial matches for the paterns 1, 2 and
3
2401 partial matches for the paterns 1, 2, 3
and 4
a single global match the rule activation

183

Efficiency in Production Systems


Though the two rules are equivalent, rule R-II
is consuming considerable more memory and
time in comparison with rule R-I
The previous example evidentiates the
importance of patterns order in rules; this
order has to be established to minimize the
number of calculations of the RETE algorithm

Efficiency in Production Systems


Rule designing points:
The most specific patterns must be placed at the
beginning of a rule
The patterns that one knows that are rarely satisfied
by the facts in the WM are to be placed at the
beginning of a rule, too
The patterns that may be mached by facts frequently
added and removed from the WM must be placed at
the end of a rule

184

Efficiency in Production Systems


The above points may be contradicting, so that
the optimum design of rules/facts is to be
ajusted to the specific current problem
General efficiency criterium: place the tests as
close as possible to the begininng of a rule: in
each pattern, as already mentioned, and in the
first patter/patterns (see the following
examples the second solutions are more
efficient)

Efficiency in Production Systems


(defrule R1
?a1 < (student ?n1 ? 1601
?a2 < (student ?n2 ? 1601
?a3 < (student ?n3 ? 1601
(test (and (neq ?a1 ?a2)
(neq ?a2 ?a3)
(neq ?a1 ?a3) )
=> (printout t Triplet found:
?n2 ?n3 crlf ) )

185

?)
?)
?)

)
?n1

Efficiency in Production Systems


(defrule R2
?a1 < (student ?n1 ? 1601 ?)
?a2 < (student ?n2 ? 1601 ?)
(test (neq ?a1 ?a2) )
?a3 < (student ?n3 ? 1601 ?)
(test (and (neq ?a2 ?a3)
(neq ?a1 ?a3) ) )
=> (printout t Triplet found: ?n1
?n2 ?n3 crlf ) )

Efficiency in Production Systems


(defrule Reg1
(student ?n ?y ?g ?)
(test (and (numberp ?y) (numberp ?g)
(<> ?y ?g) ) )
=>
(printout t Student ?n year of study
?y group ?g crlf
the number of year is different from the
number of group crlf) )

186

Efficiency in Production Systems


(defrule Reg2
(student ?n ?y ?g&:(and (numberp ?y)
(numberp ?g) (<> ?y ?g) ) ?)
=>
(printout t Student ?n year of study
?y group ?g crlf
the number of year is different from the
number of group crlf) )

Efficiency in Production Systems


3. Generally, the avoidance of a great number of
salience levels is to be considered for
efficiency
The main criterium to be used for rule selection
should be matching
The use of salience as a means for rules
selection determines memory/time
consumption (see the following example)

187

Efficiency in Production Systems


Problem: to plan the movement for a mobile
robot, according to the following guiding points
(these must be applied according to their
order):
If the area in front of robot is free, then the
robot should move forward
If an area on the robot sides is free, then the
robot should move sideways
If the area behind the robot is free, then the
robot should move backward.

Efficiency in Production Systems


Solution 1 (inefficient)
(defrule Robot-1
(declare (salience 20) )
?a < (phase movement-choice)
(area front free)
=> (retract ?a)
(assert (planned-movement forward) ) )

188

Efficiency in Production Systems


(defrule Robot-2
(declare (salience 10) )
?a < (phase movement-choice)
(area sideways free)
=> (retract ?a)
(assert (planned-movement sideways) ) )

Efficiency in Production Systems


(defrule Robot-3
?a < (phase movement-choice)
(area back free)
=> (retract ?a)
(assert (planned-movement backward) ) )

189

Efficiency in Production Systems


The three rules solve the problem
This solution is inefficient, which is clear when
all the areas around the robot are free
In this case all the rules are activated (memory
and time consumption), though only the first
rule will be executed

Efficiency in Production Systems


Solution 2 (efficient)
(defrule Robot-1
?a < (phase movement-choice)
(area front free)
=> (retract ?a)
(assert (planned-movement forward) ) )

190

Efficiency in Production Systems


(defrule Robot-2
?a < (phase movement-choice)
(not (area front free) )
(area sideways free)
=> (retract ?a)
(assert (planned-movement sideways) ) )

Efficiency in Production Systems


(defrule Robot-3
?a < (phase movement-choice)
(not (area front free) )
(not (area sideways free) )
(area back free)
=> (retract ?a)
(assert (planned-movement backward) ) )

191

Efficiency in Production Systems


The second solution makes explicit the
conditions for each rule activation, eliminating
the interdependence between rules
An inefficient solution can be determined by an
unclear heuristics or by a logical wrong
conversion of the solutioning mechanism in
rules
For the previous example, a clear and logically
correct solution would be expressed by the
following guiding points

Efficiency in Production Systems


If the area in front of robot is free, then the robot
should move forward
If an area on the robot sides is free, and the
area in front of robot is not free, then the robot
should move sideways
If the area behind the robot is free, and the
area in front of robot is not free, and the area on
the robot sides is not free, then the robot should
move backward

192

Efficiency in Production Systems


4. Efficiency can be also discussed with respect to
the dilemma: general versus specific rules
There is not a definite answer about which
solution is the best:
A greater number of specific rules
A smaller number of general rules

Specific rules tend to increase pattern network


instead reducing partial matches network.

Efficiency in Production Systems


General rules often determine the advantage of
a better sharing of the nodes in the two networks
between more rules
Fewer general rules are easier to update and
modify than a large number of specific rules
In principle, a general rule makes tests and
actions for more specific rules; that is why, an
efficient design of a general rule is much dificult,
and exposed to mistakes

193

Efficiency in Production Systems


Problem
The position of a mobile robot has to be updated
according to the movement that was carried
out; the robot is moving in plane, with one unit,
along the two axes
Two solutions:
1. Four specific rules
2. A general rule and some control facts

Efficiency in Production Systems


First solution
(defrule north-movement
?a < (movement north)
?b < (position ?x ?y)
=> (retract ?a ?b)
(assert (position ?x (+ ?y 1) ) ) )
(defrule south-movement
?a < (movement south)
?b < (position ?x ?y)
=> (retract ?a ?b)
(assert (position ?x (- ?y 1) ) ) )

194

Efficiency in Production Systems


(defrule east-movement
?a < (movement east)
?b < (position ?x ?y)
=> (retract ?a ?b)
(assert (position (+ ?x 1) ?y ) ) )
(defrule west-movement
?a < (movement west)
?b < (position ?x ?y)
=> (retract ?a ?b)
(assert (position (- ?x 1) ?y ) ) )

Efficiency in Production Systems


Second solution
(deffacts control-facts
(direction north 0 1)
(direction south 0 -1)
(direction east 1 0)
(directie west -1 0) )
(defrule movement
?a < (movement ?dir)
(direction ?dir ?dx ?dy)
?b < (position ?x ?y)
=> (retract ?a ?b)
(assert (position (+ ?x ?dx) (+ ?y ?dy) ) ) )

195

Efficiency in Production Systems


The general rule consumes more time in the
matching process (more variables are to be
bound).
The higher abstraction of the general rule
facilitates the system development.
For example, some new movement directions
(e.g. north-east, north-west, etc.) are easily added,
by the means of some new control facts. In the
specific approach, such development implies new
rules.

Chapter 5 Production Systems


Homework
Problems 1 and 2, Brachman and Levesque,
pp. 133.

Consider a manufacturing process involving


certain commands from some operators;
transpose the operators activity in a rulebased program.

196

Chapter 5 Production Systems


Homework
Develop a rule-based program to solve operations on
sets (union, intersection).
Develop the rule based program to monitor the
scheme of the following diagram. The devices (Di)
will be switched on and off depending on the
thresholds of the related sensors (Si). Choose the
appropriate conflict resolution strategy for the
developed program and make the needed tuning to
obtain an efficient inference engine operation.

Chapter 5 Production Systems


Homework

S1

S2
D1

197

S3

S4

D2

D3

Você também pode gostar