Escolar Documentos
Profissional Documentos
Cultura Documentos
Knowledge Representation
and Reasoning
Course Lectures
EDITURA
CONSPRESS
2013
CONSPRESS
B-dul Lacul Tei nr. 124, sector 2
cod 020396, Bucureti
Tel: (021) 242 2719 / 300; Fax: (021) 242 0781
2
Contents
Bibliography ..5
Course overview ..5
Chapter I Introduction to Knowledge Representation and
Reasoning (KRR) ..6
Basic notions in KRR ..7
KRR: Representation and Reasoning 17
Knowledge Based Systems (KBSs) ...19
KBSs Importance of reasoning 23
KBSs The role of Logic .26
Homework ......26
Chapter II Important aspects on the Language of
First-Order Logic (FOL) ..27
Introduction .28
Language of First-Order Logic Syntax 30
Language of First-Order Logic Semantics .36
Language of First-Order Logic Pragmatics 48
Homework ......58
Chapter III Knowledge Engineering .60
Knowledge engineering and knowledge management ...61
Ontology .71
V oc abul ar y
.91
111
..136
..153
Production rules
...154
Conflict resolution
.161
Bibliography
This course notes follow to a large extent the following
book:
Ron Brachman, Hector Levesque, Knowledge
representation and reasoning, Elsevier, Amsterdam,
2004
Other sources that were used are:
Frank van Harmelen, Vladimir Lifschitz, Bruce Porter, Handbook
of Knowledge Representation, Elsevier, Amsterdam, 2008
Natasha Noy, Ontology Development 101, Stanford University,
2008
Irma Becerra-Fernandez, Avelino Gonzalez, Rajiv Sabherwal,
Knowledge Management, Prentice Hall, 2004
Amrit Tiwana, The Knowledge Management Toolkit, Prentice
Hall, 2000
Joseph Giarratano, Gary Riley, Expert Systems: Principles and
Programming, PWS-KENT, Boston, 1989
Course overview
Knowledge Representation (KR)
Logic based KR
Reasoning
Expert Systems (ES)
Actions/Decisions
Reasoner
+
Knowledge
World
Sensory
information
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Semantics - Interpretations
Meanings are typically captured by specific
interpretations
An interpretation I in FOL is a pair <D, I>, where D
is any nonempty set of objects, called the domain of
the interpretation (the universe of discourse - UoD),
and I is a mapping, called the interpretation mapping,
from the nonlogical symbols to functions and relations
over D.
UoD - the set of entities over which the variables may
range (UoD is part of the world considered by the
current problem)
Semantics - Interpretations
It is important to stress that an interpretation
need not involve only mathematical objects
D can be any set, including people, garages,
numbers, sentences, fairness, unicorns, chunks
of peanut butter, situations, and the universe,
among other things.
The interpretation mapping I is devoted to
assigning the meaning of the predicate and
function symbols
39
Semantics - Interpretations
To every predicate symbol P of arity n, I[P] is
an n-ary relation over D:
I[P] D D
(D is taken n times)
Examples:
Robot a unary predicate symbol
I[Robot] would be some subset of D (presumably
the set of robots in that interpretation)
Semantics - Interpretations
Examples:
SmarterThan a binary predicate symbol
I[SmarterThan] would be some subset of D x D
(presumably the set of pairs of objects in D where the
first element of the pair is smarter than the second)
D D D
40
Semantics - Interpretations
Examples:
bestServant a unary function symbol
I[bestServant] would be a function D D
(presumably the function that maps a person to
his/her best servant)
johnSmith a 0-ary function
I[johnSmith] would be an element of D (presumably
somebody called John Smith)
Semantics - Interpretations
It is useful to think of the interpretation of
predicates in terms of their characteristic
functions.
The interpretation of an n-ary predicate
becomes an n-ary function to {0, 1}:
I[P]
D D {0, 1}
41
Semantics - Interpretations
Two distinct interpretations of a predicate:
as a relation
as a function to {0, 1}
The relationship between the two specifications
is that a tuple of objects is considered to be in
the relation over D if and only if the
characteristic function over those objects has
value 1.
Semantics - Interpretations
There is an advantage of using the second
type of interpretation.
It allows us to see how predicates of arity 0
(i.e., the propositional symbols) are handled:
If P is 0-arity predicate, then I[P] is either 0, or 1.
42
Semantics - Interpretations
It is normal to think as 0 represents the truth
value False, while 1 stands for True.
Thus, for the propositional subset of FOL, one
can ignore D completely.
For propositional calculus, an interpretation is
simply a mapping, I, from the propositional
symbols to either 0 or 1.
Semantics - Denotation
Denotation clarifies the interpretation of terms
that contain variables.
Given an interpretation I = <D, I> we can
specify which elements of D are denoted by any
variable-free term of FOL.
An example:
bestServant(johnSmith); I[bestServant](I[johnSmith]) D
43
Semantics - Denotation
To deal with terms including variables, one
needs a variable assignment over D, that is, a
mapping from the variables of FOL to the
elements of D.
- variable assignment:
: V D, [x] D
By using both an interpretation I and a variable
assignment the denotation of every term t can
be calculated.
Semantics Denotation
If x is a variable, then ||x||I, = [x]
If t1, ... , tn are terms, and f is a function symbol
of arity n, then
||f(t1, , tn)||I, = F(d1, , dn)
F = I[f], di = ||ti||I,
The above rules are recursive
According to these recursive rules, ||ti||I, is
always an element of D
44
I, |= ,
according to the following rules:
Assume that t1, ... , tn are terms, P is a predicate of
arity n, and are formulas, and x is a variable.
45
46
47
48
49
50
51
52
()
53
Pragmatics
2. The KBS starts with a collection of sentences as
given premises (KB)
3. The KB includes not only facts about particulars of
the intended application, but also those expressing
connections among the nonlogical symbols involved
4. Calculating the entailments thus becomes a core
part of a KBS, as it is like the form of reasoning we
would expect of someone who understood the
meaning of the terms involved (KB becomes a richer
set)
Pragmatics
IMPORTANT CONCLUSION
This is all existing on knowledge representation
and reasoning; the rest is just details.
54
A
B
not green
55
(*)
56
57
58
59
60
61
62
63
64
65
66
67
People
Processes
Products
Performance
Knowledge Management
Knowledge
Employee
Learning
Employee
Adaptability
Employee
Job
Satisfaction
68
69
70
71
72
73
Ontology Development
Defining terms in the domain and relations
among them
Defining concepts in the domain (classes)
Arranging the concepts in a hierarchy (subclasssuperclass hierarchy)
Defining which attributes/properties (slots) classes
can have and constraints on their values
Defining individuals and filling in slot values
(instances)
Ontology Development
An example of ontology
74
Ontology Development
Ontology development versus Object-oriented modeling
An OO Structure
An ontology
- It
Ontology Development
determine
determine
scope
scope
consider
reuse
enumerate
terms
define
classes
define
properties
define
constraints
create
instances
75
Ontology Development
Example: questions for the wine ontology
Which wine characteristics should I consider when
choosing a wine?
Is Bordeaux a red or white wine?
Does Cabernet Sauvignon go well with seafood?
What is the best choice of wine for grilled meat?
Which characteristics of a wine affect its
appropriateness for a dish?
Does a bouquet or body of a specific wine change
with vintage year?
Ontology Development
determine
scope
consider
consider
reuse
reuse
enumerate
terms
define
classes
define
properties
define
constraints
create
instances
76
Ontology Development
Examples of existing ontologies
Ontology libraries
Protg ontology library (protege.stanford.edu)
Ontolingua ontology library
(www.ksl.stanford.edu/software/ontolingua/)
Upper ontologies
IEEE Standard Upper Ontology (suo.ieee.org)
Cyc (www.cyc.com)
Domain-specific ontologies
UMLS Semantic Net
GO (Gene Ontology) (www.geneontology.org)
OBO (Open Biological Ontologies) (obo.sourceforge.net)
Ontology Development
determine
scope
consider
reuse
enumerate
enumerate
terms
terms
define
classes
define
properties
define
constraints
create
instances
77
Ontology Development
Examples of terms for wine ontology
wine, grape, winery, location
wine color, wine body, wine flavor, sugar
content
white wine, red wine, Bordeaux wine
food, seafood, fish, meat, vegetables, cheese
Ontology Development
determine
scope
consider
reuse
enumerate
terms
define
define
classes
classes
define
properties
define
constraints
create
instances
78
Ontology Development
Inheritance is a common mechanism in a
class hierarchy
Classes usually constitute a taxonomic
hierarchy (a subclass-superclass hierarchy)
A class hierarchy is usually an IS-A
hierarchy:
an instance of a subclass is an instance of
a superclass
If you think of a class as a set of elements, a
subclass is a subset
Ontology Development
Multiple inheritance can create problems
A class can have more than
one superclass
The subclass inherits properties
and restrictions from all the
parents
Different systems resolve
conflicts differently
Industrial
Equipment
Programmable
System
Robot
79
Ontology Development
determine
scope
consider
reuse
enumerate
terms
define
classes
define
define
properties
properties
define
constraints
create
instances
Ontology Development
A taxonomy of properties
Types of properties
intrinsic properties: flavor and color of wine
extrinsic properties: name and price of wine
parts: ingredients in a dish
relations to other objects: producer of wine (winery)
80
Ontology Development
Properties
A subclass inherits all the properties from the
superclass
If a wine has a name and flavor, a red wine also
has a name and flavor
Ontology Development
determine
scope
consider
reuse
enumerate
terms
define
classes
define
properties
define
define
constraints
constraints
create
instances
81
Ontology Development
Property constraints
Global property constraints apply to the
property throughout the ontology
range for hasColor is always an instance of
WineColor
Ontology Development
Property constraints
Cardinality the number of values a property can or must
have
Minimum cardinality
Minimum cardinality 1 means that the property must have a
value (required)
Minimum cardinality 0 means that the property value is
optional
Maximum cardinality
Maximum cardinality 1 means that the property can have at
most one value (functional property)
Maximum cardinality greater than 1 means that the property
can have more than one value (multiple-valued property)
82
Ontology Development
determine
scope
consider
reuse
enumerate
terms
define
classes
define
properties
define
constraints
create
create
instances
instances
Ontology Development
All the siblings in the class hierarchy must
be at the same level of generality
If a class has only one child, there may be
a modeling problem
If a class has more than a dozen children,
additional subcategories may be
necessary
83
Ontology Development
Danger of multiple
inheritance: cycles in
the class hierarchy
B
subclass-of
Classes A, B, and C
have equivalent sets of
instances
subclass-of
subclass-of
Ontology Development
Different modes of the development
top-down - define the most general concepts
first and then specialize them
bottom-up - define the most specific concepts
and then organize them in more general
classes
combination
84
Ontology Development
Classes represent concepts in the domain,
not their names
The class name can change, but it will still
refer to the same concept
Synonym names for the same concept are
not different classes
Many systems allow listing synonyms as part of
the class definition
Ontology Development
A wine is not a kind-of wines
A wine is an instance of the class Wines
Class names should be either
all singular
all plural
85
Ontology Development
Subclasses of a class usually have
Additional/new properties
Additional/new restrictions
Participate in different relationships
Ontology Development
It is to decide either for a new class or a
property value
Wine
OR
White wine
Rose wine
Red wine
Wine
Colour: white, rose, red
86
Ontology Development
How important is the distinction for the
domain?
A class of an instance should not change
often
Individual instances are the most specific
objects in an ontology
If concepts form a natural hierarchy,
represent them as classes
Ontology Development
Domain of a property the class (or classes)
that have the property
More precisely: class (or classes) of
instances which can have the property
Range of a property the class (or classes)
to which property values belong
87
Ontology Development
When defining a domain or range for a slot,
find the most general class or classes
Consider the produces slot for a Winery:
Range: Red wine, White wine, Ros wine
Range: Wine
Consider the flavor slot
Domain: Red wine, White wine, Ros wine
Domain: Wine
Ontology Development
Inverse properties (e.g., maker produces)
contain redundant information, but
Allow acquisition of the information in either direction
Enable additional verification
Allow presentation of information in both directions
88
Ontology Development
Default value a value the slot gets when
an instance is created and a specific value
is not provided
A default value can be changed
The default value is a common value for
the slot, but is not a required value
Ontology Development
Naming convention
Define a naming convention for classes and
properties and adhere to it
Features of an ontology tool to be
considered:
Can classes and properties have the same
names?
Is the system case-sensitive?
What delimiters are allowed?
89
90
Vocabulary
A main point for an ontology is to agree on the
use of a certain vocabulary
The nonlogical symbols are those that have
an application-dependent meaning or use
Vocabulary establishing means the setting up
of the nonlogical symbols
Considered example: a soap opera world
Vocabulary
Constants:
Named individuals: maryJones, johnSmith,
tomJones
One needs to allow multiple identifiers that could
used to refer to the same individual (e.g. tom,
johnSmith)
Other entities: corporations, restaurants, places,
objects (e.g. astraInsuranceCompany,
iasiTownCouncil, tomsHouse, studentgroup1)
91
Vocabulary
After capturing the set of individuals that will
be central to the agents world, it is next
essential to circumscribe the basic types of
entities that those individuals are.
The unary predicates should be used
Person(x)
Other Examples:
Man, Woman, Place, Company, Jewelry, Pen
Restaurant, Bar, House, University
Vocabulary
The unary predicates can also describe the
set of attributes that the entities can have
A vocabulary of properties that can hold for
individuals are to be established
Examples:
Rich, Beautiful, Unscrupulous, Bankrupt,
ClosedForRepairs
92
Vocabulary
The n-ary predicates can express
relationships
Examples of predicates of arity 2:
MarriedTo, DaughterOf, LivesAt
Vocabulary
The n-ary functions are also part of the
vocabulary
Both unary and higher arity functions can be
included
Examples of functions
fatherOf, bestFriendOf (unary functions)
childOf, gradeOf (binary functions; e.g.
gradeOf(John, KRR) = 10) )
93
Vocabulary
Important remark: there may be cases when
both a function or a predicate can be involved
Some criteria to decide:
Functions are taken to be total in FOL; if
exceptions are possible, then use predicates
(e.g. bestFriendOf versus BestFriendOf)
Functions are needed as a means to keep using
the FOL (e.g. Happy(bestFriendOf(x)) is an FOL
formula, while Happy(BestFriendOf(x)) is beyond
FOL)
94
95
96
97
98
99
100
101
Reification
The FOL language gives us the basic tools for
representing facts with a great deal of flexibility
There is also considerable flexibility in what we
consider to be the individuals in the domain
Sometimes is useful to introduce new abstract
individuals that might not have been considered
in a first analysis
Reification the idea of making up new
individuals as needed for the most appropriate
detail level
Reification
Example1: the event of Johns buying a bike
Purchases(john, bike)
Purchases(john, bike, feb15)
Purchases(john, bike, feb15, $200)
102
Reification
We may not be able to predict in advance the
level of detail; we need another approach, more
flexible.
To introduce an abstract individual, together
with a unary predicate and some functions
p23 a constant representing the event of
purchasing
Purchase(p23) agent(p23) = john object(p23)
= bike time(p23) = feb15 cost(p23) = $200
Reification
We gain flexibility more/less details by
adjusting the formula
Advantage - the arity of the predicate and
function symbols involved can be determined in
advance
Example2: a set of predicates to represent the
marriage relationships: MarriedTo(x, y);
PreviouslyMarriedTo(x, y); ReMarriedTo(x, y)
103
Reification
We can reify the marriage events as abstract
individuals (entities)
We can determine anyones current marital
status and complete marital history directly from
these individuals
Marriage(m20) husband(m20) = john
wife(m20) = ann date(m23) =
Reification
In representing commonsense information we
need individuals for numbers, dates, times,
addresses, and so on.
Basically, any object about which we can ask
a wh-question should have an individual
standing for it in the KB so it can be returned as
the result of a query.
104
Reification
We gain flexibility in representing quantitative
knowledge, by using proper functions
Example:
ageInYears(suzzy) = 12
ageInMonths(suzzy) = 144
years(age(suzzy)) = 12
months(x) = 12 years(x)
Entailment
Entailment is part of the reasoning process
Entailment allows deriving of implicit
conclusions from statements explicitly
represented in the KB
KB collection of simple and complex facts
(considered example, soap opera world)
If the only tasks are to answer simple
questions (e.g. Is John rich man?), then
entailments will not be needed
105
Entailment
Entailments are needed when complex
statements are analyzed
Example1: Is there a company whose CEO
(chief executive officer) loves Jane?
: x [Company(x) Loves(ceoOf(x), jane)]?
KB |= (I, if I |= KB (*), then I |= )
Entailment
I |= {Rich(john), Man(john)}
- from (*)
(1)
(3)
I |= {john = ceoOf(fic),
Company(faultyInsuranceCompany),
fic = faultyInsuranceCompany}
106
Entailment
(3),(4)I |= Company(fic)Loves(ceoOf(fic), jane)
KB |=
Entailment
Remarks:
The provided solution determined not only that
there is a company whose CEO loves Jane, but
also what that company is.
We can be interested in finding out not only
whether something is true or not, but also which
individuals satisfy a property of interest.
A KBS has to face not only to yesno questions,
but to wh-questions as well (who? what? where?
when? how? why?)
107
Entailment
Example2: If no man is blackmailing John, then
is he being blackmailed by someone he loves?
: x [Man(x) 7 Blackmails(x, john)]
y [Loves(john, y) Blackmails(y, john)] ?
KB |=
It is more easy to demonstrate the requested
entailment by using:
KB |= 1 1 iff KB U {1} |= 1
Entailment
I |= KB (*) and
I |= x [Man(x) 7 Blackmails(x, john)]
(**)
108
Entailment
I |= x [Woman(x) Blackmails(x, john)]
from (**), (1), (2)
(3)
(5)
from (*)
(6)
Entailment
I |=x[Woman(x)7(x = jane)7Blackmails(x, john)]
from (5, 6)
(7)
(9)
109
Entailment
We have illustrated, in intuitive form, how a
proof can be thought of as a sequence of FOL
sentences, starting with those known to be true
in the KB or surmised as part of the
assumptions dictated by the query.
The proof proceeds logically using the facts in
the KB and the rules of logic.
Entailment
In some cases the answer to a question may
be a negative one.
Entailment can be involved in such a case, in
a more complicated way.
We have to prove:
KB |
We must produce a specific interpretation and
argue that it satisfies every sentence in the KB
as well as the negation of
110
111
Chapter 4 Resolution
Chapter Overview
Introduction
Resolution in Propositional Calculus
Resolution Derivations
A Resolution based Entailment Procedure
Resolution in FOL
Answer Extraction
Skolemization
Equality
Dealing with the computational complexity of
resolution
Chapter 4 Resolution
Objectives for Chapter 4
Understading the resolution based reasoning
mechanism
A procedure to apply the resolution rule for all
the cases of FOL formulas
A procedure to extract an answer from a
knowledge base
Understanding the resolution limitations
112
Resolution - Introduction
Resolution is the main method applied by
automated theorem proving
It details how to automate a deductive
reasoning procedure
Problems to solve:
Given a knowledge base KB and a sentence ,
we would like a procedure that can determine
whether or not KB |=
If [x1, ... , xn] is a formula with free variables
among the xi, we want a procedure that can find
terms ti, if they exist, such that KB |= [t1, ... , tn]
Resolution - Introduction
There is no procedure that can fully satisfy
this specification
The possible result: a procedure that does
deductive reasoning in as sound and
complete a manner as possible, and in a
language as close as possible to that of full
FOL
113
Resolution - Introduction
If we take the KB to be a finite set of
sentences {1, ... , n}, there are several
equivalent ways of formulating the deductive
reasoning task:
KB |=
|= [(1 n) ]
KB U {7} is not satisfiable
114
2.
3.
4.
|= ( )
|= ( )
115
An example:
(( p 7q r) q
{[p, 7q, r], [q]}
is:
116
2.
Resolution derivations
To discuss reasoning at the symbol level, it is
common to posit what are called rules of
inference: statements of what formulas can be
inferred from other formulas.
Automated theorem proving uses the
resolution rule of inference:
A B, 7B C |= A C
117
Resolution derivations
Resolution rule can be expressed for set
representation of formulas:
Given a clause of the form C1 = c1 U {} containing
some literal , and a clause of the form C2 = c2 U
{7} containing the complement of , infer the
clause c1 U c2 consisting of those literals in the
first clause other than and those in the second
other than 7.
We say in this case that c1 U c2 is a resolvent of
the two input clauses with respect to (c1 U c2 =
Res(C1, C2) )
Resolution derivations
Examples:
[w, p, 7q], [s, w, 7p] |= [w, 7q, s]
118
Resolution derivations
The importance of resolution for reasoning:
If S |- c, then S |= c (the converse does not hold)
Conclusion 1: In general, as a form of reasoning,
Resolution is sound, but not complete.
Conclusion 2: For c = [ ], resolution is both
sound and complete (S |- [ ] if and only if S |= [ ])
3.
119
120
121
122
Resolution in FOL
For the general case of FOL we have to
handle terms and quantifiers
Again we have to convert the formulas in the
CNF (clausal normal form)
First, we neglect existential quantifiers
123
Resolution in FOL
Procedure to obtain CNF:
Important remark a formula in CNF is a sentence
which if containing variables, these are
universally quantified
1. Eliminate the logical operations and by
using only 7, and
2. Move 7 inward so that it appears only in front
of an atom, using De Morgans laws and the
following two:
|= 7 x. x. 7
|= 7 x. x. 7
Resolution in FOL
3.
124
Resolution in FOL
4.
5.
Resolution in FOL
6.
7.
125
Resolution in FOL
As for the propositional calculus, the clausal form
allows us to use sets instead of formulas.
In the general case, a literal can be:
P(t1, , tn) P predicate symbol of arity n,
ti term (variable or function symbol)
Resolution in FOL
Example:
x y [(P(x) 7R(a, f(b, x))) Q(x, y)]
{[P(x) 7R(a, f(b, x))], [Q(x, y)]}
126
Resolution in FOL
Example:
= {x|a, y|g(a, b, z)}
= P(x, c, y)
= P(a, c, g(a, b, z))
Resolution in FOL
General rule of resolution for FOL
Suppose we are given a clause of the form
c1 U {1} containing some literal 1, and a clause
of the form c2 U {72} containing the complement
of a literal 2. Suppose we rename the variables
in the two clauses so that each clause has
distinct variables, and there is a substitution
such that 1 = 2. Then, we can infer by
resolution the clause (c1Uc2) consisting of those
literals in the first clause other than 1 and those
in the second other than 2, after applying .
127
Resolution in FOL
We say in this case that unifies 1 and 2, and
that is a unifier of the two literals.
With this new general rule of Resolution, the
definition of a derivation by resolution stays the
same, and ignoring equality, it is the case that
S |- [ ] if and only if S |= [ ], as for the
propositional calculus.
Resolution in FOL
Example 1
KB
x(GradStudent(x) Student(x))
x(Student(x) HardWorker(x))
GradStudent(sue)
Question
HardWorker(sue)
Problem
KB |= HardWorker(sue)
128
Resolution in FOL
Derivation by resolution:
C1 = [7GradStudent(x), Student(x)] ,
C2 = [7Student(y), HardWorker(y)],
C3 = [GradStudent(sue)], C4 = [7HardWorker(sue)],
C5 = [7Student(sue)]
- Res(C2, C4); = {y|sue}
C6 = [7GradStudent(sue)] - Res(C5, C1); ={x|sue}
C7 = [ ]
Res(C3, C6)
Resolution in FOL
Example 2
KB
On(a, b) = C1
On(b, c) = C2
Green(a) = C3
7Green(c) = C4
Question
Q = x y (On(x, y) Green(x) 7Green(y))
129
Resolution in FOL
Q = x y 7(On(x, y) Green(x) 7Green(y)) =
x y (7On(x, y) 7Green(x) Green(y)) = C5
C6 = [7Green(a), Green(b)] - Res(C1, C5); ={x|a, y|b}
- Res(C6, C3)
C7 = [Green(b)]
C8 = [7Green(b), Green(c)] - Res(C2, C5); ={x|b, y|c}
C9 = [7Green(b)]
- Res(C8, C4)
C10 = [ ]
- Res(C9, C10)
Resolution in FOL
Example 3
Using resolution derivation it is possible to get
answers to queries that we might think of as
requiring some computation, too.
To do arithmetic, for example, we can use the
constant zero to stand for 0, and succ to stand
for the successor function.
130
Resolution in FOL
Every natural number can then be written as a
ground term using these two symbols.
4 = succ(succ(succ(succ(zero))))
Resolution in FOL
From this KB one can derive that: 2 + 3 = 5
(namely, Plus(2, 3, 5)).
The same KB and the query:
u Plus(2, 3, u) = Q
KB
C1 = [Plus(0, s, s)]
C2 = [7Plus(x, y, z), Plus(succ(x), y, succ(z))]
C3 = 7Q = [7 Plus(2, 3, u)]
131
Resolution in FOL
[7Plus(2, 3, u)]
[7Plus(0, 3, w)]
= {s|3, w|3}
[]
Resolution in FOL
The answer is obtained by examining the
bindings of variables
u succ(v)
v succ(w)
w3
132
Answer extraction
It is not always possible to get answers to
questions by looking at the bindings of variables
in a derivation of an existential.
It can happen that a KB entails some x. P(x)
without entailing P(t) for any specific t. For
example, this happens for the block world
problem with the green and not green blocks.
A general method to deal with answers to
queries the answer extraction process.
Answer extraction
We replace a query such as x. P(x) (where x is
the variable we are interested in) by
x(P(x) 7A(x)), where A is a new predicate
symbol occurring nowhere else, called the
answer predicate.
When the answer predicate is added, it will not
be possible to derive the empty clause from the
modified query. Instead, we terminate the
derivation as soon as we produce a clause
containing only the answer predicate.
The answer predicate tells us the result.
133
Answer extraction
Example 1
KB
Student(john) = C1
Student(jane) = C2
Happy(john) = C3
Q = x (Student(x) Happy(x) 7A(x))
7Q = x (7Student(x) 7Happy(x) A(x)) = C4
Answer extraction
Resolution derivation
C5 = [7Student(john), A(john)]; Res(C3, C4);
= {x|john}
C6 = [A(john)]; Res(C5, C1)
In this example an anwer is produced.
There can be many such answers, but each
derivation only deals with one.
An example: add to the previous KB the
hypothesis: Happy(jane)
134
Answer extraction
The answer extraction process helps especially
in cases involving indefinite answers.
Example
KB
Student(john) = C1
Student(jane) = C2
Happy(john) Happy(jane) = C3
7Q = x (7Student(x) 7Happy(x) A(x)) = C4
Answer extraction
C5=[7Happy(john), A(john)] Res(C1, C4); ={x|john}
C6=[7Happy(jane), A(jane)] Res(C2, C4); ={x|jane}
C7=[A(jane), Happy(john)] Res(C6, C3)
C8 = [A(jane), A(john)] Res(C5, C7)
135
Answer extraction
It is worth noting that the answer extraction
process can result in clauses containing
variables.
Example
KB
x. Student(f(a, x))
y z. Happy(f(y, g(z)))
7Q = x (7Student(x) 7Happy(x) A(x))
Result: A(f(a, g(z))) An answer is any instance
of the term f(a, g(z))
Skolemization
We still have to solve the fourth step of the
procedure to obtain CNF (slide 26)
For example, the formula xy z.P(x, y, z) cannot
be transformed in clausal form with the presented
rules
The main idee of Skolemization:
Because some individuals are claimed to exist, we
introduce names for them (called Skolem constants
and Skolem functions) and represent facts using those
names. If we are careful not to use the names
anywhere else, what will be entailed will be precisely
what was entailed by the original existential.
136
Skolemization
The previous formula becomes: yP(a, y, f(y))
with: a constant; f function symbol of arity 1
(a and f are called Skolem symbols)
Skolemization definition:
Replace each existential variable by a new function
symbol with as many arguments as there are
universal variables dominating the existential.
Skolemization
In other words, if we start with:
x1(... x2(... x3(... [... f(x1, x2, x3) ...] ...) ...) ...)
137
Skolemization
If is our original formula and is the result of
converting it to CNF including Skolemization, then
it is no longer the case that
|= ( )
as it was before.
Skolemization
What can be shown, however, is that is
satisfiable if and only if is satisfiable, and this is
really all we need for Resolution.
Skolemization depends crucially on the universal
variables that dominate the existential.
A formula like xyR(x, y) entails
138
Treatment of equality
So far, we have ignored formulas containing
equality.
If we were to simply treat equality as a normal
predicate, we would miss many unsatisfiable sets
of clauses.
An example, {a = b, b = c, a c} is unsatisfiable
To handle these, it is necessary to augment the
set of clauses to ensure that all of the special
properties of equality are taken into account.
Treatment of equality
What we require are the clausal versions of the
axioms of equality.
reflexitivity: x (x = x)
symmetry: x y (x = y y = x)
transitivity: x y z (x = y y = z x = z)
Substitution for functions:
x1y1 xnyn(x1=y1 xn=yn
f(x1, , xn) = f(y1, , yn) )
139
Treatment of equality
Substitution for predicates:
Treatment of equality
It can be shown that with the addition of these
axioms, equality can be handled, and soundness
and completeness of Resolution for the empty
clause will be preserved.
Example
KB
x. Married(father(x), mother(x))
father(john) = bill
Query
Married(bill, mother(john))
140
C1
C2
Treatment of equality
C3 = [7Married(bill, mother(john))]
C4 = x1y1x2y2(x1=y1 x2=y2 Married(x1,x2)
Married(y1,y2) )
C4 = x1y1x2y2 (x1y1 x2 y2
7Married(x1,x2) Married(y1,y2) )
C4 = [x1y1, x2 y2, 7Married(x1,x2), Married(y1,y2)]
Treatment of equality
C5 = [x1bill, x2 mother(john), 7Married(x1, x2)]
Res(C3, C4); = {y1|bill ,y2|mother(john)}
C6 = [x2 mother(john), 7Married(father(john), x2)]
Res(C5, C2); = {x1|father(john)}
C7 = [mother(john) mother(john)] Res(C6, C1);
= {x|john, x2|mother(john)}
C8 = [ ] Res(C7, C0) C0 = reflexivity
141
[7LessThan(0, 0)]
= {x|0, y|0}
[7LessThan(1, 0)]
= {x|1, y|0}
[7LessThan(2, 0)]
...
= {x|2, y|0}
142
143
144
An example:
Using Resolution to do mathematical theoremproving, in order to determine whether or not
Goldbachs Conjecture or its negation follows
from the axioms of number theory
Chapter 4 Resolution
Homework
Solve by resolution the problems 2 and 4 of
Chapter 2 (the last sentence will be considered
the conclusion to be derived)
Problems 1 and 2, Brachman and Levesque,
pp. 75-76.
145
Chapter 4 Resolution
Homework
Show by resolution that the following set of clauses is
inconsistent (derive empty clause from it)*:
[A; B; C]; [A; B; 7C]; [A; 7B; C]; [A; 7B; 7C];
[7A; B; C]; [7A; B; 7C]; [7A; 7B; C]; [7A; 7B; 7 C]
*This problem is part of an examination paper at
University of Nottingham, School of Computer Science
Chapter 4 Resolution
Homework
Transform the following formula to clausal
form: xy(P(x, y) zQ(x, y, z))
146
Chapter 4 Resolution
Homework
Consider the following knowledge base (where run,
nothing, now and bear are constants)*:
t x(See(t; x) Dangerous(x) BestAction(t; run))
t(7 x(See(t; x) Dangerous(x)) BestAction(t; nothing))
Dangerous(bear)
See(now; bear)
147
148
149
150
Basic operation
A production system is a forward-chaining
reasoning system that uses rules (also called
production rules or simply, productions) as its
representation of general knowledge.
A production system keeps an ongoing
memory of assertions in what is called its
working memory (WM).
151
Basic operation
The WM is like a database, but more volatile; it
is constantly changing during the operation of
the system.
The basic operation of a production system is
a cycle of three steps that repeats until no
more rules are applicable to the WM, at which
point the system halts.
The three parts of the cycle are as follows:
Basic operation
Recognize (match): find which rules are
applicable, that is, those rules whose
antecedent conditions are satisfied by the
current working memory.
Resolve conflict: among the rules found in the
first step (called a conflict set - agenda), choose
which of the rules should fire, that is, get a
chance to execute.
Act: change the WM by performing the
consequent actions of all the rule selected in
the second step.
152
Working memory
Working memory is composed of a set of
working memory elements (WMEs).
Each WME is a tuple:
(type|relation attribute1: value1 . . . attributen: valuen)
Working memory
Declaratively, we understand each WME as an
existential sentence:
x(type(x) attribute1(x) = value1
attributen(x) = valuen)
Note that the individual about whom the assertion is
made is not explicitly identified in a WME (it can
appear problems regarding the duplicate facts).
If we choose to do so, we can identify individuals by
using an attribute that is expected to be unique for
each individual.
In expressing relationships among objects, reification
can be used.
153
Production rules
The antecedent of a production rule is a set of
conditions.
If there is more than one condition, they are
understood conjunctively.
Each condition can be positive or negative.
The body of each condition is a tuple of the
following form:
Production rules
(type attribute1: specification1 . . . attributek:
specificationk)
Each specification is one of the following:
an atom
a variable
an evaluable expression
a test
the conjunction, disjunction, or negation
154
Production rules
Examples:
(person age: [n+4] occupation:x)
(not (person age: {< 23 >6} ) )
A positive condition is satisfied if there is a
matching WME in the WM.
Production rules
A negative condition is satisfied if there is no
matching WME (negation is interpreted as
failure, like in PROLOG-type systems the
Closed World Assumption).
The consequent sides of production rules have
a strictly procedural interpretation, all of the
actions in the consequent are to be executed
in sequence.
155
Production rules
Each action is one of the following:
ADD pattern: this means that a new WME specified
by pattern is added directly to the WM.
REMOVE i: i is an integer, and this means to
remove (completely) from WM the WME that
matched the i-th condition in the antecedent of the
rule. This construct is not applicable if that condition
was negative.
MODIFY i (attribute specification): this means to
modify the WME that matched the i-th condition in
the antecedent by replacing its current value for
attribute by specification. MODIFY is also not
applicable to negative conditions.
Production rules
Note that in the actions of rules, any variables
that appear refer to the values obtained when
matching the antecedent of the rule.
Example:
IF (student name: x) THEN ADD (person name: x)
The corresponding formula is:
x(Student(x) Person(x))
156
Production rules
Example1:
We have three bricks, each of different size,
sitting in a heap. We have three identifiable
positions in which we want to place the bricks
with a robotic hand; call these positions 1, 2,
and 3. Our goal is to place the bricks in those
positions in order of their size, with the largest in
position 1 and the smallest in position 3.
Production rules
The patterns used for WMEs:
(counter <value>)
(brick <name> <size> <position>)
1)
10 heap)
30 heap)
20 heap)
157
Production rules
We can achieve our goal with two production
rules that work with any number of bricks.
The first one will place the largest currently
available brick in the hand.
The other one will place the brick currently in
the hand into the next position, going through
the positions sequentially.
Production rules
(defrule R1
?a <- (brick ?name ?s heap)
(not (brick ?n&~?name ?size&:(> ?size
?s) heap))
(not (brick ? ? hand))
=>
(retract ?a)
(assert (brick ?name ?s hand))
(printout t Brick ?name is grasped
crlf))
158
Production rules
(defrule R2
?a <- (brick ?name ?s hand)
?b <- (counter ?i)
=>
(retract ?a ?b)
(assert (brick ?name ?s ?i)
(counter (+ ?i 1)))
(printout t Brick ?name is placed in
position ?i crlf))
Production rules
Remarks:
In this example, no conflict resolution is
necessary, because only one rule can fire at a
time (the two rules are disjoint).
The first rule (R1) is recursively activated for
the current, largest size brick.
159
Production rules
Remarks:
Due to the undecidable character of first order
logic, certain constraints are imposed on
production systems.
For example, in CLIPS no variable can be
used in a WME.
Production rules
Remarks:
It results that CLIPS uses one of the following
forms of modus ponens inference rule:
C, C D
D
A(a ), x ( A( x ) B( x ))
B(a )
160
Production rules
Remarks:
If an inference engine can use any formula in first
order logic, then it is named a first order inference
engine:
xA( x), x( A( x) B ( x))
xB( x)
Conflict resolution
Whether we are doing data-directed reasoning
or goal-directed reasoning, it may be the case
that more than one rule is applicable.
There are many conflict resolution strategies
for arriving at the most appropriate rule to fire.
The most used conflict resolution strategies:
specificity and recency.
161
Conflict resolution
Specificity: select the applicable rule whose
conditions are most specific (generally, more
complex the case in CLIPS).
One set of conditions is said to be more
specific than another if the set of WMs that
satisfy it is a subset of those that satisfy the
other.
Example:
IF (bird) THEN ADD (canFly)
IF (bird weight: {>100}) THEN ADD (cannotFly)
Conflict resolution
Recency: select an applicable rule based on
how recently it has been used.
There are different versions of this strategy,
ranging from firing the rule that matches on the
most recently created/modified WME (the case
in CLIPS) to firing the rule that has been least
recently used.
Prefering the rules that match the most recent
WMEs can be used to make sure a problem
solver stays focused on what it was just doing
(typical related with depth-first search).
162
Conflict resolution
Somehow related with conflict resolution is
refractoriness: do not select/activate a rule that
has just been applied, as being matched by the
same WMEs.
This prevents the looping behavior that results
from firing a rule repeatedly because of the
same WMEs.
Nontrivial rule systems often need to use more
than one conflict resolution criterion.
Conflict resolution
Examples:
The OPS5 production rule system uses the
following four criteria for selecting the rule to fire
among those that are found to be applicable.
1. Discard any rule that has just been used for the
same values of variables (refractoriness);
2. Order the remaining instances in terms of
recency of WME matching the first condition,
and then the second condition, and so on;
3. Order the remaining rules by number of
conditions (complexity);
4. If there is still a conflict, select arbitrarily among
the remaining candidates.
163
Conflict resolution
One interesting approach to conflict resolution
is provided by the SOAR system.
This system is a general problem solver that
attempts to find a path from a start state to a
goal state by applying productions.
It treats selecting which rule to fire as deciding
what the system should do next.
Conflict resolution
Thus, if unable to decide on which rule to fire
at some point, SOAR sets up a new metagoal
to solve, namely, the goal of selecting which
rule to use, and the process iterates.
When this metagoal is solved (which could in
principle involve metametagoals, etc.), the
system has made a decision about which base
goal to pursue, and therefore the conflict is
resolved.
164
165
Rules
Fig. 1
Agenda
166
Rules
Agenda
Fig. 2
167
Rules
Agenda
Fig. 3
168
169
Root Node
Field_1 = data
Field_2 = 2
Field_2 =
no_constraint
Field_3 optim
Field_3 =
no_constraint
Field_4 =
no_constraint
Field_4 = Field_2
Fig. 4
170
171
172
173
Root Node
Field_1 =
data
Field_1 =
condition
Field_2 = 2
Field_2 =
no_constraint
Field_2 =
no_constraint
Field_3
optim
Field_3 =
no_constraint
Field_3 =
no_constraint
Field_4 =
no_constraint
Field_4 =
Field_2
Pattern
Network
Field_4 of pattern_1 =
Field_2 of pattern_2
Join Network
Fig. 5
Activated Rule
174
175
176
(matches R2)
177
(1)
(2)
178
179
180
(initial-fact)
(sequence a b c d)
(element a)
(element b)
(element c)
(element d)
(element e)
(element f)
(element g)
f1
f 2, f 3, f 4, f 5, f 6, f 7, f 8
f 2, f 3, f 4, f 5, f 6, f 7, f 8
f 2, f 3, f 4, f 5, f 6, f 7, f 8
f 2, f 3, f 4, f 5, f 6, f 7, f 8
181
f 2, f 3, f 4, f 5, f 6, f 7, f 8
f 2, f 3, f 4, f 5, f 6, f 7, f 8
f 2, f 3, f 4, f 5, f 6, f 7, f 8
f 2, f 3, f 4, f 5, f 6, f 7, f 8
f1
182
[f2,f 2];[f 2,f 3];[f 2,f 4];[f 2,f 5];[f 2,f 6];
[f 2,f 7];[f 2,f 8]
[f 3,f 2];[f 3,f 3];[f 3,f 4];[f 3,f 5];[f 3,f 6];
[f 3,f 7];[f 3,f 8]
....
[f 8,f 2];[f 8,f 3];[f 8,f 4];[f 8,f 5];[f 8,f 6];
[f 8,f 7];[f 8,f 8]
Result: 56 partial matches only for patterns 1 and 2
183
184
185
?)
?)
?)
)
?n1
186
187
188
189
190
191
192
193
194
195
196
S1
S2
D1
197
S3
S4
D2
D3