Você está na página 1de 17

UNIT III

KNOWLEDGE INFERENCE
Knowledge representation -Production based system, Frame based system. Inference Backward
chaining, Forward chaining, Rule value approach, Fuzzy reasoning - Certainty factors, Bayesian
Theory-Bayesian Network-Dempster - Shafer theory.
KNOWLEDGE REPRESENTATION
Knowledge representation (KR) is an important issue in both cognitive science and
artificial intelligence.
In cognitive science, it is concerned with the way people store and process
information and
In artificial intelligence (AI), main focus is to store knowledge so that programs
can process it and achieve human intelligence.
There are different ways of representing knowledge e.g.
predicate logic,
semantic networks,
extended semantic net,
frames,
conceptual dependency etc.
In predicate logic, knowledge is represented in the form of rules and facts as is done in
Prolog.
Knowledge is the information about a domain that can be used to solve problems in that
domain. To solve many problems requires much knowledge, and this knowledge must be
represented in the computer. As part of designing a program to solve problems, we must
define how the knowledge will be represented.
A representation scheme is the form of the knowledge that is used in an agent.
A representation of some piece of knowledge is the internal representation of the
knowledge. A representation scheme specifies the form of the knowledge. A knowledge
base is the representation of all of the knowledge that is stored by an agent.
Differenttypesofknowledgerequiredifferentkindsofrepresentation.TheKnowledgeRepresent
ationmodels/mechanismsareoftenbasedon:
Logic

Rules

Frames

SemanticNet

Knowledge and Representation

Problem solving requires large amount of knowledge and some mechanism for
manipulating that knowledge.

The Knowledge and the Representation are distinct entities, play a central but
distinguishable roles in intelligent system.
Knowledge is a description of the world;
it determines a system's competence by what it knows.

Representation is the way knowledge is encoded;


it defines the system's performance in doing something.

In simple words:

need to know about things we want to represent, and

need some means by which things we can manipulate.

Knowledge Representation Schemes


There are four types of Knowledge representation:
Relational, Inheritable, Inferential, and Declarative/Procedural.
Relational Knowledge:
provides a framework to compare two objects based on equivalent attributes.
any instance in which two different objects are compared is a relational type
of knowledge.
Inheritable Knowledge

is obtained from associated objects.

it prescribes a structure in which new objects are created which may inherit all
or a subset of attributes from existing objects.

Inferential Knowledge

is inferred from objects through relations among objects.

e.g.,a word alone is a simple syntax, but with the help of other words in phrase
the reader may infer more from a word; this inference with in linguistic is
called semantics.

PRODUCTION BASED SYSTEM

A rule-based systems are used as a way to store and manipulate knowledge to interpret
information in a useful way. They are often used in applications and research

A classic example of a rule-based system is the domain-specific expert system that uses
rules to make deductions or choices. For example, an expert system might help a doctor
choose the correct diagnosis based on a cluster of symptoms, or select tactical moves to
play a game.

Rule-based systems can be used to perform lexical analysis to compile or interpret


computer programs, or in natural language processing.

Rule-based programming attempts to derive execution instructions from a starting set of


data and rules. This is a more indirect method than that employed by an imperative
programming language which lists execution steps sequentially.

A typical rule-based system has four basic components

A list of rules or rule base, which is a specific type of Knowledge base.


An Inference engine or semantic reasoner , which infers information or takes action
based on the interaction of input and the rule base. The interpreter executes a
production system program by performing the following match-resolve-act cycle
Match: In this first phase, the left-hand sides of all productions are matched against
the contents of working memory. As a result a conflict set is obtained, which consists
of instantiations of all satisfied productions. An instantiation of a production is an
ordered list of working memory elements that satisfies the left-hand side of the
production.
Conflict-Resolution: In this second phase, one of the production instantiations in the
conflict set is chosen for execution. If no productions are satisfied, the interpreter
halts.
Act: In this third phase, the actions of the production selected in the conflictresolution phase are executed. These actions may change the contents of working
memory. At the end of this phase, execution returns to the first phase.

Temporary working memory.


A User interface or other connection to the outside world through which input and output
signals are received and sent.

ARCHITECTURE OF EXPERT SYSTEMS:

Major Components
o Knowledge base - a declarative representation of the expertise, often in IF THEN
rules
o Working storage - the data which is specific to a problem being solved
o Inference engine - the code at the core of the system
o Derives recommendations from the knowledge base and problem-specific data in
working storage
o User interface - the code that controls the dialog between the user and the system
FRAME-BASED SYSTEMS:

Frame-based systems are knowledge representation systems that use frames, a notion originally
introduced by Marvin Minsky, as their primary means to represent domain knowledge.
A frame is a structure for representing a CONCEPT or situation such as "living room" or "being
in a living room."

A frame provides a means of organising knowledge in slots to describe various attributes


and characteristics of the object.

Frame Structure:

The concept of a frame is defined by a collection of slots. Each slot describes a particular
attribute or operation of the frame.
Slots are used to store values. A slot may contain a default value or a pointer to another
frame, a set of rules or procedure by which the slot value is obtained.

Slot value. A slot value can be symbolic, numeric or Boolean.


For example, the slot Name has symbolic values, and the slot Age numeric values. Slot
values can be assigned when the frame is created or during a session with the expert
system
o Default slot value. The default value is taken to be true when no evidence to the
contrary has been found. For example, a car frame might have four wheels and a
chair frame four legs as default values in the corresponding slots.
o Range of the slot value. The range of the slot value determines whether a
particular object complies with the stereotype requirements defined by the frame.
For example, the cost of a computer might be specified between $750 and $1500.
o Procedural information. A slot can have a procedure attached to it, which is
executed if the slot value is changed or needed.

Individual frames have a special slot called : INSTANCE-OF whose filler is the name of
a generic frame:
Example: (toronto % lower case for individual frames )
Generic frames may have IS-A slot that includes generic frame
(Canadian City % upper case for generic frames )

Frame Based Representation Languages

Frame representations have become popular that special high level frame-based
representation languages have been developed.
LISP have functions to create, access, modify, update and display frames.
A function which defines a frame is called with (f define f-name<parents><slots>)

f define is a frame definition function.


F-name is the name assigned to the new frame.
<parents> is a list of all parent frames to which the new frame is linked.
<slots> is a list of slot names and initial values.
There are many frame languages which aids in building frame-based systems.
They include the
1. frame representation language (FRL),
2. knowledge representation language (KPL) which served as a base language for a
scheduling system called NUDGE and KLONE.

Frame architecture

Frames are structured sets of knowledge, such as an object or concept name, the objects
main attribute and their corresponding values and some attached procedures.
The attribute, values and procedures are stored in specified slots and slot facets of the

frame.
Frames are linked together as a network much like the nodes in an associative network.
Frames have many of the features of associative networks, property inheritance and
default reasoning.
Frame architectures and a number of building tools which create and manipulate frame
structured systems have been developed.

Example:

The Present Illness Program (PIP) system, developed in 1976, was an early diagnostic
tool designed to emulate clinicians in the evaluation of patients with edema.

It merged facts about the patient with knowledge from a database to develop a
hypothesis about what was afflicting the patient.

The system had four major components:

a set of patient data


a long-term memory, the knowledge repository
a short-term memory, the intersection of patient data and the knowledge repository
a supervisor program to filter knowledge and act on patient input.

The medical knowledge in PIP is organized in frame structures, where each frame is
composed of categories of slots with names such as

Typical findings

Logical decision criteria

Complimentary relations to other frames

Differential diagnosis

Scoring

INFERENCE
Inference is deriving new sentences from old.
There are standard patterns of inference that can be applied to derive chains of conclusions that
lead to the desired goal. These patterns of inference are called inference rules.
First order Logic
Whereas propositional logic assumes the world contains facts, first-order logic (like natural
language) assumes the world contains

Objects: people, houses, numbers, colors, baseball games, wars,


Relations: red, round, prime, brother of, bigger than, part of, comes between,
Functions: father of, best friend, one more than, plus,
Syntactic elements of First Order Logic
The basic syntactic elements of first-order logic are the symbols that stand for objects,
relations, and functions. The symbols, come in three kinds:
a) constant symbols, which stand for objects;
b) predicate symbols, which stand for relations;
c) and function symbols, which stand for functions.
We adopt the convention that these symbols will begin with uppercase letters. Example:
Constant symbols : Richard and John;
Predicate symbols :Brother, OnHead, Person, King, and Crown;
Function symbol : LeftLeg.
Usage of First Order Logic.
The best way to find usage of First order logic is through examples. The examples can be taken
from some simple domains. In knowledge representation, a domain is just some part of the
world about which we wish to express some knowledge.
Assertions and queries in first-order logic
Sentences are added to a knowledge base using TELL, exactly as in propositional logic. Such
sentences are called assertions.
For example, we can assert that John is a king and that kings are persons:
TELL(KB, King (John))
Forward chaining with an example.
Using a deduction to reach a conclusion from a set of antecedents is called forward chaining.
In other words, the system starts from a set of facts ,and a set of rules, and tries to find the way
of using these rules and facts to deduce a conclusion or come up with a suitable course of
action. This is known as data driven reasoning.

To chain forward, match data in working memory against 'conditions' of rules in the rule-

base.
When one of them fires, this is liable to produce more data.
So the cycle continues

The proof tree generated by forward chaining. Example knowledge base


The law says that it is a crime for an American to sell weapons to hostile nations. The
country Nono, an enemy of America, has some missiles, and all of its missiles were sold to it
by Colonel West, who is American.
Prove that Col. West is a criminal... it is a crime for an American to sell weapons to
hostile nations: American(x)
Weapon(y)
Owns(Nono,x)
Colonel West Missile(x)
Missile(x)

) all of its missiles were sold to it by


Missiles are weapons:
"hostile: Enemy(x,America)
)

The country Nono, an enemy of America Enemy(Nono,America) Note:


(a) The initial facts appear in the bottom level
(b) Facts inferred on the first iteration is in the middle level
(c) The facts inferered on the 2nd iteration is at the top level

ALGORITHM

Backward chaining with an example.

Forward chaining applies a set of rules and facts to deduce whatever conclusions can

be derived.
In backward chaining ,we start from a conclusion, which is the hypothesis we wish
to prove,and we aim to show how that conclusion can be reached from the rules and

facts in the data base.


The conclusion we are aiming to prove is called a goal ,and the reasoning in this way is
known as goal-driven.

Backward chaining example

Backward chaining is the best choice if:


The goal is given in the problem statement, or can sensibly be guessed at the beginning of the
consultation; or:
The system has been built so that it sometimes asks for pieces of data (e.g. "please now do the
gram test on the patient's blood, and tell me the result"), rather than expecting all the facts to be

presented to it.
Backward chaining
This is because (especially in the medical domain) the test may be
expensive,
or unpleasant,
or dangerous for the human participant
so one would want to avoid doing such a test unless there was a good reason for it.

Forward chaining is the best choice if:


All the facts are provided with the problem statement;
or:
There are many possible goals, and a smaller number of patterns of data; or:
There isn't any sensible way to guess what the goal is at the beginning of the consultation.

FUZZY REASONING:

Fuzzy Set Theory

Fuzzy set theory defines set membership as a possibility distribution.


The general rule for this can expressed as:
where n some number of possibilities.
This basically states that we can take n possible events and us f to generate as single
possible outcome.
This extends set membership since we could have varying definitions of, say, hot curries.
One person might declare that only curries of Vindaloo strength or above are hot whilst
another might say madras and above are hot. We could allow for these variations
definition by allowing both possibilities in fuzzy definitions.
Once set membership has been redefined we can develop new logics based on combining
of sets etc. and reason effectively.

Fuzzy Logic
Fuzzy logic is a totally different approach to representing uncertainty:

It focuses on ambiguities in describing events rather the uncertainty about the occurrence
of an event.
Changes the definitions of set theory and logic to allow this.
Traditional set theory defines set memberships as a boolean predicate.

Operations- Fuzzy set.


The following rules which are common in classical set theory also apply to Fuzzy set theory.
De Morgans law

,
Associativity

Commutativity

Distributivity

Operations on fuzzy sets


union,
intersection,
complement
Operations on fuzzy numbers
arithmetic,
equations,
functions
the extension principle

Operations on Fuzzy Sets: Union and Intersection

CERTAINTY FACTORS:
The MYCIN model
Certainty factors / Confidence coefficients (CF)
Heuristic model of uncertain knowledge
In MYCIN two probabilistic functions to model the degree of belief and the degree of
disbelief in a hypothesis
function to measure the degree of belief - MB
function to measure the degree of disbelief - MD
MB[h,e] how much the belief in h increases based on evidence e
MD[h,e] - how much the disbelief in h increases based on evidence e

Belief functions

MB[h, e] = max(P(h | e), P(h)) P(h)

max(0,1) P(h)
1

MD[h, e] = min(P(h | e), P(h)) P(h)

min(0,1) P(h)
Certainty factor
CF[h, e] = MB[h, e] MD[h, e]

daca P(h) = 1
in caz contrar

daca P(h) = 0
in caz contrar

Belief functions features


Value range
0 MB[h, e] 1

0 MD[h, e] 1

1 CF[h, e] 1

If h is sure, i.e. P(h|e) = 1, then


MB[h, e] =

1 P(h)
=1
1 P(h)

MD[h, e] = 0

CF[h, e] = 1

If the negation of h is sure, i.e. , P(h|e) = 0 then


MB[h, e] = 0

MB[h, e] =

0 P(h)
=1
0 P(h)

CF[h, e] = 1

Example in MYCIN
if
(1) the type of the organism is gram-positive, and

(2) the morphology of the organism is coccus, and

(3) the growth of the organism is chain


then there is a strong evidence (0.7) that the identity of the organism is streptococcus
Example of facts in MYCIN :
(identity organism-1 pseudomonas 0.8)
(identity organism-2 e.coli 0.15)
(morphology organism-2 coccus 1.0)
Limits of CF
CF of MYCIN assumes that that the hypothesis are sustained by independent evidence
An example shows what happens if this condition is violated
A:
The sprinkle functioned last night
U:
The grass is wet in the morning
P:
Last night it rained
BAYES THEOREM AND BAYESIAN NETWORKS
Bayes rule and its use
P(A B) = P(A|B) *P(B)
P(A B) = P(B|A) *P(A)
Bays rule (theorem)
P(B|A) = P(A | B) * P(B) / P(A)
P(B|A) = P(A | B) * P(B) / P(A)

Bayes Theorem
hi hypotheses (i=1,k);
e1,,en - evidence
P(hi)
P(hi | e1,,en)
P(e1,,en| hi)

P(h i | e1 , e 2 ,..., e n ) =

P(e1 , e 2 ,..., e n | h i ) P(h i )


k

P(e , e ,..., e
j1

, i = 1, k

| h j ) P(h j )

If e1,,en are independent hypotheses then


P(e | h j ) = P(e1 , e 2 ,..., e n | h j ) = P(e1 | h j ) P(e 2 | h j ) ... P(e n | h j ), j = 1, k

PROSPECTOR
BAYESIAN NETWORKS

Represent dependencies among random variables


Give a short specification of conditional probability distribution
Many random variables are conditionally independent
Simplifies computations
Graphical representation
DAG causal relationships among random variables
Allows inferences based on the network structure

Definition of Bayesian networks


A BN is a DAG in which each node is annotated with quantitative probability information,
namely:
Nodes represent random variables (discrete or continuous)
Directed links XY: X has a direct influence on Y, X is said to be a parent of Y
each node X has an associated conditional probability table, P(Xi | Parents(Xi)) that
quantify the effects of the parents on the node
Example: Weather, Cavity, Toothache, Catch
Weather, Cavity Toothache, Cavity Catch

Bayesian network example

Bayesian network semantics


A)Represent a probability distribution
B) Specify conditional independence build the network
A) each value of the probability distribution can be computed as:
P(X1=x1 Xn=xn) = P(x1,, xn) =
i=1,n P(xi | Parents(xi))
where Parents(xi) represent the specific values of Parents(Xi)
Building the network
P(X1=x1 Xn=xn) = P(x1,, xn) =
P(xn | xn-1,, x1) * P(xn-1,, x1) = =
P(xn | xn-1,, x1) * P(xn-1 | xn-2,, x1)* P(x2|x1) * P(x1) =
i=1,n P(xi | xi-1,, x1)
We can see that P(Xi | Xi-1,, X1) = P(xi | Parents(Xi)) if
Parents(Xi) { Xi-1,, X1}
The condition may be satisfied by labeling the nodes in an order consistent with a DAG
Intuitively, the parents of a node Xi must be all the nodes
Xi-1,, X1 which have a direct influence on Xi.
Pick a set of random variables that describe the problem
Pick an ordering of those variables

while there are still variables repeat


(a) choose a variable Xi and add a node associated to Xi
(b) assign Parents(Xi) a minimal set of nodes that already exists in the network such that the
conditional independence property is satisfied
(c) define the conditional probability table for Xi
Because each node is linked only to previous nodes DAG
P(MaryCalls | JohnCals, Alarm, Burglary, Earthquake) = P(MaryCalls | Alarm)
DEMPSTERSHAFER THEORY
The theory of belief functions, also referred to as evidence theory or DempsterShafer
theory (DST), is a general framework for reasoning with uncertainty, with understood
connections to other frameworks such as probability, possibility and imprecise
probability theories
Each fact has a degree of support, between 0 and 1:
0 No support for the fact
1 full support for the fact
Differs from Bayesian approach in that:
Belief in a fact and its negation need not sum to 1.
Both values can be 0 (meaning no evidence for or against the fact)
Dempster-Shafer Belief Functions
o A general (abstract) formulization sees Belief functions as a special case of upper
probabilities:
o Definition: A belief function Bel defined on a space W satisfies the following
three properties:
o B1. Bel( ) = 0 (normalization)
o B2. Bel( W) = 1 (normalization)
o B3. Bel( U1 U2) Bel( U1) + Bel( U2) - Bel( U1 U2) Bel( U1 U2 U2)
Bel( U1) + Bel( U2) + Bel( U3) - Bel( U1 U2)-Bel( U1 U3)-Bel( U2
U3)+Bel( U1 U2 U3) (inclusion-exclusion rule)
o In Dempsters scenario belief functions are constructed by means of multi-valued
mappings.
o Bel and its dual, Pl (plausibility), are special kind of lower/upper probability
functions:
You can see it by defining PBel = { : ( U) Bel( U) for all U W} and
showing that Bel = ( PBel ) * and Pl = ( PBel ) *
Shafer gave a somewhat different interpretation of these ideas (given in the book A
Mathematical Theory of Evidence). In his theory, belief functions are part of a theory of
evidence.
Definition (mass function) A mass function on W is a function m: 2 W [0, 1] such that
the following two conditions hold:
m( ) = 0. U W m( U) = 1
Definition (belief/plausibility function based on m) Let m be a mass function on W. Then
for every U W:
Bel( U) =def U' U m( U')

Pl( U) =def U' U m( U')

Bel and Pl are dual.


U' U m(U') + U' U m( U') = U'U= m(U') + U' U
m( U') = 1
If Bel is a belief function on W, then there is a unique mass function m
over such that Bel is the belief function based on m. This mass function
is given by the following equation:
For all U W, m( U) = U' U (-1)| U\U' | Bel (U')

Você também pode gostar