ICS2308 Artificial Intelligence Notes PDF

ICS2308 ARTIFICIAL INTELLIGENCE
Course Outline
1. Introduction to artificial intelligence 2. Knowledge representation 3. Heuristic search 4. Natural language processing 5. Symbolic machine learning 6. Connectionism and evolutionary computation
References Introduction to Artificial Intelligence

Intelligence is the capacity to learn; learning means acquisition and application of
knowledge.
Artificial intelligence is an area in computer science that focuses on creating machines
that can engage on behaviour that humans consider intelligence.

AI is a combination of computer science, physiology and philosophy.
History of Artificial Intelligence

The invention of computers in 1941 availed the technology to create machine intelligence. However, it was not until the 50s that the link between human intelligence and machines was really observed. Norbert Weiner researched on the feedback theory and established that an intelligent
behaviour was as a result of feedback mechanisms which could possibly be simulated by

machines.
In 1955, a program was developed representing each problem as a tree model. The program would attempt to solve it by selecting the branch that would most likely result in the correct conclusion. In 1956, the term AI was coined at the Dartsmouth Conference and since then, research has continued into developing programs or applications that could efficiently solve problems and learn by themselves. Several applications have been developed e.g. the missile systems, voice and character recognition, engineering controllers etc.
Motivation towards Artificial Intelligence

Computers are generally well suited to performing mechanical computations using fixed
programmed rules.
This allows them to perform simple monotonous tasks efficiently and reliably which human beings are ill suited to. For more complex problems, computers have trouble understanding specific situations
and adapting to new situations unlike humans.

Artificial intelligence aims to improve machine behaviour in tackling such complex tasks. AI also is allowing us to understand our intelligent behaviour humans have an interesting approach to problem solving based on abstract thoughts, high level reasoning and pattern information (recognition). AI helps us to understand this approach by creating it and enhances us beyond our current capabilities.
Applications of Artificial Intelligence

1. Game playing machines that can play certain games e.g. Chess with great mastery mainly through brute force computation which gives ability to look at hundreds of thousands of positions per second. 2. Speech and character recognition the ability of computers to recognize voices and writings. 3. Natural language processing providing the computer with understanding of the domain of a natural language.
4. Expert systems systems that are able to make decisions and perform the work of professionals (human experts) e.g. diagnostic systems in hospitals. 5. Robotics automation of tasks performed by a mechanical device through predefined programs 6. Information predictors e.g. in banks, insurance companies, market surveys whereby intelligence tools are used to detect trends and predict e.g. customer behaviour. 7. Computer vision and pattern recognition computer processing of images from the real world and recognition of features present in images.
Challenges to the Achievement of Total Artificial Intelligence

Test of Intelligence by Turing: A machine will be said to be intelligent if it can successfully deceive a human being that it is a human being as well. Limitation of sensory organs (that assist humans to learn) in computers Intelligence is a complex idea Knowledge is vast Knowledge is of various domains: o o o o Affective feelings Cognitive analysis Psychomotor motion Etc.
Some domains are hard to store in a machine e.g. affective, psychomotor Lack of a well understood model to represent reality and thus induce artificial intelligence It is expensive to acquire tools develop and research on artificial intelligence
Knowledge Representation
Knowledge is the symbolic representation of some named universe of discourse. The universe of discourse may be actual activities or fictional ones in the future or in some belief. In AI systems, we may need to represent objects, events and performance or behaviour as kinds of knowledge.
Knowledge representation is an area of artificial intelligence that is concerned with how to use symbol system to represent a domain of discourse. Its goal is to organize knowledge in a manner that facilitates drawing of conclusions.
Components of a Representation
A representation has four components: i. ii. iii. A represented world the domain that the representations are mapped` A representing world the domain that contains the representation Representing rules the set of rules that map elements of the represented world to those of the representing world iv. The representation system the procedure for extracting information in a knowledge representation; its choice determines the ease or difficulty of finding the information.
Uses of a Representation
After representing the knowledge, we use it for: i. ii. Inference/reasoning inferring facts from the existing data Learning acquiring knowledge whereby mean data has to be classified prior to storage for easy retrieval and has to interact with existing facts to avoid duplication.
Types of Knowledge
There are two main types of knowledge: i. Declarative/descriptive/propositional knowledge it is the factual information stored in memory and is known to be static in nature. It is the part of knowledge that describes how things are. Its domain is defined by things or events or processes their attributes and the relations between them ii. Procedural/imperative/know-how knowledge it is the knowledge of how to perform a task or how to operate. It is mainly applied in problem solving.
Properties of Knowledge
Good representations of Knowledge i. ii. They make the important objects and relations explicit. They expose natural constraints i.e. one can express the way one object or relation influences another. iii. iv. v. vi. vii. viii. ix. They bring objects and relations together. They suppress irrelevant detail. They are transparent i.e. the meaning can be understood clearly. They are complete i.e. consist of all that needs to be contained. They are concise i.e. they communicate the information efficiently. They are fast i.e. retrieval of information is fast. They are computable i.e. they have been created based on a known procedure.
Properties of Good Knowledge Representation Systems These characteristics can be summarised into the following four properties for knowledge representation systems: i. ii. Representational adequacy the ability to represent the required knowledge. Inferential efficiency the ability to direct the inferential mechanisms into the most productive directions by storing appropriate guides. iii. Inferential adequacy the ability to manipulate the knowledge represented to produce new knowledge corresponding to that inferred from the original. iv. Acquisitional efficiency the ability to acquire knew knowledge using automatic
methods whenever possible rather than reliance on human intervention.
Fundamental components of a knowledge representation system

i. Lexical component the part that determines which symbols are allowed in a representations vocabulary. ii. Structural component the part that describes constraints on how the symbols can be arranged.
iii.
Procedural component the part that specifies access procedures that enable one to create descriptions; modify them and answer questions using them.
iv.
Semantic component this part establishes a way of associating meaning with the descriptions created from the procedural part.
Knowledge Representation Techniques

i. Logic representation
a. Propositional logic This is logic at the sentence level where we consider sentences or statements that are
either true or false. If a proposition is true then it has a truth value of true and if it is
false then its truth value is false Example Proposition: Saturday is the last day of the week. Non-proposition: Walk out. Simple sentences which are true or false are basic propositions. Larger and more complex sentences can be constructed from the basic propositions by combining them with connectives. Therefore, the basic elements of propositional logic are propositions and connectives. Examples of connectives are:
NOT AND OR IMPLY/IF-THEN IF-AND-ONLY-IF
Truth tables are used to map the relations of propositions when they are combined with connectives. Let and propositions:
i. Not
ii. And T T F F iii. Or T T F F T F T F T T T F T F T F T F F F
iv. Imply T T F F T F T F T F T T
v. If and only if T T F F T F T F T F F T
b. Predicate Logic Propositional logic is not powerful enough to represent all types of assertions.
To cope with the deficiencies of propositional logic, we introduce predicates and quantifiers to form predicate logic. A predicate is a verb phrase that describes properties of objects or the relationships
among objects e.g.

The sky is blue; is bluepredicate B(x) is blue(x) Quantification is performed on formulas of predicate logic by using quantifiers on variables. There are two types of quantifiers: the universal quantifier ( ) the existential quantifier ( )
For example:
Algorithm: Converting to Clause Form 1. Eliminate using the fact that 2. Reduce the scope of each laws (i.e. ( between quantifiers krzrr Consider the following set of facts: ii. Maina was a man. iii. Maina was Larilian. iv. All Larilians were Nyandaruans. v. Mugo was a chief. and ( )
. Krzrr example. ) , DeMorgans ) and in standard correspondencies
to a single term using the fact that (
vi. All Nyandaruans were either loyal to Mugo or hated him. vii. Everyone is loyal to someone. viii. People only try to stone chiefs they are not loyal to.
ix. Maina tried to stone Mugo. x. All men are people. The above was propositional logic: i. ii. iii. iv. v. vi. vii. viii. ix. ( ( ( ) ( ) ( ) ( ( ) ( ( ) ( ) ) ( ) ) ( ) ( ) ( ) ( ) ) ) ( )
The above is a predicate set of functions. Conversion of these statements to clause form or well-formed statements (wffs) would lead to: i. ii. iii. iv. v. vi. vii. viii. ix. ( ( ( ( ) ) ( ) ( ) ( ) ( ( ) ( ) ( ) ) ) ) ( ) ( ) ( ( ) ) ( )
Proof of did Maina hate Mugo? i. Express the question in predicate form: ii. Negate the statement/predicate: ( ( ) )
iii. Look for relevant statements and put them together e.g.
( )
( ) ( ) ( )
( ) ( )
( ) ( )
( ) () ( )
( )
( ) ( )
( )
() ( )
( ) ( )
( )
()
( ) ( )
( )
()
( )
()
()
NULL
ii. Rules Rules are commonly used to represent knowledge in an inference system. The rules are usually in the form of production rules (if-then rules). They are used to show relationships among variables and derive actions from input to
an inference engine.
Each rule consists of an antecedent (the if part) and a consequent (the then part). Interpreting an if-then rule involves distinct parts: Evaluating the antecedent. Applying the result to the consequent.
In the case of a binary/2-valued logic, if the premise is true, then the conclusion is true. In case of a multivalued logic, if the antecedent is true to some degree, then the consequent is also true to that same degree. Binary (0 or 1) multivalued (Range from 0 to 1 i.e. 0.5) The antecedent of a rule can have multiple parts: e.g. if the sky is grey and the wind is blowing then it will rain in such a case, all the parts of the antecedent are evaluated simultaneously and resolved into a single number/part using logical operators. iii. Natural Language Natural language is the human spoken language. It is the most expressive knowledge representation formalism since everything that can be expressed symbolically can also be expressed in natural language. Its reasoning potential is very complex but its hard to model. Problems with natural Language i. ii. iii. It is often ambiguous There is little uniformity in the structure of sentences Syntax and semantics are not fully understood
iv. Database systems Database systems are logical organizations of data in a form that makes meaning to the user and facilitates easy retrieval. Database systems are well suited to efficiently represent and process large amounts of data. However, only simple aspects of some universe of discourse can be represented hence reasoning is very simple and limited. v. Semantic Networks Semantic networks are capable of representing individual objects, categories of objects and relations among objects
Mammals
Has Mother Subset of Female persons Member of Mary
Persons
Legs Subset of Male persons Member of
Sister of
John
Mary is a sister of John. Mary and John are members of persons. Persons have two legs. Semantic nets make it easy to perform inheritance reasoning. They are simple and efficient as compared to logic vi. Frames **to read An AI data structures used to divide knowledge into sub-structures by representing stereotyped situations. Frames are connected together to form a complete idea.
Heuristic Search
Heuristic search uses problems specific knowledge beyond the definition of the problem itself. It is also known as informed Search. It can thus arrive at solutions more efficiently than uninformed/blind search strategies.
Well Defined Problems and Solutions

A problem is defined formally by four components: i. Initial/starting state is the starting point for solving any problem. ii. Successor function which is a description of possible actions available. The initial state and the successor functions define the set of all states reachable from the initial state. iii. Goal test which determines whether a given state is a goal state e.g. in the game of chess, the goal is to reach a state called check-mate where the opponents king is under attack and cannot escape. iv. Path cost which is a function that assigns a numeric cost to each path. A problem solving agent choses a cost function that reflects its own performance measure These components define a problem and can be put together into a single structure that is given as input to problem-solving algorithms. A solution to a problem is a path from the initial state to the goal state. The quality of a solution is measured by the path cost whereby an optimal solution has the lowest path cost among all solutions. Exercise: formally formulate the problems of the eight queens on a chess board The output of a problem solving algorithm is either failure or a solution. Some algorithms may get stuck in an infinite loop and never return an output. The performance of an algorithm is evaluated using four measures: Completeness is the algorithm guaranteed to find a solution when there is one? Optimality does the algorithm find the optimal solution? Time complexity how long does the algorithm take to find a solution? Space complexity how much memory is needed to perform the algorithm?
Uninformed/Blind Search Strategies

Blind search means that the search has no additional information about states beyond that provided in the problem definition. Breadth First Search (BFS) This is a simple strategy in which the root node is expanded first then all the successors of the root node are expanded next, then their successors and so on. All the nodes are
expanded at a given breadth in the search tree before any nodes at the next level are expanded. BFS can be implemented using a FIFO queue ensuring that the nodes that are visited first will be expanded first. Evaluation of BFS Algorithm i. It is complete If the goal node is at finite depth d, BFS will eventually find it after expanding all shallower nodes. ii. Optimality
The shallowest goal node is not necessarily the optimal one hence BFS algorithm is optimal if the path cost is a non-decreasing function of the depth of the node e.g. when all the actions/moves have the same cost. iii. Time complexity
Consider a state space where every state has b successors. The root of the search tree generates b nodes at level one b2 at level 2, b3 at level 3. Each of these generates b more nodes and so on. If the solution is at level d in the worst case, we would expand all but the last nodes at level d. this would result in exponential complexity of generated nodes. ( )
Time requirements are major constraints in BFS. iv. Space complexity
Every node that is generated must remain in memory hence space complexity also grows exponentially. BFS places a very high demand on memory. Depth First Search (DFS) The search proceeds to the deepest level of the search tree where the nodes have no successors. As the nodes are expanded, they are dropped off and the search backs up to the next shallowest node that still has unexplored successors. DFS can be implemented using stacks or LIFO queues. Comparison: both are complete, optimality is relative to the node to be searched, time complexity is relative to space.
Heuristic Searches
A key component of a heuristic search is heuristic function denoted ( ). ( ) is the
estimated cost of the cheapest path from node n to a goal node. If n is the goal node,
then ( ) .
Greedy Best First Search (GBFS) GBFS tries to expand the node that is closest to the goal on the grounds that it is likely to meet a solution quickly. It evaluates nodes using the heuristic function ( ) ( ) ( ) ( ) it resembles DFS in the way it prefers to follow a single path all the way to the end but backs up when it hits a dead end. Just like DFS, it is neither optimal nor complete.
366
253
329
374
172
380
193
0 A* Search It evaluates nodes by combining ( ), the cost to reach the node and get from the node to the goal ( ) ( ) ( ) Since ( ) gives the path cost from the start node to node n, ( ) is the estimated cost for the cheapest solution through node n. therefore, in trying to find the cheapest
( ) i.e. the cost to
solution, we try the node with the lowest value of ( ) optimal. From the above algorithm, route would be ABEH and ( )
( ). It is both complete and
To reach goal node K, we go through ABCIK
(cheapest route)
Learning
Forms of Learning
The field of machine learning distinguishes three forms of learning: a. Supervised learning b. Unsupervised learning c. Reinforced learning The type of feedback available is usually the most important factor in determining the nature of learning the agent takes. 1. Supervised Learning It involves learning a function from example of its inputs and outputs e.g. learning multiplication tables.
The correct output values are first provided after which a learning agent can get the correct output from its perceived knowledge. For fully observable environments, an agent can observe the effects of its actions and hence can use supervised learning methods to learn to predict them. inputs outputs
Learning function
2. Unsupervised Learning It involves learning patterns in the input when no specific output values are supplied. input
Learning function
A purely unsupervised learning agent cannot learn what to do because it has no information as to what constitutes a correct action or desirable state. An example is conducting a research. 3. Reinforced Learning Rather than being told what to do, the agent learns from reinforcement maybe a reward or its absence (teaches behavioral skills e.g. potty training, promoting hard working employees, etc.) The design of a learning element is affected by three major concerns:
i. ii. iii.
Which components of the performance element are to be learned? What feedback is available to learn these components? What representation is used for the components?
The components of a performance element in learning may include the following: i. ii. iii. iv. A direct mapping from conditions of the current state to action. A means to make an inference of relevance properties of the world being learned. Information about the way the world being learned responds and the results of possible actions. Information indicating the desirable states and actions.
Learning by Decision Trees

A decision tree takes as input an object or a situation described by a set of attributes and returns a decision which is a predicted output value for the given input. The input values can be discrete or continuous and so are the outputs. Learning a Discrete-Valued Function is called classification while learning a Continuous Function is called regression. A decision tree reaches its decision by performing a sequence of tests. Each internal node corresponds to a test of the value of one of the properties and the branches are labeled with the possible values of the test. Each leaf node specifies the value to be returned if that value is reached. Example: Suppose you model a simple problem of whether to wait for a table at a restaurant or not. The following is a list of applicable attributes: i. ii. iii. iv. v. vi. vii. Alternate whether there is a suitable alternative restaurant nearby. Bar whether the restaurant has a comfortable bar area to wait in. Friday/Saturday true on Fridays and Saturdays. Hungry whether we are hungry or not. Patrons how many people are in the restaurant (none/some/full). Price the restaurants price range. Raining whether its raining outside or not.
viii. ix. x.
Reservation whether we have made a reservation or not. Type the kind of restaurant (e.g. Italian/French). WaitEstimate the wait estimated by the host (0-10mins, 10-30mins, 3060mins, ).
Example
Alt
Bar
Fri/Sat
Hungry
Ptrns
Price
Rain
Rsvn
Type
est
Goal (will wait)
Yes Yes No Yes Yes No No No No Yes No Yes
No No Yes No No Yes Yes No Yes Yes No Yes
No No No Yes Yes No No No Yes Yes No Yes
Yes Yes No Yes No Yes No Yes No Yes No Yes
Some Full Some Full Full Some None Some Full Full None Full
No No No Yes No Tes Yes Yes Yes No No No
Yes No No No Yes Yes No Yes No Yes No No
French Thai Burger Thai French Italian Burger Thai Burger Italian Thai Burger
0-10 30-60 0-10 10-30 >60 0-10 0-10 0-10 >60 10-30 0-10 30-60
Yes No Yes Yes Yes No Yes No Yes No No yes
The restaurant scenario is an example of Boolean decision tree which consists of a vector of input attributes, X and a single Boolean output, Y. a set of examples (x_1, y_1), , (x_12, y_12) are as shown above. Decision trees are fully expressive in the class of proportional languages (dealing with one variable) since any Boolean function can be written as a decision tree. Positive examples are the ones in which the goal will wait is true e.g. x_1, x_3, x_4, while the negative examples are the ones in which it is false. The complete set of examples is called the training set. The idea behind decision tree learning algorithm is to test the most important attribute first i.e. the attribute that makes most difference to the classification to the training example. This hopes to get the correct classification with a small number of tests implying that all paths in the tree will be short and the tree as a whole will be small e.g. starting with patrons then hungry as opposed to starting with type.
hungry
2, 4, 5, 9, 10
yes
2, 4, 10
Testing both:
no
5, 9
type? french
1, 5
italian
6, 10
thai
2, 4, 8, 11
bugger
3, 12, 7, 9
patrons none
7, 11 negative (no)
some
1, 3, 6, 8 positive (yes)
2, 4, 5, 9, 10, 12
full
yes
no
This is a poor attribute because it leaves us with four outcomes, each with the same number of positive and negative examples. Patrons is a fairly important attribute; if the value is none or some then we are left with example sets which we can answer definitively. Considerations for the Recursive Algorithm include: i. ii. iii. If there are some positive and negative examples, chose the best attribute to split them. If all remaining examples are positive or negative then we can answer yes or no/true or false. If there are no examples left, it means that no such example has been observed and will return a default value calculated from the majority at the nodes parent. iv. If there are no attributes left but both positive and negative examples then there is a problem. It means that the examples have the same descriptions but different classifications as a result of incorrect data or when attributes do not give enough information to describe the situation fully.
The decision Tree Learning Algorithm and learning algorithm.
Assessing the Performance of a Learning Algorithm

A learning algorithm is good if it produces hypothesis that do a good job of predicting the classification of unseen examples. A prediction is good if it turns out to be true hence we can assess the quality of a hypothesis by checking its predictions against the correct classification once we know it. This is done on a set of examples known as a test set. If we train all our available examples, it means we should go out and collect more examples. We usually apply the following methodology: a. Collect a large set of examples. b. Divide it into two disjoint sets; training and test sets. c. Apply the learning algorithm to the training set generating a hypothesis, h. d. Measure the percentage of examples in the test set that are correctly classified by h. e. Repeat steps a-d for different sizes of training sets and randomly selected training sets of each size. This results to a set of data that can be processed to give the average prediction quality as a function of size of a training set. When plotted, on a graph, it gives the learning curve for the algorithm on a particular domain.
Proportion collection set data
10
20
30
40
50
60
70
80
Training set size As the training set grows, prediction quality increases. NB: the learning algorithm must not be allowed to see the test data before the learned hypothesis is tested on them. d.i.y. noise over fitting research on these terms regarding problems when using decision trees for training and how to minimize them. In order to extend decision trees to a wider variety of problems, the following issues must be addressed: Missing data: in many domains, not all attribute values will be known for every example. The values might have not been recorded or too expensive to obtain. Multivalued attributes: when an attribute has many possible values, the information gain measure gives an inappropriate indication on the attributes usefulness. Continuous and integer-valued input attributes: they have an infinite set of possible values that would generate infinitely many branches. Typically, we find the split point that gives the highest information gain.

ICS2308 Artificial Intelligence Notes PDF

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

ICS2308 Artificial Intelligence Notes PDF

Enviado por

Direitos autorais:

Formatos disponíveis

ICS2308 ARTIFICIAL INTELLIGENCE

References Introduction to Artificial Intelligence

that can engage on behaviour that humans consider intelligence.

History of Artificial Intelligence

behaviour was as a result of feedback mechanisms which could possibly be simulated by

Motivation towards Artificial Intelligence

and adapting to new situations unlike humans.

Applications of Artificial Intelligence

Challenges to the Achievement of Total Artificial Intelligence

methods whenever possible rather than reliance on human intervention.

Fundamental components of a knowledge representation system

Knowledge Representation Techniques

NOT AND OR IMPLY/IF-THEN IF-AND-ONLY-IF

ii. And T T F F iii. Or T T F F T F T F T T T F T F T F T F F F

among objects e.g.

. Krzrr example. ) , DeMorgans ) and in standard correspondencies

to a single term using the fact that (

Has Mother Subset of Female persons Member of Mary

Legs Subset of Male persons Member of

Well Defined Problems and Solutions

Uninformed/Blind Search Strategies

Time requirements are major constraints in BFS. iv. Space complexity

( ) i.e. the cost to

( ). It is both complete and

To reach goal node K, we go through ABCIK

Learning by Decision Trees

Goal (will wait)

Yes Yes No Yes Yes No No No No Yes No Yes

No No Yes No No Yes Yes No Yes Yes No Yes

No No No Yes Yes No No No Yes Yes No Yes

Yes Yes No Yes No Yes No Yes No Yes No Yes

No No No Yes No Tes Yes Yes Yes No No No

Yes No No No Yes Yes No Yes No Yes No No

Yes No Yes Yes Yes No Yes No Yes No No yes

The decision Tree Learning Algorithm and learning algorithm.

Assessing the Performance of a Learning Algorithm

Proportion collection set data

Você também pode gostar