Você está na página 1de 9

kuro5hin.org || technology and culture, from the trenches http://www.kuro5hin.

org/print/2004/2/25/124713/784

A Prolog Introduction for Hackers


By tkatchev in Technology
Thu Feb 26, 2004 at 05:03:23 PM EST
Tags: Software (all tags)
Strange, but true: Prolog is, without a doubt, currently the simplest and
the most straightforward programming language of all mainstream
programming languages; however, the special interests of academia and
inept teaching have given it a horrible, pariah-like reputation. (After all,
you cannot write a PhD thesis explaining obvious, practical things.) This
article aims to ameliorate the situation; to introduce the practical
simplicity of Prolog to those that might normally sneer at what they
consider a horrible, convoluted playing field of doctorate theory.

The Prolog approach

Prolog is, essentially, a query language for databases, like SQL. However,
unlike SQL, which is a limited query language for relational databases,
(tables of rows and columns, much like a spreadsheet) Prolog is a query
language for matching complicated patterns against a database of simple
facts.

Thus, all Prolog programs consist of three parts: a list of facts, a list of
pattern matching rules (sometimes also called predicates in Prolog jargon)
and a list of queries. (Also sometimes called goals in Prolog jargon.)

Prolog facts

Facts, in Prolog, are pre-defined patterns that get stored in Prolog's


internal database, usually in a manner to make searching them more
efficient.

There are, essentially, three types of values in Prolog:

Numbers.

Ex: 1, -2, 0.5, -0.776.

Symbols, which are, for all intents and purposes, immutable strings
of lower-case letters without special characters or spaces. (We'll
explain why symbols must be lower-case later.)

Ex: hello, world, this_is_a_symbol, atom3.

Linked lists of symbols or numbers. Lists are untyped.

Ex: [hello, cruel, world], [1, 2, 3], [1, hello, 2, world], [].

1 of 9 2/3/2010 7:59 AM
kuro5hin.org || technology and culture, from the trenches http://www.kuro5hin.org/print/2004/2/25/124713/784

Modern, practical Prolog implementations define many more useful


datatypes; however, this is implementation-dependent and doesn't really
matter as far as this article is concerned. Look up your Prolog
implementation's manual if you are interested.

Facts, or pre-defined patterns, are written using the standard notation of


functions or procedures in other programming languages. The symbol
before the opening parenthesis is the pattern's name, with a list of
comma-separated values inside the parentheses.

Ex: f(hello), greeting_message(hello, world), g([hello, world]), fac(3, 6).

Note that the following is illegal in Prolog: f(). Patterns without arguments
are written without parentheses, like this: f.

Also note that pattern arguments can have any of Prolog's datatypes, thus
symbol, number and list arguments are allowed.

Thus, to define a fact in Prolog, all you need to do is to write a Prolog


program that lists pre-defined patterns with a period (full-stop) after each
entry.

Example of a Prolog program that defines several facts:


hello.
world.

f(hello, world).
g([hello, world]).
standard_greeting([hello, world], 2).

This simple program inserts five pre-defined patterns into Prolog's internal
database. (hello and world, two patterns without arguments; f(hello, world),
a pattern f with two arguments; g([hello, world]), a pattern g with one
argument, a list; and standard_greeting([hello, world], 2), a pattern
standard_greeting with 2 arguments, a list and a number)

When several patterns are defined with the same name and the same
number of arguments, Prolog will run through them, one after another in
a top-to-down fashion, when trying to match them. (You can think of this
as a short-circuited "OR" of pattern matching rules.)

Defining pattern-matching rules

Defining pattern-matching rules in Prolog is equally simple:


f(hello, world) :- g([hello, world]).

Whenever Prolog sees the special symbol :-, Prolog creates a new pattern-
matching rule. The basic meaning of :- is very simple: to match whatever
is to left of the :-, the part to the right must be matched. This allows to
"decompose" complex patterns into smaller, more manageable ones.

To make the task practical, Prolog defines many operators that help us in
the task of composing pattern-matching rules. Some of the more
important and useful are:

,: A comma denotes sequential matching of patterns; this is

2 of 9 2/3/2010 7:59 AM
kuro5hin.org || technology and culture, from the trenches http://www.kuro5hin.org/print/2004/2/25/124713/784

equivalent to a "short-circuited AND" in many imperative


programming languages. (C's &&, for example.)

Ex:f :- a, b, c.

This means that to match the pattern f, we need to match, in order,


patterns a, b and c.

;: A semi-colon denotes choice; this is equivalent to a "short-


circuited OR" in many imperative programming languages. (C's ||,
for example.)

Ex:f :- p; q; r.

This means that to match the pattern f, either p must be matched,


or, if Prolog fails to match p, try to match q; if matching q fails,
finally try matching r.
Note that the semi-colon is essentially equivalent to listing patterns
on separate lines; thus,
f :- (p; q).
is equivalent to
f :- p.
f :- q.

->: An arrow denotes a conditional pattern rule, in other words, an


"if-then-else" rule.

Ex:f :- (g -> h; i).

This code means that Prolog first tries to match pattern g; if the
pattern can be matched, try to match the pattern h. If g cannot be
matched, try to match the pattern i.
Note that the construct must be enclosed in parentheses, due to
strangeness in Prolog's syntax rules.

\+:This is equivalent, in a sense, to the negation operator in many


programming languages. (Like the C !.)

Ex:f :- \+ g.

This code means that the pattern f matches whenever the pattern g
cannot be matched.

Variables

This is all very easy to understand, but is, unfortunately, useless in


real-world applications. What we lack are variables. Variables in Prolog are
symbols that begin with a capital letter; for example: Var, A, Q, MATCH_ME.

Whenever Prolog comes upon a variable, it starts searching in its internal


database of facts and pattern-matching rules for a substitution such that
substituting a value for the variable matches some fact.

A simple example will illustrate the concept better. Consider the following
Prolog program:
i_know(hello).
i_know(world).

is_phrase(A, B) :- i_know(A), i_know(B).

3 of 9 2/3/2010 7:59 AM
kuro5hin.org || technology and culture, from the trenches http://www.kuro5hin.org/print/2004/2/25/124713/784

is_greeting(A, B) :- is_phrase(A, B), A = hello.

This program defines two facts (i_know(hello) and i_know(world)) and two
pattern-matching rules.

This rule, which is named is_phrase and


is_phrase(A, B) :- i_know(A), i_know(B).
which accepts two arguments, tries to find substitutions for the two
variables it uses (A and B) such that the pattern on the right side of the :-
matches against Prolog's internal fact database.

In this particular case, Prolog will find the following substitutions:

A=hello, B=hello
A=hello, B=world
A=world, B=world
A=world, B=hello

is_greeting(A, B) :- is_phrase(A, B), A = hello. Again, a pattern of two


arguments. As before, Prolog will try to find substitutions for the two
variables A and B such that the pattern rule matches.

The new concept here is the operator =. This operator is Prolog's


equivalent of variable assignment.
P=Q means that Prolog will find substitutions for variables such that the two
arbitrary patterns to the left and to the right of the = are exactly equal.
Note that in this operator Prolog doesn't care whether or not the two
patterns can be matched against its internal database; what matters is
that the two patterns become equal after = finished its work. The =
operator is commutative; thus, A=B and B=A mean the same thing. If such a
substitution cannot be found, the = operator will fail to match. For
example, hello=world will always fail to match.
Thus, after executing i_know(A)=i_know(foo) A will be substituted with foo even
though i_know(foo) does not match against Prolog's internal database. (By
the way, this procedure is often called unification in Prolog jargon; thus,
A=hello means that A will be unified with hello.)

Finally, we can figure out what the pattern is_greeting(A, B) does. Here,
Prolog searches for substitutions for A and B such that a match against the
known facts is found.
Expanding all the pattern-matching rules, Prolog will find the following
substitutions:

A=hello, B=hello
A=hello, B=world

As you can see, using just a few basic Prolog operators and Prolog's
advanced search engine, we can build pattern-matching rules of arbitrary
complexity. It is here that Prolog's power really shines though: Prolog
allows to build very complicated matching rules very easily. Thus, Prolog is
designed for use in applications where we have just a few very simple
facts, but a lot of very complex search rules.

Contrast this with SQL, which has been designed for the opposite
situation: a great amount of very complex data with very simple, very
basic search rules.

Programming in Prolog

4 of 9 2/3/2010 7:59 AM
kuro5hin.org || technology and culture, from the trenches http://www.kuro5hin.org/print/2004/2/25/124713/784

While the basic Prolog engine is enough to perform arbitrarily complex


searches, it is not enough to use Prolog as a general-purpose
programming language.

At the very least, what we miss is a way to do basic input/output, a way


for handling arithmetic and a way for doing loops.
In true Prolog spirit, the designers of the language decided to not
complicate the language with unnecessary constructs and unnecessary
syntax, but instead to write simple, basic hacks that integrate very well
with the basic Prolog query language. (Contrast this with Oracle's PL/SQL,
for those that know it.)

Input and output in Prolog is done with special pattern rules that always
match, and produce output or return input as a side effect. Since there are
a great number of implementation-specific input and output functions, I
will describe two of the most basic, to give the general idea. If you want to
learn more, consult the manual of your Prolog implementation.

Outputting a value is very simple: the pattern-matching rule write(A) will


output its argument. This pattern-matching rule always matches. For
example, write_greeting(A, B) :- is_greeting(A, B), write(A), nl, write(B), nl.
will output two words on the screen, provided that the two words are a
greeting. (The nl pattern simply outputs a newline character.)

Basic input is equally simple: the pattern-matching rule read(A) will ask the
user to input a value, using Prolog syntax, substitute the typed value for A
and match successfully. For example, this simple pattern-matching rule of
zero arguments will simply output whatever value the user typed: echo :-
read(A), write(A), nl.

For arithmetic, Prolog uses a special operator called is. This operator is
just like the = operator, except that on the right side must be an
arithmetic expression, not a pattern. For example, A is B + 1 will substitute
a value for A that is equal to B + 1. Due to the special syntax of the is
command, it is not commutative. Thus, B + 1 is A will give an error. This
means that there is always a variable to the left of the is.

Note, however, that the is operator is still nothing but a special pattern-
matching rule; so, for example, A is A + 1 is an invalid expression that will
likely give an error, since no number can be substituted such that the
number becomes equal to itself incremented by one. If this makes little
sense, try making some simple substitutions in your head: 5 is 5 + 1
violates the basic rules of arithmetic.

Loops in Prolog are very simple; like other well-known programming


languages, looping is done by writing recursive pattern-matching rules.

An example: infinite_loop(A) :- write(A), nl, infinite_loop(A). This rule will


run an infinite loop that prints its argument infinitely many times. Using
recursive applications of pattern-matching rules, any looping construct can
be expressed. Recursion is equivalent to looping, as any programmer of
functional languages knows.

Prolog queries

5 of 9 2/3/2010 7:59 AM
kuro5hin.org || technology and culture, from the trenches http://www.kuro5hin.org/print/2004/2/25/124713/784

Queries, or goals, are a way for the user to interact with Prolog's internal
database and pattern-matching rules. Almost all Prolog implementations
include an interactive shell where the user can type arbitrary patterns;
Prolog then tries to match the given pattern, outputting whatever variable
substitutions are needed to match the pattern. This provides an easy and
powerful way to interact with Prolog in real-time, typing in search queries
and getting back results immediately.

Note, however, that the majority of Prolog implementations do not allow


defining new patterns interactively; a Prolog program must be written
using a separate editor and loaded into the interactive Prolog shell.

Query syntax, like almost everything in Prolog, is elegantly simple. Here is


an example of an interaction, based on the previously shown program:
Ciao-Prolog 1.9 #44: Mon Dec 30 16:47:15 2002
?- ensure_loaded('i:/ivan/foo.pl').

yes
?- f(hello).

yes
?- f(foo).

no
?- is_phrase(hello, world).

yes
?- is_phrase(hello, _).

yes
?- is_greeting(A, B).

A = hello,
B = hello ? ;

A = hello,
B = world ? ;

no
?- is_greeting(_, B).

B = hello ? ;

B = world ? ;

no
?- hello.

yes
?- foo.
{ERROR: user:foo/0 - undefined predicate}

no
?-

The ?- is the standard Prolog prompt; it means that the user is invited to
type in a Prolog pattern, ending with a period, as all Prolog statements.
Note, also, that Prolog returns either a yes or a no after each query; a yes
means that Prolog was able to match the query against its internal
database; a no means, respectively, that Prolog was unable to find a
match. (Notice that trying to use an undefined pattern foo caused a Prolog
error.) When Prolog encounters variables in the query (like A and B in this
example) Prolog prompts us before returning each found value for the
variable, one by one. Finally, there is a special variable called _, which
means that we do not care about the values of this variable and do not
want them printed.

6 of 9 2/3/2010 7:59 AM
kuro5hin.org || technology and culture, from the trenches http://www.kuro5hin.org/print/2004/2/25/124713/784

An extended example

To solidify your understanding of the abstract underpinnings of Prolog,


here are a few very simple programs in Prolog. (In Prolog, the % character
denotes comments.)

Calculating the factorial:


% fac(N,R) : R is the factorial of N.

fac(0,1).
fac(N,R) :- P is N-1, fac(P,Q), R is N*Q.

Using this pattern is simple: typing the query fac(5,R)., for example, gives
the result: R = 120. When playing around a little, though, many deficiencies
start becoming apparent, including unwanted infinite loops and the fact
that fac(N,120) doesn't give the expected result. The reason for this is the
fact that Prolog is very ill-suited for numerical computation; arithmetic in
Prolog is a hack, and doesn't integrate well into the Prolog search engine
too well.

Addendum: Here is a tail-recursive version of the factorial pattern.


% fac2(N,A,R) : R is the factorial of N; A is the "accumulator" value.

fac2(0,R,R).
fac2(N,Acc,R) :- P is N-1, Q is N*Acc, fac2(P,Q,R).

Usage: fac2(5,1,R).

Reversing a list:
% cat(A, B, C) : C is the concatenation of lists A and B.

cat([], R, R).
cat([H|T1], Z, [H|T2]) :- cat(T1, Z, T2).

% rev(A, B) : B is the list A when reversed.

rev([], []).
rev([H|T], Q) :- rev(T,P), cat(P,[H],Q).

(Here we first encounter the special Prolog syntax for handling lists; the
special system pattern [H|T] matches any list whose first element is H with
T being the rest of the list without the first element.)
Using this pattern is simple: rev([1, 2, 3],R). In other words, find all such R
that rev([1, 2, 3],R) matches the Prolog pattern-match rule database.
Notice, however, that rev(R,[1,2,3]) gives us the same result! This is
obvious if you think a little bit about the nature of Prolog's pattern-
matching engine.

As an example, here is a second, simpler version of rev that doesn't rely


on concatenating lists:
% rev2(A, B, C) : C is the reversal of the list A, C is the "accumulator"
% for growing lists.

rev2([], R, R).
rev2([H|T], Z, R) :- rev2(T, [H|Z], R).

Here, the second parameter of the pattern-matching rule is used as an


"accumulator" for holding intermediate values. Using this rule is simple:
rev2([1, 2, 3], [], R). Look at how smart the Prolog pattern-matching is

7 of 9 2/3/2010 7:59 AM
kuro5hin.org || technology and culture, from the trenches http://www.kuro5hin.org/print/2004/2/25/124713/784

when it processes lists and other symbolic data: not only does rev2(A, [],
[1, 2, 3]) produce the expected results, but rev2([1, 2, 3], B, [3, 2, 1])
produces B = [], and rev2([1, 2, 3], A, [3, 2]) returns us a pattern-match
failure.

Addendum

Wait one second, Prolog is a logic-based language!

No, it is not. Prolog has several system patterns that were (very) loosely
inspired by formal logic; these were designed to ease the use of Prolog for
those people that are already familiar with formal logic. However, if you
persist with your delusion that Prolog is a language for "handling logic
predicates", you will get bitten sooner or later! (And probably sooner than
later.)

Standard Prolog

Prolog is a very established, industrial-strength, popular language. As


such, there is a very clear and formal ISO standard for Prolog interpreters
and compilers. You can view the standard for ISO Prolog here, for
example.

Strange language

Prolog programmers and implementors simply love using non-standard,


confusing and sometimes plain wrong language. Do not be afraid of this
peculiar trait; when encountering strange or confusing terms, be aware
that 90% of the time a very simple and down-to-earth concept is hiding
behind it.

The "cut" operator

The so-called "cut" operator (written as !) is a pre-defined operator in


Prolog that allows the programmer to subtly tweak the search strategy
used by the Prolog matching engine; while this operator allows some neat
programming tricks and optimisation techniques, I do not advise anyone
ever to use this operator. It is a dirty, very confusing and dangerous hack
that will inevitably make your program impossible to read and introduce
subtle bugs. In short, avoid the "cut" like the plague.

Open-Source Prolog

Prolog is a programming language with a great deal of choice as far as


implementations are concerned; there are lots and lots of good,
high-quality implementations of Prolog that are Open-Source.

Try it yourself

You can try interacting with a Prolog shell without installing a Prolog
implementation here. (The link points to tuProlog, an Open-Source Prolog
written on top of the JVM and embeddable into standard Java
applications.)

For fun

Just in case you are wondering how Prolog can be used in the "real world",

8 of 9 2/3/2010 7:59 AM
kuro5hin.org || technology and culture, from the trenches http://www.kuro5hin.org/print/2004/2/25/124713/784

take a look at this simple rouguelike game written entirely in Prolog:


Caves of Golorp.

Though of course, Prolog is very useful for non-toy applications as well.


Browsing through the site of any commercial Prolog vendor, (and there
are lots of them) you will inevitably stumble upon a page listing many
serious, industrial-grade applications in Prolog.

Warning to Americans

Prolog is decidedly a European language; not only was it originally


invented by a Frenchman, the majority of Prolog implementations are
developed and used in Europe; and even in a purely academic setting,
Prolog is much more popular as a teaching language in European
universities when compared to higher education in the U.S.

Despite this, do not be afraid; Prolog is a very useful tool in its own right.
Good luck.

Full discussion: http://www.kuro5hin.org/story/2004/2/25/124713/784

All trademarks and copyrights on this page are owned by their respective companies. The Rest © 2000 - 2006 Kuro5hin.org Inc.

9 of 9 2/3/2010 7:59 AM