Escolar Documentos
Profissional Documentos
Cultura Documentos
1
based on the books by Sudkamp and by Hopcroft, Motwani and Ullman
ii
Contents
1 Introduction 1
1.1 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Functions and Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Countable and uncountable sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Proof Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
6 Turing Machines 45
6.1 The Standard Turing Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.1.1 Notation for the Turing Machine . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.2 Turing Machines as Language Acceptors . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.3 Alternative Acceptance Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.4 Multitrack Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
iii
iv CONTENTS
8 Decidability 53
8.1 Decision Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.2 The Church-Turing Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
8.3 The Halting Problem for Turing Machines . . . . . . . . . . . . . . . . . . . . . . . . . 54
8.4 A Universal Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
8.5 The Post Correspondence Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
9 Undecidability 59
9.1 Problems That Computers Cannot Solve . . . . . . . . . . . . . . . . . . . . . . . . . . 59
9.1.1 Programs that Print Hello, World . . . . . . . . . . . . . . . . . . . . . . . . 59
9.1.2 The Hypothetical Hello, World Tester . . . . . . . . . . . . . . . . . . . . . . 60
9.1.3 Reducing One Problem to Another . . . . . . . . . . . . . . . . . . . . . . . . . 62
9.2 A Language That Is Not Recursively Enumerable . . . . . . . . . . . . . . . . . . . . . 63
9.2.1 Enumerating the Binary Strings . . . . . . . . . . . . . . . . . . . . . . . . . . 63
9.2.2 Codes for Turing Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
9.2.3 The Diagonalization Language . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
9.2.4 Proof that Ld is not Recursively Enumerable . . . . . . . . . . . . . . . . . . . 64
9.2.5 Complements of Recursive and RE languages . . . . . . . . . . . . . . . . . . . 65
9.2.6 The Universal Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
9.2.7 Undecidability of the Universal Language . . . . . . . . . . . . . . . . . . . . . 68
9.3 Undecidable Problems About Turing Machines . . . . . . . . . . . . . . . . . . . . . . 69
9.3.1 Reductions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
9.3.2 Turing Machine That Accepts the Empty Language . . . . . . . . . . . . . . . 70
9.3.3 Rices Theorem and Properties of RE Languages . . . . . . . . . . . . . . . . . 72
9.4 Posts Correspondence Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
9.4.1 The Modified PCP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
9.5 Other Undecidable Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
9.5.1 Undecidability of Ambiguity for CFGs . . . . . . . . . . . . . . . . . . . . . . 77
9.5.2 The Complement of a List Language . . . . . . . . . . . . . . . . . . . . . . . . 78
10 Intractable Problems 81
10.1 The Classes P and N P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
10.1.1 Problems Solvable in Polynomial Time . . . . . . . . . . . . . . . . . . . . . . . 81
10.1.2 An Example: Kruskals Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 81
10.1.3 An N P Example: The Travelling Salesman Problem . . . . . . . . . . . . . . . 84
10.1.4 NP-complete Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
10.1.5 The Satisfiability Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
10.1.6 NP-Completeness of 3SAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
List of Figures
v
vi LIST OF FIGURES
10.1 A graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Chapter 1
Introduction
1.1 Sets
A set is a collection of elements. To indicate that x is an element of the set S, we write x S. The
statement that x is not in S is written as x / S. A set is specified by enclosing some description of
its elements in curly braces; for example, the set of all natural numbers 0, 1, 2, is denoted by
N = {0, 1, 2, 3, }.
We use ellipses (i.e.,. . .) when the meaning is clear, thus Jn = {1, 2, 3, , n} represents the set of all
natural numbers from 1 to n.
When the need arises, we use more explicit notation, in which we write
S = {i|i 0, i is even}
for the last example. We read this as S is the set of all i, such that i is greater than zero, and i is
even.
Considering a universal set U, the complement S of S is defined as
S = {x|x U x
/ S}
The usual set operations are union (), intersection (), and difference(), defined as
S1 S2 = {x|x S1 x S2 }
S1 S2 = {x|x S1 x S2 }
S1 S2 = {x|x S1 x
/ S2 }
The set with no elements, called the empty set is denoted by . It is obvious that
S=S=S
S=
=U
S = S
1
2 CHAPTER 1. INTRODUCTION
1. S1 S2 = S1 S2 ,
2. S1 S2 = S1 S2 ,
1. S1 S2 = S1 S2 ,
x S1 S2
x U and x
/ S 1 S2
x U and (x S1 or x S2 ) (def.union)
x U and ((x S1 ) and (x S2 )) (negation of disjunction)
x U and (x
/ S1 and x / S2 )
(x U and x
/ S1 ) and (x U and x
/ S2 )
(x S1 and x S2 ) (def.complement)
x S 1 S2 (def.intersection)
S1 S2 = ,
Example 1.1.1
2S = {, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}}
Here |S| = 3 and |2S | = 8. This is an instance of a general result, if S is finite, then
|2S | = 2|S|
Induction Hypothesis: Assume the property holds for all sets S with k elements.
Induction Step: Show that the property holds for (all sets with) k + 1 elements.
Denote
where Sk = {y1 , y2 , y3 , . . . , yk }
A set which has as its elements ordered sequences of elements from other sets is called the Cartesian
product of the other sets. For the Cartesian product of two sets, which itself is a set of ordered pairs,
we write
S = S1 S2 = {(x, y) | x S1 , y S2 }
Example 1.1.2
S1 S2 = {(1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3)}
Note that the order in which the elements of a pair are written matters; the pair (3, 2) is not in S 1 S2 .
Example 1.1.3
A A = {(head,head),(head,tail),(tail,head),(tail,tail)}
S1 S2 Sn = {(x1 , x2 , , xn ) | xi Si }
f : S1 S2
to indicate that the domain of the function f is a subset of S1 and that the range of f is a subset of
S2 . If the domain of f is all of S1 , we say that f is a total function on S1 ; otherwise f is said to be a
partial function on S1 .
5. f : S1 S1 is called a function on S1
7. f is a total function if Df = S1 .
8. f is a partial function if Df S1
(x, x) r
x X
and
the transitivity rule:
An equivalence relation on X induces a partition on X into disjoint subsets called equivalence classes
Xj , j Xj = X, such that elements from the same class belong to the relation, and any two elements
taken from different classes are not in the relation.
Example 1.2.1
{ , 2m, m, 0, m, 2m, }
{ , 2m + 1, m + 1, 1, m + 1, 2m + 1, }
{ , 2m + 2, m + 2, 2, m + 2, 2m + 2, }
{ , m 1, 1, m 1, 2m, 3m 1, }
1.3. COUNTABLE AND UNCOUNTABLE SETS 5
Example 1.3.1
The set J = N {0} is countably infinite; the function s(n) = n + 1 defines a one-to-one mapping
from N onto J . The set J , obtained by removing an element from N , has the same cardinality as
N . Clearly, there is no one to one mapping of a finite set onto a proper subset of itself. It is this
property that differentiates finite and infinite sets.
Example 1.3.2
The set of odd natural numbers is denumerable. The function f (n) = 2n + 1 establishes the bijection
between N and the set of the odd natural numbers.
The one to one correspondence between the natural numbers and the set of all integers exhibits
the countability of set of integers. A correspondence is defined by the function
(
b n2 c + 1 if n is odd
f (n) =
b n2 c if n is even
Example 1.3.3
#Q+ = #J = #N
p
Q+ is the set of the rational numbers q > 0, where p and q are integers, q 6= 0.
hypothesis to Pn+1 is the induction step. Inductive arguments become clearer if we explicitly show
these three parts.
(b) For the induction hypothesis, we assume that the property holds with n = k;
Pk k(k+1)(2k+1)
i=0 i2 = 6
(c) In the induction step, we show that the property holds for n = k + 1; i.e.,
Pk (k)(k+1)(2k+1)
i=0 i2 = 6
Pk+1 (k+1)(k+2)(2k+3)
i=0 i2 = 6
Since
Pk+1 Pk
i=0 i2 = i=0 i2 + (k + 1)2
(k)(k+1)(2k+1) (k+1)(k+2)(2k+3)
6 + (k + 1)2 = 6
In a proof by contradiction, we assume the opposite or contrary of the property to be proved; then
we prove that the assumption is invalid.
Example 1.4.2
Show that 2 is not a rational number.
As inall proofs by contradiction, we assume the contrary of what we want to show. Here we assume
that 2 is a rational number so that it can be written as
n
2= m ,
n
where n and m are integers without a common factor. Rearranging ( 2 = m ), we have
2m2 = n2
Therefore n2 must be even. This implies that n is even, so that we can write n = 2k or
2m2 = 4k 2
and
m2 = 2k 2
1.4. PROOF TECHNIQUES 7
Therefore m is even.
Butn this contradicts our assumption that n and m have no common factor.
Thus, m and n in ( 2 = m ) cannot exist and 2 is not a rational number.
This example exhibits the essence of a proof by contradiction. By making a certain assumption we
are led to a contradiction of the assumption or some known fact. If all steps in our argument are
logically sound, we must conclude that our initial assumption was false.
To illustrate Cantors diagonalization method, we prove that the set A = {f |f a total function,
f : N N }, is uncountable. This is essentially a proof by contradiction; so we assume that A
is countable, i.e., we can give an enumeration f0 , f1 , f2 , of A. To come to a contradiction, we
construct a new function f as
f(x) = fx (x) + 1 xN
The function f is constructed from the diagonal of the function values of fi A as represented in
the figure below. For each x, f differs from fx on input x. Hence f does not appear in the given
enumeration. However f is total and f : N N . Hence the set A is uncountable since such an f can
be given for any chosen enumeration.
Therefore A cannot be enumerated; hence A is uncountable.
Remarks:
The set of all infinite sequences of 0s and 1s is uncountable. With each infinite sequence of 0s and
1s we can associate a real number in the range [0, 1). As a consequence, the set of real numbers in
the range [0, 1) is uncountable. Note that the set of all real numbers is also uncountable.
8 CHAPTER 1. INTRODUCTION
Chapter 2
2.1 Languages
We start with a finite, nonempty set of symbols, called the alphabet. From the individual symbols
we construct strings (over or on ), which are finite sequences of symbols from the alphabet.
The empty string is a string with no symbols at all. Any set of strings over/on is a language
over/on .
Example 2.1.1
= {c}
L1 = {cc}
L2 = {c, cc, ccc}
L3 = {w|w = ck , k = 0, 1, 2, . . .}
Example 2.1.2
= {a, b}
L1 = {ab, ba, aa, bb, }
L2 = {w|w = (ab)k , k = 0, 1, 2, 3, . . .}
= {, ab, abab, ababab, . . .}
The concatenation of two strings w and v is the string obtained by appending the symbols of v to the
right end of w, that is, if
w = a 1 a2 . . . an
and
v = b 1 b2 . . . bm ,
then the concatenation of w and v, denoted by wv, is
wv = a1 a2 . . . an b1 b2 . . . bm
which completes the induction step.
If w is a string, then w n is the string obtained by concatening w with itself n times. As a special case,
we define
w0 = ,
9
10 CHAPTER 2. LANGUAGES AND GRAMMARS
for all w. Note that w = w = w for all w. The reverse of a string is obtained by writing the symbols
in reverse order; if w is a string as shown above, then its reverse w R is
w R = a n . . . a2 a1
If
w = uv,
then v is said to be prefix and u a suffix of w.
The length of a string w, denoted by |w|, is the number of symbols in the string.
Note that,
|| = 0
If u and v are strings, then the length of their concatenation is the sum of the individual lengths,
|uv| = |u| + |v|
Let us show that |uv| = |u| + |v|. To prove this by induction on the length of strings, let us define the
length of a string recursively, by
|a| = 1
|wa| = |w| + 1
for all a and w any string on . This definition is a formal statement of our intuitive under-
standing of the length of a string: the length of a single symbol is one, and the length of any string
is incremented by one if we add another symbol to it.
Basis: |uv| = |u| + |v| holds for all u of any length and all v of length 1 (by definition).
Induction Hypothesis: we assume that |uv| = |u| + |v| holds for all u of any length and all v
of length 1, 2, . . . , n.
|v| = |w| + 1,
If is an alphabet, then we use to denote the set of strings obtained by concatenating zero or
more symbols from . The set and + are always infinite since there is no limit on the length of
the strings in these sets.
A language can thus be defined as a subset of . A string w in a language L is also called a word or
a sentence of L.
Example 2.1.3
2.1. LANGUAGES 11
L = {an bn |n 0}
is also a language on . The strings aabb and aaaabbbb are words in the language L, but the string
abb is not in L. This language is infinite.
Since languages are sets, the union, intersection, and difference of two languages are immediately
defined. The complement of a language is defined with respect to ; that is, the complement of L is
L = L
The concatenation of two languages L1 and L2 is the set of all strings obtained by concatenating any
element of L1 with any element of L2 ; specifically,
L1 L2 = {xy|x L1 and y L2 }
n
We define L as L concatenated with itself n times, with the special case
L0 = {}
for every language L.
Example 2.1.4
L1 = {a, aaa}
L2 = {b, bbb}
Example 2.1.5
For
L = {an bn |n 0},
then
L2 = {an bn am bm |n 0, m 0}
1. , (representing {}), a (representing {a}) a are regular expressions. They are called
primitive regular expressions.
3. A string is a regular expression if it can be derived from the primitive regular expressions by
applying a finite number of the operations +, * and concatenation.
A regular expression denotes a set of strings, which is therefore referred to as a regular set or language.
Regarding the notation of regular expression, texts will usually print them boldface; however, we
assume that it will be understood that, in the context of regular expressions, is used to represent
{} and a is used to represent {a}.
Example 2.2.1
Example 2.2.2
Beyond the usual properties of + and concatenation, important equivalences involving regular expres-
sions concern porperties of the closure (Kleene star) operation. Some are given below, where , ,
stand for arbitrary regular expressions:
1. ( ) = .
2. ( ) = .
3. + = .
4. ( + ) = + .
5. () = () .
6. ( + ) = ( + ) .
7. ( + ) = ( ) .
8. ( + ) = ( ) .
In general, the distribution law does not hold for the closure operation. For example, the statement
?
+ = ( + ) is false because the right hand side denotes no string in which both and
appear.
2.3. GRAMMARS 13
2.3 Grammars
Definition 2.3.1 A grammar G is defined as a quadruple
G = (V, , S, P )
where
V is a finite set of symbols called variables or nonterminals,
is a finite set of symbols called terminal symbols or terminals,
SV is a special symbol called the start symbol,
P is a finite set of productions or rules or production rules.
xy
where
x (V )+ and
y (V ) .
w = uxv
we say that the production x y is applicable to this string, and we may use it to replace x with y,
thereby obtaining a new string,
w z;
Successive strings are derived by applying the productions of the grammar in arbitrary order. A
production can be used whenever it is applicable, and it can be applied as often as desired. If
w1 w 2 w 3 w
we say that w1 derives w, and write w1 w.
The * indicates that an unspecified number of steps (including zero) can be taken to derive w from
w1 . Thus
ww
is always the case. If we want to indicate that atleast one production must be applied, we can write
+
wv
L(G) = {w |s w}
S w1 w2 w
is a derivation of the sentence (or word) w. The strings S, w1 , w2 , , are called sentential forms of
the derivation.
14 CHAPTER 2. LANGUAGES AND GRAMMARS
Example 2.3.1
Consider the grammar
G = ({S}, {a, b}, S, P )
with P given by,
S aSb
Then
S aSb aaSbb aabb,
so we can write
S aabb.
The string aabb is a sentence in the language generated by G.
Example 2.3.2
P:
Example 2.3.3
< expression >< variable > | < expression >< operation >< expression >
< variable > A|B|C| |Z
< operation > +| | |/
Leftmost Derivation
< expression >< expression >< operation >< expression >
< variable >< operation >< expression >
A < operation >< expression >
A+ < expression >
A+ < expression >< operation >< expression >
A+ < variable >< operation >< variable >
A + B < operation >< expression >
A + B < expression >
A + B < variable >
A+BC
2.3. GRAMMARS 15
(*)
+5
/
(*),+-
=
0A7 (*),+-
. /0
< expr >< multi expr > | < multi expr >< add expr >< expr >
< multi expr >< variable > | < variable >< multi op >< variable >
< multi op > | /
< add op > + |
< variable > A | B | C | | Z
Note that, for an inherently ambiguous language L, every grammar that generates L is ambiguous.
Example 2.3.4
G : S | aSb | bSa | SS
na (w) = nb (w)
wL
2. L L(G)
Let w L. By definition of L, na (w) = nb (w). We show that w L(G) by induction (on the
16 CHAPTER 2. LANGUAGES AND GRAMMARS
length of w).
We will now show that count goes through 0 at least once within w1 = awa (case bwb
is similar)
w1 = a (count = +1) (count goes through 0) (count = -1) a (by end, count = 0).
w1 = w0 (count = 0) w00 where
w0 L,
w00 L.
|w0 | 2i and
|w00 | 2i
w0 , w00 L(G) (I. H.)
w1 = w0 w00 can be derived in G from w 0 and w00 , using the rule S SS.
Example 2.3.5
n
L(G) = {a2 |n 0}
G = (V, T, S, P ) where
V = {S, [, ], A, D}
T = {a}
P :S [A]
[ [D |
D] ]
DA AAD
]
Aa
2.3. GRAMMARS 17
Example 2.3.6
Solution:
S ABCS
ABCABCS
ABCABC
ABCABC
ACBACB
CABCAB
CACBBA
CCABBA
CCBABA
cababa
Example 2.3.7
S | aSb
L(G) = {, ab, aabb, aaabbb, . . .}
L = {ai bi |i 0}
2. L L(G) :
Let w L, w = ak bk
we apply S aSb (k times), thus
Sak Sbk
then S
S a k bk
1. L(G) L:
We need to show that, if w can be derived in G, then w L. is in the language, by definition.
We first show that all sentential forms are of the form ai Sbi , by induction on the length of the
sentential form.
S ai Sbi ai bi
represents all possible derivations; hence G derives only strings of the form ai bi (i 0).
Type 1 If all the grammar rules x y satisfy |x| |y|, then the grammar is context sensitive or Type
1. Grammar G will generate a language L(G) which is called a context-sensitive language. Note
that x has to be of length at least 1 and thereby y too. Hence, it is not possible to derive the
empty string in such a grammar.
Type 2 If all production rules are of the form x y where |x| = 1, then the grammar is said to be
context-free or Type 2 (i.e., the left hand side of each rule is of length 1).
Similarly, for a left linear grammar, the production rules are of the form
A Bx
Ax
2.5. NORMAL FORMS OF CONTEXT-FREE GRAMMARS 19
A language which can be generated by a regular grammar will (later) be shown to be regular. Note
that, a language that can be derived by a regular grammar iff it can be derived by a right linear
grammar iff it can be derived by a left linear grammar.
i) A BC
ii) A a
iii) S
where B, C V {S}
Example 2.5.1
G :S aABC|a
A aA|a
B bcB|bc
C cC|c
Solution:
A CNF equivalent G0 can be given as :
G0 : S A0 T1 |a
A0 a
T1 AT2
T2 BC
A A0 A
B B 0 T3 |B 0 C 0
B0 b
T3 C 0 B
C C 0 C|c
C0 c
Example 3.1.1
baab
baaab
babaabaaba
aaa a
All of the above strings are characterized by the presence of at least one aa substring.
According to the definition of a DFA, the following are identified:
Q = {q0 , q1 , q2 }
= {a, b}
: Q Q : (qi , a) 7 qj
where i can be equal to j and the mapping is given by the transition table below.
Transition Table:
21
22 CHAPTER 3. FINITE STATE AUTOMATA
a b
q0 q1 q0
q1 q2 q0
q2 q2 q2
Definition 3.1.2 Let M = (Q, , , q0 , F ) be a DFA. The language of M , denoted L(M ), is the set
of strings in accepted by M .
A DFA can be considered as a language acceptor; the language recognized by the machine is the set
of strings that are accepted by its computations. Two machines that accept the same language are
said to be equivalent.
Definition 3.1.3 The extended transition function of a DFA with transition function is a
function from Q to Q defined by recursion on the length of the input string.
i , ) = qi .
i) Basis: length(w) = 0. Then w = and (q
i , a) = (qi , a).
length(w) = 1. Then w = a, for some a and (q
i , ua) = ((q
ii) Recursive step: Let w be a string of length n > 1. Then w = ua and (q i , u), a).
Note that a deterministic finite automaton is considered a special case of a nondeterministic one. The
transition function of a DFA specifies exactly one state that may be entered from a given state and
on a given input symbol, while an NFA allows zero, one or more states to be entered. Hence, a string
input to an NFA may generate several distinct computations.
For the language over = {a, b} where each string has at least one occurrence of a double a, an
NFA can be given with the following transition table:
a b
q0 {q0 , q1 } {q0 }
q1 {q2 }
q2 {q2 } {q2 }
3.3. NFA WITH EPSILON TRANSITIONS (NFA- OR -NFA)) 23
and
We will further show that a language accepted by an NFA is also accepted by a DFA. As an
example, the language accepted by the above NFA is also accepted by the DFA of Example 3.1.1.
Definition 3.2.2 The language of an NFA M, denoted L(M ), is the set of strings accepted by M .
That is, L(M ) = {w| there is a computation [q0 , w] ` [qi , ] with qi F }.
Example 3.3.1
24 CHAPTER 3. FINITE STATE AUTOMATA
First we will show that every regular set is accepted by some NFA-. This follows from the recursive
definition of regular sets. The regular sets are built from the basis elements , {} and the singletons
containing a symbol from the alphabet. Machines that accept these sets are given in Figure 3.6. The
regular sets are constructed from the primitive regular sets using union, concatenation, and Kleene
star operations.
i) Basis: qi -closure(qi ).
iii) Closure: qj is in -closure(qi ) only if it can be obtained from qi by a finite number of applications
of operations in ii).
Example 3.4.1 For the NFA- of Figure 3.7, we derive the DFA of Figure 3.8.
26 CHAPTER 3. FINITE STATE AUTOMATA
!
!
(Note: the diagram of the figure is missing a transition from FG to BCE on 1, and transitions on 0
and 1 at .)
Example 3.4.2
The expression graph given in (fig 3.9) accepts the regular expressions u and u vw .
The reduced graph has atmost two nodes, the start node and an accepting node. If these are the
same node, the reduced graph has the form (fig 3.11(a)), accepting w . A graph with distinct start
and accepting nodes reduces to (fig 3.11(b)) and accepts the expression w1 w2 (w3 w4 (w1 ) w2 ) . This
expression may be simplified if any of the arcs in the graph are labeled .
1. Make m copies of G, each of which has one accepting state. Call these graphs G 1 , G2 , . . . , Gm .
Each accepting node G is the accepting node of Gt , for some t = 1, 2, . . . , m.
2. for each Gt , do
2.1. repeat
2.1.1. choose a node i in Gt , that is neither the start nor the accepting node of Gt .
2.1.2. delete the node i from Gt according to the procedure:
for every j,k not equal to i (this includes j = k) do
i) if wj,i 6= , and wi,i = then add an arc from node j to node k labeled wj,i wi,k
ii) if wj,i 6= , wi,k 6= , and wi,i 6= then add an arc from node j to node k labeled
wj,i (wi,i ) wi,k
iii) if nodes j and k have arcs labeled w1 , w2 , . . . , ws connecting them then replace them
by a single arc labeled w1 w2 . . . ws
iv) remove the node i and all arcs incident to it in Gt until the only nodes in Gt are the
start node and the single accepting node.
end for
2.2. determine the expression accepted by Gt . end for
3. The regular expression accepted by G is obtained by joining the expressions for each G t with .
The deletion of the node i is accomplished by finding all paths j, i, k of length two that have
i as the intermediate node. An arc from j to k is added by passing the node i. If there is no arc from i
to itself, the new arc is labeled by the concatenation of the expressions on each of the component arcs.
If wi,i 6= , then the arc wi,i can be traversed any number of times before following the arc from i to
k. The label for the new arc is wj,i (wi,i ) wi,k . These graph transformations are illustrated in (fig 3.10).
3.4. FINITE AUTOMATA AND REGULAR SETS 29
Example 3.4.3
1. Example 1: Fig 3.12(a) shows the original DFA which is reduced to an expression graph shown
in fig 3.12(b).
Step 3: After eliminating all but initial and final state in Gi , fig 3.14(c)
Step 4: Final regular expression,fig 3.14(d)
L = r1 r2 (r3 + r3 r4 r1 r2 r3 )
= r1 r2 (r3 + r4 r1 r2 )
or
L = r1 fig 3.14(d)
3.4. FINITE AUTOMATA AND REGULAR SETS 31
Theorem 4.1.1 Let G = (V, , P, S) be a regular grammar. Define the NFA M = (Q, , , S, F ) as
follows:
(
V {Z} whereZ
/ V, if P contains a rule A a
i) Q =
V otherwise
(
B wheneverA aB P
ii) (A, a) =
Z wheneverA a P
(
{A|A P } {Z} if Z Q
iii) F =
{A|A P } otherwise
Example 4.1.1
G :S aS|bB|a
B bB|
In G:
Sa
aaS
aabB
aabbB
aabb
aabb
33
34 CHAPTER 4. REGULAR LANGUAGES AND SETS
In M:
Similarly, a regular grammar that accepts the L(M ) is constructed from the automaton M.
G0 :S aS|bB|aZ
B bB|
Z
The transitions provide the S rules and the first B rule. The varepsilon rules are added since B
and Z are accepting states.
Note:
Example 4.1.2
S bB|aA
A aS|bC
B aC|bS|
C aB|bA
Theorem 4.2.1 Let L1 and L2 be two regular languages. The languages L1 , L2 , L1 L2 , and L1 are
regular languages.
L = L
Theorem 4.2.3 Let L1 and L2 be regular languages over . The language L1 L2 is regular.
L1 L 2 = L 1 L 2
The right-hand side of the equality is regular since it is built from L1 and L2 using union and
complementation.
Theorem 4.2.4 Let L1 be a regular language and L2 be a context-free language. The language L1 L2
is not necessarily regular.
Proof: Let L1 = a b and L2 = {ai bi | 0}. L2 is context-free since it is generated by the grammar
S aSb|. The intersection of L1 and L2 is L2 , which is not regular.
Theorem 4.3.1 Let L be a regular language that is accepted by a DFA M with n states. Let w be any
string in L with length(w) n. Then w can be written as xyz with length(xy) n, length(y) > 0,
and xy k z L for all k 0.
Example 4.3.1
Prove that the languge L = {ai bi |i 0} is not regular using the Pumping lemma for regular languages.
Proof: By contradiction: Assume L is regular; then the pumping lemma holds. Let w = an bn .
By splitting an bn into xyz, we get
x = ai , y = aj , and z = anij bn
36 CHAPTER 4. REGULAR LANGUAGES AND SETS
where
i + j n and j > 0
Pumping y to y 2 gives,
ai aj aj anij bn
= a n a j bn
/ L (contradiction with the pumping lemma).
Example 4.3.2
The language L = {ai |i is prime} is not regular.
Assume L is regular, and that a DFA with n states accepts L. Let m be a prime greater than n.
The pumping lemma implies that am can be decomposed as xyz, y 6= , such that xy k z is in L for all
k 0.
The length of s = xy m+1 z must be prime if s is in L. But,
Since its length is not prime, xy m+1 z is not in L (contradiction with the pumping lemma). Hence, L
is not regular.
Example 5.1.1
The language L = {ai |i 0} {ai bi |i 0} contains strings consisting solely of a0 s or an equal number
of a0 s and b0 s. The stack of the PDA M that accepts L maintains a record of the number of a0 s
processed until a b is encountered or the input string is completely processed.
When scanning an a in state q0 , there are two transitions that are applicable. A string of the form
ai bi , i > 0, is accepted by a computation that remains in states q0 and q1 . If a transition to state
q2 follows the processing of the final a in a string ai , the stack is emptied and the input is accepted.
Reaching q2 in any other manner results in an unsuccessful computation, since no input is processed
after q2 is entered.
37
38 CHAPTER 5. PUSHDOWN AUTOMATA AND CONTEXT-FREE LANGUAGES
The -transition from q0 allows the machine to enter q2 after the entire input string has been read,
since a symbol is not required to process an -transition. The transition, which is applicable whenever
the machine is in state q0 , introduces nondeterministic computations of M .
Example 5.1.2
The even-length palindromes over {a, b} are accepted by the PDA That is, L(M ) = {ww R | w {a, b} }.
A successful computation remains in state q0 while processing the string w and enters state q1 upon
reading the first symbol in w R .
[qj , ] (qi , a, )
[qj , ] (qi , , A)
[qj , A] (qi , , )
Theorem 5.2.1 shows that the languages accepted by atomic PDAs are the same as those accepted
by PDAs. Moreover, it outlines a method to construct an equivalent atomic PDA from an arbitrary
PDA.
Theorem 5.2.1 Let M be a PDA. Then there is an atomic PDA M 0 with L(M 0 ) = L(M ).
Proof: To construct M 0 , the nonatomic transitions of M are replaced by a sequence of atomic tran-
sitions. Let [qj , B] (qi , a, A) be a transition of M . The atomic equivalent requires two new states,
p1 and p2 , and the transitions
[p1 , ] (qi , a, )
(p1 , , A) = {[p2 , ]}
In a similar manner, a transition that consists of changing the state and performing two additional
actions can be replaced with a sequence of two atomic transitions. Removing all nonatomic transitions
produces an equivalent atomic PDA.
An extended transition is an operation on a PDA that replaces the stack top with a string of
symbols, rather than just a single symbol. The transition [qj , BCD] (qi , u, A) replaces the stack
top A with the string BCD with B becoming the new stack top. The apparent generalization does
5.3. PUSHDOWN AUTOMATA AND CONTEXT-FREE LANGUAGES 39
not increase the set of languages accepted by pushdown automaton. A PDA containing extended
transitions is called an extended PDA. Each extended PDA can be converted into an equivalent
PDA in the sense of Definition 5.1.1
To construct a P DA from an extended P DA, extended transitions are converted to a sequence of
transitions each of which pushes a single stack element. To achieve the result of an extended transition
that pushes k elements requires k 1 additional states to push the elements in the correct order. The
sequence of transitions
[p1 , D] (qi , u, A)
replaces the stack top A with the string BCD and leaves the machine in state qj . This produces the
same result as the single extended transition [qj , BCD] (qi , u, A).
Proof: Let G = (V, , P, S) be a grammar in Greibach normal form that generates L. An extended
PDA M with start state q0 is defined by
QM = {q0 , q1 }
M =
M = V {S}
FM = {q1 }
with transitions
(q0 , a, ) = {[q1 , w] | S aw P }
(q1 , a, A) = {[q1 , w] | A aw P and A V {S}}
(q0 , , ) = {[q1 , ]} if S P.
We first show that L L(M ). Let S uw be a derivation with u + and w V .
We will prove that there is a computation
[q0 , u, ] ` [q1 , , w]
in M . The proof is by induction on the length of the derivation and utilizes the correspondence
between derivations in G and computations of M . The basis consists of derivations S aw of length
one. The transition generated by the rule S aw yields the desired computation. Assume that for
n
all strings uw generated by derivations S uw there is a computation
[q0 , u, ] ` [q1 , , w]
in M .
n+1
Now let S uw be a derivation with u = va + and w V . This derivation can be written as
n
S vAw2 uw,
40 CHAPTER 5. PUSHDOWN AUTOMATA AND CONTEXT-FREE LANGUAGES
where w = w1 w2 and A aw1 is a rule in P . The inductive hypothesis and the transition [q1 , w1 ]
(q1 , a, A) combine to produce the computation
[q0 , va, ] ` [q1 , a, Aw2 ]
` [q1 , , w1 w2 ]
For every string u in L of positive length, the acceptance of u is exhibited by the computation in M
corresponding to the derivation S u. If L, then S is a rule of G and the computation
[q0 , , ] ` [q1 , , ] accepts the null string. The opposite inclusion, L(M ) L, is established by show-
ing that for every computation [q0 , u, ] ` [q1 , , w] there is a corresponding derivation S uw in G.
Theorem 5.3.2 Let P = (Q, , , , q0 , F ) be a PDA. Then there is a context-free grammar G such
that L(G) = L(P ).
The derivation tree can be divided into subtrees where the nodes labeled by the variable A indicated
in the diagram are the final two occurrences of A in the path p.
The derivation of z consists of the subderivations
1. S r1 Ar2
2. r1 u
+
3. A vAx
4. A w
5. r2 y.
5.4. THE PUMPING LEMMA FOR CONTEXT-FREE LANGUAGES 41
Subderivation 3 may be omitted or be repeated any number of times before applying subderivation 4.
The resulting derivations generate the strings uv i wxi y L(G) = L.
We now show that conditions (ii) and (iii) in the pumping lemma are satisfied by this decomposition.
+
The subderivation A vAx must begin with a rule of the form A BC. The second occurence of
the variable A is derived from either B or C. If it is derived from B, the derivation can be written
A BC
vAyC
vAyz
= vAx
The string z is nonnull since it is obtained by a derivation from a variable in a Chomsky normal form
grammar that is not the start symbol of the grammar. It follows that x is also nonnull. If the second
occurrence of A is derived from the variable C, a similar argument shows that v must be nonnull.
The subpath of p from the first occurrence of the variable A in the diagram to a leaf must be of the
length at most n + 2. since this is the longest path in the subtree with root A, the derivation tree
generated by the derivation A vwx has depth at most n + 1. Also the string vwx obtained from
n
this derivation has length k = 2 or less.
Example 5.4.1
The language L = {ai bi ci |i 0} is not context-free.
Proof: Assume L is context-free. By Theorem 5.4.1, the string w = ak bk ck , where k is the number
specified by the pumping lemma, can be decomposed into substrings uvwxy that satisfy the repetition
properties. Consider the possibilities for the substrings v and x. If either of these contain more than
one type of terminal symbol, then uv 2 wx2 y contains a b preceding an a or a c preceding a b. In either
case, the resulting string is not in L.
By the previous observation, v and x must be substrings of one of ak ,bk or ck . Since at most one of
the strings v and x is null, uv 2 wx2 y increases the number by at least one, maybe two, but not all
three types of terminal symbols. This implies that uv 2 wx2 y / L. Thus there is no decomposition of
ak bk ck satisfying the conditions of the pumping lemma; consequently, L is not context-free.
42 CHAPTER 5. PUSHDOWN AUTOMATA AND CONTEXT-FREE LANGUAGES
Theorem 5.5.2 The set of context-free languages is not closed under intersection or complementa-
tion.
Proof:
i) Intersection: Let L1 = {ai bi cj |i, j 0} and L2 = {aj bi ci |i, j 0}. L1 and L2 are both
context-free, since they are generated by G1 and G2 , respectively.
G1 :S BC G2 : S AB
B aBb| A aA|
C cC| B bBc|
The intersection of L1 and L2 is the set {ai bi ci |i 0}, which, by Example 5.4.1, is not context-
free.
ii) Complementation: Let L1 and L2 be any two context-free languages. If the context-free lan-
guages are closed under complementation, then, by Theorem 5.5.1 the language
L = L 1 L2
Definition 5.6.1 A two-stack PDA is structure (Q, , , , q0 , F ), where Q is a finite set of states,
a finite set called the input alphabet, a finite set called the stack alphabet, q 0 the start state,
F Q a set of final states, and a transition function from Q ( {}) ( {}) ( {}) to
subsets of Q ( {}) ( {}).
5.6. A TWO-STACK AUTOMATON 43
Example 5.6.1
The two-stack PDA defined below accepts the language L = {ai bi ci |i 0}. The first stack is used to
match the a0 s and b0 s and the second b0 s and c0 s.
Turing Machines
1. Change state. The next state optionally may be the same as the current state.
2. Write a tape symbol in the cell scanned. This tape symbol replaces whatever symbol was in
that cell. Optionally, the symbol written may be the same as the symbol currently there.
3. Move the tape head left or right. In our formalism we require a move, and do not allow the head
to remain stationary. This restriction does not constrain what a Turing Machine can compute,
since any sequence of moves with a stationary head could be condensed, along with the next
tape head move, into a single state change, a new tape symbol, a move left or right.
Turing machines are designed to perform computations on strings from the input alphabet. A com-
putation begins with the tape head scanning the leftmost tape square and the input string beginning
at position one. All tape squares to the right of the input string are assumed to be blank. The Turing
machine defined with initial conditions as described above, is referred to as the standard Turing
machine. A language accepted by a Turing machine is called a recursively enumerable language.
A language accepted by a Turing machine that halts for all input strings is said to be recursive.
Example 6.1.1
45
46 CHAPTER 6. TURING MACHINES
The Turing machine COPY fig 6.2 with input alphabet a, b produces a copy of the input string. That
is, a computation that begins with the tape having the form BuB terminates with tape BuBuB.
examines only the first half of the input before accepting the string aabb. The language (ab) aa(ab)
is recursive; the computations of M halt for every input string. A successful computation terminates
when a substring aa is encountered. All other computations halt upon reading the first blank following
the input.
Example 6.2.2
The language {ai bi ci |i 0} is accepted by the Turing machine in fig 6.4. A computation successfully
terminates when all the symbols in the input string have been transformed to the appropriate tape
symbol.
!
!
#
$
"
4-4 /
,.-
,0/ - / - /
%'& (+ (21 3 3 (5 3 3 (*)
4-4 /
Conversely, let M = (Q, , , , q0 , F ) be a Turing machine that accepts the language L by final state.
Define the machine M 0 = (Q qf , , , 0 , q0 ) that accepts by halting as follows:
"
#
$
!
Languages accepted by two-track machines are precisely the recursively enumerable languages.
Theorem 6.4.1 A language L is accepted by a two-track Turing machine if, and only if, it is accepted
by a standard Turing machine.
Proof: Clearly, if L is accepted by a standard Turing machine it is accepted by a two-track machine.
The equivalent two-track machine simply ignores the presence of the second track.
representation of a two-track tape position as an ordered pair indicates how this can be accomplished.
The tape alphabet of the equivalent one-track machine M 0 consits of ordered pairs of tape elements of
M . The input to the two-track machine consists of ordered pairs whose second component is blank.
The input symbol a of M is identified with the ordered pair [a, B]of M 0 . The one-track machine.
M 0 = (Q, {B}, , 0 , q0 , F )
accepts L(M ).
The repositioning consists of moving the tape head one square to the left or one square to the right or
leaving it at its current position. The input to a multitape machine is placed in the standard position
on tape 1. All the other tapes are assumed to be blank. The tape heads origanlly scan the leftmost
position of each tape. Any tape head attempting to move to the left of the boundary of its tape
terminates the computation abnormally. Any language accepted by a k-tape machine is accepted by
a 2k + 1-track machine.
Theorem 6.6.1 The time taken by the one-tape TM N to simulate n moves of a k-tape TM M is
O(n2 )
Example 6.8.1
The machine E enumerates the language L = {ai bi ci |i 0}.
+*
,-.,/0
(' )
+*
!
"#
1 $
"#
%&
#
51
52 CHAPTER 7. THE CHOMSKY HIERARCHY
Chapter 8
Decidability
A solution to a decision problem P is an algoritm that determines the appropriate answer to every
question p P.
An algorithm that solves a decision problem should be
1. Complete
2. Mechanistic
3. Deterministic.
53
54 CHAPTER 8. DECIDABILITY
Proof: The proof is by contradiction. Assume that there is a Turing machine H that solves the
halting problem. A string is accepted by H if
If either of these conditions is not satisfied, H rejects the input. The operation of the machine
H is depicted by the fig 8.1 The machine H is modified to construct a Turing machine H 0 . The
# *) *#+, -
/.0 )1 0 2# 3 *#4 , -
!"
+%,%-
./0
$#&%('()%(*
!)
3:9;<=>6@?=9?ABC!="12.3D4
<+F,F-B
12.354*6 78 12.3D4012.354 78
9;<=
3:E&F,G(>A)FH=*9;<+=I6@?=9?AB!C!=I12+3D4
constructed directly from a machine H that solves the halting problem. The assumption that the
halting problem is decidable produces the preceding contradiction. Therefore, we conclude that the
halting problem is undecidable.
Corollary 8.3.1 The language LH = {R(M )w|R(M )} where R(M ) is the representation of a Turing
machine M and M halts with input w over {0, 1} is not recursive.
1. If the string does not have the form R(M )w for a Turing machine M and string w, U moves
indefinitely to the right.
2. The string w is written on tape 3 beginning at position one. The tape head is then repositioned
at the leftmost square of the tape. The configuration of tape 3 is the initial configuration of a
computation of M with input w.
3. A single 1, the encoding of state q0 , is written on tape 2.
4. A transition of M is simulated on tape 3. The transition of M is determined by the symbol
scanned on tape 3 and the state encoded on tape 2. Let x be the symbol from tape 3 and q i the
state encoded on tape 2.
a) Tape 1 is scanned for a transition whose first two components match en(qi ) and en(x). If
there is no such transition, U halts accepting the input.
b) Assume tape 1 contains the encoded transition en(qi )0en(x)0en(qj )0en(y)0en(d).
Then
i) en(qi ) is replaced by en(qj ) on tape 2.
ii) The symbol y is written on tape 3.
iii) The tape head of tape 3 is moved in the direction Example 3.4.3-2n specified by
5. The next transition of M is simulated by repeating steps 4 and 5.
The simulations of the universal machine U accepts the strings in LH . The computations of U loop
indefinitely for strings in {0, 1} LH . Since LH = L(U ), LH is recursively enumerable.
Corollary 8.4.1 The recursive languages are a proper subset of the recursively enumerable languages.
Proof: The acceptance of LH by the universal machine demonstrates that LH is recursively enu-
merable while Corollary 8.3.1 established that LH is not recursive.
Note: A language L is recursive if both L and L are recursively enumerable.
The game begins when one of the dominoes is placed on a table. Another domino is then placed to
the immediate right of the domino on the table. This process is repeated, constructing a sequence
of adjacent dominoes. A Post correspondence system can be thought of as defining a finite set of
domino types. We assume that there is an unlimited number of dominoes of each type; playing a
8.5. THE POST CORRESPONDENCE PROBLEM 57
domino does not limit the number of future moves. A string is obtained by concatenating the strings
in the top halves of a sequence of dominoes. We refer to this as the top string. Similarly, a sequence
of dominoes defines a bottom string. The game is successfully completed by constructing a finite
sequence of dominoes in which the top and bottom strings are identical.
Consider the Post correspondence system defined by dominoes in fig 8.5, The sequence in fig 8.6 is a
Example 8.5.1
The Post correspondence system with alphabet {a, b} and ordered pairs [aaa, aa], [baa, abaaa] has a
solution.
58 CHAPTER 8. DECIDABILITY
Undecidability
There are specific problems we cannot solve using a computer. These problems are called undecid-
able. While a Turing Machine looks nothing like a PC, it has been recognized as an accurate model
for what any physical computing device is capapble of doing. We use the Turing Machine to develop
a theory of undecidable problems. We show that a number of problems that are easy to express are
in fact undecidable.
main()
{
printf (hello, world);
}
such as total = 12, x = 3, y = 4, and z = 5, for which xn + y n = z n . Thus, for input 2, the program
does print hello, world.
However, for any integer n > 2, the program will never find a triple of positive integers to satisfy
xn + y n = z n , and thus will fail to print hello, world. Interestingly, until a few years ago, it was
known whether or not this program would print hello, world for some large integer n. The claim that
59
60 CHAPTER 9. UNDECIDABILITY
it would not, i.e., that there are no integer solutions to the equation xn + y n = z n if n > 2, was made
by Fermat 300 yars ago, but no proof was found until quite recently. This statement is often referred
to as Fermats last theorem.
Let us define the hello world problem to be: determine whether a given C program, with a given
input, prints hello, world as the first 12 characters that it prints. It would be remarkable indeed if we
could write a program that could examine any program P and input I for P , and tell whether P , run
with I as its input, would print hello, world. We shall prove that no such program exists.
!
has answer yes or no, then the problem is said to be decidable. Our goal is to prove that H
does not exist, i.e. the hello-world problem is undecidable.
In order to prove that statement by contradiction, we are going to make several changes to H,
eventually constructing a related program called H2 that we show does not exist. Since the changes
to H are simple transformations that can be done to any C program, the only questionable statement
is the existence of H, so it is that assumption we have contradicted.
To simplify our discussion, we shall make a few assumptions about C programs.
1. All output is character-based, e.g., we are not using a graphics package or any other facility to
make output that is not in the form of characters.
2. All character-based output is performed using printf, rather than put-char() or another character-
based output function.
We now assume that the program H exists. Our first modification is to change the output no, which
is the response that H makes when its input program P does not print hello, world as its first output
in reponse to input I. As soon as H prints n, we know it will eventually follow with o. Thus, we
can modify any printf statement in H that prints n to instead print hello, world. Another printf
statement that prints o but not the n is omitted. As a result, the new program, which we call
H1 , behaves like H, except it prints hello, world exactly when H would print no. H1 is suggested by
Fig 9.4.
Since we are interested in programs that take other programs as input and tell something about
Figure 9.5: H2 behaves like H1 , but uses its input P as both P and I
!
other. But if H2 , given itself as input, prints hello, world first, then the output of the box in fig. 9.6
must be yes. Whichever output we suppose H2 makes, we can argue that it makes the other output.
This situation is paradoxical, and we conclude that H2 cannot exist. As a result, we have contradicted
the assumption that H exists. That is, we have proved that no program H can tell whether or not a
given program P with input I prints hello, world as its first output.
In order to make a proof that problem P2 is undecidable, we have to invent a construction, represented
by the square box in fig. 9.7, that converts instances of P1 to instances of P2 that have the same
answer. Once we have this construction, we can solve P1 as follows:
1. Given an instance of P1 , that is, given a string w that may or may not be in the language P1 ,
apply the construction algorithm to produce a string x.
We shall now give a formal proof of the existence of a problem about Turing Machines that no Turing
Machine can solve. We divide problems that can be solved by a Turing Machine into two classes:
those that have an algorithm(i.e.,a Turing Machine that halts whether or nor it accepts its input),
and those that are only solved by Turing Machines that may run forever on inputs they do not accept.
We prove undecidable the following problem:
Does this Turing Machine accept this input?
Then, we exploit this undecidability result to exhibit a number of other undecidable problems.
3. M accepts input w.
We must give a coding for Turing Machines that uses only 0s and 1s, regardless of how many states
the TM has. We can then treat any binary string as if it was a Turing Machine. If the string is not a
well-formed representation of some TM, we may think of it as representing a TM with no moves.
1. We shall assume the states are q1 , q2 , ..., qr for some r.The start state will always be q1 , and q2
will be the only accepting state.
2. We shall assume the tape symbols are X1 , X2 , ..., Xs for some s.X1 always will be the symbol
0, X2 will be 1, X3 will be B, the blank. Other tape symbols can be assigned to the remaining
integers arbitrarily.
3. We shall refer to direction L as D1 and direction R as D2 .
Once we have established an integer to represent each state, symbol, and direction, we can encode
the transition function . Suppose one transition rule is (qi , Xj ) = (qk , Xl , Dm ), for some integers
i, j, k, l and m. We shall code this rule by the string 0i 10j 10k 10l 10m . Notice that since all of i, j, k, l,
and m are atleast one, there are no occurrences of two or more consecutive 1s within the code for a
single transition.
A code for the entire TM M consists of all the codes for the transitions, in some order, separated by
pairs of 1s:
The language Ld , the diagonalization language, is the set of strings wi such that wi is not in
L(Mi ).
That is, Ld consists of all strings w such that the TM M whose code is w does not accept when given
w as input.
The reason Ld is called a diagonalization language can be seen if we consider Fig. 9.8. This table
tells for all i and j, whether TM Mi accepts inout string wj ; 1 means yes and 0 means no. We
may think of the ith row as the characteristic vector for the language L(M i ); that is the 1s in this
row indicate the strings that are members of this language.
The diagonal value tells whether Mi accepts wi . To construct Ld , we complement the diagonal. For
instance, if fig. 9.8 were the correct table, then the complemented diagonal would begin 1,0,0,0,....
Thus, Ld would contain w1 = , not contain w2 through w4 , which are 0,1, and 00, and so on.
The trick of complementing the diagonal to construct the charactersitic vector of a language that
cannot be the language that appears in any row, is called diagonalization.
Figure 9.8: The table that represents acceptance of strings by Turing machines
+*
!#"!$%$&
'(")
Figure 9.9: Relationship between the recursive, RE, and non-RE languages
Since M is generated to halt, we know thet M is also guaranteed to halt. Moreover, M accepts exactly
those strings that M does not accept, Thus, M accepts L.
Theorem 9.2.3 If both a language L and its complement are RE, then L is recursive. Note that by
Theorem 9.2.2, L is recursive as well.
Proof: The proof is suggested by fig 9.11. Let L = L(M1 ) and L = L(M2 ). Both M1 and M2 are
simulated in parallel by a TM M . We can make M a two-tape TM, and then convert it to a one-tape
TM, to make the simulation easy and obvious. One tape of M simulates the tape of M 1 , while the
other tape of M simulates the tape of M2 . The states of M1 and M2 are each components of the state
of M .
Figure 9.11: Simulation of two TMs accepting a language and its complement
If input w to M is in L, then M1 will eventually accept. If so, M accepts and halts. If w is not in L,
then it is in L, so M2 will eventually accept. When M2 accepts, M halts without accepting. Thus on
all inputs, M halts, and L(M ) is exactly L. Since M always halts, and L(M ) = L, we conclude that
L is recursive.
9.2. A LANGUAGE THAT IS NOT RECURSIVELY ENUMERABLE 67
We can now exhibit a problem that is RE but not recursive; it is the language Lu . Knowing that Lu is
undecidable(i.e., not a recursive language), is in many ways more valuable than our previous discovery
that Ld is not RE. The reason is that the reduction of Lu to another problem P can be used to show
there is no algorithm to solve P , regardless of whether or not P is RE. However, reduction of L d to P
is only possible if P is not RE, so Ld cannot be used to show undecidability for these problems that
are RE but not recursive. On the other hand, if we want to show a problem not to be RE, then only
Ld can be used; Lu is useless since it is RE.
Proof: We just proved in section 9.2.6 that Lu is RE. Suppose Lu were recursive. Then by
Theorem 9.2.2, Lu , the complement of Lu , would also be recursive. However, if we have a TM M to
accept Lu , then we can construct a TM to accept Ld . SInce we already know that Ld is not RE, we
have a contradiction of our assumption that Lu is recursive.
9.3. UNDECIDABLE PROBLEMS ABOUT TURING MACHINES 69
" $#
!
Figure 9.14: Reductions turn positive instances into positive and negative to negative
We have thus contradicted the assumption that P1 is undecidable. Our conclusion is that if P1 is
undecidable, then P2 is undecidable.
Now, consider part (b). Assume that P1 is non-RE, but P2 is RE. Now, we have an algorithm to
reduce P1 to P2 , but we have only a procedure to recognize P2 ; that is, there is a TM that says yes,
if its input is in P2 , but may not halt if its input is not in P2 . As for part (a), starting with an instance
w of P1 , convert it by the reduction algorithm to an instance x of P2 . Then apply the TM for P2 to
x. If x is accepted is accepted then accept w.
This procedure describes a TM whose language is a TM. If w is a TM, then x is in P2 , so this TM
will accept w. If w is not in P1 , then x not in P2 . Then, the TM may or may not halt, but will surely
not accept w, Sice we assumed no TM for P1 exists, we have shown by contradiction that no TM for
P2 exists either; i.e., if P1 is non-RE, then P2 is non-RE.
Le = { M | L(M ) = }
Lne = { M | L(M ) 6= }
have contradicted the assumption that Lne is recursive, and conclude that Lne is not recursive.
Now, we know the status of Le . If Le were RE, then by Theorem 9.2.3, both it and Lne would be
recursive. Since Lne is not recursive by Theorem 9.3.3, we conclude that:
72 CHAPTER 9. UNDECIDABILITY
Note that the empty property, , is different from the property of being an empty language,{}.
if P is a property of the RE languages, the language LP is the set of codes for the Turing Machines
in Mi such that L(Mi ) is a language in P. When we talk about the decidability of a property P, we
mean the decidability of the language LP
Let P be a nontrivial property of the RE languages. Assume to begin that , the empty language, is
not in P; Since P is nontrivial, there must be some non-empty languages L that is in P. Let ML be
a TM accepting L.
LP = { M | L(M ) P}
Example:
1. LP is { M | L(M ) 6= } = Lne
2. LP is { M | L(M ) = } = Le
4. LP is { M | L(M ) is CF }
Proof: Reduce Lu LP
1. Assume /P
P is non-trivial L P which is not empty.
L is RE TM ML that accepts L.
(a) Simulate M on input w. Note that w is not the input M 0 ; rather, M 0 writes M and w onto
one of its tapes and simulates the universal TM U on that pair.
(b) If M does not accept w, then M 0 does not accept nothing else. M 0 never accepts its own
input x, so L(M 0 ) =. Since we assume is not in property P, that means the code for M 0
is not in LP .
(c) If M accepts w, then M 0 begins simulating ML on its own input x. Thus, M 0 will accept
exaclty the language L. Since L is in P, the code for M 0 is in LP .
M accepts w L(M 0 ) P
M does not accepts w L(M 0 )
/P
Since the above algorithm turns (M, w) into M 0 that is in LP if and only if (M, w) is in Lu , this
algorithm is reduction of Lu to LP , and proves that the property P is undecidable.
9.3. UNDECIDABLE PROBLEMS ABOUT TURING MACHINES 73
2. Now P / LP
From (1) we know that LP is undecidable. However, since every TM accepts an RE language,
LP , the set of codes for turing machines that do not accept a language in P is the same as LP ,
the set of TMs that accept language in P. Suppose LP were decidable. Then so would LP ,
because the complement of a recursive language is recursive (Theorem 9.2.2).
Theorem 9.3.6 Rices theorem on recursive index sets states that if P is non-trivial, L P is not
recursive.
Theorem 9.3.7 If LP is RE; then the list of binary codes for the finite sets in P is enumerable.
Proof: Let (i, j) be a pair generated and we treat i as the binary code of a finite set, assuming
0 is the code for comma, 10 the code for zero, and 11 the code for one. We may in straightforward
manner contruct a TM M (i) (essentially a finite automaton) that accepts exactly the words in the
finite language represented i. We then simulate the enumerator for LP for j steps. If it has printed
M (i) , we print the code for the finite set represented by i, that is, the binary representation of i
itself,followed by a delimiter symbol #. In any event, after the simulation we return control to the
pair generated, which generates the pair following (i, j).
Corollary 9.3.1
The following properties of RE.,sets are not RE.
a) L =
b) L =
c) L is recursive
d) L is not recursive
74 CHAPTER 9. UNDECIDABILITY
e) L is a singleton
f) L is a regular set
g) L Lu 6=
Example 9.3.1
Show that the following properties of RE languages are not RE.
1. L =
P = { L | L = }.
LP = { M T M description | L(M ) = }.
Let L1 P i.e. L1 = . Let L2 = . L1 is a subset of L2 , but L2
/ P . Rule 1 of theorem 9.3.8
is not satisfied and so L is not RE.
2. S1 = {M (T M )|L(M ) = 01 0}
P = {01 0}. Let L1 = S1 . Let us take L2 = . We can see that L1 is a subset of L2 , and we
know that is RE. But we also know that does not belong in P i.e. L2 / P.The first rule
of theorem 9.3.8 is not satisfied. Therefore, S1 is not RE.
3. S2 = M |L(M ) 6= 01 0
P 6= {01 0}. Let L1 be the language that contains strings of the form 00 i.e. L1 = 00 and L2 =
01 0. L1 L2 . L2 is RE and L2 / P . This once again does not satisfy rule 1 of theorem 9.3.8.
Therefore, S2 is not RE.
4. L is Recursive.
P = { L | L is recursive }.
LP = { M T M description | L(M ) is recursive }.
Let L1 P i.e. L1 is recursive. Let L2 = L1 Lu . L1 is a subset of L2 , but L2
/ P , since Lu
is RE but not recursive(theorem 6.3.1). Rule 1 of theorem 9.3.8 is not satisfied and so L is not
RE.
5. L is not Recursive.
P = { L | L is not recursive }.
LP = { M T M description | L(M ) is not recursive }.
Let L1 P i.e. L1 is not recursive. Let L2 = . L1 is a subset of L2 , but L2
/ P . Rule 1 of
theorem 9.3.8 is not satisfied and so L is not RE.
6. L is a singleton.
P = {L | L is a singleton }.
LP = { M T M description | L(M ) is a singleton }.
Let L1 P i.e. L1 is a singleton. Let L2 = . L1 is a subset of L2 , but L2
/ P . Rule 1 of
theorem 9.3.8 is not satisfied and therefore L is not RE.
7. L is a regular set.
P = {L | L is a regular set }.
LP = { M T M description | L(M ) is a regular set }.
Let L1 P i.e. L1 is a regular set. Let L2 = L1 {0n 1n | n 0 }. L1 is a subset of L2 , but
L2
/ P . Rule 1 of theorem 9.3.8 is not satisfied and therefore L is not RE.
9.3. UNDECIDABLE PROBLEMS ABOUT TURING MACHINES 75
Corollary 9.3.2
The following properties of RE.,sets are RE.
a) L 6=
b) L contains atleast 10 memebers
c) w is in L for some fixed word w
d) L Lu 6=
Example 9.3.2
Show that the following properties of RE sets are RE.
1. L 6=
P = { L | L 6= }.
LP = { M T M description | L(M ) 6= }, which is the language Lne
P = { L | |L| 10 }.
LP = { M T M description | |L(M )| 10 }
There exists a TM T10 (fig. 9.18) that non-deterministically guesses strings. The TM accepts
after 10 strings are guessed. Therefore, L is RE.
The language of this grammar will be referred to as LB . The same observations that we made for
GA apply also to GB . In particular a terminal string in LB has a unique derivation, which can be
determined by the index symbols in the tail of the string.
Finally, we combine the languages and grammars of the two lists to form a grammar G AB for the
entire PCP instance. GAB consists of:
2. Productions S A|B.
We claim that GAB is ambiguous if and only if the instance (A, B) of PCP has a solution; that argu-
ment is the core of the next theorem.
Proof: We have already given the reduction of PCP to the question of whether a CFG is ambiguous;
that reduction proves the problem of CFG ambiguity to be undecidable, since PCP is undecidable.
We have to show that the above construction is correct, that is:
(If) Suppose i1 , i2 , ..., im is a solution, we know that wi1 wi2 ...wim = xi1 xi2 ...xim . Thus, these two
derivations are derivations of the same terminal string. SInce the derivations themselves are clearly
two distinct, leftmost derivations of the same terminal string, we conclude that GAB is ambiguous.
(Only if) We already observed that a given terminal string cannot have more than one derivation in
GA and not more than one in GB . So the only way that a terminal string could have two leftmost
derivations in GAB is if one of them begins S A and continues with a derivation in GA , while the
other begins S B and continues with a derivation of the same string in GB .
The string with two derivations has a tail of indexes aim ...ai2 ai1 , for some m 1.This tail must be
a solution to the PCP instance, because what precedes the tail in the string with two derivations is
both wi1 wi2 ...wim and xi1 xi2 ...xim .
Theorem 9.5.2 If LA is the language for the list A, then LA is a context-free language.
Proof: Let be the alphabet of strings on a list A = w1 , w2 , ..., wk , and let I be the set of index
symbols. I = {a1 , a2 , ..., ak }. The DPDA P we design to accept LA works as follows.
1. As long as P sees symbols in , it stores then on its stack. Since all strings in as in LA , P
accepts as it goes.
9.5. OTHER UNDECIDABLE PROBLEMS 79
2. As soon as a P sees an index symbol in I, say ai , it pops its stack to see if the top symbols from
wiR , that is , the reverse of the corresponding string.
(a) If not, then the input seen so far, and any continuation of this input is in LA . Thus, P goes
to an accepting state in which it consumes all future inputs without changing the stack.
(b) If wiR was popped from the stack, but the bottom-of-stack marker is not yet exposed on
the stack, then P accepts, but remembers, in its state that it is looking for symbols in I
only, and may yet see a string in LA (which P will not accept). P repeats step(2) as long
as the question of whether the input is in LA is unresolved.
(c) If wiR was popped from the stack, and the bottom-of-stack marker is exposed, then P has
seen an input in LA . P does not accept this input. However, since any input continua-
tion cannot be in LA , P goes to a state where it accepts all futute inputs, leaving stack
unchanged.
3. If, after seeing one or more symbols of I, P sees another symbol in , then the input is not of
the correct form to be in LA . Thus, P goes to a state in which it accepts this and all future
inputs without changing its stack.
Theorem 9.5.3
Let G1 and G2 be context-free grammars, and let R be a regular expression. Then the following are
undecidable.
(a) Is L(G1 ) L(G2 ) = ?
(b) Is L(G1 ) = L(G2 )?
(c) Is L(G1 ) = L(R)?
(d) Is L(G1 ) = T for some alphabet T ?
(e) Is L(G1 ) L(G2 )?
(f) Is L(R) L(G1 )?
Proof: Each of the proofs is reduction from PCP. We show how to take an instance (A, B) of PCP
and convert it to a question about CFGs and/or regular expressions that has answer yes if and
only if the instance of PCP has a solution. In some cases, we reduce PCP to the question as stated in
the theorem; in other cases we reduce it to the complement. It doesnt matter, since if we show the
complement of a problem to be undecidable, it is not possible that the problem is decidable, since the
recursive languages are closed under complementation. (Theorem 9.2.2)
Let the alphabet strings for the instance be and the alphabet of index symbols be I. Our reductions
depend on the fact that LA , LB , LA and LB all have CFGs. We construct these CFGs either
directly as in section 9.5.1 or by the construction of a PDA for the complement languages given in
Theorem 9.5.2 coupled with conversation from a PDA to a CFG.
(a) Let L(G1 ) = LA and L(G2 ) = LB . Then L(G1 ) L(G2 ) is the set of solutions to this instance
of PCP. The intersection is empty if and only if there is no solution. Note that, technically, we
reduced PCP to the language of pairs of CFGs, whose intersection is nonempty;i.e., we have
shown the problem is the intersection of two CFGs nonempty to be undecidable. However, as
mentioned in the introduction of the proof, showing complement of a problem to be undecidable
is tantamount to showing the probelm itself undecidable.
(b) Since CFGs are closed under union, we can construct a CFG G1 for L(G1 ) L(G2 ). Since
( I) is a regular set, we surely may construct for it a CFG G2 . Now LA LB = LA LB .
Thus, L(G1 ) is missing only those strings represent solutions to the instance of PCP. L(G2 ) is
missing no string in ( I) . Thus, their languages are equal if and only if the PCP instance
has no solution.
80 CHAPTER 9. UNDECIDABILITY
(c) The argument is the same as for (b), but we let R be the regular expression ( I) .
(d) The argument of (c) suffices, since I is the only alphabet of which LA LB could possibly
be the closure.
(e) Let G1 be a CFG for ( I) and let G2 be a CFG for LA LB . Then L(G1 ) L(G2 ) if and
only if LA LB = ( I) ,i.e., if and only if PCP instance has no solution.
(f) The argument is the same as (e), but let R be the regular expression ( I) , and let L(G1 ) be
in LA LB .
Chapter 10
Intractable Problems
In this chapter we introduce the theory of intractability, that is techniques for showing problems
not to be solvable in polynomial time.
1. Maintain for each node the connected component in which the node appears, using whatever
edges of the tree have been selected so far. Initially, no edges are selected, so every node is then
in a connected component by itself.
81
82 CHAPTER 10. INTRACTABLE PROBLEMS
2. Consider the lowest-weight edge that has not yet been considered; break ties anyway you like.
If this edge connects two nodes that are currently in different connected components then:
If, on the other hand, the selected edge connects two nodes of the same component, then this
edge does not belong in the spanning tree; it would create a cycle.
3. Continue considering edges until either all edges have been considered, or the number of edges
selected for the spanning tree is one less than the number of nodes. Note that in the latter case,
all nodes must be in one connected component, and we can stop considering edges.
Example 10.1.1
In the graph of fig 10.1, we first consider the edge (1,3), because it has the lowest weight, 10. since
1 and 3 are initially in different components, we accept this edge, and make 1 and 3 have the same
component number, say component 1. The next edge in order of weights is (2,3), with weight 12.
Since 2 and 3 are in different components, we accept this edge and merge node 2 into component
1. The third edge is (1,2), with weight 15. However, 1 and 2 are now in the same component, so we
reject this edge and proceed to the fourth edge, (3,4). Since 4 is not in component 1, we accept this
edge. Now, we have three edges for the spanning tree of a 4-node graph, and so may stop.
It is possible to implement this algorithm(using a computer not a TM) on a graph with m nodes
and e edges in time O(m + e log e). A simpler implementation proceeds in e rounds. A table
gives the current component of each node. We pick the lowest-weight remaining edge in O(e) time,
and find the components of the two nodes connected by the edge in m time. If they are in different
components, merge all nodes wth those numbers in O(m) time, by scanning the table of nodes. The
total time taken by this algorithm is O(e(e + m)). This running time is polynomial in the size of
the input, which we might informally take to be the sum of e and m.
When we translate the above ideas to Turing Machines, we face several issues:
When we study algorithms we encounter problems that ask for outputs in a variety of forms, such
as a list of edges in a MWST. When we deal with Turing machines, we may ony think of problems
as languages, and th only output is yes or no, i.e. accept or reject. For instance, the MWST
tree problem could be couched as: Given this graph G and limit W , does G have a spanning
tree of weight W or less? That problem may seem easier to answer han the MWST problem
with which we are familiar, since we dont even learn what the spanning tree is. However, in
the theory of intractability. we generally want to argue that a problem is hard, not easy, and
the fact that a yes-no version of a problem is hard implies that a more standard version, where
a full answer must be computed, is also hard.
While we might think informally of the size of a graph as the number of nodes or edges, the
inout to a TM is a string over a finite alphabet. Thus, problem elements such as nodes and edges
must be encoded suitably. The effect of this requirement is that inputs to Turing Machines are
generaly slightly longer than the intuitive size of the input. However there are two reasons why
the difference is not significant:
1. The difference between the size as a TM input string and as an informal problem inout is
never more than a small factor, usually the logarithm of the input size. Thus, what can
be done in polynomial time using one measure can be done in polynomial time using the
other measure.
2. The length of a string representing the input is actually a more accurate measure of the
number of bytes a real computer has to read to get its input. For instance, if a node is
10.1. THE CLASSES P AND N P 83
represented by an integer, then the number of bytes needed to represent that integer is
proportional to the logarithm of the integers size, and it is not 1 byte for any node as
we might imagine in an informal accounting for input size.
Example 10.1.2
Let us consider a possible code for the graphs and weight limits that could be input to the MWST
problem. The code has five symbols, 0,1, the left and right parentheses, and the comma.
2. Begin the code with the value of m in binary and the weight limit W in binary, separated by a
comma.
3. If there is an edge between nodes i and j with weight w, place (i, j, w) in the code. The integers
i, j, and w are coded in binary. The order of i and j within an edge, and the order of edges
within the code are immaterial.
Thus, one of the possible codes for graph of Fig 10.1 with limit W = 40 is
100,101000(1,10,1111)(1,11,1010)(10,11,1100)(10,100,10100)(11,100,10010)
If we represent inouts to the MWST problem as in Example 10.1.2, then an inout of length n can
represent at most O(n/logn) edges. It is possible that m, the number of nodes, could be exponential
in n, if there are very few edges. However, unless the number if edges, e, is atleast m-1, the graph
cannot be connected and therefore will have no MWST, regardless of its edges. Consequently, if the
number of nodes is not atleast some fraction of n/log n, there is no need to run Kruskals algorithm
at all; we simply say no; there is no spanning tree of that weight.
Thus, if we have an upper bound on the running time of Kruskals algorithm as a function of m and
e, such as the upper bound O(e(m + e)) developed above, we can conservatively replace both m and
e by n and say that the running time, as a function of the input length n is O(n(n + n)), or O(n2 ).
We claim that in O(n2 ) steps we can implement the version of Kruskals algorithm described above
on a multitape TM. The extra tapes are used for several jobs:
1. The input tape can hold the code for graph beginning with the number of nodes, the limit W
and the edges as described in Example 10.1.2.
2. The second tape stores the list of nodes and their current components. This tape is O(n) in
length.
3. A third tape is used to store the current least weight edge. Scanning for the lowest-weight
unmarked edge takes O(n) time.
4. When an edge is selected in a round, place its two nodes on a fourth tape. Search the table of
nodes and components to find the components of these two nodes. This requires O(n) time.
5. A tape can be used to hold the two components i and j, being merged. We scan the table of
nodes and components and each node found to be in component i has its component number
changed to j. This scan takes O(n) time.
We should thus be able to say that one round can be executed in O(n) time on a multi-tape TM.
Since the number of rounds, e, is at most n, we conclude that O(n2 ) time suffices on a multi-tape
TM. Theorem 6.6.1 says that whatever a multitape TM can do in s steps, a single-tape TM can d in
O(s2 ) steps. Thus, if the multitape TM takes O(n2 ) steps, then we can construct a single-tape TM
to do the same thing in 0(n4 ) steps. Our conclusion is that the yes-no version of the MWST problem,
does graph G have a MWST of total weight W or less, is in P.
84 CHAPTER 10. INTRACTABLE PROBLEMS
Definition 10.1.1 A Hamilton circuit is a set of edges that connect the nodes into single cycle,
with each node appearing exactly once. Note that the number of edges on a Hamilton circuit must
equal the number of nodes in the graph.
1. Variables whose values are boolean;i.e., they either have the value 1(true) or 0(false).
2. Binary operations and , standing for the logical AND and OR of two expressions.
3. Unary operation standing for logical negation.
4. Parentheses to group operators and operands, if necessary to alter the default precedence of
operators: highest, then , and finally .
[1] Sudkamp A. Thomas, Languages and Machines, An Introduction to the Theory of Compute Sci-
ence, Addison Wesley Publishing Company, Inc.,United States of America, 1997.
85