Algorithm Design Slides

Running Time (1.
1)
w Most algorithms transform
Analysis of Algorithms
Input
Algorithm
Output
An algorithm is a step-by-step procedure for

solving a problem in a finite amount of time.
best case
average case
worst case
120
100
Running Time
input objects into output

objects.
w The running time of an
algorithm typically grows
with the input size.
w Average case time is often
difficult to determine.
w We focus on the worst case
running time.
80
60
40
20
Easier to analyze
Crucial to applications such as
games, finance and robotics
1000
2000
3000
Experimental Studies ( 1.6)

w Write a program
w It is necessary to implement the
8000
algorithm, which may be difficult
7000
Time (ms)
Limitations of Experiments
9000
implementing the
algorithm
w Run the program with
inputs of varying size and
composition
w Use a method like
System.currentTimeMillis() to
get an accurate measure
of the actual running time
w Plot the results
4000
Input Size
w Results may not be indicative of the
6000
running time on other inputs not included

in the experiment.
w In order to compare two algorithms, the
same hardware and software
environments must be used
5000
4000
3000
2000
1000
0
0
50
100
Input Size
Theoretical Analysis
Pseudocode (1.1)
Example: find max
element of an array
of an algorithm
More structured than Algorithm arrayMax(A, n)
English prose
Input array A of n integers
Less detailed than a
Output maximum element of A
program
currentMax A[0]
Preferred notation for
for i 1 to n 1 do
describing algorithms
if A[i] > currentMax then
Hides program design
currentMax A[i]
issues
return currentMax
w High-level description
w Uses a high-level description of the

algorithm instead of an implementation
w Characterizes running time as a
function of the input size, n.
w Takes into account all possible inputs
w Allows us to evaluate the speed of an
algorithm independent of the
hardware/software environment
w
w
w
w
The Random Access Machine

(RAM) Model
Pseudocode Details
w Method call
w Control flow
n
n
n
n
n
if then [else ]
while do
repeat until
for do
Indentation replaces braces
w Method declaration
Algorithm method (arg [, arg])
Input
Output
w A CPU
var.method (arg [, arg])
w Return value
w An potentially unbounded
return expression
w Expressions
bank of memory cells,

each of which can hold an
arbitrary number or
character
Assignment
(like = in Java)
= Equality testing
(like == in Java)
n 2 Superscripts and other
mathematical
formatting allowed
w Memory cells are numbered and accessing

any cell in memory takes unit time.
w
w
w
w
performed by an algorithm
Identifiable in pseudocode
Largely independent from the
programming language
Exact definition not important
(we will see why later)
Assumed to take a constant
amount of time in the RAM
model
w By inspecting the pseudocode, we can determine the
w Examples:
n
n
n
maximum number of primitive operations executed by

an algorithm, as a function of the input size
Evaluating an
expression
Assigning a value
to a variable
Indexing into an
array
Calling a method
Returning from a
method
Algorithm arrayMax(A, n)
currentMax A[0]
for i 1 to n 1 do
if A[i] > currentMax then
currentMax A[i]
{ increment counter i }
return currentMax
# operations
2
2 +n
2(n 1)
2(n 1)
2(n 1)
1
Total
Estimating Running Time
7n 1
10
Growth Rate of Running Time
w Algorithm arrayMax executes 7n 1 primitive
w Changing the hardware/ software
operations in the worst case. Define:
environment
a = Time taken by the fastest primitive operation

b = Time taken by the slowest primitive operation
w Let T(n) be worst-case time of arrayMax. Then

a (7n 1) T(n) b(7n 1)
Affects T(n) by a constant factor, but

Does not alter the growth rate of T(n)
w The linear growth rate of the running
w Hence, the running time T(n) is bounded by two

linear functions
Counting Primitive
Operations (1.1)
Primitive Operations
w Basic computations
11
time T(n) is an intrinsic property of

algorithm arrayMax
12
Growth Rates
Constant Factors
functions:
n
n
Linear n
Quadratic n2
Cubic n 3
T (n )
w In a log-log chart,
the slope of the line

corresponds to the
growth rate of the
function
1E+30
1E+28
1E+26
1E+24
1E+22
1E+20
1E+18
1E+16
1E+14
1E+12
1E+10
1E+8
1E+6
1E+4
1E+2
1E+0
1E+0
w The growth rate is
Cubic
not affected by
Quadratic
constant factors or
lower-order terms
Linear
w Examples
T ( n)
w Growth rates of
102 n + 105 is a linear

function
105 n2 + 108 n is a
quadratic function
1E+26
1E+24
1E+22
1E+20
1E+18
1E+16
1E+14
1E+12
1E+10
1E+8
1E+6
1E+4
1E+2
1E+0
Quadratic
Quadratic
Linear
Linear
1E+0
1E+2
1E+4
1E+6
1E+8
1E+2
1E+4
1E+10
13
f(n) cg(n) for n n0

w Example: 2n + 10 is O(n)
n
n
n
n
2n + 10 cn
(c 2) n 10
n 10/(c 2)
Pick c = 3 and n 0 = 10
1,000,000
3n
w Example: the function
2n+10
n2
n
n
n
10
10
100
n 2 cn
nc
The above inequality
cannot be satisfied
since c must be a
constant
100n
10n
n
10,000
1,000
1,000
100
10
100
1,000
n
15
16
Big-Oh and Growth Rate

w The big-Oh notation gives an upper bound on the
7n-2 is O(n)
need c > 0 and n 0 1 such that 7n-2 cn for n n 0
this is true for c = 7 and n 0 = 1
growth rate of a function
w The statement f(n) is O(g(n)) means that the growth

rate of f(n) is no more than the growth rate of g(n)
+ 20n2 + 5
3n 3 + 20n2 + 5 is O(n3 )
need c > 0 and n 0 1 such that 3n3 + 20n 2 + 5 cn3 for n n 0
log n + log log n
3 log n + log log n is O(log n)

need c > 0 and n 0 1 such that 3 log n + log log n clog n for n n 0
7n-2
n3
is not O(n)
10
1
More Big-Oh Examples
n 3n 3
14
n^2
100,000
1E+10
Big-Oh Example
10,000
g(n), we say that f(n) is 1,000

O(g(n)) if there are
positive constants
100
c and n0 such that
1E+8
Big-Oh Notation (1.2)

w Given functions f(n) and
1E+6
n
17
w We can use the big-Oh notation to rank functions

according to their growth rate
g(n) grows more

f(n) grows more
Same growth
f(n) is O(g(n))
g(n) is O(f(n))
Yes
No
Yes
No
Yes
Yes
18
Big-Oh Rules
Asymptotic Algorithm Analysis

w The asymptotic analysis of an algorithm determines
w If is f(n) a polynomial of degree d, then f(n) is

O(n d), i.e.,
n
n
Drop lower-order terms

Drop constant factors
Say 2n is O(n) instead of 2n is
O(n2 )
w Use the simplest expression of the class

n
eventually dropped anyhow, we can disregard them

when counting primitive operations
19
Computing Prefix Averages

asymptotic analysis with
two algorithms for prefix
averages
w The i-th prefix average of
an array X is average of the
first (i + 1) elements of X:
A[i] = (X[0] + X[1] + + X[i])/(i+1)
w Computing the array A of
35
quadratic time by applying the definition
X
A
30
Algorithm prefixAverages1(X, n)
Input array X of n integers
Output array A of prefix averages of X #operations
A new array of n integers
n
for i 0 to n 1 do
n
s X[0]
n
for j 1 to i do
1 + 2 + + (n 1)
s s + X[j]
1 + 2 + + (n 1)
A[i] s / (i + 1)
n
return A
1
25
20
15
10
0
1
21
Arithmetic Progression
There is a simple visual

proof of this fact
w Thus, algorithm
prefixAverages1 runs in
O(n 2) time
20
Prefix Averages (Quadratic)
prefix averages of another

array X has applications to
financial analysis
w The following algorithm computes prefix averages in
w We further illustrate
prefixAverages1 is
O(1 + 2 + + n)
w The sum of the first n
integers is n(n + 1) / 2
We determine that algorithm arrayMax executes at most

7n 1 primitive operations
We say that algorithm arrayMax runs in O(n) time
w Since constant factors and lower-order terms are
Say 3n + 5 is O(n) instead of 3n + 5 is O(3n)
w The running time of
We find the worst-case number of primitive operations

executed as a function of the input size
We express this function with big-Oh notation
w Example:
w Use the smallest possible class of functions

n
the running time in big-Oh notation
w To perform the asymptotic analysis
22
Prefix Averages (Linear)

w The following algorithm computes prefix averages in
linear time by keeping a running sum
6
5
4
3
2
1
0
6
23
Algorithm prefixAverages2(X, n)
Output array A of prefix averages of X
A new array of n integers
s0
for i 0 to n 1 do
s s + X[i]
A[i] s / (i + 1)
return A
#operations
n
1
n
n
n
1
w Algorithm prefixAverages2 runs in O(n) time

24
Relatives of Big-Oh
Math you need to Review

w Summations (Sec. 1.3.1)
w Logarithms and Exponents (Sec. 1.3.2)
w big-Omega
w
w
log b(xy) = logbx + logby

log b (x/y) = log bx - log by
log bxa = alogbx
log ba = logx a/log x b
w properties of exponentials:
a(b+c) = aba c
abc = (ab)c
ab /ac = a(b-c)
Proof techniques (Sec. 1.3.3) b = a loga b
b c = a c*log a b
Basic probability (Sec. 1.3.4)
25
26
Example Uses of the

Relatives of Big-Oh
Intuition for Asymptotic

Notation
Big-Oh
n f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n)
big-Omega
n f(n) is (g(n)) if f(n) is asymptotically greater than or equal to g(n)
big-Theta
n f(n) is (g(n)) if f(n) is asymptotically equal to g(n)
little-oh
n f(n) is o(g(n)) if f(n) is asymptotically strictly less than g(n)
little-omega
n f(n) is (g(n)) if is asymptotically strictly greater than g(n)
f(n) is (g(n)) if there is a constant c > 0

and an integer constant n0 1 such that
f(n) cg(n) for n n0
w big-Theta
n f(n) is (g(n)) if there are constants c > 0 and c > 0 and an
integer constant n 0 1 such that cg(n) f(n) cg(n) for n n0
w little-oh
n f(n) is o(g(n)) if, for any constant c > 0, there is an integer
constant n 0 0 such that f(n) cg(n) for n n 0
w little-omega
n f(n) is (g(n)) if, for any constant c > 0, there is an integer
constant n 0 0 such that f(n) cg(n) for n n 0
n
w properties of logarithms:
27
5n 2 is (n2 )
f(n) is (g(n)) if there is a constant c > 0 and an integer constant n 0 1

such that f(n) cg(n) for n n0
let c = 5 and n0 = 1
5n 2 is (n)
f(n) is (g(n)) if there is a constant c > 0 and an integer constant n 0 1

such that f(n) cg(n) for n n0
let c = 1 and n0 = 1
2
5n is (n)
f(n) is (g(n)) if, for any constant c > 0, there is an integer constant n0
0 such that f(n) cg(n) for n n0
need 5n02 cn0 given c, the n0 that satifies this is n0 c/5 0
28
The Stack ADT (2.1.1)

Elementary Data
Structures
The Stack ADT stores

arbitrary objects
Insertions and deletions
follow the last-in first-out

scheme
Think of a spring-loaded
plate dispenser
Main stack operations:
Stacks, Queues, & Lists

Amortized analysis
Trees
Auxiliary stack
push(object): inserts an
element
object pop(): removes and
returns the last inserted
element
operations:
object top(): returns the

last inserted element
without removing it
integer size(): returns the
number of elements
stored
boolean isEmpty():
indicates whether no
elements are stored
Elementary Data Structures
Array-based Stack (2.1.1)
Applications of Stacks
A simple way of
Direct applications
Page-visited history in a Web browser

Undo sequence in a text editor
Chain of method calls in the Java Virtual
Machine or C++ runtime environment
Indirect applications
Auxiliary data structure for algorithms

Component of other data structures
Algorithm pop():
if isEmpty() then
throw EmptyStackException
else
tt1
return S[t + 1]
implementing the
Stack ADT uses an
array
We add elements
Algorithm push(o)
from left to right
if t = S.length 1 then
A variable t keeps
throw FullStackException
track of the index of
else
the top element
tt+1
(size is t+1)
S[t] o
S
0 1 2
t
Comparison of the
Strategies
Growable Array-based
Stack (1.5)
In a push operation, when
the array is full, instead of

throwing an exception, we Algorithm push(o)
can replace the array with
A new array of
a larger one
size
How large should the new
for i 0 to t do
array be?
A[i] S[i]
incremental strategy:
increase the size by a
constant c
doubling strategy: double
the size
SA
tt+1
S[t] o
We compare the incremental strategy and
the doubling strategy by analyzing the total

time T(n) needed to perform a series of n
push operations
We assume that we start with an empty
stack represented by an array of size 1
We call amortized time of a push operation
the average time taken by a push over the
series of operations, i.e., T(n)/n
Analysis of the
Incremental Strategy
Direct Analysis of the

Doubling Strategy
We replace the array k = log2 n
We replace the array k = n/c times

The total time T(n) of a series of n push
times
operations is proportional to
n + c + 2c + 3c + 4c + + kc =
n + c(1 + 2 + 3 + + k) =
n + ck(k + 1)/2
Since c is a constant, T(n) is O(n + k2), i.e.,
O(n2)
The amortized time of a push operation is O(n)
The total time T(n) of a series

of n push operations is
proportional to
n + 1 + 2 + 4 + 8 + + 2k =
n + 2k + 1 1 = 2n 1
T(n) is O(n)
The amortized time of a push
operation is O(1)
geometric series
2
1
1
8
Accounting Method Analysis

of the Doubling Strategy
Amortization Scheme for

the Doubling Strategy
The accounting method determines the amortized
Consider again the k phases, where each phase consisting of twice
running time with a system of credits and debits

We view a computer as a coin-operated device requiring
1 cyber-dollar for a constant amount of computing.
We set up a scheme for charging operations. This
is known as an amortization scheme.
The scheme must give us always enough money to
pay for the actual cost of the operation.
The total cost of the series of operations is no more
than the total amount charged.
(amortized time) (total $ charged) / (# operations)
operations:
the first-in first-out scheme

Insertions are at the rear of the
queue and removals are at the
front of the queue
Main queue operations:
enqueue(object): inserts an
element at the end of the
queue
object dequeue(): removes and
returns the element at the front
of the queue
object front(): returns the

element at the front without
removing it
number of elements stored
boolean isEmpty(): indicates
whether no elements are
stored
Exceptions
$
$
$
$
$
$
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 1 2 3 4 5 6 7
We charge $3 for a push. The $2 saved for a regular push are

stored in the second half of the array. Thus, we will have
2(i/2)=i cyber-dollars saved at then end of phase i.
Therefore, each push runs in O(1) amortized time; n pushes run
in O(n) time.
10
Direct applications
11
Waiting lines
Access to shared resources (e.g., printer)
Multiprogramming
Attempting the execution of

dequeue or front on an
empty queue throws an
EmptyQueueException
for the array growth for the beginning of the next phase.
Applications of Queues
The Queue ADT stores arbitrary Auxiliary queue

objects
array-growing push of the next phase.
At the end of phase i we want to have saved i cyber-dollars, to pay
The Queue ADT (2.1.2)

Insertions and deletions follow
as many pushes as the one before.
At the end of a phase we must have saved enough to pay for the

12
Singly Linked List
Queue with a Singly Linked List
A singly linked list is a
concrete data structure

consisting of a sequence
of nodes
Each node stores
The front element is stored at the first node

The rear element is stored at the last node
The space used is O(n) and each operation of the
element
link to the next node
We can implement a queue with a singly linked list
next
Queue ADT takes O(1) time
node
elem
nodes
elements
13
List ADT (2.2.2)
Doubly Linked List

A doubly linked list provides a natural
The List ADT models a
sequence of positions
storing arbitrary objects
It allows for insertion
and removal in the
middle
Query methods:
isFirst(p), isLast(p)
first(), last()
before(p), after(p)
node
trailer
elements
16
Tree ADT (2.3.1)

We use positions to abstract
ComputersRUs
tree is an abstract model
of a hierarchical
structure
Sales
Manufacturing
R&D
A tree consists of nodes
with a parent-child
relation
US
International
Laptops
Desktops
Applications:
nodes/positions
15
In computer science, a
elem
header
Trees (2.3)
Organization charts
File systems
Europe
Programming
environments
next
Special trailer and header nodes
replaceElement(p, o),
swapElements(p, q)
insertBefore(p, o),
insertAfter(p, o),
insertFirst(o),
insertLast(o)
remove(p)
element
link to the previous node
Update methods:
prev
implementation of the List ADT

Nodes implement Position and store:
Accessor methods:
14
nodes
Generic methods:
Asia
Canada
17
integer size()
boolean isEmpty()
objectIterator elements()
positionIterator positions()
Accessor methods:
Query methods:
boolean isInternal(p)
boolean isExternal(p)
boolean isRoot(p)
Update methods:
swapElements(p, q)
object replaceElement(p, o)
Additional update methods
position root()
position parent(p)
positionIterator children(p)
may be defined by data

structures implementing the
Tree ADT
18
Preorder Traversal (2.3.2)

A traversal visits the nodes of a
Algorithm preOrder(v)
visit(v)
for each child w of v
preorder (w)
tree in a systematic manner

In a preorder traversal, a node is
visited before its descendants
Application: print a structured
document
1
Postorder Traversal (2.3.2)
cs16/
1. Motivations
2. Methods
4
1.2 Avidity
2.2 Ponzi
Scheme
2.3 Bank
Robbery
h1c.doc
3K
h1nc.doc
2K
internal nodes: operators

external nodes: operands
node left child and right child
binary tree is either
a tree consisting of a single node,

or
a tree whose root has an ordered
pair of children, each of which is a
binary tree
22
Binary tree associated with a decision process

internal nodes: questions with yes/no answer
external nodes: decisions
Example: dining decision

Want a fast meal?
How about coffee?
No
Yes
Alternative recursive definition: a
expression (2 (a 1) + (3 b))
We call the children of an internal
Example: arithmetic expression tree for the
Decision Tree
Binary tree associated with an arithmetic expression
arithmetic expressions
decision processes
searching
Each internal node has two

children
The children of a node are an
ordered pair
21
Arithmetic Expression Tree
20
Applications:
following properties:
The call for v costs $(cv + 1), where cv is the

number of children of v
For the call for v, charge one cyber-dollar to v and
charge one cyber-dollar to each child of v.
Each node (except the root) gets charged twice:
once for its own call and once for its parents call.
Therefore, traversal time is O(n).
A binary tree is a tree with the
of an n-node tree is proportional to the sum,

taken over each node v in the tree, of the
time needed for the recursive call for v.
6
Robot.java
20K
Stocks.java
25K
Binary Trees (2.3.3)
Time taken in preorder or postorder traversal
DDR.java
10K
19
Amortized Analysis of
Tree Traversal
todo.txt
1K
programs/
homeworks/
References
2.1 Stock
Fraud
Algorithm postOrder(v)
postOrder (w)
visit(v)
node is visited after its

descendants
Application: compute space
used by files in a directory and
its subdirectories
9
Make Money Fast!
1.1 Greed
In a postorder traversal, a
23
On expense account?
Yes
No
Yes
No
Starbucks
In N Out
Antoine's
Dennys
24
Properties of Binary Trees

Notation
Inorder Traversal
In an inorder traversal a
Properties:
n number of nodes
e number of
external nodes
i number of internal
nodes
h height
e=i+1
n = 2e 1
hi
h (n 1)/2
e 2h
h log2 e
h log2 (n + 1) 1
x(v) = inorder rank of v

y(v) = depth of v
6
4
3
on the left (preorder)

from below (inorder)
on the right (postorder)
traversal
Element
Parent node
Sequence of children
nodes
Element
Parent node
Left child node
Right child node
Node objects implement

the Position ADT
B
F
E
C
28
Linked Data Structure for

Binary Trees
by an object storing
((2 (a 1)) + (3 b))
A node is represented
the Position ADT
27
Linked Data Structure for

Representing Trees (2.3.4)
an object storing
A node is represented by
print operand or operator

when visiting node
print ( before traversing left
subtree
print ) after traversing right
subtree
Algorithm printExpression(v)
if isInternal (v)
print(()
inOrder (leftChild (v))
print(v.element ())
if isInternal (v)
inOrder (rightChild (v))
print ())
Specialization of an inorder
26
Printing Arithmetic Expressions
Generic traversal of a binary tree

Includes a special cases the preorder, postorder and inorder traversals
Walk around the tree and visit each node three times:
7
5
25
Euler Tour Traversal
Algorithm inOrder(v)
if isInternal (v)
visit(v)
if isInternal (v)
node is visited after its left

subtree and before its right
subtree
Application: draw a binary
tree
E
29
D
C
E
30
Array-Based Representation of
Binary Trees
nodes are stored in an array
1
A
3
B
let rank(node) be defined as follows:
4
rank(root) = 1
if node is the left child of parent(node),
rank(node) = 2*rank(parent(node))
if node is the right child of parent(node),
rank(node) = 2*rank(parent(node))+1
5
E
7
C
10
11
G
H
31
Outline and Reading

The Stack ADT (2.1.1)
Applications of Stacks (2.1.1)
Array-based implementation (2.1.1)
Growable array-based stack (1.5)
Stacks
Stacks
Abstract Data Types (ADTs)

An abstract data
type (ADT) is an
abstraction of a
data structure
An ADT specifies:
Data stored
Operations on the
data
Error conditions
associated with
operations
The Stack ADT
Example: ADT modeling a

simple stock trading system
The data stored are buy/sell

orders
The operations supported are
order buy(stock, shares, price)
order sell(stock, shares, price)
void cancel(order)
The Stack ADT stores

arbitrary objects
Insertions and deletions
follow the last-in first-out
scheme
Think of a spring-loaded
plate dispenser
Main stack operations:
Error conditions:
Buy/sell a nonexistent stock
Cancel a nonexistent order

Stacks
Exceptions
push(object): inserts an
element
object pop(): removes and
returns the last inserted
element
Auxiliary stack
operations:
object top(): returns the

last inserted element
without removing it
number of elements
stored
boolean isEmpty():
indicates whether no
elements are stored
Stacks
Applications of Stacks
Attempting the
execution of an
operation of ADT may
sometimes cause an
error condition, called
an exception
Exceptions are said to
be thrown by an
operation that cannot
be executed
Stacks
In the Stack ADT,

operations pop and
top cannot be
performed if the
stack is empty
Attempting the
execution of pop or
top on an empty
stack throws an
EmptyStackException
5
Direct applications
Page-visited history in a Web browser

Undo sequence in a text editor
Chain of method calls in the Java Virtual
Machine

Stacks
Method Stack in the JVM
Array-based Stack
The Java Virtual Machine (JVM) main() {

int i = 5;
keeps track of the chain of
foo(i);
active methods with a stack
}
When a method is called, the
JVM pushes on the stack a
foo(int j) {
frame containing
int k;
Local variables and return value
k = j+1;
Program counter, keeping track of
bar(k);
the statement being executed
}
When a method ends, its frame
is popped from the stack and
bar(int m) {
control is passed to the method
on top of the stack

}
bar
PC = 1
m=6
foo
PC = 3
j=5
k=6
main
PC = 2
i=5
Stacks
t
Stacks
Not intrinsic to the

Stack ADT
0 1 2
Let n be the number of elements in the stack

The space used is O(n)
Each operation runs in time O(1)
Limitations
The maximum size of the stack must be defined a

priori and cannot be changed
Trying to push a new element into a full stack
causes an implementation-specific exception
t
Stacks
Computing Spans
7
We show how to use a stack 6
as an auxiliary data structure 5
in an algorithm
4
Given an an array X, the span
3
S[i] of X[i] is the maximum
2
number of consecutive
elements X[j] immediately
1
preceding X[i] and such that 0
X[j] X[i]
Spans have applications to
financial analysis
0 1 2
Performance
Algorithm push(o)
throw FullStackException
else
tt+1
Limitation of the arrayS[t] o
based implementation
Algorithm pop()
if isEmpty() then
throw EmptyStackException
else
tt1
return S[t + 1]
Performance and Limitations
The array storing the

stack elements may
become full
A push operation will
then throw a
FullStackException
Algorithm size()
return t + 1
Array-based Stack (cont.)
A simple way of
implementing the
Stack ADT uses an
array
We add elements
from left to right
A variable keeps
track of the index of
the top element
E.g., stock at 52-week high

Stacks
Stacks
10
Quadratic Algorithm
X
S
6
1
3
1
4
2
5
3
2
1
11
Algorithm spans1(X, n)
Output array S of spans of X
S new array of n integers
for i 0 to n 1 do
s1
while s i X[i s] X[i]
ss+1
S[i] s
return S
n
n
n
1 + 2 + + (n 1)
1 + 2 + + (n 1)
n
1
Algorithm spans1 runs in O(n2) time

Stacks
12
Computing Spans with a Stack

We keep in a stack the
indices of the elements
visible when looking
back
We scan the array from
left to right
Let i be the current index

We pop indices from the
stack until we find index j
such that X[i] < X[j]
We set S[i] i j
We push x onto the stack
Each index of the

array
7
6
5
4
3
2
1
0
0 1 2 3 4 5 6 7
Stacks
13
Growable Array-based Stack
incremental strategy:
increase the size by a
constant c
doubling strategy: double
the size
tt+1
S[t] o
Stacks
15
Incremental Strategy Analysis
The statements in
the while-loop are
executed at most
n times
Algorithm spans2
runs in O(n) time
#
n
1
n
n
n
n
n
n
n
1
Stacks
14
We compare the incremental strategy and

the doubling strategy by analyzing the total
time T(n) needed to perform a series of n
push operations
We assume that we start with an empty
stack represented by an array of size 1
We call amortized time of a push operation
the average time taken by a push over the
series of operations, i.e., T(n)/n
Stacks
16
Doubling Strategy Analysis
We replace the array k = n/c times

The total time T(n) of a series of n push
operations is proportional to
n + c + 2c + 3c + 4c + + kc =
n + c(1 + 2 + 3 + + k) =
n + ck(k + 1)/2
Since c is a constant, T(n) is O(n + k2), i.e.,
O(n2)
The amortized time of a push operation is O(n)
Stacks
Is pushed into the

stack exactly one
Is popped from
the stack at most
once
Algorithm spans2(X, n)
S new array of n integers
A new empty stack
for i 0 to n 1 do
while (A.isEmpty()
X[top()] X[i] ) do
j A.pop()
if A.isEmpty() then
S[i] i + 1
else
S[i] i j
A.push(i)
return S
Comparison of the Strategies
In a push operation, when Algorithm push(o)

the array is full, instead of
throwing an exception, we
A new array of
can replace the array with
size
a larger one
for i 0 to t do
A[i] S[i]
How large should the new
SA
array be?
Linear Algorithm
17
We replace the array k = log2 n

times
The total time T(n) of a series
of n push operations is
proportional to
n + 1 + 2 + 4 + 8 + + 2k =
n + 2k + 1 1 = 2n 1
T(n) is O(n)
The amortized time of a push
operation is O(1)
Stacks
geometric series
2
1
1
8
18
Stack Interface in Java

Java interface
corresponding to
our Stack ADT
Requires the
definition of class
EmptyStackException
Different from the
built-in Java class
java.util.Stack
Array-based Stack in Java
public interface Stack {
public class ArrayStack

implements Stack {
public int size();

public boolean isEmpty();
// holds the stack elements

private Object S[ ];
public Object top()

throws EmptyStackException;
// index to top element

private int top = -1;
public void push(Object o);

public Object pop()
throws EmptyStackException;
}
Stacks
19
// constructor
public ArrayStack(int capacity) {
S = new Object[capacity]);
}
public Object pop()

throws EmptyStackException {
if isEmpty()
throw new EmptyStackException
(Empty stack: cannot pop);
Object temp = S[top];
// facilitates garbage collection
S[top] = null;
top = top 1;
return temp;
}
Stacks
20
Vectors
6/8/2002 2:14 PM
Outline and Reading

The Vector ADT (2.2.1)
Array-based implementation (2.2.1)
Vectors
6/8/2002 2:14 PM
Vectors
The Vector ADT

The Vector ADT
extends the notion of
array by storing a
sequence of arbitrary
objects
An element can be
accessed, inserted or
removed by specifying
its rank (number of
elements preceding it)
An exception is
thrown if an incorrect
rank is specified (e.g.,
a negative rank)
6/8/2002 2:14 PM
6/8/2002 2:14 PM
Vectors
Applications of Vectors
Main vector operations:
object elemAtRank(integer r):
returns the element at rank r
without removing it
object replaceAtRank(integer r,
object o): replace the element at
rank with o and return the old
element
insertAtRank(integer r, object o):
insert a new element o to have
rank r
object removeAtRank(integer r):
removes and returns the element
at rank r
Additional operations size() and
isEmpty()
Vectors
Array-based Vector
Direct applications
Sorted collection of objects (elementary

database)

6/8/2002 2:14 PM
Vectors
Insertion
In operation insertAtRank(r, o), we need to make
room for the new element by shifting forward the
n r elements V[r], , V[n 1]
In the worst case (r = 0), this takes O(n) time
Use an array V of size N

A variable n keeps track of the size of the vector
(number of elements stored)
Operation elemAtRank(r) is implemented in O(1)
time by returning V[r]
V
V
0 1 2
0 1 2
0 1 2
o
r
V
0 1 2
V
6/8/2002 2:14 PM
Vectors
6/8/2002 2:14 PM
Vectors
n
6
Vectors
6/8/2002 2:14 PM
Deletion
Performance
In operation removeAtRank(r), we need to fill the

hole left by the removed element by shifting
backward the n r 1 elements V[r + 1], , V[n 1]
In the worst case (r = 0), this takes O(n) time
V
0 1 2
o
r
0 1 2
V
0 1 2
r
Vectors
The space used by the data structure is O(n)

size, isEmpty, elemAtRank and replaceAtRank run in
O(1) time
insertAtRank and removeAtRank run in O(n) time
If we use the array in a circular fashion,

insertAtRank(0) and removeAtRank(0) run in
O(1) time
In an insertAtRank operation, when the array
is full, instead of throwing an exception, we
can replace the array with a larger one
6/8/2002 2:14 PM
In the array based implementation of a Vector
n
7
6/8/2002 2:14 PM
Vectors
Queues
6/8/2002 2:16 PM
Outline and Reading

The Queue ADT (2.1.2)
Implementation with a circular array
(2.1.2)
Growable array-based queue
Queue interface in Java
Queues
6/8/2002 2:16 PM
Queues
The Queue ADT

The Queue ADT stores arbitrary
objects
Insertions and deletions follow
the first-in first-out scheme
Insertions are at the rear of the
queue and removals are at the
front of the queue
Main queue operations:
enqueue(object): inserts an
element at the end of the
queue
object dequeue(): removes and
returns the element at the front
of the queue
6/8/2002 2:16 PM
6/8/2002 2:16 PM
Queues
Applications of Queues
Auxiliary queue
operations:
object front(): returns the
element at the front without
removing it
number of elements stored
boolean isEmpty(): indicates
whether no elements are
stored
Exceptions
Queues
Array-based Queue
Waiting lists, bureaucracy

Access to shared resources (e.g., printer)
Multiprogramming
Attempting the execution of

dequeue or front on an
empty queue throws an
EmptyQueueException
Direct applications

6/8/2002 2:16 PM
Queues
Queue Operations
Use an array of size N in a circular fashion

Two variables keep track of the front and rear
We use the
modulo operator
(remainder of
division)
f index of the front element

r index immediately past the rear element
Array location r is kept empty
Algorithm size()
return (N f + r) mod N
Algorithm isEmpty()
return (f = r)
normal configuration
Q
0 1 2
wrapped-around configuration
6/8/2002 2:16 PM
0 1 2
Q
0 1 2
0 1 2
f
Queues
6/8/2002 2:16 PM
Queues
Queues
6/8/2002 2:16 PM
Queue Operations (cont.)

Operation enqueue
throws an exception if
the array is full
This exception is
implementationdependent
Queue Operations (cont.)
Algorithm enqueue(o)
if size() = N 1 then
throw FullQueueException
else
Q[r] o
r (r + 1) mod N
Q
0 1 2
0 1 2
Growable Array-based Queue
Queues
0 1 2
6/8/2002 2:16 PM
Java interface
corresponding to
our Queue ADT
Requires the
definition of class
EmptyQueueException
No corresponding
built-in Java class
O(n) with the incremental strategy

O(1) with the doubling strategy
6/8/2002 2:16 PM
r
f
Queues
Queue Interface in Java
In an enqueue operation, when the array is

full, instead of throwing an exception, we
can replace the array with a larger one
Similar to what we did for an array-based
stack
The enqueue operation has amortized
running time
0 1 2
Q
f
Queues
Algorithm dequeue()
if isEmpty() then
throw EmptyQueueException
else
o Q[f]
f (f + 1) mod N
return o
6/8/2002 2:16 PM
Operation dequeue
throws an exception
if the queue is empty
This exception is
specified in the
queue ADT
public interface Queue {

public int size();
public boolean isEmpty();
public Object front()
throws EmptyQueueException;
public void enqueue(Object o);
public Object dequeue()
throws EmptyQueueException;
}
6/8/2002 2:16 PM
Queues
10
Sequences
6/8/2002 2:15 PM
Outline and Reading

Singly linked list
Position ADT and List ADT (2.2.2)
Doubly linked list ( 2.2.2)
Sequence ADT ( 2.2.3)
Implementations of the sequence ADT
( 2.2.3)
Iterators (2.2.3)
Lists and Sequences
6/8/2002 2:15 PM
Sequences
Singly Linked List
We can implement a stack with a singly linked list

The top element is stored at the first node of the list
Stack ADT takes O(1) time
next
element
Sequences
Stack with a Singly Linked List
A singly linked list is a

concrete data structure
consisting of a sequence
of nodes
Each node stores
6/8/2002 2:15 PM
node
elem
nodes
6/8/2002 2:15 PM
Sequences
elements
3
Queue with a Singly Linked List

We can implement a queue with a singly linked list
6/8/2002 2:15 PM
Position ADT
The Position ADT models the notion of
place within a data structure where a
single object is stored
It gives a unified view of diverse ways
of storing data, such as
The front element is stored at the first node

The rear element is stored at the last node

Queue ADT takes O(1) time
r
nodes
Sequences
a cell of an array
a node of a linked list
Just one method:
object element(): returns the element

stored at the position
elements
6/8/2002 2:15 PM
Sequences
6/8/2002 2:15 PM
Sequences
Sequences
6/8/2002 2:15 PM
List ADT
Doubly Linked List
The List ADT models a

sequence of positions
storing arbitrary objects
It establishes a
before/after relation
between positions
Generic methods:
size(), isEmpty()
isFirst(p), isLast(p)
first(), last()
before(p), after(p)
swapElements(p, q)
insertBefore(p, o),
insertAfter(p, o),
insertFirst(o),
insertLast(o)
remove(p)
prev
element
link to the previous node
Update methods:
Query methods:
A doubly linked list provides a natural

implementation of the List ADT
Nodes implement Position and store:
Accessor methods:
next
elem
node
Special trailer and header nodes

nodes/positions
header
trailer
elements
6/8/2002 2:15 PM
Sequences
Insertion
6/8/2002 2:15 PM
Sequences
Deletion
We visualize operation insertAfter(p, X), which returns position q
We visualize remove(p), where p = last()
p
D
p
A
X
p
A
6/8/2002 2:15 PM
q
B
Sequences
Performance
The space used by a list with n elements is

O(n)
The space used by each position of the list
is O(1)
All the operations of the List ADT run in
O(1) time
Operation element() of the
Position ADT runs in O(1) time
6/8/2002 2:15 PM
6/8/2002 2:15 PM
Sequences
10
Sequence ADT
In the implementation of the List ADT

by means of a doubly linked list
Sequences
11
The Sequence ADT is the

union of the Vector and
List ADTs
Elements accessed by
List-based methods:
Rank, or
Position
Generic methods:
size(), isEmpty()
Vector-based methods:
elemAtRank(r),
replaceAtRank(r, o),
insertAtRank(r, o),
removeAtRank(r)
6/8/2002 2:15 PM
first(), last(),
before(p), after(p),
swapElements(p, q),
insertBefore(p, o),
insertAfter(p, o),
insertFirst(o),
insertLast(o),
remove(p)
Bridge methods:
Sequences
atRank(r), rankOf(p)
12
Sequences
6/8/2002 2:15 PM
Applications of Sequences
Array-based Implementation
The Sequence ADT is a basic, generalpurpose, data structure for storing an ordered
collection of elements
Direct applications:
Generic replacement for stack, queue, vector, or

list
small database (e.g., address book)
Element
Rank
Indices f and l
keep track of
first and last
positions
Indirect applications:
Building block of more complex data structures
elements
We use a
circular array
storing
positions
A position
object stores:
0
Sequences
13
Sequence Implementations
Operation
size, isEmpty
atRank, rankOf, elemAtRank
first, last, before, after
replaceElement, swapElements
replaceAtRank
insertAtRank, removeAtRank
insertFirst, insertLast
insertAfter, insertBefore
remove
6/8/2002 2:15 PM
Sequences
Array
1
1
1
1
1
n
1
n
n
3
positions
S
f
6/8/2002 2:15 PM
6/8/2002 2:15 PM
Sequences
14
Iterators
List
1
n
1
1
n
n
1
1
1
An iterator abstracts the

process of scanning through
a collection of elements
Methods of the ObjectIterator
ADT:
object object()
boolean hasNext()
object nextObject()
reset()
6/8/2002 2:15 PM
ObjectIterator elements()
Two notions of iterator:
Extends the concept of

Position by adding a traversal
capability
Implementation with an array
or singly linked list
15
An iterator is typically
associated with an another
data structure
We can augment the Stack,
Queue, Vector, List and
Sequence ADTs with method:
Sequences
snapshot: freezes the

contents of the data
structure at a given time
dynamic: follows changes to
the data structure
16
Trees
6/8/2002 2:15 PM
Outline and Reading

Tree ADT (2.3.1)
Preorder and postorder traversals (2.3.2)
BinaryTree ADT (2.3.3)
Inorder traversal (2.3.3)
Euler Tour traversal (2.3.3)
Template method pattern
Data structures for trees (2.3.4)
Java implementation (http://jdsl.org)
Trees
Make Money Fast!
Stock
Fraud
6/8/2002 2:15 PM
Ponzi
Scheme
Bank
Robbery
Trees
What is a Tree
Organization charts
File systems
Europe
Programming
environments
6/8/2002 2:15 PM
Asia
Canada
Trees
Tree ADT
integer size()
boolean isEmpty()
objectIterator elements()
positionIterator positions()
Accessor methods:
position root()
position parent(p)
positionIterator children(p)
6/8/2002 2:15 PM
Root: node without parent (A)

Internal node: node with at least
one child (A, B, C, F)
External node (a.k.a. leaf ): node
without children (E, I, J, K, G, H, D)
Ancestors of a node: parent,
grandparent, grand-grandparent,
etc.
Depth of a node: number of
ancestors
E
Height of a tree: maximum depth
of any node (3)
Descendant of a node: child,
I
grandchild, grand-grandchild, etc.
6/8/2002 2:15 PM
Subtree: tree consisting of

a node and its
descendants
A
subtree
J
Trees
Preorder Traversal
We use positions to abstract

nodes
Generic methods:
Trees
Tree Terminology
In computer science, a
ComputersRUs
tree is an abstract model
of a hierarchical
structure
Sales
Manufacturing
R&D
A tree consists of nodes
with a parent-child
relation
US
International
Laptops
Desktops
Applications:
6/8/2002 2:15 PM
Trees
Query methods:
boolean isInternal(p)
boolean isExternal(p)
boolean isRoot(p)
Update methods:
swapElements(p, q)
object replaceElement(p, o)
A traversal visits the nodes of a

tree in a systematic manner
In a preorder traversal, a node is
visited before its descendants
Application: print a structured
document
1
Additional update methods

may be defined by data
structures implementing the
Tree ADT
Make Money Fast!
1. Motivations
9
2. Methods
1.1 Greed
1.2 Avidity
6/8/2002 2:15 PM
Algorithm preOrder(v)
visit(v)
preorder (w)
2.1 Stock
Fraud
Trees
2.2 Ponzi
Scheme
References
8
2.3 Bank
Robbery
6
Trees
6/8/2002 2:15 PM
Postorder Traversal
In a postorder traversal, a
node is visited after its
descendants
Application: compute space
used by files in a directory and
its subdirectories
9
Binary Tree
Algorithm postOrder(v)
postOrder (w)
visit(v)
cs16/
homeworks/
todo.txt
1K
programs/
h1c.doc
3K
h1nc.doc
2K
DDR.java
10K
6/8/2002 2:15 PM
Stocks.java
25K
Trees
Trees
I
8
No
On expense account?
Yes
No
Yes
No
Starbucks
Spikes
Al Forno
Caf Paragon
6/8/2002 2:15 PM
Trees
10
BinaryTree ADT
Properties:
e = i + 1
n = 2e 1
h i
h (n 1)/2
h
e 2
h log2 e
h log2 (n + 1) 1
The BinaryTree ADT

extends the Tree
ADT, i.e., it inherits
all the methods of
the Tree ADT
Additional methods:
Trees
internal nodes: questions with yes/no answer

external nodes: decisions
How about coffee?
Properties of Binary Trees
6/8/2002 2:15 PM
Yes
n number of nodes
e number of
external nodes
i number of internal
nodes
h height
Example: dining decision
Notation
Want a fast meal?
6/8/2002 2:15 PM
Trees
Binary tree associated with a decision process
Example: arithmetic expression tree for the

expression (2 (a 1) + (3 b))
a tree consisting of a single node,

or
a tree whose root has an ordered
pair of children, each of which is a
binary tree
6/8/2002 2:15 PM
arithmetic expressions
decision processes
searching
Decision Tree
internal nodes: operators

external nodes: operands
Applications:
Each internal node has two

children
The children of a node are an
ordered pair
We call the children of an internal

node left child and right child
Alternative recursive definition: a
binary tree is either
Robot.java
20K
Binary tree associated with an arithmetic expression
Arithmetic Expression Tree
A binary tree is a tree with the

11
Update methods
may be defined by
data structures
implementing the
BinaryTree ADT
position leftChild(p)
position rightChild(p)
position sibling(p)
6/8/2002 2:15 PM
Trees
12
Trees
6/8/2002 2:15 PM
Inorder Traversal
In an inorder traversal a
node is visited after its left
subtree and before its right
subtree
Application: draw a binary
tree
Print Arithmetic Expressions
Algorithm inOrder(v)
if isInternal (v)
visit(v)
if isInternal (v)
x(v) = inorder rank of v

y(v) = depth of v
6
2
4
3
Trees
recursive method returning

the value of a subtree
when visiting an internal
node, combine the values
of the subtrees
13
Algorithm evalExpr(v)
if isExternal (v)
return v.element ()
else
x evalExpr(leftChild (v))
y evalExpr(rightChild (v))
operator stored at v
return x y
((2 (a 1)) + (3 b))
1
Trees
14
Euler Tour Traversal

Generic traversal of a binary tree
Includes a special cases the preorder, postorder and inorder traversals
Walk around the tree and visit each node three times:
on the left (preorder)
from below (inorder)
on the right (postorder)
6/8/2002 2:15 PM
Trees
15
Template Method Pattern

Generic algorithm that
can be specialized by
redefining certain steps
Implemented by means of
an abstract Java class
Visit methods that can be
redefined by subclasses
Template method eulerTour
Recursively called on the
left and right children
A Result object with fields
leftResult, rightResult and
finalResult keeps track of
the output of the
recursive calls to eulerTour
6/8/2002 2:15 PM
6/8/2002 2:15 PM
Specialization of a postorder
traversal
print operand or operator

when visiting node
print ( before traversing left
subtree
print ) after traversing right
subtree
Evaluate Arithmetic Expressions
6/8/2002 2:15 PM
Algorithm printExpression(v)
if isInternal (v)
print(()
print(v.element ())
if isInternal (v)
print ())
Specialization of an inorder
traversal
Trees
16
Specializations of EulerTour
public abstract class EulerTour {

protected BinaryTree tree;
protected void visitExternal(Position p, Result r) { }
protected void visitLeft(Position p, Result r) { }
protected void visitBelow(Position p, Result r) { }
protected void visitRight(Position p, Result r) { }
protected Object eulerTour(Position p) {
Result r = new Result();
if tree.isExternal(p) { visitExternal(p, r); }
else {
visitLeft(p, r);
r.leftResult = eulerTour(tree.leftChild(p));
visitBelow(p, r);
r.rightResult = eulerTour(tree.rightChild(p));
visitRight(p, r);
return r.finalResult;
}
Trees
6/8/2002 2:15 PM
17
We show how to
specialize class
EulerTour to evaluate
an arithmetic
expression
Assumptions
public class EvaluateExpression

extends EulerTour {
protected void visitExternal(Position p, Result r) {
r.finalResult = (Integer) p.element();
}
protected void visitRight(Position p, Result r) {
Operator op = (Operator) p.element();
r.finalResult = op.operation(
(Integer) r.leftResult,
(Integer) r.rightResult
);
}
External nodes store

Integer objects
Internal nodes store
Operator objects
supporting method
operation (Integer, Integer)
6/8/2002 2:15 PM
Trees
18
Trees
6/8/2002 2:15 PM
Data Structure for Trees

A node is represented by
an object storing
Element
Parent node
Sequence of children
nodes
A node is represented
by an object storing
6/8/2002 2:15 PM
Tree interface
BinaryTree interface
extending Tree
Classes implementing Tree
and BinaryTree and
providing
Trees
19
expandExternal(v)
removeAboveExternal(w)
6/8/2002 2:15 PM
Trees
Trees
20
JDSL was developed at

Browns Center for Geometric
Computing
See the JDSL documentation
and tutorials at http://jdsl.org
InspectableBinaryTree
InspectableTree
BinaryTree
Tree
InspectableTree
Inspectable versions of the

interfaces do not have
update methods
InspectableBinaryTree
Tree classes in JDSL
JDSL is the Library of Data

Structures in Java
Tree interfaces in JDSL
6/8/2002 2:15 PM
Constructors
Update methods
Print methods
Trees in JDSL
expandExternal(v)
Examples of updates for

binary trees
Java Implementation

the Position ADT
D
C
Element
Parent node
Left child node
Right child node

the Position ADT
Data Structure for Binary Trees
21
NodeBinaryTree
NodeTree
6/8/2002 2:15 PM
Tree
BinaryTree
Trees
22
Heaps
4/5/2002 14:4
Priority Queue
ADT ( 2.4.1)
A priority queue stores a
Heaps and Priority Queues

2
5
9
total order relation
Reflexive property:
xx
Antisymmetric property:
xy yxx=y
Transitive property:
xyyzxz
The running time of this
sorting method depends on

the priority queue
implementation
A comparator encapsulates
the action of comparing two

objects according to a given
A generic priority queue
uses an auxiliary
comparator
The comparator is external
to the keys being compared
When the priority queue
needs to compare two keys,
it uses its comparator
Standby flyers
Auctions
Stock market
Methods of the Comparator

ADT, all with Boolean
return type
isLessThan(x, y)
isLessThanOrEqualTo(x,y)
isEqualTo(x,y)
isGreaterThan(x, y)
isGreaterThanOrEqualTo(x,y)
isComparable(x)
Implementation with an
Algorithm PQ-Sort(S, C)
Input sequence S, comparator C
for the elements of S
Output sequence S sorted in
increasing order according to C
P priority queue with
comparator C
while S.isEmpty ()
e S.remove (S. first ())
P.insertItem(e, e)
while P.isEmpty()
e P.removeMin()
S.insertLast(e)
Applications:
Sequence-based Priority Queue
We can use a priority
Insert the elements one

by one with a series of
insertItem(e, e)
operations
Remove the elements in
sorted order with a series
of removeMin()
operations
minKey(k, o)
returns, but does not
remove, the smallest key of
an item
minElement()
remove, the element of an
item with smallest key
size(), isEmpty()
Sorting with a
Priority Queue ( 2.4.2)
queue to sort a set of
comparable elements
Comparator ADT ( 2.4.1)
Keys in a priority Mathematical concept of
insertItem(k, o)
inserts an item with key k
and element o
removeMin()
removes the item with
smallest key and returns its
element
Total Order Relation

queue can be
arbitrary objects
on which an order
is defined
Two distinct items
in a priority queue
can have the
same key
Additional methods
collection of items
An item is a pair
(key, element)
Main methods of the Priority
Queue ADT
unsorted list
Performance:
Implementation with a
sorted list
1
Performance:
insertItem takes O(1) time

since we can insert the item
at the beginning or end of
the sequence
removeMin, minKey and
minElement take O(n) time
since we have to traverse
the entire sequence to find
the smallest key
insertItem takes O(n) time

since we have to find the
place where to insert the
item
minElement take O(1) time
since the smallest key is at
the beginning of the
sequence
Heaps
4/5/2002 14:4
Selection-Sort
Insertion-Sort
Insertion-sort is the variation of PQ-sort where the
priority queue is implemented with a sorted
sequence
Selection-sort is the variation of PQ-sort where the

priority queue is implemented with an unsorted
sequence
4
Inserting the elements into the priority queue with n

insertItem operations takes O(n) time
Removing the elements in sorted order from the priority
queue with n removeMin operations takes time
proportional to
1 + 2 + + n
Heap-Order: for every

internal node v other than
the root,
key(v) key(parent(v))
Complete Binary Tree: let h
be the height of the heap
The last node of a heap
is the rightmost internal

node of depth h 1
2
6
Insertion-sort runs in O(n2) time
last node
Proof: (we apply the complete binary tree property)

Let h be the height of a heap storing n keys
Since there are 2i keys at depth i = 0, , h 2 and at least one key
at depth h 1, we have n 1 + 2 + 4 + + 2h2 + 1
Thus, n 2h1 , i.e., h log n + 1
h2
2h2
h1
10
Insertion into a
Heap (2.4.3)
can use a heap to implement a priority queue

store a (key, element) item at each internal node
keep track of the position of the last node
simplicity, we show only the keys in the pictures
(2, Sue)
(6, Mark)
Method insertItem of the

priority queue ADT
corresponds to the
insertion of a key k to
the heap
The insertion algorithm
consists of three steps
(7, Anna)
Theorem: A heap storing n keys has height O(log n)
(5, Pat)
depth keys
0
1
(9, Jeff)

queue with a series of n removeMin operations takes O(n)
time
5
9
2i nodes of depth i
at depth h 1, the internal
nodes are to the left of the
external nodes
We
We
We
For
Height of a Heap (2.4.3)
for i = 0, , h 1, there are

insertItem operations takes time proportional to
What is a heap (2.4.3)
storing keys at its internal

nodes and satisfying the
1 + 2 + + n
Selection-sort runs in O(n2) time
A heap is a binary tree
Running time of Insertion-sort:
Running time of Selection-sort:
11
Find the insertion node z

(the new last node)
Store k at z and expand z
into an internal node
Restore the heap-order
property (discussed next)
2
5
9
z
7
insertion node
2
5
9
6
7
12
Heaps
4/5/2002 14:4
Upheap
Removal from a Heap (2.4.3)
After the insertion of a new key k, the heap-order property may be
Method removeMin of
violated
Algorithm upheap restores the heap-order property by swapping k
along an upward path from the insertion node
Upheap terminates when the key k reaches the root or a node
whose parent has a key smaller than or equal to k
Since a heap has height O(log n), upheap runs in O(log n) time
the priority queue ADT

corresponds to the
removal of the root key
from the heap
The removal algorithm
consists of three steps
13
Downheap
w
last node
7
14
Updating the Last Node
After replacing the root key with the key k of the last node, the
heap-order property may be violated

Algorithm downheap restores the heap-order property by
swapping key k along a downward path from the root
Upheap terminates when key k reaches a leaf or a node whose
children have keys greater than or equal to k
Since a heap has height O(log n), downheap runs in O(log n) time
7
5
Replace the root key with

the key of the last node w
Compress w and its
children into a leaf
Restore the heap-order
property (discussed next)
2
5
The insertion node can be found by traversing a path of O(log n)

nodes
Go up until a left child or the root is reached

If a left child is reached, go to the right child
Go down left until a leaf is reached
Similar algorithm for updating the last node after a removal
5
6
15
16
Vector-based Heap
Implementation (2.4.3)
Heap-Sort (2.4.4)
We can represent a heap with n
Consider a priority
queue with n items

implemented by means
of a heap
the space used is O(n)

methods insertItem and
removeMin take O(log n)
time
methods size, isEmpty,
minKey, and minElement
take time O(1) time
Using a heap-based
priority queue, we can

sort a sequence of n
elements in O(n log n)
time
The resulting algorithm
is called heap-sort
Heap-sort is much
faster than quadratic
sorting algorithms, such
as insertion-sort and
selection-sort
17
keys by means of a vector of

length n + 1
For the node at rank i
the left child is at rank 2i

the right child is at rank 2i + 1
2
5
Links between nodes are not

explicitly stored
The leaves are not represented

The cell of at rank 0 is not used
Operation insertItem corresponds
to inserting at rank n + 1
Operation removeMin corresponds
to removing at rank n
Yields in-place heap-sort
18
Heaps
4/5/2002 14:4
Bottom-up Heap
Construction (2.4.3)
Merging Two Heaps

We are given two two
heaps and a key k

We create a new heap
with the root node
storing k and with the
two heaps as subtrees
We perform downheap
to restore the heaporder property
3
8
2
5
7
3
8
2
5
2
3
8
We can construct a heap
storing n given keys in

using a bottom-up
construction with log n
phases
In phase i, pairs of
heaps with 2i 1 keys are
merged into heaps with
2i+11 keys
6
19
Example
20
Example (contd.)
25
15
15
25
16
2i 1
2i+11
16
2i 1
12
12
23
23
11
20
16
15
27
20
5
15
16
4
25
21
Example (contd.)
11
12
27
9
23
6
12
11
20
23
9
27
20
22
Example (end)
10
7
15
16
4
25
6
12
11
27
15
23
20
16
5
25
8
12
11
23
9
27
20
4
4
15
16
5
25
8
12
11
23
9
27
15
20
23
16
7
25
10
8
12
11
23
9
27
20
24
Heaps
4/5/2002 14:4
Analysis
We visualize the worst-case time of a downheap with a proxy path
that goes first right and then repeatedly goes left until the bottom
of the heap (this path may differ from the actual downheap path)
Since each node is traversed by at most two proxy paths, the total
number of nodes of the proxy paths is O(n)
Thus, bottom-up heap construction runs in O(n) time
Bottom-up heap construction is faster than n successive insertions
and speeds up the first phase of heap-sort
25
Priority Queues
6/8/2002 2:00 PM
Outline and Reading

PriorityQueue ADT (2.4.1)
Total order relation (2.4.1)
Comparator ADT (2.4.1)
Sorting with a priority queue (2.4.2)
Selection-sort (2.4.2)
Insertion-sort (2.4.2)
Priority Queues
6/8/2002 2:00 PM
Sell
100
IBM
$122
Sell
300
IBM
$120
Buy 500
IBM
$119
Buy 400
IBM
$118
Priority Queues
Priority Queue ADT
insertItem(k, o)
inserts an item with key k
and element o
removeMin()
removes the item with
smallest key and returns its
element
6/8/2002 2:00 PM
minKey(k, o)
remove, the smallest key of
an item
minElement()
remove, the element of an
item with smallest key
size(), isEmpty()
Applications:
Priority Queues
Standby flyers
Auctions
Stock market
Comparator ADT
A comparator encapsulates
the action of comparing two
objects according to a given
A generic priority queue
uses an auxiliary
comparator
The comparator is external
to the keys being compared
When the priority queue
needs to compare two keys,
it uses its comparator
6/8/2002 2:00 PM
Priority Queues
Total Order Relation
Additional methods
A priority queue stores a

collection of items
An item is a pair
(key, element)
Main methods of the Priority
Queue ADT
6/8/2002 2:00 PM
Keys in a priority
queue can be
arbitrary objects
on which an order
is defined
Two distinct items
in a priority queue
can have the
same key
6/8/2002 2:00 PM
Mathematical concept
of total order relation
Reflexive property:
xx
Antisymmetric property:
xyyxx=y
Transitive property:
xyyzxz
Priority Queues
Sorting with a Priority Queue

Methods of the Comparator
ADT, all with Boolean
return type
Priority Queues
isLessThan(x, y)
isLessThanOrEqualTo(x,y)
isEqualTo(x,y)
isGreaterThan(x, y)
isGreaterThanOrEqualTo(x,y)
isComparable(x)
We can use a priority

queue to sort a set of
comparable elements
1. Insert the elements one
by one with a series of
insertItem(e, e)
operations
2. Remove the elements in
sorted order with a series
of removeMin()
operations
The running time of this

sorting method depends on
the priority queue
implementation
6/8/2002 2:00 PM
Algorithm PQ-Sort(S, C)
Input sequence S, comparator C
for the elements of S
increasing order according to C
P priority queue with
comparator C
while S.isEmpty ()
e S.remove (S. first ())
P.insertItem(e, e)
while P.isEmpty()
e P.removeMin()
S.insertLast(e)
Priority Queues
Priority Queues
6/8/2002 2:00 PM
Sequence-based Priority Queue

Implementation with an
unsorted sequence
Implementation with a
sorted sequence
Store the items of the

priority queue in a list-based
sequence, in arbitrary order
Performance:
6/8/2002 2:00 PM
Selection-sort is the variation of PQ-sort where the

priority queue is implemented with an unsorted
sequence
Running time of Selection-sort:
Store the items of the

priority queue in a
sequence, sorted by key
Performance:
insertItem takes O(1) time

since we can insert the item
at the beginning or end of
the sequence
minElement take O(n) time
since we have to traverse
the entire sequence to find
the smallest key
insertItem takes O(n) time

since we have to find the
place where to insert the
item
minElement take O(1) time
since the smallest key is at
the beginning of the
sequence
Priority Queues
Insertion-Sort

insertItem operations takes time proportional to
1 + 2 + + n
2.

queue with a series of n removeMin operations takes
O(n) time
1 + 2 + + n
Selection-sort runs in O(n2) time
6/8/2002 2:00 PM
Priority Queues
Priority Queues
Instead of using an
external data structure,
we can implement
selection-sort and
insertion-sort in-place
A portion of the input
sequence itself serves as
the priority queue
For in-place insertion-sort
Insertion-sort runs in O(n2) time

6/8/2002 2:00 PM
1. Inserting the elements into the priority queue with n

insertItem operations takes O(n) time
2. Removing the elements in sorted order from the priority
queue with n removeMin operations takes time
proportional to
In-place Insertion-sort
Insertion-sort is the variation of PQ-sort where the

priority queue is implemented with a sorted
sequence
Running time of Insertion-sort:
1.
Selection-Sort
We keep sorted the initial

portion of the sequence
We can use
swapElements instead of
modifying the sequence
6/8/2002 2:00 PM
Priority Queues
5
10
Dictionaries
4/5/2002 15:1
Dictionary ADT (2.5.1)

The dictionary ADT models a
searchable collection of keyelement items
The main operations of a
dictionary are searching,
inserting, and deleting items
Multiple items with the same
key are allowed
Applications:
Dictionaries and Hash Tables

0
1
2
3
4
025-612-0001
981-101-0002
451-229-0004
address book
credit card authorization
mapping host names (e.g.,
cs16.net) to internet addresses
(e.g., 128.148.34.101)
Dictionary ADT methods:

findElement(k): if the
dictionary has an item with
key k, returns its element,
else, returns the special
element NO_SUCH_KEY
insertItem(k, o): inserts item
(k, o) into the dictionary
removeElement(k): if the
key k, removes it from the
dictionary and returns its
element, else returns the
special element
NO_SUCH_KEY
size(), isEmpty()
keys(), Elements()
A log file is a dictionary implemented by means of an unsorted

sequence
Example:
insertItem takes O(1) time since we can insert the new item at the
beginning or at the end of the sequence
findElement and removeElement take O(n) time since in the worst
case (the item is not found) we traverse the entire sequence to
look for an item with the given key
The log file is effective only for dictionaries of small size or for
dictionaries on which insertions are the most common
operations, while searches and removals are rarely performed
(e.g., historical record of logins to a workstation)
A hash function h maps keys of a given type to

integers in a fixed interval [0, N 1]
We store the items of the dictionary in a sequence (based on a

doubly-linked lists or a circular array), in arbitrary order
Performance:
Hash Functions and

Hash Tables (2.5.2)
Log File (2.5.1)
h(x) = x mod N
is a hash function for integer keys
The integer h(x) is called the hash value of key x
A hash table for a given key type consists of

Hash function h
Array (called table) of size N
When implementing a dictionary with a hash table,
the goal is to store item (k, o) at index i = h(k)
Example
Hash Functions ( 2.5.3)
We design a hash table for
0
1
2
3
4
A hash function is
025-612-0001
981-101-0002
9997
9998
9999
The hash code map is
applied first, and the

usually specified as the
compression map is
composition of two
applied next on the
functions:
result, i.e.,
h(x) = h2(h1(x))
Hash code map:
The goal of the hash
h1: keys integers
function is to
Compression map:
disperse the keys in
h2: integers [0, N 1]
an apparently random
451-229-0004
a dictionary storing items

(SSN, Name), where SSN
(social security number) is a
nine-digit positive integer
Our hash table uses an
array of size N = 10,000 and
the hash function
h(x) = last four digits of x
200-751-9998
way
Dictionaries
4/5/2002 15:1
Hash Code Maps (2.5.3)

Memory address:
We reinterpret the memory

address of the key object as
an integer (default hash code
of all Java objects)
Good in general, except for
numeric and string keys
Component sum:
We partition the bits of

the key into components
of fixed length (e.g., 16
or 32 bits) and we sum
the components
(ignoring overflows)
Suitable for numeric keys
of fixed length greater
than or equal to the
number of bits of the
integer type (e.g., long
and double in Java)
Integer cast:
Hash Code Maps (cont.)
We reinterpret the bits of the

key as an integer
Suitable for keys of length
less than or equal to the
number of bits of the integer
type (e.g., byte, short, int
and float in Java)
h2 (y) = (ay + b) mod N

a and b are
nonnegative integers
such that
a mod N 0
Otherwise, every
integer would map to
the same value b
Polynomial p(z) can be
evaluated in O(n) time

using Horners rule:
The following
polynomials are
successively computed,
each from the previous
one in O(1) time
p0(z) = an1
pi (z) = ani1 + zpi1(z)
(i = 1, 2, , n 1)
We have p(z) = pn1(z)
025-612-0001
451-229-0004
981-101-0004
Chaining is simple,
but requires
additional memory
outside the table
Consider a hash table A

that uses linear probing
findElement(k)
h(x) = x mod 13
Insert keys 18, 41,
22, 44, 59, 32, 31,
73, in this order
We start at cell h(k)

We probe consecutive
locations until one of the
following occurs
An item with key k is
0 1 2 3 4 5 6 7 8 9 10 11 12
found, or
An empty cell is found,
or
41
18 44 59 32 22 31 73
0 1 2 3 4 5 6 7 8 9 10 11 12
0
1
2
3
4
10
Search with Linear Probing
Example:
different elements are

mapped to the same
cell
Chaining: let each
cell in the table point
to a linked list of
elements that map
there
Linear Probing (2.5.5)

colliding item is placed in a
different cell of the table
Linear probing handles
collisions by placing the
colliding item in the next
(circularly) available table cell
Each table cell inspected is
referred to as a probe
Colliding items lump together,
causing future collisions to
cause a longer sequence of
probes
Collisions occur when
Divide (MAD):
Open addressing: the
We partition the bits of the

key into a sequence of
components of fixed length
(e.g., 8, 16 or 32 bits)
a0 a1 an1
We evaluate the polynomial
p(z) = a0 + a1 z + a2 z2 +
+ an1zn1
at a fixed value z, ignoring
overflows
Especially suitable for strings
(e.g., the choice z = 33 gives
at most 6 collisions on a set
of 50,000 English words)
Collision Handling
( 2.5.5)
Multiply, Add and
h2 (y) = y mod N
The size N of the
hash table is usually
chosen to be a prime
The reason has to do
with number theory
and is beyond the
scope of this course
Compression
Maps (2.5.4)
Division:
Polynomial accumulation:
11
N cells have been
unsuccessfully probed
Algorithm findElement(k)
i h(k)
p0
repeat
c A[i]
if c =
return NO_SUCH_KEY
else if c.key () = k
return c.element()
else
i (i + 1) mod N
pp+1
until p = N
return NO_SUCH_KEY
12
Dictionaries
4/5/2002 15:1
Updates with Linear Probing

To handle insertions and
deletions, we introduce a
special object, called
AVAILABLE, which replaces
deleted elements
removeElement(k)
Double Hashing
Double hashing uses a
insert Item(k, o)
We search for an item with

key k
A cell i is found that is
either empty or stores

AVAILABLE, or
N cells have been
unsuccessfully probed
If such an item (k, o) is

found, we replace it with the
special item AVAILABLE
and we return element o
Else, we return
NO_SUCH_KEY
We store item (k, o) in

cell i
13
k
18
41
22
44
59
32
31
73
table storing integer

keys that handles
collision with double
hashing
N = 13
h(k) = k mod 13
d(k) = 7 k mod 7
Insert keys 18, 41,

22, 44, 59, 32, 31,
73, in this order
3
1
6
5
4
3
4
4
5
2
9
5
7
6
5
8
31
41
18 32 59 73 22 44
0 1 2 3 4 5 6 7 8 9 10 11 12
15
Universal
Hashing ( 2.5.6)
1, 2, , q
14
The expected running
time of all the dictionary

ADT operations in a
hash table is O(1)
In practice, hashing is
very fast provided the
load factor is not close
to 100%
Applications of hash
tables:
small databases
compilers
browser caches
16
Proof of Universality (Part 1)
A family of hash functions
is universal if, for any

0<i,j<M-1,
Pr(h(j)=h(k)) < 1/N.
Choose p as a prime
between M and 2M.
Randomly select 0<a<p
and 0<b<p, and define
h(k)=(ak+b mod p) mod N
q<N
q is a prime
The possible values for

d2(k) are
insertions and removals on a

hash table take O(n) time
The worst case occurs when
all the keys inserted into the
dictionary collide
The load factor = n/N
affects the performance of a
hash table
Assuming that the hash
values are like random
numbers, it can be shown
that the expected number of
probes for an insertion with
open addressing is
1 / (1 )
0 1 2 3 4 5 6 7 8 9 10 11 12
In the worst case, searches,
10
Performance of
Hashing
h (k ) d (k ) Probes
5
2
9
5
7
6
5
8
compression map for the

secondary hash function:
d2(k) = q k mod q
where
Example of Double Hashing

Consider a hash
Common choice of
secondary hash function

d(k) and handles
collisions by placing an
item in the first available
cell of the series
(i + jd(k)) mod N
for j = 0, 1, , N 1
The secondary hash
function d(k) cannot
have zero values
The table size N must be
a prime to allow probing
of all the cells
We throw an exception
if the table is full
We start at cell h(k)
We probe consecutive
cells until one of the
following occurs
Theorem: The set of

all functions, h, as
defined here, is
universal.
Let f(k) = ak+b mod p

Let g(k) = k mod N
So h(k) = g(f(k)).
f causes no collisions:
Let f(k) = f(j).
Suppose k<j. Then
So a(j-k) is a multiple of p
But both are less than p
So a(j-k) = 0. I.e., j=k.
(contradiction)
Thus, f causes no collisions.
aj + b
ak + b
aj + b
p = ak + b p p
p
aj + b ak + b
a ( j k ) =
p
p p
17
18
Dictionaries
4/5/2002 15:1
Proof of Universality (Part 2)

If f causes no collisions, only g can make h cause collisions.
Fix a number x. Of the p integers y=f(k), different from x,
the number such that g(y)=g(x) is at most
p / N 1
Since there are p choices for x, the number of hs that will

cause a collision between j and k is at most
p( p 1)
p ( p / N 1)
N
There are p(p-1) functions h. So probability of collision is
p( p 1) / N 1
at most
p ( p 1)
Therefore, the set of possible h functions is universal.

19
Dictionaries
6/8/2002 2:01 PM
Outline and Reading

Dictionary ADT (2.5.1)
Log file (2.5.1)
Binary search (3.1.1)
Lookup table (3.1.1)
Binary search tree (3.1.2)
Dictionaries
6
<
2
>
4 =
6/8/2002 2:01 PM
Dictionaries
6/8/2002 2:01 PM
Dictionary ADT
Dictionary ADT methods:
address book
credit card authorization
mapping host names (e.g.,
cs16.net) to internet addresses
(e.g., 128.148.34.101)
6/8/2002 2:01 PM
findElement(k): if the
key k, returns its element,
else, returns the special
element NO_SUCH_KEY
insertItem(k, o): inserts item
(k, o) into the dictionary
removeElement(k): if the
key k, removes it from the
dictionary and returns its
element, else returns the
special element
NO_SUCH_KEY
size(), isEmpty()
keys(), Elements()
Dictionaries
similar to the high-low game

at each step, the number of candidate items is halved
terminates after a logarithmic number of steps
Performance:
l
0
0
11
14
16
18
19
11
14
16
18
19
11
14
16
18
19
11
14
16
18
m
1
m
1
1
3
3
h
4
We store the items of the dictionary in an array-based sequence,

sorted by key
We use an external comparator for the keys
Performance:
l
0
Dictionaries
A lookup table is a dictionary implemented by means of a sorted

sequence
insertItem takes O(1) time since we can insert the new item at the
beginning or at the end of the sequence
findElement and removeElement take O(n) time since in the worst
case (the item is not found) we traverse the entire sequence to
look for an item with the given key
The log file is effective only for dictionaries of small size or for
dictionaries on which insertions are the most common
operations, while searches and removals are rarely performed
(e.g., historical record of logins to a workstation)
Example: findElement(7)
0
We store the items of the dictionary in a sequence (based on a

doubly-linked lists or a circular array), in arbitrary order
Lookup Table
Binary search performs operation findElement(k) on a dictionary

implemented by means of an array-based sequence, sorted by key
A log file is a dictionary implemented by means of an unsorted

sequence
6/8/2002 2:01 PM
Binary Search
Dictionaries
Log File
The dictionary ADT models a

searchable collection of keyelement items
The main operations of a
dictionary are searching,
inserting, and deleting items
Multiple items with the same
key are allowed
Applications:
Search (3.1.3)
Insertion (3.1.4)
Deletion (3.1.5)
Performance (3.1.6)
19
findElement takes O(log n) time, using binary search

insertItem takes O(n) time since in the worst case we have to shift
n/2 items to make room for the new item
removeElement take O(n) time since in the worst case we have to
shift n/2 items to compact the items after the removal
The lookup table is effective only for dictionaries of small size or

for dictionaries on which searches are the most common
operations, while insertions and removals are rarely performed
(e.g., credit card authorizations)
l=m =h
6/8/2002 2:01 PM
Dictionaries
6/8/2002 2:01 PM
Dictionaries
Dictionaries
6/8/2002 2:01 PM
Binary Search Tree

A binary search tree is a
binary tree storing keys
(or key-element pairs)
at its internal nodes and
satisfying the following
property:
Search
An inorder traversal of a
binary search trees
visits the keys in
increasing order
6
Let u, v, and w be three

nodes such that u is in
the left subtree of v and
w is in the right subtree
of v. We have
key(u) key(v) key(w)
External nodes do not

store items
6/8/2002 2:01 PM
Dictionaries
Insertion
6
<
2
>
>
w
6
Dictionaries
We consider the case where

the key k to be removed is
stored at a node v whose
children are both internal
we find the internal node w

that follows v in an inorder
traversal
we copy key(w) into node v
we remove node w and its
left child z (which must be a
leaf) by means of operation
removeAboveExternal(z)
Example: remove 3
6/8/2002 2:01 PM
Dictionaries
>
4 v
8
5
6
2
9
5
10
Performance
1
3
8
6
Consider a dictionary
with n items
of a binary search tree
of height h
1
5
8
6
Dictionaries
6/8/2002 2:01 PM
<
Deletion (cont.)
Dictionaries
To perform operation
removeElement(k), we
search for key k
Assume key k is in the tree,
and let let v be the node
storing k
If node v has a leaf child w,
we remove v and w from the
tree with operation
Example: remove 4
6/8/2002 2:01 PM
Deletion
insertItem(k, o), we search
for key k
Assume k is not already in
the tree, and let let w be
the leaf reached by the
search
We insert k at node w and
expand w into an internal
node
Example: insert 5
6/8/2002 2:01 PM
Algorithm findElement(k, v)
To search for a key k,
if T.isExternal (v)
we trace a downward
return NO_SUCH_KEY
path starting at the root
if
k < key(v)
The next node visited
return findElement(k, T.leftChild(v))
depends on the
else if k = key(v)
outcome of the
return element(v)
comparison of k with
else { k > key(v) }
the key of the current
return findElement(k, T.rightChild(v))
node
6
If we reach a leaf, the
<
key is not found and we
2
9
return NO_SUCH_KEY
>
8
Example:
1
4 =
findElement(4)
11

methods findElement ,
insertItem and
removeElement take
O(h) time
The height h is O(n) in

the worst case and
O(log n) in the best
case
6/8/2002 2:01 PM
Dictionaries
12
Dictionaries
4/5/2002 15:1
Ordered Dictionaries
Binary Search Trees
order.
New operations:
<
2
Keys are assumed to come from a total
4 =
>
Binary Search Trees
closestKeyBefore(k)
closestElemBefore(k)
closestKeyAfter(k)
closestElemAfter(k)
Binary Search Trees
Binary Search (3.1.1)
Lookup Table (3.1.1)
Binary search performs operation findElement(k) on a dictionary
implemented by means of an array-based sequence, sorted by key
A lookup table is a dictionary implemented by means of a sorted

sequence
similar to the high-low game

at each step, the number of candidate items is halved
terminates after O(log n) steps
Example: findElement(7)
Performance:
11
14
16
18
19
l
0
11
14
16
18
19
11
14
16
18
19
11
14
16
18
19
l=m =h
Binary Search Trees
A binary search tree is a

binary tree storing keys
(or key-element pairs)
at its internal nodes and
satisfying the following
property:
Let u, v, and w be three
nodes such that u is in
the left subtree of v and
w is in the right subtree
of v. We have
key(u) key(v) key(w)
findElement takes O(log n) time, using binary search

insertItem takes O(n) time since in the worst case we have to shift
n/2 items to make room for the new item
removeElement take O(n) time since in the worst case we have to
shift n/2 items to compact the items after the removal
The lookup table is effective only for dictionaries of small size or

for dictionaries on which searches are the most common
operations, while insertions and removals are rarely performed
(e.g., credit card authorizations)
Binary Search
Tree (3.1.2)
We store the items of the dictionary in an array-based sequence,

sorted by key
We use an external comparator for the keys
Binary Search Trees
Search (3.1.3)
An inorder traversal of a
binary search trees
visits the keys in
increasing order
6
2
1
9
4
External nodes do not

store items
Binary Search Trees
To search for a key k,
we trace a downward
path starting at the root
The next node visited
depends on the
outcome of the
comparison of k with
the key of the current
node
If we reach a leaf, the
key is not found and we
return NO_SUCH_KEY
Example:
findElement(4)
Algorithm findElement(k, v)
if T.isExternal (v)
return NO_SUCH_KEY
if k < key(v)
return findElement(k, T.leftChild(v))
else if k = key(v)
return element(v)
else { k > key(v) }
return findElement(k, T.rightChild(v))
<
2
1
Binary Search Trees
6
9
>
4 =
Dictionaries
4/5/2002 15:1
Insertion (3.1.4)
Deletion (3.1.5)
6
<
insertItem(k, o), we search

for key k
Assume k is not already in
the tree, and let let w be
the leaf reached by the
search
We insert k at node w and
expand w into an internal
node
Example: insert 5
>
removeElement(k), we
search for key k
Assume key k is in the tree,
and let let v be the node
storing k
If node v has a leaf child w,
we remove v and w from the
tree with operation
Example: remove 4
>
w
6
Deletion (cont.)
We consider the case where
we find the internal node w

that follows v in an inorder
traversal
we copy key(w) into node v
we remove node w and its
left child z (which must be a
leaf) by means of operation
removeAboveExternal(z)
Example: remove 3
4 v
8
5
6
2
Binary Search Trees
9
5
Consider a dictionary
8
6
with n items
of a binary search tree
of height h
1
5

methods findElement ,
insertItem and
removeElement take O(h)
time
The height h is O(n) in
8
6
Binary Search Trees
>
Performance (3.1.6)
1
the key k to be removed is

stored at a node v whose
children are both internal
Binary Search Trees
<
9
the worst case and

O(log n) in the best
case
Binary Search Trees
10
Red-Black Trees
4/10/2002 11:1
AVL Tree Definition

AVL trees are
AVL Trees
v
balanced.
An AVL Tree is a
17
48
n(1)
nodes of an AVL tree of height h.
17
78
32
50
48
n(h) > 2n(h-2), n(h) > 4n(h-4), n(h) > 8n(n-6), (by induction),
n(h) > 2in(h-2i)
Solving the base case we get: n(h) > 2 h/2-1

Taking logarithms: h < 2log n(h) +2
Thus the height of an AVL tree is O(log n)
44
2
48
62
b=x
54
after insertion
4
17
32
1
1
a=z
case 2: double rotation

(a right rotation about c,
then a left rotation about a)
c=y
T0
b=x
unbalanced...
2
1
4
62
T0
T3
T2
T1
b=x
44
c=y
4
3
17
32
a=z
88
54
T2
c=x
7
1
50
48
64
78
2 y
T3
b=y
case 1: single rotation

(a left rotation about a)
62
88
Insertion Example, continued
the three
a=z
50
AVL Trees
let (a,b,c) be an inorder listing of x, y, z

perform the rotations needed to make b the topmost node of
T3
32
before insertion
Trinode Restructuring
T2
88
T1
c=z
78
AVL Trees
T1
17
a=y
one AVL subtree of height n-1 and another of height n-2.
c=x
Insertion is as in a binary search tree

Always done by expanding an external node.
Example:
44
44
That is, n(h) = 1 + n(h-1) + n(h-2)

Knowing n(h-1) > n(h-2), we get n(h) > 2n(h-2). So
T0
An example of an AVL tree where the

heights are shown next to the nodes:
Insertion in an AVL Tree
We easily see that n(1) = 1 and n(2) = 2

For n > 2, an AVL tree of height h contains the root node,
b=y
AVL Trees
Fact: The height of an AVL tree storing n keys is O(log n).

Proof: Let us bound n(h): the minimum number of internal
a=z
62
...balanced
1
1
2 y
2
48
50
4 x
z6
62
78
54
7
88
T2
T0
T1
T2
T3
AVL Trees
Height of an AVL Tree
(other two cases

are symmetrical)
88
50
1
n(2)
32
children of v can
differ by at most 1.
AVL Trees
78
1
such that for every

internal node v of T,
the heights of the
binary search tree
6
3
44
T0
T1
T3
T2
5
AVL Trees
T0
T1
T3
Red-Black Trees
4/10/2002 11:1
Restructuring
(as Single Rotations)
Restructuring
(as Double Rotations)
Single Rotations:
double rotations:
a=z
single rotation
b=y
c=x
T0
T1
double rotation
a=z
c=x
T0
T1
T3
T2
T0
T3
T2
T1
single rotation
b=y
T3
T2
T1
a=x
T3
c=z
T3
T2
T1
17
48
54
before deletion of 32
48
78
54
using a linked-structure binary tree
find is O(log n)
height of tree is O(log n), no restructures needed
insert is O(log n)
initial find is O(log n)

Restructuring up the tree, maintaining heights is O(log n)
remove is O(log n)
initial find is O(log n)

Restructuring up the tree, maintaining heights is O(log n)
AVL Trees
T0
8
62
17
11
c=x
78
54
44
b=y
62
48
after deletion
a single restructure is O(1)
T1
44
50
Running Times for

AVL Trees
T2
up the tree from w. Also, let y be the child of z with the larger
height, and let x be the child of y with the larger height.
We perform restructure(x) to restore balance at z.
As this restructuring may upset the balance of another node
higher in the tree, we must continue checking for balance until
the root of T is reached
88
AVL Trees
T3
AVL Trees
a=z
50
88
c=z
Let z be the first unbalanced node encountered while travelling
62
78
b=x
Rebalancing after a Removal
means the node removed will become an empty

external node. Its parent, w, may cause an imbalance.
Example:
44
44
50
T3
T1
Removal begins as in a binary search tree, which
62
T0
T2
T0
Removal in an AVL Tree
17
T2
T1
b=x
AVL Trees
32
c=y
a=y
a=y
b=y
a=x
T0
T0
double rotation
c=z
c=z
b=x
a=z
c=y
b=x
T3
T2
b=y
a=z
17
50
48
88
AVL Trees
78
88
54
10
(2,4) Trees
6/8/2002 2:08 PM
Outline and Reading

Multi-way search tree (3.3.1)
(2,4) Trees
Definition
Search
(2,4) tree (3.3.2)

9
2 5 7
10 14
Definition
Search
Insertion
Deletion
Comparison of dictionary implementations

6/8/2002 2:08 PM
(2,4) Trees
Multi-Way Search Tree

Each internal node has at least two children and stores d 1
key-element items (ki, oi), where d is the number of children
For a node with children v1 v2 vd storing keys k1 k2 kd1
(2,4) Trees
keys in the subtree of v1 are less than k1

keys in the subtree of vi are between ki1 and ki (i = 2, , d 1)
keys in the subtree of vd are greater than kd1
We can extend the notion of inorder traversal from binary trees

to multi-way search trees
Namely, we visit item (ki, oi) of node v between the recursive
traversals of the subtrees of v rooted at children vi and vi + 1
An inorder traversal of a multi-way search tree visits the keys in
increasing order
11
The leaves store no items and serve as placeholders
11
2 6 8
2 6 8
27
24
24
15
Multi-Way Inorder Traversal
A multi-way search tree is an ordered tree such that
6/8/2002 2:08 PM
32
1
30
4
3
6
5
12
15
27
32
14
10
18
30
11
13
15
6/8/2002 2:08 PM
(2,4) Trees
Multi-Way Searching
k = ki (i = 1, , d 1): the search terminates successfully

k < k1: we continue the search in child v1
ki1 < k < ki (i = 2, , d 1): we continue the search in child vi
k > kd1: we continue the search in child vd
2 6 8
Node-Size Property: every internal node has at most four children

Depth Property: all the external nodes have the same depth
Depending on the number of children, an internal node of a

(2,4) tree is called a 2-node, 3-node or 4-node
10 15 24
24
15
A (2,4) tree (also called 2-4 tree or 2-3-4 tree) is a multi-way

search with the following properties
Reaching an external node terminates the search unsuccessfully

Example: search for 30
11
17
(2,4) Trees
(2,4) Tree
Similar to search in a binary search tree

A each internal node with children v1 v2 vd and keys k1 k2 kd1
6/8/2002 2:08 PM
19
16
27
32
2 8
12
18
27
32
30
6/8/2002 2:08 PM
(2,4) Trees
6/8/2002 2:08 PM
(2,4) Trees
(2,4) Trees
6/8/2002 2:08 PM
Height of a (2,4) Tree
Insertion
We insert a new item (k, o) at the parent v of the leaf reached by
searching for k
Theorem: A (2,4) tree storing n items has height O(log n)

Proof:
Let h be the height of a (2,4) tree with n items
Since there are at least 2i items at depth i = 0, , h 1 and no
items at depth h, we have
n 1 + 2 + 4 + + 2h1 = 2h 1
Thus, h log (n + 1)
We preserve the depth property but

We may cause an overflow (i.e., node v may become a 5-node)
Example: inserting key 30 causes an overflow

10 15 24
Searching in a (2,4) tree with n items takes O(log n) time

depth items
0
1
1
h1
2h1
0
(2,4) Trees
v" is a 2-node with key k4 and children v4 v5
The overflow may propagate to the parent node u

15 24 32
15 24
v'
27 30 32 35
12
6/8/2002 2:08 PM
18
27 30
v"
35
v1 v2 v3 v4
v1 v2 v3 v4 v5
(2,4) Trees
3. while overflow(v)
if isRoot(v)
create a new empty root above v
v split(v)
6/8/2002 2:08 PM
Tree T has O(log n)

height
Step 1 takes O(log n)
time because we visit
O(log n) nodes
Step 2 takes O(1) time
Step 3 takes O(log n)
time because each split
takes O(1) time and we
perform O(log n) splits
Thus, an insertion in a
(2,4) tree takes O(log n)
time
(2,4) Trees
10
Deleting an item from a node v may cause an underflow, where

node v becomes a 1-node with one child and no keys
To handle an underflow at node v with parent u, we consider two
cases
Case 1: the adjacent siblings of v are 2-nodes
10 15 24
12
18
27 32 35
Fusion operation: we merge v with an adjacent sibling w and move

an item from u to the merged node v'
After a fusion, the underflow may propagate to the parent u
u
10 15 27
6/8/2002 2:08 PM
Underflow and Fusion
We reduce deletion of an item to the case where the item is at the

node with leaf children
Otherwise, we replace the item with its inorder successor (or,
equivalently, with its inorder predecessor) and delete the latter item
Example: to delete key 24, we replace it with 27 (inorder successor)
2 8
2. We add the new item (k, o) at node v
v5
Deletion
2 8
27 30 32 35
18
Let T be a (2,4) tree

with n items
1. We search for key k to locate the

insertion node v
key k3 is inserted into the parent u of v (a new root may be created)
18
27 32 35
(2,4) Trees
Algorithm insertItem(k, o)
let v1 v5 be the children of v and k1 k4 be the keys of v

node v is replaced nodes v' and v"
v' is a 3-node with keys k1 k2 and children v1 v2 v3
12
12
18
Analysis of Insertion
We handle an overflow at a 5-node v with a split operation:
2 8
6/8/2002 2:08 PM
Overflow and Split
12
10 15 24
6/8/2002 2:08 PM
2 8
12
(2,4) Trees
18
2 5 7
9 14
10
2 5 7
9
10 14
v'
32 35
11
6/8/2002 2:08 PM
(2,4) Trees
12
(2,4) Trees
6/8/2002 2:08 PM
Underflow and Transfer
Analysis of Deletion
To handle an underflow at node v with parent u, we consider

two cases
Case 2: an adjacent sibling w of v is a 3-node or a 4-node
Let T be a (2,4) tree with n items
Transfer operation:
1. we move a child of w to v
2. we move an item from u to v
3. we move an item from w to u
After a transfer, no underflow occurs
u
4 9
2
6 8
4 8
6/8/2002 2:08 PM
Tree T has O(log n) height
In a deletion operation
(2,4) Trees
13
We visit O(log n) nodes to locate the node from

which to delete the item
We handle an underflow with a series of O(log n)
fusions, followed by at most one transfer
Each fusion and transfer takes O(1) time
Thus, deleting an item from a (2,4) tree takes

O(log n) time
6/8/2002 2:08 PM
(2,4) Trees
14
Implementing a Dictionary
Comparison of efficient dictionary implementations
Search
Hash
Table
Skip List
(2,4)
Tree
6/8/2002 2:08 PM
Insert
Delete
expected
expected
expected
Notes
no ordered dictionary
methods
simple to implement
log n
log n
log n
high prob.
high prob.
high prob.
randomized insertion
simple to implement
complex to implement
log n
log n
log n
worst-case
worst-case
worst-case
(2,4) Trees
15
Red-Black Trees
6/8/2002 2:20 PM
Outline and Reading

From (2,4) trees to red-black trees (3.3.3)
Red-black tree ( 3.3.3)
Red-Black Trees
3
4
Definition
Height
Insertion
restructuring
recoloring
Deletion
restructuring
recoloring
adjustment
6/8/2002 2:20 PM
Red-Black Trees
From (2,4) to Red-Black Trees

same logarithmic time performance
simpler implementation with a single node type
Red-Black Trees
Red-Black Tree
A red-black tree is a representation of a (2,4) tree by means of a

binary tree whose nodes are colored red or black
In comparison with its associated (2,4) tree, a red-black tree has
6/8/2002 2:20 PM
A red-black tree can also be defined as a binary

search tree that satisfies the following properties:
2 6 7
Root Property: the root is black

External Property: every leaf is black
Internal Property: the children of a red node are black
Depth Property: all the leaves have the same black depth
9
4
5
3
6/8/2002 2:20 PM
OR
6
5
Red-Black Trees
The height of a red-black tree is at most twice the height of

its associated (2,4) tree, which is O(log n)
The search algorithm for a binary search tree is the

same as that for a binary search tree
By the above theorem, searching in a red-black tree
takes O(log n) time
Red-Black Trees
21
12
7
6/8/2002 2:20 PM
Red-Black Trees
Insertion
Theorem: A red-black tree storing n items has height

O(log n)
Proof:
6/8/2002 2:20 PM
Height of a Red-Black Tree
15
To perform operation insertItem(k, o), we execute the insertion

algorithm for binary search trees and color red the newly inserted
node z unless it is the root
We preserve the root, external, and depth properties

If the parent v of z is black, we also preserve the internal property and
we are done
Else (v is red ) we have a double red (i.e., a violation of the internal
property), which requires a reorganization of the tree
Example where the insertion of 4 causes a double red:
6
3
6/8/2002 2:20 PM
6
8
3
4
Red-Black Trees
Red-Black Trees
6/8/2002 2:20 PM
Remedying a Double Red
Restructuring
Consider a double red with child z and parent v, and let w be

the sibling of v
Case 1: w is black
Case 2: w is red
The double red is an incorrect

replacement of a 4-node
Restructuring: we change the
4-node replacement
4
2
The double red corresponds

to an overflow
Recoloring: we perform the
equivalent of a split
4
2
4 6 7
A restructuring remedies a child-parent double red when the

parent red node has a black sibling
It is equivalent to restoring the correct replacement of a 4-node
The internal property is restored and the other properties are
preserved
z
6
4
v
v
w
7
7
2
4
z
w
2
6
4 6 7
2 4 6 7
.. 2 ..
.. 2 ..
6/8/2002 2:20 PM
Red-Black Trees
Restructuring (cont.)
6
6
4
4
.. 2 ..
6/8/2002 2:20 PM
Red-Black Trees
Recoloring
A recoloring remedies a child-parent double red when the parent
red node has a red sibling
The parent v and its sibling w become black and the grandparent u
becomes red, unless it is the root
It is equivalent to performing a split on a 5-node
The double red violation may propagate to the grandparent u
There are four restructuring configurations depending on

whether the double red nodes are left or right children
4 6 7
2
4
6
4
2
4
2
4
2
6/8/2002 2:20 PM
Red-Black Trees
Analysis of Insertion
Algorithm insertItem(k, o)
1. We search for key k to locate
the insertion node z
2. We add the new item (k, o) at
node z and color z red
3. while doubleRed(z)
if isBlack(sibling(parent(z)))
z restructure(z)
return
else { sibling(parent(z) is red }
z recolor(z)
6/8/2002 2:20 PM
Red-Black Trees
6/8/2002 2:20 PM
6 7
Red-Black Trees
10
Deletion
Recall that a red-black tree

has O(log n) height
Step 1 takes O(log n) time
because we visit O(log n)
nodes
Step 2 takes O(1) time
Step 3 takes O(log n) time
because we perform
2 4 6 7
O(log n) recolorings, each

taking O(1) time, and
at most one restructuring
taking O(1) time
To perform operation remove(k), we first execute the deletion

algorithm for binary search trees
Let v be the internal node removed, w the external node removed,
and r the sibling of w
If either v of r was red, we color r black and we are done

Else (v and r were both black) we color r double black, which is a
violation of the internal property requiring a reorganization of the tree
Example where the deletion of 8 causes a double black:

6
Thus, an insertion in a redblack tree takes O(log n) time

11
6/8/2002 2:20 PM
v
w
Red-Black Trees
12
Red-Black Trees
6/8/2002 2:20 PM
Remedying a Double Black
Red-Black Tree Reorganization
The algorithm for remedying a double black node w with sibling

y considers three cases
Case 1: y is black and has a red child
We perform a restructuring, equivalent to a transfer , and we are

done
Case 2: y is black and its children are both black
We perform a recoloring, equivalent to a fusion, which may

propagate up the double black violation
Case 3: y is red
We perform an adjustment, equivalent to choosing a different

representation of a 3-node, after which either Case 1 or Case 2
applies
Deletion in a red-black tree takes O(log n) time
6/8/2002 2:20 PM
Red-Black Trees
13
Insertion
remedy double red
Red-black tree action
(2,4) tree action
result
restructuring
change of 4-node
representation
double red removed
recoloring
split
double red removed

or propagated up
Deletion
remedy double black
Red-black tree action
(2,4) tree action
result
restructuring
transfer
double black removed
recoloring
fusion
double black removed

or propagated up
adjustment
change of 3-node
representation
restructuring or
recoloring follows
6/8/2002 2:20 PM
Red-Black Trees
14
Skip Lists
4/5/2002 15:3
Outline and Reading

What is a skip list (3.5)
Operations
Skip Lists
Search (3.5.1)
Insertion (3.5.2)
Deletion (3.5.2)
S3
S2
S0
10
15
S1
15
23
15
23
Implementation
Analysis (3.5.3)
+
36
Space usage
Search and update times
4/5/2002 15:30
Skip Lists
What is a Skip List
4/5/2002 15:30
We search for a key x in a a skip list as follows:
lists S0, S1 , , Sh such that
S2
S1
S0
31
23
12
23
26
4/5/2002 15:30
31
34
31
34
64
44
56
64
78
Skip Lists
performs coin tosses (i.e.,

uses random bits) to control
its execution
It contains statements of the
type
S3
S2
S1
S0
+
3
b random()
if b = 0
do A
else { b = 1}
do B
4/5/2002 15:30
the coins are unbiased, and

the coin tosses are
independent
of a randomized algorithm is
often large but has very low
probability (e.g., it occurs
when all the coin tosses give
heads)
We use a randomized
algorithm to insert items into
a skip list
Skip Lists
23
12
23
26
4/5/2002 15:30
31
34
31
34
64
44
56
64
78
Skip Lists
To insert an item (x, o) into a skip list, we use a randomized

algorithm:
The worst-case running time
Its running time depends on

the outcomes of the coin
tosses
running time of a
randomized algorithm under
the following assumptions
+
31
Insertion
We analyze the expected
Example: search for 78
Randomized Algorithms
A randomized algorithm
We start at the first position of the top list

At the current position p, we compare x with y key(after(p))
x = y: we return element(after(p))
x > y: we scan forward
x < y: we drop down
If we try to drop down past the bottom list, we return NO_SUCH_KEY
Each list Si contains the special keys + and

List S0 contains the keys of S in nondecreasing order
Each list is a subsequence of the previous one, i.e.,
S0 S1 Sh
List Sh contains only the two special keys
We show how to use a skip list to implement the dictionary ADT
S3
Search
A skip list for a set S of distinct (key, element) items is a series of
Skip Lists
We repeatedly toss a coin until we get tails, and we denote with i

the number of times the coin came up heads
If i h, we add to the skip list new lists Sh+1, , Si +1, each
containing only the two special keys
We search for x in the skip list and find the positions p0, p1 , , pi
of the items with largest key less than x in each list S0, S1, , Si
For j 0, , i, we insert item (x, o) into list Sj after position pj
Example: insert key 15, with i = 2

p2
S2
p1
S1
S0
p0
10
4/5/2002 15:30
23
23
36
S3
S2
15
S1
15
23
S0
15
23
Skip Lists
10
+
+
+
36
6
Skip Lists
4/5/2002 15:3
Deletion
Implementation
To remove an item with key x from a skip list, we proceed as
We can implement a skip list

with quad-nodes
follows:
We search for x in the skip list and find the positions p0, p1 , , pi
of the items with key x, where position pj is in list Sj
We remove positions p0, p1 , , pi from the lists S0, S1, , Si
We remove all but one list containing only the two special keys
A quad-node stores:
Example: remove key 34
S3
34
S1
S0
p2
S2
12
23
34
23
34
p1
p0
45
4/5/2002 15:30
S2
S1
S0
+
+
23
12
23
Consider a skip list with n

items
By Fact 1, we insert an item

in list Si with probability 1/2i
By Fact 2, the expected size
of list Si is n/2i
The expected number of
Fact 1: The probability of getting i

consecutive heads when
flipping a coin is 1/2i
Fact 2: If each of n items is
present in a set with
probability p, the expected size
of the set is np
nodes used by the skip list is

h
n
1
2i = n 2i < 2n
i =0
i =0
Thus, the expected space

usage of a skip list with n
items is O(n)
Skip Lists
Search and Update Times

The search time in a skip list
is proportional to
PLUS_INF and MINUS_INF,

and we modify the key
comparator to handle them
4/5/2002 15:30
Skip Lists
the destination key does not

belong to a higher list
The drop-down steps are
bounded by the height of the

skip list and thus are O(log n)
with high probability
To analyze the scan-forward
steps, we use yet another
probabilistic fact:
Fact 4: The expected number of
coin tosses required in order
to get tails is 2
A scan-forward step is
associated with a former coin
toss that gave tails
By Fact 4, in each list the
expected number of scanforward steps is 2

Thus, the expected number of
scan-forward steps is O(log n)
We conclude that a search in a
skip list takes O(log n)
expected time
The analysis of insertion and
deletion gives similar results
Skip Lists
Consider a skip list with n
The running time of the
search an insertion
algorithms is affected by the
height h of the skip list
We show that with high
probability, a skip list with n
items has height O(log n)
We use the following
additional probabilistic fact:
Fact 3: If each of n events has
probability p, the probability
that at least one event
occurs is at most np
4/5/2002 15:30
items
By Fact 1, we insert an item in

list Si with probability 1/2i
By Fact 3, the probability that
list Si has at least one item is
at most n/2i
By picking i = 3log n, we have
that the probability that S3log n

has at least one item is
at most
n/23log n = n/n3 = 1/n2
Thus a skip list with n items
has height at most 3log n with
probability at least 1 1/n2
Skip Lists
10
Summary
When we scan forward in a list,
the number of drop-down

steps, plus
the number of scan-forward
steps
4/5/2002 15:30
quad-node
before
after
below
after
Height
depends on the random bits

used by each invocation of the
insertion algorithm
We use the following two basic
probabilistic facts:
node
node
node
node
45
Skip Lists
The space used by a skip list
the
the
the
the
Also, we define special keys
Space Usage
4/5/2002 15:30
item
link to
link to
link to
link to
11
A skip list is a data
structure for
dictionaries that uses a
randomized insertion
algorithm
In a skip list with n
items
The expected space used

is O(n)
The expected search,
insertion and deletion
time is O(log n)
4/5/2002 15:30
Using a more complex
probabilistic analysis,
one can show that
these performance
bounds also hold with
high probability
Skip lists are fast and
simple to implement in
practice
Skip Lists
12
Red-Black Trees
4/5/2002 15:2
Splay Trees are

Binary Search Trees
Splay Trees
(10,A)
(35,R)
BST Rules:
3
4
items stored only at internal

nodes
keys stored at nodes in the
left subtree of v are less
than or equal to the key
stored at v
keys stored at nodes in the
right subtree of v are
greater than or equal to the
key stored at v
(14,J)
(7,T)
(1,Q)
(1,C)
(5,H)
all the keys in the yellow

region are 20
Example Searching in a BST,

continued
(20,Z)
(10,A)
(21,O)
(8,N)
(5,H)
(1,Q)
(40,X)
(1,C)
(10,U)
Splaying:
start with
node x
is x the
root?
splaying moves a node to the root using rotations
left rotation
yes
stop
is x a child of
the root?
no
T1
T2
is x a left-left
grandchild?
is x a right-right
grandchild?
yes
is x a right-left
grandchild?
T2
T1
is x the left
child of the
root?
T3
T3
yes
(structure of tree above y

is not modified)
T2
T3
(structure of tree above x

is not modified)
Splay Trees
T1
zig
right-rotate
about the root
T2
5
(6,Y)
left-left grandchild means x is a left child of its

parent, which is itself a left child of its parent
p is xs parent; g is ps parent
a left rotation about x
T1
(10,U)
x is a
yes
no
zig
(40,X)
no
y
T3
(37,P)
(36,L)
(5,G)
yes
makes the right child y of a node x

into xs parent; x becomes the left
child of y
(21,O)
Splay Trees
new operation: splay
a right rotation about y
(7,P)
(5,I)
(5,H)
(2,R)
Splay Trees do Rotations after

Every Operation (Even Search)
makes the left child x of a node y into
ys parent; y becomes the right child
of x
(14,J)
(8,N)
(6,Y)
Splay Trees
(35,R)
(7,T)
(5,G)
(5,I)
right rotation
(10,A)
(37,P)
(36,L)
(7,P)
(2,R)
an internal node.
(35,R)
(14,J)
(7,T)
(1,C)
(6,Y)
(5,I)
search for key 8, ends at
(1,Q)
(10,U)
(5,G)
(20,Z)
the tree to found item

or an external node.
Example: Search for
time with key 11.
(40,X)
Splay Trees
Searching in a Splay Tree:

Starts the Same as in a BST
Search proceeds down
(37,P)
(36,L)
(7,P)
(2,R)
return the keys in order
Splay Trees
(21,O)
(8,N)
An inorder traversal will
all the keys in the blue

region are 20
(20,Z)
note that two keys of

equal value may be wellseparated
yes
is x a left-right
grandchild?
left-rotate about
the root
yes
Splay Trees
zig-zig
right-rotate about g,
right-rotate about p
zig-zig
left-rotate about g,
left-rotate about p
zig-zag
left-rotate about p,
right-rotate about g
zig-zag
right-rotate about p,
left-rotate about g
6
Red-Black Trees
4/5/2002 15:2
Splaying Example
Visualizing the
Splaying Cases
zig-zag
let x = (8,N)
x is the right child of its parent,
which is the left child of the

grandparent
left-rotate around p, then rightrotate around g
z
z
y
y
T4
T1
T1
T4
T2
T2
T2
T3
T4
T1
T1
T2
T2
T3
(1,C)
T4
(7,P)
(21,O)
(14,J)
(10,U)
(5,G)
(6,Y)
(40,X)
(1,C)
(after first rotation)
(10,A)
(21,O)
(7,P)
1.
(before applying
rotation)
(1,Q)
(5,H)
(6,Y)
(after second
rotation)
x is not yet the root, so
we splay again 8
(6,Y)
(5,I)
(8,N)
2.
(20,Z)
(20,Z)
(10,A)
before
(35,R)
(14,J)
(7,T)
(1,Q)
(21,O)
(8,N)
(5,H)
(7,P)
(14,J)
(21,O)
(10,U)
(5,G)
(5,G)
(40,X)
(6,Y)
(5,I)
(36,L)
(40,X)
(1,Q)
(35,R)
(37,P)
(14,J)
(7,T)
(37,P)
(8,N)
(1,C)
(5,H)
(7,P)
(2,R)
x is the root, so stop
(10,U)
(5,I)
(21,O)
(1,C)
(36,L)
(35,R)
(8,N)
(5,H)
(2,R)
after first splay
(5,G)
(7,P)
(21,O)
(36,L)
(10,U)
(5,G)
(5,I)
(6,Y)
(6,Y)
after second
splay
Splay Trees
10
Splay Trees & Ordered

Dictionaries
Splay Tree Definition
which nodes are splayed after each operation?
a splay tree is a binary search tree where a

node is splayed after it is accessed (for a
search or update)
method
deepest internal node accessed is splayed

splaying costs O(h), where h is height of the tree
which is still O(n) worst-case
findElement
insertElement
O(h) rotations, each of which is O(1)

removeElement
Splay Trees
(10,A)
(37,P)
(6,Y)
Splay Trees
(40,X)
(10,U)
(7,P)
(40,X)
(14,J)
(7,T)
(35,R)
(10,A)
(5,H)
(37,P)
(36,L)
(20,Z)
(10,A)
(after rotation)
(40,X)
3.
(5,G)
before, the depth of the shallowest leaf is(1,C)

3 and the deepest is 7
after, the depth of shallowest leaf is 1
(2,R)
and deepest is 8
(1,Q)
(1,C)
(37,P)
(36,L)
(14,J)
(10,U)
(2,R)
tree might not be more balanced

e.g. splay (40,X)
(7,T)
(5,I)
(35,R)
(20,Z)
(2,R)
(40,X)
(7,T)
Example Result
of Splaying
(37,P)
(36,L)
(36,L)
(5,G)
(10,A)
(8,N)
(1,Q)
(5,H)
(35,R)
(8,N)
(5,I)
(6,Y)
Splay Trees
right-rotate around root
(10,U)
(7,P)
(5,G)
(37,P)
now x is the left child of the root
(2,R)
(5,H)
(20,Z)
2.
(2,R)
Splaying Example, Continued
(5,H)
(1,C)
(7,P)
(5,I)
Splay Trees
(20,Z)
(21,O)
(10,U)
(7,T)
T3
T4
(before
rotating)
(40,X)
(35,R)
(14,J)
(8,N)
(1,Q)
T4
T2
T3
1.
(37,P)
(36,L)
g
p
(1,C)
(8,N)
(10,A)
zig
T1
(21,O)
(20,Z)
T3
(1,Q)
(1,Q)
(5,I)
zig-zig
(35,R)
(14,J)
(7,T)
(2,R)
T3
(7,T)
(10,A)
g
p
T1
(20,Z)
11
splay node
if key found, use that node
if key not found, use parent of ending external node
use the new node containing the item inserted
use the parent of the internal node that was actually
removed from the tree (the parent of the node that the
removed item was swapped with)
Splay Trees
12
Red-Black Trees
4/5/2002 15:2
Amortized Analysis of
Splay Trees
Cost per zig
Running time of each operation is proportional to time
for splaying.
Define rank(v) as the logarithm (base 2) of the number
of nodes in subtree rooted at v.
Costs: zig = $1, zig-zig = $2, zig-zag = $2.
Thus, cost for playing a node at depth d = $d.
Imagine that we store rank(v) cyber-dollars at each
node v of the splay tree (just for the sake of analysis).
zig-zig
T4
T1
T1
T2
T3
T1
Doing a zig at x costs at most rank(x) - rank(x):
T2
T1
cost = rank(x) + rank(y) - rank(y) - rank(x)

< rank(x) - rank(x).
14
at most 3(rank(r) - rank(x)) - d + 2:

Proof: Splaying x takes d/2 splaying substeps:
d /2
cost cost i
i =1
(3( rank i ( x ) rank i 1 ( x )) 2) + 2
x
z
T4
T4
d /2
zig-zag
y
T3
rooted at r:
T4
3(rank(x) - rank(x)) - 2.
Proof: See Theorem 3.9, Page 192.
z
T2
Cost of splaying a node x at depth d of a tree

z
T2
Doing a zig-zig or zig-zag at x costs at most
T1
T2
Cost of Splaying
y
T1
T3
T3
Splay Trees
13
Cost per zig-zig and zig-zag

x
T4
Splay Trees
zig
x
i =1
y
T2
T3
= 3( rank ( r ) rank 0 ( x )) 2(d / d ) + 2
T4
3( rank ( r ) rank ( x )) d + 2.
T3
Splay Trees
15
Performance of
Splay Trees
Recall: rank of a node is logarithm of its size.
Thus, amortized cost of any splay operation is
O(log n).
In fact, the analysis goes through for any

reasonable definition of rank(x).
This implies that splay trees can actually
adapt to perform searches on frequentlyrequested items much faster than O(log n) in

some cases. (See Theorems 3.10 and 3.11.)
Splay Trees
17
Splay Trees
16
Locators
5/15/2002 11:36 AM
Outline and Reading

3 a
Locators (2.4.4)
Locator-based methods (2.4.4)
Implementation
Positions vs. Locators
Locators
1 g
5/15/2002 11:36 AM
4 e
Locators
Locators
Application example:
key
element
position (or rank) of
the item in the
underlying structure
binary search tree

with locators
5/15/2002 11:36 AM
When an order is placed,

a locator to it is returned
Given a locator, an order
can be canceled or
modified
Locators
insert(k, o): inserts the item

(k, o) and returns a locator
for it
min(): returns the locator of
an item with smallest key
remove(l): remove the item
with locator l
replaceKey(l, k): replaces
the key of the item with
locator l
replaceElement(l, o):
replaces with o the element
of the item with locator l
5/15/2002 11:36 AM
locators(): returns an iterator

over the locators of the items
in the priority queue
Locator-based dictionary
methods:
insert(k, o): inserts the item

(k, o) and returns its locator
find(k): if the dictionary
contains an item with key k,
returns its locator, else return
the special locator
NO_SUCH_KEY
remove(l): removes the item
with locator l and returns its
element
locators(), replaceKey(l, k),
replaceElement(l, o)
Locators
Positions vs. Locators

6 d
3 a
Position
9 b
In turn, the position

(or array cell) stores
the locator
Example:
number of shares
Implementation
The locator is an
object storing
Locator-based priority queue

methods:
the price
the element is the
key(): returns the key of the

item associated with the locator
element(): returns the element
of the item associated with the
locator
5/15/2002 11:36 AM
Orders to purchase and

sell a given stock are
stored in two priority
queues (sell orders and
buy orders)
the key of an order is
claim check
reservation number
Methods of the locator ADT:
Locators
Locator-based Methods
A locators identifies and tracks a

(key, element) item within a data
structure
A locator sticks with a specific
item, even if that element
changes its position in the data
structure
Intuitive notion:
5/15/2002 11:36 AM
1 g
Locators
4 e
Locator
represents a place in a
data structure
related to other positions in
the data structure (e.g.,
previous/next or
parent/child)
implemented as a node or
an array cell
Position-based ADTs (e.g.,

sequence and tree) are
fundamental data storage
schemes
8 c
5/15/2002 11:36 AM
identifies and tracks a (key,

element) item
unrelated to other locators in
the data structure
implemented as an object
storing the item and its
position in the underlying
structure
Key-based ADTs (e.g., priority

queue and dictionary) can be
augmented with locator-based
methods
Locators
Merge Sort
4/9/2002 10:0
Outline and Reading

Divide-and-conquer paradigm (4.1.1)
Merge-sort (4.1.1)
Merge Sort
7 29 4 2 4 7 9
72 2 7
94 4 9
77
4/9/2002 10:09
22
99
44
Merge Sort
Generic merging and set operations (4.2.1)

Summary of sorting algorithms (4.2.1)
1
Divide-and-Conquer
Divide-and conquer is a
Divide: divide the input data

S in two disjoint subsets S1
and S2
Recur: solve the
subproblems associated
with S1 and S2
Conquer: combine the
solutions for S1 and S2 into a
solution for S
The base case for the
sequence S with n
elements consists of
three steps:
It uses a comparator
It has O(n log n) running
time
Unlike heap-sort
recursion are subproblems of

size 0 or 1
4/9/2002 10:09
Merge Sort
Merge-sort on an input
algorithm based on the

divide-and-conquer
paradigm
Like heap-sort
4/9/2002 10:09
Merge-Sort
Merge-sort is a sorting
general algorithm design

paradigm:
Algorithm
Merging two sorted sequences
Merge-sort tree
Execution example
Analysis
It does not use an

auxiliary priority queue
It accesses data in a
sequential manner
(suitable to sort data on a
disk)
Merge Sort
Divide: partition S into

two sequences S1 and S2
of about n/2 elements
each
Recur: recursively sort S1
and S2
Conquer: merge S1 and
S2 into a unique sorted
sequence
4/9/2002 10:09
Algorithm mergeSort(S, C)
Input sequence S with n
elements, comparator C
Output sequence S sorted
according to C
if S.size() > 1
(S1, S2) partition(S, n/2)
mergeSort(S1, C)
mergeSort(S2, C)
S merge(S1, S2)
Merge Sort
Merging Two Sorted Sequences
Merge-Sort Tree
The conquer step of
An execution of merge-sort is depicted by a binary tree
merge-sort consists
of merging two
sorted sequences A
and B into a sorted
sequence S
containing the union
of the elements of A
and B
Merging two sorted
sequences, each
with n/2 elements
and implemented by
means of a doubly
linked list, takes O(n)
time
4/9/2002 10:09
Algorithm merge(A, B)
Input sequences A and B with
n/2 elements each
Output sorted sequence of A B
unsorted sequence before the execution and its partition

sorted sequence at the end of the execution
S empty sequence
while A.isEmpty() B.isEmpty()
if A.first().element() < B.first().element()
S.insertLast(A.remove(A.first()))
else
S.insertLast(B.remove(B.first()))
while A.isEmpty()
S.insertLast(A.remove(A.first()))
while B.isEmpty()
S.insertLast(B.remove(B.first()))
return S
Merge Sort
each node represents a recursive call of merge-sort and stores
the root is the initial call

the leaves are calls on subsequences of size 0 or 1
7 2
7
9 4 2 4 7 9
2 2 7
77
4/9/2002 10:09
22
4 4 9
99
Merge Sort
44
6
Merge Sort
4/9/2002 10:0
Execution Example
Execution Example (cont.)
Partition
Recursive call, partition
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
7 2 9 4 2 4 7 9
7 2 2 7
77
22
3 8 6 1 1 3 8 6
9 4 4 9
99
4/9/2002 10:09
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
44
3 8 3 8
33
88
7 29 4 2 4 7 9
6 1 1 6
66
Merge Sort
11
7
7 2 2 7
77
22
77
22
99
4/9/2002 10:09
44
3 8 3 8
33
88
6 1 1 6
66
11
9
722 7
77
22
22
4/9/2002 10:09
99
Merge Sort
3 8 3 8
33
99
44
3 8 3 8
33
88
6 1 1 6
66
Merge Sort
11
10
Merge
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
3 8 6 1 1 3 8 6
44
11
3 8 6 1 1 3 8 6
9 4 4 9
4/9/2002 10:09
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
77
66
Recursive call, base case
9 4 4 9
88
Merge Sort
7 29 4 2 4 7 9
722 7
33
6 1 1 6
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
Merge Sort
7 29 4 2 4 7 9
44
3 8 3 8
Recursive call, base case
3 8 6 1 1 3 8 6
9 4 4 9
99
4/9/2002 10:09
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
722 7
9 4 4 9
Recursive call, partition
7 29 4 2 4 7 9
3 8 6 1 1 3 8 6
88
7 29 4 2 4 7 9
6 1 1 6
66
11
11
722 7
77
22
4/9/2002 10:09
3 8 6 1 1 3 8 6
9 4 4 9
99
44
Merge Sort
3 8 3 8
33
88
6 1 1 6
66
11
12
Merge Sort
4/9/2002 10:0
Recursive call, , base case, merge
Merge
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
7 29 4 2 4 7 9
722 7
77
3 8 6 1 1 3 8 6
9 4 4 9
22
99
4/9/2002 10:09
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
44
3 8 3 8
33
88
7 29 4 2 4 7 9
6 1 1 6
66
Merge Sort
11
13
722 7
77
22
77
22
99
4/9/2002 10:09
44
3 8 3 8
33
88
6 1 1 6
66
Merge Sort
11
15
The overall amount or work done at the nodes of depth i is O(n)

we partition and merge 2i sequences of size n/2i
we make 2i+1 recursive calls
Thus, the total running time of merge-sort is O(n log n)

1
n/2
2i
n/2i
4/9/2002 10:09
Merge Sort
66
Merge Sort
11
14
722 7
77
22
3 8 6 1 1 3 8 6
9 4 4 9
99
3 8 3 8
44
4/9/2002 10:09
33
88
6 1 1 6
66
Merge Sort
Algorithm
Time
selection-sort
O(n2)
insertion-sort
O(n2)
heap-sort
O(n log n)
merge-sort
O(n log n)
depth #seqs size

0
88
11
16
Summary of Sorting Algorithms
at each recursive call we divide in half the sequence,
33
7 29 4 2 4 7 9
The height h of the merge-sort tree is O(log n)
44
6 1 1 6
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
Analysis of Merge-Sort
3 8 3 8
Merge
3 8 6 1 1 3 8 6
9 4 4 9
99
4/9/2002 10:09
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
722 7
9 4 4 9
Recursive call, , merge, merge
7 29 4 2 4 7 9
3 8 6 1 1 3 8 6
17
4/9/2002 10:09
Notes
slow
in-place
for small data sets (< 1K)
slow
in-place
for small data sets (< 1K)
fast
in-place
for large data sets (1K 1M)
fast
sequential data access
for huge data sets (> 1M)
Merge Sort
18
Quick-Sort
4/9/2002 10:1
Outline and Reading

Quick-sort (4.3)
Quick-Sort
7 4 9 6 2 2 4 6 7 9
4 2 2 4
7 9 7 9
22
Analysis of quick-sort (4.3.1)

In-place quick-sort (4.8)
Summary of sorting algorithms
99
Quick-Sort
Quick-Sort
We partition an input
sorting algorithm based

on the divide-and-conquer
paradigm:
sequence as follows:
Divide: pick a random

element x (called pivot) and
partition S into
E elements equal x
is at the beginning or at the

end of a sequence, and
hence takes O(1) time
Thus, the partition step of
quick-sort takes O(n) time
G elements greater than x
Recur: sort L and G

Conquer: join L, E and G
Quick-Sort
Quick-Sort Tree
Each node represents a recursive call of quick-sort and stores
Pivot selection
7 2 9 43 7 6 1 1 2 3 4 6 7 8 9
Sorted sequence at the end of the execution
Algorithm partition(S, p)
Input sequence S, position p of pivot
Output subsequences L, E, G of the
elements of S less than, equal to,
or greater than the pivot, resp.
L, E, G empty sequences
x S.remove(p)
while S.isEmpty()
y S.remove(S.first())
if y < x
L.insertLast(y)
else if y = x
E.insertLast(y)
else { y > x }
G.insertLast(y)
return L, E, G
Quick-Sort
Unsorted sequence before the execution and its pivot
Execution Example
An execution of quick-sort is depicted by a binary tree
We remove, in turn, each

element y from S and
We insert y into L, E or G,
depending on the result of
the comparison with the
pivot x
Each insertion and removal
L elements less than x
Quick-Sort
Partition
Quick-sort is a randomized
Algorithm
Partition step
Quick-sort tree
Execution example
The root is the initial call

The leaves are calls on subsequences of size 0 or 1
7 2 9 4 2 4 7 9
3 8 6 1 1 3 8 6
7 4 9 6 2 2 4 6 7 9
4 2 2 4
22
7 9 7 9
22
99
99
Quick-Sort
9 4 4 9
33
88
44
Quick-Sort
Quick-Sort
4/9/2002 10:1
Partition, recursive call, pivot selection
Partition, recursive call, base case
7 2 9 4 3 7 6 1 1 2 3 4 6 7 8 9
2 4 3 1 2 4 7 9
22
3 8 6 1 1 3 8 6
9 4 4 9
99
7 2 9 43 7 6 1 1 2 3 4 6 7 8 9
33
2 4 3 1 2 4 7
88
11
44
7
99
33
2 4 3 1 1 2 3 4
88
11
99
44
10
Join, join
7 2 9 4 3 7 6 1 1 2 3 4 6 7 7 9
7 9 7 1 1 3 8 6
2 4 3 1 1 2 3 4
99
44
Quick-Sort
99
Quick-Sort
7 2 9 43 7 6 1 1 2 3 4 6 7 8 9
88
88
Partition, , recursive call, base case
4 3 3 4
7 9 7 1 1 3 8 6
4 3 3 4
99
Quick-Sort
11
7 2 9 43 7 6 1 1 2 3 4 6 7 8 9
44
2 4 3 1 1 2 3 4
44
Recursive call, pivot selection
3 8 6 1 1 3 8 6
4 3 3 4
88
Quick-Sort
7 2 9 43 7 6 1 1 2 3 4 6 7 8 9
11
33
Recursive call, , base case, join
2 4 3 1 1 2 3 4
9 4 4 9
99
Quick-Sort
3 8 6 1 1 3 8 6
11
4 3 3 4
99
11
7 9 7 17 7 9
88
99
44
Quick-Sort
12
Quick-Sort
4/9/2002 10:1
Worst-case Running Time
Expected Running Time

Consider a recursive call of quick-sort on a sequence of size s
The worst case for quick-sort occurs when the pivot is the unique
minimum or maximum element
One of L and G has size n 1 and the other has size 0
The running time is proportional to the sum
n + (n 1) + + 2 + 1
Thus, the worst-case running time of quick-sort is O(n2)
Good call: the sizes of L and G are each less than 3s/4
Bad call: one of L and G has size greater than 3s/4
7 2 9 43 7 6 1
7 2 9 43 7 6 19
7 9 7 1 1
2 4 3 1
Bad call
A call is good with probability 1/2
n1
n1
1/2 of the possible pivots cause good calls:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Bad pivots
Quick-Sort
13
Good pivots
Quick-Sort
Expected Running Time, Part 2
In-Place Quick-Sort
Probabilistic Fact: The expected number of coin tosses required in
Quick-sort can be implemented
For a node of depth i, we expect
In the partition step, we use
order to get k heads is 2k
The amount or work done at the
14
s(r)
s(a)
O(n)
s(b)
O(log n)
nodes of the same depth is O(n)
s(c)
s(d)
s(e)
s(f)
O(n)
Thus, the expected running time

of quick-sort is O(n log n)
total expected time: O(n log n)
Quick-Sort
15
Summary of Sorting Algorithms

Algorithm
Time
Notes
selection-sort
O(n2)
in-place
slow (good for small inputs)
insertion-sort
O(n2)
in-place
slow (good for small inputs)
quick-sort
O(n log n)
expected
in-place, randomized
fastest (good for large inputs)
heap-sort
O(n log n)
in-place
fast (good for large inputs)
merge-sort
O(n log n)
sequential data access

fast (good for huge inputs)
Quick-Sort
17
Algorithm inPlaceQuickSort(S, l, r)
Input sequence S, ranks l and r
Output sequence S with the
elements of rank between l and r
rearranged in increasing order
the elements less than the
if l r
pivot have rank less than h
return
the elements equal to the pivot
i a random integer between l and r
have rank between h and k
x S.elemAtRank(i)
the elements greater than the
pivot have rank greater than k
(h, k) inPlacePartition(x)
The recursive calls consider
inPlaceQuickSort(S, l, h 1)
elements with rank less than h
inPlaceQuickSort(S, k + 1, r)
elements with rank greater
than k
replace operations to rearrange

the elements of the input
sequence such that
time per level

O(n)
expected height
For a node of depth 2log4/3n,

the expected input size is one
The expected height of the
quick-sort tree is O(log n)
Bad pivots
to run in-place
i/2 ancestors are good calls

The size of the input sequence for the current call is at most (3/4)i/2n
Therefore, we have
7294376
Good call
depth time
Quick-Sort
16
(2,4) Trees
4/9/2002 13:1
Comparison-Based
Sorting ( 4.4)
Many sorting algorithms are comparison based.
Sorting Lower Bound
They sort by making comparisons between pairs of objects

Examples: bubble-sort, selection-sort, insertion-sort, heap-sort,
merge-sort, quick-sort, ...
Let us therefore derive a lower bound on the running
time of any algorithm that uses comparisons to sort n

elements, x1, x2, , xn.
no
Is xi < xj?
yes
Sorting Lower Bound
Counting Comparisons
Decision Tree Height
Let us just count comparisons then.

Each possible run of the algorithm corresponds
to a root-to-leaf path in a decision tree
xi < xj ?
xa < xb ?
Sorting Lower Bound
The height of this decision tree is a lower bound on the running time
Every possible input permutation must lead to a separate leaf output.
If not, some input 45 would have same output ordering as
54, which would be wrong.
Since there are n!=1*2**n leaves, the height is at least log (n!)
minimum height (time)
xc < xd ?
xi < xj ?
xa < xb ?
xc < xd ?
log (n!)
xe < xf ?
xk < xl ?
xm < xo ?
xp < xq ?
xe < xf ?
xm < xo ?
xk < xl ?
xp < xq ?
n!
Sorting Lower Bound
The Lower Bound

Any comparison-based sorting algorithms takes at
least log (n!) time
Therefore, any such algorithm takes time at least

n
n 2
log (n!) log = (n / 2) log (n / 2).
2
That is, any comparison-based sorting algorithm must
run in (n log n) time.
Sorting Lower Bound
Sorting Lower Bound
Sequences
4/15/2002 11:5
Set Operations
We represent a set by the
Set union:
sorted sequence of its

elements
By specializing the auxliliary
methods he generic merge
algorithm can be used to
perform basic set
operations:
Sets
Set intersection:
union
intersection
subtraction
The running time of an
operation on sets A and B

should be at most O(nA + nB)
Sets
Storing a Set in a List
Sets
of two sorted lists

A and B
Template method
genericMerge
Auxiliary methods
canonical ordering
Nodes storing set elements in order
aIsLess
bIsLess
bothAreEqual
Runs in O(nA + nB)
time provided the

auxiliary methods
run in O(1) time
Set elements
3
Using Generic Merge

for Set Operations
Any of the set operations can be
implemented using a generic merge
For example:
For intersection: only copy elements that

are duplicated in both list
For union: copy every element from both
lists except for the duplicates
All methods run in linear time.
Sets
Generalized merge Algorithm genericMerge(A, B)
The space used is O(n)
Sets
aIsLess(a, S)
{ do nothing }
bIsLess(b, S)
{ do nothing }
bothAreEqual(a, b, S)
S. insertLast(a)
Generic Merging
We can implement a set with a list

Elements are stored sorted according to some
List
aIsLess(a, S)
S.insertFirst(a)
bIsLess(b, S)
S.insertLast(b)
S. insertLast(a)
S empty sequence
while A.isEmpty() B.isEmpty()
a A.first().element(); b B.first().element()
if a < b
aIsLess(a, S); A.remove(A.first())
else if b < a
bIsLess(b, S); B.remove(B.first())
else { b = a }
A.remove(A.first()); B.remove(B.first())
while A.isEmpty()
aIsLess(a, S); A.remove(A.first())
while B.isEmpty()
bIsLess(b, S); B.remove(B.first())
return S
Sets
Radish-Sort
4/9/2002 13:4
Bucket-Sort ( 4.5.1)
Let be S be a sequence of n
Bucket-Sort and Radix-Sort

1, c
3, a
3, b
7, d
7, g
Algorithm bucketSort(S, N)
Input sequence S of (key, element)
items with keys in the range
[0, N 1]
Output sequence S sorted by
increasing keys
B array of N empty sequences
while S.isEmpty()
f S.first()
(k, o) S.remove(f)
B[k].insertLast((k, o))
for i 0 to N 1
while B[i].isEmpty()
f B[i].first()
(k, o) B[i].remove(f)
S.insertLast((k, o))
(key, element) items with keys

in the range [0, N 1]
Bucket-sort uses the keys as
indices into an auxiliary array B
of sequences (buckets)
Phase 1: Empty sequence S by
moving each item (k, o) into its
bucket B[k]
Phase 2: For i = 0, , N 1, move
the items of bucket B[i] to the
end of sequence S
7, e
Analysis:
0 1 2 3 4 5 6 7 8 9
Phase 1 takes O(n) time

Phase 2 takes O(n + N) time
Bucket-sort takes O(n + N) time

Example
Key-type Property
1, c
3, a
7, g
3, b
7, e
Phase 1
1, c
B
3, a
3, b
7, d
7, g
7, e
3, a
3, b
7, d
7, g
Lexicographic Order
Integer keys in the range [a, b]

Put item (k, o) into bucket
B[k a]
String keys from a set D of

possible strings, where D has
constant size (e.g., names of
the 50 U.S. states)
Sort D and compute the rank
The relative order of

any two items with the
same key is preserved
after the execution of
the algorithm
r(k) of each string k of D in

the sorted sequence
Put item (k, o) into bucket
B[r(k)]
Lexicographic-Sort
A d-tuple is a sequence of d keys (k1, k2, , kd), where

key ki is said to be the i-th dimension of the tuple
Example:
The Cartesian coordinates of a point in space are a 3-tuple
The lexicographic order of two d-tuples is recursively

defined as follows
(x1, x2, , xd) < (y1, y2, , yd)
x1 = y1 (x2, , xd) < (y2, , yd)

I.e., the tuples are compared by the first dimension,
then by the second dimension, etc.
x1 < y1
Extensions
7, e
The keys are used as

indices into an array
and cannot be arbitrary
objects
No external comparator
Stable Sort Property
Phase 2
1, c
Properties and Extensions
Key range [0, 9]

7, d
Let Ci be the comparator
that compares two tuples by

their i-th dimension
Let stableSort(S, C) be a
stable sorting algorithm that
uses comparator C
Lexicographic-sort sorts a
sequence of d-tuples in
lexicographic order by
executing d times algorithm
stableSort, one per
dimension
Lexicographic-sort runs in
O(dT(n)) time, where T(n) is
the running time of
stableSort
Algorithm lexicographicSort(S)
Input sequence S of d-tuples
lexicographic order
for i d downto 1
stableSort(S, Ci)
Example:
(7,4,6) (5,1,5) (2,4,6) (2, 1, 4) (3, 2, 4)
(2, 1, 4) (3, 2, 4) (5,1,5) (7,4,6) (2,4,6)
(2, 1, 4) (5,1,5) (3, 2, 4) (7,4,6) (2,4,6)
(2, 1, 4) (2,4,6) (3, 2, 4) (5,1,5) (7,4,6)
Radish-Sort
4/9/2002 13:4
Radix-Sort for
Binary Numbers
Radix-Sort ( 4.5.2)
Radix-sort is a
specialization of
lexicographic-sort that
uses bucket-sort as the
stable sorting algorithm
in each dimension
Radix-sort is applicable
to tuples where the
keys in each dimension i
are integers in the
range [0, N 1]
Radix-sort runs in time
O(d( n + N))
Consider a sequence of n
Algorithm radixSort(S, N)
Input sequence S of d-tuples such
that (0, , 0) (x1, , xd) and
(x1, , xd) (N 1, , N 1)
for each tuple (x1, , xd) in S
lexicographic order
for i d downto 1
bucketSort(S, N)
Example
Sorting a sequence of 4-bit integers
1001
0010
1001
1001
0001
0010
1110
1101
0001
0010
1101
1001
0001
0010
1001
0001
1101
0010
1101
1101
1110
0001
1110
1110
1110
b-bit integers
x = xb 1 x1x0
We represent each element
as a b-tuple of integers in
the range [0, 1] and apply
radix-sort with N = 2
This application of the
radix-sort algorithm runs in
O(bn) time
For example, we can sort a
sequence of 32-bit integers
in linear time
Algorithm binaryRadixSort(S)
Input sequence S of b-bit
integers
replace each element x
of S with the item (0, x)
for i 0 to b 1
replace the key k of
each item (k, x) of S
with bit xi of x
bucketSort(S, 2)
Quick-Sort
4/15/2002 11:4
The Selection Problem

Given an integer k and n elements x1, x2, , xn,
Selection
taken from a total order, find the k-th smallest

element in this set.
Of course, we can sort the set in O(n log n) time
and then index the k-th element.
k=3
7 4 9 6 2 2 4 6 7 9
Can we solve the selection problem faster?

Selection
Selection
Quick-Select ( 4.7)
Partition
Quick-select is a randomized
We partition an input
selection algorithm based on

the prune-and-search
paradigm:
L elements less than x

E elements equal x
G elements greater than x
Search: depending on k, either

answer is in E, or we need to
recurse in either L or G
k < |L|
Each insertion and removal is
k > |L|+|E|
k = k - |L| - |E|
|L| < k < |L|+|E|

(done)
Selection
at the beginning or at the

end of a sequence, and
hence takes O(1) time
Thus, the partition step of
quick-select takes O(n) time
Quick-Select Visualization
Selection
Consider a recursive call of quick-select on a sequence of size s
recursion path
Each node represents a recursive call of quick-select, and

stores k and the remaining sequence
Good call: the sizes of L and G are each less than 3s/4
Bad call: one of L and G has size greater than 3s/4
7 2 9 43 7 6 1
7 2 9 43 7 6 19
k=5, S=(7 4 9 3 2 6 5 1 8)
7 9 7 1 1
2 4 3 1
7294376
Good call
k=2, S=(7 4 9 6 5 8)
Bad call
A call is good with probability 1/2
k=2, S=(7 4 6 5)
1/2 of the possible pivots cause good calls:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
k=1, S=(7 6 5)
Bad pivots
5
Selection
Algorithm partition(S, p)
Input sequence S, position p of pivot
Output subsequences L, E, G of the
elements of S less than, equal to,
or greater than the pivot, resp.
L, E, G empty sequences
x S.remove(p)
while S.isEmpty()
y S.remove(S.first())
if y < x
L.insertLast(y)
else if y = x
E.insertLast(y)
else { y > x }
G.insertLast(y)
return L, E, G
Expected Running Time
An execution of quick-select can be visualized by a
We remove, in turn, each

element y from S and
We insert y into L, E or G,
depending on the result of
the comparison with the
pivot x
Prune: pick a random element x

(called pivot) and partition S into
sequence as in the quick-sort

algorithm:
Good pivots
Selection
Bad pivots
6
Quick-Sort
4/15/2002 11:4
Expected Running Time,

Part 2
Deterministic Selection
Probabilistic Fact #1: The expected number of coin tosses required in

order to get one head is two
Probabilistic Fact #2: Expectation is a linear function:
itself to find a good pivot for quick-select:

Divide S into n/5 sets of 5 each
Find a median in each set
Recursively find the median of the baby medians.
E(X + Y ) = E(X ) + E(Y )

E(cX ) = cE(X )
Let T(n) denote the expected running time of quick-select.

By Fact #2,
T(n) < T(3n/4) + bn*(expected # of calls before a good call)
By Fact #1,
Min size
for L
T(n) < T(3n/4) + 2bn
That is, T(n) is a geometric series:

T(n) < 2bn + 2b(3/4)n + 2b(3/4)2n + 2b(3/4)3n +
So T(n) is O(n).
We can solve the selection problem in O(n) expected

time.
Selection
We can do selection in O(n) worst-case time.

Main idea: recursively use the selection algorithm
Min size
for G
See Exercise C-4.24 for details of analysis.

7
Selection
Merge Sort
4/21/2002 4:43 PM
Outline and Reading

The Greedy Method
The Greedy Method Technique (5.1)
Fractional Knapsack Problem (5.1.1)
Task Scheduling (5.1.2)
Minimum Spanning Trees (7.3) [future lecture]
The Greedy Method
The Greedy Method

Technique
configurations: different choices, collections, or

values to find
objective function: a score assigned to
configurations, which we want to either maximize or
minimize
It works best when applied to problems with the

greedy-choice property:
a globally-optimal solution can always be found by a

series of local improvements from a starting
configuration.
The Greedy Method
Constraint:
b (x / w )
i
The Greedy Method
Does not have greedy-choice property, since $.40 is best made

with two $.20s, but the greedy solution will pick three coins
(which ones?)
The Greedy Method
bi - a positive benefit
wi - a positive weight
Goal: Choose items with maximum total benefit but with

weight at most W.
knapsack
Solution:
Items:
Weight:
Benefit:
Value:
xi W
iS
iS
Example 2: Coins are valued $.30, $.20, $.05, $.01
In this case, we let xi denote the amount we take of item i

Objective: maximize
Has the greedy-choice property, since no amount over $.32 can

be made with a minimum number of coins by omitting a $.32
coin (similarly for amounts over $.08, but under $.32).
Given: A set S of n items, with each item i having

weight at most W.
If we are allowed to take fractional amounts, then this is
the fractional knapsack problem.
Example
Problem: A dollar amount to reach and a collection of

coin amounts to use to get there.
Configuration: A dollar amount yet to return to a
customer plus the coins already returned
Objective function: Minimize number of coins returned.
Greedy solution: Always return the largest coin you can
Example 1: Coins are valued $.32, $.08, $.01
The Fractional Knapsack

Problem
Making Change
The greedy method is a general algorithm

design paradigm, built on the following
elements:
The Greedy Method
($ per ml)
4 ml
8 ml
2 ml
6 ml
1 ml
$12
$32
$40
$30
$50
20
50
The Greedy Method
1
2
6
1
ml
ml
ml
ml
of
of
of
of
5
3
4
2
10 ml
6
Merge Sort
4/21/2002 4:43 PM
The Fractional Knapsack

Algorithm
Greedy choice: Keep taking
item with highest value
(benefit to weight ratio)
Since bi ( xi / wi ) = (bi / wi ) xi
iS
iS
Run time: O(n log n). Why?
Correctness: Suppose there

is a better solution
there is an item i with higher

value than a chosen item j,
but xi<wi, xj>0 and vi<vj
If we substitute some i with j,
we get a better solution
How much of i: min{wi-xi, xj}
Thus, there is no better
solution than the greedy one
Task Scheduling
Given: a set T of n tasks, each having:
Algorithm fractionalKnapsack(S, W)
Input: set S of items w/ benefit bi
and weight wi; max. weight W
Output: amount xi of each item i
to maximize benefit w/ weight
at most W
for each item i in S
xi 0
{value}
vi bi / wi
{total weight}
w0
while w < W
remove item i w/ highest vi
xi min{wi , W - w}
w w + min{wi , W - w}
The Greedy Method
A start time, si
A finish time, fi (where si < fi)
Goal: Perform all the tasks using a minimum number of

machines.
Machine 3
Machine 2
Machine 1
Task Scheduling
Algorithm
The Greedy Method
Example
Greedy choice: consider tasks

by their start time and use as
few machines as possible with
Algorithm taskSchedule(T)
this order.
Input: set T of tasks w/ start time si
Run time: O(n log n). Why?
and finish time fi
Correctness: Suppose there is a
Output: non-conflicting schedule
with minimum number of machines
better schedule.
{no. of machines}
m0
We can use k-1 machines
while T is not empty
The algorithm uses k
remove
task
i
w/
smallest si
Let i be first task scheduled
if theres a machine j for i then
on machine k
schedule i on machine j
Machine i must conflict with
else
k-1 other tasks
mm+1
But that means there is no
schedule i on machine m
non-conflicting schedule
using k-1 machines
The Greedy Method
Given: a set T of n tasks, each having:
A start time, si
A finish time, fi (where si < fi)
[1,4], [1,3], [2,5], [3,7], [4,7], [6,9], [7,8] (ordered by start)
Goal: Perform all tasks on min. number of machines

Machine 3
Machine 2
Machine 1
The Greedy Method
10
Merge Sort
4/26/2002 10:19 AM
Outline and Reading

Divide-and-conquer paradigm (5.2)
Review Merge-sort (4.1.1)
Recurrence Equations (5.2.1)
Divide-and-Conquer
7 29 4 2 4 7 9
72 2 7
77
22
94 4 9
99
44
Divide-and-Conquer
Integer Multiplication (5.2.2)
Divide-and-Conquer
Merge-sort on an input
sequence S with n
elements consists of
three steps:
Divide: divide the input data S in

two or more disjoint subsets S1,
S2,
Recur: solve the subproblems
recursively
Conquer: combine the solutions
for S1, S2, , into a solution for S
The base case for the

recursion are subproblems of
constant size
Analysis can be done using
recurrence equations
Divide-and-Conquer
Recurrence Equation
Analysis
if n < 2
if n 2
Divide: partition S into

two sequences S1 and S2
of about n/2 elements
each
Recur: recursively sort S1
and S2
Conquer: merge S1 and
S2 into a unique sorted
sequence
Algorithm mergeSort(S, C)
Input sequence S with n
elements, comparator C
according to C
if S.size() > 1
(S1, S2) partition(S, n/2)
mergeSort(S1, C)
mergeSort(S2, C)
S merge(S1, S2)
Divide-and-Conquer
In the iterative substitution, or plug-and-chug, technique, we

iteratively apply the recurrence equation to itself and see if we can
find a pattern:
T (n ) = 2T (n / 2) + bn
= 2( 2T (n / 2 2 )) + b( n / 2)) + bn
= 22 T (n / 22 ) + 2bn
= 23 T (n / 23 ) + 3bn
= 24 T (n / 24 ) + 4bn
= ...
= 2i T (n / 2i ) + ibn
We can therefore analyze the running time of merge-sort by

finding a closed form solution to the above equation.
Iterative Substitution
The conquer step of merge-sort consists of merging two sorted

sequences, each with n/2 elements and implemented by means of
a doubly linked list, takes at most bn steps, for some constant b.
Likewise, the basis case (n < 2) will take at b most steps.
Therefore, if we let T(n) denote the running time of merge-sort:
T (n) =
2T ( n / 2) + bn
Divide-and-Conquer
Merge-Sort Review
Divide-and conquer is a
general algorithm design
paradigm:
Iterative substitution
Recursion trees
Guess-and-test
The master method
Note that base, T(n)=b, case occurs when 2i=n. That is, i = log n.
So,
T (n ) = bn + bn log n
That is, a solution that has T(n) only on the left-hand side.
Thus, T(n) is O(n log n).

Divide-and-Conquer
Divide-and-Conquer
Merge Sort
4/26/2002 10:19 AM
The Recursion Tree
Guess-and-Test Method
Draw the recursion tree for the recurrence relation and look for a
pattern:
b
if n < 2
T (n ) =
2T ( n / 2) + bn if n 2
time
In the guess-and-test method, we guess a closed form solution

and then try to prove it is true by induction:
b
if n < 2
T (n ) =
2T ( n / 2) + bn log n if n 2
Guess: T(n) < cn log n.
depth
Ts
size
bn
n/2
bn
= 2( c(n / 2) log(n / 2)) + bn log n

= cn (log n log 2) + bn log n
2i
n/2i
bn
= cn log n cn + bn log n
T (n ) = 2T (n / 2) + bn log n
Total time = bn + bn log n
Wrong: we cannot make this last line be less than cn log n
(last level plus all previous levels)

Divide-and-Conquer
Guess-and-Test Method,
Part 2
Recall the recurrence equation:
b
T (n ) =
2T ( n / 2) + bn log n
Guess #2: T(n) < cn log2 n.
Many divide-and-conquer recurrence equations have

the form:
if n < 2
T (n) =
aT
n
b
(
/
) + f (n)
if n 2
= 2(c (n / 2) log 2 (n / 2)) + bn log n
1. if f (n) is O (n logb a ), then T (n) is ( n logb a )
= cn log 2 n 2cn log n + cn + bn log n
2. if f (n) is ( n logb a log k n), then T (n) is (n logb a log k +1 n)
cn log 2 n
3. if f ( n) is ( n logb a + ), then T (n) is ( f (n)),

provided af (n / b) f (n) for some < 1.
So, T(n) is O(n log2 n).

In general, to use this method, you need to have a good guess
and you need to be good at induction proofs.
Divide-and-Conquer
Master Method, Example 1

The form:
T (n) =
aT ( n / b) + f (n )
if n < d
if n d
The Master Theorem:
= cn(log n log 2) 2 + bn log n
if c > b.
Master Method
T (n) = 2T (n / 2) + bn log n
Divide-and-Conquer
Divide-and-Conquer
10
if n < d
if n d
The form: T (n ) =
c
aT ( n / b) + f (n )
The Master Theorem:
if n < d
if n d
The Master Theorem:
1. if f (n) is O( n logb a ), then T (n) is ( n logb a )
2. if f (n) is (n logb a log k n), then T ( n) is (n logb a log k +1 n)
3. if f (n) is (n logb a + ), then T (n) is ( f (n)),
Example:
Example:
T (n) = 4T (n / 2) + n
Solution: logba=2, so case 1 says T(n) is

Divide-and-Conquer
T (n) = 2T (n / 2) + n log n
O(n2).
Solution: logba=1, so case 2 says T(n) is O(n log2 n).

11
Divide-and-Conquer
12
Merge Sort
4/26/2002 10:19 AM

The form:
T (n) =
aT ( n / b) + f (n )
if n < d
if n d
The form: T (n ) =
c
aT ( n / b) + f (n )
The Master Theorem:
The Master Theorem:

2. if f (n) is (n
log b a
log n), then T ( n) is (n

k

log b a
log
k +1
n)
Example:
Example:
T (n) = T (n / 3) + n log n
Solution: logba=0, so case 3 says T(n) is O(n log n).

Divide-and-Conquer
T (n) =
aT ( n / b) + f (n )
T (n) = 8T (n / 2) + n 2
Solution: logba=3, so case 1 says T(n) is O(n3).

13

The form:
if n < d
if n d
Divide-and-Conquer
14
if n < d
if n d
The form: T (n ) =
c
aT ( n / b) + f (n )
The Master Theorem:
if n < d
if n d
The Master Theorem:
3. if f (n) is (n
log b a +
), then T (n) is ( f (n)),
Example:
Example:
T (n) = 9T (n / 3) + n 3
Solution: logba=2, so case 3 says T(n) is O(n3).

Divide-and-Conquer
c
aT ( n / b) + f (n )
15
Divide-and-Conquer
if n < d
if n d
Using iterative substitution, let us see if we can find a pattern:

T (n) = aT (n / b) + f (n)
= a (aT (n / b 2 )) + f (n / b)) + bn
= a 2T (n / b 2 ) + af (n / b) + f (n)
= a 3T (n / b 3 ) + a 2 f (n / b 2 ) + af (n / b) + f (n)
= ...

3. if f (n) is (n
16
Iterative Proof of the

Master Theorem
The Master Theorem:
log b a +
(binary search)
Solution: logba=0, so case 2 says T(n) is O(log n).

The form: T (n ) =
T ( n ) = T ( n / 2) + 1
), then T (n) is ( f (n)),
= a logb nT (1) +
(log b n ) 1
i
f (n / bi )
i =0
Example:
T (n) = 2T (n / 2) + log n
(heap construction)
Solution: logba=1, so case 1 says T(n) is O(n).
= n logb aT (1) +
f (n / bi )
We then distinguish the three cases as
17
a
i =0
Divide-and-Conquer
(log b n ) 1
i
The first term is dominant

Each part of the summation is equally dominant
The summation is a geometric series
Divide-and-Conquer
18
Merge Sort
4/26/2002 10:19 AM
An Improved Integer
Multiplication Algorithm
Integer Multiplication
Algorithm: Multiply two n-bit integers I and J.
Divide step: Split I and J into high-order and low-order bits
Algorithm: Multiply two n-bit integers I and J.
Divide step: Split I and J into high-order and low-order bits
Observe that there is a different way to multiply parts:
I = I h 2n / 2 + Il
J = J h 2n / 2 + J l
J = J h 2n / 2 + J l
We can then define I*J by multiplying the parts and adding:
I * J = ( I h 2n / 2 + Il ) * ( J h 2n / 2 + J l )
I * J = I h J h 2 n + [( I h I l )( J l J h ) + I h J h + I l J l ]2 n / 2 + I l J l
= I h J h 2 n + [( I h J l I l J l I h J h + I l J h ) + I h J h + I l J l ]2 n / 2 + I l J l
= I h J h 2n + I h J l 2n / 2 + I l J h 2n / 2 + I l J l
So, T(n) = 4T(n/2) + n, which implies T(n) is O(n2).

But that is no better than the algorithm we learned in grade
school.
= I h J h 2 n + ( I h J l + I l J h )2 n / 2 + I l J l
Divide-and-Conquer
I = I h 2n / 2 + Il
19
So, T(n) = 3T(n/2) + n, which implies T(n) is O(nlog23), by

the Master Theorem.
Thus, T(n) is O(n1.585).
Divide-and-Conquer
20
Merge Sort
4/29/2002 11:40 AM
Outline and Reading

Matrix Chain-Product (5.3.1)
The General Technique (5.3.2)
0-1 Knapsack Problem (5.3.3)
Dynamic Programming
Dynamic Programming
Dynamic Programming
Matrix Chain-Products
Matrix Chain-Product:
Dynamic Programming is a general

algorithm design paradigm.
Rather than give the general structure, let us

first give a motivating example:
Review: Matrix Multiplication.
C = A*B
A is d e and B is e f
e 1
O(def ) time
e
A
Dynamic Programming
i,j
f
Dynamic Programming
The number of paranethesizations is equal

to the number of binary trees with n nodes
This is exponential!
It is called the Catalan number, and it is
almost 4n.
This is a terrible algorithm!
Dynamic Programming
Idea #1: repeatedly select the product that

uses (up) the most operations.
Counter-example:
Try all possible ways to parenthesize

A=A0*A1**An-1
Calculate number of ops for each one
Pick the one that is best
Running time:
B is 3 100
C is 100 5
D is 5 5
(B*C)*D takes 1500 + 75 = 1575 ops
B*(C*D) takes 1500 + 2500 = 4000 ops
A Greedy Approach
Matrix Chain-Product Alg.:
An Enumeration Approach
Compute A=A0*A1**An-1
Ai is di di+1
Problem: How to parenthesize?
Example
k =0
C[i, j ] = A[i, k ] * B[k , j ]
A is 10 5
B is 5 10
C is 10 5
D is 5 10
Greedy idea #1 gives (A*B)*(C*D), which takes
500+1000+500 = 2000 ops
A*((B*C)*D) takes 500+250+250 = 1000 ops
Dynamic Programming
Merge Sort
4/29/2002 11:40 AM
Another Greedy Approach
A Recursive Approach
Idea #2: repeatedly select the product that uses

the fewest operations.
Counter-example:
A is 101 11
B is 11 9
C is 9 100
D is 100 99
Greedy idea #2 gives A*((B*C)*D)), which takes
109989+9900+108900=228789 ops
(A*B)*(C*D) takes 9999+89991+89100=189090 ops
The greedy approach is not giving us the

optimal value. Dynamic Programming
Recall that Ai is a di di+1 dimensional matrix.

So, a characterizing equation for Ni,j is the following:
N i , j = min{N i ,k + N k +1, j + d i d k +1d j +1}

i k < j
Note that subproblems are not independent--the

subproblems overlap.
Dynamic Programming
N i , j = min{N i ,k + N k +1, j + d i d k +1d j +1}
Dynamic Programming
Find the best parenthesization of Ai*Ai+1**Aj.

Let Ni,j denote the number of operations done by this
subproblem.
The optimal solution for the whole problem is N0,n-1.
Subproblem optimality: The optimal solution can be

defined in terms of optimal subproblems
There has to be a final multiplication (root of the expression

tree) for the optimal solution.
Say, the final multiply is at index i: (A0**Ai)*(Ai+1**An-1).
Then the optimal solution N0,n-1 is the sum of two optimal
subproblems, N0,i and Ni+1,n-1 plus the time for the last multiply.
If the global optimum did not have these optimal
subproblems, we could define an even better optimal
solution.
Dynamic Programming
Since subproblems
Algorithm matrixChain(S):
overlap, we dont
use recursion.
Input: sequence S of n matrices to be multiplied
Output: number of operations in an optimal
Instead, we
paranethization of S
construct optimal
for i 1 to n-1 do
subproblems
bottom-up.
Ni,i 0
for b 1 to n-1 do
Ni,is are easy, so
start with them
for i 0 to n-b-1 do
j i+b
Then do length
2,3, subproblems,
Ni,j +infinity
and so on.
for k i to j-1 do
Running time: O(n3)
Ni,j min{Ni,j , Ni,k +Nk+1,j +di dk+1 dj+1}
A Dynamic Programming
Algorithm Visualization
ik < j
The bottom-up
N 0 1 2
construction fills in the
N array by diagonals
0
1
Ni,j gets values from
pervious entries in i-th

i
row and j-th column
Filling in each entry in
the N table takes O(n)
time.
Total run time: O(n3)
Getting actual
n-1
parenthesization can be
done by remembering
k for each N entry
A Dynamic Programming
Algorithm
The global optimal has to be defined in terms of

optimal subproblems, depending on where the final
multiply is at.
Let us consider all possible places for that final multiply:
A Characterizing
Equation
Define subproblems:
n-1
Dynamic Programming
10
The General Dynamic

Programming Technique
answer
Applies to a problem that at first seems to

require a lot of time (possibly exponential),
provided we have:
11
Simple subproblems: the subproblems can be

defined in terms of a few variables, such as j, k, l,
m, and so on.
Subproblem optimality: the global optimum value
can be defined in terms of optimal subproblems
Subproblem overlap: the subproblems are not
independent, but instead they overlap (hence,
should be constructed bottom-up).
Dynamic Programming
12
Merge Sort
4/29/2002 11:40 AM
The 0/1 Knapsack Problem
Example

weight at most W.
If we are not allowed to take fractional amounts, then
this is the 0/1 knapsack problem.
In this case, we let T denote the set of items we take
Objective: maximize
Constraint:
w
iT
Weight:
Benefit:
Dynamic Programming
13
A 0/1 Knapsack Algorithm,

First Attempt
Solution:
1
4 in
2 in
2 in
6 in
2 in
$20
$3
$6
$25
$80
5 (2 in)
3 (2 in)
4 (4 in)
9 in
Dynamic Programming
14
A 0/1 Knapsack Algorithm,

Second Attempt
Sk: Set of items numbered 1 to k.

Define B[k] = best selection from Sk.
Problem: does not have subproblem optimality:

weight at most W.
knapsack
Items:
b
iT
Consider S={(3,2),(5,4),(8,5),(4,3),10,9)} weight-benefit pairs
Best for S4:
Sk: Set of items numbered 1 to k.

Define B[k,w] = best selection from Sk with
weight exactly equal to w
Good news: this does have subproblem
optimality:
B[k 1, w]
if wk > w
B[k , w] =
B
k
w
B
k
w
w
b
max{
[
1
,
],
[
1
,
]
}
else
k
k
Best for S5:
I.e., best subset of Sk with weight exactly w is

either the best subset of Sk-1 w/ weight w or the
best subset of Sk-1 w/ weight w-wk plus item k.
Dynamic Programming
15
Dynamic Programming
16
The 0/1 Knapsack

Algorithm
Recall definition of B[k,w]:
B[k 1, w]
if wk > w
B[k , w] =
else
max{B[k 1, w], B[k 1, w wk ] + bk }
Algorithm 01Knapsack(S, W):
Since B[k,w] is defined in
Input: set S of items w/ benefit bi
terms of B[k-1,*], we can
and weight wi; max. weight W
Output: benefit of best subset with
reuse the same array
weight at most W
Running time: O(nW).

Not a polynomial-time
algorithm if W is large
This is a pseudo-polynomial
time algorithm
for w 0 to W do
B[w] 0
for k 1 to n do
for w W downto wk do
if B[w-wk]+bk > B[w] then
B[w] B[w-wk]+bk
Dynamic Programming
17
Graphs
5/3/2002 7:41 AM
Outline and Reading

Graphs (6.1)
Graphs
1843
337
43
17
LAX
1233
802
SFO
ORD
Data structures for graphs (6.2)
DFW
5/3/2002 7:41 AM
Graphs
Graph
A vertex represents an airport and stores the three-letter airport code

An edge represents a flight route between two airports and stores the
mileage of the route
SFO
2555
337
HNL
1843
43
17
LAX
1233
5/3/2002 7:41 AM
849
ORD
802
DFW
2
14
PVD
7
138
1120
flight
AA 1206
PVD
unordered pair of vertices (u,v)

e.g., a flight route
ORD
849
miles
PVD
all the edges are directed

e.g., route network
MIA
all the edges are undirected

e.g., flight network
5/3/2002 7:41 AM
Graphs
cslab1b
End vertices (or endpoints) of

an edge
cs.brown.edu
U and V are the endpoints of

a
Edges incident on a vertex
Highway network
Flight network
brown.edu
qwest.net
att.net
Local area network

Internet
Web
Adjacent vertices
John
Graphs
U and V are adjacent
e
W
X has degree 5
h and i are parallel edges
j
Z
i
g
Parallel edges
cox.net
Paul
a, d, and b are incident on V
Degree of a vertex
Self-loop
David
Entity-relationship diagram
5/3/2002 7:41 AM
Databases
ORD
Undirected graph
10
99
math.brown.edu
Printed circuit board

Integrated circuit
Computer networks
ordered pair of vertices (u,v)

first vertex u is the origin
second vertex v is the destination
e.g., a flight
Terminology
Transportation networks
Electronic circuits
Undirected edge
LGA
Graphs
cslab1a
Graphs
Directed graph
Applications
5/3/2002 7:41 AM
Directed edge
V is a set of nodes, called vertices

E is a collection of pairs of vertices, called edges
Vertices and edges are positions and store elements
Example:
Edge list structure

Adjacency list structure
Adjacency matrix structure
Edge Types
A graph is a pair (V, E), where
Definition
Applications
Terminology
Properties
ADT
j is a self-loop
5/3/2002 7:41 AM
Graphs
Graphs
5/3/2002 7:41 AM
Terminology (cont.)
Terminology (cont.)
Cycle
Path
sequence of alternating
vertices and edges
begins with a vertex
ends with a vertex
each edge is preceded and
followed by its endpoints
a
U
c
Simple path
path such that all its vertices

and edges are distinct
P1
X
Graphs
5/3/2002 7:41 AM
Notation
v deg(v) = 2m
Property 2
Graphs
element
reference to position in
vertex sequence
sequence of vertex
objects
Edge sequence
aVertex()
incidentEdges(v)
endVertices(e)
isDirected(e)
origin(e)
destination(e)
opposite(v, e)
areAdjacent(v, w)
sequence of
references to edge
objects of incident
edges
Augmented edge
objects
Vertex sequence
insertVertex(o)
insertEdge(v, w, o)
insertDirectedEdge(v, w, o)
removeVertex(v)
removeEdge(e)
Generic methods
numVertices()
numEdges()
vertices()
edges()
Graphs
10
Edge list structure

Incidence sequence
for each vertex
Edge object
Update methods
Adjacency List Structure

u
Vertex object
element
origin vertex object
destination vertex object
reference to position in
edge sequence
are positions
store elements
5/3/2002 7:41 AM
Edge List Structure
Graphs
Accessor methods
Example
n = 4
m = 6
deg(v) = 3
In an undirected
graph with no selfloops and no
multiple edges
m n (n 1)/2
Proof: each vertex
has degree at most
(n 1)
C1
g
Vertices and edges
number of vertices
number of edges
degree of vertex v
n
m
deg(v)
Proof: each endpoint

is counted twice
Main Methods of the Graph ADT
Property 1
C1=(V,b,X,g,Y,f,W,c,U,a,) is a
simple cycle
C2=(U,c,W,e,X,g,Y,f,W,d,V,a,)
is a cycle that is not simple
Properties
5/3/2002 7:41 AM
Examples
d
C2
cycle such that all its vertices

and edges are distinct
Simple cycle
P1=(V,b,X,h,Z) is a simple path

P2=(U,c,W,e,X,g,Y,f,W,d,V) is a
path that is not simple
5/3/2002 7:41 AM
d
P2
Examples
circular sequence of alternating

vertices and edges
each edge is preceded and
followed by its endpoints
references to
associated
positions in
incidence
sequences of end
vertices
sequence of edge objects
5/3/2002 7:41 AM
Graphs
11
5/3/2002 7:41 AM
Graphs
12
Graphs
5/3/2002 7:41 AM
Adjacency Matrix Structure

a
Edge list structure

Augmented vertex
objects
Performance
n vertices
m edges
no parallel edges
no self-loops
Integer key (index)

associated with
vertex
2D-array adjacency
array
Reference to edge
object for adjacent
vertices
Null for non
nonadjacent
vertices
5/3/2002 7:41 AM
1
0
0
Graphs
1
a
13
Adjacency
List
Adjacency
Matrix
n+m
n+m
n2
incidentEdges(v)
deg(v)
areAdjacent (v, w)
insertVertex(o)
m
1
min(deg(v), deg(w))
1
1
n2
Space
0
Edge
List
insertEdge(v, w, o)
removeVertex(v)
removeEdge(e)
m
1
deg(v)
1
n2
1
5/3/2002 7:41 AM
Graphs
14
Campus Tour
6/8/2002 2:07 PM
Outline and Reading

Overview of the assignment
Review
Campus Tour
Adjacency matrix structure (6.2.3)

Kruskals MST algorithm (7.3.1)
Partition ADT and implementation (4.2.2)

The decorator pattern (6.5.1)
The traveling salesperson problem
Definition
Approximation algorithm (13.4.3)
6/8/2002 2:07 PM
Campus Tour
Graph Assignment
6/8/2002 2:07 PM
Campus Tour
Kruskals Algorithm
The vertices are
partitioned into clouds
We start with one cloud

per vertex
Clouds are merged during
the execution of the
algorithm
Partition ADT:
Reference to edge
object for adjacent
vertices
Null for non
nonadjacent
vertices
Computation and visualization of an approximate traveling

salesperson tour
6/8/2002 2:07 PM
makeSet(o): create set {o}

and return a locator for
object o
find(l): return the set of
the object with locator l
union(A,B): merge sets A
and B
6/8/2002 2:07 PM
1
0
6/8/2002 2:07 PM
0
a
Campus Tour
Example
Algorithm KruskalMSF(G)
Input weighted graph G
Output labeling of the edges of a
minimum spanning forest of G
6
H
C 11
9
C 11
7
10
6/8/2002 2:07 PM
3
D
2
A
Campus Tour
C 11
10
G
9
10
8
5
3
D
10
C 11
Q new heap-based priority queue

for all v G.vertices() do
l makeSet(v) { elementary cloud }
setLocator(v,l)
for all e G.edges() do
Q.insert(weight(e), e)
while Q.isEmpty()
e Q.removeMin()
[u,v] G.endVertices(e)
A find(getLocator(u))
B find(getLocator(v))
if A B
setMSFedge(e)
{ merge clouds }
union(A, B)
Campus Tour
2D-array adjacency
array
Frontend
Integer key (index)

associated with
vertex
Implement the adjacency matrix structure for representing a

graph
Implement Kruskals MST algorithm
Edge list structure

Augmented vertex
objects
Learn and implement the adjacency matrix structure an

Kruskals minimum spanning tree algorithm
Understand and use the decorator pattern and various JDSL
classes and interfaces
Your task
Adjacency Matrix Structure
Goals
Campus Tour
3
D
Campus Tour
6/8/2002 2:07 PM
Example (contd.)
8
B
5
6
H
C 11
C 11
10
3
D
9
C 11
6/8/2002 2:07 PM
st
e
4
four steps
tw
o
B
1
10
Partition implementation
4
ps
10
C 11
Partition Implementation
6
2
10
We perform n makeSet operations, 2m find operations and no

more than n 1 union operations
Label operations
We set vertex labels n times and get them 2m times
Kruskals algorithm runs in time O((n + m) log n) time

provided the graph has no parallel edges and is
represented by the adjacency list structure
6/8/2002 2:07 PM
Campus Tour
Find a polynomial-time algorithm

computing a traveling salesperson
tour or prove that none exists
6/8/2002 2:07 PM
Campus Tour
6
A
The decorator pattern extends

the methods of the Position
ADT to support the handling
of attributes (labels)
DFS: unexplored/visited
label for vertices and
unexplored/ forward/back
labels for edges
Dijkstra and Prim-Jarnik:
distance, locator, and
parent labels for vertices
Kruskal: locator label for
vertices and MSF label for
edges
has(a): tests whether the

position has attribute a
get(a): returns the value of
attribute a
set(a, x): sets to x the value of
attribute a
destroy(a): removes attribute
a and its associated value (for
cleanup purposes)
The decorator pattern can be

implemented by storing a
dictionary of (attribute, value)
items at each position
Campus Tour
10
We can approximate a TSP tour

with a tour of at most twice the
weight for the case of Euclidean
graphs
TSP Approximation
Auxiliary data
Output
6/8/2002 2:07 PM
Traveling Salesperson Problem

A tour of a graph is a spanning cycle
(e.g., a cycle that goes through all
the vertices)
A traveling salesperson tour of a
weighted graph is a tour that is
simple (i.e., no repeated vertices or
edges) and has has minimum weight
No polynomial-time algorithms are
known for computing traveling
salesperson tours
The traveling salesperson problem
(TSP) is a major open problem in
computer science
Examples
We perform m insert operations and m removeMin operations
Partition operations
Campus Tour
Labels are commonly used in

graph algorithms
Methods vertices and edges are called once

Method endVertices is called m times
Priority queue operations
Consider a series of k Partiton

ADT operations that includes
n makeSet operations
Each time we move an
element into a new sequence,
the size of its set at least
doubles
An element is moved at most
log2 n times
Moving an element takes O(1)
time
The total time for the series
of operations is O(k + n log n)
Decorator Pattern
Graph operations
makeSet, find: O(1)

union: O(min(nA, nB))
6/8/2002 2:07 PM
Analysis of Kruskals Algorithm
Amortized analysis
Worst-case running times
Campus Tour
A set is represented the

sequence of its elements
A position stores a reference
back to the sequence itself (for
operation find)
The position of an element in
the sequence serves as locator
for the element in the set
In operation union, we move
the elements of the smaller
sequence into to the larger
sequence
Example of traveling
salesperson tour
(with weight 17)
Approximation algorithm
11
Vertices are points in the plane

Every pair of vertices is connected
by an edge
The weight of an edge is the
length of the segment joining the
points
Compute a minimum spanning tree
Form an Eulerian circuit around the
MST
Transform the circuit into a tour
6/8/2002 2:07 PM
Campus Tour
12
Depth-First Search
5/6/2002 11:46 AM
Outline and Reading

Definitions (6.1)
Depth-First Search
Depth-first search (6.3.1)
Subgraph
Connectivity
Spanning trees and forests
Algorithm
Example
Properties
Analysis
Applications of DFS (6.5)
Depth-First Search
Subgraphs
Depth-First Search
Connectivity
A subgraph S of a graph
G is a graph such that
Path finding
Cycle finding
The vertices of S are a

subset of the vertices of G
The edges of S are a
subset of the edges of G
A graph is
connected if there is
a path between
every pair of
vertices
A connected
component of a
graph G is a
maximal connected
subgraph of G
Subgraph
A spanning subgraph of G
is a subgraph that
contains all the vertices
of G
Spanning subgraph
Depth-First Search
Trees and Forests

T is connected
T has no cycles
This definition of tree is
different from the one of
a rooted tree
Depth-First Search
Depth-First Search
A spanning tree of a
connected graph is a
spanning subgraph that is
a tree
A spanning tree is not
unique unless the graph is
a tree
Spanning trees have
applications to the design
of communication
networks
A spanning forest of a
graph is a spanning
subgraph that is a forest
A forest is an undirected
graph without cycles
The connected
components of a forest
are trees
Non connected graph with two

connected components
4
Spanning Trees and Forests
A (free) tree is an
undirected graph T such
that
Connected graph
Tree
Forest
Graph
Spanning tree
5
Depth-First Search
Depth-First Search
5/6/2002 11:46 AM
Depth-First Search
Depth-first search (DFS)
is a general technique
for traversing a graph
A DFS traversal of a
graph G
Visits all the vertices and

edges of G
Determines whether G is
connected
Computes the connected
components of G
Computes a spanning
forest of G
DFS Algorithm
DFS on a graph with n

vertices and m edges
takes O(n + m ) time
DFS can be further
extended to solve other
graph problems
Find and report a path

between two given
vertices
Find a cycle in the graph
Depth-first search is to
graphs what Euler tour
is to binary trees
Depth-First Search
B
C
C
9
C
Depth-First Search
10
Properties of DFS
Property 1
DFS(G, v) visits all the
vertices and edges in
the connected
component of v
We mark each
intersection, corner
and dead end (vertex)
visited
We mark each corridor
(edge ) traversed
We keep track of the
path back to the
entrance (start vertex)
by means of a rope
(recursion stack)
Depth-First Search
D
C
The DFS algorithm is

similar to a classic
strategy for exploring
a maze
DFS and Maze Traversal
A
D
Depth-First Search
Depth-First Search
A
B
Algorithm DFS(G, v)
Input graph G and a start vertex v of G
Output labeling of the edges of G
in the connected component of v
as discovery edges and back edges
setLabel(v, VISITED)
for all e G.incidentEdges(v)
if getLabel(e) = UNEXPLORED
w opposite(v,e)
if getLabel(w) = UNEXPLORED
setLabel(e, DISCOVERY)
DFS(G, w)
else
setLabel(e, BACK)
Example (cont.)
unexplored vertex
visited vertex
unexplored edge
discovery edge
back edge
Algorithm DFS(G)
Input graph G
Output labeling of the edges of G
as discovery edges and
back edges
for all u G.vertices()
setLabel(u, UNEXPLORED)
for all e G.edges()
setLabel(e, UNEXPLORED)
for all v G.vertices()
if getLabel(v) = UNEXPLORED
DFS(G, v)
Example
A
The algorithm uses a mechanism

for setting and getting labels of
vertices and edges
Property 2
The discovery edges
labeled by DFS(G, v)
form a spanning tree of
the connected
component of v
11
Depth-First Search
12
Depth-First Search
5/6/2002 11:46 AM
Analysis of DFS
Path Finding
Setting/getting a vertex/edge label takes O(1) time

Each vertex is labeled twice
We can specialize the DFS

algorithm to find a path
between two given
vertices u and z using the
template method pattern
We call DFS(G, u) with u
as the start vertex
We use a stack S to keep
track of the path between
the start vertex and the
current vertex
As soon as destination
vertex z is encountered,
we return the path as the
contents of the stack
once as UNEXPLORED
once as VISITED
Each edge is labeled twice
once as UNEXPLORED
once as DISCOVERY or BACK
Method incidentEdges is called once for each vertex

DFS runs in O(n + m) time provided the graph is
Recall that
v deg(v) = 2m
Depth-First Search
13
Algorithm pathDFS(G, v, z)
S.push(v)
if v = z
return S.elements()
w opposite(v,e)
S.push(e)
pathDFS(G, w, z)
S.pop(e)
else
setLabel(e, BACK)
S.pop(v)
Depth-First Search
14
Cycle Finding
We can specialize the
DFS algorithm to find a
simple cycle using the
template method pattern
We use a stack S to
keep track of the path
between the start vertex
and the current vertex
As soon as a back edge
(v, w) is encountered,
we return the cycle as
the portion of the stack
from the top to vertex w
Algorithm cycleDFS(G, v, z)
S.push(v)
w opposite(v,e)
S.push(e)
pathDFS(G, w, z)
S.pop(e)
else
T new empty stack
repeat
o S.pop()
T.push(o)
until o = w
return T.elements()
S.pop(v)
Depth-First Search
15
Breadth-First Search
5/7/2002 11:06 AM
Outline and Reading

Breadth-first search (6.3.3)
Algorithm
Example
Properties
Analysis
Applications
L0
L1
L2
DFS vs. BFS (6.3.3)
Comparison of applications
Comparison of edge labels
5/7/2002 11:06 AM
5/7/2002 11:06 AM
Breadth-first search
(BFS) is a general
technique for traversing
a graph
A BFS traversal of a
graph G
Visits all the vertices and

edges of G
Determines whether G is
connected
Computes the connected
components of G
Computes a spanning
forest of G
5/7/2002 11:06 AM
L0
L1
L1
C
E
5/7/2002 11:06 AM
L0
L0
L1
L1
L0
C
E
L0
A
C
E
5/7/2002 11:06 AM
Algorithm BFS(G, s)
L0 new empty sequence
L0.insertLast(s)
setLabel(s, VISITED)
i0
while Li.isEmpty()
Li +1 new empty sequence
for all v Li.elements()
w opposite(v,e)
setLabel(w, VISITED)
Li +1.insertLast(w)
else
setLabel(e, CROSS)
i i +1
Example (cont.)
Algorithm BFS(G)
Input graph G
Output labeling of the edges
and partition of the
vertices of G
for all e G.edges()
BFS(G, v)
L0
unexplored vertex
visited vertex
unexplored edge
discovery edge
cross edge
The algorithm uses a

mechanism for setting and
getting labels of vertices
and edges
Find and report a path

with the minimum
number of edges
between two given
vertices
Find a simple cycle, if
there is one
Example
BFS Algorithm
BFS on a graph with n

vertices and m edges
takes O(n + m ) time
BFS can be further
extended to solve other
graph problems
L1
F
5
L0
C
5/7/2002 11:06 AM
L1
C
E
D
F
L2
L2
L2
L1
C
E
D
F
6
5/7/2002 11:06 AM
Example (cont.)
L0
L1
L2
Notation
L2
Property 3
C
F
once as UNEXPLORED
once as VISITED
once as UNEXPLORED
once as DISCOVERY or CROSS
Recall that
v deg(v) = 2m
5/7/2002 11:06 AM
DFS vs. BFS

Applications
DFS
BFS
Shortest paths
L2
L0
10
Back edge (v,w)
Cross edge (v,w)
w is an ancestor of v in
the tree of discovery
edges
w is in the same level as

v or in the next level in
the tree of discovery
edges
A
C
Compute the connected components of G

Compute a spanning forest of G
Find a simple cycle in G, or report that G is a
forest
Given two vertices of G, find a path in G between
them with the minimum number of edges, or
report that no such path exists
5/7/2002 11:06 AM
Biconnected components
L1
C
E
C
E
L1
11
5/7/2002 11:06 AM
L2
DFS
BFS
L0
L2
DFS
5/7/2002 11:06 AM
DFS vs. BFS (cont.)
Spanning forest, connected

components, paths, cycles
Using the template method pattern, we can

specialize the BFS traversal of a graph G to
solve the following problems in O(n + m) time
Each vertex is inserted once into a sequence Li

BFS runs in O(n + m) time provided the graph is
L0
The path of Ts from s to v has i

edges
Every path from s to v in Gs has at
least i edges
5/7/2002 11:06 AM
Applications
Each edge is labeled twice
C
E
L1
For each vertex v in Li
Setting/getting a vertex/edge label takes O(1) time

Each vertex is labeled twice
BFS(G, s) visits all the vertices and

edges of Gs
Property 2
Analysis
Property 1
The discovery edges labeled by

BFS(G, s) form a spanning tree Ts
of Gs
5/7/2002 11:06 AM
Gs: connected component of s
L2
L1
L0
L1
Properties
L0
C
E
D
F
BFS
12
Biconnectivity
5/7/2002 11:09 AM
Outline and Reading

Definitions (6.3.2)
Biconnectivity
Separation vertices and edges

Biconnected graph
Biconnected components
Equivalence classes
Linked edges and link components
SEA
PVD
ORD
SNA
FCO
Algorithms (6.3.2)
MIA
Auxiliary graph
Proxy graph
5/7/2002 11:09 AM
Biconnectivity
Separation Edges and Vertices
Applications
Separation edges and vertices represent single points of failure in a
network and are critical to the operation of the network
Example
SFO
PVD
LAX
5/7/2002 11:09 AM
DFW
HNL
LAX
MIA
Biconnectivity
Biconnected Components
A maximal biconnected subgraph of G, or

A subgraph consisting of a separation edge of G and its end vertices
Interaction of biconnected components
An edge belongs to exactly one biconnected component

A nonseparation vertex belongs to exactly one biconnected component
A separation vertex belongs to two or more biconnected components
Example of a graph with four biconnected components
SFO
ORD
PVD
5/7/2002 11:09 AM
LAX
DFW
Biconnectivity
DFW
MIA
Biconnectivity
Given a set S, a relation R on S is a set of ordered pairs of

elements of S, i.e., R is a subset of SS
An equivalence relation R on S satisfies the following properties
Reflexive: (x,x) R
Symmetric: (x,y) R (y,x) R
Transitive: (x,y) R (y,z) R (x,z) R
An equivalence relation R on S induces a partition of the

elements of S into equivalence classes
Example (connectivity relation among the vertices of a graph):
LGA
HNL
5/7/2002 11:09 AM
Equivalence Classes
Biconnected component of a graph G
PVD
ORD
LGA
LGA
HNL
Graph G has no separation edges and no separation vertices

For any two vertices u and v of G, there are two disjoint
simple paths between u and v (i.e., two simple paths
between u and v that share no other vertices or edges)
For any two vertices u and v of G, there is a simple cycle
containing u and v
Example
DFW, LGA and LAX are separation vertices

(DFW,LAX) is a separation edge
ORD
SFO
Equivalent definitions of a biconnected graph G
Let G be a connected graph

A separation edge of G is an edge whose removal disconnects G
A separation vertex of G is a vertex whose removal disconnects G
Biconnectivity
Biconnected Graph
Definitions
5/7/2002 11:09 AM
RDU
MIA
5
Let V be the set of vertices of a graph G

Define the relation
C = {(v,w) VV such that G has a path from v to w}
Relation C is an equivalence relation
The equivalence classes of relation C are the vertices in each
connected component of graph G
5/7/2002 11:09 AM
Biconnectivity
Biconnectivity
5/7/2002 11:09 AM
Link Relation
Link Components
Edges e and f of connected

graph G are linked if
e = f, or
G has a simple cycle
containing e and f
Equivalence classes of linked edges:

{a} {b, c, d, e, f} {g, i, j}
Proof Sketch:
The reflexive and
symmetric properties
follow from the definition
For the transitive
property, consider two
simple cycles sharing an
edge
5/7/2002 11:09 AM
Theorem:
The link relation on the
edges of a graph is an
equivalence relation
Biconnectivity
Associated with a DFS traversal

of G
The vertices of B are the edges
of G
For each back edge e of G, B has
edges (e,f1), (e,f2) , , (e,fk),
where f1, f2, , fk are the
discovery edges of G that form a
simple cycle with e
Its connected components
correspond to the the link
components of G
5/7/2002 11:09 AM
HNL
i
i
j
d
RDU
DFW
MIA
Biconnectivity
In the worst case, the number of edges of the

auxiliary graph is proportional to nm
DFS on graph G
g
e
f
c
a
DFS on graph G
Auxiliary graph B
Biconnectivity
5/7/2002 11:09 AM
Proxy Graph
Auxiliary graph B
Biconnectivity
10
Proxy Graph (cont.)
Algorithm proxyGraph(G)
Input connected graph G
Output proxy graph F for G
F empty graph
DFS(G, s) { s is any vertex of G}
for all discovery edges e of G
F.insertVertex(e)
setLabel(e, UNLINKED)
for all vertices v of G in DFS visit order
for all back edges e = (u,v)
F.insertVertex(e)
while u s
f discovery edge with dest. u
F.insertEdge(e,f,)
if f getLabel(f) = UNLINKED
setLabel(f, LINKED)
u origin of edge f
else
u s { ends the while loop }
return F
5/7/2002 11:09 AM
PVD
Auxiliary Graph (cont.)
LAX
5/7/2002 11:09 AM
g
b
ORD
SFO
LGA
Auxiliary graph B for a connected

graph G
Auxiliary Graph
The link components of a connected graph G are the equivalence

classes of edges with respect to the link relation
A biconnected component of G is the subgraph of G induced by an
equivalence class of linked edges
A separation edge is a single-element equivalence class of linked
edges
A separation vertex has incident edges in at least two distinct
equivalence classes of linked edge
Biconnectivity
j
d
DFS on graph G
g
b
c
f
d
11
The biconnected components of G

The separation vertices of G
The separation edges of G
5/7/2002 11:09 AM
Biconnectivity
j
d
DFS on graph G
g
Given a graph G with n vertices

and m edges, we can compute the
following in O(n + m) time
Proxy graph F
Spanning forest of the auxiliary

graph B
Has m vertices and O(m) edges
Can be constructed in O(n + m)
time
Its connected components (trees)
correspond to the the link
components of G
Proxy graph F for a connected

graph G
c
a
f
d
Proxy graph F
12
Shortest Path
5/13/2002 10:55 AM
Outline and Reading (6.4)

Reachability (6.4.1)
Directed Graphs
BOS
ORD
Directed DFS
Strong connectivity
JFK
Transitive closure (6.4.2)
SFO
LAX
DFW
The Floyd-Warshall Algorithm
MIA
Directed Acyclic Graphs (DAGs) (6.4.4)
Directed Graphs
Topological Sorting
Directed Graphs
Digraphs
Digraph Properties
C
A digraph is a graph
whose edges are all
directed
Applications
B
A
Directed Graphs
ics51
ics53
ics52
We can specialize the

traversal algorithms (DFS and
BFS) to digraphs by
traversing edges only along
their direction
In the directed DFS
algorithm, we have four
types of edges
ics161
ics141
ics121
ics171
Directed Graphs
discovery edges
back edges
forward edges
cross edges
E
D
C
B
A directed DFS starting a a

vertex s determines the
vertices reachable from s
The good life
ics151
Directed Graphs
Directed DFS
Scheduling: edge (a,b) means task a must be

completed before b can be started
ics23
If G is simple, m < n*(n-1).

If we keep in-edges and out-edges in separate
adjacency lists, we can perform listing of inedges and out-edges in time proportional to
their size.
Digraph Application
ics22
Each edge goes in one direction:

Edge (a,b) goes from a to b, but not b to a.
ics21
A graph G=(V,E) such that

D
one-way streets
flights
task scheduling
ics131
Directed Graphs
Shortest Path
5/13/2002 10:55 AM
Strong Connectivity
Reachability
Each vertex can reach all other vertices
DFS tree rooted at v: vertices reachable

from v via directed paths
E
D
F
Directed Graphs
Strong Connectivity
Algorithm
a
G:
If theres a w not visited, print no.

d
Let G be G with edges reversed.

Perform a DFS from v in G.
c
a
G:
c
d
Running time: O(n+m).
Directed Graphs
e
b
f
9
B
C
B
C
A
We can perform
DFS starting at
each vertex
{a,c,g}
{f,d,e,b}
Directed Graphs
Computing the
Transitive Closure
Transitive Closure
Directed Graphs
Maximal subgraphs such that each vertex can reach

all other vertices in the subgraph
Can also be done in O(n+m) time using DFS, but is
more complicated (similar to biconnectivity).
If theres a w not visited, print no.

Else, print yes.
Given a digraph G, the

transitive closure of G is the
digraph G* such that
G* has the same vertices
as G
if G has a directed path
from u to v (u v), G*
has a directed edge from
u to v
The transitive closure
provides reachability
information about a digraph
Strongly Connected
Components
Pick a vertex v in G.
Perform a DFS from v in G.
Directed Graphs
C
A
10
If there's a way to get

from A to B and from
B to C, then there's a
way to get from A to C.
O(n(n+m))
Alternatively ... Use

dynamic programming:
The Floyd-Warshall
Algorithm
G*
11
Directed Graphs
12
Shortest Path
5/13/2002 10:55 AM
Floyd-Warshall
Transitive Closure
Floyd-Warshalls Algorithm
Idea #1: Number the vertices 1, 2, , n.

Idea #2: Consider paths that use only
vertices numbered 1, 2, , k, as
intermediate vertices:
Uses only vertices numbered 1,,k
(add this edge if its not already in)
Uses only vertices

numbered 1,,k-1
k
Uses only vertices

numbered 1,,k-1
Directed Graphs
Algorithm FloydWarshall(G)
Input digraph G
Output transitive closure G* of G
i1
G0=G
denote v as vi
Gk has a directed edge (vi, vj)
ii+1
if G has a directed path from
G0 G
vi to vj with intermediate
for k 1 to n do
vertices in the set {v1 , , vk}
Gk Gk 1
We have that Gn = G*
for i 1 to n (i k) do
for j 1 to n (j i, k) do
In phase k, digraph Gk is
computed from Gk 1
if Gk 1.areAdjacent(vi, vk)
Gk 1.areAdjacent(vk, vj)
Running time: O(n3),
if Gk.areAdjacent(vi, vj)
assuming areAdjacent is O(1)
Gk.insertDirectedEdge(vi, vj , k)
(e.g., adjacency matrix)
return Gn
Floyd-Warshalls algorithm
numbers the vertices of G as
v1 , , vn and computes a
series of digraphs G0, , Gn
13
Floyd-Warshall Example
v7
Directed Graphs
14
Floyd-Warshall, Iteration 1
BOS
ORD
v4
ORD
JFK
v2
SFO
v1
v4
JFK
v2
v6
LAX
v6
SFO
DFW
LAX
v3
v1
MIA
DFW
v3
MIA
v5
v5
Directed Graphs
15
v7
Directed Graphs
16
BOS
ORD
ORD
JFK
v4
JFK
v2
v6
SFO
v1
v7
BOS
v4
v2
LAX
v7
BOS
v6
SFO
DFW
LAX
v3
v1
MIA
DFW
v3
MIA
v5
Directed Graphs
v5
17
Directed Graphs
18
Shortest Path
5/13/2002 10:55 AM
v7
v4
ORD
SFO
v1
JFK
v2
v6
LAX
v4
ORD
JFK
v2
v6
SFO
DFW
DFW
LAX
v3
v3
v1
MIA
MIA
v5
v5
Directed Graphs
19
v7
Directed Graphs
20
Floyd-Warshall, Conclusion
BOS
JFK
v2
v6
SFO
v1
v4
ORD
JFK
v2
v7
BOS
v4
ORD
LAX
v7
BOS
BOS
v6
SFO
DFW
DFW
LAX
v3
v3
v1
MIA
MIA
v5
v5
Directed Graphs
21
DAGs and Topological Ordering

A directed acyclic graph (DAG) is a
digraph that has no directed cycles
A topological ordering of a digraph
is a numbering
v1 , , vn
of the vertices such that for every
edge (vi , vj), we have i < j
Example: in a task scheduling
digraph, a topological ordering a
task sequence that satisfies the
v2
precedence constraints
Theorem
A digraph admits a topological
v1
ordering if and only if it is a DAG
Directed Graphs
Topological Sorting
wake up
A typical student day
2
study computer sci.
C
DAG G
A
D
B
C
v4
22
Number vertices, so that (u,v) in E implies u < v
Directed Graphs
v5
4
7
play
nap
Topological
ordering of G
23
5
more c.s.
8
write c.s. program
9
make cookies
for professors
v3
eat
6
work out
10
sleep
11
dream about graphs
Directed Graphs
24
Shortest Path
5/13/2002 10:55 AM
Algorithm for Topological Sorting
Algorithm topologicalDFS(G, v)
Input graph G and a start vertex v of G
Output labeling of the vertices of G
in the connected component of v
w opposite(v,e)
topologicalDFS(G, w)
else
{e is a forward or cross edge}
Label v with topological number n
nn-1
Simulate the algorithm by using

depth-first search
Note: This algorithm is different than the

one in Goodrich-Tamassia
Algorithm topologicalDFS(G)
Input dag G
Output topological ordering of G
n G.numVertices()
for all e G.edges()
topologicalDFS(G, v)
Method TopologicalSort(G)
HG
// Temporary copy of G
n G.numVertices()
while H is not empty do
Let v be a vertex with no outgoing edges
Label v n
nn-1
Remove v from H
Running time: O(n + m). How?

Directed Graphs
Topological Sorting
Algorithm using DFS
O(n+m) time.
25
Topological Sorting Example
Directed Graphs
26
9
Directed Graphs
27
Directed Graphs
28
7
8
8
9
Directed Graphs
9
29
Directed Graphs
30
Shortest Path
5/13/2002 10:55 AM
6
7
5
7
8
9
Directed Graphs
31
Directed Graphs
32

3
4
6
4
6
5
7
5
7
8
9
Directed Graphs
33
Directed Graphs
34
3
4
6
4
6
5
7
5
7
8
9
Directed Graphs
9
35
Directed Graphs
36
Shortest Path
5/15/2002 11:40 AM
Outline and Reading

Weighted graphs (7.1)
Shortest Paths
5
E
D
8
The Bellman-Ford algorithm (7.1.2)

Shortest paths in dags (7.1.3)
All-pairs shortest paths (7.2.1)
Shortest Paths
Weighted Graphs
802
DFW
LGA
7
138
1120
PVD
10
99
SFO
HNL
MIA
Shortest Paths
There is a tree of shortest paths from a start vertex to all the other
vertices
Example:
Tree of shortest paths from Providence
1233
802
LAX
DFW
Shortest Paths
849
2
14
PVD
LGA
7
138
1120
10
99
1205
2555
337
HNL
43
17
LAX
1233
849
2
14
PVD
LGA
DFW
Shortest Paths
The distance of a vertex

v from a vertex s is the
length of a shortest path
between s and v
Dijkstras algorithm
computes the distances
of all the vertices from a
given start vertex s
Assumptions:
A subpath of a shortest path is itself a shortest path
Property 2:
ORD
43
17
ORD
7
138
1120
10
99
MIA
4
Dijkstras Algorithm
Property 1:
1843
2555
1843
Shortest Path Properties
SFO
Internet packet routing

Flight reservations
Driving directions
1205
LAX
1233
2
14
Shortest path between Providence and Honolulu
Applications
337
43
17
ORD
849
Length of a path is the sum of the weights of its edges.
Example:
1205
2555
337
HNL
1843
Given a weighted graph and two vertices u and v, we want to

find a path of minimum total weight between u and v.
In a flight route graph, the weight of an edge represents the

distance in miles between the endpoint airports
SFO
Shortest Paths
Shortest Path Problem
In a weighted graph, each edge has an associated numerical

value, called the weight of the edge
Edge weights may represent, distances, costs, etc.
Example:
Algorithm
Edge relaxation
802
Dijkstras algorithm (7.1.1)
Shortest path problem

Shortest path properties
MIA
5
the graph is connected

the edges are
undirected
the edge weights are
nonnegative
We grow a cloud of vertices,

beginning with s and eventually
covering all the vertices
We store with each vertex v a
label d(v) representing the
distance of v from s in the
subgraph consisting of the cloud
and its adjacent vertices
At each step
Shortest Paths
We add to the cloud the vertex

u outside the cloud with the
smallest distance label, d(u)
We update the labels of the
vertices adjacent to u
6
Shortest Path
5/15/2002 11:40 AM
Edge Relaxation
Example
Consider an edge e = (u,z)

such that
d(u) = 50
u is the vertex most recently

added to the cloud
z is not in the cloud
8
e
10 d(z) = 75
The relaxation of edge e

updates distance d(z) as
follows:
d(z) min{d(z),d(u) + weight(e)}
10 d(z) = 60
Shortest Paths
5
E
B
2
C
3
A
2
C
3
5
E
4
1
D
8
Label operations
We set/get the distance and locator labels of vertex z O(deg(z)) times
Setting/getting a label takes O(1) time

Each vertex is inserted once into and removed once from the priority
queue, where each insertion or removal takes O(log n) time
The key of a vertex in the priority queue is modified at most deg(w)
times, where each key change takes O(log n) time
Dijkstras algorithm runs in O((n + m) log n) time provided the

graph is represented by the adjacency list structure
Recall that
B
2
D
8
2
7
C
3
5
E
4
1
D
8
5
8
Distance (d(v) label)

locator in priority
queue
Algorithm DijkstraDistances(G, s)
if v = s
setDistance(v, 0)
else
setDistance(v, )
l Q.insert(getDistance(v), v)
setLocator(v,l)
while Q.isEmpty()
u Q.removeMin()
for all e G.incidentEdges(u)
{ relax edge e }
z G.opposite(u,e)
r getDistance(u) + weight(e)
if r < getDistance(z)
setDistance(z,r)
Q.replaceKey(getLocator(z),r)
Shortest Paths
10
Extension
Graph operations
insert(k,e) returns a
locator
replaceKey(l,k) changes
the key of an item
Analysis
Key: distance
Element: vertex
We store two labels

with each vertex:
Shortest Paths
Locator-based methods
11
8
D
Shortest Paths
5
E
A priority queue stores

the vertices outside the
cloud
D
8
F
Dijkstras Algorithm
C
3
Example (cont.)
8
A
7
d(u) = 50
v deg(v) = 2m
The running time can also be expressed as O(m log n) since the
graph is connected
Shortest Paths
11
Using the template

method pattern, we
can extend Dijkstras
algorithm to return a
tree of shortest paths
from the start vertex
to all other vertices
We store with each
vertex a third label:
parent edge in the

shortest path tree
In the edge relaxation

step, we update the
parent label
Algorithm DijkstraShortestPathsTree(G, s)
setParent(v, )

{ relax edge e }
z G.opposite(u,e)
setDistance(z,r)
setParent(z,e)
Shortest Paths
12
Shortest Path
5/15/2002 11:40 AM
Why Dijkstras Algorithm

Works
Why It Doesnt Work for

Negative-Weight Edges
Dijkstras algorithm is based on the greedy

method. It adds vertices by increasing distance.
Suppose it didnt find all shortest

distances. Let F be the first wrong
vertex the algorithm processed.
When the previous node, D, on the
true shortest path was considered,
its distance was correct.
But the edge (D,F) was relaxed at
that time!
Thus, so long as d(F)>d(D), Fs
distance cannot be wrong. That is,
there is no wrong vertex.
8
B
2
C
3
5
E
D
8
13
Bellman-Ford Algorithm
C
0
-8
D
5
Shortest Paths
Nodes are labeled with their d(v) values

0
-2
-2
1
-2
-2
5 8
1
-2
-2
1
-2
3
4
-2
9
-2
setDistance(z,r)
Shortest Paths
14
Bellman-Ford Example
Works even with negative- Algorithm BellmanFord(G, s)

weight edges
if v = s
Must assume directed
setDistance(v, 0)
edges (for otherwise we
else
would have negativesetDistance(v, )
weight cycles)
for i 1 to n-1 do
Iteration i finds all shortest
for each e G.edges()
paths that use i edges.
{ relax edge e }
Running time: O(nm).
u G.origin(e)
z G.opposite(u,e)
Can be extended to detect
a negative-weight cycle if it
exists
How?
Cs true distance is 1, but

it is already in the cloud
with d(C)=5!
Shortest Paths
If a node with a negative

incident edge were to be added
late to the cloud, it could mess
up distances for vertices already
in the cloud.
Dijkstras algorithm is based on the greedy

method. It adds vertices by increasing distance.
-1
15
-2
-2
9
-1
4
9
Shortest Paths
16
DAG-based Algorithm
Works even with
negative-weight edges
Uses topological order
Doesnt use any fancy
data structures
Is much faster than
Dijkstras algorithm
Running time: O(n+m).
DAG Example
Algorithm DagDistances(G, s)
if v = s
setDistance(v, 0)
else
setDistance(v, )
Perform a topological sort of the vertices
for u 1 to n do {in topological order}
for each e G.outEdges(u)
{ relax edge e }
z G.opposite(u,e)
setDistance(z,r)
17
-5
-2
3
-2
2
9
5
-5
3
6
5
5
-2
-2
2
-2
0
7
-5
Shortest Paths
Nodes are labeled with their d(v) values
-2
9
-1
5
-5
5
Shortest Paths
7
0
1
3
6
-2
9
7
(two steps)
-1
5
5
18
Shortest Path
5/15/2002 11:40 AM
All-Pairs Shortest Paths

Find the distance
between every pair of
vertices in a weighted
directed graph G.
We can make n calls to
Dijkstras algorithm (if no
negative edges), which
takes O(nmlog n) time.
Likewise, n calls to
Bellman-Ford would take
O(n2m) time.
We can achieve O(n3)
time using dynamic
programming (similar to
the Floyd-Warshall
algorithm).
Algorithm AllPair(G) {assumes vertices 1,,n}

for all vertex pairs (i,j)
if i = j
D0[i,i] 0
else if (i,j) is an edge in G
D0[i,j] weight of edge (i,j)
else
D0[i,j] +
for k 1 to n do
for i 1 to n do
for j 1 to n do
Dk[i,j] min{Dk-1[i,j], Dk-1[i,k]+Dk-1[k,j]}
return Dn
i
Uses only vertices

numbered 1,,k-1
Shortest Paths
Uses only vertices numbered 1,,k

(compute weight of this edge)
j
k
Uses only vertices

numbered 1,,k-1
19
Minimum Spanning Tree
5/13/2002 4:52 PM
Outline and Reading

Minimum Spanning Trees
2704
337
144
JFK
1258
184
Definitions
A crucial fact
The Prim-Jarnik Algorithm (7.3.2)
187
740
621
802
LAX
PVD
ORD
1846
BWI
1391
1464
BOS
867
849
SFO
Minimum Spanning Trees (7.3)
1090
DFW
1235
Kruskal's Algorithm (7.3.1)
946
1121
MIA
2342
Baruvka's Algorithm (7.3.3)


Spanning subgraph
Spanning subgraph that is

itself a (free) tree
Spanning tree of a weighted

graph with minimum total
edge weight
PIT
DEN
Minimum spanning tree (MST)
Let T be a minimum
spanning tree of a
weighted graph G
Let e be an edge of G
that is not in T and C let
be the cycle formed by e
with T
For every edge f of C,
weight(f) weight(e)
Proof:
By contradiction
If weight(f) > weight(e) we
can get a spanning tree
of smaller weight by
replacing e with f
10
7
3
STL
DCA
Applications
Communications networks
Transportation networks
DFW
ATL
Cycle Property:
Spanning tree
Cycle Property
ORD
Subgraph of a graph G
containing all the vertices of G
9
3
7
Replacing f with e yields
a better spanning tree
f
2
8
4
9
3
7
Partition Property
Partition Property:
Consider a partition of the vertices of
G into subsets U and V
Let e be an edge of minimum weight
across the partition
There is a minimum spanning tree of
G containing edge e
Proof:
Let T be an MST of G
If T does not contain e, consider the
cycle C formed by e with T and let f
be an edge of C across the partition
By the cycle property,
weight(f) weight(e)
Thus, weight(f) = weight(e)
We obtain another MST by replacing
f with e
4
9
8
8
e
7
Replacing f with e yields

another MST
U
4
9
8
8
Prim-Jarniks Algorithm
Similar to Dijkstras algorithm (for a connected graph)
We pick an arbitrary vertex s and we grow the MST as a
cloud of vertices, starting from s
We store with each vertex v a label d(v) = the smallest
weight of an edge connecting v to a vertex in the cloud
At each step:
We add to the cloud the
vertex u outside the cloud
with the smallest distance
label
We update the labels of the
vertices adjacent to u
e
7
5
5/13/2002 4:52 PM
Prim-Jarniks Algorithm (cont.)

the vertices outside the
cloud
Key: distance
Element: vertex
Locator-based methods
insert(k,e) returns a
locator
replaceKey(l,k) changes
the key of an item
We store three labels

with each vertex:
Example
Algorithm PrimJarnikMST(G)
s a vertex of G
if v = s
setDistance(v, 0)
else
setDistance(v, )
setParent(v, )
l Q.insert(getDistance(v), v)
setLocator(v,l)
while Q.isEmpty()
u Q.removeMin()
z G.opposite(u,e)
r weight(e)
setDistance(z,r)
setParent(z,e)
Distance
Parent edge in MST
Locator in priority queue
3
2
8
E

the edges outside the
cloud
Key: weight
Element: edge
At the end of the

algorithm
8
A
5
C
3
7
8
We are left with one

cloud that encompasses
the MST
A tree T which is our
MST
Each vertex is inserted once into and removed once from the priority
queue, where each insertion or removal takes O(log n) time
The key of a vertex w in the priority queue is modified at most deg(w)
times, where each key change takes O(log n) time
Recall that
v deg(v) = 2m
10
Data Structure for

Kruskal Algortihm
The algorithm maintains a forest of trees
An edge is accepted it if connects distinct trees
We need a data structure that maintains a partition,
i.e., a collection of disjoint sets, with the operations:
-find(u): return the set storing u
-union(u,v): replace the sets storing u and v with
their union
Algorithm KruskalMST(G)
for each vertex V in G do
define a Cloud(v) of {v}
let Q be a priority queue.
Insert all edges into Q using their
weights as the key
T
while T has fewer than n-1 edges do
edge e = T.removeMin()
Let u, v be the endpoints of e
if Cloud(v) Cloud(u) then
Add edge e to T
Merge Cloud(v) and Cloud(u)
return T
We set/get the distance, parent and locator labels of vertex z O(deg(z))

times
Setting/getting a label takes O(1) time
The running time is O(m log n) since the graph is connected
Kruskals Algorithm
3
7
Prim-Jarniks algorithm runs in O((n + m) log n) time provided the

graph is represented by the adjacency list structure
5
C
5
8
Analysis
F
E
3
7
Label operations
4
8
8
0
Graph operations
5
C
8
7
8
A
5
C
Example (contd.)
7
5
C
8
B
5
F
E
7
2
4
8
8
A
8
C
11
12
5/13/2002 4:52 PM
Representation of a
Partition
Partition-Based
Implementation
A partition-based version of Kruskals Algorithm
performs cloud merges as unions and tests as finds.
Each set is stored in a sequence

Each element has a reference back to the set
Algorithm Kruskal(G):
Input: A weighted graph G.
Output: An MST T for G.
Let P be a partition of the vertices of G, where each vertex forms a separate set.
Let Q be a priority queue storing the edges of G, sorted by their weights
Let T be an initially-empty tree
while Q is not empty do
(u,v) Q.removeMinElement()
if P.find(u) != P.find(v) then
Running time:
Add (u,v) to T
O((n+m)log n)
P.union(u,v)
return T
operation find(u) takes O(1) time, and returns the set of

which u is a member.
in operation union(u,v), we move the elements of the
smaller set to the sequence of the larger set and update
their references
the time for operation union(u,v) is min(nu,nv), where nu
and nv are the sizes of the sets storing u and v
Whenever an element is processed, it goes into a

set of size at least double, hence each element is
processed at most log n times
Kruskal
Example
2704
13
849
337
LAX
337
1090
DFW
1235
946
LAX
BWI
1090
946
DFW
1235
1121
1121
MIA
MIA
2342
2342
Example
2704
15
849
740
621
1846
337
LAX
1391
1464
1235
740
621
1846
1258
337
LAX
1391
1464
1235
PVD
187
144
JFK
1258
184
802
SFO
BWI
946
BWI
1090
946
DFW
1121
1121
MIA
MIA
2342
BOS
849
ORD
144
JFK
16
867
PVD
1090
DFW
2704
187
184
802
Example
BOS
867
ORD
SFO
1258
184
1391
1464
144
JFK
802
SFO
BWI
1391
1464
PVD
187
740
621
1846
1258
184
802
BOS
867
849
144
JFK
14
ORD
187
740
621
SFO
2704
PVD
ORD
1846
Example
BOS
867
2342
17
18
Example
5/13/2002 4:52 PM
2704
849
337
LAX
1846
337
1090
946
DFW
1235
LAX
BWI
1090
946
DFW
1235
1121
1121
MIA
MIA
2342
2342
Example
2704
19
849
337
LAX
144
337
946
LAX
BWI
1090
946
DFW
1235
1121
1121
MIA
MIA
2342
2342
Example
2704
21
849
740
621
1846
337
LAX
1391
1464
1235
740
621
1846
1258
337
LAX
1391
1464
1235
PVD
187
144
JFK
1258
184
802
SFO
BWI
946
BWI
1090
946
DFW
1121
1121
MIA
MIA
2342
BOS
849
ORD
144
JFK
22
867
PVD
1090
DFW
2704
187
184
802
Example
BOS
867
ORD
SFO
1258
184
1391
1464
144
JFK
802
SFO
1090
1235
PVD
187
740
621
1846
1258
BWI
DFW
BOS
849
ORD
184
1391
20
867
PVD
JFK
802
1464
2704
187
740
621
1846
Example
BOS
867
ORD
SFO
1258
184
1391
1464
144
JFK
802
SFO
BWI
1391
PVD
187
740
621
1258
184
802
1464
849
144
JFK
BOS
867
ORD
187
740
621
SFO
2704
PVD
ORD
1846
Example
BOS
867
2342
23
24
Example
5/13/2002 4:52 PM
2704
849
LAX
1846
337
1090
946
DFW
1235
LAX
BWI
1391
1090
946
DFW
1235
1121
MIA
MIA
2342
2342
2704
25
ORD
740
621
1846
1391
1464
337
LAX
1235
PVD
BWI
946
DFW
1235
1121
MIA
MIA
2342
BWI
1090
946
LAX
1258
184
1391
1464
144
JFK
802
SFO
337
187
740
621
1846
1258
1121
2342
27
Baruvka
Example
Baruvkas Algorithm
2704
Like Kruskals Algorithm, Baruvkas algorithm grows many

clouds at once.
28
BOS
867
849
Algorithm BaruvkaMST(G)
T V {just the vertices of G}
while T has fewer than n-1 edges do
for each connected component C in T do
Let edge e be the smallest-weight edge from C to another component in T.
if e is not already in T then
Add edge e to T
return T
Each iteration of the while-loop halves the number of connected

compontents in T.
BOS
867
ORD
144
JFK
26
849
187
1090
DFW
2704
PVD
184
802
Example
BOS
867
849
SFO
1258
184
1464
1121
Example
144
JFK
802
SFO
BWI
1391
1464
PVD
187
740
621
1258
184
802
337
849
144
JFK
BOS
867
ORD
187
740
621
SFO
2704
PVD
ORD
1846
Example
BOS
867
ORD
740
1846
621
802
SFO
337
LAX
1391
1464
1235
PVD
187
144
JFK
1258
184
BWI
1090
946
DFW
The running time is O(m log n).
1121
MIA
2342
29
30
Example
5/13/2002 4:52 PM
2704
849
ORD
740
1846
621
802
SFO
337
LAX
1391
1464
1235
Example
BOS
867
187
849
ORD
144
JFK
740
1846
621
1258
184
802
SFO
BWI
337
946
LAX
1391
1464
1235
187
PVD
144
JFK
1258
184
BWI
1090
DFW
946
1121
1121
MIA
MIA
2342
2342
BOS
867
PVD
1090
DFW
2704
31
32
Maximum Flow
5/13/2002 5:09 PM
Outline and Reading

Flow networks
Maximum Flow
4/6
3/3
1/1
3/5
5/13/2002 5:09 PM
Maximum flow
3/3
1/1
4/7
1/9
3/5
2/2
Maximum Flow
A weighted digraph G with nonnegative integer edge weights,

where the weight of an edge e is called the capacity c(e) of e
Two distinguished vertices, s and t of G, called the source and sink,
respectively, such that s has no incoming edges and t has no
outgoing edges.
Example:
v
1
u
5/13/2002 5:09 PM
7
w
Hydraulic systems
Electrical circuits
Traffic movements
Freight transportation
5/13/2002 5:09 PM
Capacity Rule: For each edge e, 0 f (e) c(e)

eE ( v )
where
E(v)
5
z
f ( e) = f (e )
Conservation Rule: For each vertex v s,t

and
E+(v)
1/1
3/3
u
3
5/13/2002 5:09 PM
1/1
3/5
3/7
2/9
4/5
z
2/2
Maximum Flow
Cut
v
2/6
1/3
1/1
3/3
1/1
3/5
3/7
2/9
4/5
z
2/2
Flow of value 8 = 2 + 3 + 3 = 1 + 3 + 4
v
4/6
3/3
1/1
3/3
1/1
3/5
3/7
2/9
t
4/5
z
2/2
Maximum flow of value 10 = 4 + 3 + 3 = 3 + 3 + 4
u
Maximum Flow
eE + ( v )
are the incoming and outgoing edges of v, resp.
2/6
Maximum Flow
A flow f for a network N is is an assignment of an integer value

f(e) to each edge e that satisfies the following properties:
Maximum Flow
A flow for a network N is

said to be maximum if its
value is the largest of all
flows for N
The maximum flow
problem consists of
finding a maximum flow
for a given network N
Applications
Maximum Flow
The value of a flow f , denoted |f|, is the total flow from the source,
which is the same as the total flow into the sink
Example:
v
1/3
5/13/2002 5:09 PM
Flow
A flow network (or just network) N consists of
Augmenting path (8.2.1)

Maximum flow and minimum cut (8.2.1)
Ford-Fulkersons algorithm (8.2.2-8.2.3)
Flow Network
Flow (8.1.1)
Cut (8.1.2)
A cut of a network N with source s

and sink t is a partition = (Vs,Vt)
of the vertices of N such that s
Vs and t Vt
Forward edge of cut : origin in Vs
5/13/2002 5:09 PM
Maximum Flow
3
1
w
5
u
and destination in Vt
Backward edge of cut : origin in
Vt and destination in Vs
Flow f() across a cut : total flow

of forward edges minus total flow
of backward edges
Capacity c() of a cut : total
capacity of forward edges
Example:
c() = 24
f() = 8
5
z
2/6
3/3
1/3
1/1
w
1/1
3/5
u
2/2
3/7
2/9
4/5
z
6
Maximum Flow
5/13/2002 5:09 PM
Flow and Cut
Augmenting Path
Lemma:
The flow f() across any
cut is equal to the flow
value |f|
Lemma:
The flow f() across a cut
is less than or equal to
the capacity c() of the cut
Theorem:
The value of any flow is
less than or equal to the
capacity of any cut, i.e.,
for any flow f and any cut
, we have
|f| c()
5/13/2002 5:09 PM
Consider a flow f for a

network N
Let e be an edge from u to v:
2
v
2/6
1/3
1/1
3/3
2/9
1/1
3/5
3/7
4/5
2/2
5/13/2002 5:09 PM
Flow Augmentation
Forward edge:
f (e) = f(e) + f()
Backward edge:
f (e) = f(e) f()
2/9
4/5
|f|=7
2/2
2/6
0/1
3/3
2/7
1/1
2/9
4/5
There is no augmenting path

from s to t with respect to the
current flow f
Cut = (Vs,Vt) has capacity

c() = |f|
Forward edge: f(e) = c(e)

Backward edge: f(e) = 0
Thus, flow f has maximum

value and cut has minimum
capacity
5/13/2002 5:09 PM
Maximum Flow
5/13/2002 5:09 PM
Theorem:
The value of a maximum
flow is equal to the
capacity of a minimum cut
10
1/1
u
0/9
1/3
0/5
1/5
0/2
0/3
1/1
w
0/1
u
2/7
0/9
1/5
z
1/2
4/7
1/9
2/2
1/6
t
1/7
1/1
u
3/3
1/1
3/3
0/3
0/3
0/1
1/5
4/6
3/5
Maximum Flow
0/6
An edge e is traversed
from u to v provided
f(u, v) > 0
Example (1)
Define
Vs set of vertices reachable from s
by augmenting paths
Vt set of remaining vertices
Search for an
augmenting path
Augment by f() the
flow along the edges
of
| f | = 8
2/2
Maximum Flow
Termination of Ford-Fulkersons
algorithm
|f| = 7
Algorithm FordFulkersonMaxFlow(N)
for all e G.edges()
setFlow(e, 0)
while G has an augmenting path
{ compute residual capacity of }
for all edges e

{ compute residual capacity of e }
if e is a forward edge of
getCapacity(e) getFlow(e)
else { e is a backward edge }
getFlow(e)
if <
{ augment flow along }

for all edges e
if e is a forward edge of
setFlow(e, getFlow(e) + )
else { e is a backward edge }
setFlow(e, getFlow(e) )
A specialization of DFS
(or BFS) searches for an
augmenting path
2/3
Max-Flow and Min-Cut
f() = 1
u
5/13/2002 5:09 PM
2/7
0/1
3/5
2/2
Maximum Flow
Initially, f(e) = 0 for each

edge e
Repeatedly
1/3
1/1
3/3
4/5
z
Ford-Fulkersons Algorithm
2/6
2/5
2/9
f(s,u) = 3
f(u,w) = 1
f(w,v) = 1
f(v,t) = 2
f() = 1
A path from s to t is an
augmenting path if f() > 0
2/7
0/1
2/5
residual capacities of the

edges of in the direction
from s to t
c(1) = 12 = 6 + 3 + 1 + 2
c(2) = 21 = 3 + 7 + 9 + 2
|f| = 8
1/3
1/1
3/3
Let be a path from s to t

The residual capacity f()
of is the smallest of the
Maximum Flow
Lemma:
Let be an augmenting path
for flow f in network N. There
exists a flow f for N of value
| f | = |f | + f()
Proof:
We compute flow f by
modifying the flow on the
edges of
Residual capacity of e from

u to v: f(u, v) = c(e) f (e)
Residual capacity of e from
v to u: f(v, u) = f (e)
2/6
3/5
z
0/1
u
11
0/3
0/1
1/3
s
1/5
c() = | f | = 10
0/6
5/13/2002 5:09 PM
1/2
1/6
t
1/7
0/9
1/5
z
2/3
s
1/5
0/1
u
Maximum Flow
1/3
0/1
1/2
2/7
0/9
1/5
z
12
Maximum Flow
5/13/2002 5:09 PM
Example (2)
v
3/6
1/5
3/3
0/1
2/3
0/1
u
Analysis
t
2/7
0/9
1/5
3/3
1/1
3/3
1/5
z
1/2
4/6
0/1
u
3/7
1/9
2/5
z
1/2
two steps
v
3/6
0/1
3/3
s
1/5
3/3
0/1
u
5/13/2002 5:09 PM
1/2
4/6
t
2/7
1/9
2/5
z
3/3
s
3/5
1/1
u
Maximum Flow
3/3
1/1
2/2
In the worst case, FordFulkersons algorithm

performs |f*| flow
augmentations, where f* is a
maximum flow
Example
4/7
1/9
3/5
z
13
1/1
1/50
The augmenting paths found

alternate between 1 and 2
The algorithm performs 100
augmentations
Finding an augmenting path

and augmenting the flow
takes O(n + m) time
The running time of FordFulkersons algorithm is
O(|f*|(n + m))
5/13/2002 5:09 PM
Maximum Flow
0/50
1/50
0/50
v
0/1
1/50
s
1/50
1/50
t
2
1/50
14
Pattern Matching
5/29/2002 11:27 AM
Outline and Reading

Strings (9.1.1)
Pattern matching algorithms
Pattern Matching
Pattern Matching
Strings
Let P be a string of size m
Java program
HTML document
DNA sequence
Digitized image
An alphabet is the set of

possible characters for a
family of strings
Example of alphabets:
A substring P[i .. j] of P is the

subsequence of P consisting of
the characters with ranks
between i and j
A prefix of P is a substring of
the type P[0 .. i]
A suffix of P is a substring of
the type P[i ..m 1]
Given strings T (text) and P

(pattern), the pattern matching
problem consists of finding a
substring of T equal to P
Applications:
ASCII
Unicode
{0, 1}
{A, C, G, T}
Pattern Matching
Text editors
Search engines
Biological research
Pattern Matching
Example
p a t
r i
1
t h m
t e r n
2
t h m
m a t c h i n g
3
t h m
r
r
4
t h m
Pattern Matching
a l g o r
5
t h m
t h m
11 10 9 8 7
r i t h m
r i
6
t h m
5
the largest index i such that P[i] = c or

1 if no such index exists
Example:
= {a, b, c, d}
Boyer-Moores algorithm preprocesses the pattern P and the

alphabet to build the last-occurrence function L mapping to
integers, where L(c) is defined as
If P contains c, shift P to align the last occurrence of c in P with T[i]

Else, shift P to align P[0] with T[i + 1]
r i
Algorithm BruteForceMatch(T, P)
Input text T of size n and pattern
P of size m
Output starting index of a
substring of T equal to P or 1
if no such substring exists
a match is found, or
for i 0 to n m
all placements of the pattern
{ test shift i of the pattern }
have been tried
j0
Brute-force pattern matching
while j < m T[i + j] = P[j]
runs in time O(nm)
jj+1
Example of worst case:
if j = m
T = aaa ah
return i {match at i}
P = aaah
may occur in images and
else
DNA sequences
break while loop {mismatch}
unlikely in English text
return -1 {no match anywhere}
Last-Occurrence Function
The Boyer-Moores pattern matching algorithm is based on two

heuristics
Looking-glass heuristic: Compare P with a subsequence of T
moving backwards
Character-jump heuristic: When a mismatch occurs at T[i] = c
The brute-force pattern

matching algorithm compares
the pattern P with the text T
for each possible shift of P
relative to T, until either
Boyer-Moore Heuristics
Pattern Matching
Brute-Force Algorithm
A string is a sequence of
characters
Examples of strings:
Brute-force algorithm (9.1.2)

Boyer-Moore algorithm (9.1.3)
Knuth-Morris-Pratt algorithm (9.1.4)
P = abacab
L(c)
The last-occurrence function can be represented by an array

indexed by the numeric codes of the characters
The last-occurrence function can be computed in time O(m + s),
where m is the size of P and s is the size of
Pattern Matching
Pattern Matching
5/29/2002 11:27 AM
The Boyer-Moore Algorithm

Algorithm BoyerMooreMatch(T, P, )
L lastOccurenceFunction(P, )
im1
jm1
repeat
if T[i] = P[j]
if j = 0
return i { match at i }
else
ii1
jj1
else
{ character-jump }
l L[T[i]]
i i + m min(j, 1 + l)
jm1
until i > n 1
return 1 { no match }
Example
Case 1: j 1 + l
.
a .
i
b a
j l
mj
a
a
a .
i
a .
l
b .
j
m (1 + l)
a .
a
4
b
3
13 12 11 10 9
a
7
b .
1+l
Pattern Matching
The KMP Algorithm - Motivation
Boyer-Moores algorithm
runs in time O(nm + s)
Example of worst case:
Analysis
b
a
Pattern Matching
b a
Case 2: 1 + l j
.
12 11 10
T = aaa a
P = baaa
The worst case may occur in

images and DNA sequences
but is unlikely in English text
Boyer-Moores algorithm is
significantly faster than the
brute-force algorithm on
English text
Knuth-Morris-Pratts algorithm
compares the pattern to the
text in left-to-right, but shifts
the pattern more intelligently
than the brute-force algorithm.
When a mismatch occurs,
what is the most we can shift
the pattern so as to avoid
redundant comparisons?
Answer: the largest prefix of
P[0..j] that is a suffix of P[1..j]
18 17 16 15 14 13
24 23 22 21 20 19
Pattern Matching
a b a a b x .
a b a a b a
j
a b a a b a
No need to
repeat these
comparisons
Resume
comparing
here
10
The KMP Algorithm
P[j]
F(j)
a b a a b x .
The failure function can be

represented by an array and
can be computed in O(m) time
At each iteration of the whileloop, either
.
a b a a b a
j
Pattern Matching
Pattern Matching
KMP Failure Function

Knuth-Morris-Pratts
algorithm preprocesses the
pattern to find matches of
prefixes of the pattern with
the pattern itself
The failure function F(j) is .
defined as the size of the
largest prefix of P[0..j] that is
also a suffix of P[1..j]
Knuth-Morris-Pratts
algorithm modifies the bruteforce algorithm so that if a
mismatch occurs at P[j] T[i]
we set j F(j 1)
i increases by one, or
the shift amount i j
increases by at least one
(observe that F(j 1) < j)
Hence, there are no more

than 2n iterations of the
while-loop
Thus, KMPs algorithm runs in
optimal time O(m + n)
a b a a b a
F(j 1)
11
Algorithm KMPMatch(T, P)
F failureFunction(P)
i0
j0
while i < n
if T[i] = P[j]
if j = m 1
return i j { match }
else
ii+1
jj+1
else
if j > 0
j F[j 1]
else
ii+1
return 1 { no match }
Pattern Matching
12
Pattern Matching
5/29/2002 11:27 AM
Computing the Failure

Function
Example
The failure function can be

represented by an array and Algorithm failureFunction(P)
can be computed in O(m) time
F[0] 0
i1
The construction is similar to
j0
the KMP algorithm itself
while i < m
At each iteration of the whileif P[i] = P[j]
{we have matched j + 1 chars}
loop, either
i increases by one, or
the shift amount i j
increases by at least one
(observe that F(j 1) < j)
Hence, there are no more

than 2m iterations of the
while-loop
Pattern Matching
F[i] j + 1
ii+1
jj+1
else if j > 0 then
{use failure function to shift P}
j F[j 1]
else
F[i] 0 { no match }
ii+1
13
a b a c a a b a c c a b a c a b a a b b
1 2 3 4 5 6
a b a c a b
7
a b a c a b
8 9 10 11 12
a b a c a b
13
3
P[j]
F(j)
a b a c a b
14 15 16 17 18 19
a b a c a b
Pattern Matching
14
Tries
5/24/2002 8:37 AM
Outline and Reading

Standard tries (9.2.1)
Compressed tries (9.2.2)
Suffix tries (9.2.3)
Huffman encoding tries (9.3.1)
Tries
e
mize
mi
nimize
ze
nimize
5/24/2002 8:37 AM
nimize
ze
ze
Tries
Preprocessing Strings
After preprocessing the pattern, KMPs algorithm performs
pattern matching in time proportional to the text size
Tries
Standard Trie (2)
5/24/2002 8:37 AM
s
u
l
e
y
l
l
Tries
Each node but the root is labeled with a character

The children of a node are alphabetically ordered
The paths from the external nodes to the root yield the strings of S
S = { bear, bell, bid, bull, buy, sell, stock, stop }

b
e
5/24/2002 8:37 AM
Tries
We insert the
words of the
text into a
trie
Each leaf
stores the
occurrences
of the
associated
word in the
text e
n total size of the strings in S

m size of the string parameter of the operation
d size of the alphabet
Word Matching with a Trie
A standard trie uses O(n) space and supports

searches, insertions and deletions in time O(dm),
where:
The standard trie for a set of strings S is an ordered tree such that:
A tries supports pattern matching queries in time

proportional to the pattern size
5/24/2002 8:37 AM
Example: standard trie for the set of strings
If the text is large, immutable and searched for often

(e.g., works by Shakespeare), we may want to
preprocess the text instead of the pattern
A trie is a compact data structure for representing a
set of strings, such as all the words in a text
Tries
Standard Trie (1)
Preprocessing the pattern speeds up pattern matching

queries
5/24/2002 8:37 AM
o
c
k
p
5
78
5/24/2002 8:37 AM
s e e
b e a r ?
s e l
s t o c k !
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
s e e
b u l
l ?
b u y
s t o c k !
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
b i d
s t o c k !
b i d
s t o c k !
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68
h e a r
t h e
b e l
l ?
s t o p !
69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88
i
d
47, 58
u
l
l
e
y
36
a
r
69
30
Tries
e
e
0, 24
t
l
l
12
o
c
k
17, 40,
51, 62
p
84
Tries
5/24/2002 8:37 AM
Compressed Trie
A compressed trie has
internal nodes of degree
at least two
It is obtained from
standard trie by
compressing chains of
redundant nodes
Compact Representation
b
ar
id
ll
ell
ll
to
ck
0 1 2 3
0 1 2 3 4
S[4] =
S[1] =
s e e
b e a r
S[2] =
s e l l
S[3] =
s t o c k
S[0] =
Stores at the nodes ranges of indices instead of substrings

Uses O(s) space, where s is the number of strings in the array
Serves as an auxiliary index structure
b
e
Compact representation of a compressed trie for an array of strings:
5/24/2002 8:37 AM
1, 2, 3
Tries
S[7] =
S[5] =
S[8] =
h e a r
b e l l
S[6] =
b i d
S[9] =
s t o p
1, 0, 0
1, 1, 1
Suffix Trie (1)
6, 1, 2
8, 2, 3
0 1 2 3
b u l l
b u y
0, 0, 0
7, 0, 3
4, 1, 1
4, 2, 3
5/24/2002 8:37 AM
0, 1, 1
5, 2, 2
0, 2, 2
3, 1, 2
2, 2, 3
3, 3, 4
9, 3, 3
Tries
Suffix Trie (2)

Compact representation of the suffix trie for a string X of size n
from an alphabet of size d
The suffix trie of a string X is the compressed trie of all the

suffixes of X
Uses O(n) space

Supports arbitrary pattern matching queries in X in O(dm) time,
where m is the size of the pattern
m i n i m i z e
0 1 2 3 4 5 6 7
m i n i m i z e
0 1 2 3 4 5 6 7
mi
nimize
ze
7, 7
mize
nimize
ze
nimize
5/24/2002 8:37 AM
ze
4, 7
Tries
Encoding Trie (1)
Each leaf stores a character

The code word of a character is given by the path from the root to
the leaf storing the character (0 for a left child and 1 for a right child
00
010
011
10
11
5/24/2002 8:37 AM
a
Tries
2, 7
0, 1
6, 7
5/24/2002 8:37 AM
2, 7
2, 7
6, 7
6, 7
Tries
10
Encoding Trie (2)
A code is a mapping of each character of an alphabet to a binary

code-word
A prefix code is a binary code such that no code-word is the prefix
of another code-word
An encoding trie represents a prefix code
1, 1
d
b
Given a text string X, we want to find a prefix code for the characters
of X that yields a small encoding for X
Frequent characters should have long code-words
Rare characters should have short code-words
Example
X = abracadabra
T1 encodes X into 29 bits
T2 encodes X into 24 bits
T1
T2
d
a
11
5/24/2002 8:37 AM
b
c
Tries
d
12
Tries
5/24/2002 8:37 AM
Huffmans Algorithm
Given a string X,
Huffmans algorithm
construct a prefix
code the minimizes
the size of the
encoding of X
It runs in time
O(n + d log d), where
n is the size of X
and d is the number
of distinct characters
of X
A heap-based
priority queue is
used as an auxiliary
structure
5/24/2002 8:37 AM
Example
Algorithm HuffmanEncoding(X)
Input string X of size n
Output optimal encoding trie for X
C distinctCharacters(X)
computeFrequencies(C, X)
Q new empty heap
for all c C
T new single-node tree storing c
Q.insert(getFrequency(c), T)
while Q.size() > 1
f1 Q.minKey()
T1 Q.removeMin()
f2 Q.minKey()
T2 Q.removeMin()
T join(T1, T2)
Q.insert(f1 + f2, T)
return Q.removeMin()
Tries
11
a
5
b
2
c
1
d
1
X = abracadabra
Frequencies
a
5
b
2
5/24/2002 8:37 AM
r
2
a
5
2
d
r
2
13
a
5
Tries
4
d
4
d
r
14
Numerical Algorithms
6/8/2002 2:07 PM
Outline
Divisibility and primes
Modular arithmetic
Euclids GCD algorithm
Multiplicative inverses
Powers
Fermats little theorem
Eulers theorem
x1
6/8/2002 2:07 PM
9
9
Facts About Numbers
2, 7, 19 are primes
3, 1, 6 are not primes
gcd(18, 30) = 6
gcd(21, 49) = 7
Prime decomposition of a positive integer n:

n = p1e1 pkek
Example:
The greatest common divisor (GCD) of two positive

integers a and b, denoted gcd(a, b), is the largest
positive integer that divides both a and b
The above definition is extended to arbitrary integers
Examples:
p is an integer
p2
The only divisors of p are 1and p
Examples
Greatest Common Divisor
Prime number p:
6/8/2002 2:07 PM
gcd(0, 20) = 20
Two integers a and b are said to be relatively prime if

gcd(a, b) = 1
Example:
200 = 23 52
Fundamental Theorem of Arithmetic
Integers 15 and 28 are relatively prime
The prime decomposition of a positive integer is unique

6/8/2002 2:07 PM
Modular Arithmetic
13 mod 13 = 0
13 = 0 + 113
Euclids algorithm for

computing the GCD
repeatedly applies the
formula
gcd(a, b) = gcd(b, a mod b)
Example
1 mod 13 = 12
12 = 1 + 113
Modulo and GCD:

gcd(a, b) = gcd(b, a mod b)
Example:
gcd(21, 12) = 3
6/8/2002 2:07 PM
Euclids GCD Algorithm
Modulo operator for a positive integer n

r = a mod n
equivalent to
a = r + kn
and
r = a a/n n
Example:
29 mod 13 = 3
29 = 3 + 213
6/8/2002 2:07 PM
gcd(412, 260) = 4
Algorithm EuclidGCD(a, b)
Input integers a and b
Output gcd(a, b)
if b = 0
return a
else
return EuclidGCD(b, a mod b)
412 260 152 108
44
20
260 152 108
20
44
gcd(12, 21 mod 12) = gcd(6, 9) = 3

6/8/2002 2:07 PM
6/8/2002 2:07 PM
Analysis
Multiplicative Inverses (1)
Let ai and bi be the arguments of the i-th recursive call of

algorithm EuclidGCD
We have
ai + 2 = bi + 1 = ai mod ai + 1 < ai + 1
Sequence a1, a2, , an decreases exponentially, namely
ai + 2 ai for i > 1
Case 1 ai + 1 ai
Case 2 ai + 1 > ai
The residues modulo a positive integer n are the set

Zn = {0, 1, 2, , (n 1)}
Let x and y be two elements of Zn such that
xy mod n = 1
We say that y is the multiplicative inverse of x in Zn
and we write y = x1
Example:
ai + 2 < ai + 1 ai
ai + 2 = ai mod ai + 1 = ai ai + 1 ai
Thus, the maximum number of recursive calls of algorithm

EuclidGCD(a. b) is
1 + 2 log max(a. b)
Algorithm EuclidGCD(a, b) executes O(log max(a, b)) arithmetic
operations
6/8/2002 2:07 PM
The elements of Z10 with a multiplicative inverse are 1, 3, 5, 7
Corollary
If is p is prime, every nonzero residue in Zp has a multiplicative
inverse
Theorem
A variation of Euclids GCD algorithm computes the multiplicative
inverse of an element x of Zn or determines that it does not exist
6/8/2002 2:07 PM
1
1
3
7
7
3
Fermats Little Theorem
24 mod 1 = 16 mod 5 = 1
44 mod 1 = 256 mod 5 = 1
10
10
Let p be a prime
The sequences of successive powers of the elements of Zp
exhibit repeating subsequences
The sizes of the repeating subsequences and the number of
their repetitions are the divisors of p 1
Example (p = 7)
x
x2
x3
x4
x5
6/8/2002 2:07 PM
x6
10
The multiplicative group for Zn, denoted with Z*n, is the subset of
elements of Zn relatively prime with n
The totient function of n, denoted with (n), is the size of Z*n
Example
Z*10 = { 1, 3, 7, 9 }
(10) = 4
If p is prime, we have
Corollary
Let p be a prime. For each nonzero residue x of Zp,
the multiplicative inverse of x is xp 2 mod p
Proof
x(xp 2 mod p) mod p = xxp 2 mod p = xp 1 mod p = 1
Eulers Theorem
Theorem
Let p be a prime. For each nonzero residue x of Zp,
we have xp 1 mod p = 1
Example (p = 5):
6/8/2002 2:07 PM
6/8/2002 2:07 PM
9
9
14 mod 5 = 1
34 mod 1 = 81 mod 5 = 1
Powers
Theorem
An element x of Zn has a multiplicative inverse if and only if x and
n are relatively prime
Example
x
x1
x1
Multiplicative Inverses (2)
Multiplicative inverses of the residues modulo 11
11
Z*p = {1, 2, , (p 1)}
(p) = p 1
Theorem
For each element x of Z*n, we have x(n) mod n = 1
Example (n = 10)
3(10) mod 10 = 34 mod 10 = 81 mod 10 = 1
7(10) mod 10 = 74 mod 10 = 2401 mod 10 = 1
9(10) mod 10 = 94 mod 10 = 6561 mod 10 = 1
6/8/2002 2:07 PM
12
FFT
11/27/2002 1:42 AM
Outline and Reading

Polynomial Multiplication Problem
Primitive Roots of Unity (10.4.1)
The Discrete Fourier Transform (10.4.2)
The FFT Algorithm (10.4.3)
Integer Multiplication (10.4.4)
Java FFT Integer Multiplication (10.5)
The Fast Fourier Transform

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
FFT
Polynomials
FFT
Polynomial Evaluation
Polynomial:
Horners Rule:
p( x ) = 5 + 2 x + 8 x 2 + 3x 3 + 4 x 4
Given coefficients (a0,a1,a2,,an-1), defining polynomial

n 1
p ( x ) = ai x i
In general,
n 1
p ( x ) = ai x
i =0
Given x, we can evaluate p(x) in O(n) time using the equation
p( x ) = a0 + x ( a1 + x (a2 + L + x ( an 2 + xan1 )L))
i =0
or
Eval(A,x):
p( x ) = a0 + a1 x + a2 x + L + an 1 x
2
n 1
[Where A=(a0,a1,a2,,an-1)]
If n=1, then return a0

Else,
Let A=(a1,a2,,an-1)
[assume this can be done in constant time]
return a0+x*Eval(A,x)
FFT
Polynomial Multiplication
Problem
Given a set of n points in the plane with distinct x-coordinates,

there is exactly one (n-1)-degree polynomial going through all
these points.
Alternate approach to computing p(x)q(x):
Horners rule doesnt help, since
n 1
p ( x ) q( x ) = ci x
Polynomial Interpolation &

Polynomial Multiplication
Given coefficients (a0,a1,a2,,an-1) and (b0,b1,b2,,bn-1) defining

two polynomials, p() and q(), and number x, compute p(x)q(x).
where
FFT
i =0
Calculate p() on 2n x-values, x0,x1,,x2n-1.

Calculate q() on the same 2n x values.
Find the (2n-1)-degree polynomial that goes through the points
{(x0,p(x0)q(x0)), (x1,p(x1)q(x1)), , (x2n-1,p(x2n-1)q(x2n-1))}.
ci = a j bi j
Unfortunately, a straightforward evaluation would still take

O(n2) time, as we would need to apply an O(n)-time Horners
Rule evaluation to 2n different points.
The magical FFT will do it in O(n log n) time, by picking 2n
points that are easy to evaluate
j =0
A straightforward evaluation would take O(n2) time. The

magical FFT will do it in O(n log n) time.
FFT
FFT
FFT
11/27/2002 1:42 AM
Properties of
Primitive Roots of Unity
Primitive Roots of Unity
A number is a primitive n-th root of unity, for n>1, if
Example 1:
Inverse Property: If is a primitive root of unity, then -1=n-1
n = 1
The numbers 1, , 2, , n-1 are all distinct
Z*11:
x
1
2
3
4
5
6
7
8
9
10
x^2
1
4
9
5
3
3
5
9
4
1
x^3
1
8
5
9
4
7
2
6
3
10
x^4
1
5
4
3
9
9
3
4
5
1
x^5
1
10
1
1
1
10
10
10
1
10
x^6
1
9
3
4
5
5
4
3
9
1
x^7
1
7
9
5
3
8
6
2
4
10
x^8
1
3
5
9
4
4
9
5
3
1
x^9
1
6
4
3
9
2
8
7
5
10
=0
Proof: By the cancellation property, for k=n/2:

n 1
j =0
Corollary: k+n/2= -k.
The DFT and inverse DFT really are inverse operations

Proof: Let A=F -1F. We want to show that A=I, where
A[i , j ] =
A[i , i ] =
1 n 1 ki kj

n k =0
1 n 1 ki ki 1 n 1 0 1
= n =1
= n
n k =0
n
k =0
If i and j are different, then
A[i , j ] =
1 n 1 ( j i ) k
= 0
n k =0
[a0,a1,a2,...,an-1]
[b0,b1,b2,...,bn-1]
Pad with n 0's
Pad with n 0's
[a0,a1,a2,...,an-1,0,0,...,0]
[b0,b1,b2,...,bn-1,0,0,...,0]
DFT
DFT
[y0,y1,y2,...,y2n-1]
FFT
Correctness of the
inverse DFT
F[i,j]=ij.
FFT
So we can get the

coefficients of the
product polynomial
quickly if we can
compute the DFT (and
its inverse) quickly
0 = ( n / 2 ) j = 0 + n / 2 + 0 + n / 2 + L + 0 + n / 2 = (n / 2)(1 + n / 2 )
Matrix form: a=F -1y, where F -1[i,j]=-ij/n.
The DFT and the

inverse DFT can be
used to multiply two
polynomials
kj
j =0
( ) 1 ( ) 1 (1) 1
11
=
= k
=
=0
k 1
k 1
1 k 1
If i=j, then
y j = ai
Convolution
Reflective Property: If n is even, then n/2 = -1.
The Inverse Discrete Fourier Transform recovers the

coefficients of an (n-1)-degree polynomial given its values at
1,,2,,n-1
kj
Proof: If 1,,2,,2n-1 are all distinct, so are 1,2,(2)2,,(2)n-1
1,,2,,n-1
We produce (y0,y1,y2,,yn-1), where yj=p(j)
n 1
That is,
ij
Matrix form: y=Fa, where
Reduction Property: If w is a primitve (2n)-th root of unity, then

2 is a primitive n-th root of unity.
2, 6, 7, 8 are 10-th roots of unity in Z*11

22=4, 62=3, 72=5, 82=9 are 5-th roots of unity in Z*11
2-1=6, 3-1=4, 4-1=3, 5-1=9, 6-1=2, 7-1=8, 8-1=7, 9-1=5
i =0
n 1
j=0
Given coefficients (a0,a1,a2,,an-1) for an (n-1)-degree polynomial

p(x)
The Discrete Fourier Transform is to evaluate p at the values
Proof:
x^10
1
1
1
1
1
1
1
1
1
1
The Discrete Fourier Transform
n 1
Cancellation Property: For non-zero -n<k<n,
Example 2: The complex number e2i/n is a primitive n-th root of

unity, where i = 1
FFT
7
Proof: n-1=n=1
[z0,z1,z2,...,z2n-1]
(by Cancellation Property)
FFT
10
The Fast Fourier Transform

The FFT is an efficient algorithm for computing the DFT
The FFT is based on the divide-and-conquer paradigm:
If n is even, we can divide a polynomial
into two polynomials
Component
Multiply
[y0z0,y1z1,...,y2n-1z2n-1]
and we can write
inverse DFT
[c0,c1,c2,...,c2n-1]
FFT
(Convolution)
11
FFT
12
FFT
11/27/2002 1:42 AM
The FFT Algorithm
Multiplying Big Integers

Given N-bit integers I and J, compute IJ.
Assume: we can multiply words of O(log N) bits in constant time.
Setup: Find a prime p=cn+1 that can be represented in one word,
and set m=(log p)/2, so that we can view I and J as n-length
vectors of m-bit words.
Finding a primitive root of unity.
Find a generator x of Z*p.

Then =xc is a primitive n-th root of unity in Z*p (arithmetic is mod p)
Apply convolution and FFT algorithm to compute the convolution C

of the vector representations of I and J.
n 1
Then compute
K = ci 2 mi
i =0
The running time is O(n log n). [inverse FFT is similar]

FFT
13
Java Example:
Multiplying Big Integers
FFT
14
Java Integer
Multiply Method
Setup: Define BigInt class, and include essential parameters,

including the prime, P, and primitive root of unity, OMEGA.
FFT
K is a vector representing IJ, and takes O(n log n) time to compute.
Use convolution to multiply two big integers, this and val:
15
FFT
16
Support Methods for

Java FFT in Z*p
Java FFT in Z*p
FFT
17
FFT
18
FFT
11/27/2002 1:42 AM
Non-recursive FFT
Experimental Results
There is also a non-recursive version of the FFT
Performs the FFT in place

Precomputes all roots of unity
Performs a cumulative collection of shuffles on A and
on B prior to the FFT, which amounts to assigning
the value at index i to the index bit-reverse(i).
Log-log scale shows traditional multiply runs in

O(n2) time, while FFT versions are almost linear
The code is a bit more complex, but the running

time is faster by a constant, due to improved
overhead
FFT
19
FFT
20
Cryptography
6/8/2002 2:08 PM
Outline
Traditional cryptography
Statistical attacks
Secret-key encryption
Public-key encryption
Cryptography
plaintext
6/8/2002 2:08 PM
encrypt
ciphertext
Cryptography
Encryption
6/8/2002 2:08 PM
encrypt
ciphertext
decrypt
Cryptography
plaintext
3
Statistical Attacks
Most frequent characters in English: e, t, o, a, n, i, ...

Most frequent digrams: th, in, er, re, an, ...
Most frequent trigrams: the, ing, and, ion, ...
The first description of the frequency analysis attack appears in a

book written in the 9th century by the Arab philosopher al-Kindi
Example (S. Singh, The Code Book, 1999):
PCQ VMJYPD LBYK LYSO KBXBJXWXV BXV ZCJPO EYPD KBXBJYUXJ LBJOO
KCPK. CP LBO LBCMKXPV XPV IYJKL PYDBL, QBOP KBO BXV OPVOV LBO
LXRO CI SX'XJMI, KBO JCKO XPV EYKKOV LBO DJCMPV ZOICJO BYS,
KXUYPD: DJOXL EYPD, ICJ X LBCMKXPV XPV CPO PYDBLK Y BXNO ZOOP
JOACMPLYPD LC UCM LBO IXZROK CI FXKL XDOK XPV LBO RODOPVK CI
XPAYOPL EYPDK. SXU Y SXEO KC ZCRV XK LC AJXNO X IXNCMJ CI UCMJ
SXGOKLU?
OFYRCDMO, LXROK IJCS LBO LBCMKXPV XPV CPO PYDBLK
6/8/2002 2:08 PM
Cryptography
most frequent letters in English: e, t, o, a, n, i, ...

most frequent digrams: th, in, er, re, an, ...
most frequent trigrams: the, ing, and, ion, ...
The first description of the frequency analysis attack appears in a

book written in the 9th century by the Arab philosopher al-Kindi
6/8/2002 2:08 PM
Cryptography
Frequency Analysis (1)
Armed with statistical knowledge about the plaintext language, one

can easily break a monoalphabetic substitution cipher
replace a with d
replace b with e
...
replace z with c
Caesars cipher is an example of a monoalphabetic substitution

cipher, which permutes the characters
Armed with simple statistical knowledge, one can easily break a
monoalphabetic substitution cipher
What is a good encryption scheme?

What is the complexity of encrypting/decrypting?
What is the size of the ciphertext, relative to the plaintext?
If Alice and Bob have never interacted before, how can they
agree on an encryption scheme?
plaintext
Ciphers were already studied in ancient times

Caesars cipher:
Alice wants to send a message (plaintext p) to Bob.

The communication channel is insecure and can be eavesdropped
If Alice and Bob have previously agreed on an encryption scheme
(cipher), the message can be sent encrypted (ciphertext c)
Issues:
Cryptography
Traditional Cryptography
Scenario:
6/8/2002 2:08 PM
We identify the most common characters, digrams and trigrams

in the ciphertext
Example
PCQ VMJYPD LBYK LYSO KBXBJXWXV BXV ZCJPO EYPD
KBXBJYUXJ LBJOO KCPK. CP LBO LBCMKXPV XPV IYJKL PYDBL,
QBOP KBO BXV OPVOV LBO LXRO CI SX'XJMI, KBO JCKO XPV
EYKKOV LBO DJCMPV ZOICJO BYS, KXUYPD: DJOXL EYPD, ICJ
X LBCMKXPV XPV CPO PYDBLK Y BXNO ZOOP JOACMPLYPD LC
UCM LBO IXZROK CI FXKL XDOK XPV LBO RODOPVK CI XPAYOPL
EYPDK. SXU Y SXEO KC ZCRV XK LC AJXNO X IXNCMJ CI UCMJ
SXGOKLU?
First guess:
LBO is THE
6/8/2002 2:08 PM
Cryptography
Cryptography
6/8/2002 2:08 PM
Frequency Analysis (2)
Decryption
Assuming LBO represents THE, we replace L with T, B with H,

and O with E and get
PCQ VMJYPD THYK TYSE KHXHJXWXV HXV ZCJPE EYPD
KHXHJYUXJ THJEE KCPK. CP THE THCMKXPV XPV IYJKT
PYDHT, QHEP KHO HXV EPVEV THE LXRE CI SX'XJMI, KHE JCKE
XPV EYKKEV THE DJCMPV ZEICJE HYS, KXUYPD: DJEXT EYPD,
ICJ X THCMKXPV XPV CPE PYDHTK Y HXNE ZEEP JEACMPTYPD
TC UCM THE IXZREK CI FXKT XDEK XPV THE REDEPVK CI
XPAYEPT EYPDK. SXU Y SXEE KC ZCRV XK TC AJXNE X IXNCMJ
CI UCMJ SXGEKTU?
EFYRCDME, TXREK IJCS THE THCMKXPV XPV CPE PYDBTK
6/8/2002 2:08 PM
Cryptography
Secret-Key Encryption
DES
3DES
IDEA
BLOWFISH
Cryptography
Cryptography
Bob uses a pair of keys (KE,KD) and
makes key KE public

keeps key KD private
Anyone can use the public key KE to encrypt a plaintext into a

ciphertext sent to Bob
Only Bob can decrypt the ciphertext using the private key KD
The most popular encryption scheme is RSA, named after its
inventors Rivest, Shamir, and Adleman (1978)
The RSA patent expired in 2000
public key
With private-key encryption, a distinct secret key must be

established for every pair of parties
6/8/2002 2:08 PM
6/8/2002 2:08 PM
Public-Key Encryption
A secret-key cipher uses a unique key K to encrypt and decrypt

Caesars generalized cipher uses the modular addition of each
character (viewed as an integer) with the key:
C[i] = P[i] + K mod m
P[i] = C[i] K mod m
More secure secret-key encryption schemes have been devised
in this century
Examples:
Code:
X Z A V O I D B Y G E R S P C F H J K L M N Q T U W
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Ciphertext:
PCQ VMJYPD LBYK LYSO KBXBJXWXV BXV ZCJPO EYPD KBXBJYUXJ
LBJOO KCPK. CP LBO LBCMKXPV XPV IYJKL PYDBL, QBOP KBO BXV
OPVOV LBO LXRO CI SX'XJMI, KBO JCKO XPV EYKKOV LBO DJCMPV
ZOICJO BYS, KXUYPD: DJOXL EYPD, ICJ X LBCMKXPV XPV CPO PYDBLK
Y BXNO ZOOP JOACMPLYPD LC UCM LBO IXZROK CI FXKL XDOK XPV
LBO RODOPVK CI XPAYOPL EYPDK. SXU Y SXEO KC ZCRV XK LC AJXNO
X IXNCMJ CI UCMJ SXGOKLU?
Plaintext:
Now during this time Shahrazad had borne King Shahriyar three sons.
On the thousand and first night, when she had ended the tale of
Ma'aruf, she rose and kissed the ground before him, saying: Great King,
for a thousand and one nights I have been recounting to you the fables
of past ages and the legends of ancient kings. May I make so bold as to
crave a favour of your majesty?
Epilogue, Tales from the Thousand and One Nights
plaintext
9
6/8/2002 2:08 PM
encrypt
private key
ciphertext
Cryptography
decrypt
plaintext
10
RSA Cryptosystem
6/8/2002 2:20 PM
Outline
Eulers theorem (10.1.3)
RSA cryptosystem (10.2.3)
RSA Cryptosystem
Bits
PCs
Memory
430
128MB
760
215,000
4GB
1,020
342106
170GB
1,620
1.61015
120TB
Algorithms for RSA
6/8/2002 2:20 PM
RSA Cryptosystem
Setup:
(10) = 4
RSA Cryptosystem
1
1
19
39
37
53
2
8
20
25
38
37
3
27
21
21
39
29
6/8/2002 2:20 PM
4
9
22
33
40
35
5
15
23
12
41
6
6
51
24
19
42
3
7
13
25
5
43
32
8
17
26
31
44
44
9
14
27
48
45
45
10
10
28
7
46
41
RSA Cryptosystem
11
11
29
24
47
38
public key: (119, 5)

private key: 77
M = C27 mod 55
12
23
30
50
48
42
13
52
31
36
49
4
Encryption:
M = 19
C = 195 mod 119 = 66
Decryption:
C = 6677 mod 119 = 19
M = Cd mod n
RSA Cryptosystem
The security of the RSA

cryptosystem is based on the
widely believed difficulty of
factoring large numbers
C = M3 mod 55
Decryption
Keys:
Security
Encryption
p = 5, q = 11
n = 511 = 55
(n) = 410 = 40
e = 3
d = 27 (327 = 81 = 240 + 1)
Plaintext M in Zn
C = Me mod n
6/8/2002 2:20 PM
Complete RSA Example
Public key: KE = (n, e)

Private key: KD = d
Decryption:
Setup:
p = 7, q = 17
n = 717 = 119
(n) = 616 = 96
e=5
d = 77
Encryption:
3(10) mod 10 = 34 mod 10 = 81 mod 10 = 1

7(10) mod 10 = 74 mod 10 = 2401 mod 10 = 1
9(10) mod 10 = 94 mod 10 = 6561 mod 10 = 1
Setup:
Keys:
(p) = p 1
Eulers Theorem
For each element x of Z*n, we have x(n) mod n = 1
Example (n = 10)
M
C
M
C
M
C
Example
n = pq, with p and q

primes
e relatively prime to
(n) = (p 1) (q 1)
d inverse of e in Z(n)
If p is prime, we have
6/8/2002 2:20 PM
RSA Cryptosystem
RSA Cryptosystem
The multiplicative group for Zn, denoted with Z*n, is the subset of
elements of Zn relatively prime with n
The totient function of n, denoted with (n), is the size of Z*n
Example
Z*p = {1, 2, , (p 1)}
Modular power (10.1.4)

Modular inverse (10.1.5)
Randomized primality testing (10.1.6)
6/8/2002 2:20 PM
Eulers Theorem
Z*10 = { 1, 3, 7, 9 }
Definition
Example
Security
Correctness
14
49
32
43
50
40
15
20
33
22
51
46
16
26
34
34
52
28
17
18
35
30
53
47
5
18
2
36
16
54
54
The best known factoring

algorithm (general number
field sieve) takes time
exponential in the number of
bits of the number to be
factored
The RSA challenge, sponsored

by RSA Security, offers cash
prizes for the factorization of
given large numbers
In April 2002, prizes ranged
from $10,000 (576 bits) to
$200,000 (2048 bits)
6/8/2002 2:20 PM
In 1999, a 512-bit number was

factored in 4 months using the
following computers:
160 175-400 MHz SGI and Sun
8 250 MHz SGI Origin
120 300-450 MHz Pentium II
4 500 MHz Digital/Compaq
Estimated resources needed to

factor a number within one year
RSA Cryptosystem
Bits
PCs
Memory
430
128MB
760
215,000
4GB
1,020
342106
170GB
1,620
1.61015
120TB
6
RSA Cryptosystem
6/8/2002 2:20 PM
Correctness
Algorithmic Issues
We show the correctness of

the RSA cryptosystem for the
case when the plaintext M
does not divide n
Namely, we show that
(Me)d mod n = M
Since ed mod (n) = 1, there is
an integer k such that
ed = k(n) + 1
Since M does not divide n, by
Eulers theorem we have
M(n) mod n = 1
Thus, we obtain
(Me)d mod n =
Med mod n =
Mk(n) + 1 mod n =
MMk(n) mod n =
M (M(n))k mod n =
M (M(n) mod n)k mod n =
M (1)k mod n =
M mod n =
M
See the book for the proof of
correctness in the case when
the plaintext M divides n
The implementation of
the RSA cryptosystem
requires various
algorithms
Overall
RSA Cryptosystem
Modular Power
The repeated squaring
algorithm speeds up the
computation of a modular
power ap mod n
Write the exponent p in binary
p = pb 1 pb 2 p1 p0
Start with
Q1 = apb 1 mod n
Repeatedly compute
Qi = ((Qi 1)2 mod n)apb i mod n
We obtain
Qb = ap mod n
The repeated squaring
algorithm performs O (log p)
arithmetic operations
6/8/2002 2:20 PM
3 mod 19 (18 = 10010)

Q1 = 31 mod 19 = 3
Q2 = (32 mod 19)30 mod 19 = 9
Q3 = (92 mod 19)30 mod 19 =
81 mod 19 = 5
Q4 = (52 mod 19)31 mod 19 =
(25 mod 19)3 mod 19 =
18 mod 19 = 18
Q5 = (182 mod 19)30 mod 19 =
(324 mod 19) mod 19 =
1719 + 1 mod 19 = 1
18
Qi
18
RSA Cryptosystem
a = 21
b = 15
d=3
i = 3, j = 4
3 = 321 + (4)15 =
63 60 = 3
6/8/2002 2:20 PM
Given positive integers a and b,

the extended Euclids algorithm
computes a triplet (d,i,j) such that
d = gcd(a,b)
d = ia + jb
To test the existence of and

compute the inverse of x Zn, we
execute the extended Euclids
algorithm on the input pair (x,n)
Let (d,i,j) be the triplet returned
d = ix + jn
Case 1: d = 1
i is the inverse of x in Zn
Case 2: d > 1
x has no inverse in Zn
RSA Cryptosystem
10
Randomized Primality Testing
A random 100-bit integer is a composite base-2 pseudoprime with

probability less than 10-13
The smallest composite base-2 pseudoprime is 341
Check whether xn 1 mod n = 1
Can be performed efficiently with the repeated squaring algorithm
RSA Cryptosystem
xn 1 mod n = 1 (Fermats little theorem)
6/8/2002 2:20 PM
RSA Cryptosystem
Theorem
Given positive integers a
and b, let d be the smallest
positive integer such that
d = ia + jb
for some integers i and j.
We have
d = gcd(a,b)
Example
Base-x pseudoprimality testing for an integer n:
Modular power
6/8/2002 2:20 PM
2 p5 i
Composite base-x pseudoprimes are rare:
Modular power
Decryption
p5 1
The number of primes less than or equal to n is about n / ln n

Thus, we expect to find a prime among, O(b) randomly generated
numbers with b bits each
Testing whether a number is prime (primality testing) is believed
to be a hard problem
An integer n 2 is said to be a base-x pseudoprime if
Representation of integers
of arbitrarily large size and
arithmetic operations on
them
Modular Inverse
Example
Pseudoprimality Testing
Generation of random
numbers with a given
number of bits (to generate
candidates p and q)
Primality testing (to check
that candidates p and q are
prime)
Computation of the GCD (to
verify that e and (n) are
relatively prime)
Computation of the
multiplicative inverse (to
compute d from e)
Encryption
6/8/2002 2:20 PM
Setup
11
Compositeness witness function

witness(x, n) with error probability
q for a random variable x
Case 1: n is prime
witness w(x, n) = false
Case 2: n is composite
witness w(x, n) = false with
probability q < 1
Algorithm RandPrimeTest tests

whether n is prime by repeatedly
evaluating witness(x, n)
A variation of base- x
pseudoprimality provides a
suitable compositeness witness
function for randomized primality
testing (Rabin-Miller algorithm)
6/8/2002 2:20 PM
Algorithm RandPrimeTest(n, k)
Input integer n,confidence
parameter k and composite
witness function witness(x,n)
with error probability q
Output an indication of
whether n is composite or prime
with probability 2k
RSA Cryptosystem
t k/log2(1/q)
for i 1 to t
x random()
if witness(x,n)= true
return n is composite
return n is prime
12
Information Security
6/8/2002 2:20 PM
Outline and Reading

Digital signatures
Definition (10.2.2)
RSA signature and verification (10.2.3)
One-way hash functions
message
M
one-way hash
fingerprint
f = H(M)
Key distribution
6/8/2002 2:20 PM
Digital Signature
Integrity: S establishes that M has not been altered
Nonrepudiation: S unequivocally identifies the author A of M and proves
that A did indeed sign M
A digital signature scheme provides algorithms for
M maps a string M of arbitrary length into an integer f = H(M) with a

fixed number of bits, called the fingerprint or digest of M
H can be computed efficiently
Given an integer f, it is computationally infeasible to find a string M
such that that H(M) = d
Given a string M , it is computationally infeasible to find another string
M such that H(M) = H(M) (collision resistance)
It is computationally infeasible to find two strings M and M such that
H(M) = H(M) (strong collision resistance)
Setup:
n = pq, with p and q

primes
e relatively prime to
(n) = (p 1) (q 1)
d inverse of e in Z(n)
Public key: KE = (n, e)

Private key: KD = d
Message M in Zn
Signature S = Md mod n
Public key: KE = (55, 3)

Private key: KD = 27
Signature:
M = 51
S = 5127 mod 55 = 6
Verification:
Check that M = Se mod n
6/8/2002 2:20 PM
p = 5, q = 11
n = 511 = 55
(n) = 410 = 40
e=3
d = 27 (327 = 81 = 240 + 1)
Keys:
Signature:
S = 63 mod 55 = 216 mod 55 = 51
Alice and Bob want to flip a random coin by communicating

over the internet
The following protocol, based on a one-way hash function H,
ensures the fairness of the outcome
MD5 (Message Digest 5, 1992), which uses a 128-bit (16 bytes)

fingerprint
SHA-1 (Secure Hash Algorithm 1, 1995), which uses a 160-bit (20
bytes) fingerprint
6/8/2002 2:20 PM
Coin Flipping Over the Net
Two widely used one-way hash functions are
A one-way hash function is a function H with the following

properties
Verification:
One-Way Hash Function
Setup:
Signature: Alice (author) computes S = decrypt(KD,M) using her private

key KD and sends the pair (M,S) to Bob
Verification: Bob (reader) computes M = encrypt(KE, S) using Alices
public key KE and checks that M = M
Keys:
A recently passed law in the US gives digital signatures the same

validity of handwritten signatures
A public-key cryptosystem yields a digital signature scheme
provided encrypt(KE, decrypt(KD, M)) = M
6/8/2002 2:20 PM
Signing a message by the author

Verifying the signature of a message by the reader
6/8/2002 2:20 PM
Certificates (10.3.5)
Revocation (10.3.5)
RSA Digital Signature
A digital signature is a string S associated with a message M and

the author A of M that has the following properties
Definition (10.3.1)
Applications (10.3.2)
Alice picks a random integer x, computes the fingerprint f = H(x)

and sends f to Bob
Bob sends to Alice his guess of whether x is odd or even
Alice announces the result of the coin flip: heads if Bob has
guessed correctly and tails otherwise
Alice sends to Bob integer x as a proof of the outcome of the flip
Bob verifies that f = H(x)
Because of the strong-collision resistance property, it is

computationally infeasible for Alice to cheat
6/8/2002 2:20 PM
6/8/2002 2:20 PM
Digitally Signed Fingerprints
Certificates
In the RSA digital signature scheme with modulus n, the message

to be signed must be an integer in Zn , i.e., the message should
have at most b = log n bits
To overcome the above restriction on the message length, we can
use the fingerprint f = H(M) of the message instead of the
message itself, where H is a one-way hash function
Alice computes first f = H(M) and then the signature S of f

Bob first computes f = H(M) and then verifies S
Since the one-way hash function H has the collision-resistance

property, it is computationally infeasible to modify the message M
while preserving the signature of the fingerprint f = H(M)
Public-key cryptography is based on the knowledge by each

participant of the public key of the other participants
It is complicated to securely distribute the public keys of all the
participants
A certificate is a message of the type (name, public key) signed
by a third-party
Public-key infrastructure (PKI)
message
M
6/8/2002 2:20 PM
one-way hash
fingerprint
f = H(M)
sign
signature
S = f d mod n
7
Web Server Certificates
The private key of the subject has been compromised

The certificate was incorrectly issued by the CA
Certificate Revocation List (CRL)
Time-stamped list of all the unexpired certificates that have been

revoked by the CA
Periodically published and signed by the CA
When presented with a certificate, one should
The SSL (secure socket layer)

protocol uses Web server
certificates to provide encryption
and authentication in a secure
Web connection (https)
In certain circumstances, a certificate may have to be revoked

before its expiration date
Serial number
Hash and signature schemes
(e.g., MD5 and RSA)
Issuer (certification authority)
Period of validity (from, to)
Subject (URL and organization)
Public key
6/8/2002 2:20 PM
6/8/2002 2:20 PM
Certificate Revocation
A Web server certificate is used

to authenticate the public key of
a Web server
Fields of a Web server certificate
An entity trusted by all the participants, called certification

authority (CA), issues to each participant a certificate (Name, KE)
that authoritatively binds the participants to their public keys
Only the CAs public key needs to be distributed securely
Before sending an encrypted message to Bob or verifying a
message digitally signed by Bob, Alice determines Bobs public key
KE by using Bobs certificate (Bob, KE)
Verify the CAs signature on the certificate

Check that the certificate has non been revoked by searching in the
latest available CRL
By default, Web browsers do not check the revocation status of

a Web server certificate, which poses a security risk
9
6/8/2002 2:20 PM
10
Convex Hull
6/8/2002 12:42 PM
Outline and Reading

Convex hull (12.5.2)
Orientation (12.5.1-2)
Sorting by angle (12.5.5)
Graham scan (12.5.5)
Analysis (12.5.5)
Convex Hull
obstacle
start
6/8/2002 12:42 PM
end
Convex Hull
Convex Polygon
6/8/2002 12:42 PM
Convex Hull
The convex hull of a set of points is the smallest

convex polygon containing the points
Think of a rubber band snapping around the points
6/8/2002 12:42 PM
Find an optimal route that avoids obstacles for a robot
Geometric algorithms
Two points
All the points are
collinear
Convex hull is like a two-dimensional sorting
obstacle
there is one point

All the points are
coincident
6/8/2002 12:42 PM
Motion planning
The convex hull is a

point
Convex Hull
Applications
The convex hull is a

segment
nonconvex
Special Cases
Convex Hull
Convex Hull
A convex polygon is a nonintersecting polygon whose

internal angles are all convex (i.e., less than )
In a convex polygon, a segment joining two vertices
of the polygon lies entirely inside the polygon
convex
6/8/2002 12:42 PM
start
Convex Hull
6/8/2002 12:42 PM
end
Convex Hull
Convex Hull
6/8/2002 12:42 PM
Computing the Convex Hull
Orientation
The orientation of three points in the
plane is clockwise, counterclockwise, or
collinear
orientation(a, b, c)
The following method computes the convex hull of a set of points

Phase 1: Find the lowest point (anchor point)
Phase 2: Form a nonintersecting polygon by sorting the points
counterclockwise around the anchor point
Phase 3: While the polygon has a nonconvex vertex, remove it
clockwise (CW, right turn)

counterclockwise (CCW, left turn)
collinear (COLL, no turn)
6/8/2002 12:42 PM
Convex Hull
CCW
b
a
COLL
b
p
q
6/8/2002 12:42 PM
COLL
q convex orientation(p, q, r) = CCW

q nonconvex orientation(p, q, r) = CW or COLL
CW
c
a
p
9
6/8/2002 12:42 PM
Graham Scan
c
b
Convex Hull
Convex Hull
The Graham scan is a

systematic procedure for
removing nonconvex
vertices from a polygon
The polygon is traversed
counterclockwise and a
sequence H of vertices is
maintained
yc 1
CCW
6/8/2002 12:42 PM
xc
Testing whether a vertex is convex can be done

using the orientation function
Let p, q and r be three consecutive vertices of a
polygon, in counterclockwise order
b < c orientation(a, b, c) = CCW

b = c orientation(a, b, c) = COLL
b > c orientation(a, b, c) = CW
yb 1
CW
Removing Nonconvex Vertices
Computing angles from coordinates is complex and

leads to numerical inaccuracy
We can sort a set of points by angle with respect to
the anchor point a using a comparator based on the
orientation function
(a, b, c) = xb
6/8/2002 12:42 PM
Sorting by Angle
ya 1
c
a
The orientation of three points is

characterized by the sign of the
determinant (a, b, c), whose absolute
value is twice the area of the triangle with
vertices a, b and c
xa
Convex Hull
10
Analysis
for each vertex r of the polygon
Computing the convex hull of a set of points

takes O(n log n) time
Let q and p be the last and second last

vertex of H
while orientation(p, q, r) = CW or COLL
remove q from H
qp
p vertex preceding p in H
Add r to the end of H
Convex Hull
p
H
11
Finding the anchor point takes O(n) time

Sorting the points counterclockwise around the
anchor point takes O(n log n) time
Use the orientation comparator and any sorting
algorithm that runs in O(n log n) time (e.g., heap-sort or
merge-sort)
The Graham scan takes O(n) time

Each point is inserted once in sequence H
Each vertex is removed at most once from sequence H
6/8/2002 12:42 PM
Convex Hull
12
Incremental Convex Hull
6/8/2002 2:01 PM
Outline and Reading

Point location
Incremental convex hull
6/8/2002 2:01 PM
Point Location
Given a convex polygon P, a
point location query locate(q)
determines whether a query
point q is inside (IN), outside
(OUT), or on the boundary
(ON) of P
An efficient data structure for
point location stores the top
and bottom chains of P in two
binary search trees, TL and TH
of logarithmic height
6/8/2002 2:01 PM
locate(q): determines if
query point q is inside,
outside or on the convex
hull of S
insert(q): inserts a new
point q into S
hull(): returns the convex
hull of S
6/8/2002 2:01 PM
Edge eL or vertex vL on the

lower chain of P whose
horizontal span includes x(q)
Edge eH or vertex vH on the
upper chain of P whose
horizontal span includes x(q)
We consider four cases
TL
6/8/2002 2:01 PM
TH
To perform locate(q), we search

for x(q) in TL and TH to find
The incremental convex

hull problem consists of
performing a series of
the following operations
on a set S of points
Problem
Data structure
Insertion algorithm
Analysis
Point Location (cont.)
TH
An internal node stores a pair

(x (v), v) where v is a vertex
and x (v) is its x-coordinate
An external node represents
an edge or an empty halfplane
Problem
Data structure
If no such edges/vertices exist,

we return OUT
Else if q is on eL (vL) or on eH
(vH), we return ON
Else if q is above eL (vL) and
below eH (vH), we return IN
Else, we return OUT
6/8/2002 2:01 PM
vL
TL
Insertion of a Point
Incremental convex
hull data structure
eH
P
We store the points

of the convex hull
and discard the other
points
We store the hull
points in two redblack trees
TL for the lower hull
TH for the upper hull
In operation insert(q),
we consider four
cases that depend on
the location of point q
A IN or ON: no change
B OUT and above: add q
to the upper hull
C OUT and below: add q
to the lower hull
D OUT and left or right:
add q to the lower and
upper hull
6/8/2002 2:01 PM
6/8/2002 2:01 PM
Insertion of a Point (cont.)
We find the edge e (vertex v) whose

horizontal span includes q
w left endpoint (neighbor) of e (v)
z left neighbor of w
While orientation(q, w, z) = CW or COLL
Let n be the current size of the convex

hull
u right endpoint (neighbor) of e (v)

t right neighbor of u
While orientation(t, u, q) = CW or COLL
We remove vertex u
ut
t right neighbor of u
We remove vertex w
wz
z left neighbor of w
Analysis
q
Algorithm to add a vertex q to the upper

hull chain in Case B (boundary conditions
omitted for simplicity)
We add vertex q
6/8/2002 2:01 PM
w
z
Operation locate takes O(log n) time

Operation insert takes O((1 + k)log n) time,
where k is the number of vertices removed
Operation hull takes O(n) time
The amortized running time of operation
insert is O(log n)
t
7
6/8/2002 2:01 PM
Graphs
6/3/2002 1:41 PM
Outline and Reading

P and NP (13.1)
NP-Completeness
x1
x1
x2
x2
12
x3
x3
x4
22
x4
NP-completeness (13.2)
32
11
13
21
23
31
Running Time Revisited
NP-Completeness
Dealing with Hard Problems
Input size, n
To be exact, let n denote the number of bits in a nonunary
encoding of the input
All the polynomial-time algorithms studied so far in this

course run in polynomial time using this definition of
input size.
What to do when we find a problem

that looks hard
Exception: any pseudo-polynomial time algorithm
SFO
2555
337
HNL
1843
43
17
LAX
1233
849
ORD
802
Definition of NP-hard and NP-complete

The Cook-Levin Theorem
33
NP-Completeness
Definition of P
Definition of NP
Alternate definition of NP
7
138
DFW
2
14
PVD
LGA
1120
10
99
I couldnt find a polynomial-time algorithm;

I guess Im too dumb.
MIA
NP-Completeness
Sometimes we can prove a strong lower

bound (but not usually)
I couldnt find a polynomial-time algorithm,

because no such algorithm exists!
NP-Completeness
NP-Completeness
NP-completeness lets us show

collectively that a problem is hard.
I couldnt find a polynomial-time algorithm,

but neither could all these other smart people.
5
NP-Completeness
Graphs
6/3/2002 1:41 PM
Polynomial-Time
Decision Problems
Problems and Languages
To simplify the notion of hardness, we will

focus on the following:
Polynomial-time as the cut-off for efficiency

Decision problems: output is 1 or 0 (yes or no)
Examples:
A language L is a set of strings defined over some

alphabet
Every decision algorithm A defines a language L
Does a given graph G have an Euler tour?

Does a text T contain a pattern P?
Does an instance of 0/1 Knapsack have a solution with
benefit at least K?
L is the set consisting of every string x such that A outputs

yes on input x.
We say A accepts x in this case
Example:
If A determines whether or not a given graph G has an
Euler tour, then the language L for A is all graphs with
Euler tours.
Does a graph G have an MST with weight at most K?
NP-Completeness
NP-Completeness
The Complexity Class NP
The Complexity Class P
We say that an algorithm is non-deterministic if it

uses the following operation:
A complexity class is a collection of languages
P is the complexity class consisting of all languages
that are accepted by polynomial-time algorithms
For each language L in P there is a polynomial-time
decision algorithm A for L.
We say that a non-deterministic algorithm A accepts

a string x if there exists some sequence of choose
operations that causes A to output yes on input x.
NP is the complexity class consisting of all languages
accepted by polynomial-time non-deterministic
algorithms.
If n=|x|, for x in L, then A runs in p(n) time on input x.

The function p(n) is some polynomial
NP-Completeness
Problem: Decide if a graph has an MST of weight K

Algorithm:
2.
3.
NP-Completeness
10
The Complexity Class NP

Alternate Definition
NP example
1.
Choose(b): chooses a bit b

Can be used to choose an entire string y (with |y| choices)
Non-deterministically choose a set T of n-1 edges

Test that T forms a spanning tree
Test that T has weight at most K
Analysis: Testing takes O(n+m) time, so this

algorithm runs in polynomial time.
NP-Completeness
We say that an algorithm B verfies the acceptance

of a language L if and only if, for any x in L, there
exists a certificate y such that B outputs yes on
input (x,y).
NP is the complexity class consisting of all languages
verified by polynomial-time algorithms.
We know: P is a subset of NP.
Major open question: P=NP?
Most researchers believe that P and NP are different.
11
NP-Completeness
12
Graphs
6/3/2002 1:41 PM
Equivalence of the
Two Definitions
NP example (2)
Problem: Decide if a graph has an MST of weight K
Suppose A is a non-deterministic algorithm

Let y be a certificate consisting of all the outcomes of the
choose steps that A uses
We can create a verification algorithm that uses y instead of
As choose steps
If A accepts on x, then there is a certificate y that allows us to
verify this (namely, the choose steps A made)
If A runs in polynomial-time, so does this verification
algorithm
Verification Algorithm:
1.
2.
3.
Use as a certificate, y, a set T of n-1 edges

Test that T forms a spanning tree
Test that T has weight at most K
Analysis: Verification takes O(n+m) time, so this

algorithm runs in polynomial time.
NP-Completeness
Suppose B is a verification algorithm

Non-deterministically choose a certificate y
Run B on y
If B runs in polynomial-time, so does this non-deterministic
algorithm
13
An Interesting Problem
NP-Completeness
CIRCUIT-SAT is in NP
A Boolean circuit is a circuit of AND, OR, and NOT

gates; the CIRCUIT-SAT problem is to determine if
there is an assignment of 0s and 1s to a circuits
inputs so that the circuit outputs 1.
Non-deterministically choose a set of inputs and the

outcome of every gate, then test each gates I/O.
Inputs:
Logic Gates:
0
1
Inputs:
1
Logic Gates:
NOT
0
Output:
0
1
1
Output:
OR
0
1
NOT
OR
1
14
AND
AND
NP-Completeness
15
NP-Completeness
poly-time
NP-Completeness
16
Cook-Levin Theorem
A problem (language) L is NP-hard if every

problem in NP can be reduced to L in
polynomial time.
That is, for each language M in NP, we can
take an input x for M, transform it in
polynomial time to an input x for L such that
x is in M if and only if x is in L.
L is NP-complete if its in NP and is NP-hard.
NP
NP-Completeness
CIRCUIT-SAT is NP-complete.
We already showed it is in NP.
To prove it is NP-hard, we have to show that every

language in NP can be reduced to it.
Let M be in NP, and let x be an input for M.

Let y be a certificate that allows us to verify membership in M in
polynomial time, p(n), by some algorithm D.
Let S be a circuit of size at most O(p(n)2) that simulates a
computer (details omitted)
NP
L
17
poly-time
NP-Completeness
CIRCUIT-SAT
18
Graphs
6/3/2002 1:41 PM
Some Thoughts
about P and NP
Cook-Levin Proof
We can build a circuit that simulates the verification of xs
membership in M using y.
Inputs
D
Let W be the working storage
for D (including registers,
such as program counter); let
D be given in RAM machine < p(n) W
cells
code.
S
Simulate p(n) steps of D by
replicating circuit S for each
y
step of D. Only input: y.
Circuit is satisfiable if and only
n x
if x is accepted by D with
some certificate y
Total size is still polynomial:
O(p(n)3).
NP-Completeness
W
S
Output
0/1
from D
p(n)
steps
19
NP
CIRCUIT-SAT
NP-complete
problems live here
Belief: P is a proper subset of NP.

Implication: the NP-complete problems are the hardest in NP.
Why: Because if we could solve an NP-complete problem in
polynomial time, we could solve every problem in NP in polynomial
time.
That is, if an NP-complete problem is solvable in polynomial time,
then P=NP.
Since so many people have attempted without success to find
polynomial-time solutions to NP-complete problems, showing your
problem is NP-complete is equivalent to showing that a lot of smart
people have worked on your problem and found no polynomialtime algorithm.
NP-Completeness
20
Graphs
6/4/2002 3:44 PM
Outline and Reading

Definitions (13.1-2)
NP-Completeness (2)
x1
x1
x2
x2
12
x3
x3
x4
22
x4
Some NP-complete problems (13.3)
32
11
13
21
23
31
33
NP-Completeness
NP-Completeness
If AB and BC, then AC.
CIRCUIT-SAT is in NP
For every M in NP, MCIRCUIT-SAT.
Types of reductions:
Output:
1
0
NP-Completeness
SAT
NP-Completeness
Reduce CIRCUIT-SAT to SAT.
(a+b+d+e)(a+c)(b+c+d+e)(a+c+e)
OR: +, AND: (times), NOT:
SAT: Given a Boolean formula S, is S

satisfiable, that is, can we assign 0s and 1s
to the variables so that S is 1 (true)?
SAT is NP-complete
A Boolean formula is a formula where the

variables and operations are Boolean (0/1):
Local replacement: Show AB by dividing an input to A into

components and show how each component can be converted
to a component for B.
Component design: Show AB by building special
components for an input of B that enforce properties needed
for A, such as choice or evaluate.
An input x for A can be converted to x for B, such that x is in A

if and only if x is in B. Likewise, for B to C.
Convert x into x for C such that x is in B iff x is in C.
Hence, if x is in A, x is in B, and x is in C.
Likewise, if x is in C, x is in B, and x is in A.
Thus, AC, since polynomial-time is closed under composition.
Denote this by ML.
A problem (language) L is NP-hard if every problem in

NP is polynomial-time reducible to L.
A problem (language) is NP-complete if it is in NP and
Inputs:
it is NP-hard.
0
1
0
CIRCUIT-SAT is NP-complete:
1
Transitivity of Reducibility
A language M is polynomial-time reducible to a

language L if an instance x for M can be transformed in
polynomial time to an instance x for L such that x is in M
if and only if x is in L.
Problem reduction
CNF-SAT and 3SAT
Vertex Cover
Clique
Hamiltonian Cycle
Problem Reduction
NP is the set of all problems (languages) that can be

accepted non-deterministically (using choose
operation) in polynomial time.
verified in polynomial-time given a certificate y.
Inputs:
Easy to see that CNF-SAT is in NP:

Non-deterministically choose an assignment of 0s and
1s to the variables and then evaluate each clause. If
they are all 1 (true), then the formula is satisfiable.
NP-Completeness
Example: m((a+b)e)(cf)(dg)(eh)(efi)
a
b
d
5
Given a Boolean circuit, make a variable for every

input and gate.
Create a sub-formula for each gate, characterizing
its effect. Form the formula as the output variable
AND-ed with all these sub-formulas:
h
i
The formula is satisfiable

if and only if the
Boolean circuit
m
is satisfiable.
Output:
NP-Completeness
Graphs
6/4/2002 3:44 PM
3SAT
Vertex Cover
The SAT problem is still NP-complete even if

the formula is a conjunction of disjuncts, that
is, it is in conjunctive normal form (CNF).
The SAT problem is still NP-complete even if
it is in CNF and every clause has just 3 literals
(a variable or its negation):
A vertex cover of graph G=(V,E) is a subset W of V, such

that, for every edge (a,b) in E, a is in W or b is in W.
VERTEX-COVER: Given an graph G and an integer K, is
does G have a vertex cover of size at most K?
(a+b+d)(a+c+e)(b+d+e)(a+c+e)
Reduction from SAT (See 13.3.1).
NP-Completeness
Vertex-Cover is NP-complete
Reduce 3SAT to VERTEX-COVER.

Let S be a Boolean formula in CNF with each clause
having 3 literals.
For each variable x, create a node for x and x,
and connect these two:
x
x
For each clause (a+b+c), create a triangle and
connect these three nodes.
a
12
11
22
13
21
NP-Completeness
31
NP-Completeness
b
10
Clique
A clique of a graph G=(V,E) is a subgraph C that is

fully-connected (every pair in C has an edge).
CLIQUE: Given a graph G and an integer K, is there a
clique in G of size at least K?
This graph has
a clique of size 5
32
23
c
9
Example: (a+b+c)(a+b+c)(b+c+d)
Graph has vertex cover of size K=4+6=10 iff formula is
satisfiable.
b
Completing the construction

Connect each literal in a clause triangle to its copy
in a variable pair.
E.g., a clause (x+y+z)
Let n=# of variables

Let m=# of clauses
Set K=n+2m
NP-Completeness
VERTEX-COVER is in NP: Non-deterministically choose a

subset W of size K and check that every edge is covered
NP-Completeness
8
by W.
33
11
CLIQUE is in NP: non-deterministically choose a

subset C of size K and check that every pair in C has
an edge in G.
NP-Completeness
12
Graphs
6/4/2002 3:44 PM
Some Other
NP-Complete Problems
CLIQUE is NP-Complete
Reduction from VERTEX-COVER.
A graph G has a vertex cover of size K if and only if
its complement has a clique of size n-K.
SET-COVER: Given a collection of m sets, are

there K of these sets whose union is the
same as the whole collection of m sets?
NP-complete by reduction from VERTEX-COVER
SUBSET-SUM: Given a set of integers and a

distinguished integer K, is there a subset of
the integers that sums to K?
G
NP-Completeness
13
NP-Completeness
14
Some Other
NP-Complete Problems
0/1 Knapsack: Given a collection of items with
weights and benefits, is there a subset of weight
at most W and benefit at least K?
NP-complete by reduction from SUBSET-SUM
Hamiltonian-Cycle: Given an graph G, is there a

cycle in G that visits each vertex exactly once?
Traveling Saleperson Tour: Given a complete

weighted graph G, is there a cycle that visits each
vertex and has total cost at most K?
NP-complete by reduction from Hamiltonian-Cycle.

NP-Completeness
15
Graphs
6/7/2002 11:50 AM
Outline and Reading

Approximation Algorithms for NP-Complete
Problems (13.4)
Approximation Algorithms
Approximation ratios
Polynomial-Time Approximation Schemes (13.4.1)
2-Approximation for Vertex Cover (13.4.2)
2-Approximation for TSP special case (13.4.3)
Log n-Approximation for Set Cover (13.4.4)
Polynomial-Time Approximation
Schemes
Approximation Ratios
Optimization Problems
We have some problem instance x that has many

feasible solutions.
We are trying to minimize (or maximize) some cost
function c(S) for a solution S to x. For example,
Finding a minimum spanning tree of a graph
Finding a smallest vertex cover of a graph
Finding a smallest traveling salesperson tour in a graph
An approximation produces a solution T
T is a k-approximation to the optimal solution OPT

if c(T)/c(OPT) < k (assuming a min. prob.; a
maximization approximation would be the reverse)
A 2-Approximation for
Vertex Cover
Vertex Cover
A vertex cover of graph G=(V,E) is a subset W of V,
such that, for every (a,b) in E, a is in W or b is in W.
OPT-VERTEX-COVER: Given an graph G, find a vertex
cover of G with smallest size.
OPT-VERTEX-COVER is NP-hard.
A problem L has a polynomial-time

approximation scheme (PTAS) if it has a
polynomial-time (1+)-approximation algorithm,
for any fixed >0 (this value can appear in the
running time).
0/1 Knapsack has a PTAS, with a running time
that is O(n3/ ). Please see 13.4.1 in GoodrichTamassia for details.
Every chosen edge e

has both ends in C
But e must be covered
by an optimal cover;
hence, one end of e
must be in OPT
Thus, there is at most
twice as many vertices
in C as in OPT.
That is, C is a 2-approx.
of OPT
Running time: O(m)
Algorithm VertexCoverApprox(G)
Input graph G
Output a vertex cover C for G
C empty set
HG
while H has edges
e H.removeEdge(H.anEdge())
v H.origin(e)
w H.destination(e)
C.add(v)
C.add(w)
for each f incident to v or w
H.removeEdge(f)
return C
Graphs
6/7/2002 11:50 AM
A 2-Approximation for TSP

Special Case
Special Case of the Traveling

Salesperson Problem
OPT-TSP: Given a complete, weighted graph, find
a cycle of minimum cost that visits each vertex.
OPT-TSP is NP-hard
Special case: edge weights satisfy the triangle
inequality (which is common in many applications):
Euler tour P of MST M
w(a,b) + w(b,c) > w(a,c)
5
a
Output tour T
A 2-Approximation for TSP

Special Case - Proof
OPT-SET-COVER: Given a
collection of m sets, find the
smallest number of them
whose union is the same as
the whole collection of m sets?
Euler tour P of MST M

(twice the cost of M)
Set Cover
The optimal tour is a spanning tour; hence |M|<|OPT|.

The Euler tour P visits each edge of M twice; hence |P|=2|M|
Each time we shortcut a vertex in the Euler Tour we will not increase
the total length, by the triangle inequality (w(a,b) + w(b,c) > w(a,c));
hence, |T|<|P|.
Therefore, |T|<|P|=2|M|<2|OPT|
Output tour T
(at most the cost of P)
Algorithm TSPApprox(G)
Input weighted complete graph G,
satisfying the triangle inequality
Output a TSP tour T for G
M a minimum spanning tree for G
P an Euler tour traversal of M,
starting at some vertex s
T empty list
for each vertex v in P (in traversal order)
if this is vs first appearance in P then
T.insertLast(v)
T.insertLast(s)
return T
OPT-SET-COVER is NP-hard
Greedy approach produces an

O(log n)-approximation
algorithm. See 13.4.4 for
details.
Optimal tour OPT

(at least the cost of MST M)
9
Algorithm SetCoverApprox(G)
Input a collection of sets S1Sm
Output a subcollection C with same union
F {S1,S2,,Sm}
C empty set
U union of S1Sm
while U is not empty
Si set in F with most elements in U
F.remove(Si)
C.add(Si)
Remove all elements in Si from U
return C
10

Algorithm Design Slides

Enviado por

Dados do documento

Descrição original:

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Algorithm Design Slides

Enviado por

Direitos autorais:

Formatos disponíveis

Running Time (1.

An algorithm is a step-by-step procedure for

input objects into output

Experimental Studies ( 1.6)

w It is necessary to implement the

algorithm, which may be difficult

w Results may not be indicative of the

running time on other inputs not included

w Uses a high-level description of the

The Random Access Machine

var.method (arg [, arg])

bank of memory cells,

w Memory cells are numbered and accessing

w By inspecting the pseudocode, we can determine the

maximum number of primitive operations executed by

Estimating Running Time

Growth Rate of Running Time

w Algorithm arrayMax executes 7n 1 primitive

w Changing the hardware/ software

operations in the worst case. Define:

a = Time taken by the fastest primitive operation

w Let T(n) be worst-case time of arrayMax. Then

Affects T(n) by a constant factor, but

w The linear growth rate of the running

w Hence, the running time T(n) is bounded by two

time T(n) is an intrinsic property of

the slope of the line

w The growth rate is

102 n + 105 is a linear

f(n) cg(n) for n n0

w Example: the function

Big-Oh and Growth Rate

growth rate of a function

w The statement f(n) is O(g(n)) means that the growth

log n + log log n

3 log n + log log n is O(log n)

More Big-Oh Examples

g(n), we say that f(n) is 1,000

Big-Oh Notation (1.2)

w We can use the big-Oh notation to rank functions

g(n) grows more

Asymptotic Algorithm Analysis

w If is f(n) a polynomial of degree d, then f(n) is

Drop lower-order terms

Say 2n is O(n) instead of 2n is

w Use the simplest expression of the class

eventually dropped anyhow, we can disregard them

Computing Prefix Averages

w Computing the array A of

quadratic time by applying the definition

There is a simple visual

Prefix Averages (Quadratic)

prefix averages of another

w The following algorithm computes prefix averages in

We determine that algorithm arrayMax executes at most

w Since constant factors and lower-order terms are

Say 3n + 5 is O(n) instead of 3n + 5 is O(3n)

w The running time of

We find the worst-case number of primitive operations

w Use the smallest possible class of functions

the running time in big-Oh notation

w To perform the asymptotic analysis

Prefix Averages (Linear)