Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Combinatorial Algorithms: Enlarged Second Edition
Combinatorial Algorithms: Enlarged Second Edition
Combinatorial Algorithms: Enlarged Second Edition
Ebook558 pages9 hours

Combinatorial Algorithms: Enlarged Second Edition

Rating: 3.5 out of 5 stars

3.5/5

()

Read preview

About this ebook

Newly enlarged, updated second edition of a valuable text presents algorithms for shortest paths, maximum flows, dynamic programming and backtracking. Also discusses binary trees, heuristic and near optimums, matrix multiplication, and NP-complete problems. 153 black-and-white illus. 23 tables.
Newly enlarged, updated second edition of a valuable, widely used text presents algorithms for shortest paths, maximum flows, dynamic programming and backtracking. Also discussed are binary trees, heuristic and near optimums, matrix multiplication, and NP-complete problems. New to this edition: Chapter 9 shows how to mix known algorithms and create new ones, while Chapter 10 presents the "Chop-Sticks" algorithm, used to obtain all minimum cuts in an undirected network without applying traditional maximum flow techniques. This algorithm has led to the new mathematical specialty of network algebra. The text assumes no background in linear programming or advanced data structure, and most of the material is suitable for undergraduates. 153 black-and-white illus. 23 tables. Exercises, with answers at the ends of chapters.
LanguageEnglish
Release dateApr 26, 2012
ISBN9780486152943
Combinatorial Algorithms: Enlarged Second Edition

Related to Combinatorial Algorithms

Related ebooks

Mathematics For You

View More

Related articles

Reviews for Combinatorial Algorithms

Rating: 3.5 out of 5 stars
3.5/5

2 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Combinatorial Algorithms - T. C. Hu

    INDEX

    CHAPTER 1.

    SHORTEST PATHS

    There is no shortest path to success.

    § 1.1 GRAPH TERMINOLOGY

    When we try to solve a problem, we often draw a graph. A graph is often the simplest and easiest way to describe a system, a structure, or a situation. The Chinese proverb A picture is worth one thousand words is certainly true in mathematical modeling. This is why graph theory has a wide variety of applications in physical, biological, and social sciences. Due to the wide variety of applications, we also have diverse terminology. Papers on graph theory are full of definitions, and every author has his own definitions. Here, we introduce a minimum number of definitions which are intuitively obvious. The notation and terminology adopted here is similar to that of Knuth [18].

    A graph consists of a finite set of vertices and a set of edges joining the vertices. We shall draw small circles to represent vertices and lines to represent edges. A system or a structure can often be represented by a graph where the lines indicate the relations among the vertices (the elements of the system). For example, we can use vertices to represent cities and edges to represent the highways connecting the cities. We can also use vertices to represent persons and draw an edge joining two vertices if the two persons know each other.

    The reader should keep in mind that graph theory is a theory of relations, not a theory of definitions; however, a minimum number of definitions is needed here. Vertices are also called nodes, and edges are also called arcs, branches, or links. We usually assume that there are n vertices in the graph G and at most one edge joining any two vertices and there is no edge joining a node to itself. The vertices are denoted by Vi (i = 1,2,...,n) and the edge joining Vi and Vj is denoted by eij. Two vertices are adjacent if they are joined by an edge (the two vertices are also called neighbors); two edges are adjacent if they are both incident to the same vertex. A vertex is of degree k if there are k edges incident to it.

    A sequence of vertices and edges

    (Vi, e12, V2, e23, V3, ..., Vn)

    is said to form a path from V1 to Vn. We can represent a path by only its vertices as

    (V1, V2, ..., Vn)

    or by only the edges in the path as

    (e12, e23, ..., en−1,n).

    A graph is connected if there is a path between any two nodes of the graph. A path is of length k if there are k edges in the path. A path is a simple path if all the vertices V1, V2, ..., Vn−1, Vn are distinct. If V1 = Vn, then it is called a cycle. In other words, a cycle is a path of length three or more from a vertex to itself. If all vertices in a cycle are distinct, then the cycle is a simple cycle. Unless otherwise stated, we shall use the word path to mean a simple path, cycle to mean a simple cycle, and graph to mean a connected graph.

    If an edge has a direction (just like a street may be a one-way street), then it is called a directed edge. If a directed edge is from Vi to Vj, then we cannot follow this edge from Vj to Vi. Thus in the definition of a path, we want an edge to be undirected or to be a directed edge from Vi to Vi+1. In all other definitions, the directions of edges are ignored. A graph is called a directed graph if all edges are directed and a mixed graph if some edges are directed and some are not. A cycle formed by directed edges is called a directed cycle (or circuit). A directed graph is called acyclic if there are no directed cycles. The words graph and edge are used for an undirected graph and an undirected edge throughout this section.

    A tree is a connected graph with no cycles. If a graph has n vertices, then any two of the following conditions characterize a tree and automatically imply the third condition.

    The graph G is connected.

    The graph has n-1 edges.

    The graph contains no cycles.

    We shall denote a graph by G = (V; E) where V is the set of nodes or vertices, and E is the set of edges in the graph. A graph G′ = (V′; E′) is a subgraph of G = (V; E) if V′ ⊆ V and E′ ⊆ E.

    A subgraph which is a tree and which contains all the vertices of a graph is called a spanning tree of the graph. We shall illustrate these intuitive definitions of graph theory in Figure 1.1.

    Figure 1.1

    There are three paths between V1 and V5, namely (V1, V2, V4, V5), (V1, V4, V5), (V1, V3, V4, V5). The edges e14, e24, e34, and e45 form a spanning tree and so do the edges e12, e24, e34, and e45. Also, we may pick e12, e13, e34, and e45 to be the spanning tree. Here the node V1 is of degree 3 in the graph G but is of degree 2 in the last spanning tree. If the edge e45 was directed from V4 to V5, then there are still three paths from V1 to V5 but none from V5 to V1.

    In most applications, we associate numbers with edges or vertices. Then the graph is called a network. All the definitions of graph theory apply to networks as well. In network theory, we usually use nodes and arcs instead of vertices and edges.

    §1.2 SHORTEST PATH

    One of the fundamental problems in network theory is to find shortest paths in a network. Each arc of the network has a number which is the length of the arc.

    In most cases, the arcs have positive lengths, but the arcs may have negative lengths in some applications. For example, the nodes may represent the various states of a physical system, where the length associated with the arc eij denotes the energy absorbed in transforming the state Vi to the state Vj. An arc with negative length then indicates that energy is released in transforming the state Vi into the state Vj. If the total length of a circuit or cycle is negative, we say that the network contains a negative circuit.

    The length of a path is the sum of lengths of all the arcs in the path. There are usually many paths between a pair of nodes, say Vs and Vt, but a path with the minimum length is called a shortest path from Vs to Vt.

    The problem of finding a shortest path is a fundamental problem and often occurs as a subproblem of other optimization problems. In some applications, the numbers associated with arcs may represent characteristics other than lengths and we may want optimum paths where optimum is defined by a different criterion. But the shortest path problem is the most common problem in the whole class of optimum path problems. The shortest path algorithm can usually be modified slightly to find other optimum paths. Thus we shall concentrate on the shortest paths.

    If we denote a path from V1 to Vk by (V1, V2, ..., Vk), then ei.i+1 must be either a directed arc from Vi to Vi+1 or an undirected arc joining Vi and Vi+1 (i = 1,...,k-1). In most applications, we can think of an undirected arc between Vi and Vj as two directed arcs, one from Vi to Vj and the other from Vj to Vi. We usually are interested in three kinds of shortest-path problems:

    The shortest path from one node to another.

    The shortest paths from one node to all the other nodes.

    The shortest paths between all pairs of nodes.

    Since all algorithms solving Problem (1) and Problem (2) are essentially the same, we shall discuss the problem of finding shortest paths from one node to all other nodes in the network.

    The problem of finding shortest paths is well-defined if the network does not contain a negative cycle (or a negative circuit). Note that a network can have some directed arcs with negative length and yet does not contain a negative cycle. We shall first study the case that all arcs have positive length.

    Let us denote the length of the arc from Vi to Vj by dij, and assume that

    For convenience, we assume that dij = ∞ if there is no arc leading from Vi to Vj and dii = 0 for all i.

    Condition 3 makes the shortest-path problem nontrivial. Otherwise, the shortest path from Vi to Vj consists of the single arc eij.

    Assume that there are n nodes in the network and we want the shortest paths from V0 to Vi (i = 1, 2, ..., n−1). If there are two or more shortest paths from V0 to a node, then any one path is equally acceptable.

    Usually, we would like to know the length of a shortest path as well as the intermediate nodes in the path.

    Let us make some observations first. Let Pk be a path from V0 to Vk, where Vi is an intermediate node on the path. Then the subpath from V0 to Vi contains fewer arcs than the path Pk. Since all arcs have positive lengths, the subpath must be shorter than Pk. We state this as observation 1.

    Observation 1. The length of a path is always longer than the length of any of its subpaths. (Note this is true only when all arcs are positive.)

    Let Vi be an intermediate node on the path Pk (from V0 to Vk). If the path Pk is a shortest path, then the subpath from V0 to Vi must itself be a shortest path. Otherwise, a shorter path to Vi followed by the original route from Vi to Vk constitutes a path shorter than Pk. This would contradict that Pk is a shortest path. We state this as observation 2.

    Observation 2. Any subpath (of a shortest path) must itself be a shortest path.

    (Note that this does not depend on arcs having positive lengths.)

    Observation 3. Any shortest path contains at most n−1 arcs. (This depends on arcs not forming negative cycles and that there are n nodes in the network.)

    Based on these three observations, we can develop an algorithm to find the shortest paths from V0 to all the other nodes in the network.

    Imagine that all shortest paths from V0 to all other nodes have been ordered according to their path lengths. For convenience of discussion, we can rename the nodes such that the shortest path to V1 is the shortest among all shortest paths. We shall write

    Pn-1

    to denote that the lengths of these paths are monotonically increasing.

    The algorithm will find P1 first, P2 second,..., until the longest of the shortest paths is found.

    Let us motivate the ideas behind the algorithm. How many arcs are in the path P1? If P1 contains more than one arc, then it contains a subpath which is shorter than P1 (observation 1). Thus P1 must contain only one arc.

    If Pk contains more than k arcs, then it contains at least k intermediate nodes on the path. Each of the subpaths to an intermediate node must be shorter than Pk, and we would have k paths shorter than Pk, a contradiction. Thus the shortest path Pk contains at most k arcs. We shall state this as observation 4. Observation 4. The shortest path Pk contains at most k arcs.

    To find P1, we need only examine one-arc paths; the minimum among these must be P1.

    To find P2, we need only one-arc or two-arc paths. The minimum among these must be P2. If P2 is a two-arc path where the last arc is ej2 (j≠1), then the single arc e0j is a subpath of P2 and hence shorter than P2. Thus, the path P2 must be either a one-arc path or a two-arc path where the arc e12 is the last arc on the path P2.

    In what follows, we shall write numbers on the nodes and call these numbers labels. There are two kinds of labels, temporary and permanent labels. The permanent label on a node is the true shortest distance from the origin V0 to that node. A temporary label is the length of a path from the origin to that node. Since the path may or may not be a shortest path, a temporary label is an upper bound on the true shortest distance.

    When we search for P1, we write on every node Vi the length of the arc d0i. These are called the temporary labels of Vi (since these labels may be changed later). Among all the temporary labels, we select the minimum and change the label to be permanent. Thus we have V1 permanently labelled. (A node permanently labelled is called a permanent node.)

    To find P2, we do not have to find all two-arc paths; only those where the first arc is e01. All the lengths of one-arc paths have already been written on the nodes as temporary labels. So we can compare d0i (the length of one arc) with d01 + d1i (the length of the two-arc path), and the minimum of the two is written on Vi as the temporary label of Vj. Then among all temporary labels, the minimum is P2.

    The permanent label on a node Vi indicates the true shortest distance from V0 to Vi. The temporary label on a node Vj indicates either the distance of the arc e0j or the distance of a path from V0 to a permanent node Vi followed by the arc eij.

    Imagine that all arcs of the network were colored green. Whenever an arc is used in a shortest path, we recolor it brown. So we use one brown arc to reach V1, one or two brown arcs to reach V2, ..., at most k brown arcs to reach Vk. (The choice of color is such that the brown arcs form a tree, see EX. 2.) We see that the path Pk+1 cannot contain nodes with temporary labels as intermediate nodes. Thus we can limit our search to those paths consisting of a sequence of brown arcs followed by one green arc reaching the node Vk+1. Two or more green arcs indicate a subpath of shorter distance than Pk+1.

    To find the path Pk+1 containing one green arc and possibly some brown arcs, we limit our search to the neighbors of V0, V1, ..., Vk. The search is made easy if we adopt the following rule:

    +dij is less than the current temporary label of Vj. If it is less, we shall replace the current temporary label by the smaller value. If not, we leave the temporary label unchanged.

    To find Pk+1, we just find the minimum of temporary labels of all neighbors of V0, V1,..., Vk and change the minimum to a permanent label.

    Now we can formalize the algorithm and use it in a numerical example. We will use lto denote the true shortest distances.

    Dijkstra’s Algorithm

    Step 0.

    All nodes Vi receive temporary labels li with value equal to d0i (i = 1,2,...,n−1). For convenience, we can take d0i = ∞ if there is no arc joining V0 and Vi. Set li = d0i (i = 1,...,n−1)

    Step 1.

    Among all temporary labels li

    Change l.

    Stop if there is no temporary label left.

    Step 2.

    Let Vk be the node that just received a permanent label in Step 1.

    Replace all temporary labels of the neighbors of Vk by the following rule:

    Return to Step 1.

    Consider the network shown in Figure 1.2 where the numbers are arc lengths.

    Figure 1.2

    We shall put temporary labels inside each node and when the label becomes permanent, we shall add a star on the number. Whenever arcs are used in a shortest path, we shall use heavy lines to represent them.

    Step 0.

    All nodes receive temporary labels equal to d0i, and the node V0 gets permanent label 0. This is shown in Figure 1.3.

    Step 1.

    Among all temporary labels, V3 has the minimum value 2, so V3 receives a permanent label.

    Step 2.

    The node V3 has neighbors V2 and V5

    The result is shown in Figure 1.4.

    Figure 1.3

    Figure 1.4

    Step 1.

    Among all temporary labels, the node V2 has the minimum label 3. So V2 receives a permanent label.

    Step 2.

    The neighbors of V2 are V1, V7. (Note V3 is also a neighbor, but since V3 has become permanent it is excluded.)

    Step 1.

    The node V1 receives the permanent label 4.

    Step 2.

    This is shown in Figure 1.5.

    Step 1.

    V4 gets a permanent label.

    Step 2.

    Figure 1.5

    Step 1.

    V6 gets a permanent label.

    Step 2.

    Step 1.

    V5 gets a permanent label.

    Step 2.

    Step 1.

    V7 gets a permanent label.

    The final result is shown in Figure 1.6.

    Figure 1.6

    We have computed the shortest distances from V0 to all other nodes in the network, but we have not given the shortest paths which achieve these shortest distances. This situation is like knowing that one can drive from New York to Los Angeles in 72 hours but not knowing which route one should use. One way to trace the intermediate nodes is as follows. If the permanent labels are written on the nodes, we can look for all neighbors of a node Vj to see which neighbor has a label that differs from Vj by exactly the length of the connecting arc. In this way, we can trace back from every node the shortest path from the origin to that node. In Figure 1.7, we have written two numbers on every node. The first number is the permanent label indicating the true shortest distance from the origin to that node; the second number indicates the last intermediate node on the shortest path.

    Figure 1.7

    Since the algorithm consists of comparisons and additions, we shall count the number of comparisons and additions.

    There are n−2 comparisons the first time and n−3 comparisons the second time, so that there are (n−2) + (n−3)+...+ 1 = (n−1)(n−2)/2 comparisons in Step 1. Similarly, there are (n−1) (n−2)/2 additions and the same number of comparisons in Step 2. Thus, the algorithm is 0(n²). Since there are 0(n²) arcs in a network of n nodes and every arc must be examined at least once, it is safe to say that there exists no algorithm which needs 0(n log n) steps in general.

    We shall discuss the problem of finding shortest paths from one origin to each of the other nodes when the network has negative arcs in section 1.6. Next, we turn to the problem of finding shortest paths between all pairs of nodes.

    §1.3 MULTITERMINAL SHORTEST PATHS

    In this section, we look for shortest paths between all pairs of nodes in a network. We allow the arcs to have negative length as long as there is no negative circuit. For computational purposes, we can regard an undirected arc as two directed arcs both with the same length. Thus, a negative undirected arc is equivalent to a negative cycle formed by two directed arcs. If there is only one directed arc with length dij leading from the node Vi to the node Vj, then we can imagine that there is another arc from Vj to Vi with dji = ∞.

    As we have mentioned in section 1.2, any subpath of a shortest path must itself be the shortest. Let

    eij, ejk, ekl, ..., epq

    be a shortest path from Vi to Vq. Then the shortest path from Vi to Vj must be the single arc eij, the shortest path from Vj to Vk must be the single arc ejk, and so forth. We define an arc eij to be a basic arc if it is the shortest path from Vi to Vj. With this definition, we see that a shortest path must consist of basic arcs only. Also the brown arcs defined in section 1.2 are basic arcs but not all basic arcs are brown arcs.

    We will present an algorithm which replaces all nonbasic arcs by basic arcs. In other words, we create an arc between each pair of nodes not connected by a basic arc. The length of the created arc is equal to the shortest distance between the two nodes. Let us consider a simple operation defined for a given node Vj:

    (1)

    We perform this operation for a fixed j and all possible i, k ≠ j. For three nodes Vi, Vj, and Vk and the three arcs with lengths dik, dij, and djk, the operation compares the length of an arc eik with the length of a path of two arcs with the intermediate node Vj. This situation is shown in Figure 1.8. The operation (1) is called a triple operation.

    Figure 1.8

    We try all possible pairs of nodes Vi and Vk which are neighbors of Vj.

    dij + djk , we do not change anything.

    If

    dik > dij + djk,

    we create a new arc from Vi to Vk with dik = dij + djk. We first fix j=1 and perform the triple operation (1) for all i,k = 2,3,...,n. Then we fix j=2 and perform the triple operation (1) for all i,k = 1,3,...,n. Note that when j =2, all the new arcs created when j=1 are used. We claim that once all triple operations are computed for j =n, the network consists of just basic arcs. That is to say, the number associated with every directed arc leading from Vp to Vq represents the shortest distance from Vp to Vq. Now we shall justify this assertion.

    Take any shortest path from Vp to Vq in the original network. The shortest path must consist of basic arcs in the original network. If the triple operation would create an arc with its length equal to the sum of lengths of all basic arcs in the shortest path, then this would prove that the algorithm is correct.

    Let us consider any shortest path such as the one shown in Figure 1.9.

    Figure 1.9

    When j=1, the triple operation creates a new arc with d63 = d61 + d13.

    When j=2, the triple operation does not affect this particular shortest path.

    When j=3, the arc with d63 = d61 + d13 has already been created; thus, the triple operation will create an arc with d69 = d63 + d39 = (d61 + d13) + d39.

    When j=4, an arc with d26 = d24 + d46 is created.

    When j=6, an arc with d29 = d26 + d69

    =(d24 + d46) + (d63 + d39)

    =d24 + d46 +d61 +d13 + d39

    is created. In Figure 1.9, we show these basic arcs created successively by dashed lines. The numbers beside these dashed lines indicate the order in which they are created. Thus, the basic arc e63 is created first, and the basic arc e69 is created second. Some basic arcs, such as e41, will also be created by triple operations, but they are not shown in the figure since they are irrelevant to the shortest path that we are considering. Note that any of the arcs created by the triple operation cannot be replaced by another arc or a path of shorter distance. If so, it would contradict that the original path is a shortest path. If there are no negative cycles, then any shortest path must be a simple path and consist of at most n-1 arcs and at most n-2 distinct intermediate nodes. We have shown that the algorithm works for a particular shortest path, but we can easily generalize the idea to any arbitrary shortest path and supply a formal proof.

    This algorithm is easy to program on a computer. The arc lengths of an n-node network are given by an n×n matrix. For example, a network is shown in Figure 1.10 and its distance matrix in Table 1.1.

    When j=1, we compare every entry dik (i≠1, k≠1) with (di1 + d1k). If the entry dik is greater than the sum of di1 and dik, then the entry dik is replaced by the sum (di1 + d1k). Otherwise the entry remains the same. Likewise, when j=2, we compare every entry dik (i ≠ 2, k ≠ 2) with (di2 + d2k). The minimum of dik and (di2 + d2k) becomes the new value of dik. Note that in the above computation for j=2, we have already used the results for j=1. For a fixed value of j, we have to check entries in a (n−1)×(n−1) matrix (the diagonal entries are always zero). Each entry is compared with the sum of two other entries, one in the same row and one in the same column. The algorithm is completed when we finish the computation for j =n.

    Figure 1.10

    Table 1.1

    Let us do the triple operation for j = 1, 2, 3, 4, 5 and only record the computation where there is a change in the data. Note all the lengths of an undirected network are symmetric, so we shall only calculate one of the lengths.

    For j=1,

    For j=2,

    For j=3,

    Nothing is changed.

    For j=4,

    For j=5,

    Nothing is changed.

    Note that d25 was first ∞, then 5, and finally 2. But when d25 = 5, it is never used as an arc in any shortest path. Any arc with length equal to the sum of lengths of basic arcs in a shortest path cannot be replaced by another arc of shorter length. Otherwise, it will contradict the assumption that the path is shortest. The matrix of shortest distances between all pairs of nodes is shown in Table 1.2.

    Having found the shortest distances between every pair of nodes, we still have to find the intermediate nodes in a shortest path. To keep track of the intermediate nodes, we use a matrix [pik], where the entry in the ith row and kth column indicates the first intermediate node on the path from Vi to Vk. If Pik = j, then the shortest path is Vi, Vj, ..., Vk. Then if Pjk = s, the shortest path is Vi, Vj, Vs, ..., Vk. Initially, we set Pik = k for all i,k. For example, corresponding to Table 1.1, we have Table 1.3.

    This means each arc is assumed to be a basic arc (until proven otherwise), and in order to go from Vi to Vk the first intermediate node is Vk itself.

    Table 1.2

    Table 1.3

    When we perform the triple operation on Table 1.1, we also update the information in Table 1.3. The entries in Table 1.3 are changed according to the following rule.

    (2)

    For example, when we set d25 to d21 + d15 = 5, we also set

    p25 = p21 = 1.

    When we set d14 to d12 + d24 = 2, we also set

    P14 = p12 = 2.

    When we set d15 to d14 + d45, we also set

    p15 = p14 = p12 = 2.

    When we set d25 to d24 + d45, we also set

    p25 = p24 = 4.

    So at the end of the computation, we have

    p15 = 2, p25 = 4.

    Since the arc e45 is a basic arc, p45 = 5 throughout the computation.

    p15 = 2 means V2 is the first intermediate node in going from V1 to V5,

    p25 = 4 means V4 is the first intermediate node in going from V2 to V5,

    p45 = 5 means V5 is the first intermediate node in going from V4 to V5.

    Thus we can trace out all the intermediate nodes from V1 to V5 as V1, V2, V4 and V5.

    In general, to find the shortest path from Vs to Vt, we look at the entry pst and find the first intermediate node.

    If pst = a, we look for pat = ?

    If pat = b, we look for pbt = ?

    This

    Enjoying the preview?
    Page 1 of 1