08 Unit8

Fundamentals of Algorithms
Unit 8
Unit 8
Additional Features
Structure 8.1 Introduction Objectives 8.2 Lower Boundary Theory 8.3 NP Hard and NP Complete Problems 8.4 Approximate Algorithms for NP Hard Problems Absolute approximation - Approximation Polynomial Time Approximation Schemes 8.5 Summary 8.6 Terminal Questions 8.7 Answers
8.1 Introduction
We have studied in the previous units the different techniques of algorithms. Now, the purpose of this unit is to expose you to some techniques that have been used to establish that a given algorithm is the most efficient possible. The way this is done is by discovering a function g(n) that is a lower bound on the time that any algorithm must take to solve the given problem. If we have an algorithm whose computing time is the same order as g(n), then we know that asymptotically we can do no better. Objectives After studying this unit, you should be able to: apply the Lower Boundary theory use Comparison tree for searching and sorting apply Approximate Algorithms for NP hard problems use Approximation in Bin Packing analyse Polynomial time approximation schemes.
8.2 Lower Boundary Theory

If f(n) is the time for some algorithm, then we write f(n)= (g(n)) to mean that g(n) is a lower bound for f(n). Formally, this equation can be written if there exist positive constants C and no such that |f(n)| C|g (n)| for all n>no. In
Sikkim Manipal University Page No. 148
Unit 8
addition to developing lower bounds to within a constant factor, we are also concerned with determining more exact bounds whenever this is possible. For many problems it is possible to easily observe that a lower bound identical to n exists where n is the number of input to the problem. Consider all algorithms that find the maximum of an unordered set of n integers. Clearly, every integer must be examined at least once, so (n) is a lower bound for any algorithm that solves this problem. Comparison Trees for Searching and Sorting Below, we will study some of the comparison trees used for searching and sorting Ordered Searching: In obtaining the lower bound for the ordered searching problem, we consider only those comparison based algorithms in which every comparison between two elements of S is of the type compare x and A [i]. The progress of any searching algorithm that satisfies this restriction can be described by a path in a binary tree. Each internal node in this tree represents a comparison: between x and A[i], There are three possible outcome of this comparison: x<A[i],x=A[i], or x>A[i]. We can assume that if x=A[i], the algorithm terminates. The left branch is taken if x<A[i], and the right branch is taken x>A[i]. If the algorithm terminates, following a left or right branch, then no i has been such that x=A[i] and the algorithm must declare the search unsuccessful. Theorem: Let a[1:n],n 1, contain n distinct elements ordered so that a[1]<<a[n]. Let FIND (n) be the minimum number of comparisons needed, in the worst case, by any comparison-based algorithm to recognize whether x A[1:n]. Then FIND (n) [log (n+1)]. Proof: Consider all possible comparison trees that model algorithm to solve the searching problem. The value of FIND (n) is bound below by the distance of the longest path from the root to a leaf in such a tree. There must be n internal nodes in all these trees corresponding to the n possible successful occurrences of x in A. If all internal nodes of a binary tree are at levels less than or equal to k, then there are at most 2k1 internal modes. Thus n<2k1 and FIND (n)=k
Sikkim Manipal University
[log (n+1)].
Page No. 149
Unit 8
Sorting: Now, lets consider the sorting problem. We can describe any sorting algorithm that satisfies the restrictions of the comparison tree model by a binary tree. Consider the case in which the n numbers A [1:n] to be sorted are distinct. Now, any comparison between A[i] and A[j] must result in one of two possibilities: Either A [i]<A [j] or A [i] > A[j]. So, the comparison tree is a binary tree in which each internal node is labeled by the pair i : j which represents the comparison of A[i] with A[j]. If A[i] is less than A[j], then the algorithm proceeds down the left branch of the tree; otherwise, it proceeds down the right branch. The external nodes represent termination of the algorithm associated with every path from the root to an external node is a unique permutation. To see that this permutation is unique, note that the algorithms we allow are only permitted to move data and make comparisons. The data movement on any path from the root to an external node is the same no matter what the initial input values are. As there are n! different possible permutations of n items, and any one of these might legitimately be the only correct answer for the sorting problem on a given instance, the comparison tree must have at least n! external nodes. Fig 8.1 shows a comparison tree for sorting three items. The first comparison is A[1]:A[2]. If A[1] is less than A[2], then the next comparison is A[2] with A[3]. If A[2] is less than A[3], then the left branch leads to an external nodes containing left branch leads to an external node containing 1,2,3. This implies that the original set was already sorted for A[1]<A[2]<A[3]. The other five external nodes correspond to the other possible orderings that could yield a sorted set.
Page No. 150
Unit 8
1:2
2:3
2:3
1, 2, 3
1:3
1:3
3, 2, 1
1, 3, 2
3, 1, 2
2, 1, 3
2, 3, 1
Fig. 8.1: A Comparison Tree for sorting three items
We consider the worst case for all comparison based sorting algorithms. Let T (n) be the minimum number of comparisons that are sufficient to sort n items in the worst case. Using our knowledge of binary trees once again, if all internal nodes are at levels less than k, then there are at most 2k external nodes. Therefore, if we let k = T (n) n! 2T(n) Since T(n) is an integer, we get the lower bound T (n) [log n!]
Self Assessment Question 1. In obtaining the lower bound for the ordered searching problem, we consider only those comparison based algorithms in which every comparison between two elements of S is of the type .
8.3 NP Hard and NP Complete Problems

Basic Concepts In this unit, we are concerned with the distinction between problems that can be solved by a polynomial time algorithm and problems for which no polynomial time algorithm is known. It is an unexplained phenomenon that for many of the problems we know and study, the best algorithms for their solutions have computing times that cluster into two groups. The first group consists of problems whose solution are bounded by polynomials of small degree.
Unit 8
The second group is made up to problems whose best-known algorithms are non-polynomial. Examples we have seen include the traveling salesperson and the knapsack problems for which the best algorithms given in this text have complexities O (n22n) and O (2n/2) respectively. In the quest to develop efficient algorithms, no one has been able to develop a polynomial time algorithm for any problem in the second group. This is very important because algorithms whose computing times are greater then, polynomial very quickly require such vast amounts of time to execute that even moderate-size problems cannot be solved. The theory of NP-completeness, which we present here, does not provide a method of obtaining polynomial time algorithms for problems in the second group. Nor does it say that algorithms of this complexity do not exist. Instead, what we do is show that many of the problems for which there are no known polynomial time algorithms are computationally related. In fact, we establish two classes of problems. These are given names, NP-hard and NP-Complete. A problem that is NP-Complete has the property that it can be solved in polynomial time if and only if all other NP-Complete problems can also be solved in polynomial time. If an NP-hard problem can be solved in polynomial time, then all NPComplete problems can be solved in polynomial time. All NP complete problems are NP-hard, but some NP-hard problems are not known to be NP-complete.
NP
NP hard
NP Complete Fig. 8.2: Venn diagram
Page No. 152
Unit 8
Although one can define many distinct problem classes having the properties stated above for the NP-hard and NP-complete classes, the classes we study are related to non-deterministic computations. The relationship of these classes to non deterministic computations together with the apparent power of non- determinism lead to the intuitive conclusion that no NP- complete or NP- hard problem is polynomially solvable. Cooks Theorem Cooks theorem states that satisfiability is in P if and only if P=NP. We now prove this important theorem. We know that satisfiability in NP. Hence, if P = NP, then satisfiability is in P. It remains to be shown that if satisfiability is in P, then P = NP. To do this, we show how to obtain from any polynomial time nondeterministic decision algorithm A and input I a formula Q (A, I) such that Q is satisfiable if A has a successful termination with input I. If the length of I is n and the time complexity of A is p (n) for some polynomial p(), then the length of Q is O (p3(n)log n) = O(p4(n)). The time needed to construct Q is also O (p3(n) log n). A deterministic algorithm Z to determine the out come of A on any input I can be easily obtained. Algorithm Z simply computes Q and then uses a deterministic algorithm for the satisfiability problem to determine whether Q is satisfiable. If O (q (m)) is the time needed to determine whether a formula of length m is satisfiable, then the complexity of Z is O (p3(n) log n+q (p3(n)log n)). If satisfiability is in P, then q (m) is a polynomial function of m and the complexity of Z becomes O(r (n)) for some polynomial r(). Hence, if satisfiability is in P, then for every nondeterministic algorithm A in NP we can obtain a deterministic Z in P. So, the above construction shows that if satisfiability is in P, then P=NP. Some Theorems Definition: A K- clique in a graph G is a complete sub graph of G with Kvertices. CNF product of sums
Theorem 1: CNF satisfiability is polynomially transformable to the clique problem. Therefore, the clique problem is NP complete. Proof: Let F=F1 F2.Fq be an expression in CNF, where the Fi s are the factors, each Fi is of the form
Page No. 153
Unit 8
(xi1+xi2+..xik) where xij is a literal. Construct an undirected graph G=(V,E) whose vertices are pairs of integers [i,j] for 1 i q and 1 j k. First component represents a factor and the 2nd component represents a literal within the factor since there are 2n combinations. It is a polynomial And it is NP-complete. Definition: Let G=(V,E) be an undirected graph. A vertex cover of G is a subset SV such that each edge of G is incident upon some vertex in S. Theorem 2: The clique problem is polynomially transformable to the vertex cover problem. Therefore, the vertex problem is NP-Complete. Proof: Given an undirected graph G=(V,E), consider the complement graph G=(V,E) Where E {(v, w)|v,w V,v w and (v,w) E} A set S V is a clique in G iff V-S is a vertex cover of G. If S is a clique in G, no edge in G connects 2 vertices in S. Thus, every edge in G is incident upon atleast one vertex in V-S, implying V-S is a vertex cover of G. Similarly, if V-S is a vertex cover of G, then every edge of G is incident upon atleast one vertex of V-S. Thus, no edge of G connects 2 vertices in S. Therefore, every pair of vertices of S is connected in G, and S is a clique in G. To find whether there exists a clique of size k, we construct G and determine whether it has a vertex cover of size ||v||-k. Given a standard representation of G=(V,E) and K, we find a representation of G and ||v||-k in time which is a polynomial in the length of the representation for G and K. Self Assessment Questions 2. first group consists of problems whose solution times are bounded by polynomials of . 3. A K- clique in a graph G is a complete sub graph of G with .
8.4 Aproximate Algorithms for NP Hard Problems

Many NP-hard optimization problems have great practical importance and it is desirable to solve large instances of these problems in a reasonable
Page No. 154
Unit 8
amount of time. The best-known algorithms for NP-hard problems have a worst-case complexity that is exponential in the number of inputs. A feasible solution with value close to the value of an optimal solution is called an approximate solution. An approximate algorithm for P is an algorithm that generates approximate solution for P Let A be an algorithm that generates a feasible solution to every instance I of a problem P. Let F*(I) be the value of an optimal solution to I. Let F(I)^be the value of the feasible solution generated by A. Definition: A is an absolute approximation algorithm for problem P if and ^ only if for every instance I of P, | F*(I) F(I) | K for some constant k. Definition: An -approximate algorithm is an approximate algorithm for which f(n) for some constant . Definition: A ( ) is an approximation scheme if and only if for every given >O and problem instance I, A( ) generates a feasible solution such that |F*(I) ^ F(I)|/F*(I) . Again, we assume F* (I) > O Definition: An approximation scheme is a polynomial time approximation scheme if and only if for every fixed >O it has a computing time that is polynomial in the problem size. 8.4.1 Absolute Approximation Below, we study some of the absolute approximations along with their algorithms Planar Graph Coloring There are very few NP-hard optimization problems for which polynomial time absolute approximation algorithms are known. One problem is that of determining the minimum number of colors needed to color a planar graph G= (V,E). It is known that every planar graph is four colorable. It is zero colorable iff V= It is one colorable iff E= It is two colorable if it is bipartite. Determining whether a planar graph is three colorable is NP hard. However, all planar graphs are four colorable. An absolute approximation algorithm with |F*(I)-F(I)| 1 is easy to obtain. Algorithm 7.1 is such an ^ algorithm. It finds an exact answer when the graph can be colored using at
Unit 8
most two colors. Since we can determine whether a graph is bipartite in time O (|V|+|E|), the complexity of the algorithm is to 0(|V|+|E|). Below, we discuss an algorithm for approximate coloring Algorithm 8.1 1. Algorithm A color (V,E,) 2. // Determine an approximation to the minimum number of colors. 3. { 4. if V= then return 0; 5. 6. 7. 8. } Maximum Programs Stored Problem Assume that we have n programs and two storage devices (say disks or tapes). We assume the devices are disks and our discussion applied to any kind of storage device. Let li be the amount of storage needed to store the ith program. Let L be the storage capacity of each disk. Determine the maximum number of these n program over the disks) is NP-hard. Theorem: Partition varies as the maximum programs stored. Proof: Let {a1, a2,an}, define an instance of the partition problem. We can assume ai = 2T. Define an instance of the maximum programs stored problem as follows: L=T and li=ai, 1 i . Clearly {a1,..an} has a partition if and only if n programs can be stored on the two disks. By considering programs in order of non-decreasing storage requirement li, we can obtain a polynomial time absolute approximation algorithm. Function Pstore (Algorithm 8.2) assumes l1 l 2 . l n and assigns programs to disk 1 so long as enough space remains on this tape. Then it begins assigning programs to disk 2. In addition to the time needed to initially sort the programs into non-decreasing order of ll, O (n) time is needed to obtain the storage assignment. else if E= then return 1; else if G is bipartite then return 2; else return 4;
Page No. 156
Unit 8
Below we discuss approximation Algorithm to store programs Algorithm 8.2 1) Algorithm Pstore(l, n, L) 2) //Assume that l [i] l [i+1],1 i n. 3) { 4) i:=1; 5) for j:=1 to 2 do 6) { 7) sum:=0;//Amount of disk j already assigned 8) while (sum + l [i]) L do 9) { 10) write(Store program ,i,on disk,j); 11) sum :=sum+l[i];i :=i+1 ; 12) if i>n then return; 13) } 14) } 15) } Example: Let L=10, n=4, and (l 1, l 2, l 3, l 4)=(2,3,4,5). Function Pstore will store programs 1 and 2 on disk 1 and only program 3 on disk 2. An optimal storage scheme stores all four programs. One way to do this is to store programs 1 and 4 on disk 1 and the other two on disk 2. 8.4.2 Approximation Below, we study some Approximation Scheduling Independent Tasks Obtaining minimum finish time schedules on m, m 2, identical processors is NP-hard. There exists a very simple scheduling rule that generates schedules with a finish time very close to that of an optimal schedule. An instance I of the scheduling problem is defined by a set of n task times t i, 1 i n, and m, the number of processors. The scheduling rule we are about to describe is known as the LPT. An LPT schedule is a schedule that results from this rule. Definition: An LPT schedule is one that is the result of an algorithm that, whenever a processor becomes free, assigns to that processor a task
Unit 8
whose time is the largest of those tasks not yet assigned. Ties are broken in an arbitrary manner. Example: Let m=3, n=6, and (t1, t2,t3,t4,t5,t6)=(8,7,6,5,4,3).In an LPT schedule tasks 1, 2 and 3 are assigned to processor 1, 2 and 3 respectively. Tasks 4, 5 and 6 are respectively assigned to process 3, 2 and 1. The finish time is 11. Since ti/3 =11,the schedule is also optimal. 6 7 8 P1 P2 P3 3 2 4 1 5 6 11
Fig. 8.3: LPT schedule for the example
Bin Packing In this problem, we are given n object that have to be placed in bin s of equal capacity L. Object i requires li units of bin capacity. The objective is to determine the minimum number of bins needed to accommodate all n objects. No object may be placed partly in one bin and partly in another. Example: Let L=10,n=6,and (l 1, l 2, l 3, l 4, l 5, l 6)=(5, 6, 3, 7, 5, 4). Fig 8.3 shows a packing of the six objects using three bins. Numbers in bins are object indices. Obviously, at least three bins are needed. The bin packing problem can be regarded as a variation of the scheduling problem. The bins represent processors and L is the time by which all tasks must be completed. The variable li, is the processing requirement of task i. The problem is to determine the minimum number of processor needed to accomplish this. An alternative interpretation is to regard the bins as tapes. The variable L is the length of a tape, and li, the tape length needed to store program i. The problem is to determine the minimum number of tapes needed to store all n programs. Clearly, many interpretations exist for this problem. Theorem: The bin packing problem is NP-hard. Proof: To see this, consider the partition problem Let {a1, a2..an} be an instance of the partition problem. Define an instance of the bin packing
Unit 8
problem as li = ai, 1 i n, and L= ai/2. Clearly, the minimum number of bins needed is two if and only if there is a partition for {a1, a2..an}.
5 6 6 2 2 (a) (b) (c)
Fig. 8.4: Optimal Packing for the example
8.4.3 Polynomial Time Approximation Schemes The LPT rule leads to a (1/3-1/(3m))-approximate algorithm for the problem obtaining an m processor schedule for n task . A polynomial time approximation scheme is also known for this problem. This scheme relied on the following scheduling rule Let k be some specified and fixed integer. Obtain an optimal schedule for the K longest tasks. Schedule the remaining n-k tasks using the LPT rule Example: Let m = 2, n = 6, (t1, t2, t3, t4, t5, t6) = (8, 6, 5, 4, 4, 1), k = 4. The four longest tasks have task times 8, 6, 5 and 4 respectively. An optional schedule for these has finish time 12 (Fig. 8.4(a)). When the remaining two task are scheduled using the LPT rule, the schedule of Fig. (8.4 (b)) results. This has finish time 15. Fig. (8.4 (c)) shows as optimal schedule. This has finish time 14. Theorem: Let I be an m-processor instance of the scheduling problem. Let F*(I) be the finish time of an optimal schedule for I and let F (I) be the length of the schedule generated by the above scheduling rule. Then,
(I) | | F * (I) F F * (I) 1 1/ m 1 [k / m]
Proof: Let r be the finish time of an optimal schedule for the k longest tasks. (I)=r, the F*(I)= F (I) and the theorem is proved. So, assume F (I)>r. Let If F t i, 1 i n, be the task times of the n tasks of I.
Without loss of generality, we can assume ti ti+1, 1 i<n, and n>k. Also, we can assume n>m. Let j, j>k, be such that task j has finish time F(I). Then, no
Unit 8
(I)tj]. Since tk+1 processor is idle in the interval [0, F )I)tk+1]. processor is idle in the interval [0, F
tj, it follows that no
Hence
t i 1 i (I) t ) t m(F k 1 k
1
and so, F*(I)

(I)| or |F*(I) F
1 m
n t 1 i
(I) m 1 t F k 1 m
m 1 tk 1 m
Since ti tk+1, 1 i k+1 and at least one processor must execute at least 1+[k/m] of these k+1 tasks, it follows that
F * (I) 1
k m
tk
Combining these two inequalities, we obtain

(I) | | F * (I) F F * (I) (m 1) / m 1 km 1 1m 1
k m
Using the result of theorem, we can construct a polynomial time approximation scheme for the scheduling problem. This scheme has as an input variable. For any input , it computes an integer k such that (11 k /m)/(1+[ /m]). This defines the k to be used in the scheduling rule described above. Solving for k, we obtain that any integer k, k>(m 1)/ m, guarantees -approximate schedules. The time required to obtain such schedules, however, depends mainly on the time needed to obtain an optimal schedule for k tasks on m machines. Using a branch and bound algorithm this time is O (mk). The time needed to arrange the tasks so that ti t i+1 and also to obtain the LPT schedule for the remaining n k tasks is O(n log n). Hence the total time needed by the -approximate scheme is O(n log n+mk) = O(n log n)+m[(m-1)/ m]).Since this time is not polynomial in 1/ (it is exponential in1/ ), this approximation scheme is not a fully polynomial time approximation scheme. It is a polynomial time approximation scheme (for any fixed m) as the computing time is polynomial in the number of task n.
Unit 8
Self Assessment Questions 4. A feasible solution with value close to the value of an optimal solution is called an . 5. An -approximate algorithm is an approximate algorithm for which . 6. are broken in an arbitrary manner. 7. The bin packing problem is .
8.5 Summary
In this unit, we studied the following f(n) = (g(n)) is the lower bound on time for any algorithm. Cooks theorem states that satisfiability is in P if and only if P=NP. A k-clique in a graph G is a complete sub-graph of G with k-vertices. A vertex cover of G, where G=(V, E) is a undirected graph, is a subset S V such that each edge of G is incident upon some vertex in X. The bin packing problem is NP-hard.
8.6 Terminal Questions

1. What do you mean by Order Searching 2. Explain Cooks Theorem.
8.7 Answers
Self Assessment Questions 1. compare x and A [i] 2. small degree 3. K-vertices 4. approximate solution 5. f(n) 6. Ties 7. NP-hard for some constant
Page No. 161
Unit 8
Terminal Questions 1. In the ordered searching problem, we consider only those comparison based algorithms in which every comparison between two elements of S is of the type compare x and A [i]. The progress of any searching algorithm that satisfies this restriction can be described by a path in a binary tree. Each internal node in this tree represents a comparison: between x and A[i]. (Refer Section 8.2). 2. Cooks theorem states that satisfiability is in P if and only if P=NP. We now prove this important theorem. We know that satisfiability in NP. Hence, if P = NP, then satisfiability is in P. It remains to be shown that if satisfiability is in P, then P = NP. To do this, we show how to obtain from any polynomial time nondeterministic decision algorithm A and input I a formula Q (A, I) such that Q is satisfiable if A has a successful termination with input I. (Refer Section 8.3).
Page No. 162

08 Unit8

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

08 Unit8

Enviado por

Direitos autorais:

Formatos disponíveis

Fundamentals of Algorithms

8.2 Lower Boundary Theory

Sikkim Manipal University

Page No. 150

Fig. 8.1: A Comparison Tree for sorting three items

8.3 NP Hard and NP Complete Problems

NP Complete Fig. 8.2: Venn diagram

Sikkim Manipal University

Page No. 152

Sikkim Manipal University

Page No. 153

8.4 Aproximate Algorithms for NP Hard Problems

Sikkim Manipal University

Page No. 154

Sikkim Manipal University

Page No. 156

Fig. 8.3: LPT schedule for the example

Fig. 8.4: Optimal Packing for the example

tj, it follows that no

and so, F*(I)

Combining these two inequalities, we obtain

8.6 Terminal Questions

Sikkim Manipal University

Page No. 161

Sikkim Manipal University

Page No. 162

Você também pode gostar