Você está na página 1de 6

[next] [tail] [up]

6.1 Error-Free Probabilistic Programs


Randomization is an important programming tool. Intuitively, its power stems from choice. The ability to make "random choices" can be viewed as a derivation of the ability to make "nondeterministic choices." In the nondeterministic case, each execution of an instruction must choose between a number of options. Some of the options might be "good", and others "bad." The choice must be for a good option, whenever it exists. The problem is that it does not generally seem possible to make nondeterministic choices in an efficient manner. The options in the case for random choices are similar to those for the nondeterministic case, however, no restriction is made on the nature of the option to be chosen. Instead, each of the good and bad options is assumed to have an equal probability of being chosen. Consequently, the lack of bias among the different options enables the efficient execution of choices. The burden of increasing the probability of obtaining good choices is placed on the programmer. Here random choices are introduced to programs through random assignment instructions of the form x := random(S), where S can be any finite set. An execution of a random assignment instruction x := random(S) assigns to x an element from S, where each of the elements in S is assumed to have an equal probability of being chosen. Programs with random assignment instructions, and no nondeterministic instructions, are called probabilistic programs. Each execution sequence of a probabilistic program is assumed to be a computation . On a given input a probabilistic program might have both accepting and nonaccepting computations. The execution of a random assignment instruction x := random(S) is assumed to take one unit of time under the uniform cost criteria, and |v| + log |S| time under the logarithmic cost criteria. |v| is assumed to be the length of the representation of the value v chosen from S, and |S| is assumed to denote the cardinality of S. A probabilistic program P is said to have an expected time complexity (x) on input x if (x) is equal to p0(x) 0 + p1(x) 1 + p2(x) 2 + . The function pi(x) is assumed to be the probability for the program P to have on input x a computation that takes exactly i units of time. The program P is said to have an expected time complexity (n) if (|x|) (x) for each x.

The following example shows how probabilism can be used to guarantee an improved behavior (on average) for each input. Example 6.1.1 Consider the deterministic program in Figure 6.1.1

call SELECT(k, S) procedure SELECT(k,S) x := first element in S S1 := { y | y is in S, and y < x } n1 := cardinality of the set stored in S1 S2 := { y | y is in S, and y > x } n2 := cardinality of the set stored in S2 n3 := (cardinality of the set stored in S) - n2 case k n1: SELECT(k, S1) n3 < k : SELECT(k - n3, S2) n1 < k n3: x holds the desired element end end

Figure 6.1.1 A program that selects the kth smallest element in S. (given in a free format using recursion). The program selects the kth smallest element in any given set S of finite cardinality. Let T(n) denote the time (under the uniform cost criteria) that the program takes to select an element from a set of cardinality n. T(n) satisfies, for some constant c and some integer m < n, the following inequalities.

From the inequalities above T(n) T(n - 1) + cn T(n - 2) + c T(n - 3) + c T(1) + c cn2. That is, the program is of time complexity O(n2). The time requirement of the program is sensitive to the ordering of the elements in the sets in question. For instance, when searching for the smallest element, O(n) time is sufficient if the elements of the set are given in nondecreasing order. Alternatively, the program uses O(n2) time when the elements are given in nonincreasing order.

This sensitivity to the order of the elements can be eliminated by assigning a random element from S to x, instead of the first element of S. In such a case, the expected time complexity (n) of the program satisfies the following inequalities, for some constant c.

From the inequalities above (n) + cn

+ cn + c(1 - ) + cn

+ c + cn

+ + + c + cn + 2c + cn + 2c + cn + 3c + cn (1) + (n - 1)c + cn 2cn. That is, the modified program is probabilistic and its expected time complexity is O(n). For every given input (k, S) with S of cardinality |S|, the probabilistic program is guaranteed to find the kth smallest element in S within O(|S|2) time. However, on average it requires O(|S|) time for a given input. [next] [front] [up] [next] [prev] [prev-tail] [tail] [up]

6.2 Probabilistic Programs That Might Err


Error Probability of Repeatedly Executed Probabilistic Programs Outputs of Probabilistic Programs

For many practical reasons programs might be allowed a small probability of erring on some inputs. Example 6.2.1 A brute-force algorithm for solving the nonprimality problem takes exponential time (see Example 5.1.3). The program in Figure 6.2.1 is an example of a probabilistic program that determines the nonprimality of numbers in polynomial expected time.

read x y := random({2, . . . , }) if x is divisible by y then answer := yes /* not a prime number */ else answer := no

Figure 6.2.1 An undesirable probabilistic program for the nonprimality problem.

The program has zero probability for an error on inputs that are prime numbers. However, for infinitely many nonprime numbers the program has a high probability of giving a wrong answer. Specifically, the probability for an error on a nonprime number m is 1 - s/( - 1), where s is assumed to be the number of distinct divisors of m in {2, . . . , }. In particular, the probability for an error reaches the value of 1 - 1/( - 1) for those numbers m that are a square of a prime number. The probability of getting a wrong answer for a given number m can be reduced by executing the program k times. In such a case, the number m is declared to be nonprime with full confidence, if in any of k executions the answer yes is obtained. Otherwise, m is determined to be prime with probability of at most (1 - 1/( - 1))k for an error. With k = c( - 1) this probability c approaches the value of (1/ ) < 0.37c as m increases, where is the constant 2.71828 . . . However, such a value for k is forbidingly high, because it is exponential in the length of the representation of m, that is, in log m. An improved probabilistic program can be obtained by using the following known result. Result Let Wm(b) be a predicate that is true if and only if either of the following two conditions holds. a. b. (bm-1 - 1) mod m 0. 1 < gcd (bt - 1, m) < m for some t and i such that m - 1 = t 2i. 2 the conditions below hold.

Then for each integer m a. b.

m is a prime number if and only if Wm(b) is false for all b such that 2 b < m. If m is not prime, then the set { b | 2 b < m, and Wm(b) holds } is of cardinality (3/4)(m - 1) at least.

The result implies the probabilistic program in Figure 6.2.2.

read x y := random{2, . . . , x - 1} if Wx(y) then answer := yes else answer := no

Figure 6.2.2 A good probabilistic program for the nonprimality problem. For prime numbers m the program always provides the right answers. On the other hand, for nonprime numbers m, the program provides the right answer with probability of at most 1 (3/4)(m - 1)/(m - 2) 1/4 for an error. The probability for a wrong answer can be reduced to any desired constant by executing the program for k log1/4 times. That is, the number of times k that the program has to be executed is independent of the input m. Checking for the condition (bm-1 - 1) mod m 0 can be done in polynomial time by using the relation (a + b) mod m = ( (a mod m) + (b mod m) ) mod m and the relation (ab) mod m = ((a mod m)(b mod m)) mod m. Checking for the condition gcd (bt - 1, m) can be done in polynomial time by using Euclid's algorithm (see Exercise 5.1.1). Consequently, the program in Figure 6.2.2 is of polynomial time complexity. Example 6.2.2 Consider the problem of deciding for any given matrices A, B, and C whether AB C. A brute-force algorithm to decide the problem can compute D = AB, and E = D - C, and check whether E A brute-force multiplication of A and B requires O(N3) time (under the uniform cost criteria), if A and B are of dimension N N. Therefore, the brute-force algorithm for deciding whether AB C also takes O(N3) time. The inequality AB C holds if and only if the inequality

(AB - C) holds for some vector =

Consequently, the inequality AB determines a. A column vector

C can be determined by a probabilistic program that

b. c. d. e. f.

of random numbers from {-1, 1}. The value of the vector = B . The value of the vector = A . The value of the vector = C . The value of the vector = - =A -C =A = -C = (AB - C)

Você também pode gostar