Você está na página 1de 18

MODULE 1

ALGORITHM
An algorithm is a finite set of instructions each of which may require one or more
operations that, if followed, accomplishes a particular task.All algorithms must satisfy the foll
criteria:
1.
2.
3.
4.

Input : zero or more quantities are externally


Output : at least one quantity is produced
Definiteness : each instruction is clear and unambiguous.
Finiteness :if we trace out the instructions of an algorithm, then for all cases,the

algorithm terminates after a fixed no.of steps.


5. Effectiveness :every instruction must be very basic so that it can be carried out in
principle,by a person using only pencil and paper. It also must be feasible.
Considering criteria 3 directions such as add 6 or 7 to x or compute 5/0 are not
permitted.Considering criteria 5 directions such as add 6 or 7 to x or compute
5/0 are not permitted.

CATEGORIES OF ALGORITHM

Fundamentally algorithms can be divided into two categories.


o Iterative (repetitive) algorithm
o Recursive algorithms.

Iterative algorithms typically use loops and conditional statements.

Recursive algorithms use divide and conquer strategy to break down a large problem
into small chunks and separately apply algorithm to each chunk.

PERFORMANCE ANALYSIS
1.SPACE COMPLEXITY
The space

complexity of an algorithm is amount of memory needs to run to

completion.The space needed is seen to be the sum of the foll components:

A fixed part that is independent of the characteristics of the inputs and the outputs.It

includes the space for constants,space for the code,etc.


A variable part that consists of the space needed by component variables whose size is
dependent on the particular problem instance (eg:input size)being solved, the space
needed by referenced variables.

The space requirement S(P) of any algorithm P can be written as S(P)=c+

SP

(instance

characteristics),where c is a constant.
For any given problem,we need first to determine which instance characteristics to use to
measure the space requirements.
Float abc(float a,float b,float c)
{return(a+b+b*c+(a+b-c)/(a+b)+4.0);
}
2. TIME COMPLEXITY

The time T(P) taken by a program P is the sum of the compile time and the run time.
The compile time doesnt depend on the instance characteristics.
A compiled program will be run several times without recompilation.
t .
The run time depends on instance characteristics P

In a multiuser system, the execution time depends on such factors as system load,the

number of other programs running on the computer at the time program P is run.
tP
Because many of the factors
depends on are not known at the time a prog is
conceived,it is reasonable to attempt only to estimate run time.

If we knew the characteristics of the compiler to be used ,we could proceed to determine

the number of additions,subtractions,multiplications,divisions,compares,loads,and so on.


We could obtain an expression for t P (n) of the form
t P (n) = c a ADD(n)+ c s SUB(n)+ c m MUL(n)+ c d DIV(n)+.
where

denotes

,etc,respectively

the
denote

instance

characteristics

the

time

and

ca

cs

needed

cm

for

cd
an

addition,subtraction,multiplication,division etc,and ADD,SUB,DIV,MUL,etc are the

functions.
A program step is a meaningful segment of a program that has an execution time that is

independent of the instance characteristics.


For eg, comments and declarative statements such as int, #include, etc, count as zero
steps,an assignment statement which does not involve any calls to other programs is
counted as one step, in an iterative statement such as for, while, we consider the step

counts only for the control part of the statement.


The control parts for for have the foll forms:
For(i=<expr>;i<=<expr1>;i++)
The first execution of the control part of the for has a step count equal to the sum of the
counts for <expr>and <expr1>.Remaining executions of the for have a step count of one .

3.AMORTIZED COMPLEXITY

Suppose that a sequence I1,I2,D1,I3,I4,I5,I6,D2,I7 of insert and delete operations is


performed on a set.
Assume actual cost of each of the seven inserts is one. (i.e one unit of time).

Suppose that the delete operations D1 and D2 have an actual cost of 8 and 10
respectively.
So the actual cost of sequence of operation s is 25.

In an amortized scheme,we charge some of the actual cost of an operation to other


operations.
This reduces the charged cost of some operations and increases that of others..

If we charge one unit of the cost of a delete operations to each of the inserts since the last
delete operations .then two units of the cost of D1 get transferred to I1 and I2, and four
units of the cost of D2 get transferred to I3 to I6.

The amortized cost of each of I1 to I6 becomes two,that of I7 becomes equal to its actual
cost,and that of each of D1 and D2 becomes 6.
The sum of the amortized costs is 25,which is the same as the sum of the actual costs.

The cost transferring scheme is required to be such that the sum of the amortized costs of
the operations is greater than or equal to the sum of their actual costs.
Amortized (i) actual(i)
1 i n
1 i n
Where

Amortized (i)

and

actual(i)

,respectively denote the amortized and actual

complexities of the ith operation in a sequence of n operations.


3 popular methods to arrive at amortized costs of operations are: aggregate method,
accounting method, potential method.

Asymptotic Notation(O,,)

Big oh
The function f(n)=O(g(n)) iff there exist positive constants c and n0 such that
f(n) c*g(n) for all n, n

n0

States only that g(n) is an upper bound on the value of f(n) for all n, n
2

Egs : 3n+2=O(n) as 3n+2 4n for all n2 ; 10 n


2

+4n+2 11 n

+4n+2=O( n

n0
2

) as 10 n

for all n 5

Omega
the function f(n)= (g(n)) iff there exist positive constants c and
f(n) c*g(n) for all n, n

n0 such that

n0

Function g(n) is only a lower bound on f(n)


2
2
2
Egs : 3n+2= (n) as 3n+2 3n for n1 ; 10 n +4n+2= ( n ) as 10 n

+4n+2

Theta

n2 for all n 1

The function f(n)= (g(n)) iff there exist positive constants


n0 such that

c1 ,

c2

,and

c 1 g(n) f(n) c 2 g(n) for all n, n n0

g(n) is both an upper and lower bound on f(n)


Egs : 3n+2= (n) as 3n+2 3n for all n 2
Little oh
f ( n)
( g(n))=
The function f(n)=o(g(n))iff
0
lim
n
2
lim 3 n+ 2/n2
Egs : 3n+2=o( n ) since n
=0

Little omega
The function f(n)=i(g(n)) iff

g ( n ) /f ( n)
lim
)=0
n

BEST CASE, AVERAGE CASE, WORST CASE

Worst case:

W(n) maximum over inputs of size n. This represents the input set that

allows an algorithm to perform most slowly. It is an important analysis because it gives

us an idea of the most time algorithm will ever take.


Best case:
B(n) minimum over inputs of size n. Allows an algorithm to perform

most quickly. It makes an algorithm to take shortest time to execute.


Average case: A(n) average over inputs of size n . Represents the input set that
allows an algorithm to deliver an average performance. It has a three-steps process: Determine the number of different groups into which all input sets can be divided.
Determine the probability that the input will come from each of these groups
Determine how long the algorithm will run for each these groups.

PSEUDOCODE AND ALGORITHM

First produce a general algorithm (one can use pseudocode)


Refine the algorithm successively to get step by step detailed algorithm that is very close
to a computer language.

Pseudocode is an artificial and informal language that helps programmers develop

algorithms.
Pseudocode is very similar to everyday English.
Example 1:
Write an algorithm to determine a students final grade and indicate whether it is
passing or failing. The final grade is calculated as the average of four marks.
Pseudocode:
Input a set of 4 marks
Calculate their average by summing and dividing by 4
if average is below 50
Print FAIL
Else
Print PASS
Detailed Algorithm:
Step 1:
Input M1,M2,M3,M4
Step 2:
GRADE (M1+M2+M3+M4)/4
Step 3:
if (GRADE < 50) then
Print FAIL
Else
Print PASS
endif

FINDING THE MAXIMUM AND MINIMUM

Problem can be solved by the divide-and-conquer technique

The problem is to find the maximum and minimum items in a set of n elements.
The time is determined mainly by the total cost of the element comparisons.
The function StraightMaxMin requires 2(n-1) element comparisons in the best, average,

and worse cases.


An immediate improvement is possible by realizing that the comparison a[i]<min is

necessary only when a a[i]>max is false.


Hence we can replace the contents of the for loop by
If(a[i]>max)
Max=a[i];
Else if(a[i]<min)
Min=a[i];
Now the best case occurs when the elements are in increasing order.
The number of element comparisons is n-1.
The worst case occurs when the elements are in decreasing order.
In this case the number of element comparisons is 2(n-1).

A divide-and conquer algorithm for this problem would proceed as follows:


Let P = (n,a[i],,a[j]) denote an instance of the problem.
Here n is the number of elements in the list a[i],.,a[j].
In this case, the maximum and minimum are a[i] if n=1.
If n=2,the problem can be solved by making one comparison.
If the list has more than 2 elements, P has to be divided into smaller instances.
n
n
P 1= , a [ 1 ] , . a

For example,we might divide P into the 2 instances


2
2

(n-n/2,a[n/2+1],.a[n]).
After having divided P into 2 smaller sub problems ,we can solve them by recursively

invoking the same divide-and conquer algo.


P1
How can we combine the solutions for
and

If MAX(P) and MIN(P) are the maximum and minimum of the elements in P,then

MAX(P) is the larger of MAX(


MIN(

P1

) and MIN(

P2

P1

) and MAX(

P2

P2

[ ])

P2

to obtain a solution for P?

). Also,MIN(P) is the smaller of

).

MaxMin is a recursive function that finds the maximum and minimum of the set of

elements {a(i),a(i+1),..,a(j)1}.
The situation of set sizes one(i=j) and two (i=j-1) are handled separately.
For sets containing more than 2 elements,the midpoint is determined(just as in binary

search)and 2 new subproblems are generated.


When the maxima and minima of these subproblems are determined,the 2 maxima

arecompared and the two minima are compared to achieve the solution for the entire set.
Suppose we simulate the function MaxMin on the following 9 elements:

A good way of keeping track of recursive calls is to build a tree by adding a node each

time a new call is made.


Each node has 4 items of information : I ,j,max,min.
Examining the figure ,we see that the root node contains 1 and 9 as the values i and j

corresponding to the initial call to MaxMin.


This execution produces 2 new calls to MaxMin,where i and j have the values 1,5 and

6,9 respectively,and thus split the set into 2 subsets of approximately the same size.
What is the no.of element comparisons needed foe MaxMin?
If T(n) represents this number ,then the resulting recurrence relation is


T (n/2)+T (n /2)+2>2

T(n) =
1 n=2
{ 0 n=1

Figure. Trees of recursive calls of MaxMin

RANDOMIZED ALGORITHM : INFORMAL DESCRIPTIONS

A randomized algo is one that makes use of a a randomizer(such as a random number

generator).
Some of the decisions made in the algorithm depend on the output of the randomizer.
Since the output of any randomizer might differ run to run,the output of a randomized

algo could also differ from run to run for the same input.
The execution time of a randomized algorithm could also vary from run to run for the

same input.
Randomized algorithms can be categorized into 2 classes: the first algo that always

produce the same output for the same input.These are called Las Vegas algorithms.
The execution time of a Las Vegas algo depends on the output of the randomizer.

If we are lucky,the algo might terminate fast,and if not,it might run for a longer period of

time.
The second is algorithms whose outputs might differ from run to run.these are called

Monte Carlo algorithms.


Consider any problem for which there are only 2 possible answers,say Yes and No.
If a Monte Carlo algo is employed to solve such a problem,then the algo might give

incorrect answers depending on the output of the randomizer.


A randomized algo can be viewed as a family of algorithms.
For a given input,some of the algorithms in this family may give incorrect answers.
The objective in the design of a randomized algo is to ensure that the number of such bad

algorithms in the family is only a small fraction of the total no of algorithms.


If for any input we can show that atleast 1- fraction of algorithms in the family will

run quickly on that input where is error probability.


~
O ( n ) is used for characterizing the run times of Las Vegas algorithms.

2 advantages of using randomized algorithms are simplicity and efficiency.


If we design a fast algorithm for a problem with an error probability less than that of the
hardware failure probability, then it is more likely for the machine to break down than for

the algorithm to fail.


The use of randomizer within an algorithm doesnt always result in better performance.

IDENTIFYING THE REPEATED ELEMENT(eg:Las Vegas type)


Consider

an array a[ ] of n numbers that has n/2 distinct elements and n/2 copies of

another element.

The problem is to identify the repeated element.


Any deterministic algorithm for solving this problem will need at least n/2+2 time steps
in the worst case.

The first n/2+1 elements examined by the algorithm are all distinct.
Even after having looked at n/2+1 elements, the algorithm will not be in a position to
infer the repeated element.

It will have to examine at least n/2+2 elements and hence take at least n/2+2 time steps.

In contrast there is a simple and elegant randomized Las Vegas algorithm that takes only
~
O ( log n )

time.

It randomly picks 2 array elements and checks if they come from 2 diff cells and have the
same value.

If they do the repeated element has been found.


If not, this basic step of sampling is repeated as many times as it takes to identify the
repeated element.

In this algo, the sampling performed is with repetitions; i.e.,the first and second elements
are randomly picked from out of the n elements.

There is a probability that the same array element is picked each time.if we just check for
the equality of the 2 elements picked,our answer might be incorrect.

Therefore it is essential to make sure that the two array indices picked are different and
the 2 array cells contain the same value.

PRIMALITY CHECKING(eg: Monte carlo type)

Any integer greater than one is said to be a prime if its only divisors are 1 and the integer

itself.
By convention, we take 1 to be a nonprime.

Then 2,3,5,7,11,and 13 are the first six primes.


Given an integer n,the problem of deciding whether n is a prime is known as primality
testing.

If a number n is composite ,it must have a divisor n .


Consider each number l in the interval[2, n ] and check whether l divides n.

If none of these numbers divides n,then n is prime; otherwise it is composite.


We can devise a Monte Carlo randomized algorithm for primality testing that runs in time

n
O(( log 2 .

The output of this algorithm is correct with high probability.


If the input is prime the algorithm never gives an incorrect answer.
If the input number is composite ,then there is a small probability that the answer may be

incorrect.
Algorithms of this kind are said to have one-sided error.
Fermats theorem suggests the following algorithm for primality testing: randomly
n1
choose an a <n and check whether a 1(mod n) .

If Fermats equation is not satisfied ,n is composite.


If the equation is satisfied,we try some more random as.
If on each a tried, Fermats equation is satisfied, we output n is prime;otherwise we

output n is composite.
Unfortunately,in our algorithm even if n is composite,Fermats equation may be satisfied

depending on the a chosen.


Moreover, there are composite numbers for which every a that is less than and relatively

prime to n will satisfy Fermats equation.The numbers 561 and 1105 are examples.
The modified primality testing algorithm is also known as Miller Rabins algorithm.
Miller Rabins algorithm will never give an incorrect answer if the input is prime,since

Fermats equation will always be satisfied.


If n is composite,the algorithm detect the compositeness of n if either the randomly
chosen a leads to the discovery of a nontrivial square root of 1 or it violates Fermats
equation.

STRASSENS MATRIX MULTIPLICATION

Suppose ,we need to compute the matrix multiplication of 2 matrices A & B of order n.

Let the resultant matrix be C.i.e.,C=A.B (where A,B,C are nn matrices).


The normal matrix multiplication procedure is
n

Cij

Aik . Bkj

(where i,j,k varies from 1 to n )

k=1

The running time of this procedure is ( n

Strassens shows how to reduce this complexity.(using divide and conquer technique.)
We wish to compute the product C=A.B where C,A,B are nn matrices.
Assuming that n is an exact power of 2,we divide each nn matrices into four n/2n/2

matrices as follows:
r s
a b
=
t u
c d

).

[ ] [ ] [ ]
e g
f h

From above we get,


r = ae + bf
s = ag + bh
t = ce + df
u = cg + dh
Each of these four equations specifies two multiplications of n/2n/2 matrices and the

addition of their n/2n/2 products.


Using these equations to define a straight-forward divide-and-conquer strategy,we derive
the following recurrence for the time T(n) to multiply two n/2n/2 matrices.
2
T(n)=8T(n/2)+( n )

This recurrence equation has the solution


3
T(n)= ( n )

So this method is no faster than the ordinary one.


Strassen discovered a different recursive approach that requires only 7 recursive
2

multiplications of n/2n/2 matrices and ( n

) scalar additions and subtractions

yielding the recurrence.


2

T(n)=7T(n/2)+ ( n
log 7

= ( n

2.81

= O( n

)
)

Strassens method has 4 steps:


Divide the input matrices A and B into n/2n/2 submatrices.

Using ( n

) scalar additions and subtractions,compute 14n/2n/2 matrices.

A 1 , B1 , A 2 , B2 , . , A 7 , B7
Recursively compute the seven matrix products
Pi= Ai . Bi , for i=1,2, .,7
Compute the desired submatrices r,s,t,u of the resultant matrix C by adding or/and
subtracting various combinations of
additions and subtractions.

Let us guess that each matrix product

Pi

Pi

2
matrices using only ( n ) scalar

can be written in the form

P i= A i . Bi

We guess that each product is computed by adding or subtracting some of the submatrices
of A, adding or subtracting some of the submatrices of B,and then multiplying the two
results together.
P1 + P4 P5 + P7
r=

where

s=

P3 + P5

t=

P 2 + P4

u=

P1 + P3P 2 + P6

P1 ( a+ d ) (e+ h) P2=(c +d ) e

P3=a ( f h ) P 4=d (ge)

P5=( a+ b ) h P 6=( ca ) (e +f )

P7=( bd ) ( g+ h)

These 7 submatrix products can thus be used to compute the product C=A.B, which
completes the description of Strassens method.

BINARY SEARCH
ai

Let

Consider the problem of determining whether a given element x is present in the list.
aj
If x is present, we are to determine a value j such that
= x.

,1 i n, be a list of elements that are sorted in nondecreasing order.

If x is not in the list, then j is to be set to zero.


a , , al
Let P = (n, i
,x,) denote an arbitrary instance of this search problem(n is the
number of elements in the list,

ai , , al

is the list of elements and x is the element

searched for).
Divide-and-conquer can be used to solve this problem.
Let Small(P)be true if n = 1.
a
In this case,S(P)will take the value i if x = i , otherwise it will take the value 0.

Then g(l) = (1).


If P has more than one element,it can be divided (or reduced)into a new subproblem as

follows.

Pick an index q (in the range [ i ,l ] and compare x with

There are three possibilities:


(1) x =

aq

In this case x has to be searched for only in the sublist

.Therefore, P reduces to (q i ,

ai , . ,a q1

(3) x > aq: In this case the sublist to be searched is


P reduces to (l - q,

: In this case the problem P is immediately solved.

(2) x < aq :
ai , ai+1 , .. ,a q1

aq

aq +1 , .. , al

,x)

aq +1, . al

,x).

Any given problem P gets divided (reduced)into one new subproblem.


This division takes only (1) time.

After a comparison with

aq

, the instance remaining to be solved (if any) can be

solved by using this divide-and-conquer scheme again.


aq
If q is always chosen such that
is the middle element(that is, q = (n + l)/2), then the

resulting search algorithm is known as binary search.


Note that the answer to the new subproblem is also the answer to the original problemP;

there is no need for any combining.


.Does BinSearch terminate?
We observe that low and high are integer variables such that each time through the loop

either x is found or low is increased by at least one or high is decreased by at least one.
Thus we have two sequences of integers approaching each other and eventually low
becomes greater than high and causes termination in a finite number of steps if x is not

present.
Computing time of binary search is

SELECTION OF A WORST- CASE OPTIMAL ALGORITHM

Você também pode gostar