Algorithms and Complexity: Bioinformatics Spring 2008 Hiram College

Algorithms and Complexity
Bioinformatics
Spring 2008
Hiram College
Algorithm definition slides taken from CPSC 171, courtesy of Obertia Slotterbeck
What is an algorithm?
An algorithm is a
well-ordered collection of
unambiguous and
effectively computable operations that,
when executed,
produces a result and
halts in a finite amount of time.
AN EXAMPLE OF A VERY SIMPLE

ALGORITHM
1. Wet your hair.
2. Lather your
hair.
3. Rinse your
hair.
4. Stop.
Observe:
Operations need not be
executed by a computer only
by an entity capable of
carrying out the operations
listed.
We assume that
The algorithm begins executing at the top of the list
of operations.
The "Stop" can be omitted if we assume the last
line is an implied "Stop" operation.
A well-ordered collection of
operations
The question that must be answered is:
At any point in the execution of the algorithm, do you
know what operation is to be performed next?
Well-ordered operations:
1. Wet your hair.
Not well-ordered
operations:
2. Lather your hair.
1. Either wet your hair or

lather your hair.
3. Rinse your hair.
2. Rinse your hair.
Choices are allowed:

1. If your hair is dirty, then If your hair is dirty, then

a. Wet your hair.
Wet your hair.
b. Lather your hair.
Lather your hair.
c. Rinse your hair.
Rinse your hair.
2. Else
a. Go to bed.
Else
Go to bed.
Note: We will often omit the numbers and the letters

and assume a "top-down" reading of the operations.
Unambiguous operations
Does the computing entity understand what the
operation is to do?
This implies that the knowledge of the computing
entity must be considered.
For example, is the following ambiguous?
Make the pie crusts.
To an experienced cook,
Make the pie crusts.
is not ambiguous.
But, an less experienced cook may need:
Take 1 1/3 cups of flour.
Sift the flour.
Mix the sifted flour with 1/2 cup of butter
and 1/4 cup of water to make dough.
Roll the dough into two 9-inch pie crusts.
or even more detail!
Definition: An operation that is

unambiguous is called a primitive
operation (or just a primitive)
One question we will be exploring in the course is
what are the primitives of a computer.
Note that a given collection of operations may be an
algorithm with respect to one computing agent, but
not with respect to another computing agent!!
Primitives for Computer

Algorithms (e.g. PERL)
Mathematical operations: add,
subtract, multiply, divide, log, sqrt,
String operations: append, substring,
reverse,
File operations: read, write, append
Other I/O: print, scan
Primitives for Biological

Algorithms
Bind (a molecule binds to a site)

Separate (strands of DNA)
Polymerize (add base to strand)
Repair gaps
These primitives make up an algorithm

for DNA replication (A, pp. 14-16)
Effectively computable
operations
Is the computing entity capable of doing the operation?
This assumes that the operation must first be unambiguousi.e. the computing agent understands what is to be done.
Not effectively computable operations:
Write all the fractions between 0 and 1.
Create matter from nothing
that, when executed, produces a

result
Can the user of the algorithm observe a result produced
by the algorithm?
The result need not be a number or piece of text
viewed as "an answer".
It could be an alarm, signaling something is wrong.
It could be an approximation to an answer.
It could be an error message.
halts in a finite amount of time
Will the computing entity complete the operations in

a finite number of steps and stop?
Do not confuse "not finite" with "very, very large". A
failure to halt usually implies there is an infinite loop in
the collection of operations:
1. Write the number 1 on a piece of paper.
2. Add 1 to the number you just wrote and write it
on a piece of paper.
3. Repeat 2.
4. Stop.
Definition of an algorithm:
An algorithm is a well-ordered collection
of unambiguous and effectively
computable operations that, when
executed, produces a result and halts in
a finite amount of time.
Note: Although I have tried to give clean cut examples to
illustrate what these new words mean, in some cases, a
collection of operations can fail for more than one reason.
A Language for Algorithms

Natural Language (English)?
Too ambiguous
Programming Language (Perl)?

Too much new syntax to learn
Pseudocode
A compromise. ( just right)
What is Pseudocode?
Structured like a programming
language, but ignores many syntactical
details (like $ and ;)
Complex operations can be written in
natural language
Still, we need to agree on some
standard operations
Our Pseudocode (pp. 8-11)
Assignment
Arithmetic
Conditional execution
Repeated execution
Array access
Functions
Assignment and Variables

A variable has a name (which can be
anything in pseudocode) and a value.
Assignment changes the value
Examples:
myName <- Ellen Walker
number <- 17
copy <- number
number <- 98765
Arithmetic
In pseudocode, mathematical symbols
are allowed
2
dist ( x 2 x1 ) + ( y 2 y1 )
But, programming language style math

is easier to type
dist <- sqrt((x[2]-x[1])^2 + (y[2]-y[1])^2)
Conditional (If Statement)

Allows a choice to be made, given:
a condition,
something to do if the condition is true
something to do if the condition is false.
Example:
if today is a weekday
go to work
else
stay home
Repeated Execution (Loops)

Need to know:
Which instructions to repeat
When to stop repeating
Two kinds of loops

While loop: stop when a condition is true
For loop: repeat a specific number of times
For Loop
Loop controlled by a variable
Executes once for each value of the
variable, from a given starting value to a
given ending value
Example:
sum <- 0
for num <- 1 to 10
sum <- sum + num
While Loop
Loop is controlled by a condition.
When the condition is false, the loop no
longer executes.
Example
while (you are cold)
raise the thermostat temperature 1 degree
Array Access
An array is a sequence of values of the
same type.
We access each item, or element by its
numeric index.
Computers start counting at 0!
Array Example
Jan Feb
Mar
Apr
May
Jun Jul
Aug Sep Oct
Nov Dec
10 11
Months <- {Jan, Feb, Mar, Apr, May, Jun,

Jul, Aug, Sep, Oct, Nov, Dec}
print(months[5])
for m<- 0 to 11
print(months[m])
go to the next line
prints Jun
Assignment to an Array
An array element is really a variable, so
you can assign to it.
Example:
for n <- 0 to 99
squares[n] = n*n
Function
A function has a name, parameters (inputs),
code to execute, and a return value.
Example:
fibonacci (n)
f[0] <- 1
f[1] <- 1
for i <- 2 to n-1
f[i] <- f[i-1]+f[i-2]
return f[n-1]
Calling a Function
Write the function name, and the actual
values for the parameters
When the function is complete, the
return value replaces the name of the
function in any expression.
Example:
print(fibonacci(8))
prints 21
Exercise
Find a set of instructions, written in English, on
the Internet, preferably for a non-mathematical
task.
There should be at least 5 steps, and a
condition or a loop, preferably both.
Rewrite the instructions in pseudocode.
Possibilities:
Instruction manuals
Government sites
Game descriptions
Comparing Algorithms
There can be many different algorithms
to solve the same problem
Better algorithms
Get the correct answer (if possible)
Get better answers (otherwise)
Use fewer resources (time and space)
Algorithm Complexity
Time
How long does the algorithm take?
Abstract, dont want answer to depend on
which machine!
Space
How much space (arrays, variables) does
the algorithm need?
Time Complexity of an Algorithm

What we want to do is relate
1. the amount of work performed by an algorithm
2. and the algorithm's input size
by a fairly simple formula.
STEPS FOR DETERMING THE TIME

COMPLEXITY OF AN ALGORITHM
1. Determine how you will measure input size. Ex:

N items in a list
N x M table (with N rows and M columns)
Two numbers of length N
2. Choose an operation (or perhaps two operations) to count as

a gauge of the amount of work performed. Ex:
Comparisons
Swaps
Copies
Additions
Normally we don't count operations in input/output.

3. Decide whether you wish to count operations in the
Best case? - the fewest possible operations

Worst case? - the most possible operations
Average case?
This is harder as it is not always clear what is meant
by an "average case". Normally calculating this case
requires some higher mathematics such as probability
theory.
4. For the algorithm and the chosen case (best, worst,
average), express the count as a function of the input size of
the problem.
For example, we determine by counting, statements such as ...
EXAMPLES:
For n items in a list, counting the operation
swap, we find the algorithm performs 10n +
5 swaps in the worst case.
For an n X m table, counting additions, we
find the algorithm perform nm additions in
the best case.
For two numbers of length n, there are 3n +
20 multiplications in the best case.

5. Given the formula that you have determined, decide the
complexity class of the algorithm.
What is the complexity class of an algorithm?
Question: Is there really much difference between
3n
5n + 20
and
6n -3
especially when n is large?
But, there is a huge difference, for n large, between

n
n2
and
n3
So we try to classify algorithm into classes, based on their

counts and simple formulas such as n, n2, n3, and others.
Why does this matter?

It is the complexity of an algorithm that most
affects its running time--not the machine or its speed
The TRS-80
ORDER WINS OUT
Main language support: BASIC - typically a slow running

language
For more details on TRS-80 see:
http://en.wikipedia.org/wiki/TRS-80
The CRAY-YMP
Language used in example: FORTRAN- a fast running language
For more details on CRAY-YMP see:
http://en.wikipedia.org/wiki/Cray_Y-MP
CRAY YMP
with FORTRAN
complexity is 3n3
TRS-80
with BASIC
complexity is 19,500,000n
n is:
10
3 microsec
100
3 millisec
200 millisec
2 sec
1000
3 sec
20 sec
2500
50 sec
50 sec
10000
49 min
3.2 min
1000000
95 years
5.4 hours
Trying to maintain an exact count for an operation isn't too useful.

Thus, we group algorithms that have counts such as
n
3n + 20
1000n - 12
0.00001n +2
together. We say algorithms with these type of counts are in the
class (n) read as the class of theta-of-n or
all algorithms of magnitude n or
all order-n algorithms
Similarly, algorithms with counts such as

n2 + 3n
1/2n2 + 4n - 5
1000n2 + 2.54n +11
are in the class (n2).
Other typical classes are those with easy formulas in n such
as
1
n3
2n
lg n
k = lg n if and only if 2k = n
lg n
k = lg n if and only if 2k = n
lg 4 = ?
lg 8 = ?
lg 16 = ?
lg 10 = ?
Note that all of these are base 2 logarithms. You don't
use any logarithm table as we don't need exact values
(except on integer powers of 2).
Look at the curves showing the growth for
algorithms in
(1), (n), (n2), (n3), (lg n), (n lg n), (2n)
These are the major ones we'll use.
Figure 3.4
Work = cn for Various Values of c
Figure 3.10
Work = cn2 for Various Values of c
Figure 3.11
A Comparison of n and n2
Figure 3.21
A Comparison of n and lg n
Figure 3.21
A Comparison of n and lg n
Figure 3.25
Comparisons of lg n, n, n2 , and 2n
Making Change
US Change Problem
Input: amount of money, M, in cents
Output: smallest number of coins (quarters,
dimes, nickels, and pennies) that add up to
M
US Change Algorithm
While M>0
c <- value of largest coin with value <=
M
give c coin to customer
M <- M-c
(See mathematical version, p. 19)
US Change Examples:
77c
Quarter (77-25 = 52)
Quarter (52-25 = 27)
Quarter (27-25 = 2)
Penny (2-1 = 1)
Penny (2-1 = 1)
Same Algorithm using

Mathematics
Calculate and give the max # of quarters
(25c):
Q = floor (M / 25)
Give customer Q quarters
M = M - 25*Q
Calculate and give the max # of dimes (10c).

Calculate and give the max # of nickels (5c).
Give the rest in pennies.
Generalizing the Algorithm

Suppose a non-US money system has
d (different) coins of denominations:
c[0], c[1], c[d-1]
And, c[0] > c[1] > c[d-1]
Then, we can generalize the algorithm:
Generalizing, continued
for i <- 0 to d-1
Calculate and give the max number of
this coin (c[i])
Generalized Algorithm is Not

Correct!
c = {25, 20, 10, 5, 1}
M = 40
Result is 3 coins, not 2
Brute Force: Correct but Slow

N <- 1
While (not done)
construct a combination of N coins
if it adds up to M
return it (done=true)
else if no more combinations of N coins
N <- N+1
Improving the Change

Algorithm
Try combinations in a useful order
Weve already done this, looking at fewer
coins first.
Generate as few combinations as

possible
Use knowledge to bound the combinations
tried.

Algorithms and Complexity: Bioinformatics Spring 2008 Hiram College

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Algorithms and Complexity: Bioinformatics Spring 2008 Hiram College

Enviado por

Direitos autorais:

Formatos disponíveis

Algorithms and Complexity

AN EXAMPLE OF A VERY SIMPLE

2. Lather your hair.

1. Either wet your hair or

3. Rinse your hair.

2. Rinse your hair.

Choices are allowed:

1. If your hair is dirty, then If your hair is dirty, then

Lather your hair.

c. Rinse your hair.

Rinse your hair.

Note: We will often omit the numbers and the letters

Definition: An operation that is

Primitives for Computer

Primitives for Biological

Bind (a molecule binds to a site)

These primitives make up an algorithm

that, when executed, produces a

halts in a finite amount of time

The question that must be answered is:

Will the computing entity complete the operations in

A Language for Algorithms

Programming Language (Perl)?

Our Pseudocode (pp. 8-11)

Assignment and Variables

But, programming language style math

Conditional (If Statement)

Repeated Execution (Loops)

Two kinds of loops

Aug Sep Oct

Months <- {Jan, Feb, Mar, Apr, May, Jun,

Time Complexity of an Algorithm

STEPS FOR DETERMING THE TIME

1. Determine how you will measure input size. Ex:

2. Choose an operation (or perhaps two operations) to count as

Normally we don't count operations in input/output.

STEPS FOR DETERMING THE TIME

Best case? - the fewest possible operations

STEPS FOR DETERMING THE TIME

especially when n is large?

But, there is a huge difference, for n large, between

So we try to classify algorithm into classes, based on their

Why does this matter?

ORDER WINS OUT

Main language support: BASIC - typically a slow running

Trying to maintain an exact count for an operation isn't too useful.

Similarly, algorithms with counts such as

Same Algorithm using

Calculate and give the max # of dimes (10c).

Generalizing the Algorithm

Generalized Algorithm is Not

Brute Force: Correct but Slow

Improving the Change

Generate as few combinations as

Você também pode gostar