Você está na página 1de 21

Sorting Algorithms

• In computer science and mathematics, a sorting algorithm is an algorithm that puts


elements of a list in a certain order.
• The most-used orders are numerical order and lexicographical order.
In-Place Sorting Algorithms
• In computer science, an in-place algorithm is an algorithm which transforms a data
structure using a small, constant amount of extra storage space.
• The input is usually overwritten by the output as the algorithm executes.
• An algorithm which is not in-place is sometimes called not-in-place or out-of-place.
• A sorting algorithm is said to be in-place if it requires very little additional space besides
the initial array holding the elements that are to be sorted.
• Normally “very little” is taken to mean that for sorting elements, O(log n) extra space
is required.
• There are a number of sorting algorithms that can rearrange arrays into sorted order in-
place, including:
 Bubble sort
 Selection sort
 Insertion sort
 Heapsort
• Quicksort and Mergesort are not in-place sorting algorithms.
Stable Sorting Algorithms
• Stable sorting algorithms maintain the relative order of records with equal keys (i.e.,
sort key values).
• That is, a sorting algorithm is stable if whenever there are two records R and S with the
same key and with R appearing before S in the original list, R will appear before S in the
sorted list.
• Assume that the following pairs of numbers are to be sorted by their first component:
(4, 1) (3, 7) (3, 1) (5, 6)
• In this case, two different results are possible, one which maintains the relative order of
records with equal keys, and one which does not:
(3, 7) (3, 1) (4, 1) (5, 6) (order maintained)
(3, 1) (3, 7) (4, 1) (5, 6) (order changed)

Sorting Algorithms Stable


Bubble Sort Yes
Selection Sort No
Insertion Sort Yes
Heapsort No
Quicksort No
Mergesort No

• Internal Sort
– The data to be sorted is all stored in the computer’s main memory.
• External Sort
– Some of the data to be sorted might be stored in some external, slower, device.

• In-Memory Sorting: When the size of memory is bigger than that of file to be sorted!
• External Sorting: When the size of file to be sorted is bigger than that of available
memory!

Loop Invariants
• Loop invariants provide us with a way to reason about loops, and can be used to verify
that the loops we design are correct.
• Simply stated, a loop invariant is a relationship among variables in a program that is
true when control enters a loop, remains true each time the body of the loop is executed,
and is still true when the loop is exited.
The General Form of a Loop
• It is important to understand a bit about the anatomy of a loop, and to discuss some loop
terminology. There are three essential components to any loop.
1. Initialization for the loop.
2. The reason for looping, or the termination condition for the
loop.
3. The loop body, made up of the statements to be executed each
time the program goes though the loop. The loop body must guarantee
that the loop terminates in some finite number of steps.
Example
• Consider the problem of computing the value of n! where n is a positive, non-zero integer
greater than 1. We know from math that
(1) n! = 1 * 2 * 3 * ... * (n-1) * n
• The strategy that we want to use to solve this problem is to come up with a loop that
creates each new integer value in turn, and then multiples it by the product of the previously
computed integers.
• To do this, define the program variables j and product. The variable j will hold each new
integer value as we compute it, and product will hold the product of j and the previously
computed integer values.
• Clearly, when we are finished,
j = n and
product = j!
• It should be easy to see that the loop body will compute
product = product * j;
and we will want the loop to run as long as
j < n.
• The loop can therefore be constructed as follows:
• Every time that we loop back and start to execute the loop body again, we know that j
will be less than n.
• This is the reason for looping. Inside the body of the loop we need to generate one new
integer value with the statement
j = j + 1;
and use that new value to calculate a new product.
• The loop body therefore has two statements:
j = j + 1;
product = product * j;
• Since we know that every loop needs some initialization, we need to provide that here.
Our equation (1) for factorial starts with the integer 1. Furthermore, we know that 1! = 1. So,
let's write
j = 1;
product = 1;
• The finished loop now looks like
j = 1;
product = 1;
// At this point we know that product = j! and j < n
while ( j < n )
{
j = j + 1;
product = product * j;
// At this point we also know that product = j! and j < n
}
// At this point, we know that product = j! and j = n
• We see from the above that there is one condition that is true before we start the loop, it is
true each time we complete the body of the loop, and it is true when we exit the loop. This
condition, product = j!, is called the loop invariant.
Using Loop Invariants to Construct Loops
• The steps that we just went through can be used in any similar situation, to construct a
loop.
• In summary, these steps are:
1. Come up with a loop strategy that solves the problem.
2. Determine the set of variables required in the loop.
3. Express the result desired when the loop exits, in terms of the loop variables.
4. Write down the reason for leaving the loop and the loop invariant. These can
usually be written in terms of the desired result when the loop exits.
5. Construct the initialization and the loop body.
Loop Invariants
 If a condition is true when you enter a loop it will be true when you leave each iteration
of the loop
 Used to determine if you have written your loop correctly.
int fact(int n) {
int prod = 1;
int k = 0;
while(k < n) {
k = k + 1;
prod = prod * k;
}
return prod;
}
Loop invariant : prod = k!
When loop terminates, k = n, hence prod = n!
Insertion Sort
Design approach: Incremental
Sorts in place: Yes
Best case: Θ (n)
Worst case: Θ (n2)
• Idea: like sorting a hand of playing cards
– Start with an empty left hand and the cards facing down on the table.
– Remove one card at a time from the table, and insert it into the correct position in the left
hand
• compare it with each of the cards already in the hand, from right to left
– The cards held in the left hand are sorted
• these cards were originally the top cards of the pile on the table.

1 2
6 0 4 3
6
To insert 12, we need to make
room for it by moving first 36
and then 24.
12

6 10 24
36

12

6 10 24 3
6
1 2
6 0 4 3
6
To insert 12, we need to make
room for it by moving first 36
and then 24.
12

6 10 24
36

12
6 10 24 3
6

12

input array
5 2 4 6 1
3
at each iteration, the array is divided in two sub-arrays:
left sub-array right sub-array

sorted unsorted
1 2 3 4 5 6 7 8

a1 a2 a3 a4 a5 a6 a7 a8

ke
y
Loop Invariants and the Correctness of Insertion Sort
• Proving loop invariants works like induction
• Initialization (base case):
– It is true prior to the first iteration of the loop
• Maintenance (inductive step):
– If it is true before an iteration of the loop, it remains true before the next iteration(
[i-1] true => [i] true )
• Termination:
– When the loop terminates, the invariant gives us a useful property that helps show
that the algorithm is correct
– Stop the induction when the loop terminates
• When the first two properties hold, the loop invariant is true prior to every
iteration of the loop.
• Loop invariant:Original A[1 .. j-1] is permuted to A’[1 .. j-1] but in
sorted order
• At the start of each iteration of for loop, the subarray A[1 .. j-1] consists of
the elements originally in A[1 .. j-1] but in sorted order.
It can help us to understand why an algorithm is correct.
 Initialization: j=2, A[1 .. j-1]=A[1] consists of just the single A[1] , which is the
original element in A[1], and is sorted. Obviously, the loop invariant holds prior to the first
iteration of the loop.

 Maintenance: the while inner loop moves A[j -1], A[j -2], A[j -3], and so on, by
one position to the right until the proper position for key (which has the value that started out
in A[j]) is found. At that point, the value of key is placed into this position. The second
property holds for the outer loop.

 Termination:
– The outer for loop ends when j = n + 1 ⇒ j-1 = n
– Replace n with j-1 in the loop invariant:
• the subarray A[1 . . n] consists of the elements originally in A[1 . . n],
but in sorted order.

• The entire array is sorted!


Analysis of Insertion Sort

Alg.: INSERTION-SORT(A) Cost Times


for j ← 2 to n c1 n

do key ← A[ j ] c2 n-1

Insert A[ j ] into the sorted sequence A[1 . . j -1] c3 n-1

i←j–1 c4 n-1

while i > 0 and A[i] > key ∑


n
tj
c5 j =2

do A[i + 1] ← A[i] ∑
n
(t j −1)
c6 j =2

i←i–1 ∑
n
c7 j =2
(t j −1)

A[i + 1] ← key c8 n-1


tj: # of times the while statement is executed at iteration j
T (n) = c1n + c2 ( n −1) + c4 ( n −1) + c5 ∑t j + c6 ∑(t j −1) + c7 ∑(t j −1) + c8 (n −1)
n n n

j =2 j =2 j =2

Best Case Analysis


• The array is already sorted
– A[i] ≤ key upon the first time the while loop test is run (when i = j -1)
– tj = 1
• T(n) = c1n + c2(n -1) + c4(n -1) + c5(n -1) + c8(n-1) = (c1 + c2 + c4 + c5 + c8)n + (c2 + c4 + c5
+ c8)
= an + b = Θ (n)
Worst Case Analysis
• The array is in reverse sorted order
– Always A[i] > key in while loop test
– Have to compare key with all elements to the left of the j-th position ⇒ compare
with j-1 elements ⇒ tj = j, using
n
n(n + 1) n
n(n + 1) n
n(n − 1)

j =1
j =
2
=> ∑
j =2
j =
2
− 1 => ∑j =2
( j − 1) =
2
, we have

 n(n +1)  n( n −1) n( n −1)


T (n) = c1n + c2 ( n −1) + c4 ( n −1) + c5  −1 + c6 + c7 + c8 (n −1)
 2  2 2
= an 2 + bn + c
a quadratic function of n

• T(n) = Θ (n2) order of growth in n2


Average-case Analysis
 Want to determine the average number of comparisons taken over all possible inputs.
 Determine the average no. of comparisons for a key A[j].
 A[j] can belong to any of the j locations, 1..j, with equal probability.
 The number of key comparisons for A[j] is j–k+1, if A[j] belongs to location k, 1 < k ≤ j
and is j–1 if it belongs to location 1.
Average no. of comparisons for inserting key A[j] is:
j −1
1  1
∑ j k  + j ( j −1)

k =1 
j −1
1 1
= ∑( k ) +1 −
j k =1 j
j −1 1 j +1 1
= +1 − = −
2 j 2 j

Summing over the no. of comparisons for all keys,


n
 i +1 1 
Cavg ( n) = ∑ − 
i =2  2 i
2 n
n 3n 1
= + −1 − ∑
4 4 i =2 i

= Θ( n 2 ) − O (ln n)
= Θ( n 2 )

Therefore, Tavg(n) = Θ (n2)


Mergesort
Design approach: divide and conquer
Sorts in place: No
Running time: Θ (nlogn)

 Recursive in structure
 Divide the problem into sub-problems that are similar to the original but smaller
in size
 Conquer the sub-problems by solving them recursively. If they are small enough,
just solve them in a straightforward manner.
 Combine the solutions to create a solution to the original problem
Sorting Problem: Sort a sequence of n elements into non-decreasing order. A[p . . r]:

 Divide: Divide the n-element sequence to be sorted into two subsequences of n/2
elements each

 Conquer: Sort the two subsequences recursively using merge sort.

 Combine: Merge the two sorted subsequences to produce the sorted answer.
Merge Sort – Example
18 26 32 6 43 15 9 1 22 26 19 55 37 43 99 2

18 26 32 6 43 15 9 1 22 26 19 55 37 43 99 2

18 26 32 6 43 15 9 1 22 26 19 55 37 43 99 2

18 26 32 6 43 15 9 1 22 26 19 55 37 43 99 2

18 26 32 6 43 15 9 1 22 26 19 55 37 43 99 2

Merge Sort – Example


Original Sequence Sorted Sequence
18 26 32 6 43 15 9 1 1 6 9 15 18 26 32 43

18 26 32 6 43 15 9 1 6 18 26 32 1 9 43
15 43

18 26 32 6 43 15 9 1 18 26 6 32 15 43 1 9

18 26 32 6 43 15 9 11 18 26 32 6 43 15 9 1

18 26 32 6 43 15 99 1
Merge Sort
p q r
1 2 3 4 5 6 7 8

Alg.: MERGE-SORT(A, p, r) 5 2 4 7 1 3 2 6

if p < r Check for base case


then q ←  (p + r)/2 Divide
MERGE-SORT(A, p, q) Conquer
MERGE-SORT(A, q + 1, r) Conquer
MERGE(A, p, q, r) Combine

Initial call: MERGE-SORT(A, 1, n)

Example – n Power of 2
1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6 q=4

Divide
1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6

1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6

1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6
Example – n Power of 2
1 2 3 4 5 6 7 8

1 2 2 3 4 5 6 7
Conquer
and
Merge 1 2 3 4 5 6 7 8

2 4 5 7 1 2 3 6

1 2 3 4 5 6 7 8

2 5 4 7 1 3 2 6

1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6

Example – n Not a Power of 2


1 2 3 4 5 6 7 8 9 10 11

4 7 2 6 1 4 7 3 5 2 6 q=6

Divide
1 2 3 4 5 6 7 8 9 10 11
q=3 4 7 2 6 1 4 7 3 5 2 6 q=9

1 2 3 4 5 6 7 8 9 10 11

4 7 2 6 1 4 7 3 5 2 6

1 2 3 4 5 6 7 8 9 10 11

4 7 2 6 1 4 7 3 5 2 6

1 2 4 5 7 8

4 7 6 1 7 3
Example – n Not a Power of 2
1 2 3 4 5 6 7 8 9 10 11

1 2 2 3 4 4 5 6 6 7 7
Conquer
and 1 2 3 4 5 6 7 8 9 10 11

Merge 1 2 4 4 6 7 2 3 5 6 7

1 2 3 4 5 6 7 8 9 10 11

2 4 7 1 4 6 3 5 7 2 6

1 2 3 4 5 6 7 8 9 10 11

4 7 2 1 6 4 3 7 5 2 6

1 2 4 5 7 8

4 7 6 1 7 3

Merging

p q r
1 2 3 4 5 6 7 8

2 4 5 7 1 2 3 6
• Input: Array A and indices p, q, r such that p ≤ q < r
– Subarrays A[p . . q] and A[q + 1 . . r] are sorted
• Output: One single sorted subarray A[p . . r]

• Idea for merging:


– Two piles of sorted cards
• Choose the smaller of the two top cards
• Remove it and place it in the output pile
– Repeat the process until one pile is empty
– Take the remaining input pile and place it face-down onto the output pile
Merge - Pseudocode
p q r
1 2 3 4 5 6 7 8
n1 ← q – p + 1
2 4 5 7 1 2 3 6
n2 ← r – q
Alg.: MERGE(A, p, q, r)
Compute n1 and n2 n1 n2

Copy the first n1 elements into L[1 . .


for i ← 1 to n1 for j ← 1 to n2 p q
n1 + 1] and the next n elements into R[1 . . n2 + 1]
do L[i] ← A[p + i – 1] do R[j] ← A[q + j]
2 L 2 4 5 7 ∞
L[n1 + 1] ← ∞; R[n2 + 1] ← ∞ q+1 r

i ← 1; j ← 1 R 1 2 3 6 ∞
for k ← p to r
do if L[ i ] ≤ R[ j ] Sentinels, to avoid having to
then A[k] ← L[ i ] check if either subarray is fully
copied at each step.
i ←i + 1
else A[k] ← R[ j ]
j←j+1
Example: MERGE(A, 9, 12, 16)
p q r

Example: MERGE(A, 9, 12, 16)


Example (cont.)

Example (cont.)
Example (cont.)

Done!

Correctness of Merge Sort

p r
Loop invariant
(at the start of the for loop)
A[p…k-1] contains the k-p smallest elements of
L[1 . . n1 + 1] and R[1 . . n2 + 1] in
sorted order
L[i] and R[j] are the smallest elements not yet
copied back to A
Proof of the Loop Invariant

Initialization
Prior to first iteration: k = p
⇒ subarray A[p..k-1] is empty
A[p..k-1] contains the k – p = 0 smallest elements
of L and R
L and R are sorted arrays (i = j = 1)
⇒ L[1] and R[1] are the smallest elements in L and
R

Proof of the Loop Invariant

Maintenance
Assume L[i] ≤ R[j] ⇒ L[i] is the smallest element not
yet copied to A
After copying L[i] into A[k], A[p..k] contains the k – p
+ 1 smallest elements of L and R
Incrementing k (for loop) and i reestablishes the loop
invariant
Proof of the Loop Invariant

Termination
At termination k = r + 1
By the loop invariant: A[p..k-1] = A[p…r] contains the k
– p = r – p + 1 smallest elements of L and R in sorted
order k=r+1
Exactly the number of elements to be sorted
⇒ MERGE(A, p, q, r) is correct

Running Time of Merge

Initialization (copying into temporary arrays):


Θ (n1 + n2) = Θ (n)
Adding the elements to the final array (the last for
loop):
n iterations, each taking constant time ⇒ Θ (n)
Total time for Merge:
Θ (n)
Analyzing Divide-and Conquer
Algorithms
The recurrence is based on the three steps of the
paradigm:
T(n) – running time on a problem of size n
Divide the problem into a subproblems, each of size
n/b: takes D(n)
Conquer (solve) the subproblems aT(n/b)
Combine the solutions C(n)
Θ (1) if n ≤ c
T(n) = aT(n/b) + D(n) + C(n) otherwise

Você também pode gostar