Você está na página 1de 9

Sorting

 Computers often have to do searching


 Searching is far more efficient when data is sorted
 Therefore sorting is also important

Problem: given n data items in random sequence,


put them in order
… as quickly as possible!

 Limitation: can only compare two items at a time


 Constant: the time it takes to make a comparison
the time it takes to move a data item
 Usually: just consider the number of comparisons

103-3

Objectives
How do these incremental sorts work?
 selection sort
 bubble sort
How do these divide-and-conquer sorts work?
 merge sort
 quicksort
How do we measure the performance of a sort algorithm?
 limitations and constants
 rate of growth function
• linear
• quadratic
• exponential
 performance classes
•n2
• n log2n
103-4

1
#1: Selection sort
1. Find the smallest item in the data set
2. Swap it with the first item
3. Shorten the data set by ignoring the first item
4. Keep going until there is only one data item left

How many comparisons must be made to find the smallest item?

The beer bottle


1. Call the first item the (current) smallest algorithm
2. Compare the (current) smallest with the next item
3. If a smaller one is found, call it the (current) smallest

n – 1 comparisons to find the smallest, n – 2 to find the next smallest, …

n=5 4+3+2+1 = 5 × 2 n × (n – 1)/2 comparisons in all


103-5

#2: Bubble sort


1. For each data item
a. compare it with the next one
b. if they are in the wrong order, swap them
2. Shorten the data set by ignoring the last item
3. Keep going until there is only one data item left

Each pass moves the largest remaining item to the end

1. n – 1 the first time through


2. n – 2 the next time through
3. etc

n=10 9+8+7+6+5+4+3+2+1 = 10 × 9/2


n × (n – 1)/2 comparisons — same as selection sort 103-6

2
Rates of growth
250
n2 n × (n–1)/2

200
n log2n
150

100
n
50

0
103-7
0 5 10 15 20 25 30 35 40 45

quadratic
2000
1800
n log n
1600
1400
1200
1000 linear
800
600
400
200
0
0 100 200 300 400 500 600 700 800 900
103-8

3
Divide and conquer
 For a long time, people thought that all sorting
methods would take time proportional to n2

 But in 1959, a man called Donald Shell showed that


they were wrong

 Since then, even better methods have been found

Divide and conquer


1. Break the set into two approximately equal parts
2. Solve the problem for each part
3. Merge the results
(problem is trivial if the set contains only one item!)

103-9

#3: Merge sort


1. Recursively divide data in half
(bottom out when there is only item)
2. Merge the subsets in the correct order

To merge:
1. Compare the first item from each subset
2. Print the smaller of the two
3. Iterate

A B output n log2n comparisons


n log 2n n log 2n
4 3 3
sorted 8 5 4 2 1 2
4 2 8
9 6 5 8 3 24
16 4 64
7 6 32 5 160
7 64 6 384
103-10
9

4
#4: Quicksort
1. Choose any value from the set (the “pivot”)
2. Divide remaining items into two subsets:
those smaller and those bigger than the pivot
3. Recursively quicksort the two subsets
4. Rejoin the sorted subsets and pivot value into one set

n log2n comparisons
(like merge sort)

 Is there a linear sorting method?


 Sublinear?

103-11

Sorting demo

103-12
http://www.cs.waikato.ac.nz/~ihw/103/sorting/

5
Parallel algorithms
Find the smallest number in a (unsorted) list

Can’t be sublinear …
unless parallel
input 1 smaller

input 2 larger

103-13

Parallel sorting
input 1 smaller

input 2 larger

103-14

6
Parallel sorting
input 1 smaller

input 2 larger
5 1 1 1 1

5
1 3 2 2 2

6 3 3 3
3 2 3

3 6 5 4 4 4
4

5
4 5 5
2 4
2 4 6 6 6
103-15

Reflections on parallel sorting

 Is this the smallest 6-input sorting network?


 Does this one work in reverse?
 Do there exist sorting networks that work in reverse?
 What problems can / can not be parallelized?

One person digs a ditch in 5 days,


can 5 people dig it in 1 day?

One woman bears a child in 9 months,


can 9 women bear it in one month?
103-16

7
Tony Hoare
Born: 1934 in Colombo, Sri Lanka, to British parents
On their return to England in 1946, he pursued a
classical education in Latin and Greek, and graduated
from Oxford in 1956 with a BA in Literae Humaniores. He
studied the works of Plato and Aristotle in the original
language, and pursued his interests in ancient and
modern philosophy, concentrating on modern logic and
its application to the foundations of mathematics. He
then learned Russian in the Royal Navy, spent a year at
Oxford studying Statistics and a year at Moscow State
University studying the translation of languages by
computer.
On return to England in 1960, he joined Elliott Brothers Ltd, a small British computer
manufacturer, and led a team which designed and delivered an automatic translator
for the new international programming language ALGOL 60. He later worked on the
design of operating systems and new hardware products.
In 1968 he was appointed as Professor of Computing Science at Queen’s University,
Belfast. In 1977, he returned to Oxford to build up the subject there too. He has been
honoured by all the international awards that the world of computing can bestow. He
is a member of the Academia Europaea, and three of the national scientific
academies of Europe. He retains an enduring interest in industrial application of the
science of computing, and has held consultancies with major international companies
in computing software and communications. But he has allowed his knowledge of
103-17
Latin and Greek to get very rusty.

Questions
1. What are the three standard assumptions that are used when
comparing how long different sorting methods take?
2. Briefly state the selection sort algorithm.
3. Briefly state the bubble sort algorithm.
4. Briefly state the “divide and conquer” algorithm.
5. Briefly state the merge sort algorithm.
6. Briefly state the quicksort algorithm.
7. Put these four methods into the order of the time they take for
large amounts of data, from fastest to slowest: merge sort, bubble
sort, quicksort, selection sort. Discuss any ties.
8. You need to search a long, unsorted list of names with 1024 items
for n selected names. You could use linear search n times on the
unsorted list. Or sort the list using quicksort and then use binary
search. If n=3, which is faster? What if n=30? Show your work!
9. Your friend is doing this task, but decides to use selection sort. How
big, roughly speaking, does n have to be for sorting to pay off?
10. What is a “linear” algorithm? A “quadratic” one? A “sublinear” one?
11. Draw a diagram of an 4-input sorting network. 103-18

8
Questions (cont)
12. You are examining a list of 128 items by comparing the items in
pairs. If it takes 2 seconds to confirm two items, how long will it
take you to find the ten largest items using these strategies:

(a) Use a sequential search to find the largest item, another


sequential search to find the next largest, and so on until the ten
largest items have been found

(b) Sort all of the items first using SelectionSort, then take the last
ten items in the ordering

(c) Sort the items first using QuickSort, then take the last ten items
in the ordering.

103-19

Você também pode gostar