Você está na página 1de 15

Binary Searching

Earl Paine Stu Schwartz

Copyright 2005, used by permission

Who we are
Combining curriculums The art of collaboration

The problem
You have 10 alphabetized names along with the last 4 digits of their social security number. You input a name and the program will search for it.
If found, the SS# 4 digits will appear If not found, the user will be asked the persons SS# 4 digits which will be added to the list.

A sequential search algorithm


Enter the name you wish to find: name Compare name to all names in list until name is greater than the current name in the list
1 2 3 4 5 6 7 8 9 10 Adams Arras Barnes Cox Flynn James Norman Paine Schwartz Woods 4232 5116 7651 9901 9254 3339 5412 6512 7742 1091

Ex: Barnes

If name is there, report the SS#.

Sequential search algorithm (contd)


If name is not there, we now have the insertion point (insert).
Ex: Barlow
1 2 3 4 5 6 7 8 9 10 Adams Arras Barnes Cox Flynn James Norman Paine Schwartz Woods 4232 5116 7651 9901 9254 3339 5412 6512 7742 1091

Ask for 4-digit SS# of name

Sequential search algorithm (contd)


Push all list names beyond insert one position down.
1. Adams 1425 2. Arras 7623 1. Adams 1425 2. Arras 7623 3. Barnes 9001 4. Cox 500. Zorro 4234 5419 3. Barnes 9001 4. Barnes 9001 5. Cox 501. Zorro 4234 5419

Position insert is now available for name and SS#


Increment the counter containing the number of data Note: the push process must start at the bottom of the list so that data is not lost.

Problem with Sequential Search


Inefficiency
1. On the average, it will take 5 searches to find a name (assuming no bias in choice of name)

2. If name is not found, it will take 10 searches to ascertain that information. A list of 500 names requires 500 searches.
Note: in this age of speed, 500 searches can be done in a microsecond. It is hard to justify a need to generate a more efficient method to save such a small amount of time. It should be mentioned that if there were 10 million names with the process continuing over and over, the need for efficiency is much more pronounced. Hence a more efficient method is needed.

The binary search


We start with the same premises. 10 names and name. We also introduce 3 new integer variables: low, high and target. low is initially set equal to 1. high is initally set equal to 11. target will be the average of high and low.

Binary search (contd)


If target is a decimal, the decimal part of it is chopped off. We always compare name to the name in the target position. If they match, we printout the SS#. We are done.

Binary search (contd)


If they dont match, we find if name is greater than target name. If so, we know that name is further down in the list and we change: low = target. Otherwise we know that name is alphabetically before the target name and we change: high = target

Binary search (contd)


Continue the process target = (low + high)/2 - integer value We are done (name not found) when high = low. If the name is not found, we push all names from the target position down as before insert the new name into target, and increment the counter of names.

Binary search example 1


name = Norman
Initial: low=1, high=11, target = 6 Pass 1: low=6, high = 11, target = 8 Pass 2: low=6, high = 8, target = 7
1 2 3 4 5 6 7 8 9 10 Adams Arras Barnes Cox Flynn James Norman Paine Schwartz Woods 4232 5116 7651 9901 9254 3339 5412 6512 7742 1091

It took 3 comparisons to find Norman. Sequentially, it would have taken 7.

Binary search example 2


name = Donald
Initial: low=1, high=11, target = 6 Pass 1: low=1, high = 6, target = 3
1 2 3 4 5 6 7 8 9 10 Adams Arras Barnes Cox Flynn James Norman Paine Schwartz Woods 4232 5116 7651 9901 9254 3339 5412 6512 7742 1091

Pass 2: low=3, high = 6, target = 4 Pass 3: low=4, high = 6, target = 5


Pass 4: low=4, high = 5, target = 4

Pass 5: low=4, high = 4: STOP - name not found Every name from position target + 1 and below must be pushed down so name can be inserted into target.

Efficiency of binary search


The maximum number of searches x necessary to find a name is the smallest integer that satisfies the inequality 2x > 10 or x = 4. If n represents the number of names, the maximum number of searches x necessary to find a name is the smallest integer that satisfies the inequality 2x > n. 2x > n
log (2x) > log n x log 2>log n

The maximum number of searches is the smallest integer greater than log n/log 2

Efficiency of binary search


# of names 10 100 1,000 5,000 10,000 50,000 100,000 1,000,000 10,000,000 1,000,000,000 Maximum sequential Maximum binary searches necessary searches necessary 10 4 100 7 1,000 10 5,000 13 10,000 14 50,000 16 100,000 17 1,000,000 20 10,000,000 24 1,000,000,000 30

With the incredible speed of todays computers, a binary search becomes necessary only when the number of names is large.

Você também pode gostar