Você está na página 1de 17

Hashing

A searching technique called Hashing or Hash addressing which is essentially independent of the number n. Hashing is used to index and retrieve items in a database because it is faster to find the item using the shorter hashed key than to find it using the original value. A bucket in a hash file is unit of storage (typically a disk block) that can hold one or more records.

Hash function
Hash functions are mostly used in hash tables, to quickly locate a data record given its search key.
If element e has key k and h is hash function, then e is stored in position h(k) of table To search for e, compute h(k) to locate position. If no element, dictionary does not contain e.

Specifically, the hash function is used to map the search key to the index of a slot in the table where the corresponding record is supposedly stored. What are the characteristics of a good hash function?
A good hash function avoids collisions. A good hash function tends to spread keys evenly in the array. A good hash function is easy to compute.

Popular hash functions


Division Method: Choose the number n larger then the number n of the keys in K. The number n is usually chosen to be a prime number or a number with a small number of divisors. This frequently minimize the number of collisions. The hash function H is defined by H(k)=k(mod m) or H(k) = (mod m)+1

Midsquare method: The key k is squared then the hash function H is defined by H(k)=I Where I is obtained by deleting digits from both ends of K2

Folding Method: The key K is partitioned into a number of parts k1, k2,.kr where each part except possibly the last has the same number of digits as the required address. Then the parts are edit together ignoring the last carry that is H(k) = k1+k2+kr Where the leading digit carries are ignored some time for extra milling the even numbered parts k2,k4, are each reversed before the addition. H(4502) = 54+20 = 74

There are 2 broad kinds of hashing, open hashing, and closed hashing.

1.

Open Hashing

Each bucket in the hash table is the head of a linked list All elements that hash to a particular bucket are placed on that buckets linked list Records within a bucket can be ordered in several ways
by order of insertion, by key value order, or by frequency of access order

Example
0 1 ... 2 3 4 ...

D-1

...

Example

2.

Closed Hashing

A closed hash table keeps the members of the set in the bucket table rather than using that table to store list headers. only one element is in any bucket.

Collision
Multiple keys can hash to the same slot
0
U

(universe of keys)

h(k1)

h(k4)
k1 K (actual k2 keys) k3 k4

collision
k5

h(k2)=h(k5) h(k3) m1

Collision Resolution Techniques


There are two broad ways of collision resolution: 1. Separate Chaining: An array of linked list implementation. 2. Open Addressing: Array-based implementation. (i) Linear probing (linear search) (ii) Quadratic probing (nonlinear search) (iii) Double hashing (uses two hash functions)

14

Collision Resolution by Chaining


The hash table is implemented as an array of linked lists.

0
U

(universe of keys)
k1

k1

k4

K (actual keys)
k8

k4 k5 k7 k3 k6

k2

k5 k7 k8 m1

k2 k3

k6

Open addressing
A method in which a hash collision is resolved by probing, or searching through alternate locations in the array (the probe sequence) until either the target record is found. Linear probing is a scheme in computer programming for resolving hash collisions of values of hash functions by sequentially searching the hash table for a free location
newLocation = (startingValue + stepSize) % arraySize

Quadratic probing operates by taking the original hash value and adding successive values of an arbitrary quadratic polynomial to the starting value. Double hashing here second hash function is used.

Você também pode gostar