Escolar Documentos
Profissional Documentos
Cultura Documentos
6.046J/18.401J
LECTURE 8
Hashing II
• Universal hashing
• Universality theorem
• Constructing a set of
universal hash functions
• Perfect hashing
October 5, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.1
A weakness of hashing
randomly from H.
m
October 5, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.3
Universality is good
October 5, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.4
Proof of theorem
October 5, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.5
Proof (continued)
⎡ ⎤
E
[
C x ]
=
E
⎢ ∑
c xy ⎥ • Take expectation
⎢⎣
y∈T −{ x} ⎥⎦
of both sides.
October 5, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.6
Proof (continued)
⎡ ⎤
E
[
C x ]
=
E ⎢ ∑
c xy ⎥ • Take expectation
⎢⎣
y∈T −{ x
} ⎥⎦
of both sides.
= ∑ E[cxy ] • Linearity of
y∈T −{ x} expectation.
October 5, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.7
Proof (continued)
⎡ ⎤
E
[
C x ]
=
E
⎢ ∑
c xy ⎥ • Take expectation
⎢⎣
y∈T −{ x
} ⎥⎦
of both sides.
= ∑ E[cxy ] • Linearity of
y∈T −{ x
} expectation.
= ∑ 1/ m • E[cxy] = 1/m.
y∈T −{ x}
October 5, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.8
Proof (continued)
⎡ ⎤
x ]
=
E ⎢ ∑
c xy ⎥
E
[C • Take expectation
⎢⎣
y∈T −{
x} ⎥⎦
of both sides.
= ∑ E[cxy ] • Linearity of
y∈T −{ x} expectation.
= ∑ 1/ m • E[cxy] = 1/m.
y∈T −{ x}
=
n
−
1 . • Algebra.
m
October 5, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.9
Constructing a set of
universal hash functions
Let m be prime. Decompose key k into r + 1
digits, each with value in the set {0, 1, …, m–1}.
That is, let k = 〈k0, k1, …, kr〉, where 0 ≤ ki < m.
Randomized strategy:
i=0
modulo m
How big is H = {ha}? |H| = mr + 1. REMEMBER
THIS!
October 5, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.10
Universality of dot-product
hash functions
Theorem. The set H = {ha} is universal.
Equivalently, we have
r
∑ ai (xi − yi ) ≡ 0 (mod m)
i=0
or
r
a0 (x0 − y0 ) + ∑ ai (xi − yi ) ≡ 0 (mod m) ,
i=1
which implies that
r
a0 (x0 − y0 ) ≡ −∑ ai (xi − yi ) (mod m) .
i=1
October 5, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.12
Fact from number theory
Example: m = 7.
z 1 2 3 4 5 6
z–1 1 4 5 2 3 6
October 5, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.13
Back to the proof
We have
r
a0 (x0 − y0 ) ≡ −∑ ai (xi − yi ) (mod m) ,
i=1
and since x0 ≠ y0 , an inverse (x0 – y0 )–1 must exist,
which implies that
⎛ r ⎞
a0 ≡ ⎜⎜ − ∑ ai (xi − yi ) ⎟⎟ ⋅ (x0 − y0 ) −1 (mod m) .
⎝ i=1 ⎠
Thus, for any choices of a1, a2, …, ar, exactly
one choice of a0 causes x and y to collide.
October 5, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.14
Proof (completed)
⎝ ⎝ i =
1
⎠ ⎠
r a
r
to collide is m · 1 = m = |H|/m.
October 5, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.15
Perfect hashing
T S1
IDEA: Two-
0
level scheme 1
44 31 14
31 1427
27
with universal 2
h31(14) = h31(27) = 1
hashing at 3
S4
both levels. 4
11 00
00 26
26 S6
5
No collisions
6
99 86
86 40
40 37
37 22
22
at level 2!
m a 0 1 2 3 4 5 6 7 8
October 5, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.16
Collisions at level 2
⎜ ⎟ ⋅
2 =
⋅
2 <
1 .
⎝
2 ⎠
n 2 n 2
October 5, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.17
No collisions at level 2
October 5, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.18
Analysis of storage