Escolar Documentos
Profissional Documentos
Cultura Documentos
Election Day
Elections for class president Each student whispers in Mr. Drews ear Mr. Drew writes down the votes
Carol Alice
Alice Bob
Problem: Mr. Drews notebook leaks sensitive information First student voted for May compromise Carol the privacy of Second student voted the elections for Alice
Election Day
What about more involved applications? Write-in candidates Votes which are subsets or rankings .
Carol Alice
Alice Bob
History independence
The memory representation should not reveal information that cannot be obtained using the legitimate interface
Typical Applications
[BGG94, Mic97]
Our Contribution
A HI dictionary that simultaneously achieves the following:
Efficiency: Lookup time O(1) worst case Update time O(1) expected amortized Memory utilization 50% (25% with deletions) Strongest notion of history independence Simple and fast
6
Independence
Motivated by incremental cryptography Only considered the shape of the trees and not their memory representation
Independence
Memory revealed at the end of an activity period Any two sequences of operations S1 and S2 that lead to the same content induce the same distribution on the memory representation Memory revealed several times during an activity period Any two sets of breakpoints along S1 and S2 with the same content at each breakpoint, induce the same distributions on the memory representation at all these points Completely randomizing memory after each
Independence
Canonical representation (up to initial randomness) implies SHI Other direction shown to hold for reversible data structures [HHMPR05]
WHI for reversible data structures is possible without a canonical representation Provable efficiency gaps [BP06] (in restricted models)
SHI Dictionaries
Deleti ons
Naor & Teague 01 Blelloch & Golovin 07 Blelloch & Golovin 07 This work
Updat e time
O(1) expect ed O(1) expect ed O(1) expect ed O(1) expect ed
Looku p time
O(1) worst case O(1) expect ed O(1) worst case O(1) worst case
Practic al?
(mem. util. < 50%) (mem. util. < 50%)
99%
< 9%
10
Our Approach
Cuckoo hashing [PR01]: A simple & practical scheme with worst case constant lookup time Force a canonical representation on cuckoo hashing
What happens when hash functions fail? Rehashing is problematic in SHI data structures
All hash functions need to be sampled in advance (theoretical problem) When an item is deleted, may need to roll back on previous functions
11
Cuckoo Hashing
Tables T1 and T2 with hash functions h1 and h2 Store x in one of T1[h1(x)] and T2[h2(x)] Insert(x): Greedily insert in T or T 1 2
T1
T2 V
Y W X
Z Y X W
12
Cuckoo Hashing
Tables T1 and T2 with hash functions h1 and h2 Store x in one of T1[h1(x)] and T2[h2(x)] Insert(x): Greedily insert in T or T 1 2
Set S U containing n keys h1, h2 : U ! {1,...,r} S is successfully stored Every connected component has at most one cycle
Main theorem: If r (1 + )n and h1,h2 are log(n)-wise independent, then failure probability is (1/n)
Bipartite graph with sets of size r Edge (h1(x), h2(x)) for every x2S
14
Representation
Assume that S can be stored using h1 and h2 We force a canonical representation on the cuckoo graph
Assume that S forms a tree in the cuckoo graph. Typical case One location must be empty. The choice of the empty location uniquely determines the location of Rule: h1 (minimal element) all elements is empty
a b c d e
15
Representation
Assume that S can be stored using h1 and h2 We force a canonical representation on the cuckoo graph
Assume that S has one cycle Two ways to assign elements in the cycle Each choice uniquely determines the location of Rule: minimal element in all elements cycle lies in T1
a b c d e
16
Representation
Requires connecting all elements in the component with a sorted cyclic list Memory utilization drops to 25% All cases straight forward
17
Rehashing
Happens with probability (1/n) Rear, but very bad worst case performance Canonical memory implies we need to sample all hash functions in advance (theoretical problem) Whenever an item is deleted, need to check whether we must role back to previous hash functions A bad item which is repeatedly inserted and deleted would cause a rehash every operation!
18
Using a Stash
Bad item: smallest item that belongs to a cycle Secondary data structure must be SHI in itself
In theory the stash could be any SHI with constant lookup time
19
We dont know how to do this for CH with more than 2 hash functions and/or more than 1 element per bucket
Better memory utilization, better performance, but.. Expected size of connected component is not constant
20