Cormen Lin Lee-Introduction To Algorithms (Solutions) - en

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/327117558
Investigating the impact of Suboptimal Hashing Functions
Article · August 2018
CITATIONS READS
0 106
3 authors:
Léonie Buckley Jonathan Byrne

Intel Intel
7 PUBLICATIONS 4 CITATIONS 33 PUBLICATIONS 242 CITATIONS
SEE PROFILE SEE PROFILE
David Moloney
Movidus
58 PUBLICATIONS 252 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
EoT (Eyes of Things) View project
FIAMMA View project
All content following this page was uploaded by Léonie Buckley on 20 August 2018.
The user has requested enhancement of the downloaded file.

2018 IEEE Games, Entertainment, Media Conference (GEM).
Investigating the impact of Suboptimal Hashing Functions

Léonie Buckley Jonathan Byrne David Moloney
Movidius, Intel Movidius, Intel Movidius, Intel
Leixlip, Ireland Leixlip, Ireland Leixlip, Ireland
leonie.buckley@intel.com jonathan.byrne@intel.com david.moloney@intel.com
Abstract—For the purpose of volumetric data, hashing acts 365, the probability that no two keys map to the same location
to map multi-dimensional space into the one-dimensional space. is only 0.4927.
Hashing is a popular method to store sparse data for the In hashing, the address calculation is generally achieved by
purporses of both gaming and computer graphics. Traditional
methods used to hash 3D volumetric data utilise large prime num- a randomised scrambling of key values, with many methods
bers in an attempt to achieve well-distributed hash addresses to using large primes to achieve this scrambling [7], [8], [9], [10],
minimise addressing collisions. These methods generate hashing [11]. A hash is “a random jumble achieved by hashing” [12].
addressing through randomisation. However, it has been shown However, it has been shown that this randomising method,
that when considering dynamic data, a low addressing collision using XOR Hashing, provides no predictability as to how it
rate cannot be guaranteed through this randomising technique.
In this paper, a spatial hashing implementation is investigated, will perform on a diverse range of data [15]. This is due
and whether varying performance parameters can be improved to hardcoded prime values. If the data being considered is
upon through the use of DECO Hashing. DECO leverages the static, then the hashing algorithm can be run multiple times
inherent structure present in 3D data, which exists in the sense to determine the optimum prime values to use, but this is not
that each coordinate in 3D space is already unique. An open possible when dynamic data is being received from sensors on
source version of Chisel is investigated - OpenChisel - and it is
determined whether the algorithm can be improved upon through robots and drones.
replacing the existing hashing function with DECO Hashing. An optimal solution is to provide a perfect hashing function,
which allows the retrieval of data in a hash table with a single
I. I NTRODUCTION query [16] [17], [18]. This would provide a hash with no
Sparse geometric data is ubiquitous in computer graphics, addressing collisions. Intuitively, this could be achieved by
GIS (Geographical Information Systems) and gaming applica- providing a suitably large hash table. However, this is not
tions, and their use on power and memory-confined embedded always practical as volumetric data is often used in memory
systems is commonplace. Applications for gaming inclue constrained situations, such as for SLAM (Simultaneous Lo-
human pose estimation [1], [2], the creation of ultra-realistic cation And Mapping) and GIS applications on mobile and
3D models [3] and 3D scene reconstruction for AR gaming [4]. robotic platforms [8], [11].
A challenge now exists to provide a sparse storage solution The Chisel algorithm is a system for real-time 3D recon-
which provides both performance and efficient storage [5]. struction onboard a Google Tango [8], which stores the 3D
Hashing is an attractive method to store this sparse data as data through dynamic spatial-hashing. In the investigation of
it does not require pointers, and allows for trivial lookup, the Chisel algorithm [8], it was noted that no justification
insertion and deletion of data. was given for the choice of hash function - XOR Hashing
Hashing is used to map data of arbitrary size to an address- (see Section II-A) - other than that it was used in previous
ing space. More specifically, for the purpose of 3D volumetric works [7], [19], [9].
data, hashing acts to map multi-dimensional data into the one- However, the properties of a hashing function can have a
dimensional domain [6]. Hashing is a popular method to store, devastating impact on the performance of an algorithm. It is
retrieve and delete 3D volumetric data [7], [8], [9], [10], [11]. acknowledged that when different data is mapped to the same
However, the choice of hash function impacts how effi- hash address, performance decreases [7], thus unique hash
ciently the data can be stored - a high probability of unique addresses are desirable.
hashing addresses reduces the need for additional overheads As much computation regarding volumetric data takes place
to deal with addressing collisions. It also allows for high load on embedded platforms - Chisel is benchmarked on two Tango
factors (a measure of how full the hash table is), which reduces devices, a phone and tablet [8] - such memory constrained
wasted memory in the form of empty hash table addresses. devices require similarly low memory-demanding algorithms.
In much of the early literature discussing hashing [12] [13], To an extent this was taking into account in Chisel through
the common goal is to achieve distinct mapping addresses. only representing occupied data to avoid needless computation
However, Knuth states that “it is theoretically impossible to and wasted memory [8]. However, a suboptimal hashing func-
define a hash function that creates truly random data from tion produces needless computation, as addressing collisions
nonrandom data in files” [13]. The birthday paradox [14] must be resolved, diverting resources from other applications.
highlights the difficulty in achieving distinct addresses: if a Minimising these collisions will lead to improvements in other
random function is selected to map 23 keys to a table of size aspects of an algorithm.
978-1-5386-6304-2/18/$31.00 ©2018 IEEE 313

It is true that collisions cannot be completely avoided, but it

is investigated in this paper whether traditional XOR Hashing
can be replaced with DECO, an adaptive hashing method to
reduce addressing collisions and improve performance. This
investigation was implemented using OpenChisel [20], an
open-source version of the Chisel chunked TSDF library.
II. R ELATED W ORK
Anyone who attempts to generate random numbers by
deterministic means is, of course, living in a state of sin
John von Neumann
Fig. 1: A voxel chunk
In his exhaustive description of hashing functions, Knuth
states that a good hash function should satisfy two require-
ments [13]: utilise the data, such as SLAM, graphics [5] and robotics
1) Its computation should be very fast applications [8].
2) It should minimize collisions An example of an SFC is the Morton SFC [23]. SFCs
Property (1) above, Knuth says, is machine-dependent, but are also used in conjunction with Linear Octrees to address
property (2) is data-dependent. For this reason it is vital to occupied voxels [24], [25].
exploit any prior knowledge inherent to the data to be hashed. Calculating the Morton SFC is shown in Equation 2, where
As mentioned in Section I, no justification was provided M is the MSB of the x, y and z coordinates. Other imple-
in Chisel [8] or [7] as to why XOR Hashing was used in mentations may reverse the order of the interleaving [6]. In
either of the aforementioned implementations. XOR Hashing practice the ordering is not of great importance, as long as the
is described in detail below along with some other hashing ordering is maintained constant.
functions used to hash volumetric data.
A. XOR Hashing Function SF Cadr = zM yM xM zM −1 yM −1 xM −1 ....z0 y0 x0 (2)
As with the technique suggested by Floyd in [12], it For the purpose of hashing, the SFC address as the hashing
is desirable for an XOR hashing function to provide well address as demonstrated below in Equation 3, where N is the
distributed addresses. [7], [8], [9], [10], [11], [21] and [22] size of the hash table.
all utilise an XOR hashing function to index 3D volumetric
data into a hash table. hash(x, y, z) = SF Cadr %N (3)
This hash address is retrieved using Equation (1):
Although the Morton SFC is a popular hashing addressing
scheme, it is not adaptive. DECO Hashing, described below,
hash(x, y, z) = (P1 ∗ x ⊕ P2 ∗ y ⊕ P3 ∗ z)%N (1) is an adaptive method which can be optimised if there is prior
knowledge of the data to be hashed. For that reason DECO
where ⊕ is the XOR operation, (P1 , P2 , P3 ) are hard coded
Hashing is investigated in this paper over the Morton SFC.
prime values, x, y, z are the 3D coordinates of the data in the
volumetric structure that is to be stored and N is the size of the III. DECO H ASHING
hash table. None of [7], [8], [9], [10], [11] clarify why large The DECO (DEnsity COordinate based) method of hashing
primes are utilised. It is not stipulated in [22] that (P1 , P2 , P3 ), is a novel hashing method which adapts to the data being
or the “hash coefficients” be primes. hashed to minimise addressing collisions. These qualities are
In the case of hashing for 3D volumetric data, XOR hashing discussed in Section III-A. Similar to an SFC, DECO is a
is the most commonly used, and is thus examined in further locality preserving mapping [26], meaning that voxels close
detail. to each other in the multidimensional space will be close to
B. Space Filling Curves each other in the onedimensional space.
Once the coordinates of the chunk that an occupied voxel
Similar to hashing, a Space Filling Curve (SFC) is a method
belongs to have been determined (referred to as ID.x, ID.y
to map multi-dimensional space into the one-dimensional
and ID.z in Equation 4), its index relative to the other chunks
space [6]. The SFC is regarded as a locality preserving
is determined. An example of a 43 chunk is shown in Figure 1.
mapping where points that are in close proximity in the
The number of chunks that we wish to consider on a side
multidimensional space are mapped to points that are in
is referred to the Side Length (SL). Calculating this index is
close proximity in the 1D hash array [6]. This is in contrast
shown in Equation 4.
with traditional hashing, which should provide addresses that
are uniformly distributed [7]. For 3D volumetric data, this
mapping acts as a pre-processing step for applications that index = (ID.z ∗ (SL2 ) + ID.y ∗ (SL) + ID.x) (4)
314
The address in the hash table is then found by calculating • Maximum Load Factor - The maximum permissible
the modulus of the index, where N is the number of table load factor. The current load factor exceeding the max-
addresses, as shown in Equation 5 below. imum load factor forces an increase in the number of
buckets, and thus causing a rehash.
hash addr = index%N (5) • Bucket Size - The number of entries in a single bucket.
The number of elements in a bucket influences the time
Similar addressing schemes to DECO Hashing exist [27], it takes to access a particular element in the bucket.
[28], but both approaches are more suited to static data due Therefore it is preferable to minimise the number of
to the large dense hash table size and linear addressing re- elements within occupied buckets through reducing ad-
spectively. Linear addressing removes the ability to randomly dressing collisions.
access voxels in the hash table as the table must be traversed • Rehash Rehashes are automatically performed by the
to find the desired voxel. hash table whenever its load factor is going to surpass
its maximum load factor in an operation. A rehash is a
A. Adaptive Qualities of DECO Hashing reconstruction of the hash table - All the elements in the
DECO Hashing is a parameterisable method of hashing and hash table are rearranged according to their hash value
can be adapted to a particular dataset. The following proper- into the new set of buckets. This may alter the order of
ties are adaptive and can be altered to minimise addressing iteration of elements within the container.
collisions.
1) Side Length: It is possible to choose a value of SL
which will guarantee that the index value, calculated using B. Experiments Conducted
Equation 4, of every voxel will be unique. This value is To adequately compare XOR Hashing against DECO Hash-
the cubed root of the number of voxel blocks present, or, ing in OpenChisel, various sequences from the Freiburg [32]
the number of voxel blocks that exist on a side. When this dataset were examined. These are the same datasets that
optimum value is chosen, the maximum value of any identifier were used in the original Chisel [8] implementation. The
coordinates of a voxel block is SL − 1. This ensures that no exact sequences examined are detailed in Appendix A. The
x, y, z combinations will produce the same index value. data hashed was voxel chunks, each of size Nv3 voxels. The
2) Coordinate Ordering: Given the distribution of data coordinates to be used in XOR and DECO Hashing are the
being considered, the ordering of x, y, z can be altered to coordinates of the voxel chunks, which are found through
reduce the rate of addressing collisions. rounding a world coordinate to a chunk coordinate [8]. The
For example, for large-scale LiDAR datasets such as the OpenChisel algorithm was run for the various sequences, and
Dublin City Dataset [29], [30], it can be assumed that the data at the end of each run different parameters were recorded. The
will be highly sparse in the z (vertical) direction (due to tall following were recorded:
buildings). Therefore, the ordering that minimises addressing
collisions under these conditions can be chosen. • Traversal Time - The number of cycles taken to traverse
The predictability that these parameters allow is in stark the hashed structure. The time taken to traverse a the data
contrast with the stochastic nature of XOR Hashing. structure is of importance as it provides information about
the mapped area - for example, how densely occupied is
IV. C OMPARISON OF H ASHING M ETHODS the area, what is the maximum occupied height etc. Intu-
itively, the faster this traversal can take place, the faster
Below are a description of some key concepts in hashing,
the system can proceed with other actions. A speedy
as well as a description of the tests administered.
traversal is of importance especially for applications such
A. Parameters as drones and robotics which are dealing with dynamic
situations and may be required to make decisions based
The following parameters are important to understand when on information gained from the traversal.
interpreting the tests administered. The definitions are pro- Traversal was measured at the same points for both hash-
vided from http://www.cplusplus.com/ [31]: ing functions, and was conducted on a Intel(R) Core(TM)
• Chunks - A Chunk, as described in [8], consists of a i7-6820HQ CPU @ 2.70GHz. This process was repeated
fixed grid of N 3 voxels. numerous time, the lowest and highest outliers were
• Bucket - A bucket is a slot in the hash table’s internal excluded and an average number of cycles was recorded.
hash table where elements are assigned based on the hash The cycles were recorded using a cycle counter from
value of their key. FFTW [33], a cross-platform cycle counter.
• Hash Table Size - The number of buckets in the hash • Bucket Occupancy - The percentage of hash buckets
table. occupied.
• Load Factor - The load factor is the ratio between the • Collisions - The percentage of hash buckets that con-
number of entries in the hash table and the number of tained more than more voxel chunk (i.e. a collision oc-
buckets, i.e. how full the table is. curred) with respect to the total number of hash buckets.
315
• Relative Collisions - The percentage of hash buckets that with DECO Hashing are shown in Figure 2. There was an
contained more than more voxel block with respect to the improvement (i.e. a decrease) in traversal times for each of
number of occupied hash buckets. the sequences examined when XOR Hashing was replaced
• Cost of Computation - The cost of computing both with DECO Hashing. The reason for the improved traversal
hashing functions. This cost was measured as the length time is due to the decrease in addressing collisions that occur
in cycles required to calculate each function. when XOR Hashing is replaced with DECO Hashing, which
• Locality - The locality with which voxels in the 3D space is described in Section III. The average improvement in the
are placed relative to each other in the hash table. traversal time when exchanging XOR Hashing for DECO
These parameters were measured after the entire dataset Hashing was 16.72%.
had been evaluated by the OpenChisel algorithm - i.e. the
mesh had been constructed. The datasets used were from the
Freiburg Dataset [32], which are described in more detail in
Appendix A. Each of the datasets were tested individually,
using the ROS bag files provided with the datasets.
The sole alteration made to the open source version of
the Chisel algorithm was to swap the original XOR Hashing
function with DECO Hashing. Both of these algorithms are
described in Section II.
Collisions
OpenChisel deals with collisions through using a C++ asso-
ciative container, the unordered map. When voxel blocks are
mapped to the same address in an unordered map, it is not
Fig. 2: The improvement in traversal time when XOR Hashing
technically considered that a collision has occurred, as the new
was replaced with DECO Hashing
voxel block is simply added to the bucket. While this pointer
based collision chaining solution in suitable for the PC based
OpenChisel implementation, pointer implementations are not B. Bucket Occupancy
suitable for memory confined embedded systems, especially As described in Section IV, bucket occupancy is given as
when considering dynamic data. When new data is to be the percentage of hash buckets that are occupied. As many
inserted/deleted, the pointers at times must be reassigned, volumetric hashing solutions, such as Chisel [8], take place
which introduces additional overheads. on memory constrained embedded systems, it is not practical
Instead of focusing energy and resources into collision to provide an infinitely large hash table to accommodate
resolution, it is preferable to focus on collision reduction, so as all voxels. This is why only occupied voxel chunks are
to avoid to need for collision resolution at all. This will release represented through hashing.
resources previously required for collision resolution for other Another characteristic of an optimum hash function is a
requirements. This is of particular relevance on memory and fully occupied hash table [16], [17]. That is, no empty buckets
resource confined embedded systems, such as those used in exist but still no addressing collisions have occurred. This of
the original Chisel [8] implementation. course is not practical when dealing with dynamic data, as
new data insertions would cause addressing collisions if the
V. R ESULTS
table is fully occupied. Therefore, the more practical optimal
The results for the tests as detailed in Section IV are outlined solution is to minimise the number of addressing collisions
below. and maximise the bucket occupancy.
From the results in Figure 3, it is clear that DECO Hashing
A. Traversal Time
had a higher bucket occupancy than XOR Hashing for each
As discussed in Section IV, the traversal time is taken of the sequences considered.
as the number of cycles required to traverse the hash table.
Traversal comprises of querying every bucket in the hash table C. Collision Rate
to determine whether it is occupied or not. It also comprises As described in Section IV, the collision rate is given as
of noting all voxel blocks in the bucket - if no collisions the percentage of hash buckets that contained more than one
have occurred then a single query will suffice for that bucket. voxel chunk (i.e. a collision occurred) with respect to the total
However, if collisions have occurred, each of the additional number of hash buckets. As previously mentioned, an optimal
voxel blocks must then be queried in turn, meaning that the hashing solution is to provide a perfect hashing function
number of elements in a bucket influences the time it takes to which allows the retrieval of data from the hash table with a
access a particular element in the bucket [31]. single query [16], [17]. That is, no addressing collisions occur.
The improvement in traversal time for the sequences de- While an optimal hashing solution is not always possible to
scribed in Appendix A when XOR Hashing was replaced implement, reducing the addressing collision rate as much as
316
behaviour. For example, a collision rate that indicated that only

a small number of buckets contained a collision may seem like
a good result. But this result could be masking the fact that
these buckets could have a high number of collisions within
them.
Therefore, while it is desirable to have a low collision rate
relative to all hash buckets, it is also important to have a low
collision rate relative to occupied hash buckets.
The relative collision rates for both XOR Hashing and
DECO Hashing are shown below in Figure 5 for the sequences
described in Appendix A. It is clear that the relative collision
rate for DECO Hashing is less than that of XOR Hashing for
each of the sequences examined.
Fig. 3: The Bucket occupancy rate for XOR Hashing and
DECO Hashing
possible allows one to approach the functionality of an optimal

hashing function. The collision rate was compared for both
XOR Hashing and for DECO Hashing which is described in
Section III.
As described in Section IV-A, the number of elements in
a bucket influences the time it takes to access a particular
element in the bucket. This will increase the time taken to
query, insert and delete voxel chunks. This scenario is not an
attractive one when dealing with dynamic data (for example
when considering robots and drones which deal with dynamic
and changing environments), where voxel chunks will be Fig. 5: The improvement in the relative collision rate when
queried/updated on a regular basis. The collision rates for XOR Hashing was replaced with DECO Hashing
both XOR Hashing and DECO Hashing are shown below in
Figure 4 for the sequences described in Appendix A. It is clear
that the collision rate for DECO Hashing is less than that of E. Cost of Computation
XOR Hashing for each of the sequences examined. The cost of computing both the XOR and DECO Hash
was compared. Obviously, if additional computation power is
required to calculate a hash address, this impacts the perfor-
mance of the whole algorithm. To properly investigate the cost
of both hashing functions, they were tested in isolation, not as
part of the OpenChisel algorithm. The operations required to
calculate each of the hashes are shown in Table V-E.
Multiplication XOR Addition Modulus
XOR 3 2 0 1
DECO 0 0 2 1
TABLE I: Operations required for XOR and DECO Hashing
The multiplications required in Equation 4 can be omitted

Fig. 4: The collision rate for XOR Hashing and DECO through the use of a LUT. A LUT is not considered for XOR
Hashing Hashing as they were not utilised in the original OpenChisel
implementation. The equations were each calculated 10,000
times, and the average cycles required was then calculated. The
D. Relative Collision Rate cycles were recorded using a cycle counter from FFTW [33],
As described in Section IV-B, the relative collision rate is a cross-platform cycle counter. These results are shown in
given as the percentage of hash buckets that contained more Figure 6 and demonstrate that, on average, it requires 2.3
than more voxel block with respect to the number of occupied times as many cycles to calculate the XOR Hashing function
hash buckets. This is an important parameter to measure, as the as it does to calculate the DECO Hashing function. This
collision rate by itself could mask some undesirable underlying demonstrates a further advantage of DECO Hashing over the
317
Bunny
traditional XOR method - it requires less cycles to compute,
Resolution 643
and thus will lead to an overall decrease in execution time for
No occupied voxels 11070
the algorithm that it is being used in.
% Occupancy 4.2%
Max x 63
Max y 62
Max z 49
Hash Table Size (No. Entries) 2040
Hash Table Size (Bytes) 32kB
TABLE II: Parameters for the Stanford Bunny
Fig. 6: The cycles required to compute both the XOR and

DECO Hashing functions
F. Locality of mappings
As discussed in Section II, both SFCs and DECO Hashing

are locality preserving mappings - points that are in close
proximity in the multidimensional space are mapped to points
that are in close proximity in the 1D hash array [6], [26]. This
is in contrast with traditional hashing, which should provide
addresses that are uniformly distributed [6]. Fig. 7: Voxel Distribution of the Stanford Bunny. Colours are
provided for comparative purposes.
This locality preservation is of particular importance for
graphics applications, as points that are in close proximity
in the multidimensional space are frequently accessed in
quick succession in memory [5], [26]. This greater locality of
memory accesses allows for improved caching performance
and thus improved execution speed.
To demonstrate the further advantages of DECO Hashing
over XOR Hashing, the locality of data within the hash table
was contrasted for the Stanford Bunny [34]. The parameters
of the bunny and the hash table used are shown in Table II.
The bunny is shown in Figure 7.
The locality of voxels within the hash tables for XOR and
DECO Hashing are shown in Figures 8a and 8b respectively. (a) XOR (b) DECO
It is clear from these hash tables that DECO Hashing stores
the voxels with much more locality than XOR Hashing. That Fig. 8: Distribution of voxels in Hash Table for various
is, voxels that are close to each other in 3D space appear close Hashing Methods for the Stanford Bunny. The colours of the
to each other in the 1D space in the hash table (the table was voxels relate the to voxels in Figure 7
converted to 2D for comparative purposes only). As mentioned
previously, locality preservation allows for improved caching VI. D ISCUSSION
performance and improved execution speed, which provides an It is apparent from Section V that DECO Hashing outper-
explanation for the superiority of DECO Hashing over XOR formed XOR Hashing in all of the experiments detailed in
Hashing in all of the experiments conducted in Section V. Section IV. This is due to DECO exploiting the fact that
318
every voxel in 3D space possesses unique coordinates. This XOR Hashing for Hashing volumetric data. The improvements
ensures that the index calculated in Equation 4 will be unique. that were found in each of the experiments conducted is due
Collisions will only occur because the modulus of the hash to the fact that DECO Hashing exploits the inherent structure
table size must be taken so that the hash addresses will fit in that 3D data possesses through every voxel possessing unique
the table - this is calculated using Equation 5. coordinates. While DECO Hashing does dramatically decrease
While technically XOR Hashing is deterministic (the same the rate of addressing collisions, it does not eliminate them
input will always produce the same output), the address calcu- completely.
lation is achieved through the randomised scrambling of key
values, utilising large primes. This random nature can provide VIII. F UTURE W ORK
no guarantee of unique hashing addresses. While DECO does
not guarantee unique addresses either, it has been shown to add An obvious expansion to DECO Hashing is to include
a certain level of predictability, through every index calculation addressing collision resolution. To date this has not been im-
being unique. This in turn increases the probability of unique plemented due to the low percentage of addressing collisions
hash addresses. These are the properties that allowed DECO that have occurred on the datasets tested.
Hashing to outperform XOR Hashing in all of the experiments A further consideration is to add condensation as is used
that were conducted. with linear quad/octrees [35], [24], [25], [36]. Condensation
concerns representing a large number of occupied voxels
VII. C ONCLUSION
with a single identifier. The use of condensation in linear
The following have been determined through the replace- quad/octrees has been shown to, in best case scenarios, require
ment of XOR Hashing with DECO Hashing for Open- only 2% of the memory required by regular quad/octrees.
Chisel [20] an open-source implementation of Chisel [8]. The While it has been shown that DECO is superior to XOR
experiments administered are detailed in Section IV. Hashing when hashing 3D volumetric data, one must always
Discussing the use of XOR Hashing, [7] states that “Al- take caution when using hash functions for other purposes.
though the hash function does not always provide a unique Knuth offers a word of caution “we can never be completely
mapping of grid cells, it can be generated very efficiently sure that a hash function will perform properly when it is
and does not require complex a data structure”. This paper applied to a new set of data.”
has shown that replacing XOR Hashing with DECO leads to
substantial performance gains - DECO Hashing outperformed
XOR Hashing in every test administered: A PPENDIX
• Traversal Time - An improvement of 16.82% was found A. Datasets
when XOR Hashing was replaced with DECO Hashing.
This is due to the decrease in addressing collisions when The datasets examined in this paper are from the Freiburg
comparing DECO Hashing with XOR Hashing. Dataset [32]. Varying sequences were chosen to ensure a wide
• Bucket Occupancy - range of data distribution within the datasets. Listed below are
Again, DECO Hashing outperformed XOR Hashing in the sequences used along with a brief description [32].
this experiment. For the sequences examined, XOR Hash- • freiburg2 xyz - This sequence contains very clean data
ing had an average bucket occupancy of 21.08%, while for debugging translations. The Kinect was moved along
DECO Hashing had an average bucket occupancy of the principal axes in x-, y- and z-direction very slowly.
27.76%. The slow camera motion basically ensures that there is
• Collision Rate - (almost) no motion blur and rolling shutter effects in the
Again, DECO Hashing outperformed XOR Hashing in data.
this experiment. For the sequences examined, XOR Hash- • freiburg3 walking static - Two persons walk through
ing had an average collision rate of 5.7%, while DECO an office scene. The Asus Xtion sensor has been kept
Hashing had an average collision rate of 2.57%. in place manually. This sequence is intended to evaluate
• Relative Collision Rate - the robustness of visual SLAM and odometry algorithms
Again, DECO Hashing outperformed XOR Hashing in to quickly moving dynamic objects in large parts of the
this experiment. For the sequences examined, XOR Hash- visible scene.
ing had an average relative collision rate of 21.71%, while • freiburg1 360 This sequence contains a 360 degree turn
DECO Hashing had an average relative collision rate of in a typical office environment
6.29%. • freiburg3 walking xyz Two persons walk through an
• Locality - DECO Hashing was shown to store voxels office scene. The Asus Xtion sensor has manually been
in the hash table with more locality than XOR Hashing. moved along three directions (xyz) while keeping the
That is, voxels close to each other in the multidimensional same orientation. This sequence is intended to evaluate
space are close to each other in the onedimensional space. the robustness of visual SLAM and odometry algorithms
Given the results above it is the view of the authors of this to quickly moving dynamic objects in large parts of the
paper that DECO Hashing is a preferable method to Traditional visible scene.
319
R EFERENCES [24] I. Gargantini, “Linear octtrees for fast processing of three-dimensional

objects,” Computer graphics and Image processing, vol. 20, no. 4,
[1] S.-R. Ke, L. Zhu, J.-N. Hwang, H.-I. Pai, K.-M. Lan, and C.-P. pp. 365–374, 1982.
Liao, “Real-time 3d human pose estimation from monocular view with [25] A. Knoll, “A survey of octree volume rendering methods,” Scientific
applications to event detection and video gaming,” in Advanced Video Computing and Imaging Institute, University of Utah, 2006.
and Signal Based Surveillance (AVSS), 2010 Seventh IEEE International [26] B. Moon, H. V. Jagadish, C. Faloutsos, and J. H. Saltz, “Analysis of the
Conference on, pp. 489–496, IEEE, 2010. clustering properties of the hilbert space-filling curve,” IEEE Transac-
[2] D. Mehta, S. Sridhar, O. Sotnychenko, H. Rhodin, M. Shafiei, H.- tions on knowledge and data engineering, vol. 13, no. 1, pp. 124–141,
P. Seidel, W. Xu, D. Casas, and C. Theobalt, “Vnect: Real-time 3d 2001.
human pose estimation with a single rgb camera,” ACM Transactions [27] E. J. Hastings, J. Mesit, and R. K. Guha, “Optimization of large-
on Graphics (TOG), vol. 36, no. 4, p. 44, 2017. scale, real-time simulations by spatial hashing,” in Proc. 2005 Summer
[3] E. Bondarev, F. Heredia, R. Favier, L. Ma, and P. H. de With, “On Computer Simulation Conference, vol. 37, pp. 9–17, 2005.
photo-realistic 3d reconstruction of large-scale and arbitrary-shaped en- [28] C. T. Pozzer, C. A. de Lara Pahins, and I. Heldal, “A hash table
vironments,” in Consumer Communications and Networking Conference construction algorithm for spatial hashing based on linear memory,”
(CCNC), 2013 IEEE, pp. 621–624, IEEE, 2013. in Proceedings of the 11th Conference on Advances in Computer
[4] M. Dzitsiuk, J. Sturm, R. Maier, L. Ma, and D. Cremers, “De-noising, Entertainment Technology, p. 35, ACM, 2014.
stabilizing and completing 3d reconstructions on-the-go using plane [29] D. F. Laefer, S. Abuwarda, A.-V. Vo, L. Truong-Hong, and H. Gharib,
priors,” in Robotics and Automation (ICRA), 2017 IEEE International “2015 aerial laser and photogrammetry survey of dublin city collection
Conference on, pp. 3976–3983, IEEE, 2017. record,” 2015.
[5] I. Garcı́a, S. Lefebvre, S. Hornus, and A. Lasram, “Coherent parallel [30] D. F. Laefer, S. Abuwarda, A.-V. Vo, L. Truong-Hong, and H. Gharibi,
hashing,” in ACM Transactions on Graphics (TOG), vol. 30, p. 161, “Dublin als2015 lidar license (cc-by 4.0).” https://geo.nyu.edu/catalog/
ACM, 2011. nyu 2451 38684, 2017.
[6] M. F. Mokbel, W. G. Aref, and I. Kamel, “Performance of multi- [31] “Unordered map.”
dimensional space-filling curves,” in Proceedings of the 10th ACM in- [32] J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers, “A
ternational symposium on Advances in geographic information systems, benchmark for the evaluation of rgb-d slam systems,” in Proc. of the
pp. 149–154, ACM, 2002. International Conference on Intelligent Robot Systems (IROS), Oct.
[7] M. Teschner, B. Heidelberger, M. Müller, D. Pomeranets, and M. Gross, 2012.
“Optimized spatial hashing for collision detection of deformable ob- [33] M. Frigo and S. G. Johnson, “Fftw: An adaptive software architecture for
jects,” vol. 3, 12 2003. the fft,” in Acoustics, Speech and Signal Processing, 1998. Proceedings
[8] M. Klingensmith, I. Dryanovski, S. Srinivasa, and J. Xiao, “Chisel: of the 1998 IEEE International Conference on, vol. 3, pp. 1381–1384,
Real time large scale 3d reconstruction onboard a mobile device using IEEE, 1998.
spatially hashed signed distance fields,” Robotics: Science and Systems [34] G. Turk and M. Levoy, “The stanford bunny,” 2005.
XI, 2015. [35] I. Gargantini, “An effective way to represent quadtrees,” Communica-
[9] M. Nießner, M. Zollhöfer, S. Izadi, and M. Stamminger, “Real-time tions of the ACM, vol. 25, no. 12, pp. 905–910, 1982.
3d reconstruction at scale using voxel hashing,” ACM Transactions on [36] S. Chandran, A. K. Gupta, and A. Patgawkar, “A fast algorithm to
Graphics, vol. 32, p. 1–11, Jan 2013. display octrees,” in Indian Conference in Computer Vision, Graphics
[10] M. Eitz and G. Lixu, “Hierarchical spatial hashing for real-time collision and Image Processing (ICVGIP), 2000.
detection,” in Shape Modeling and Applications, 2007. SMI’07. IEEE
International Conference on, pp. 61–70, IEEE, 2007.
[11] O. Kähler, V. A. Prisacariu, C. Y. Ren, X. Sun, P. Torr, and D. Murray,
“Very high frame rate volumetric integration of depth images on mobile
devices,” IEEE transactions on visualization and computer graphics,
vol. 21, no. 11, pp. 1241–1250, 2015.
[12] G. D. Knott, “Hashing functions,” The Computer Journal, vol. 18, no. 3,
pp. 265–278, 1975.
[13] D. E. Knuth, The art of computer programming: sorting and searching,
vol. 3. Pearson Education, 1998.
[14] I. J. Good, “Probability and the weighing of evidence,” 1950.
[15] L. Buckley, J. Byrne, and D. Moloney, “Investigating the use of
primes in hashing for volumetric data,” in Proceedings of the 4th
International Conference on Geographical Information Systems Theory,
Applications and Management - Volume 1: GISTAM,, pp. 304–312,
INSTICC, SciTePress, 2018.
[16] G. Jaeschke, “Reciprocal hashing: A method for generating minimal
perfect hashing functions,” Communications of the ACM, vol. 24, no. 12,
pp. 829–833, 1981.
[17] R. Sprugnoli, “Perfect hashing functions: a single probe retrieving
method for static sets,” Communications of the ACM, vol. 20, no. 11,
pp. 841–850, 1977.
[18] S. Lefebvre and H. Hoppe, “Perfect spatial hashing,” in ACM Transac-
tions on Graphics (TOG), vol. 25, pp. 579–588, ACM, 2006.
[19] M. Teschner, B. Heidelberger, M. Muller, and M. Gross, “A versatile and
robust model for geometrically complex deformable solids,” in Computer
Graphics International, 2004. Proceedings, pp. 312–319, IEEE, 2004.
[20] M. Klingensmith, “personalrobotics/openchisel,” Jul 2017.
[21] W. Dong, J. Shi, W. Tang, X. Wang, and H. Zha, “An efficient volumetric
mesh representation for real-time scene reconstruction using spatial
hashing,” arXiv preprint arXiv:1803.03949, 2018.
[22] O. Kähler, V. Prisacariu, J. Valentin, and D. Murray, “Hierarchical voxel
block hashing for efficient integration of depth images,” IEEE Robotics
and Automation Letters, vol. 1, no. 1, pp. 192–197, 2016.
[23] G. M. Morton, A computer oriented geodetic data base and a new
technique in file sequencing. International Business Machines Company
New York, 1966.
View publication stats

320

Cormen Lin Lee-Introduction To Algorithms (Solutions) - en

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Cormen Lin Lee-Introduction To Algorithms (Solutions) - en

Enviado por

Direitos autorais:

Formatos disponíveis

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Investigating the impact of Suboptimal Hashing Functions

Article · August 2018

Léonie Buckley Jonathan Byrne

SEE PROFILE SEE PROFILE

EoT (Eyes of Things) View project

FIAMMA View project

The user has requested enhancement of the downloaded file.

Investigating the impact of Suboptimal Hashing Functions

978-1-5386-6304-2/18/$31.00 ©2018 IEEE 313

It is true that collisions cannot be completely avoided, but it

behaviour. For example, a collision rate that indicated that only

possible allows one to approach the functionality of an optimal

TABLE I: Operations required for XOR and DECO Hashing

The multiplications required in Equation 4 can be omitted

TABLE II: Parameters for the Stanford Bunny

Fig. 6: The cycles required to compute both the XOR and

As discussed in Section II, both SFCs and DECO Hashing

R EFERENCES [24] I. Gargantini, “Linear octtrees for fast processing of three-dimensional

View publication stats

Você também pode gostar