Você está na página 1de 11

Ubiquitous Computing and Communication Journal

Continuous Reverse Nearest Neighbor Search


Lien-Fa Lin*, Chao-Chun Chen
Department of Computer Science and Information
Engineering National Cheng-Kung University, Tainan, Taiwan, R.O.C.
Department of Information Communication Southern Taiwan
University of Technology, Tainan, Taiwan, R.O.C.
lienfa@cc.kyu.edu.tw,chencc@mail.stut.edu.tw

ABSTRACT
The query service for the location of an object is called Location Based Services
(LBSs), and Reverse Nearest Neighbor (RNN) queries are one of them. RNN queries
have diversified applications, such as decision support system, market decision,
query of database document, and biological information. Studies of RNN in the past,
however, focused on inquirers in immobile status without consideration of
continuous demand for RNN queries in moving conditions. In the environment of
wireless network, users often remain in moving conditions, and sending a query
command while moving is a natural behavior. Availability of such service therefore
becomes very important; we refer to this type of issue as Continuous Reverse
Nearest Neighbor (CRNN) queries. Because an inquirers location changes
according to time, RNN queries will return different results according to different
locations. For a CRNN query, executing RNN search for every point of time during a
continuous query period will require a tremendously large price to pay. In this work,
an efficient algorithm is designed to provide precise results of a CRNN query in just
one execution. In addition, a large amount of experiments were conducted to verify
the above-mentioned method, of which results of the experiments showed significant
enhancement in efficiency.
Keywords: Location Based Services, Location-Dependent Query, Continuous
Query, Reverse Nearest Neighbor Query, Continuous Reverse Nearest Neighbor
Query
1

INTRODUCTION

As wireless network communications and mobile


device technology develop
vigorously
and
positioning technology matures gradually, LBS is
becoming a key development in the industrial as well
as academic circles [2, 5, 13, 21, 26, 27]. According
to the report of IT Roadmap to a Geospatial Future
[6], LBSs will embrace pervasive computing and
transform mass advertising media, marketing, and
different societal facets in the upcoming decade.
Despite the fact that LBSs have been existing in the
traditional calculation environment (such as Yahoo!
Local), its greatest development potential lies in the
domain of mobile computing that provides freedom
of mobility and access to information anywhere
possible.
LBSs shall become an indispensable application
in mobile network as its required technology has
matured
and 3G
wireless
communication
infrastructure is expected to be deployed everywhere.
The query that answers to LBSs is referred to as

Volume 3 Number 2

Page 114

Location-Dependent Query (LDQ), of which


applications include Range Query, Nearest Neighbor
(NN) query, K-Nearest Neighbor (KNN) query, and
Reverse Nearest Neighbor (RNN) query.
There are plenty of studies about NN [14, 22, 26],
KNN [4, 9, 14, 23, 25], CNN [17, 3, 12, 20], and
CKNN [17, 20] queries, and issues pertaining to
Reverse Nearest Neighbor (RNN) Query [10, 11, 16,
18, 19, 22, 24] have been receiving attention in recent
years. RNN query means finding a collection of
nearest neighbor objects for S, a given collection of
objects, with q, a given query object. Practical
examples of RNN query are provided in [10]. If a
bank is planning to open a new branch, and its clients
prefer a branch on a nearest possible location, then
such new branch should be established on a location
where the distance to the majority of its clients is
shorter than that of other banks. Taxi cabs selecting
passengers is another good example. If a taxi cab uses
wireless devices to find out the location of its
customer, then RNN queries will be far more
advantageous than NN queries from the aspect of

www.ubicc.org

Ubiquitous Computing and Communication Journal

competition. Figure 1 illustrates that Customer c is


the nearest neighbor for Taxi a, but that does not
necessarily mean Taxi a can capture Customer c
because Taxi b is even closer to Customer c. On the
contrary, the best option for Taxi a should be
Customer d because Taxi a is the nearest neighbor for
Customer d. That is, d is the RNN for a, and a may
reach d faster than any other taxi. This is an example
of CRNN query for that the query object, the taxi,
changes location according to time. Mobile users will
be mobile in a wireless environment, and that is why
the continuous query is an important issue in the
wireless environment.
As far as the knowledge available to the
researchers is concerned, there is not yet any
researcher working on this issue. Because an inquirer
changes location constantly according to time,
changes of location will cause RNN queries to return
different results. For a CRNN query, executing RNN
search for every point of time during a continuous
query period will require a tremendously large price
to pay. The larger the number of query objects and
the shorter the time segment are, the longer the
calculation time will be.
In addition, due to the continuance nature of
time, defining the appropriate time segment for RNN
search will be a concern; if the interval between RNN
searches is too short, then more CRNN queries need
to be executed to complete the query, and vice versa.
If a RNN search is repeated over a longer period of
time to reduce the number of execution, the RNN
query result for the whole time segment will lose
accuracy due to insufficient frequency of sampling.
In this paper, a more efficient algorithm is
designed to replace processing of each and every
point of time for RNN search; just one execution of
CRNN query is all it takes to properly define the
segment for the query time that a user is interested in,
and find out the segments that share the same answer
and the RNN for each of the intervals.
Other than that, an index is also used to filter out
unnecessary objects to reduce search space and
improve CRNN search efficiency. The experiment
result suggests that using index provides efficiency
20 times better than not using index when the number
of objects is 1000.
This Study provides major contribution in three
ways:
z
This Study pioneers into continuous query
processing methods opposite to static
query regarding RNN issues.
z
A CRNN search algorithm is proposed;
just one execution will return all CRNN
results.
z
The proposed method allows the index
which was only applicable to finding RNN
for a single query point to support CRNN
query to improve CRNN search efficiency.
The structure of the other sections in this work:

Volume 3 Number 2

Page 115

Related works about RNN search are introduced in


Section 2. Concerned issues are defined and
assumptions made are described in Section 3. The
proposed CRNN search algorithm is introduced in
Section 4 The experiment environment and
evaluation parameters for experimental efficacy are
described in Section 5. In the end, a conclusion and
future study directions are provided in Section 6.

Figure 1: Example of RNN query.

RELATED WORK

RNN search concerns about finding q, a query


that is the NN for some objects. Related works of
study about RNN search are introduced and
summarized in this section:
z Index methods that support RNN search
The number of objects can be infinite; if one must
first find out the distance from query q to each object
for identifying the RNN for query q, then the
efficiency may be unacceptably low due to
overwhelmingly large computation cost. To
accelerate processing speed, most of studies adopt the
index methods. Major index methods are introduced
in this section.
z RNN search of different types
RNN searches in different scenarios are described
and categorized according to static and moving
situations of query q and the objects.
2.1

Index Methods for RNN Query

RNN search concerns about finding q, a query


that is the NN for some objects, and it is necessary to
find out the distance between query q and each object,
or the distance from the coordinate of query q to the
coordinate of an object. For a given q, not every
object is its RNN, and these objects which can not be
RNN may be practically left out of consideration to
reduce the number of objects to be taken into
consideration and accelerate processing speed for
RNN search. Many studies were dedicated to the
designing of an effective indexing structure for
coordinates of an object. The most famous ones are
R-Tree proposed by [8] and Rdnn-Tree proposed by

www.ubicc.org

Ubiquitous Computing and Communication Journal

[10]. These two index methods are described below.

these child trees to its NN will not exceed MaxDnn.

2.1.1R-Tree
R-Tree is an index structure developed in early
years for spatial database and was used by [10] to
accelerate RNN search processing. All objects are
grouped and then placed on leaf nodes according to
the closeness of their coordinates. That is, objects at
similar coordinates are put in one group. Next, each
group of objects is contained in a smallest possible
rectangle, which is called Minimum Bounding
Rectangle (MBR). Next, MBRs are grouped in
clusters, which are contained inside a larger MBR
until all objects are contained in the same MBR.
What is stored on an internal node of a R-Tree is an
MBR, in which all nodes underneath are contained,
and the root of the R-Tree contains all objects. The
size and range of an MBR is defined by its lower left
coordinate (MlMd) and upper right coordinate (Mr
Mu). Figure 2 is an example of R-Tree. From a to l,
there are total 12 objects; (a b c d) belong to
MBR b1, and (efg) belong to MBR b2. MBR b1
and b2 belong to MBR B1, and MBR R contains all
objects..

Figure 3. Data structure of Rdnn-tree


2.2 Categories of Rnn queries
Depending on the static or moving status of
query q and the query objects, related studies can be
summarized into 4 categories.
1. If query q and the query objects are both static,
then this category is called static query vs. static
objects.
2. If query q is moving and the query objects are
static, then this category is called moving query vs.
static objects.
3. If query q is static and the query objects are
moving, then this category is called static query vs.
moving objects.
4. If both query q and the query objects are moving,
then this category is called moving query vs. moving
objects.
2.2.1 Static query vs. static objects

Figure 2: Example of R-tree Indexing


2.1.2 Rdnn-Tree
Rdnn-tree (R-tree containing Distance of Nearest
Neighbors) [22] improves the method of [10]. The
author proposes a single index structure (Rdnn-tree)
to provide solutions for NN queries and RNN queries
at the same time. Rdnn-tree differs from standard Rtree structure by storing extra information about
nearest neighbor of the points in each node.
Information of (ptiddnn) is stored on the leaf node
of Rdnn-tree, as shown in Figure 3. ptid means an
object of which the data concentrate on the dimension,
denoted as d, and dnn means the distance from such
object to its NN. Information of (ptr Rect
MaxDnn) is stored on a non-leaf node, where ptr
points to the address of a child node, Rect contains
the MBR of all child nodes subordinate to this node,
and MaxDnn means the maximum value of dnn of all
objects in the child trees subordinate to this node. The
maximum distance from any object contained in

Volume 3 Number 2

Page 116

The scenario that both query q and query objects


are static is first discussed because the query and
query objects are immobile and are therefore easier
for processing than other scenarios. The method
proposed in [10] is now introduced. For static
database, the author adopts a special R-tree, called
RNN-tree, for answering RNN queries. For static
database that requires being frequently updated, the
author proposes a combined use of NN-tree and
RNN-tree. NN of every object is stored in the RNNtree, and what are stored in the NN-tree are the
objects themselves and their respective collections of
NN. The author uses every object as the center of a
circle, of which the radius is the distance from the
object to its NN, to make a circle, and then examines
every circle that contains query q to find out the
answers of RNN queries. Such method, however, is
very inefficient for dynamic database because the
structures of NN-tree and RNN-tree must be changed
whenever the database is updated. In [22], the method
proposed by [10] is therefore improved. The author
proposes a single index structure, Rdnn-tree, for
answering NN queries and RNN queries at the same

www.ubicc.org

Ubiquitous Computing and Communication Journal

time. It differs from normal R-tree; it separately


stores the information of NN of every object (i.e.
Distance of Nearest Neighbor), and NN of every
object must be calculated in advance.
2.2.2 Static query vs. moving objects
Studies mentioned above primarily assume a
monochromatic situation that all objects, including
query q and query objects, are of the same type. In
[18], the researcher addresses this type of issues in a
bichromatic situation that objects are divided into two
different types; one is inquirer, and the other is query
object. NN and range query techniques are used in
this Paper to handle RNN issues.

q; instead, i segments of time, such as segment1,


segment2 segmenti, that have the same result, are
first identified among the entire CRNN query time
period. RNN result of each segment of time, such as
RNN1, RNN2, , is calculated separately, and the
result is returned in the format of (q
[segment1])={RNN1 result} (q
[segmenti])={RNNi result} back to the inquirer.

2.2.3 Moving query


Figure 4. Example of a CRNN query
This subsection discusses the situation when an
inquirer is no longer static but changes his or her
location according to time, and the query object can
be either static or moving. That is, two categories of
query: moving query with static objects and moving
query with moving objects, are involved. Because the
inquirer is moving, these two categories of query will
return different results for the identical RNN search
at different points of time. This type of issue is
obviously more complicate than the issues previously
discussed. As far as the knowledge available to the
researcher is concerned, no related study has ever
discussed about the issues of these two categories. In
this Study, solutions for a moving query with static
objects are pursued.

3 Problem Formulation
CRNN query concerns about a period of
continuing time where adjacent points in such period
of time may have the identical RNN. That is, a period
of time may have the same RNN unless the query q
has moved beyond this period of time. Please refer to
Figure 4. When a user executes CRNN query q, the
time segment of the continuous query is [PsPe], and
the query objects are {abcd}. If time point P1
can be identified, and any given point of time in the
time segment from Ps to P1, or [PsP1], has the same
RNN result, then one-time execution of RNN search
is all it needs for the time segment of [Ps P1]. If
points of time, P2, P3, and P4, are also identified, and
any given point of time in the time segment of [P1
P2] has the identical RNN result, while any given
point of time in the time segment of [P2P3] has the
identical result, and any given point of time in the
time segment of [P3P4] has the identical result, then
the entire CRNN query needs only one-time
execution of RNN search at each time segment.
For processing CRNN query, it is not necessary
to execute RNN search for every point of time and
return the RNN of every point of time back to query

Volume 3 Number 2

Page 117

Based on the description above, CRNN query,


one issue that this Study concerns, may be stated as
below:
Given:
A collection of static objects S={O1O2On}
A query point q, its current position (q.xq.y), and
moving velocity (v.xv.y)
A continuous query time [Ps Pe]; where Ps
represents the coordinate of the point of time when
the query begins, and Pe represents the coordinate of
the point of time when the query ends
Find:
The RNNs of q between any two adjacent points of
time of {P1 P2 Pi} within [Ps Pe] remain
constant.
Such that:
RNN(q[Ps P1]) = {RNN1}RNN(q[P1 P2])
={RNN2} RNN(q [Pi Pe]) = {RNNi}
where [PiPe] [PsPe]{RNN1} {O1O2
On}.
Under the assumptions:
1. The moving direction of query q is fixed.
2. All query objects are static.
As described above, two adjacent points of time
may share the same RNN, or a segment of time has
only one RNN unless query q moves to another
segment of time that has a different RNN. The CRNN
search algorithm proposed in this Study uses exactly
this concept. First, the points of time that produce
different RNN results within a query time are
identified. These points of time divide the query time
into several segments that have different RNN results,
and then the RNN results are identified for each of
the segments. The detailed algorithm of CRNN
Search is explained in the next section.

4. CRNN Search Algorithm


The detailed procedure of CRNN Search
algorithm is introduced in this section. CRNN Search

www.ubicc.org

Ubiquitous Computing and Communication Journal

algorithm is divided into two steps.


Step 1: Finding segment points of CRNNq
Points of time that produce different RNN
results are identified. Based on these points of time,
CRNN query is divided into several time segments
that require execution of RNN search. The RNN
result for any given point of time within one segment
will remain constant, and different segments have
different RNN results.
Step 2: Calculating RNN result of each segment
Separately calculate the RNN results for each of
the segments that have been divided in the previous
step.
The entire procedure for processing CRNN
Search is illustrated in Figure 5. On top of the
necessary query objects and continuous query (query
path), it is divided into two steps: finding segment
points of CRNNq and calculating RNN result of each
segment; each of the steps is described below:

As illustrated in Figure 6, if the NN of object a


is b, and a circle is made using ab as the radius with
a as the center point, then the distance from query q
to a must be shorter than the distance from a to its
NN, or object b, as long as query q falls within this
circle. Therefore, during the period of time when
query q remains within this circle, RNNs of object a
must include a, unless query q moves out of this
circle. Because the moving direction of query q is
assumed to be fixed, CRNN query will form a query
line (qline) from its beginning to its end. The point to
which this CRNN query begins to leave this circle is
the intersection S of this circle and the query line
formed by CRNN query. Before intersection S, the
result of RNN query must include object a; beyond
intersection S, the result of RNN query will not
include object a; the RNN results will be different.
This intersection is referred to as a segment point.
This explains why the intersection of the circle
with NN as its radius and the query line is the point
of time where RNN query produces different results.
Making a circle by using an object itself as the center
and the distance to its NN as the radius will enable all
of the intersections of the circle and the query line of
CRNN query to cut CRNN query into several time
segments that have different results of RNN query.

Figure 5: Flow chart of CRNN query


processing
4.1 Finding segment points of CRNN
What CRNN query pursues is a period of
continuous time; the moving distance of query
objects is very short among some adjacent points of
time for the query, thus possibly resulting in the same
RNN result. That is, the entire period of continuous
query is divided into several segments, and the RNN
results in each segment are the same. If these points
of time share the same RNN result, then it is not
necessary to execute RNN search for each of the
points of time; one-time calculation is enough.
Therefore, CRNN query does not require executing
RNN search for all points of time. Instead, points of
time that share the same RNN result are grouped into
time segments, and one-time RNN search is executed
for each of the segments. RNN of query q is a
collection of the objects of which the NN is query q.
If the distance, or N, is realized in advance, then
these objects are the RNN for query q when the
distances from query q to the objects are shorter than
the distances from the objects to their respective NN.

Volume 3 Number 2

Page 118

Figure 6. Finding segment point of CRNN search


Figure 7 illustrates the time segmentation
process described above. For object a, b, and c, their
respective NNs are identified first: NN(a)=b,
NN(b)=a, and NN(c)=b. Next, use each object as the
center of a circle, and the distance to its respective
NN as the radius to make circles of a, b, and c. Then,
intersections of the circles and qlines, Ps, P1, P2, P3,
P4, and Pe , are sorted according to time, and every
two intersection points define a time segment. The
entire CRNN query is cut into five time segments, [Ps
P1] , [ P1 P2] , [P2P3] , [P3P4] , and [P4Pe].
Every segment has a unique RNN query result.

www.ubicc.org

Ubiquitous Computing and Communication Journal

Figure 8: Calculating RNN result of each segment.


1.

CRNN Algorithm with Index


Not every object will be an answer in the
processing of CRNN query. To improve RNN query
efficiency, it is preferred that the objects that can not
be answers are filtered out in advance to greatly
reduce search space for CRNN query, size of data
that requires CRNN query, and consequently,
computation cost. The process that further improves
CRNN query efficiency dramatically is referred to as
pruning process. Figure 9 illustrates the flowchart of
CRNN query processing with a pruning process
added.
Figure 7: Segmenting of the CRNN query
4.2 Calculating RNN result of each segment
In the previous section, intersections of qlines
and the circles with the distances between the objects
and their respective NNs as the radiuses are defined.
With these intersections, CRNN query is cut into
several time segments. The next step is to find RNNs
for each of the time segments. Because the distances
from query objects to their respective NNs are used
as the radiuses to make circles which are coded by
the objects numbers, if a segment falls within a
certain circle, then the resulting RNN of this time
segment for the CRNN query is the object collection
represented by such circle. This is illustrated in
Figure 8. First, intersections of qlines that represent
the CRNN query and the circles of the objects are
sorted by time; every two intersection points define a
time segment, and there are five segments, [PsP1] ,
[ P1 P2] , [P2 P3] , [P3 P4] , and [P4 Pe].
Segment [Ps P1] is contained only by circle a,
therefore: RNN(q [Ps P1]) ={a}. Next, examine
segment [PsP1]; this segment is contained by circle
a and circle b. Therefore: RNN(q [PsP1]) = {a
b}. If this process is repeated, then the obtained
results will be RNN(q [P2 P3]) = {a b c},
RNN(q[P3P4]) ={bc}, and RNN(q[P4P5])
={c}.

Volume 3 Number 2

Page 119

Figure 9. Flow chart of CRNN query with index


Step 2 and 3 are identical to Step 1 and 2 in
CRNN search algorithm, which have been described
in the previous sections, and they will not be
reiterated again here. For step 1, the pruning process,
an index structure for Rdnn-tree is designed to
effectively execute the pruning process. The three
steps of CRNN query with index are illustrated in
Figure 9. The pruning process is described below. For
every internal node of Rdnn-tree, the distance from
query q to its node will be computed for every
separation, and the distance is denoted as D(qRect).
If D(qRect) of a node is larger than MaxDnn of the
node, then all the objects beneath it will not be
considered because the distance from query q to Rect
Node will be equal to or larger than the distances
from query q to all the objects underneath Rect node.
When the distance from query q to Rect node is
longer than MaxDnn, it is impossible that query q is
closer to its NN than any other object underneath
Rect node, and no object underneath can be the RNN
result for query q. On the contrary, if D(q Rect)
equals the MaxDnn of such node, then the distances
from some objects underneath Rect node to their
respective NNs are shorter than the distance from
query q to Rect. That is, some objects are the RNN
results for query q. The examination continues along

www.ubicc.org

Ubiquitous Computing and Communication Journal

the branch all the way to the lead node. All entries
underneath such leaf node are recorded as the
candidate objects for RNN query result. The
collection of these candidate objects is referred to as
RNNCanSet, which means the possible results for
RNN query must exist within this collection, and the
objects outside of RNNCanSet can not possibly be
RNN query results. All that are needed to be
considered when finding segment point of CRNNq of
CRNN search algorithm are the objects inside
RNNCanSet. This will greatly reduce the quantity of
objects needed to be handled and enhance CRNN
search algorithm efficiency.
Figure 3 explains the pruning process. It begins
with root node R. Because D(q R) MaxDnn of R,
child nodes of B1 and B2 must be examined. Because
the MaxDnn of MBR B1 D(qB1), all child nodes
underneath B1 can be pruned. Next, D(q
B2)MaxDnn of B2, so child nodes b3 and b4 of B2
must be examined. D(qb3) is equal to or smaller
than the MaxDnn of b3, which is also a leaf node;

therefore, h and i are placed inside RNNCanSet. Next,


b4 is examined. MaxDnn of b4 is equal to or smaller
than D(q b4); therefore, b4 can be pruned. The
entire pruning process then ends.
However, the CRNN query to be processed is
not a RNN query of a single query point; therefore,
the pruning process in [22] can not be directly used.
To ensure that no possible RNN result is deleted, the
criteria of pruning is changed from the condition that
D(qRect), the distance from query point to Rect,
must be longer than MaxDnn to the condition that
MinD(qRect)>MaxDnn, where MinD(qlineRect)
represents the minimum distance from qline to Rect
node. The reason why the shortest distance is selected
is that if the minimum distance from the entire qline
to Rect node is larger than MaxDnn, then the distance
from any given point of time on the qline to Rect
node must be longer than MaxDnn. Therefore, all the
objects underneath Rect node can not be RNN for
qline, and pruning is out of consideration. Details of
the pruning algorithm are exhibited in Algorithm 1:

Algorithm 1: Pruning Algorithm.


and comparison of experiment results.
5. Performance Study
5.1 Experiment Settings
To evaluate the improvement which the method
proposed in this Study has made in CRNN query
efficiency, some experiments are designed, and this
section provides descriptions of experiment
environments, experimental parameters and settings,

Volume 3 Number 2

Page 120

The coordinates of the objects disperse in an


experiment environment of [0 1][0 1] plane.
Because distribution density of the objects may
influence efficiency, it should be taken into

www.ubicc.org

Ubiquitous Computing and Communication Journal

consideration in the experiment. In the experiment,


three different types of distribution are used in the
generation of objects coordinates. The three different
types of distribution are Uniform distribution,
Gaussian distribution, and Zipf distribution. In
uniform distribution, the objects are evenly
distributed on the plane, as shown in Figure 10(a). In
Gaussian distribution, most of the objects concentrate
on the center of the plane, as shown in Figure 10(b).
In Zipf distribution, most of the objects will distribute
at the extreme left and extreme bottom of the plane.
In the experiment, skew factor is set at 0.8, as shown
in Figure 10(c). In addition, 30 queries are generated
randomly in a [0.4 0.6][0.4 0.6] plane by

referring to [15], and the velocity vector of each


query falls between [-0.01 0.01]. Because the
influence of different types of object distribution on
efficiency is concerned in this experiment, the queries
are generated as close to the center of the plane as
possible. Having executed 30 queries, the average
cost of executing one CRNN search is used in
determining which method is more favorable. As to
the program coding of Rdnn-tree in the CRNN search
algorithm, R*-tree code of GIST[7] is used in
perfecting Rdnn-tree to make it match with the
requirement of this experiment.

0.8

0.8

0.8

0.6

0.6

0.6

Y axsi

Y axis

Y axis

0.4

0.4

0.4

0.2

0.2

0.2

0
0

0.2

0.4

0.6

0.8

0.2

0.4

0.6

0.8

X axis

X axis

0.2

0.4

0.6

0.8

X axis

Figure 10 Data sets of experiment evaluation


In addition to the object distribution described
above, the influences that the amount of query time
(qline) and the number of objects may impose on
efficiency are also considered. Three data sets of
Uniform, Gaussian, and Zipf are considered in object
distribution. The amount of query time (qline)
changes from query length 1 to query length 10. The
number of objects changes from 1K to 10K.
Parameters and settings used in the experiment are
listed in Table 1.
Table 1: Parameter settings of experiment
Parameter
distribution

Description
Data distribution

interval

Time interval of
Query
Number of Data
Objects

object-no

Settings
Uniform,
Gaussian, Zipf
1, 2, 5, 8, 10
1, 10. 30, 50 ,
100(k)

CRNN query is executing RNN algorithm for every


point of time which is continuous, and it is
impossible to calculate the required count of
execution. Therefore, the CRNN query time must be
segmented before the total execution time required
for CRNN query may be calculated. The more the
time is segmented, the more executions of RNN are
required. If a period of time is segmented into m
segments, then time complexity will be O(mn3), and
if time is not adequately segmented, then the RNN
result may be erroneous. These make it an inefficient
CRNN search algorithm, and it will not be compared
in this experiment. Efficiency of two methods is
compared in this experiment: one uses Rdnn-tree as
the index, and the other uses no index. To evaluate
these two methods, comparison of the time required
for one CRNN search execution can be used, and this
comparison is referred to as total cost in this Study.
5.4 Performance Results and Discussion

5.2 Compared Algorithms and Performance


Metrics
The most intuitive method for finding RNN is
looking for the NN of every object. If the number of
query objects is N, then time Complexity is O(n2).
Next, determine which objects NNs are query points.
If the NNs are the query points, then the objects will
be the RNNs for the query points. The required time
complexity for the RNN algorithm is O(n3).
However, the most intuitive method for finding

Volume 3 Number 2

Page 121

Based on the changes of metrics (distribution,


interval, and object-no), different types of
experiments have been conducted. Results are
summarized by object-no and query interval in the
next section.
5.4.1 The effect of object-no parameter
First, the fixed query interval is set at 5. The
influence imposed on efficiency by object-no

www.ubicc.org

Ubiquitous Computing and Communication Journal

parameter, which is the number of objects, under


different types of object distribution, will be
discussed. The experiment result is shown in Figure
11. X axle represents the total cost of time required
for executing one CRNN search, and Y axle
represents the number of query objects. Total cost
increases as the number of query objects increases.
In addition, when the number of objects is 1K,
the efficiency of the CRNN search that uses Rdnntree is about 300 seconds, and that of the CRNN
search using no Rdnn-tree index is about 15 seconds;
about 20 times faster. It is obvious that pruning some
unnecessary objects by adopting Rdnn-tree as the
index to reduce CRNN search space provides much
higher efficiency than not adopting Rdnn-tree. The
107

107

107
CRNN without index
CRNN with index

106

106

105

105

10

Total time (sec).

104

104
3

10

104
103

102

10

10

10

102

10

1K

10K

30K

50K

100K

CRNN without index


CRNN with index

106

CRNN without index


CRNN with index

105
Total time (sec).

Total time (sec).

comparison of influence from object distribution on


efficiency clearly suggests Zipf distribution offers the
best efficiency, followed by Uniform distribution,
and Gaussian distribution offers the worst. This result
can be explained as such: because Zipf distribution is
located at the far left and the lowest bottom, most of
the data will be pruned, and the number of objects
that are included in RNNCanSet without being
pruned is very small, offering the lowest total cost.
On the opposite, data in Gaussian distribution
concentrate in the center of the plane and very few
data can be pruned, allowing many objects to remain,
and causing a large RNNCanSet, thus resulting in the
highest total cost.

1
1K

10K

object-no

30K

50K

1K

100K

10K

30K

50K

100K

object-no

object-no

Figure 11: Influences on different types of data distribution from changing object-no
distance of MinD(qline Rect) decreases, causing
pruning efficiency to reduce. On the contrary, when
the interval increases, the number of time
segmentation
by
CRNN
query
increases.
Consequently, the number of RNN searches for every
segment increases, and total cost of CRNN query
increases as well.

5.4.2 The effect of query interval parameter


This section focuses on the influence from the
length of query interval on each method under
different types of object distribution. Results of the
experiment are shown in Figure 12. Generally
speaking, when the query interval is lengthened, the

105

105

105

104

104

104
CRNN without index
CRNN with index

103

102

10

Total Time (sec).

Total time (sec).

Total Time (sec) .

CRNN without index


CRNN with index

103

102

10

1
1

10

103

CRNN without index


CRNN with index

102

10

1
1

Query Interval

5
Query Interval

10

5
Query Interval

10

Figure12: Effect of query interval parameter for different data distribution


6. Conclusions and Future Works
An efficient CRNN search algorithm is proposed
in this Study. Such algorithm requires only one
execution to find out RNN results from all continuous
RNN searches. The diversified experiments in this

Volume 3 Number 2

Page 122

Study also prove the efficiency of the proposed


method. As wireless communication and mobile
device technology become mature, more and more
users access information from wireless information
systems through mobile devices. To process requests
from more and more mobile users, data dissemination

www.ubicc.org

Ubiquitous Computing and Communication Journal

through broadcast is an effective solution for


scalability. The future goal of this Study is extending
the issues of CRNN search to the wireless
broadcasting environment.
7. References
[1] AmitSingh,H.F. and Tosun,A.S. (2003) High
dimensional reverse nearest neighbor queries.
International
Proceedings of the 20th
Conference on Information and Knowledge
Management (CIKM03), NewOrleans, LA,
USA, pp.9198.
[2] Barbara,D. (1999) Mobile computing and
databases-a survey. IEEE Transactions on
Knowledge and Data Engineering, 11, 108117.
[3] Benetis,R.,Jensen, C.S.,Karciauskas,G., and
Saltenis,S. (2002) Nearest neighbor and reverse
nearest neighbor queries for moving objects.
International Database Engineering and
Applications Symposium, Canada, July17-19,
pp.4453.
[4] Chaudhuri,S. and Gravano,L. (1999) Evaluating
top-k selection queries. Proceedings of the 25th
IEEE International Conference on Very Large
Data Bases, pp.397410.
[5] Civilis,A., Jensen,C.S., and Pakalnis,S. (2005)
Techniques for efficient road-network-based
tracking of moving objects. IEEE Transactions
on Knowledge and Data Engineering, 17, 698
712.
[6] Computer Science and Telecommunication
Board.IT Roadmap to a geospatial future, the
national academies press,2003.
[7] http://gist.cs.berkeley.edu/.
[8] Guttman,A. (1984) R-trees:A dynamic index
structure for spatial searching. Proceedings of
the 1984 ACM SIGMOD international
conference on Management of data, pp.4757.
[9] Hjaltason,G.R. and Samet,H. (1999) Distance
browsing in spatial data bases. ACM
Transactions on Database Systems (TODS), 24,
265318.
[10] Korn,F. and Muthukrishnan,S. (2000) Influence
sets based on reverse nearest neighbor queries.
Proceedings of the 2000 ACM SIGMOD
International Conference on Management of
Data, Dallas, Texas, USA, May16-18, pp.201
212.
[11] Korn,F.,Muthukrishnan, S.,and Srivastava.,D.
(2002) Reverse nearest neighbor aggregates
over data streams. Proceedings of the
International Conference on Very Large
DataBases (VLDB02), Hong Kong, China,
August, pp.9198.
[12] Korn,F., Sidiropoulos, N.,Faloutsos, C.,Siegel,E.,
and Protopapas,Z. (1996) Fast nearest neighbor
search in medical image database. In
Proceedings of the 22th International
Conference on Very Large Data Bases

Volume 3 Number 2

Page 123

(CLDB96), pp. 215226.


[13] Lee,D.L., chien Lee, W.,Xu,J., and Zheng, B.
(2002) Data management in location dependent
services. IEEE Pervasive Computing,1, 6572.
[14] Roussopoulos,N., Kelley,S. ,and Vincent,
F.(1995)
Nearest
neighbor
queries.
Proceedings of ACM Sigmod International
Conference on Management of Data , Illinois,
USA, June, pp.7179.
[15] Ross,S.(2000) Introduction to Probability and
Statistics for Engineers and Scientists.
[16] Stanoi,I., Agrawal,D., and Abbadi,A.E. (2000)
Reverse nearest neighbor queries for dynamic
databases. ACM SIGMOD Workshop on
Research Issues in Data Mining and
Knowledge Discovery, pp.4453.
[17] Song,Z. and Roussopoulos,N. (2001) K-nearest
neighbor search for moving query point.
Proceedings of 7th International Symposium on
Advances in Spatial and Temporal Databases,
LNCS2121, RedondoBeach, CA, USA, July1215, pp.7996.
[18] Stanoi,I.,Riedewald, M.,Agrawal,D., and
Abbadi,A.E. (2001) Discovery of influence sets
in frequently updated databases. Proceedings of
the 27th VLDB Conference, Roma, Italy, pp.99
108.
[18] Tao,Y., Papadias,D., and Lian,X. (2004) Reverse
knn search in arbitrary dimensionality.
Proceedings of 30th Very Large Data Bases,
Toronto,
Canada,
August29-September3,
pp.279290.
[20]Tao,Y., Papadias, D., and Shen,Q. (2002)
Continuous
nearest
neighbor
search.
International Conference on Very Large Data
Bases, Hong Kong, China, August 20-23,
pp.279290.
[21] Xu,J., Zheng,B., Lee,W.-C.,, and Lee,D.L. (2003)
Energy efficient index for energy query
location-dependent
data
in
mobile
environments. In Proceedings of the 19th IEEE
International Conference on Data Engineering
(ICDE03), Bangalore, India, March, pp.239
250.
[22] Yang,C. and Lin,K.-I. (2001) An index structure
for efficient reverse nearest neighbor queries.
International
Proceedings of the 17th
Conference on Data Engineering, pp.485492.
[23] Yiu,M.L., Papadias,D., Manoulis,N., and Tao,Y.
(2005) Reverse nearest neighbors in large
graphs. Proceedings of 21st IEEE International
Conference on Data Engineering (ICDE),
Tokyo, Japan, April5-8, pp.186187.
[24] Yu,C.,Ooi,B.C., Tan,K.-L., and Jagadish,H.V.
(2001) Indexing the distance: An efficient
method to knn processing. Proceedings of the
27th VLDB Conference, Roma, Italy, pp. 421
430.
[25] Zheng,B.,Lee, W.-C., and Lee,D.L. (2003)

www.ubicc.org

Ubiquitous Computing and Communication Journal

Search k nearest neighbors on air. Proceedings


of the 4th International Conferenceon Mobile
Data Management, Melbourne, Australia,
January, pp.181195.
[26] Zheng,B., Xu,J., chien Lee, W., and Lee,D.L.
(2004) Energy conserving air indexes for
nearest neighbor search. Proceedings of the 9th
International
Conference
on
Extending
Database Technology (EDBT04), Heraklion,

Volume 3 Number 2

Page 124

Crete, Greece,March, pp.4866.


[27] Zhang,J., Zhu,M., Papadias,D., Tao,Y., and
Lee,D.L. (2003) Location-based spatial queries.
In Proceedings of the 2003 ACM SIGMOD
international conference on Management of
data, SanDiego, California, USA, June9-12,
pp.443454.

www.ubicc.org

Você também pode gostar