Escolar Documentos
Profissional Documentos
Cultura Documentos
2, APRIL 2016
services via the cloud. However, the LBS provider is not encrypted version of the known POI. Then, the attacker
willing to disclose the valuable LBS data to the cloud. knows that corresponding query points must be close to
Therefore, the LBS provider encrypts the LBS data, and the known POI.
outsources the encrypted data to the cloud. In addition to the above attacks, other attacks such as insider
2) The cloud has rich storage and computing resources. It attacks may be possible. In this paper, we consider ciphertext-
stores the encrypted LBS data from the LBS provider, and only and known-sample attacks, which do not require attackers
provides query services for LBS users. So, the cloud has with very strong abilities. We will leave the attacks requiring
to search the encrypted POI records in local storage to very strong abilities for future study.
find the ones matching the queries from LBS users.
3) LBS users have the information of their own locations,
and query the encrypted records of nearby POIs in the C. Design Goal
cloud. Cryptographic or privacy-enhancing techniques
Under the outsourced LBS system model, our design goal
are usually utilized to hide the location information in
is to develop an efficient, accurate, and secure solution for
the queries sent to the cloud. To decrypt the encrypted
privacy-preserving spatial range query. Specifically, the follow-
records received from the cloud, LBS users need to
ing three objectives should be achieved.
obtain the decryption key from the LBS provider in
1) Efficiency. As discussed in Section I, spatial range query
advance.
has stringent performance requirements. A good solu-
tion should not consume many resources of mobile LBS
B. Attack Models users, and the POI search latency should be acceptable for
online query.
Similar to most previous works on outsourced data query,
2) Accuracy. It is desirable that a query result contains the
the cloud is assumed honest but curious and considered as the
exact records matching the query. False negatives would
potential attacker in this work. That is, the cloud would honestly
hurt user experience, while false positives would increase
store and search data as requested; however, the cloud would
communication cost. Additional computational cost is
also have financial incentives to learn those stored LBS data
also required at the user side to filter out false positives.
and user location data in query. Because both LBS data and
3) Security. The proposed solution should be resilient to
user location data are valuable, they should be protected and
ciphertext-only attacks and known-sample attacks. An
hidden from the cloud. In general, in the outsourced LBS set-
accurate and efficient solution for spatial range query
ting, the cloud can observe both queries from LBS users and
[1] already exists, which is resilient to ciphertext-only
encrypted LBS data from the LBS provider, which could be an
attacks but not to known-sample attacks and more power-
advantage to learn user locations. Therefore, assuming different
ful attacks. The proposed solution should be more secure
abilities of the attacker, there are mainly four attack models in
than the solution in [1].
outsourced LBS setting.
Though subject to more powerful attacks such as known-
1) Ciphertext-only attack. In this model, the attacker is able
plaintext attacks, the solution proposed in this paper still can
to observe the ciphertexts of POIs’ locations and queries
be used in many situations where the attackers do not have the
but does not know the plaintexts. Obviously, every cloud
required abilities or knowledge. Our solution also has advan-
has this ability. This is a weak attack model.
tages over the solutions resilient to such attacks. As we will see
2) Known-sample attack. In this model, the attacker knows
in the related works in Section VIII, such solutions are either
the plaintexts of some POIs’ locations and/or queries. The
very computationally costly or not applicable to outsourced
attacker also knows that their corresponding ciphertexts
LBS.
must exist in all the ciphertexts observed by the attacker.
However, the attacker does not know which ciphertext is
corresponding to a known plaintext. Utilizing such infor-
mation, the attacker may be able to reveal the plaintext III. P RELIMINARIES : B ILINEAR PAIRING AND
C OMPLEXITY A SSUMPTIONS
corresponded to any given ciphertext. Such information
is not hard to obtain if the attacker has the background In this section, we outline the cryptographic technique of
knowledge that the LBS database must contain the POIs bilinear pairing and related complexity assumptions, which will
of certain type in a certain area. serve as the basis of our IPRE scheme.
3) Known-plaintext attack. In this model, the attacker knows Let G1 and G2 be two cyclic groups of the same big prime
the plaintexts of some POIs’ locations and/or queries order p, and g be a generator of G1 . Let e : G1 × G1 → G2 be
as well as their corresponding ciphertexts. Utilizing this a pairing, i.e., a map satisfies the following properties:
information, the attacker may be able to reveal the plain- 1) bilinearity: e(P a , Qb ) = e(P, Q)ab for any a, b ∈ Z∗p and
texts corresponded to other ciphertexts. any P, Q ∈ G1 ;
4) Access-pattern attack. In this model, the attacker has 2) nondegeneracy: e(g, g) = 1;
some background knowledge about the pattern of POI 3) computability: e can be computed efficiently.
accessing. For example, the attacker knows that a known The definitions of pairing parameter generator and pairing
POI would be the most popular POI. If an encrypted POI related complexity assumptions are given below. For more
appears most frequently in query results, it must be the comprehensive descriptions, refer to [9].
LI et al.: EPLQ: EFFICIENT PRIVACY-PRESERVING LOCATION-BASED QUERY OVER OUTSOURCED ENCRYPTED DATA 209
A. Overview
The proposed IPRE scheme allows computing inner prod-
ucts and comparing their values with a predefined range in a
privacy-preserving way. As far as we know, our scheme is the
first predicate/predicate-only encryption scheme for inner prod-
−−→ −−→
uct range. In IPRE, both attributes and predicates are vectors. EUi and EVj . α and β are two predefined secret integers.
So, we use attribute vectors and predicate vectors to refer to the Next, we show how to find such encoding functions.
attributes and predicates in IPRE. Let Λ ⊆ Ztp be the attribute Following the well-known multinomial theorem, we have
set and ⊆ Ztp be the class of predicates in IPRE. p is a big following equation:
prime here. IPRE allows testing if the inner product of a vector
from Λ and a vector from is in a predefined range without d
→ −
− →
t
disclosing the vectors. (α × Ui , Vj + β)d = β+α× ui,k × vj,k
IPRE scheme is a symmetric predicate-only encryption k=1
scheme, and it consists of four algorithms: 1) Setup algorithm
d
for generating a public parameter PP, an attribute encryption =
l1 +l2 +···+lt+1 =d
l1 , l2 , . . . , lt+1
key AK, and a predicate encryption key PK; 2) Enc algorithm l1 ,l2 ,...,lt+1 ∈[0,d]
for encrypting attribute vectors to ciphertexts; 3) GenToken
algorithm for encrypting predicate vectors to tokens; and
t
t
× β lt+1 (α × vj,k )lk uli,k
k
d+t
(t )
F. Check Algorithm
= hi + sj + ai,1 × bj,1 + ai,m × bj,m The check algorithm takes a ciphertext Cj = (cj,1 , cj,2 , . . . ,
−
→
m=2
cj,n+1 ) of an attribute vector Vj and a token Ki =
d+t
(t )
−
→
(qi,1 , qi,2 , . . . , qi,n+1 ) associated with a predicate
n
vector Ui
= hi × 1 + 1 × (β d + sj ) + ai,m × bj,m e(cj,k ,qi,k )
as input. The algorithm computes Ψ = ck=1 j,n+1 ×sj,n+1
. If
m=2
Hash(Ψ) is in the set {Ωk : τ1 ≤ k ≤ τ2 }, the algorithm out-
= (hi , 1, ai,2 , ai,3 , . . . , ai,n−1 ),
puts 1. Otherwise, it outputs 0.
(1, β d + sj , bj,2 , bj,3 , . . . , bj,n−1 ). Remark 1: If two inner products are equal, their Ψ are the
Thus, we can define EncodeU () and EncodeV () as same. This is not desirable for some applications. This problem
−
→ can be circumvented by adding randomness in the generation of
EncodeU (Ui , hi ) = (hi , 1, ai,2 , ai,3 , . . . , ai,n−1 ) (1) predicate/attribute vectors. For a pair of fixed predicate vector
−
→ and attribute vector, the value Ψ is still fixed. However, given
EncodeV (Vj , sj ) = (1, β d + sj , bj,2 , bj,3 , . . . , bj,n−1 ) (2)
a pair of predicate and attribute, their vectors and their vec-
where n = d+tt + 1. tors’ inner product all have multiple possible values. We will
demonstrate it in our EPLQ solution in Section V.
Correctness proof . First, we prove Hash(Ψ) ∈ {Ωk :
C. Setup Algorithm −→ − →
τ1 ≤ k ≤ τ2 } if τ1 ≤ Ui , Vj ≤ τ2 . Recall that Ωk =
The setup algorithm is a probabilistic algorithm, which takes Hash(e(g, g)(α×perm(k)+β) ). From the following equation,
d
a security parameter λ, the attribute/predicate vector length t, it is easy to find out that Hash(Ψ) ∈ {Ωk : τ1 ≤ k ≤ τ2 } if
and an inner product range [τ1 , τ2 ] as input. The algorithm −
→ − →
τ2 ≥ U i , V j ≥ τ 1 :
outputs an attribute encryption key AK = (α, β, d, M ), a predi- n
cate encryption key PK = (d, M ), and a public parameter PP = e(cj,k , qi,k )
Ψ = k=1
((G1 , G2 , g, p, e), (Ωk )τk=τ
2
1
). cj,n+1 × sj,n+1
α, β ∈ Fp are two random numbers for encoding functions. n
e(g ui,k , g vj,k )
d is a positive integer, and its value depends on the security = k=1 hi
parameter. The scheme is more secure if d is bigger. d could e(g, g) × e(g, g)sj
n
be 2, or d is an integer satisfying GCD(d, p − 1) = 1. If d = e(g, g) k=1 ui,k ×vj,k
=
2, α, β must be chosen to make sure that the intersection of e(g, g)hi +sj
the set {z : z = −z1 − 2β/α mod p, τ1 ≤ z1 ≤ τ2 } and the set −
→ −1 −
→
e(g, g)EncodeU (Ui ,hi )MM (EncodeV (Vj ,sj ))
T
With the ss-tree, searching POI records matching a spa- and its radius are all integral multiples of φ.
tial range query is very efficient. Noticing that all descendant The centroid’s coordinates will be modified to
nodes of a nonleaf node are in the nonleaf node’s associated the closest coordinates that are integral multi-
circular area. Search POI records can be done by scanning ples of φ, and the radius will be modified to the
the ss-tree from root to leaves. If a node’s circular area inter- smallest integral multiple of φ that makes the
sects with the query area, all children nodes of the node will normalized area cover the original one.
be scanned. Otherwise, its descendant nodes will be skipped. 2) After modifying (x̀j , ỳj ) and r̀j , generate the
−
→
Then, O(log N + R) trees nodes will be scanned to find attribute vector Vj as follows:
matched records where N is the number of POI records in the
database and R is the number of matched records. ((−1)ξ̀j μ̀j , (−1)ξ̀j (μ̀j (r̀j2 − x̀2j − ỳj2 )/φ2 + θ̀j ),
(−1)ξ̀j μ̀j x̀j /φ,(−1)ξ̀j μ̀j ỳj /φ,(−1)ξ̀j μ̀j r̀j /φ,
B. Proposed ss-tree
ˆ ξ`j τ2 , (−1)ξ̀j ).
ss-tree
ˆ is the core of our EPLQ solution. It is a variant of
ss-tree. They share the same tree structure, which is shown in Here, μ̀j = τ2 /(rmax1 /φ + 1
+ r̀j /φ)2 ,
Fig. 3. The difference between ss-tree and ss-tree ˆ is the tree and θ̀j is a random nonnegative integer not
nodes’ data, which are shown in Fig. 4. ss-tree
ˆ hides each tree more than μ̀j and τ2 − μ̀j (rmax1 /φ + 1
+
node’s location information using our predicate-only encryp- r̀j /φ)2 . Again, ξ`j is a random number, and its
tion scheme, and removes unnecessary information. Because of value is 1 or 0.
the encryption, detecting circular area intersection and matched Step 3) Encrypt each attribute vector with IPRE scheme.
records are also different when searching matched records with Step 4) Remove all unnecessary fields of each tree node.
the tree. More specifically, we use two kinds of inner products At last, a node’s data include its encrypted attribute
for detecting circular area intersection and matched records, vector. If the node is a nonleaf node, the data also
and our IPRE scheme assures the detection via inner product include pointers to its children. If the node is a leaf
range in a privacy-preserving way. node, its leaf_data field also store the pointer to the
1) Building ss-tree:
ˆ ss-tree
ˆ can be built from ss-tree. After corresponding record.
building an ss-tree for the spatial database, an ss-tree
ˆ can be 2) Searching ss-tree:
ˆ Searching ss-tree
ˆ is the same as
built by the following steps. searching ss-tree except that detecting circular area intersection
Step 1) Configure parameters τ1 , τ2 , and φ. and matched records are based on our IPRE scheme.
As mentioned in the IPRE scheme, τ1 and τ2 are Suppose a spatial range query wants to find all POIs within
the lower limit and upper limit of the given inner a circular area centered at coordinates (xi , yi ) with radius ri .
product range, respectively. The lower limit τ1 is To search ss-tree,
ˆ the tokens of two predicate vectors associ-
−
→
fixed to 0 in ss-tree.
ˆ τ2 is set to a value not ated with the query should be provided. The two vectors Ui
2
smaller than rmax1 where rmax1 is the maximum −
→
and Ui are shown below, and tokens are generated with IPRE’s
query range allowed in the system. φ is a positive GenT oken algorithm
integer used to scale down the inner products for
detecting area intersection, and τ2 ≥ (rmax1 /φ
+ −
→
Ui = ((−1)ξi (μi (ri2 − x2i − yi2 ) + θi ), (−1)ξi μi , (−1)ξi μi xi ,
rmax2 /φ
+ 2)2 where rmax2 is the biggest radius
(−1)ξi μi yi , (−1)ξi μi ri , 1, ξi τ2 )
of ss-tree’s circular areas. (Such inner products are
−
→ 2 2 2
scaled down to make them fit in a smaller range, Ui = ((−1)ξi (ri − xi − yi )/φ2 , (−1)ξi , (−1)ξi 2xi /φ,
which reduces the size of IPRE’s public parame-
ter. As explained later at the end of this section, the (−1)ξi 2yi /φ, (−1)ξi 2ri /φ, 1, ξi τ2 ).
scaling down will not decrease accuracy.)
Here, ξi and ξi are random integers in {0, 1}. μi = τ2 /ri2 ,
Step 2) Generate an attribute vector for each tree node from
and θi is a random nonnegative integer not more than μi and
the node’s centroid (x̀j , ỳj ) and radius r̀j .
τ2 − μi ri 2 . (xi , yi ) and ri are the centroid coordinates and
Case 1: The node is a leaf node. Generate the attribute
−
→ radius of the minimal normalized area covering the query area,
vector Vj as follows: ((−1)ξ̀j , (−1)ξ̀j (r̀j2 − x̀2j −
respectively. The way to find this normalized area is the same
ỳj2 ), (−1)ξ̀j × 2x̀j , (−1)ξ̀j × 2ỳj , (−1)ξ̀j × 2r̀j , as that in the generation of attribute vectors.
− → −→
ξ`j τ2 , (−1)ξ̀j ). The tokens of Ui and Ui are used for detecting matched
Here, ξ`j is a random number, and its value is 1 records and circular area intersection, respectively. We call
or 0. the former vector POI-matching predicate vector and the latter
Case 2: The node is a nonleaf node. Do the following two area-intersecting predicate vector.
steps. Given the above tokens associated with the query, POI
1) Modify the node’s original circular area to a records matching the query can be found by searching ss- ˆ
minimal normalized area covering the original tree. The pseudocode of the search algorithm is shown in
one. In this paper, we say a circular area is Algorithm 1. The search starts from the root node. If a nonleaf
a normalized area if its centroid’s coordinates node’s area intersects with the query area, all children of the
LI et al.: EPLQ: EFFICIENT PRIVACY-PRESERVING LOCATION-BASED QUERY OVER OUTSOURCED ENCRYPTED DATA 213
Algorithm 1. Search_ss-tree(node
ˆ nd, query_tokens Ks, As the tree node is a leaf node, r̀j = 0. Then, we have
⎧
node_list ndl) ⎪
⎪ μi (ri2 − (xi − x̀j )2 − (yi − ỳj )2 ) + θi ,
→ −
− → ⎨
1: \\ nd: the node to be searched if ξi = ξ`j
Ui , V j =
2: \\ Ks: the array of two tokens associated with the query’s ⎪
⎪ τ − μi (ri2 − (xi − x̀j )2 − (yi − ỳj )2 ) − θi ,
⎩ 2
predicate vectors. Ks[0] is the token for POI matching otherwise.
detection, while Ks[1] is the one for detecting intersection The
of circular areas. distance between the leaf node’s POI and the query point
is (xi − x̀j )2 + (yi − ỳj )2 . Then, the record matches the
3: \\ ndl: the list to store matched leaf nodes query if and only if ri2 ≥ ri2 − (xi − x̀j )2 − (yi − ỳj )2 ≥ 0.
4: As shown below, record matching can be detected by examining
5: C ← nd.encrypted_attribute_vector −
→ − →
if Ui , Vj is in the range [0, τ2 ] as well
6: if nd is a leaf node then
7: if Check(Ks[0], C) == 1 then ri2 ≥ ri2 − (xi − x̀j )2 − (yi − ỳj )2 ≥ 0
8: \\ nd’s record matches the q’s area
⇔ τ2 ≥ μi (ri2 − (xi − x̀j )2 − (yi − ỳj )2 ) + θi ≥ 0
9: Add nd to node_list ndl.
10: end if and
11: else τ2 ≥ τ2 − μi (ri2 − (xi − x̀j )2 − (yi − ỳj )2 ) − θi ≥ 0
12: if Check(Ks[0], C) == 1 then −→ − →
⇔ τ2 ≥ Ui , Vj ≥ 0.
13: \\ nd’s area intersects with the q’s area
14: for each child node cld_i of nd do For security reasons, the field order p in IPRE scheme is at least
15: Search_ss-tree(cld_i,
ˆ Ks, ndl) −
→ − →
160 bits, and p Ui , Vj −p holds. Then, we have
16: end for
17: end if −
→ − → −
→ − →
τ2 ≥ Ui , Vj ≥ 0 ⇔ τ2 ≥ Ui , Vj mod p ≥ 0.
18: end if
Therefore, record matching can be detected by examining if
−→ − →
Ui , Vj mod p is in the range [0, τ2 ].
node will be scanned. Otherwise, all descendant nodes of this
4) Correctness of Detecting the Intersection of Circular
nonleaf node are skipped. Detecting circular area intersection
Areas via Inner Product Range: Similarly, for any area-
and matched records are based on our IPRE scheme for inner −
→
product range. We give the correctness proofs of the detection intersecting predicate vector Ui and any nonleaf node’s
−→
as follows. attribute vector Vj , we have the following equation:
3) Correctness of Detecting Matched Records via Inner −→ − →
−→ Ui , Vj
Product Range: For any POI-matching predicate vector Ui ⎧
−
→ ⎪ μ̀j ((ri + r̀j )2 − (xi − x̀j )2 − (yi − ỳj )2 )/φ2 + θ̀j ,
and any leaf node’s attribute vector Vj , we have the following ⎪
⎨
equation: if ξi = ξ`j
=
⎪ 2 2 2 2
→ −
− →
Ui , V j ⎩ τ2 − μ̀j ((ri + r̀j ) − (xi − x̀j ) − (yi − ỳj ) )/φ − θ̀j ,
⎪
otherwise.
= ((−1)ξi (μi (ri2 − x2i − yi2 ) + θi ), (−1)ξi μi , (−1)ξi μi xi , The nonleaf node’s circular area (been normalized) intersects
ξi ξi
(−1) μi yi , (−1) μi ri , 1, ξi τ2 ), with the query’s normalized area if and only if (ri + r̀j )2 ≥
(ri + r̀j )2 − (xi − x̀j )2 − (yi − ỳj )2 ≥ 0. As shown below,
((−1)ξ̀j , (−1)ξ̀j (r̀j2 − x̀2j − ỳj2 ), (−1)ξ̀j × 2x̀j , (−1)ξ̀j the intersection of normalized areas can be detected by exam-
−→ − →
ining if Ui , Vj mod p is in the range [0, τ2 ] as well. Note
× 2ỳj , (−1)ξ̀j × 2r̀j , ξ`j τ2 , (−1)ξ̀j )
that detecting intersection is to rule out subtrees not containing
= (−1)ξi +ξ̀j (μi (ri2 − x2i − yi2 ) + θi ) matched POIs. Expanding original areas to normalized areas
only results in scanning more tree nodes. All matched POI
+ (−1)ξi +ξ̀j μi (r̀j2 − x̀2j − ỳj2 ) + (−1)ξi +ξ̀j 2μi xi x̀j records can still be found, and not matched records will not
be included in the result
+ (−1)ξi +ξ̀j 2μi yi ỳj + (−1)ξi +ξ̀j 2μi ri r̀j
(ri + r̀j )2 ≥ (ri + r̀j )2 − (xi − x̀j )2 − (yi − ỳj )2 ≥ 0
+ ξ`j τ2 + (−1)ξ̀j ξi τ2
⇔ τ2 ≥ μ̀j ((ri + r̀j )2 − (xi − x̀j )2
ξi +ξ̀j 2 2 2
= (−1) (μi ((ri + r̀j ) − (xi − x̀j ) − (yi − ỳj ) ) + θi ) − (yi − ỳj )2 )/φ2 + θ̀j ≥ 0
+ ξ`j τ2 + (−1)ξ̀j ξi τ2 and
⎧
⎪ μi ((ri + r̀j )2 − (xi − x̀j )2 − (yi − ỳj )2 ) + θi , τ2 ≥ τ2 − μ̀j ((ri + r̀j )2 −(xi − x̀j )2 − (yi − ỳj )2 )/φ2 − θ̀j ≥ 0
⎪
⎨
if ξi = ξ`j −→ − →
= ⇔ τ2 ≥ Ui , Vj ≥ 0
⎪
⎪ τ − μi ((ri + r̀j )2 − (xi − x̀j )2 − (yi − ỳj )2 ) − θi , −→ −
⎩ 2 →
otherwise. ⇔ τ2 ≥ Ui , Vj mod p ≥ 0.
214 IEEE INTERNET OF THINGS JOURNAL, VOL. 3, NO. 2, APRIL 2016
C. EPLQ Design The confidentiality of LBS data includes not only the confi-
Our EPLQ solution consists of two algorithms: 1) system dentiality of POI records but also the confidentiality of location
information in ss-tree.
ˆ On the other hand, user location pri-
setup and 2) spatial range search.
1) System Setup: The LBS provider initializes the system vacy involves protecting sensitive location information in user
by the following steps. queries and ss-tree.
ˆ The security of EPLQ solution depends on
the underlying standard encryption scheme and IPRE scheme.
Step 1) The LBS provider initializes the parameters and
keys for the solution. The standard encryption scheme is responsible for preventing
The LBS provider initializes the public parameter the cloud from learning POI records, while our IPRE scheme is
responsible for protecting user location and POI location from
and keys of the proposed IPRE scheme as well
as the key of a standard encryption scheme (e.g., the cloud. The current AES standard can be used as the standard
AES). Let AK = (α, β, d, M ), PK = (d, M ), and scheme, and it is secure under ciphertext-only, known-sample,
and known-plaintext attacks. Thus, we focus on the analysis of
PP = ((G1 , G2 , g, p, e), (Ωk )τk=τ
2
1
) be the attribute
encryption key, predicate encryption key, and pub- user/POI location protection with IPRE scheme.
lic parameter of IPRE scheme. PP is shared with
the cloud. PK, (G1 , G2 , g, p, e), and the key of the A. Security of Query and POI Index Encryption
standard encryption scheme are shared with LBS
users. In EPLQ, user queries and the sensitive location informa-
Remark. The standard scheme will be used to tion in ss-tree
ˆ are encrypted with IPRE scheme. A query
encrypt POI records. IPRE scheme is for searching consists of two tokens associated with two predicate vectors,
encrypted records. (Ωk )τk=τ which contains the LBS user’s location information. For a
2
in the public parameter −
→
1
is used for IPRE’s Check algorithm only, and LBS predicate vector Ui = (ui,1 , ui,2 , . . . , ui,t ), the corresponding
users do not need it to generate tokens. token is ((g ui,k )nk=1 , e(g, g)hi ) where (ui,1 , ui,2 , . . . , ui,n ) =
−→
Step 2) The LBS provider builds an ss-tree ˆ for the LBS EncodeU (Ui , hi )M mod p. Because of the hardness of CDH
database. problem, the attacker cannot reveal any exponent in the token
Step 3) The LBS provider encrypts each POI record with the even if knowing the predicate vector. Without the secret keys of
standard encryption scheme. IPRE (i.e., TK and AK), no one can reveal the predicate vec-
Step 4) The LBS provider outsources all encrypted POI tor, the secret matrix M , or the random number hi . Because of
records and the ss-tree
ˆ to the cloud. the randomness and secretness of hi , encrypting predicate vec-
2) Spatial Range Search: Suppose an LBS user wants to tor to token is semantically secure. Therefore, it is secure under
find all POIs within a circular area centered at coordinates ciphertext-only and known-sample attacks. The sensitive loca-
(xi , yi ) with radius ri . The privacy-preserving query is per- tion information in ss-tree
ˆ is concealed with attribute vector
formed by the following steps. encryption. The encryption is very similar to predicate vector
Step 1) The LBS user generates two tokens for searching encryption, and their security properties are same. Therefore,
POI records with the proposed IPRE scheme. we omit its security analysis.
As elaborated earlier in Section V-B, to search ss- ˆ
tree, two tokens associated with the query area
should be generated. The LBS user generates them B. Security Under the Attack on Inner Products
following the way in Section V-B. Let Ks[0] and As discussed above, it is hard to reveal user queries and
Ks[1] be the generated two tokens. POI locations directly from the ciphertexts of attribute vectors
Step 2) The user sends (Ks[0], Ks[1]) as a query to the cloud. and predicate vectors. Alternatively, the attacker may attempt
Step 3) The cloud searches ss-treeˆ to find all leaf nodes to recover the inner products of predicate vectors and attribute
matching the query from the user. vectors first, and then reveal the vectors containing information
The search algorithm has been given in Section V-B, about user locations and POI locations. Next, we show how this
and its pseudocode is shown in Algorithm 1. attack works and its countermeasure.
Step 4) The cloud returns the corresponding POI records of The attacker may attempt to recover inner products through
exhaustive attacks on (Υk = e(g, g)(α×perm(k)+β) )τk=τ
d
matched leaf nodes to the user. 2
. The
1
Step 5) The LBS user decrypts received POI records exponents ((α × perm(k) + β) )k=τ1 could be viewed as
d τ2
with the shared key of the standard encryption d-degree polynomials of d + 1 terms where the variables are
scheme. α and β. The coefficients of the polynomials are in the range
[τ1d , τ2d ]. These polynomials are in the same vector space of
dimension d + 1. Any d + 1 of these polynomials are linearly
VI. S ECURITY A NALYSIS independent, and any d + 2 of them are not. Thus, for any
In this section, we analyze the security properties of the Υk1 , Υk2 , . . . , Υkd+2 , there exists nonzero λ1 , λ2 , . . . , λd+2
proposed EPLQ solution. Specifically, following the security satisfying the following equation:
requirements discussed earlier, our analysis will focus on how
d+2
d+2
e(g, g)λl ×(α×perm(kl )+β) = e(g, g)0 .
d
the proposed EPLQ solution can achieve the LBS data confi- Υλkll =
dentiality and the user’s location privacy. l=1 l=1
LI et al.: EPLQ: EFFICIENT PRIVACY-PRESERVING LOCATION-BASED QUERY OVER OUTSOURCED ENCRYPTED DATA 215
TABLE III bits. It is about 763 MB for settings in Table II(b). A tree
S YSTEM S ETUP L ATENCY node’s size is around 1.3 KB. Assume that there are 1 million
records in the database. The cloud needs at most 2.42 GB stor-
age space in total. The public parameter and ss-tree
ˆ can fit in
the memory of even one single server. Therefore, the storage
cost is acceptable. The LBS provider sends the cloud, the pub-
lic parameter, and the tree only once. The communication cost
is also acceptable.
F. Accuracy
As discussed in Section IV, using hash function in IPRE
scheme reduces the size of public parameter but introduces
some false positives. This will not hurt the accuracy of EPLQ
solution. The false positive rate is (τ2 − τ1 + 1)/|Hash()|,
which is about 5.42 × 10−12 . O(log N + R) tree nodes are
Fig. 7. POI query latency at cloud side. Note that the latency should be much
scanned during a query. This number is at most a few hundreds.
lower once deployed at a real cloud. Then, the probability that a query result contains false posi-
tive(s) is at most a few hundred times of 5.42 × 10−12 , which
is negligible.
D. Cloud’s Computational Cost
Recall that searching ss-tree
ˆ to find matched records requires
to scan O(log N + R) trees nodes for the database with N VIII. R ELATED W ORKS
records and the query having R matched records. Determining Our work is related to not only privacy-preserving LBS but
whether a record or tree node matches a query or not requires also privacy-preserving query over outsourced encrypted data.
computing Check(Ki , Cj ). Computing the function requires In this section, we introduce some related works that can be
n = 37 pairings and multiplications. used to realize privacy-preserving POI query, though some of
To see whether the computational cost of searching database them are not designed for POI query or LBS. In the litera-
is acceptable or not, we conducted experiments on three ture, there are four kinds of privacy-preserving queries over
datasets. For each dataset, 1000 random query points are cho- POIs: spatial range query [1], nearest neighbor (NN) query
sen. If a query point’s location is not near any POI, the query is [15], K nearest neighbors (KNN) query [2], [16]–[20], and
not realistic and query latency is lower than that in normal sit- multidimensional range query [20]–[26]. Spatial range query
uations. To avoid that, in the experiments, each query point’s cannot be replaced by NN and KNN queries, which all return
location is the same as one random POI’s. For each query the nearest neighbor(s) to a given location. They have differ-
point, we generate three queries with radii of 500 m, 1 km, and ent usages. Sometimes spatial range query may be replaced by
2 km, respectively. Therefore, 3000 queries are generated for multidimensional range query, which returns POIs in a rectan-
each dataset. We measured the average search latency of these gular area instead of a circular area. However, the inaccurate
queries for each dataset, and the results are shown in Fig. 7. result is not desirable. Next, we review the works applicable to
As expected, the latency increases very slow when increasing privacy-preserving spatial range query.
POI count and query radius. In the experiments, a workstation
plays the role of cloud, and only four CPU cores can be utilized
to do the computing. A real cloud has much more computing A. Solutions Applicable to Outsourced LBS
resources, and the query latency at a real cloud should be much 1) Privacy-Preserving Spatial Range Query Based on
lower. Coordinate Transformation: In the solution based on coordi-
nate transformation [1], the coordinates of queries and POIs
in the original coordinate system are transformed to new
E. Communication Cost and Storage Cost coordinates in a new coordinate system. After the transfor-
To make a query, an LBS user sends two tokens to the cloud. mation, the distance information of any two points is still
The communication cost is O(n × log p). Under the settings preserved. Coordinate transformation is very efficient, and the
in Table II(b), the traffic is 4.75 KB, which is acceptable. The return results are accurate. However, solutions designed based
user has to store the attribute encryption key AK and pairing on coordinate transformation would be vulnerable to known-
parameter locally. The storage usage is dominated by M , which sample attacks [2].
is about 27 KB. This is negligible even for a mobile LBS user. 2) Privacy-Preserving POI Query Based on PIR: As far
Let N be the number of POI records in the database. In as we know, only PIR-based solutions [3], [4] can protect the
addition to LBS data, the cloud also needs to store the public privacy in both public LBS and outsourced LBS. Private infor-
parameter of IPRE and an ss-tree
ˆ of less than 4N/3 nodes (for mation retrieval (PIR) [5] is a privacy primitive hiding the
the case of mmin = 4). The size of the public parameter is dom- retrieved data item’s ID from the database server(s). Because
inated by (Ωk )τk=τ
2
1
, which is (τ2 − τ1 + 1) × log |Hash()|
the data items being retrieved are hidden from the database
LI et al.: EPLQ: EFFICIENT PRIVACY-PRESERVING LOCATION-BASED QUERY OVER OUTSOURCED ENCRYPTED DATA 217
server(s), whether two queries’ results are the same or not R EFERENCES
are undetectable. Therefore, PIR-based solutions are resilient [1] A. Gutscher, “Coordinate transformation—A solution for the privacy
to access-pattern attacks. PIR can be used to realize all the problem of location based services?” in Proc. 20th Int. Parallel Distrib.
four kinds of POI queries. However, PIR is very communica- Process. Symp. (IPDPS’06), Rhodes Island, Greece, Apr. 25–29, 2006,
p. 424.
tive and computationally costly [6] for the following reasons. [2] W. K. Wong, D. W.-l. Cheung, B. Kao, and N. Mamoulis, “Secure
PIR requires linearly scanning all POI records including their kNN computation on encrypted databases,” in Proc. SIGMOD, 2009,
location data (coordinates and radii) and nonlocation data. pp. 139–152.
[3] G. Ghinita, P. Kalnis, A. Khoshgozaran, C. Shahabi, and K.-L. Tan,
Moreover, to use PIR in LBS, an LBS user must additionally “Private queries in location based services: Anonymizers are not neces-
access the LBS database’s index data in a privacy-preserving sary,” in Proc. SIGMOD, 2008, pp. 121–132.
manner. PIR can retrieve records if given their IDs. To support [4] X. Yi, R. Paulet, E. Bertino, and V. Varadharajan, “Practical k nearest
neighbor queries with location privacy,” in Proc. 30th Int. Conf. Data
spatial range query, an LBS user should obtain nearby POIs’ Eng. (ICDE), 2014, pp. 640–651.
record IDs from index data in a privacy-preserving manner. PIR [5] B. Chor, E. Kushilevitz, O. Goldreich, and M. Sudan, “Private informa-
or other techniques may be used to obtain such IDs. tion retrieval,” J. ACM, vol. 45, no. 6, pp. 965–981, 1998.
[6] F. Olumofin and I. Goldberg, “Revisiting the computational practicality
of private information retrieval,” in Financial Cryptography and Data
Security. New York, NY: Springer, 2012, pp. 158–172.
B. Solutions for Public LBS Only [7] J. Katz, A. Sahai, and B. Waters, “Predicate encryption supporting dis-
junctions, polynomial equations, and inner products,” in Proc. 27th Ann.
1) Privacy-Preserving LBS Based on Anonymous Int. Conf. Theory Appl. Cryptograph. Tech. Adv. Cryptol. (EUROCRYPT
Communication: In this kind of solutions [27], [28], one ’08), Istanbul, Turkey, Apr. 13–17, 2008, pp. 146–162.
or more third parties relay messages between users and the [8] D. Boneh and B. Waters, “Conjunctive, subset, and range queries on
encrypted data,” in Proc. 4th Theory Cryptograph. Conf. (TCC’07),
LBS provider. This approach hides the linkage between user Amsterdam, The Netherlands, Feb. 21–24, 2007, pp. 535–554.
identities and messages from the LBS provider. The query area [9] D. Boneh and M. K. Franklin, “Identity-based encryption from the Weil
would be exposed to the LBS provider, but the user sending the pairing,” SIAM J. Comput., vol. 32, no. 3, pp. 586–615, 2003.
[10] D. A. White and R. Jain, “Similarity indexing with the ss-tree,” in Proc.
query is hidden among a set of users. 12th Int. Conf. Data Eng. (ICDE), 1996, pp. 516–523.
2) Privacy-Preserving LBS Based on Location Obfuscation: [11] A. Guttman, “R-trees: A dynamic index structure for spatial searching,”
In this kind of solutions [29], [30], to prevent the LBS provider in Proc. Annu. Meeting (SIGMOD’84), Boston, MA, USA, Jun. 18–21,
1984, pp. 47–57.
from knowing users’ precise locations, users submit low- [12] T. K. Dang, J. Küng, and R. Wagner, “The sh-tree: A super hybrid index
precision locations or fake locations along with real locations. structure for multidimensional data,” in Proc. 12th Int. Conf. Database
These solutions offer a weak level of privacy. Expert Syst. Appl. (DEXA’ 01), Munich, Germany, Sep. 3–5, 2001,
pp. 340–349.
3) Privacy-Preserving LBS Based on Spatial Cloaking: [13] B.-Y. Yang and J.-M. Chen, “All in the XL family: Theory and practice,”
This kind of solutions [31], [32] combines anonymous commu- in Proc. Int. Conf. Inf. Secur. Cryptol, 2004, pp. 67–86.
nication and location obfuscation techniques together. To the [14] G. Ars, J.-C. Faugere, H. Imai, M. Kawazoe, and M. Sugita, “Comparison
between XL and Gröbner basis algorithms,” in Proc. ASIACRYPT, 2004,
LBS provider, a user cannot be identified from a set of users in pp. 338–353.
a cloaking area, and the cloaking area instead of users’ precise [15] B. Yao, F. Li, and X. Xiao, “Secure nearest neighbor revisited,” in Proc.
locations is sent to the LBS provider. IEEE 29th Int. Conf. Data Eng. (ICDE’13), 2013, pp. 733–744.
[16] Y. Elmehdwi, B. K. Samanthula, and W. Jiang, “Secure k-nearest neigh-
All the above solutions can be applied to a wide range of LBS bor query over encrypted data in outsourced environments,” in Proc. IEEE
including POI query. However, their techniques do not allow the 30th Int. Conf. Data Eng. (ICDE), 2014, pp. 664–675.
cloud to search encrypted data. Therefore, they cannot be used [17] A. Khoshgozaran and C. Shahabi, “Blind evaluation of nearest neigh-
bor queries using space transformation to preserve location privacy,” in
for outsourced LBS where LBS data in the cloud are encrypted. Advances in Spatial and Temporal Databases. New York, NY, USA:
Springer, 2007, pp. 239–257.
[18] B. Hore, S. Mehrotra, M. Canim, and M. Kantarcioglu, “Secure multidi-
IX. C ONCLUSION mensional range queries over outsourced data,” VLDB J., vol. 21, no. 3,
pp. 333–358, 2012.
In this paper, we have proposed EPLQ, an efficient privacy- [19] I.-T. Lien, Y.-H. Lin, J.-R. Shieh, and J.-L. Wu, “A novel privacy preserv-
preserving spatial range query solution for smart phones, which ing location-based service protocol with secret circular shift for k-NN
search,” IEEE Trans. Inf. Forensics Secur., vol. 8, no. 6, pp. 863–873,
preserves the privacy of user location, and achieves confiden- Jun. 2013.
tiality of LBS data. To realize EPLQ, we have designed an [20] M. L. Yiu, G. Ghinita, C. S. Jensen, and P. Kalnis, “Enabling search
IPRE and a novel privacy-preserving index tree named ss-tree.
ˆ services on outsourced private spatial data,” VLDB J., vol. 19, no. 3,
pp. 363–384, 2010.
EPLQ’s efficacy has been evaluated with theoretical analy- [21] E. Shi, J. Bethencourt, T.-H. Chan, D. Song, and A. Perrig, “Multi-
sis and experiments, and detailed analysis shows its security dimensional range query over encrypted data,” in Proc. IEEE Symp.
against known-sample attacks and ciphertext-only attacks. Our Secur. & Privacy, 2007, pp. 350–364.
[22] B. Wang, Y. Hou, M. Li, H. Wang, and H. Li, “Maple: Scalable multi-
techniques have potential usages in other kinds of privacy- dimensional range search over encrypted cloud data with tree-based
preserving queries. If the query can be performed through index,” in Proc. 9th ACM Symp. Inf. Comput. Commun. Secur., 2014,
comparing inner products to a given range, the proposed pp. 111–122.
[23] R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu, “Order preserving
IPRE and ss-tree
ˆ may be applied to realize privacy-preserving encryption for numeric data,” in Proc. SIGMOD, 2004, pp. 563–574.
query. Two potential usages are privacy-preserving similar- [24] J. Shao, R. Lu, and X. Lin, “Fine: A fine-grained privacy-preserving
ity query and long spatial range query. In the future, we location-based service framework for mobile devices,” in Proc. IEEE
INFOCOM, 2014, pp. 244–252.
will design solutions for these scenarios and identify more [25] A. Boldyreva, N. Chenette, Y. Lee, and A. Oneill, “Order-preserving
usages. symmetric encryption,” in Proc. EUROCRYPT, 2009, pp. 224–241.
218 IEEE INTERNET OF THINGS JOURNAL, VOL. 3, NO. 2, APRIL 2016
[26] P. Wang and C. Ravishankar, “Secure and efficient range queries on out- Rongxing Lu (S’09–M’11–SM’15) received the
sourced databases using Rp-trees,” in Proc. Int. Conf. Data Eng. (ICDE), Ph.D. degree in computer science from Shanghai Jiao
2013, pp. 314–325. Tong University, Shanghai, China, in 2006, and the
[27] A. R. Beresford and F. Stajano, “Location privacy in pervasive comput- Ph.D. degree in electrical and computer engineer-
ing,” Pervasive Comput., vol. 2, no. 1, pp. 46–55, Jan./Mar. 2003. ing from the University of Waterloo, Waterloo, ON,
[28] Y. Zhu, D. Ma, D. Huang, and C. Hu, “Enabling secure location-based Canada, in 2012.
services in mobile cloud computing,” in Proc. 2nd ACM SIGCOMM From May 2012 to April 2013, he was a
Workshop Mobile Cloud Comput., 2013, pp. 27–32. Postdoctoral Fellow with the University of Waterloo.
[29] H. Kido, Y. Yanagisawa, and T. Satoh, “An anonymous communication Since May 2013, he has been an Assistant Professor
technique using dummies for location-based services,” in Proc. Int. Conf. with the School of Electrical and Electronic
Perv. Serv. (ICPS), 2005, pp. 88–97. Engineering, Nanyang Technological University,
[30] C. A. Ardagna, M. Cremonini, E. Damiani, S. D. C. Di Vimercati, Singapore. His research interests include computer network security, mobile
and P. Samarati, “Location privacy protection through obfuscation-based and wireless communication security, and applied cryptography.
techniques,” in Proc. Data Appl. Secur. XXI, 2007, pp. 47–60. Dr. Lu was the recipient of the Canada Governor General Gold Metal.
[31] M. Gruteser and D. Grunwald, “Anonymous usage of location-based
services through spatial and temporal cloaking,” in Proc. 1st Int. Conf.
Mobile Syst. Appl. Serv., 2003, pp. 31–42.
[32] M. F. Mokbel, C.-Y. Chow, and W. G. Aref, “The new Casper: Query Cheng Huang received the B.Eng. degree in infor-
processing for location services without compromising privacy,” in Proc. mation security from Xidian University, Xi’an,
32nd Int. Conf. Very Large Data Bases (VLDB’06), 2006, pp. 763–774. China, in 2013.
He is currently a Project Officer with the
INFINITUS Laboratory, School of Electrical and
Lichun Li received the Bachelor’s degree in infor- Electronic Engineering, Nanyang Technological
mation engineering from the Beijing University University, Singapore. His research interests include
of Posts and Telecommunications, Beijing, China, applied cryptography, cyber security, and privacy.
in 2002, the Master’s degree in communication
and information systems from the China Academy
of Telecommunication Technology, Beijing, China,
in 2006, and the Ph.D. degree in computer sci-
ence from the Beijing University of Posts and
Telecommunications, Beijing, China, in 2009.
He is currently a Postdoctoral Research Fellow
with the INFINITUS Laboratory, School of Electrical
and Electronic Engineering, Nanyang Technological University, Singapore. His
research interests include privacy and security in cloud and big data.