Você está na página 1de 6

1

A Privacy-Preserving Remote Data Integrity Checking


Protocol with Data Dynamics and Public Veriability
Zhuo Hao, Sheng Zhong, Member, IEEE, and Nenghai Yu, Member, IEEE
AbstractRemote data integrity checking is a crucial tech-
nology in cloud computing. Recently many works focus on
providing data dynamics and/or public veriability to this type
of protocols. Existing protocols can support both features with
the help of a third party auditor. In a previous work, Sebe et
al. [1] propose a remote data integrity checking protocol that
supports data dynamics. In this paper, we adapt Sebe et al.s
protocol to support public veriability. The proposed protocol
supports public veriability without help of a third party
auditor. In addition, the proposed protocol does not leak any
private information to third party veriers. Through a formal
analysis, we show the correctness and security of the protocol.
After that, through theoretical analysis and experimental
results, we demonstrate that the proposed protocol has a good
performance.
Index TermsData integrity, Data dynamics, Public veri-
ability, Privacy.
I. INTRODUCTION
Storing data in the cloud has become a trend [2]. An
increasing number of clients store their important data in
remote servers in the cloud, without leaving a copy in
their local computers. Sometimes the data stored in the
cloud is so important that the clients must ensure it is
not lost or corrupted. While it is easy to check data in-
tegrity after completely downloading the data to be checked,
downloading large amounts of data just for checking data
integrity is a waste of communication bandwidth. Hence, a
lot of works [1], [3], [4], [5], [6], [7], [8], [9] have been
done on designing remote data integrity checking protocols,
which allow data integrity to be checked without completely
downloading the data.
Remote data integrity checking is rst introduced in [10],
[11], which independently propose RSA-based methods for
solving this problem. After that Shah et al. [12] propose
a remote storage auditing method based on pre-computed
challenge-response pairs. Recently many works [1], [3],
[4], [5], [6], [7], [8], [9], [13], [14], [15] focus on pro-
viding three advanced features for remote data integrity
checking protocols: data dynamics [5], [6], [8], [14], public
veriability [3], [8], [9], [14] and privacy against veriers
[9], [14]. The protocols in [5], [6], [7], [8], [14] support
data dynamics at the block level, including block insertion,
This work was supported in part by NSF CNS-0845149 and CCF-
0915374. A full version of this concise paper is SUNY Buffalo CSE tech-
nical report 2010-11 (http://www.cse.buffalo.edu/tech-reports/2010-11.pdf)
Zhuo Hao and Nenghai Yu are with the Department of EEIS, University
of Science and Technology of China, Hefei, Anhui 230027, China. (email:
hzhuo@mail.ustc.edu.cn; ynh@ustc.edu.cn).
Sheng Zhong (corresponding author) is with the Department of Com-
puter Science and Engineering, State University of New York at Buffalo,
Amherst NY 14260, USA. (email:szhong@buffalo.edu).
block modication and block deletion. The protocol of [3]
supports data append operation. In addition, [1] can be
easily adapted to support data dynamics. [9], [13] can be
adapted to support data dynamics by using the techniques
of [8]. On the other hand, protocols in [3], [8], [9], [13],
[14], [15] support public veriability, by which anyone (not
just the client) can perform the integrity checking operation.
The protocols in [9], [13], [14], [15] support privacy against
third party veriers. We compare the proposed protocol with
selected previous protocols. (see Table I.)
In this paper, we have the following main contributions:
We propose a remote data integrity checking protocol
for cloud storage, which can be viewed as an adapta-
tion of Sebe et al.s protocol [1]. The proposed protocol
inherits the support of data dynamics from [1], and
supports public veriability and privacy against third-
party veriers, while at the same time it doesnt need
to use a third-party auditor.
We give a security analysis of the proposed protocol,
which shows that it is secure against the untrusted
server and private against third party veriers.
We have theoretically analyzed and experimentally
tested the efciency of the protocol. Both theoretical
and experimental results demonstrate that our protocol
is efcient.
The rest of this paper is organized as follows. In Section
II, technical preliminaries are presented. In Section III,
the proposed remote data integrity checking protocol is
presented. In Section IV, a formal analysis of the proposed
protocol is presented. In Section V, we describe the support
of data dynamics of the proposed protocol. In Section VI,
the protocols complexity is analyzed in the aspects of com-
munication, computation and storage costs; furthermore,
experimental results are presented for the efciency of the
protocol. And nally, conclusions and possible future work
are presented in Section VII.
II. TECHNICAL PRELIMINARIES
We consider a cloud storage system in which there are
a client and an untrusted server. The client stores her data
in the server without keeping a local copy. Hence, it is of
critical importance that the client should be able to verify
the integrity of the data stored in the remote untrusted
server. If the server modies any part of the clients data,
the client should be able to detect it; furthermore, any third
party verier should also be able to detect it. In case a third
party verier veries the integrity of the clients data, the
data should be kept private against the third party verier.
Below we present a formal statement of the problem.
Digital Object Indentifier 10.1109/TKDE.2011.62 1041-4347/11/$26.00 2011 IEEE
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
2
TABLE I: Comparisons between the Proposed Protocol and Previous Protocols.
S-PDP[3] [1] [8] DPDP[6] [9], [13] IPDP[14]
The Proposed
Protocol
Type of guarantee
probabilistic/
deterministic
deterministic probabilistic
1
deterministic
2
Public veriability Yes No Yes No Yes Yes Yes
With help of TPA
3
No No Yes No Yes Yes No
Data dynamics append only Yes Yes Yes No Yes Yes
Privacy preserving No No Yes Yes Yes
Support for sampling Yes No Yes Yes Yes Yes No
Size of
verication tags
O(n)
*
Communication O(1) O(clogn) O(logn) O(c) O(s) O(1)
Server block access O(c) O(n) O(c) O(n)
Server computation O(c) O(n) O(clogn) O(c) O(c+s) O(n)
Verier computation O(c) O(n) O(clogn) O(c) O(c+s) O(n)
Client storage O(1) O(n) O(1) O(n)
1
The probabilistic guarantee of data integrity is achieved by using the probabilistic checking method proposed in [3]. Because the blocks are
randomly selected, the detection probability will be high if the server deletes a fraction of all the blocks.
2
The protocol in [1] and the proposed protocol achieve deterministic guarantee of data integrity, because they check the integrity of all the data
blocks. However, both the protocol in [1] and the proposed protocol can be easily transformed into a more efcient one by using the probabilistic
checking method [3].
3
A third party auditor [8], [9], [13], [14] has certain special expertise and technical capabilities, which the clients do not have.
*
n is the block number, c is the sampling block number, and s is the number of sectors in a block.
Problem Formulation Denote by m the le that will be
stored in the untrusted server, which is divided into n blocks
of equal lengths: m = m
1
m
2
...m
n
, where n = ,[m[/l|.
Here l is the length of each le block. Denote by f
K
() a
pseudo-random function which is dened as:
f : 0, 1
k
0, 1
log
2
(n)
0, 1
d
,
in which k and d are two security parameters. Furthermore,
denote the length of N in bits by [N[.
We need to design a remote data integrity checking
protocol that includes the following ve functions: SetUp,
TagGen, Challenge, GenProof and CheckProof.
SetUp(1
k
) (pk, sk): Given the security parameter k, this
function generates the public key pk and the secret key sk.
pk is public to everyone, while sk is kept secret by the
client.
TagGen(pk, sk, m) T
m
: Given pk, sk and m, this
function computes a verication tag T
m
and makes it
publicly known to everyone. This tag will be used for public
verication of data integrity.
Challenge(pk, T
m
) chal: Using this function, the verier
generates a challenge chal to request for the integrity proof
of le m. The verier sends chal to the server.
GenProof(pk, T
m
, m, chal) R: Using this function, the
server computes a response R to the challenge chal. The
server sends R back to the verier.
CheckProof(pk, T
m
, chal, R) success, failure:
The verier checks the validity of the response R. If it is
valid, the function outputs success, otherwise the function
outputs failure. The secret key sk is not needed in the
CheckProof function.
Security Requirements There are two security require-
ments for the remote data integrity checking protocol: secu-
rity against the server with public veriability, and privacy
against third party veriers. We rst give the denition of
security against the server with public veriability. In this
denition, we have two entities: a challenger that stands for
either the client or any third party verier, and an adversary
that stands for the untrusted server.
Denition 1. (Security against the Server with Public Ver-
ibility [3]) We consider a game between a challenger and
an adversary that has four phases: Setup, Query, Challenge
and Forge.
Setup: The challenger runs the SetUp function, and
gets the (pk, sk). The challenger sends pk to the
adversary and keeps sk secret.
Query: The adversary adaptively selects some le
blocks m
i
, i = 1, 2, ..., n and queries the verication
tags from the challenger. The challenger computes a
verication tag D
i
for each of these blocks and sends
D
i
, i = 1, 2, ..., n to the adversary. According to the
protocol formulation, T
m
= D
1
, D
2
, ..., D
n
.
Challenge: The challenger generates the chal for
the le blocks m
1
, m
2
, ..., m
n
and sends it to the
adversary.
Forge: The adversary computes a response R to prove
the integrity of the requested le blocks.
If CheckProof(pk, T
m
, chal, R) = success, then the
adversary has won the game. The remote data integrity
checking protocol is said to be secure against the server
if for any TTT (probabilistic polynomial time) adversary
/, the probability that / wins the game on a collection
of le blocks is negligibly close to the probability that the
challenger can extract these le blocks by a knowledge
extractor c.
When the verier is not the client herself, the protocol
must ensure that no private information about the clients
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
3
data is leaked to the third party verier. We formalize this
requirement using the simulation paradigm [16].
Before we proceed to the denition of this requirement,
we introduce some related notations. Let f = (f
1
, f
2
) be
a TTT functionality and let be a two-party protocol
for computing f. During the execution of , denote the
view of the rst (resp., second) party by view

1
(x, y)
(resp., view

2
(x, y)). view

1
(x, y) (resp., view

2
(x, y)) in-
cludes (x, r
1
, m
1
1
, ..., m
1
t
) (resp., (x, r
2
, m
2
1
, ..., m
2
t
)) where
r
1
(resp., r
2
) represents the outcome of the rst (resp.,
second) partys internal coin tosses, and m
1
i
(resp., m
2
i
)
represents the ith message it has received. Denote the output
of the rst (resp., second) party during the execution of
on (x, y) by output

1
(x, y) (resp., output

2
(x, y)), which is
implicit in the partys own view of the execution. We denote
the verier and the server by 1 and T respectively.
Denition 2. (Privacy against Semi-Honest Behavior [16])
For a deterministic functionality f, is said to privately
compute f if there exist probabilistic polynomial time
algorithms, denoted S
1
and S
2
, such that
S
1
(x, f
1
(x, y))
x,y{0,1}

c
view

1
(x, y)
x,y{0,1}
,
S
2
(y, f
2
(x, y))
x,y{0,1}

c
view

2
(x, y)
x,y{0,1}
.
Note that
c
denotes computational indistinguishability.
From Denition 2 we dene the privacy against third
party veriers, which is given in Denition 3.
Denition 3. (Privacy against Third Party Veriers) For the
remote data integrity checking protocol , if there exists a
TTT simulator o
V
such that
o
V
(x, f
V
(x, y))
x,y{0,1}

c
view

V
(x, y)
x,y{0,1}
,
then is a protocol that ensures privacy against third-party
veriers.
Data Dynamics at Block Level Data dynamics means
after clients store their data at the remote server, they
can dynamically update their data at later times. At the
block level, the main operations are block insertion, block
modication and block deletion. Moreover, when data is
updated, the verication metadata also needs to be updated.
The updating overhead should be made as small as possible.
Homomorphic Veriable Tags In our construction,
we use a RSA-based homomorphic veriable tags (HVT)
[3], which is dened as follows. Let N = pq be one
publicly known RSA modulus. We know that e : e
Z
N
and gcd(e, N) = 1 forms a multiplicative group.
Denote this group by Z

N
. Denote an element in Z

N
with
a large order by g. The RSA-based HVT for message m
i
is dened as Tag(m
i
) = g
m
i
mod N. Its homomorphic
property can be deduced from its denition. When Tag(m
i
)
and Tag(m
j
) are tags of m
i
and m
j
respectively, the tag for
m
i
+m
j
can be generated by computing Tag(m
i
+m
j
) =
Tag(m
i
) Tag(m
j
) = g
m
i
+m
j
mod N.
III. THE PROPOSED REMOTE DATA INTEGRITY
CHECKING PROTOCOL
In this section we describe the proposed remote data
integrity checking protocol. Just as formulated in Section
II, the proposed protocol has functions SetUp, TagGen,
Challenge, GenProof and CheckProof, as well as functions
for data dynamics. In the following we present the former
ve functions of the proposed protocol. We leave the
functions for data dynamics to Section V.
SetUp(1
k
) (pk, sk): Let N = pq be one publicly
known RSA modulus, in which p = 2p

+ 1, q = 2q

+ 1
are two large primes. p

and q

are also primes. In addition,


all the quadratic residues modulo N form a multiplicative
cyclic group, which we denote by QR
N
. Denote the gen-
erator of QR
N
by g.
1
Since the order of QR
N
is p

, the
order of g is also p

. Let pk = (N, g) and sk = (p, q).


pk is then released to be publicly known to everyone, and
sk is kept secret by the client.
TagGen(pk, sk, m) T
m
: For each le block m
i
, i
[1, n], the client computes the block tag as
D
i
= (g
m
i
) mod N.
Without loss of generality, we assume that each block is
unique. If in some particular applications, there exist blocks
with the same value, then we differentiate them by adding
a random number v Z
N
in each of them. Let T
m
=
D
1
, D
2
, ..., D
n
. After nishing computing all the block
tags, the client sends the le m to the remote server, and
releases T
m
to be publicly known to everyone.
Challenge(pk, T
m
) chal: In order to verify the in-
tegrity of the le m, the verier generates a random key
r [1, 2
k
1] and a random group element s Z
N
0.
The verier then computes g
s
= g
s
mod N and sends
chal = r, g
s
to the server.
GenProof(pk, T
m
, m, chal) R: When the server re-
ceives chal = r, g
s
, it generates a sequence of block in-
dexes a
1
, a
2
, ..., a
n
by calling f
r
(i) for i [1, n] iteratively.
Then the server computes
R = (g
s
)

n
i=1
a
i
m
i
mod N,
and sends R to the verier.
CheckProof(pk, T
m
, chal, R) success, failure:
When the verier receives R from the server, she computes
a
i

i=1,...,n
as the server does in the GenProof step. Then
the verier computes P and R

as follows:
P =
n

i=1
(D
a
i
i
mod N) mod N
R

= P
s
mod N
After that the verier checks whether R

= R. If R

= R,
output success. Otherwise the verication fails and the
verier outputs failure.
Note that in the TagGen function, we make all the blocks
distinct by adding random numbers in blocks with the same
value. If the server still tries to save its storage space, then
the only way is by breaking the prime factorization of N,
or equally, getting a multiple of (N). The hardness of
breaking large number factorization makes the proposed
protocol secure against the untrusted server. We put the
formal analysis of the proposed protocol in Section IV.
1
A simple way to compute g is to let g = b
2
, in which b
R
Z

N
and
gcd(b 1, N) = 1.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
4
IV. CORRECTNESS AND SECURITY ANALYSIS
In this section, we rst show that the proposed protocol is
correct in the sense that the server can pass the verication
of data integrity as long as both the client and the server
are honest. Then we show that the protocol is secure against
the untrusted server. These two theorems together guarantee
that, assuming the client is honest, if and only if the server
has access to the complete and uncorrupted data, it can pass
the verication process successfully. Finally we show that
the proposed protocol is private against third party veriers.
Theorem 1. If both the client and the server are honest,
then the server can pass the verication successfully.
Proof: We prove this theorem by showing that R and
R

should be equal if all the data blocks are kept completely


at the server. From the TagGen(m) function, we get that
D
i
= (g
m
i
) mod N, i [1, n]. Then we get
P =
n

i=1
(D
a
i
i
mod N) mod N = g

n
i=1
a
i
m
i
mod N
Then
R

= P
s
mod N = g
s

n
i=1
a
i
m
i
mod N
= g
s

n
i=1
a
i
m
i
mod N = R
This completes the proof.
Before we proceed to Theorem 2, we rst review the
KEA1-r assumption, which has been investigated in [17],
[18], and adapted to the RSA setting in [3].
Denition 4. KEA1-r(Knowledge of Exponent Assumption
[3]). For any adversary A taking input (N, g, g
s
) and
returning (C, Y ) with Y = C
s
, there exists extractor

A, which given the same input as A returns c such that


C = g
c
.
Our proof of Theorem 2 needs a lemma from [19]
on solving prime factorization when a multiple of

(n)
is known. Denote the prime factorization of N by
p
1
v
1
p
t
v
t
. Then

(n) = lcm(p
1
1, , p
t
1), in
which lcm denotes least common multiple.
Lemma 1. [19] Let g be any function that satises the
following two conditions: 1)

(n)[g(n), 2)[g(n)[ = O([n[


k
)
for some constant k. Then prime factorization is polyno-
mial time reducible to g. Furthermore, the cost of solving
the prime factorization of n is O([n[
k+4
M([n[)), in which
M([n[) denotes the cost of multiplying two integers of
binary length [n[.
Theorem 2. Under the KEA1-r and the large integer
factorization assumptions, the proposed protocol is secure
against the untrusted server.
Proof: Just as in the security formulation, we denote
the adversary by / and the challenger by B. What we want
to prove is that for any TTT adversary who wins the data
possession game on some le blocks, the challenger can
construct a knowledge extractor c that extracts these le
blocks. Equivalently, if c cannot extract these le blocks,
the challenger can break the integer factorization problem.
For the large integer factorization problem, B is given a
large integer N, which is product of two large primes p and
q. Here p = 2p

+ 1 and q = 2q

+ 1. B tries to solve the


prime factorization of N.
B simulates the protocol environment for / with the
following steps:
Setup: B generates a random generator of QR
N
.
Denote the generator by g. B sends pk = (N, g) to
/.
Query: / adaptively selects some le blocks m
i
, i =
1, 2, ..., n and queries the verication tags from B. B
computes a verication tag D
i
= g
m
i
mod N for each
of these blocks and sends D
i
, i = 1, 2, ..., n to /.
Let T
m
= D
1
, D
2
, ..., D
n
. T
m
is made publicly
known to everyone.
Challenge: B generates a chal for the le blocks
m
1
, m
2
, ..., m
n
and sends it to /. The generation
method is the same with that in the Challenge function
described in section III. Let chal = r, g
s
.
Forge: / computes a response R to prove the integrity
of the requested le blocks.
If CheckProof(pk, T
m
, chal, R) = success, then the
adversary has won the game. Note that / is given (N, g, g
s
)
as input, and outputs R = g
s

n
i=1
a
i
m
i
mod N, in which
a
i
= f
r
(i) for i [1, n]. Because / can naturally computes
P = g

n
i=1
a
i
m
i
mod N from T
m
, P is also treated as
/s output. So / is given (N, g, g
s
) as input, and outputs
(R, P) that satises R = P
s
. From the KEA1-r assumption,
B can construct an extractor

A, which given the same input
as /, outputs c which satises P = g
c
mod N. As P =
g

n
i=1
a
i
m
i
mod N, B extracts c =

n
i=1
a
i
m
i
mod p

.
Now B generates n challenges r
1
, g
s
1
, r
2
, g
s
2
, ...,
r
n
, g
s
n
using the method described in section III. B
computes a
j
i
= f
r
j
(i) for i [1, n] and j [1, n]. Because
r
1
, r
2
, ..., r
n
are chosen by B, now B chooses them so
that a
j
1
, a
j
2
, ..., a
j
n
, j = 1, 2, ..., n satisfy the following
equation:
det

a
1
1
a
1
2
... a
1
n
a
2
1
a
2
2
... a
2
n
.
.
.
.
.
.
.
.
.
.
.
.
a
n
1
a
n
2
... a
n
n

,= 0. (1)
Here det

denotes the determinant of a matrix. B


challenges / for n times. On the jth time, B challenges
/ with r
j
, g
s
j
. From the response of /, B extracts
c
j
= a
j
1
m
1
+ a
j
2
m
2
+ + a
j
n
m
n
mod p

.
When equation (1) holds, the following system of linear
equations has a unique solution.

a
1
1
m
1
+ a
1
2
m
2
+ ... + a
1
n
m
n
= c
1
mod p

,
a
2
1
m
1
+ a
2
2
m
2
+ ... + a
2
n
m
n
= c
2
mod p

,
.
.
.
a
n
1
m
1
+ a
n
2
m
2
+ ... + a
n
n
m
n
= c
n
mod p

.
(2)
By solving the above equation set, for each le block
m
i
, i = 1, 2, ..., n, B gets m

i
which satises m

i
=
m
i
mod p

. If for each m

i
, i = 1, 2, ..., n, m

i
=
m
i
, then B has successfully extracted all the le blocks
m
i
, i = 1, 2, ..., n. However, if for some i, m

i
,= m
i
,
then we show that B can successfully compute the prime
factorization of N. Without loss of generality, we assume
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
5
m
i
is larger than m

i
. Then B can get a multiple of (N)
from m

i
= m
i
mod p

, which we denote by k
1
(N).
From Lemma 1, B can solve the prime factorization of N
with the cost of O(([k
1
[ + [(N)[)[N[
4
M([N[)). Because
/ is a TTT adversary, the length of k
1
is bounded by
O([N[
k
2
) for some constant k
2
. From the above we can
see that if any le block cannot be extracted, then B can
construct a knowledge extractor c to extract the prime
factorization of N in probabilistic polynomial time.
In conclusion, under the KEA1-r and large integer factor-
ization assumptions, the proposed protocol guarantees the
data integrity against an untrusted server.
Theorem 3. (Privacy against Third Party Veriers) Under
the semi-honest model [16], a third party verier cannot get
any information about the clients data m from the protocol
execution. Hence, the protocol is private against third party
veriers.
Proof: (Sketch) In this proof we construct a simulator
for the view of the verier, and show that the output of
the simulator is computationally indistinguishable with the
view of the verier. Due to space limitation, we put the
detailed proof in the full version of this paper [20].
V. DATA DYNAMICS
The proposed protocol supports data dynamics at the
block level in the same way as [1]. In the following we
show how our protocol supports block modication. Due to
space limitation, we describe the support of block insertion
and block deletion in the full version [20].
Block Modication. Assume that the client wants
to modify the ith block m
i
of her le. Denote the
modied data block by m

i
. Then the server updates
m
i
to m

i
. Next the client computes a new block tag
for the updated block, i.e., D

i
= g
m

i
mod N.
From the above we can see that the correspondence
relationship between the block and the digest does not
change after the data updating, i.e., D
i
= g
m
i
mod N, i =
1, 2, ..., ,[m[/l|. So the data integrity is still protected.
If the client wants to make sure that the le has really
been updated, she can launch a proof request immediately
by sending a challenge to the server. Any block that is
updated is given a novel random number, so that each block
remains unique. Therefore, the server cannot delete any
block without being detected.
VI. COMPLEXITY ANALYSIS AND EXPERIMENTAL
RESULTS
In this section, we rst present a complexity analysis of
the communication, computation and storage costs of the
proposed protocol. After that, we present the experimental
results.
A. Communication, Computation and Storage Costs
The communication, computation and storage costs of the
client, the server and the verier are analyzed and shown
in Table II. The detailed analyses are omitted due to space
limitation. Interested readers can refer to [20] for them.
For the data dynamics, the computation cost for block
insertion or block modication is just one modular expo-
nentiation, which is T
exp
([N[, N).
Storage of the Block Tags. Because the n block tags
D
1
, D
2
, ..., D
n
are made publicly known to everyone, they
can be stored at the server, the client or the verier. If
the tags are stored at the server, then traditional integrity
protection methods such as digital signatures can be used
to protect them from being tampered with by the server.
The storage cost of the block tags is upper bounded
by ,[m[/l|[N[ bits. In this case, when the data integrity
checking is performed, the tags are transmitted back to the
verier from the server. As the tags have been signed by
the clients private key, the server cannot tamper with them.
This will incur communication costs that are linear to the
number of blocks. However, because the tags are relatively
small compared with the original data, the incurred com-
munication costs are acceptable with respect to all the good
features the proposed protocol has. If the tags are stored at
the verier or the client, then these communication costs are
mitigated. However, this will cause a storage cost of O(n)
at the verier or the client, which is the same as Sebe et
al.s protocol [1].
B. Experimental Results
In the experiment, we measure the computation costs at
the verier and the server when the le length is xed and
the block size is changed. The results are shown in Table
III. After that, we measure the computation costs when the
le length changes and the block size is xed. The results
are shown in Table IV. We also measure the clients pre-
processing costs, which are shown in Table V. We choose
k = d = 128. The proposed protocol is implemented on
a laptop with Intel Core2 Duo 2.00GHz CPU and 1.99GB
memory. All the programs are written in the C++ language
with the assistance of MIRACL library [21].
From Table III we can see that when the le length
is 2
25
bits (4MB) and the block size is 2
18
bits (32KB),
the computation cost at the verier is 173.39 ms, and the
computation cost at the server is 2304.39 ms.
From Table IV, we can see that the computation cost
at the server doesnt increase much when the le length
increases. But the computation cost at the verier increases
nearly proportionally with the increasing le length. We
note that this is consistent with the theoretical analysis in
Section VI-A. When the le length increases, the times of
exponential operations at the verier increase proportion-
ally. However, the server performs only one exponential
operation no matter how large the le is. Therefore, the
servers burden is not increased much with larger les. As
to the veriers load, our scheme can be efcient when
the le length is not huge. However, when the le length
is very large, our protocol can be easily extended into
a probabilistic one by using the probabilistic framework
proposed in [3]. In that case, the extended protocol provides
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
6
TABLE II: Upper Bounds of the Communication, Computation and Storage Costs at the Client, the Server and the Verier.
Client Server Verier
Communication k + 2[N[
Computation ,[m[/l|T
exp
([N[, N)
1
,[m[/l| T
prng
+
T
exp
([,[m[/l|[ + d + l, N)+
d,[m[/l|T
add
([,[m[/l|[ + d + l)
(,[m[/l| + 2) T
prng
+
2T
exp
([N[, N) +,[m[/l|T
exp
(d, N)
+[N[,[m[/l|T
add
([N[)
Storage
2
(,[m[/l| + 2)[N[ +[p[ +[q[ [m[ (,[m[/l| + 2)[N[
1
This is the pre-processing cost at the client during the execution of the SetUp function.
2
The storage costs are analyzed when the tags are stored at both the client and the verier.
3
n = |m|/l is the block number. T
exp
(len, num) is the time cost of computing a modular exponentiation with a len-bit exponent modular num.
T
prng
is the time cost of generating a pseudo-random number and T
add
(len) is the time cost of adding two len-bit numbers.
probabilistic data possession guarantee while its other good
features are still kept.
TABLE III: Computation Costs at the Verier and the Server with
|N| = 1024 and |m| = 4MB.
l(bits) Verier (ms) Server (ms)
65,536(2
16
) 653.37 591.1
131,072(2
17
) 328.81 1161.1
262,144(2
18
) 173.39 2304.39
524,288(2
19
) 95.46 4558.67
TABLE IV: Computation Costs at the Verier and the Server with
Different File Lengths and Fixed Block Size.
File Length l(bits) Verier (ms) Server (ms)
1MB 65,536(2
16
) 176.24 568.39
2MB 65,536(2
16
) 332.55 574.2
4MB 65,536(2
16
) 653.37 591.1
8MB 65,536(2
16
) 1281.79 618.04
TABLE V: The Clients Pre-processing Time with Different Block
Lengths when |N| = 1024 and |m| = 4MB.
l (bits) Block number Processing time (ms)
65,536(2
16
) 512 4,765
131,072(2
17
) 256 2,477
262,144(2
18
) 128 1,328
524,288(2
19
) 64 755
VII. CONCLUSIONS AND FUTURE WORK
In this paper we propose a new remote data integrity
checking protocol for cloud storage. The proposed protocol
is suitable for providing integrity protection of customers
important data. The proposed protocol supports data in-
sertion, modication and deletion at the block level, and
also supports public veriability. The proposed protocol
is proved to be secure against an untrusted server. It is
also private against third party veriers. Both theoretical
analysis and experimental results demonstrate that the pro-
posed protocol has very good efciency in the aspects of
communication, computation and storage costs.
Currently we are still working on extending the protocol
to support data level dynamics. The difculty is that there
is no clear mapping relationship between the data and the
tags. In the current construction, data level dynamics can be
supported by using block level dynamics. Whenever a piece
of data is modied, the corresponding blocks and tags are
updated. However, this can bring unnecessary computation
and communication costs. We aim to achieve data level
dynamics at minimal costs in our future work.
REFERENCES
[1] F. Sebe, J. Domingo-Ferrer, A. Martinez-Balleste, Y. Deswarte, and
J.-J. Quisquater, Efcient remote data possession checking in critical
information infrastructures, IEEE Trans. on Knowledge and Data
Engineering, vol. 20, pp. 1034 1038, aug. 2008.
[2] R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic,
Cloud computing and emerging IT platforms: Vision, hype, and
reality for delivering computing as the 5th utility, Future Generation
Computer Systems, vol. 25, no. 6, pp. 599 616, 2009.
[3] G. Ateniese, R. Burns, R. Curtmola, J. Herring, L. Kissner, Z. Pe-
terson, and D. Song, Provable data possession at untrusted stores,
in CCS07, (New York, NY, USA), pp. 598609, ACM, 2007.
[4] R. Curtmola, O. Khan, R. Burns, and G. Ateniese, MR-PDP:
Multiple-Replica Provable Data Possession, in ICDCS08, IEEE.
[5] G. Ateniese, R. Di Pietro, L. V. Mancini, and G. Tsudik, Scalable
and efcient provable data possession, in SecureComm08, ACM.
[6] C. Erway, A. K upc u, C. Papamanthou, and R. Tamassia, Dynamic
provable data possession, in CCS09, pp. 213222, ACM, 2009.
[7] C. Wang, Q. Wang, K. Ren, and W. Lou, Ensuring data storage
security in cloud computing, in IWQoS09, pp. 1 9, july 2009.
[8] Q. Wang, C. Wang, J. Li, K. Ren, and W. Lou, Enabling public ver-
iability and data dynamics for storage security in cloud computing,
in 14th ESORICS, Springer, September 2009.
[9] C. Wang, Q. Wang, K. Ren, and W. Lou, Privacy-preserving
public auditing for data storage security in cloud computing, in
InfoCom2010, IEEE, March 2010.
[10] Y. Deswarte and J.-J. Quisquater, Remote Integrity Checking, in
IICIS04, pp. 111, Kluwer Academic Publishers, 1 2004.
[11] D. L. G. Filho and P. S. L. M. Barreto, Demonstrating data
possession and uncheatable data transfer. Cryptology ePrint Archive,
Report 2006/150, 2006. http://eprint.iacr.org/.
[12] M. A. Shah, M. Baker, J. C. Mogul, and R. Swaminathan, Auditing
to keep online storage services honest, in HotOS XI., Usenix, 2007.
[13] C. Wang, S. S.-M. Chow, Q. Wang, K. Ren, and W. Lou, Privacy-
preserving public auditing for secure cloud storage. Cryptology
ePrint Archive, Report 2009/579, 2009. http://eprint.iacr.org/.
[14] Y. Zhu, H. Wang, Z. Hu, G.-J. Ahn, H. Hu, and S. S. Yau,
Cooperative provable data possession. Cryptology ePrint Archive,
Report 2010/234, 2010. http://eprint.iacr.org/.
[15] Z. Hao and N. Yu, A multiple-replica remote data possession
checking protocol with public veriability, in ISDPE2010, IEEE.
[16] O. Goldreich, Foundations of Cryptography. Cambridge University
Press, 2004.
[17] I. Damg ard, Towards practical public key systems secure against
chosen ciphertext attacks, in CRYPTO91, Springer-Verlag, 1992.
[18] M. Bellare and A. Palacio, The knowledge-of-exponent assumptions
and 3-round zero-knowledge protocols, in CRYPTO04, pp. 273
289, Springer, 2004.
[19] G. L. Miller, Riemanns hypothesis and tests for primality, in
STOC75, (New York, NY, USA), pp. 234239, ACM, 1975.
[20] Z. Hao, S. Zhong, and N. Yu, A privacy-preserving remote data
integrity checking protocol with data dynamics and public veriabil-
ity, SUNY Buffalo CSE department technical report 2010-11, 2010.
http://www.cse.buffalo.edu/tech-reports/2010-11.pdf.
[21] Multiprecision Integer and Rational Arithmetic C/C++ Library.
http://www.shamus.ie/.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

Você também pode gostar