Escolar Documentos
Profissional Documentos
Cultura Documentos
1
(x, y)
(resp., view
2
(x, y)). view
1
(x, y) (resp., view
2
(x, y)) in-
cludes (x, r
1
, m
1
1
, ..., m
1
t
) (resp., (x, r
2
, m
2
1
, ..., m
2
t
)) where
r
1
(resp., r
2
) represents the outcome of the rst (resp.,
second) partys internal coin tosses, and m
1
i
(resp., m
2
i
)
represents the ith message it has received. Denote the output
of the rst (resp., second) party during the execution of
on (x, y) by output
1
(x, y) (resp., output
2
(x, y)), which is
implicit in the partys own view of the execution. We denote
the verier and the server by 1 and T respectively.
Denition 2. (Privacy against Semi-Honest Behavior [16])
For a deterministic functionality f, is said to privately
compute f if there exist probabilistic polynomial time
algorithms, denoted S
1
and S
2
, such that
S
1
(x, f
1
(x, y))
x,y{0,1}
c
view
1
(x, y)
x,y{0,1}
,
S
2
(y, f
2
(x, y))
x,y{0,1}
c
view
2
(x, y)
x,y{0,1}
.
Note that
c
denotes computational indistinguishability.
From Denition 2 we dene the privacy against third
party veriers, which is given in Denition 3.
Denition 3. (Privacy against Third Party Veriers) For the
remote data integrity checking protocol , if there exists a
TTT simulator o
V
such that
o
V
(x, f
V
(x, y))
x,y{0,1}
c
view
V
(x, y)
x,y{0,1}
,
then is a protocol that ensures privacy against third-party
veriers.
Data Dynamics at Block Level Data dynamics means
after clients store their data at the remote server, they
can dynamically update their data at later times. At the
block level, the main operations are block insertion, block
modication and block deletion. Moreover, when data is
updated, the verication metadata also needs to be updated.
The updating overhead should be made as small as possible.
Homomorphic Veriable Tags In our construction,
we use a RSA-based homomorphic veriable tags (HVT)
[3], which is dened as follows. Let N = pq be one
publicly known RSA modulus. We know that e : e
Z
N
and gcd(e, N) = 1 forms a multiplicative group.
Denote this group by Z
N
. Denote an element in Z
N
with
a large order by g. The RSA-based HVT for message m
i
is dened as Tag(m
i
) = g
m
i
mod N. Its homomorphic
property can be deduced from its denition. When Tag(m
i
)
and Tag(m
j
) are tags of m
i
and m
j
respectively, the tag for
m
i
+m
j
can be generated by computing Tag(m
i
+m
j
) =
Tag(m
i
) Tag(m
j
) = g
m
i
+m
j
mod N.
III. THE PROPOSED REMOTE DATA INTEGRITY
CHECKING PROTOCOL
In this section we describe the proposed remote data
integrity checking protocol. Just as formulated in Section
II, the proposed protocol has functions SetUp, TagGen,
Challenge, GenProof and CheckProof, as well as functions
for data dynamics. In the following we present the former
ve functions of the proposed protocol. We leave the
functions for data dynamics to Section V.
SetUp(1
k
) (pk, sk): Let N = pq be one publicly
known RSA modulus, in which p = 2p
+ 1, q = 2q
+ 1
are two large primes. p
and q
, the
order of g is also p
n
i=1
a
i
m
i
mod N,
and sends R to the verier.
CheckProof(pk, T
m
, chal, R) success, failure:
When the verier receives R from the server, she computes
a
i
i=1,...,n
as the server does in the GenProof step. Then
the verier computes P and R
as follows:
P =
n
i=1
(D
a
i
i
mod N) mod N
R
= P
s
mod N
After that the verier checks whether R
= R. If R
= R,
output success. Otherwise the verication fails and the
verier outputs failure.
Note that in the TagGen function, we make all the blocks
distinct by adding random numbers in blocks with the same
value. If the server still tries to save its storage space, then
the only way is by breaking the prime factorization of N,
or equally, getting a multiple of (N). The hardness of
breaking large number factorization makes the proposed
protocol secure against the untrusted server. We put the
formal analysis of the proposed protocol in Section IV.
1
A simple way to compute g is to let g = b
2
, in which b
R
Z
N
and
gcd(b 1, N) = 1.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
4
IV. CORRECTNESS AND SECURITY ANALYSIS
In this section, we rst show that the proposed protocol is
correct in the sense that the server can pass the verication
of data integrity as long as both the client and the server
are honest. Then we show that the protocol is secure against
the untrusted server. These two theorems together guarantee
that, assuming the client is honest, if and only if the server
has access to the complete and uncorrupted data, it can pass
the verication process successfully. Finally we show that
the proposed protocol is private against third party veriers.
Theorem 1. If both the client and the server are honest,
then the server can pass the verication successfully.
Proof: We prove this theorem by showing that R and
R
i=1
(D
a
i
i
mod N) mod N = g
n
i=1
a
i
m
i
mod N
Then
R
= P
s
mod N = g
s
n
i=1
a
i
m
i
mod N
= g
s
n
i=1
a
i
m
i
mod N = R
This completes the proof.
Before we proceed to Theorem 2, we rst review the
KEA1-r assumption, which has been investigated in [17],
[18], and adapted to the RSA setting in [3].
Denition 4. KEA1-r(Knowledge of Exponent Assumption
[3]). For any adversary A taking input (N, g, g
s
) and
returning (C, Y ) with Y = C
s
, there exists extractor
(n)
is known. Denote the prime factorization of N by
p
1
v
1
p
t
v
t
. Then
(n) = lcm(p
1
1, , p
t
1), in
which lcm denotes least common multiple.
Lemma 1. [19] Let g be any function that satises the
following two conditions: 1)
+ 1 and q = 2q
n
i=1
a
i
m
i
mod N, in which
a
i
= f
r
(i) for i [1, n]. Because / can naturally computes
P = g
n
i=1
a
i
m
i
mod N from T
m
, P is also treated as
/s output. So / is given (N, g, g
s
) as input, and outputs
(R, P) that satises R = P
s
. From the KEA1-r assumption,
B can construct an extractor
A, which given the same input
as /, outputs c which satises P = g
c
mod N. As P =
g
n
i=1
a
i
m
i
mod N, B extracts c =
n
i=1
a
i
m
i
mod p
.
Now B generates n challenges r
1
, g
s
1
, r
2
, g
s
2
, ...,
r
n
, g
s
n
using the method described in section III. B
computes a
j
i
= f
r
j
(i) for i [1, n] and j [1, n]. Because
r
1
, r
2
, ..., r
n
are chosen by B, now B chooses them so
that a
j
1
, a
j
2
, ..., a
j
n
, j = 1, 2, ..., n satisfy the following
equation:
det
a
1
1
a
1
2
... a
1
n
a
2
1
a
2
2
... a
2
n
.
.
.
.
.
.
.
.
.
.
.
.
a
n
1
a
n
2
... a
n
n
,= 0. (1)
Here det
.
When equation (1) holds, the following system of linear
equations has a unique solution.
a
1
1
m
1
+ a
1
2
m
2
+ ... + a
1
n
m
n
= c
1
mod p
,
a
2
1
m
1
+ a
2
2
m
2
+ ... + a
2
n
m
n
= c
2
mod p
,
.
.
.
a
n
1
m
1
+ a
n
2
m
2
+ ... + a
n
n
m
n
= c
n
mod p
.
(2)
By solving the above equation set, for each le block
m
i
, i = 1, 2, ..., n, B gets m
i
which satises m
i
=
m
i
mod p
. If for each m
i
, i = 1, 2, ..., n, m
i
=
m
i
, then B has successfully extracted all the le blocks
m
i
, i = 1, 2, ..., n. However, if for some i, m
i
,= m
i
,
then we show that B can successfully compute the prime
factorization of N. Without loss of generality, we assume
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
5
m
i
is larger than m
i
. Then B can get a multiple of (N)
from m
i
= m
i
mod p
, which we denote by k
1
(N).
From Lemma 1, B can solve the prime factorization of N
with the cost of O(([k
1
[ + [(N)[)[N[
4
M([N[)). Because
/ is a TTT adversary, the length of k
1
is bounded by
O([N[
k
2
) for some constant k
2
. From the above we can
see that if any le block cannot be extracted, then B can
construct a knowledge extractor c to extract the prime
factorization of N in probabilistic polynomial time.
In conclusion, under the KEA1-r and large integer factor-
ization assumptions, the proposed protocol guarantees the
data integrity against an untrusted server.
Theorem 3. (Privacy against Third Party Veriers) Under
the semi-honest model [16], a third party verier cannot get
any information about the clients data m from the protocol
execution. Hence, the protocol is private against third party
veriers.
Proof: (Sketch) In this proof we construct a simulator
for the view of the verier, and show that the output of
the simulator is computationally indistinguishable with the
view of the verier. Due to space limitation, we put the
detailed proof in the full version of this paper [20].
V. DATA DYNAMICS
The proposed protocol supports data dynamics at the
block level in the same way as [1]. In the following we
show how our protocol supports block modication. Due to
space limitation, we describe the support of block insertion
and block deletion in the full version [20].
Block Modication. Assume that the client wants
to modify the ith block m
i
of her le. Denote the
modied data block by m
i
. Then the server updates
m
i
to m
i
. Next the client computes a new block tag
for the updated block, i.e., D
i
= g
m
i
mod N.
From the above we can see that the correspondence
relationship between the block and the digest does not
change after the data updating, i.e., D
i
= g
m
i
mod N, i =
1, 2, ..., ,[m[/l|. So the data integrity is still protected.
If the client wants to make sure that the le has really
been updated, she can launch a proof request immediately
by sending a challenge to the server. Any block that is
updated is given a novel random number, so that each block
remains unique. Therefore, the server cannot delete any
block without being detected.
VI. COMPLEXITY ANALYSIS AND EXPERIMENTAL
RESULTS
In this section, we rst present a complexity analysis of
the communication, computation and storage costs of the
proposed protocol. After that, we present the experimental
results.
A. Communication, Computation and Storage Costs
The communication, computation and storage costs of the
client, the server and the verier are analyzed and shown
in Table II. The detailed analyses are omitted due to space
limitation. Interested readers can refer to [20] for them.
For the data dynamics, the computation cost for block
insertion or block modication is just one modular expo-
nentiation, which is T
exp
([N[, N).
Storage of the Block Tags. Because the n block tags
D
1
, D
2
, ..., D
n
are made publicly known to everyone, they
can be stored at the server, the client or the verier. If
the tags are stored at the server, then traditional integrity
protection methods such as digital signatures can be used
to protect them from being tampered with by the server.
The storage cost of the block tags is upper bounded
by ,[m[/l|[N[ bits. In this case, when the data integrity
checking is performed, the tags are transmitted back to the
verier from the server. As the tags have been signed by
the clients private key, the server cannot tamper with them.
This will incur communication costs that are linear to the
number of blocks. However, because the tags are relatively
small compared with the original data, the incurred com-
munication costs are acceptable with respect to all the good
features the proposed protocol has. If the tags are stored at
the verier or the client, then these communication costs are
mitigated. However, this will cause a storage cost of O(n)
at the verier or the client, which is the same as Sebe et
al.s protocol [1].
B. Experimental Results
In the experiment, we measure the computation costs at
the verier and the server when the le length is xed and
the block size is changed. The results are shown in Table
III. After that, we measure the computation costs when the
le length changes and the block size is xed. The results
are shown in Table IV. We also measure the clients pre-
processing costs, which are shown in Table V. We choose
k = d = 128. The proposed protocol is implemented on
a laptop with Intel Core2 Duo 2.00GHz CPU and 1.99GB
memory. All the programs are written in the C++ language
with the assistance of MIRACL library [21].
From Table III we can see that when the le length
is 2
25
bits (4MB) and the block size is 2
18
bits (32KB),
the computation cost at the verier is 173.39 ms, and the
computation cost at the server is 2304.39 ms.
From Table IV, we can see that the computation cost
at the server doesnt increase much when the le length
increases. But the computation cost at the verier increases
nearly proportionally with the increasing le length. We
note that this is consistent with the theoretical analysis in
Section VI-A. When the le length increases, the times of
exponential operations at the verier increase proportion-
ally. However, the server performs only one exponential
operation no matter how large the le is. Therefore, the
servers burden is not increased much with larger les. As
to the veriers load, our scheme can be efcient when
the le length is not huge. However, when the le length
is very large, our protocol can be easily extended into
a probabilistic one by using the probabilistic framework
proposed in [3]. In that case, the extended protocol provides
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
6
TABLE II: Upper Bounds of the Communication, Computation and Storage Costs at the Client, the Server and the Verier.
Client Server Verier
Communication k + 2[N[
Computation ,[m[/l|T
exp
([N[, N)
1
,[m[/l| T
prng
+
T
exp
([,[m[/l|[ + d + l, N)+
d,[m[/l|T
add
([,[m[/l|[ + d + l)
(,[m[/l| + 2) T
prng
+
2T
exp
([N[, N) +,[m[/l|T
exp
(d, N)
+[N[,[m[/l|T
add
([N[)
Storage
2
(,[m[/l| + 2)[N[ +[p[ +[q[ [m[ (,[m[/l| + 2)[N[
1
This is the pre-processing cost at the client during the execution of the SetUp function.
2
The storage costs are analyzed when the tags are stored at both the client and the verier.
3
n = |m|/l is the block number. T
exp
(len, num) is the time cost of computing a modular exponentiation with a len-bit exponent modular num.
T
prng
is the time cost of generating a pseudo-random number and T
add
(len) is the time cost of adding two len-bit numbers.
probabilistic data possession guarantee while its other good
features are still kept.
TABLE III: Computation Costs at the Verier and the Server with
|N| = 1024 and |m| = 4MB.
l(bits) Verier (ms) Server (ms)
65,536(2
16
) 653.37 591.1
131,072(2
17
) 328.81 1161.1
262,144(2
18
) 173.39 2304.39
524,288(2
19
) 95.46 4558.67
TABLE IV: Computation Costs at the Verier and the Server with
Different File Lengths and Fixed Block Size.
File Length l(bits) Verier (ms) Server (ms)
1MB 65,536(2
16
) 176.24 568.39
2MB 65,536(2
16
) 332.55 574.2
4MB 65,536(2
16
) 653.37 591.1
8MB 65,536(2
16
) 1281.79 618.04
TABLE V: The Clients Pre-processing Time with Different Block
Lengths when |N| = 1024 and |m| = 4MB.
l (bits) Block number Processing time (ms)
65,536(2
16
) 512 4,765
131,072(2
17
) 256 2,477
262,144(2
18
) 128 1,328
524,288(2
19
) 64 755
VII. CONCLUSIONS AND FUTURE WORK
In this paper we propose a new remote data integrity
checking protocol for cloud storage. The proposed protocol
is suitable for providing integrity protection of customers
important data. The proposed protocol supports data in-
sertion, modication and deletion at the block level, and
also supports public veriability. The proposed protocol
is proved to be secure against an untrusted server. It is
also private against third party veriers. Both theoretical
analysis and experimental results demonstrate that the pro-
posed protocol has very good efciency in the aspects of
communication, computation and storage costs.
Currently we are still working on extending the protocol
to support data level dynamics. The difculty is that there
is no clear mapping relationship between the data and the
tags. In the current construction, data level dynamics can be
supported by using block level dynamics. Whenever a piece
of data is modied, the corresponding blocks and tags are
updated. However, this can bring unnecessary computation
and communication costs. We aim to achieve data level
dynamics at minimal costs in our future work.
REFERENCES
[1] F. Sebe, J. Domingo-Ferrer, A. Martinez-Balleste, Y. Deswarte, and
J.-J. Quisquater, Efcient remote data possession checking in critical
information infrastructures, IEEE Trans. on Knowledge and Data
Engineering, vol. 20, pp. 1034 1038, aug. 2008.
[2] R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic,
Cloud computing and emerging IT platforms: Vision, hype, and
reality for delivering computing as the 5th utility, Future Generation
Computer Systems, vol. 25, no. 6, pp. 599 616, 2009.
[3] G. Ateniese, R. Burns, R. Curtmola, J. Herring, L. Kissner, Z. Pe-
terson, and D. Song, Provable data possession at untrusted stores,
in CCS07, (New York, NY, USA), pp. 598609, ACM, 2007.
[4] R. Curtmola, O. Khan, R. Burns, and G. Ateniese, MR-PDP:
Multiple-Replica Provable Data Possession, in ICDCS08, IEEE.
[5] G. Ateniese, R. Di Pietro, L. V. Mancini, and G. Tsudik, Scalable
and efcient provable data possession, in SecureComm08, ACM.
[6] C. Erway, A. K upc u, C. Papamanthou, and R. Tamassia, Dynamic
provable data possession, in CCS09, pp. 213222, ACM, 2009.
[7] C. Wang, Q. Wang, K. Ren, and W. Lou, Ensuring data storage
security in cloud computing, in IWQoS09, pp. 1 9, july 2009.
[8] Q. Wang, C. Wang, J. Li, K. Ren, and W. Lou, Enabling public ver-
iability and data dynamics for storage security in cloud computing,
in 14th ESORICS, Springer, September 2009.
[9] C. Wang, Q. Wang, K. Ren, and W. Lou, Privacy-preserving
public auditing for data storage security in cloud computing, in
InfoCom2010, IEEE, March 2010.
[10] Y. Deswarte and J.-J. Quisquater, Remote Integrity Checking, in
IICIS04, pp. 111, Kluwer Academic Publishers, 1 2004.
[11] D. L. G. Filho and P. S. L. M. Barreto, Demonstrating data
possession and uncheatable data transfer. Cryptology ePrint Archive,
Report 2006/150, 2006. http://eprint.iacr.org/.
[12] M. A. Shah, M. Baker, J. C. Mogul, and R. Swaminathan, Auditing
to keep online storage services honest, in HotOS XI., Usenix, 2007.
[13] C. Wang, S. S.-M. Chow, Q. Wang, K. Ren, and W. Lou, Privacy-
preserving public auditing for secure cloud storage. Cryptology
ePrint Archive, Report 2009/579, 2009. http://eprint.iacr.org/.
[14] Y. Zhu, H. Wang, Z. Hu, G.-J. Ahn, H. Hu, and S. S. Yau,
Cooperative provable data possession. Cryptology ePrint Archive,
Report 2010/234, 2010. http://eprint.iacr.org/.
[15] Z. Hao and N. Yu, A multiple-replica remote data possession
checking protocol with public veriability, in ISDPE2010, IEEE.
[16] O. Goldreich, Foundations of Cryptography. Cambridge University
Press, 2004.
[17] I. Damg ard, Towards practical public key systems secure against
chosen ciphertext attacks, in CRYPTO91, Springer-Verlag, 1992.
[18] M. Bellare and A. Palacio, The knowledge-of-exponent assumptions
and 3-round zero-knowledge protocols, in CRYPTO04, pp. 273
289, Springer, 2004.
[19] G. L. Miller, Riemanns hypothesis and tests for primality, in
STOC75, (New York, NY, USA), pp. 234239, ACM, 1975.
[20] Z. Hao, S. Zhong, and N. Yu, A privacy-preserving remote data
integrity checking protocol with data dynamics and public veriabil-
ity, SUNY Buffalo CSE department technical report 2010-11, 2010.
http://www.cse.buffalo.edu/tech-reports/2010-11.pdf.
[21] Multiprecision Integer and Rational Arithmetic C/C++ Library.
http://www.shamus.ie/.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.