Você está na página 1de 13

International Journal of Research ISSN NO:2236-6124

A Framework for Multi-Keyword Search on Outsourced


Encrypted Cloud Data

1Dr.D.William Albert, 2Dr.D.Veerabhadra Babu, 3B.Lingaiah

1Professor in CSE Department. Bharath College of Engg & Tech for Women Kadapa,
dr.albertdwgtl@gmail.com
2Associate Professor-IT, KL University,Vijayawada, iNurture Education Solutions Private
Limited, Bangalore, veerabhadra.b@inurture.co.in

3IT-Faculty, KL University Vijayawada, iNurture Education Solutions Private Limited,


Bangalore, lingaiah.b@inurture.co.in

Abstract

Mobile cloud storage (MCS) denotes a family of increasingly popular on-line services, and
even acts as the primary file storage for the mobile devices. MCS enables the mobile devices
users to store and retrieve files or data on the cloud through wireless communication, which
improves the data availability and facilitates the file sharing process without draining the
local mobile device resources. MCS system incurs new challenges over the traditional
encrypted search schemes, in consideration of the limited computing and battery capacities of
mobile devices, as well as data sharing accessing approaches through wireless
communication. Therefore, a suitable and efficient encrypted search scheme is necessary for
MCS. In this research we propose a new architecture framework that facilitates security to the
cloud storage besides supporting multi-keyword search on the encrypted cloud data. We
proposed an algorithm known as Multi-Keyword Ranked Search Algorithm to achieve this.
We built a prototype application that demonstrates proof of the concept. Our empirical results
are compared with the state of the art algorithms. The proposed framework is found to be
more effective than existing systems.

Index Terms – Mobile cloud storage, searchable data encryption, energy efficiency, traffic
efficiency

Volume 7, Issue XII, December/2018 Page No:1779


International Journal of Research ISSN NO:2236-6124

1. INTRODUCTION

CLOUD storage system is a service model in which data are maintained, managed and
backup remotely on the cloud side, and meanwhile data keeps available to the users over a
network. Mobile Cloud Storage (MCS) [1] [2] denotes a family of increasingly popular on-
line services, and even acts as the primary file storage for the mobile devices [3]. MCS
enables the mobile device users to store and retrieve files or data on the cloud through
wireless communication, which improves the data availability and facilitates the file sharing
process without draining the local mobile device resources [4].

The data privacy issue is paramount in cloud storage system, so the sensitive data is
encrypted by the owner before outsourcing onto the cloud, and data users retrieve the
interested data by encrypted search scheme. In MCS, the modern mobile devices are
confronted with many of the same security threats as PCs, and various traditional data
encryption methods are imported in MCS [5], [6]. However, mobile cloud storage system
incurs new challenges over the traditional encrypted search schemes, in consideration of the
limited computing and battery capacities of mobile device, as well as data sharing and
accessing approaches through wireless communication. Therefore, a suitable and efficient
encrypted search scheme is necessary for MCS.

Generally speaking, the mobile cloud storage is in great need of the bandwidth and energy
efficiency for data encrypted search scheme, due to the limited battery life and payable traffic
fee. Therefore, we focus on the design of a mobile cloud scheme that is efficient in terms of
both energy consumption and the network traffic, while keep meeting the data security
requirements through wireless communication channels. Our contributions in this paper
include proposal of a framework that enables user-friendly multi-keyword ranked search over
encrypted data. Emperical study is madde with a prototype and the results showed better
performance over the state of the art. The remainder of this paper is structured as follows.
Section 2 reviews literature on keyword based search over outsourced encrypted data. Section
3 presents the proposed multi-keyword ranked search mechanism. Section 4 provides
implementation details. Section 5 presents experimental results. Section 6 gives conclusions
and the scope of future work.

Volume 7, Issue XII, December/2018 Page No:1780


International Journal of Research ISSN NO:2236-6124

2. RELATED WORK

Encrypted Search (TEES) architecture for mobile cloud storage applications. TEES achieves
the efficiencies through employing and modifying the ranked keyword search as the
encrypted search platform basis, which has been widely employed in cloud storage systems.
Traditionally, two categories of encrypted search methods exit that can enable the cloud
server to perform the search over the encrypted data: ranked keyword search and Boolean
keyword search. The ranked keyword search adopts the relevance scores [7] to represent the
relevance of a file to the searched keyword and sends the top-k relevant files to the client. It
is more suitable for cloud storage than the Boolean keyword search approaches (e.g., [8], [9],
[10], and [11]), since Boolean keyword search approaches need to send all the matching files
to the clients, and therefore incur a larger amount of network traffic and a heavier post-
processing overhead for the mobile devices.

By careful redesign of ranked keyword search procedure, TEES offloads the security
calculation to the cloud side to save the energy consumption of mobile devices, and TEES
also simplify the encrypted search procedure to reduce the traffic amount for retrieving data
from encrypted cloud storage. Besides the energy and traffic efficiencies, TEES is
implemented with security enhancement in consideration of the modified encrypted search
procedure in order to mitigate statistics information leak and keywords-files association leak
[12], [13] for MCS, by adding noise in Term Frequency (TF) distribution function and
keeping the Order Preserving Encryption (OPE) attributes.

In [14] an information-theorittic approach is proposed to deal with search process by


considering TF-IDF computations. A centroid-based classification method is explored in [15]
for better search process. Secure conjunctive keyword search is proposed in [16] for
performing encrypted search over data stored in cloud. Another technique known as
distributed clustering is studied in [17] for classification of text documents. In [18] fully
homographic encryption is investigated for performing search over outsourced data which is
in the form of encryption. Similar kind of work with gentrys fully homomorphic encryption is
made in [19] for encrypted search over cloud data. A symmetric encryption method with
order preserving is explored in [20] for enabling efficient search.

Volume 7, Issue XII, December/2018 Page No:1781


International Journal of Research ISSN NO:2236-6124

3. PROPOSED METHODOLOGY

We proposed a methodology in the form of a framework that supports multi-keyword ranked


search on the encrypted cloud data. The framework is as shown in Figure 1. The data owner
sends encrypted files and indexes to cloud storage. Then performs search with multiple
keywords in order to obtain satisfied results.

Out Sourcing
Encrypted
Index
Cloud
Storage Server
Data Files
Nodes
Owner

Encrypted
Files
Out Sourcing

Relevant Files
Encrypted
Keywords

Figure 1: Proposed framework for multi-keyword ranked search on encrypted cloud data

The framework has provision for secure cloud storage and retrieval. The retrieval efficiency
is increased with multi-keyword ranked search due to the algorithm proposed. The flow of
the security activities is presented in Figure 2 while making multi-keyword search.

Volume 7, Issue XII, December/2018 Page No:1782


International Journal of Research ISSN NO:2236-6124

Stem Encrypt Hash Wrap Unwrap


&Search

Relevance
Score
Calculation

Send Files
Decrypt
Back(top-k)

Data User

Cloud
Server

Figure 2: Shows the flow with encryption and decryption activities

As shown in Figure 2, the multi-keyword ranked search is made using multi-keywords given
by end user. Then it performs activities like stem, encrypt, hash, wrap and send to server. It
does mean that the multi-keyword search text is encrypted and sends to cloud server. In the
cloud server, it is unwrapped, search is performed, relevance score is computed and top-k
files that satisfy user query are returned to end user. Now the end user is able to decrypt the
files.

Multi-Keyword Ranked Search Algorithm

Algorithm: Multi-Keyword Ranked Search Algorithm

Input: Keywords K={k1, k2, ..., kn}


Output: Ranked search results

01 Initialize files vector F


02 Initialize relevant score vector RS
03 Initialize final relevance score map FRSM
04 Initialize relevance score map RSM

Volume 7, Issue XII, December/2018 Page No:1783


International Journal of Research ISSN NO:2236-6124

05 Load files of data owner into F


Compute Relevance Score
06 For each keyword k in K
07 For each file f in F
08 Compute relevance score rs
09 IF not relevant THEM
10 Remove file from F
11 ELSE
12 Add rs to RS
13 END IF
14 End For
15 Add k and RS to RSM
16 End For
Rank Search Results
17 For each k in RSM
18 For each f in F
19 Compute average relevance score ars
20 Add file and ars to FRSM
21 End For
22 End For
Display Results
23 For each file f in F
24 IF max(FRSM) of file =ars THEN
25 Display f
26 Discard key f from FRSM
27 END IF
28 End For
Algorithm 1: Multi-Keyword Ranked Search Algorithm

The algorithm takes multi-keyword search string as input and performs search operation on
the cloud server to return requested files in encrypted format. The algorithm has two phases
known as computing relevance score and rank search results before presenting final results to
end users.

Volume 7, Issue XII, December/2018 Page No:1784


International Journal of Research ISSN NO:2236-6124

4. IMPLEMENTATION DETAILS

We built a prototype application to implement the proposed framework. The application is


built using Java programming language. It demonstrates different roles like end user, data
owner, cloud server and attacker. It also has auditor role.

Figure 3: Shows the data flow among different roles

As shown in Figure 3, the roles involved in the system allow users to perform specific
activities. For instance the data owner can perform encryption and send data to public cloud.
He can also perform verification for data integrity and perform search operations on
encrypted outsourced cloud data. The server side data flow is presented in Figure 3.

Volume 7, Issue XII, December/2018 Page No:1785


International Journal of Research ISSN NO:2236-6124

Figure 3: Illustrates server side flow

There are many operations carried out at the server. Once user logs in and makes search
operation, the server needs to perform operations like decrypting search string and execute
the proposed algorithm in order to provide multi-keyword ranked search results to end users
in encrypted format. Then the end users will decrypt the files.

5. EXPERIMENTAL RESULTS

Experiments are made with the different a prototype application built using Java
programming language with an intuitive user interface.

File Size(kb) Time(ms)


PTS TRS ORS Proposed
100 400 950 500 600

Volume 7, Issue XII, December/2018 Page No:1786


International Journal of Research ISSN NO:2236-6124

200 500 1100 600 700


300 600 1200 700 800
400 700 1300 850 950
500 800 1400 950 1050
600 900 1500 1050 1150
700 1050 1600 1150 1250
800 1150 1700 1250 1350
900 1250 1800 1350 1450
1000 1350 1900 1450 1550
Table 1: File Search and Retrieval Time

As shown in Table 1, the file size and retrieval time for proposed and existing schemes is
presented.

2000
1800
1600
1400
Time(ms)

1200
PTS
1000
800 TRS
600 ORS
400
Proposed
200
0

File Size(kb)

Figure 5: File Search and Retrieval Time

As shown in Figure 5, the results revealed that the file size has its impact on the execution
time. Another observation is that the proposed system took less time when compared with
TRS scheme. It has overhead due to its multi-keyword search operation.

File Size(kb) Throughput (kb/s)


PTS TRS ORS Proposed
100 260 100 200 300
200 400 200 320 420

Volume 7, Issue XII, December/2018 Page No:1787


International Journal of Research ISSN NO:2236-6124

300 500 250 400 500


400 560 300 460 560
500 610 350 510 610
600 650 400 580 680
700 680 420 600 700
800 700 450 630 730
900 720 500 650 750
1000 750 530 700 800
Table 2: The throughput of PTS, TRS, and ORS

As shown in Table 2, the throughput of different schemes is presented against different size
of files used in experiments.

800
700
600
Throughput(Kb/s)

500
PTS
400
TRS
300
ORS
200 Proposed
100
0
100 200 300 400 500 600 700 800 900 1000
File Size(kb)

Figure 6: Overall the throughput of PTS, TRS, and ORS

As presented in Figure 6, the throughput of the proposed system is better than other systems.
The size of file has its impact on the throughput. As file size is increased, the throughput is
also increased. Another observation is that the proposed system performed well as it is multi-
keyword search.

Volume 7, Issue XII, December/2018 Page No:1788


International Journal of Research ISSN NO:2236-6124

6. CONCLUSIONS AND FUTURE WORK

In this paper we proposed a framework to support multi-keyword ranked search on encrypted


cloud data. The framework reduces the number of roundtrips to cloud server as it is designed
efficiently. Moreover, an algorithm by name multi-keyword ranked search algorithm is
proposed and implemented to achieve more accurate search results. The algorithm takes
keywords for searching as input and generates ranked search results. It first computes
relevance score of documents and then ranks search results before displaying results. We
built a prototype application to show the utility of the proposed system. Our empirical results
show that the proposed system is effective in searching of encrypted outsourced cloud data.
In future we intend to continue our research on the proposed algorithm to improve its
efficiency further.

References

[1] J. Zhang, B. Deng, and X. Li, “Additive order preserving encryption based encrypted
documents ranking in secure cloud storage,” in Proc. 3rd Int. Conf. Adv. Swarm Intell., 2012,
pp. 58–65.

[2] S. Kamara and K. Lauter, “Cryptographic cloud storage,” in Proc. 14th Int. Conf.
Financial Cryptography Data Security, 2010, pp. 136– 149.

[3] C. Orencik and E. Sava € s, “Efficient and secure ranked multi-key- ¸ word search on
encrypted cloud data,” in Proc. Joint EDBT/ICDT Workshops, 2012, pp. 186–195.

[4] J. Ramos, “Using tf-idf to determine word relevance in document queries,” in Proc. First
Instructional Conf. Machine Learning, Tech. Rep. 14th, 2003. [Online]. Available:
https://www.cs.rutgers. edu/mlittman/courses/ml03/iCML03/papers/ramos.pdf

[5] W. Sun, B. Wang, N. Cao, M. Li, W. Lou, Y. T. Hou, and H. Li, “Privacy-preserving
multi-keyword text search in the cloud supporting similarity-based ranking,” in Proc. 8th
ACM SIGSAC Symp. Inf., Comput. Commun. Security, 2013, pp. 71–82.

[6] K. Bowers, A. Juels, and A. Oprea, “Hail: A high-availability and integrity layer for cloud
storage,” in Proc. 16th ACM Conf. Comput. Commun. Security, 2009, pp. 187–198.

Volume 7, Issue XII, December/2018 Page No:1789


International Journal of Research ISSN NO:2236-6124

[7] D. Hiemstra, “A probabilistic justification for using tfidf term weighting in information
retrieval,” Int. J. Digital Libraries, vol. 3, no. 2, pp. 131–139, 2000.

[8] Q. Chai and G. Gong, “Verifiable symmetric searchable encryption for semi-honest-but-
curious cloud servers,” in Proc. IEEE Int. Conf. Commun., 2012, pp. 917–922.

[9] M. Li, S. Yu, K. Ren, W. Lou, and Y. T. Hou, “Toward privacyassured and searchable
cloud data storage services,” IEEE Netw., vol. 27, no. 4, pp. 56–62, Jul./Aug. 2013.

[10] N. Cao, C. Wang, M. Li, K. Ren, and W. Lou, “Privacy-preserving multi-keyword


ranked search over encrypted cloud data,” in Proc. IEEE Conf. Comput. Commun., 2011, pp.
829–837.

[11] K. Jones, “Index term weighting,” Inf. Storage Retrieval, vol. 9, no. 11, pp. 619–633,
1973.

[12] G. Salton and M. J. McGill, Introduction to Modern Information Retrieval. New York,
NY, USA: McGraw-Hill, 1986.

[13] S. Hou, T. Uehara, S. Yiu, L. C. Hui, and K. Chow, “Privacy preserving multiple
keyword search for confidential investigation of remote forensics,” in Proc. 3rd Int. Conf.
Multimedia Inf. Netw. Security, 2011, pp. 595–599.

[14] A. Aizawa, “An information-theoretic perspective of tf-idf measures,” Inf. Process.


Manage., vol. 39, pp. 45–65, 2003.

[15] E. Han and G. Karypis, “Centroid-based document classification: Analysis and


experimental results,” in Proc. 4th Eur. Conf. Principles Data Mining Knowl. Discov., 2000,
pp. 116–123.

[16] P. Golle, J. Staddon, and B. Waters, “Secure conjunctive keyword search over encrypted
data,” in Proc. Appl. Cryptography Netw. Security, 2004, pp. 31–45.

[17] L. Baker and A. McCallum, “Distributional clustering of words for text classification,”
in Proc. 21st Annu. Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 1998, pp. 96–103.

[18] M. Van Dijk, C. Gentry, S. Halevi, and V. Vaikuntanathan, “Fully homomorphic


encryption over the integers,” in Proc. 28th Int. Conf. Theory Appl. Cryptographic Techn.,
2010, pp. 24–43.

Volume 7, Issue XII, December/2018 Page No:1790


International Journal of Research ISSN NO:2236-6124

[19] C. Gentry and S. Halevi, “Implementing gentrys fully-homomorphic encryption


scheme,” in Proc. 30th Annu. Int. Conf. Adv. Cryptol.: Theory Appl. Cryptographic Techn.,
2011, pp. 129–148.

[20] A. Boldyreva, N. Chenette, Y. Lee, and A. Oneill, “Order-preserv- ing symmetric


encryption,” in Proc. 28th Annu. Int. Conf. Adv. Cryptol.: Theory Appl. Cryptographic
Techn., 2009, pp. 224–241.

Volume 7, Issue XII, December/2018 Page No:1791

Você também pode gostar