Você está na página 1de 5

CLOUD STORAGE IN CLOUD COMPUTING

Mr. Pancham Singh Asst. Professor, AKGEC.


pancham.akgec@gmail.com

Vasundhra Tyagi B.Tech (3rd year)


vasundhraat15@gmail.com

Sudhanshu Gupta B.Tech (3rd year)


sudhanshu.20890@gmail.com

Abstract As an emerging technology and business paradigm, Cloud Computing has taken commercial computing by storm. Cloud computing platforms provide easy access to a companys highperformance computing and storage infrastructure through web services. Cloud computing platforms provide massive scalability, 99.999% reliability, high performance, and specifiable configurability. These capabilities are provided at relatively low costs compared to dedicated infrastructures. This article covers the key technologies in Cloud Computing and Cloud Storage, after the introduction of the Cloud Storage reference model. The architecture of cloud storage is based on it conception and the proposed hierarchical and the discussed key technologies involving data organization, virtual storage, Data Duplication, security, etc. With the development of cloud computing and global data growth, it will get more attention and will be developed well. The paper also concentrates on analyzing and discussing Storage Management. Storage Management Control optimized is an effective method that will reduce the working time to largescale data Storage management. Combining Storage devices and control management software will provide system data sharing and system high applicability. With Cloud Storage Management control mechanism being applied, business enterprises could benefit from this. Keywords cloud computing,storage,architecture I. INTRODUCTION

computing lets you access all your applications and documents from anywhere in the world, freeing you from the confines of the desktop and making it easier for group members in different locations to collaborate.(1) Providers such as Amazon, Google, Salesforce, IBM, Microsoft, and Sun Microsystems have begun to establish new data centres for hosting Cloud computing applications in various locations around the world to provide redundancy and ensure reliability in case of site failures.(3) Since user requirements for cloud services are varied, service providers have to ensure that they can be flexible in their service delivery while keeping the users isolated from the underlying infrastructure. Recent advances in microprocessor technology and software have led to the increasing ability of commodity hardware to run applications within Virtual Machines (VMs) efficiently. VMs allow both the isolation of applications from the underlying hardware and other VMs, and the customization of the platform to suit the needs of the end user. Providers can expose applications running within VMs, or provide access to VMs themselves as a service (e.g. Amazon Elastic Compute Cloud) thereby allowing consumers to install their own applications. One of the primary uses of cloud computing is for data storage.(1) With cloud storage, data is stored on multiple thirdparty servers, rather than on the dedicated servers used in traditional networked data storage. When storing data, the user sees a virtual serverthat is, it appears as if the data is stored in a particular place with a specific name.(2) The users data could be stored on any one or more of the computers used to create the cloud. The actual storage location may even differ from day to day or even minute to minute, as the cloud dynamically manages available storage space. But even though the location is virtual, the user sees a static location for his dataand can actually manage his storage space as if it were connected to his own PC.

Cloud computing portends a major change in how to store information and run applications. Instead of running programs and data on an individual desktop computer, everything is hosted in the clouda nebulous assemblage of computers and servers accessed via the Internet.(2) Cloud

Cloud storage has both financial and security associated advantages. As for security, data stored in the cloud is secure from accidental erasure or hardware crashes, because it is duplicated across multiple physical machines; since multiple copies of the data are kept continually, the cloud continues to function as normal even if one or more machines go offline.(3)

II. KEY TECHNOLOGIES OF CLOUD


COMPUTING Depending on the type of provided capability, there are four scenarios where Clouds are used as showed in Fig.1: 1) Infrastructure as a Service IPs manage a large set of computing resources, such as storing and processing capacity. Through virtualization, they are able to split, assign and dynamically resize these resources to build ad-hoc systems as demanded by customers, the SPs. They deploy the software stacks that run their services. This is the Infrastructure as a Service (IaaS) scenario. 2) Platform as a Service Cloud systems can offer an additional abstraction level instead of supplying a virtualized infrastructure, they can provide the software platform where systems run on. The sizing of the hardware resources demanded by the, execution of the services is made in a transparent manner. This is denoted as Platform as a Service (PaaS). A wellknown example is the Google Apps Engine. 3) Storage as a Service Commonly known as Storage as a Service (StaaS), it facilitates cloud applications to scale beyond their limited servers. StaaS allows users to store their data at remote disks and access them anytime from any place. Cloud Storage systems are expected to meet several rigorous requirements for maintaining users data and information, including high availability, reliability, performance, replication and data consistency; but because of the conflicting nature of these requirements, no one system implements all of them together. 4) Software as a Service Finally, there are services of potential interest to a wide variety of users hosted in Cloud systems. This is an alternative to locally run applications. An example of this is the online alternatives of typical office applications such as word processors. This scenario is called Software as a Service (SaaS). Figure 1: Cloud computing service types with examples

III. CLOUD STORAGE ARCHITECTURE Cloud storage is a model of networked computer data storage where data is stored on multiple virtual servers, generally hosted by third parties, rather than being hosted on dedicated servers. Hosting companies operate large data centers; and people who require their data to be hosted buy or lease storage capacity from them and use it for their storage needs. The data center operators, in the background, virtualize the resources according to the requirements of the customer and expose them as virtual servers, which the customers can themselves manage. Physically, the resource may span across multiple servers.

Figure 2: A typical Cloud Storage system architecture IV. CLOUD STORAGE REFERENCE MODEL

The appeal of cloud storage is due to some of the same attributes that define other cloud services: pay as you go, the illusion of infinite capacity (elasticity), and the simplicity of use/management. It is therefore important that any interface for cloud storage support these attributes, while allowing for a multitude of business cases and offerings, long into the future. The model created and published by the Storage Networking Industry Association , shows multiple types of cloud data storage interfaces able to support both legacy and new applications. All of the interfaces allow storage to be provided on demand, drawn from a pool of resources. The capacity is drawn from a pool of storage capacity provided by storage services. The data services are applied to individual data elements as determined by the data system metadata. Metadata specifies the data requirements on the basis of individual data elements or on groups of data elements (containers). As shown in Fig 3, the SNIA Cloud Data Management Interface (CDMI) is the functional interface that applications will use to create, retrieve, update and delete data elements from the cloud. As part of this interface the client will be able to discover the capabilities of the cloud storage offering and use this interface to manage containers and the data that is placed in them. In addition, metadata can be set on containers and their contained data elements through this interface.

existing client libraries such as XAM can be adapted to this interface as show in Figure 3. This interface is also used by administrative and management applications to manage containers, accounts, security access and monitoring/billing information, even for storage that is accessible by other protocols. The capabilities of the underlying storage and data services are exposed so that clients can understand the offering. V. KEY TECHNOLOGIES OF CLOUD STORAGE A. Data Organization of Cloud Storage Currently, the data storage unit based on cloud classification, cloud storage can be divided into two categories: Block Storage and File Storage. (I)Block Storage: Block Storage of data will write a different single hard disk, in order to get a larger single read and write bandwidth, Its advantage is the single read and write data quickly, disadvantage is high cost, and cannot solve the real mass file storage. (2)File Storage : File Storage is based on the filelevel storage, it is to a file on a hard disk, even if the file is too large split, they put the same hard disk. The disadvantage is that a single file read and write performance will be a single hard drive limit, the advantage of a multi-file, multi-user system, the total bandwidth can be increased with the expansion of the storage node, its structure can be unlimited expansion, and low cost. File Storage suitable for the occasion are as follows: a. Large file, the total read bandwidth-intensive - such as Web sites, IPTV; b. Write multiple files simultaneously - for example monitoring; c. Prolong storage of files - such as file backup, storage or search. B. Storage virtualization Cloud storage in the large number of storage devices and distributed in many different areas, how different manufacturers, different models and even different types (such as FC storage and IP storage) among multiple devices logical volume management, storage management and virtualising multi-link redundancy management will be a huge problem, The Deployment of virtual technology is a method of computing resources, it will apply the system at different levels: hardware, software, data, networking, storage, etc. each one to isolate, to break the data center, servers, storage, networking, data and applications obstacles in the physical device, to achieve dynamic framework, and to centralized management and dynamic use of

Figure 3.Cloud Storage reference model It is expected that the interface will be able to be implemented by the majority of existing cloud storage offerings today. This can be done with an adapter to their existing proprietary interface, or by implementing the interface directly. In addition,

physical resources and virtual resources, improve flexibility and improve service, manage risk and other purposes. Virtual Storage is to enable multiple storage device which looks like a storage device, to achieve unified management, deployment and monitoring. C. Thin Provisioning Thin Provisioning technology goal is to achieve storage resources "according to his needs." System to the application of virtual storage space, when the actual physical space required to write data, the system before the actual allocation of physical space and virtual space to physical space to complete the mapping, which are transparent to the application. Although Use of Thin Provisioning technology, the actual allocation of physical space is small, however the application to see is the actual allocation of physical space than the larger virtual memory space. With the application of data to write more and more people, the actual physical space must be automatic and timely expansion followed, to avoid the lack of adequate physical space and downtime caused by application. D. Storage Security Cloud storage and distributed characteristics relative to traditional data services, cloud storage model for greater dependence on the remote server clusters, cloud platform server cluster is running in the network environment, the server cluster may contain many user data, and these data may be scattered in various virtual data centers, these data centers are not necessarily in the same physical location, if access to these data, the control of carelessness, it will face serious security and user data privacy issues, and when these problems arise when different servers located in different physical locations. Cloud storage service providers access to data for the server access control must be strict. E. Data Duplication Data duplication is detected through duplicate data, remove redundant files, data blocks of the process, so that only unique data is stored in the system. Data duplication technology through effective reduction of redundant data storage system in the possession, use to solve the storage space efficiency. Data duplication technology, the specific usage is: the data set (in the backup environment, usually the backup data stream) is divided into blocks of data and the data blocks written to disk target region. To transmit data stream identification data block, data de-duplication engine for the data segment to create a digital signature (like fingerprints), and the signature of a given repository to create an index. The index could be stored in data segment reconstruction, and provides a reference list to determine whether the data blocks in the repository.

In the copy operation, the index used to determine which data segment to be stored, which data segment to be copied. When the data de-duplication software found in a block of data has been processed before, it will insert a link to the original data set metadata in the data block pointers instead of storing the data block again. If the same block appears more than once, will generate more than a pointer to it. Using variable-length data deduplication technology can store multiple sets of discrete metadata image. However each image represents the different data set, but all images are referenced shared memory pool contains data blocks. F. Load Balance and Data Migration Load balances are to keep available storage spaces for later application in different storage devices in cloud storage system. Data migration of cloud storage means moving data in one storage system to other storage system in different places. It aims in cooperation and keeping load balance in cloud storage system. Data migration is one effective mechanism for load balance. When the storage capacity is used over some threshold proportion values, the data should be migrated into other cloud storage units and keep pointers in the old stored positions, or modify and update the metadata at the same time. However it may bring overhead workload to network bandwidth and VO process, and it doesn't relieve access bottleneck of concurrence clients. G. Hierarchical Storage Most of the cloud storage system is a "loose cluster," which means that the performance of a single node will become a bottleneck, because the data did not, and closely matched the same cluster are distributed to the node. As a result, if a file is frequently gained access to, then the time it can only be read from a node. The solution is, copy this file to multiple cluster nodes, and then change the application to see who else needs the document. In addition, if the file access frequency lowers down, you need to find a copy of this file and delete redundant. Often, the final step is rare, and this led to a lot of wasted space. This requires store managers to pay more additional management time. A simpler and more effective solution is to add the hierarchical storage management. Automatic hierarchical document will visit frequently (or document fragment) or move to RAM-based solid state disk cache area. Then, when files are frequently gained access to when the system will be provided from the high-speed store the file. This method does not require changes to the environment (or change limited), when files are frequently gained access to they can be identified and when to migrate to high-speed storage. Then, with the visit frequency decreased, the file will be

automatically migrated to the cache. Therefore, the memory becomes self-management and selfregulation can be stored. VI. CONCLUSIONS AND FUTURE WORK Cloud Storage with a great deal of promise, isnt designed to be high performing file systems but rather extremely scalable, easy to manage storage systems. They use a different approach to data resiliency, redundant array of inexpensive nodes, should be concerned about the performance, reliability, fault tolerance, ease of use, scalability and self-management capabilities, as well as cloud

coupled with object based or object-like file systems and data replication (multiple copies of the data). The paper proposes the architecture of the cloud storage, and discusses the related key technologies. Cloud storage is a new concept, its related products and research is still in the initial state. With the rapid increase of data, stored in network storage in the cloud will become increasingly important, market demand will be more strongly. Emphasis storage and cloud computing for the next generation of operating system development.

VII. REFERENCES 1 . Qinlu He, Zhanhuai Li, Xiao Zhang, Analysis of the key technology on cloud storage 2010 International Conference on Future Information Technology and Management Engineering 2. Jiyi WU1,2, Lingdi PING1, Xiaoping GE3,Ya Wang4, Jianqing FU1, Cloud Storage as the Infrastructure of Cloud Computing 2010 International Conference on Intelligent Computing and Cognitive Informatics 3. ZHAN Ying, Cloud Storage Management Technology 2009 Second International Conference Information and Computing Science on

Você também pode gostar