Você está na página 1de 6

2013 IEEE Seventh International Symposium on Service-Oriented System Engineering

Scalable Resource Aggregation Service of an Erlang/OTP PaaS Platform


Hanglong Zhan, Lianghuan Kang, Lantao Liu, Donggang Cao*(Cooressponding Author)
School of Electronics Engineering and Computer Science, Peking University, Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education, Beijing, 100871, China {zhanhl10, kanglh09, liult12, caodg*}@sei.pku.edu.cn

Abstract The availability of powerful processors and cheap storage devices as commodity components has improved largescale distributed computing in PaaS cloud platforms. In a multi-user PaaS platform, resources, including computing units and data storage, should be shared between numbers of users. There is a great challenge in allocating and aggregating these resources for different users. In this paper, we present a design of scalable resource aggregation service in a multi-user PaaS platform. The design contains three parts: 1) a set of distributed, autonomous virtual computing nodes provided for every user, which are isolated from each other; 2) a task manager acting as user proxy in the system, responsible for controlling tasks and monitor runtime information, 3) a consistent resource view, which can be operated by the above distributed nodes with the same owner. We are implementing such a service in our prototype of networked concurrent computing environment named UniAS. In order to operate the system easily, we also implement a client shell for remote user. The future work is discussed at the end. Keywords - resources aggregation; scalability, multi-user; distributed computing

I.

INTRODUCTION

The availability of powerful processors and cheap storage devices as commodity components has improved large-scale distributed computing in PaaS cloud platforms. Platform as a Service (PaaS) [1] is a category of cloud computing services. In PaaS, users create applications using tools and/or libraries from the cloud platform. When writing a programs on PaaS platform, users do not need to consider those traditionally typical complex issues about distributed resource management, such as how to fork a new remote process, how many nodes should participate the computing, what kind of strategy should be adopted to make different processes distributed, where to read/write files when users want to do I/O operations, etc. To address these issues, the PaaS platform should provide a scalable resource aggregation service. Resource aggregation [2] service publishes, discovers and organizes distributed resources, and provides a relatively stable view of resources for users. It is one of the central parts of a PaaS platform. From the users perspective, these resources can be roughly classified into two important elements, computing units and storage capability. One of the important requirements of resource aggregation service is scalability [3]. Scalability means that a system can increase total throughput under an increased load when resources are added. When more resources are available, the resource aggregation service should notice them, and aggregate them for users, thus providing a better dynamic service.

There has been several solid work in the development of resource aggregation service, and each of them provides certain benefits in achieving resource aggregation. One way is virtualization [4]. A virtualized system includes a new layer between physical machine and operating system. This layer is used to arbitrate accesses to the underlying physical host platforms resources, like processors, memory, storage, and I/O devices, etc, so that multiple operating systems running on board can share these resources. The typical examples of virtualization include VMware [5], Xen [6], KVM [7], etc. These systems use virtual machine to manage their resources. A virtual machine can be regarded as a single unit for a user. Virtual machines get some fantastic features implicitly, like resource isolation between units, comprehensive OS process management, and local storage spaces, etc. However, virtualization is not appropriate for some distributed applications requiring many computing resources. Firstly, virtual machine is heavy-weight, because virtual machine is acting as an operating system in actual use, what an operating system can do is more than what we need for constructing a basic unit in the PaaS platform. Secondly, virtual machine focuses on sharing a physical powerful machine, but not capable of aggregating less powerful resources into a powerful machine. In order to run distributed programs across many virtual machines, there must be an additional layer managing resources between virtual machines and applications, which increase the burden of programmers. Another way to provide resource aggregation service is to construct working unit according to specific applications. This working unit is an entity to execute program. It is always implemented in process/thread level, and it just provides the necessary facilities (e.g. communication interface, dynamic monitor, process scheduler) needed in an application. The typical examples include Google App Engine [8], Hadoop [9], Dryad [10], etc. Google App Engine (GAE) is a PaaS platform for developing and hosting web applications in Google-managed data centers, Hadoop is a framework that supports data-intensive distributed applications, Dryad is a distributed execution engine for coarse-grain data-parallel applications. GAE uses Java threads as working units, while Hadoop and Dryad propose computing nodes to construct clusters. In such a cluster, some resource aggregation mechanisms are employed, like scalable organization of distributed nodes, communication between different processes, dynamic monitor of nodes state, fault-tolerant procedure, etc. However, this working unit approach is usually implemented in a specific system, so it is not a general approach. Although GAE is a popular platform to deploy user applications on network, GAE only supports web applications, so its
353

978-0-7695-4944-6/12 $26.00 2012 IEEE DOI 10.1109/SOSE.2013.76

resource operations are very simple, such as reading/writing databases, transmitting web scripts, and handling file download/upload. Frameworks like Hadoop and Dryad provides resource aggregation services covering computing nodes, and they both have a good scalability. However, they lack of a completely resource isolation. The above approaches have fulfilled the requirements of resource aggregation as possibly as they can. Besides, in a PaaS platform, another problem should be taken into account, the management of multi-users. In order to increase the utilization rate of a PaaS platform, the platform is usually shared between numbers of users. This situation often arise a series of problems, most of which involving resources competition and conflict. When a user runs an application, it is a potential danger that his own processes may be killed by other user; when a user puts a file in the storage system, he usually worries that other users may modify his own data in case of ambiguous authority; whats more, when a user wants to read or write data, he tends to use a simple index to access them. Considering the above work and problems, we are thinking about a proper way to achieve resource aggregation. In this paper, we present a scalable multi-user resource aggregation service of an Erlang/OTP PaaS platform. Firstly, we propose a logical structure in which a set of distributed, autonomous virtual computing nodes are provided for every user. This set of nodes is isolated from other users so as to avoid the interaction between different users. The scale of users nodes can be configured and changed at runtime. Secondly, we design manager acting as user proxy in the system, and this manager is responsible for controlling tasks and monitor system runtime information. Thirdly, we provide a consistent view about system resources for users, including virtual computing nodes, data and files, system state, etc. With this view, the user program running on different nodes can handle data and files as if they were in local space. Now we are using Erlang/OTP to implement a prototype of such a resource aggregation service in our distributed concurrent computing platform UniAS. And we have conducted some experiments, too. The result shows that with the help of resource aggregation service, UniAS can run multiapplications and have a good scalability. The rest of the paper is organized as follows. We start with an overview of background information, including Erlang language and UniAS (Section 2); design of the multiuser resource aggregation service (Section 3); prototype implementation of this mechanism in UniAS (Sections 4); finally, we discuss the design validation exemplars (Section 5) and conclude with the future work (Section 6). II. BACKGROUND

system, while Erlang VM is just a simple process running on a operating system, without the concept of operating system simulation. Different from OS processes or threads, Erlang processes are created, scheduled, and handled in the Erlang VM, which are light-weight and independent of the underlying operating system. Each Erlang process executes in its own memory space and owns its own heap and stack without sharing memory. Only via passing message do processes communicate with each other. Thus processes cant interfere with each other inadvertently, and some unexpected behaviors like resource conflict and deadlock can be avoided. In addition, Erlang has distribution incorporated into the language's syntax and semantics, allowing systems to be built with location transparency. The default distribution mode is based on TCP/IP, allowing a node (or Erlang runtime system) on a heterogeneous network to connect to and communicate with any other node running on any operating system. With distribution built into the language, operations such as clustering, load balancing can be very easy. As a further peculiarity, Erlang provides an extensive, standard library, called the Open Telecom Platform (OTP), which offers pre-dened modules for a number of frequently-used programming patterns. By using OTP, we can easily implement a useful programming model, such as serverclient model, event-handle model, finite-state-machine model and so on; we can easily construct a local supervision tree of processes so that each supervisor is able to monitor if a process crashes and restart it or propagate the error; and we can easily build an application with a simple calling entrance, while the application includes a set of source files, configuration file and other resources. B. Host and Node In PaaS platform, a host means a physical machine connected to network, it provides operating system layer and underlying devices. A number of hosts get together to form the infrastructure of PaaS platform. A node means a logical unit proposed for user applications. It is a collection of computing resources and maybe storage capacities. Several nodes can run on the same host. A user may have a set of nodes, and these nodes can be distributed on different hosts. C. Erlang-based virtual computing environment, UniAS Taking advantages of Erlang, we build our PaaS platform, UniAS. UniAS is a multi-user system physically distributed across many machines. It is intended for aggregating the underlying distributed computing and storage resources, and providing them for parallel and concurrent applications. Figure 1 shows its hierarchical overview. Some of UniASs design philosophies come from Plan 9 [13]. UniAS behaves like a distributed operating system, plus an application engine that interprets resource aggregation policy, concurrency policy and distribution policy. Users interact with UniAS by remote access, and their operations include uploading or downloading their files, running applications and getting results, etc.

A. Erlang/OTP Erlang [11] is a functional programming language, focusing on distributed, concurrent and robust run-time systems. It provides a set of features that make it a perfect choice to build a virtual computing environment. In particular, Erlang application runs on Erlang Virtual Machine [12] (Erlang VM), which is very much like Java VM. And Erlang VM is different from the virtual machine mentioned in Section 1, because virtual machine in VMware or some other systems means an instance of an operating

354

of underlying resources should be transparent to users. B. Autonomous Nodes Group As we mentioned in Section 1, resource isolation should be taken into account. Process level working unit is not general, but it has shown us an idea of simplifying resource aggregation. When constructing PaaS platform, parallel and concurrent applications are usually deployed in multi-cores or multi-processors, usually distributed. In order to endow multiusers with this power, we would like to make every user have the access to those distributed resources, i.e. task processors and data storages mentioned before. A preliminary idea is to divide all of the resources into a number of parts, every user gets a part. A simple and typical practice is to divide a cluster of machines into several sub-clusters, thus user A gets machine a1, a2, an, user B gets machine b1, b2, , bm, etc. But this idea seems to be not wonderful at all. Firstly, a user can not be using his allocated resources all the time, when machines are idle, they are wasted; secondly, if there are too many users, how many resources we should provide for them, the relation between the increase of resources and the number of users is linear. A further approach is to share these resources between users. Every user has a set of working units distributed in the cluster of machines. A working unit acts as a processor of its owner in a machine, responsible for managing the resources in this machine. We call this unit as node, all of a users distributed nodes forms nodes group. Nodes group is independent and can not be affected by other node groups. So we call it Autonomous Nodes Group. Autonomous nodes group has some features as follows: Elasticity. A users autonomous nodes group neednt cover every machine in the system. It can be customized according to the system condition, users requirement. And also it can be adjusted during runtime. Transparency. Autonomous nodes group works as a whole. So it seems to a user that he was the only one using this system. When he runs program, or reads/writes data, he neednt care any influences from other users. Proximity. When the system locates across the LAN, the member of autonomous nodes group should stay as close to each other as possible, so as to helps minimize network latency. In the premise of users request, allocating all nodes within users current LAN is a good choice.

Figure 1. UniAS Hierarchical Overview

Now UniAS is implemented on Linux. It runs a daemon service in each Linux host. These daemons form a cluster and manage the physical hosts automatically. Therefore, the underlying distribution details are transparent to user-level applications. Thus users can easily make their programs running in parallel or distributed. UniAS can support several types of concurrent applications, including CPU-intensive and data-intensive, communication-intensive, etc. UniAS supports multi-tasks, and multi-users can share the system at the same time. III. RESOURCE AGGREGATION SERVICE DESIGN OVERVIEW

From the perspective of the users, what a PaaS platform provides for them are mainly two things, task processors (or computing entities) and data storages. So our design focuses on the two aspects. A. Design principles Our resource aggregation service mechanism aims at three features below. First, scalability. Resources arranged for every user can be customized, and can also be added or released during runtime. Most of PaaS platforms consist of distributed nodes, and it is resource aggregation service that allocates a certain part of these nodes to users. The scale of nodes should be determined at runtime as users require. Whats more, when more nodes are added into the platform, if any user is in need for more resources, the platform can provide new nodes and make them available as soon as possible. When some of users nodes fail to work, resource aggregation service will handle this situation and ensure that the running applications should run correctly. Second, resource isolation. Ones resources should not be affected or accessed by other users, so as to ensure the correct execution of ones applications. As a result, it seems that he is the only user of the system without the knowledge of other users existence. This feature asks for a logical division of underlying resources according to different users requirements. Third, resource transparency. When handling resources, one should access them in an easy manner without considering whether these resources are distributed in a cluster of machines, or where these resources are located physically. In a word, the detail

Figure 2. Examples of Autonomous Nodes Group

355

Figure 2 shows some examples of autonomous nodes group. The virtual computing system consists of two LANs, one with 5 machines, the other with 2 machines. There are 3 users in this system, and their autonomous nodes groups contain nodes distributed in different machines. Different users can share a same machine, since their nodes will not affect each other. In this figure, User As nodes locate in all of the machines in the system, User Bs nodes cover machine 1 to machine 4, and User Cs nodes cover machine 3 to machine 5, their autonomous nodes group can have intersection. In addition, a node is usually a process or a process group in a machine, so the autonomous nodes group concentrates mainly on computing entities. When different users nodes share a same computer, they should compete for CPU time, memory spaces, etc. But when we said they can not affect each other, we mean that their computing logic and inner state are independent and safe. Although the sharing of CPU time and memory spaces will reduce working efficiency, this is a problem in the scope of operating system. And we believe that the parallel or concurrent execution of a distributed application is the main positive factor in changing the systems throughput. C. Manager and Container We design the concept of manager as an proxy of user in virtual computing system. Manager is marked with username, and it is the center of resource aggregation mechanism.

ize the computing resources. When some parts of the platform are overload, the manager will notice it, and informs container to make adjustments. In a word, manager acts as a resource scheduler in our resource aggregation service and container is the resource operator. D. Consistent Resource View With autonomous nodes group, user can do distributed computation easily, but what if the program wants to read/write data. Here we meet some problems in virtual computing system. How a process in one node could read the data written by another node before? Maybe message passing is a solution, but if the data is a big file, it is difficult to be a message. When a distributed program writes data from every executing node, how to merge these data together, since they are belonging to one program. How to collect the storage information if user wants to see all of his data in the system? We use Consistent Resource View to handle these problems. Consistent resource view means that no matter which node is, it will see the same view of resources belonging to its user. What nodes directly handle is not a local physical directory, but a virtual directory across all of the data storages in the system. As a result, programs with I/O operations need not puzzle about the distribution details of the data and file. It decouples the reading/writing of distributed data and programs computation logic, which makes distributed programs easier to write. Through consistent resource view, we should be able to see not only user data and files, but also the information and dynamic condition of the system. This idea is from the philosophy of Unix/Linux [14]. In Unix/Linux, all regular data, devices, ports, etc are treated as files, which make it easier to add, remove, and configure. In our system, user with specific authority can read information of system hosts, autonomous nodes group, etc. And every user can read his data storage, administrate his running tasks, start or stop a job, read his task log, etc. In this way, consistent resource view help improve our multi-user resource aggregation service. IV. IMPLEMENTATION IN UNIAS


Figure 3. structure of Manager Concept

Figure 3 shows the structure of the manager concept. In order to achieve regular management in the system, some components are employed, such as task accounting, task log, and system monitor. Accounting module is used to make statistics of two things, the runtime information of every node for system administrator, and the accounting of every running task for users. Log module is used to collect logs generated by runtime programs. And monitor module is used to supervise all processes running on the platform. Since the PaaS platform consists of a number of working nodes, and user applications are distributed, the above modules should be deployed on every working node as daemon processes, and provide APIs for application to call. In addition, these daemon processes send their information to manager periodically. Manager receives real-time messages from the above components to supervise platform state. Besides, another concept, Container, is introduced to represent every task of a user. It is container that controls the autonomous nodes group directly. When user requests to start a new task, manager is in charge to create a new container. And then container deploys the task into autonomous nodes group to util356

We have implemented a prototype of such a resource aggregation service in UniAS. And we build the fundamental part of UniAS in Erlang, A. Erlang VM Nodes Group As we mentioned in Section 2, Erlang VM is the platform to handle and manage Erlang processes. When starting Erlang VM, we can set its name and cookie value, making it an independent, autonomous node. The cookie value in Erlang VM is useful in nodes isolation. Only by sharing the same cookie value could two nodes use the message-passing primitive to communicate with each other. In this implementation we use Erlang VM nodes to form the nodes group. Before a user start using UniAS, he should configure the quota value, such as how many machines he wants to use, is there a range about the number. After that, UniAS create the nodes group for him.

Different users nodes are set with different cookie value, so that they can not be affected by other irrelevant nodes. User applications are deployed on this node group, and executed by Erlang processes. We also implement container to coordinate with the nodes group. The container holds a list of available nodes in current nodes group as node pool. When a program needs a node to execute a new process, the container chooses a proper node through some strategy like the least load policy. When more machines adding into the system, container can add a node into its node list as needed. We employ a supervisor mechanism in container to supervise processes on nodes. When nodes get down or offline, the container will find it and restart all processes ever run on these nodes. Thus we make the Erlang VM nodes group scalable and elastic. B. RESTful Resource URL In order to provide a consistent resource view in UniAS, we first design a set of RESTful resource URLs. Representational State Transfer (REST) [15] is a style of web service design model for distributed systems. An important concept in REST is the existence of resources (sources of specific information), each of which is referenced with a global identifier (e.g., a URI in HTTP). In order to manipulate these resources, components of the network (user agents and origin servers) communicate via a standardized interface (e.g., HTTP) and exchange representations of these resources. UniAS uses RESTful resource URL for user to index their resources, and employ HTTP standard interface to handle them. We show examples as follows: GET http://IP_addr/home/UserA - get content list of UserAs home directory GET http://IP_addr/home/UserA/info - get UserAs configuration and other information PUT http://IP_addr/home/UserA/Job - upload a job into UserAs home directory POST http://IP_addr/home/UserA/Job (Parameter) - run an application Job, which locates in UserAs sub-directory ./Job

terminal is much like a Unix/Linux shell, so that users can access UniAS in a familiar way. The client terminal provides some commands as follows: ls: list all files in the current directory upload <file>: upload a file, e.g. an application archive download <file>: download a file rm <file>: remove a file run <archive>: start to run an application ps: list all of the running task of the current user kill <taskID>: kill a running task with its taskID info: show information of the current user host: show information of physical hosts, only for adminstrator

V.

DESIGN VALIDATION EXEMPLAR

To validate the design of multi-user resource aggregation service, we conduct several experiments in UniAS. We first test a program of calculating all prime numbers in a scope of 2 to a very big number, validating the capability of computing nodes aggregation in different scale. We use the simple algorithm to calculate prime numbers. If a number N is a multiple of any integer between 2 and square-root of N, the judgment fails, otherwise N is a prime number. In the serial execution program, we judge every number from 2 to the maximum number one by one. With the help of distributed computing resources, we divide the data range into a number of sub-sets, and calculate each sub-set in parallel.


Figure 4. Result of experiment (Y axis: in ms)

GET http://IP_addr/home/UserA/Job/result - get the result of the application Job GET http://IP_addr/home/UserA/ps - get the information of current running tasks belonging to UserA POST http://IP_addr/home/UserA/proc/TaskID - kill or stop the task with TaskID Using this resource URL, user can index his resources easily. And it is obvious to distinguish ones resources from another. This helps improve directory accessing permissions.
Figure 5. Speed-ups of different processors and different scope


The above figures show that the application executes as regularly as we expect, and the resource aggregation service help the program distributed and run in parallel.

C. Gateway and Client Terminal In order to operate UniAS easily, we also implement a UniAS gateway and a client terminal. The client terminal interacts with UniAS through the gateway. The gateway is mainly used in message forwarding and response. The client
357

Then we get several users to run different applications in our Erlang/OTP PaaS platform, UniAS. One application is to calculate prime numbers in a big range (calculate_prime), the other is to multiply a big matrix by another big matrix (multiply_matrix). Both of these applications can be executed in parallel. We configure different users with various machine quotas. During the same time, one group of users run the calculate_prime application, while the other group of users run the multiply_matrix application. The experiment shows that every application runs normally, and gets result in time. This proves the isolation and transparency of UniAS. Furthermore, we try to change the condition of the system, i.e. adding machines into it, shut down some nodes. We use this experiment to verify the dynamic scalability of our system. VI. CONCLUSION AND FUTURE WORK

Science Fund for Creative Research Groups of China under Grant No. 61272154, No. 61121063.

REFERENCES
[1] [2] P. Mell and T. Grance, "NIST definition of cloud computing", National Institute of Standards and Technology. October, 2009. Xicheng Lu, Huaimin Wang, Ji Wang, Jie Xu, Dongsheng Li, Internet-based Virtual Computing Environment: Beyond the data center as a computerG. Eason, B. Noble, and I. N. Sneddon, On certain integrals of Lipschitz-Hankel type involving products of Bessel functions, Future Generation Computer Systems 29(2013), pp. 309-322. Andr B. Bondi, "Characteristics of scalability and their impact on performance", Proceedings of the 2nd international workshop on Software and performance, Ottawa, Ontario, Canada, 2000, ISBN 158113-195-X, pages 195203 Neiger, G., Rodgers, D., Santoni, A.L., Martins, F.C.M., Anderson, A.V., Bennett, S.M., Kagi, A., Leung, F.H., Smith, L., Intel virtualization technology, Computer, vol. 38, May. 2005, Issue: 5, pp. 48-56, 2005. VMware, VMware Infrastructure 3 architecture, 2006. Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield, Xen and the art of virtualization, Proceedings of the nineteenth ACM symposium on Operating systems principles, pp. 164-177, October. 2003, doi:10.1145/945445.945462. Irfan Habib, Virtualization with KVM, Linux Journal, v.2008 n.166, p.8, February 2008 A. Zahariev, Google App Engine, Helsinki University of Technology. T. White. Hadoop: The Definitive Guide. O'Reilly Media, Yahoo! Press, June 5, 2009. Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, Dennis Fetterly, Dryad: Distributed Data-Parallel Programs from Sequentia Building Blocks, EuroSys 2007 J. Armstrong. Making reliable distributed systems in the presence of software errors. Ph.D. thesis, Royal Institute of Technology, Sweden, December 2003. J. Armstrong. Programming Erlang: Software for a Concurrent World. Pragmatic Bookshelf, 2007. Rob Pike, Dave Presotto, Sean Dorward, Bob Flandrena, Ken Thompson, Howard Trickey, Phil Winterbottom. Plan 9 from Bell Labs. In: Computing Systems, Vol 8 #3, Summer 1995, pp. 221-254. DM Ritchie, K Thompson, "The UNIX time-sharing system", Communications of the ACM, 1974. R. Fielding, Architectural styles and the design of network-based software architectures, Ph.D. thesis, University of California, Irvine, 2000. Dejan S. Milojii, Fred Douglis, Yves Paindaveine, Richard Wheeler, Songnian Zhou. Process migration. In: Journal of ACM Computing Surveys, 2000 / Volume 32 Issue 3, Pages 241-299.

[3]

[4]

We presented a scalable multi-user resource aggregation service of an Erlang/OTP PaaS platform, and introduce the implemented prototype used in our virtual computing environment UniAS. UniAS uses this mechanism to manage all of its resources. We propose a logical structure called autonomous nodes group. This set of nodes is isolated from other users so as to avoid the influence between different users. Then we design manager acting as user proxy in the system, and it is responsible for controlling tasks and monitor runtime information. At last, we provide a consistent view about system resources for users. With this view, the user program running on different nodes can handle data and files as if they were in one node. There are still spaces for future improvements. Firstly, we want to make the storage system separated from computing entities. We plan to use an existing distributed file system to maintain the data and files. And then we extend the virtual directory layer mentioned in Section 3 to support the access of storage. Secondly, we plan to study the process migration [16] mechanism in UniAS. Using process migration can improve resource aggregation service since it helps a lot in achieving dynamic load balancing. ACKNOWLEDGMENT This work is supported by the National Basic Research Program of China (973) under Grant No.2011CB302604; the

[5] [6]

[7] [8] [9] [10]

[11]

[12] [13]

[14] [15]

[16]

358

Você também pode gostar