Escolar Documentos
Profissional Documentos
Cultura Documentos
DEPARTMENT OF CSE
PEO-I:
To excel in Computer Science and Engineering program to pursue their higher studies or
PEO-II:
PEO-III:
long learning and continuous self-improvement in order to respond to the rapid pace of
PEO IV:
PEO-V:
engineering problems.
PO2. Problem Analysis: Identify, formulate, review research literature, and analyze
problems and design system components or processes that meet the specified needs with
appropriate consideration for the public health and safety, and the cultural, societal, and
environmental considerations.
PO5. Modern Tool Usage: Create, select, and apply appropriate techniques, resources,
and modern engineering and IT tools including prediction and modeling to complex
PO6. The Engineer and Society: Apply reasoning informed by the contextual
knowledge to assess societal, health, safety, legal and cultural issues and the consequent
PO8. Ethics: Apply ethical principles and commit to professional ethics and
with the engineering community and with society at large, such as, being able to
comprehend and write effective reports and design documentation, make effective
of the engineering and management principles and apply these to ones own work, as a
PO12. Life-Long Learning: Recognize the need for, and have the preparation and
technological change.
IV/VII SEM iii
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
PSO2: Familiarity with various programming languages and paradigms, with practical
related areas such as algorithm design, compiler design, artificial intelligence and
information security.
IV/VII SEM iv
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
OBJECTIVES:
The student should be made to:
1. Be exposed to tool kits for grid and cloud environment.
2. Be familiar with developing web services/Applications in grid framework
3. Learn to run virtual machines of different configuration.
4. Learn to use Hadoop
LIST OF EXPERIMENTS:
GRID COMPUTING LAB
Use Globus Toolkit or equivalent and do the following:
1. Develop a new Web Service for Calculator.
2. Develop new OGSA-compliant Web Service.
3. Using Apache Axis develop a Grid Service.
4. Develop applications using Java or C/C++ Grid APIs
5. Develop secured applications using basic security mechanisms available in Globus
Toolkit.
6. Develop a Grid portal, where user can submit a job and get the result. Implement it
with and without GRAM concept.
CLOUD COMPUTING LAB
Use Eucalyptus or Open Nebula or equivalent to set up the cloud and demonstrate.
1. Find procedure to run the virtual machine of different configuration. Check how many
virtual machines can be utilized at particular time.
2. Find procedure to attach virtual block to the virtual machine and check whether it
holds the data even after the release of the virtual machine.
3. Install a C compiler in the virtual machine and execute a sample program.
4. Show the virtual machine migration based on the certain condition from one node to
the other.
5. Find procedure to install storage controller and interact with it.
6. Find procedure to set up the one node Hadoop cluster.
7. Mount the one node Hadoop cluster using FUSE.
8. Write a program to use the API's of Hadoop to interact with it.
9. Write a word count program to demonstrate the use of Map and Reduce tasks
TOTAL: 45 PERIODS
REFERENCE: spoken-tutorial.org.
At the end of the course, the student should be able to:
Use the grid and cloud tool kits.
Design and implement applications on the Grid.
Design and Implement applications on the Cloud.
LIST OF EQUIPMENT FOR A BATCH OF 30 STUDENTS:
SOFTWARE:
Globus Toolkit or equivalent
Eucalyptus or Open Nebula or equivalent
HARDWARE
Standalone desktops 30 Nos
IV/VII SEM 5
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
COURSE OUTCOMES:
C408.1 An ability to learn and use the modern software and tools in Open Nebula and Globus toolkit.
C408.2 An ability to develop web service programs for Grid computing using Globus toolkit.
C408.3 An ability to implement applications using Gird APIs, Security mechanism available in
Globus toolkit & to develop a Grid portal.
C408.4 An ability to learn virtual machine environment on different configuration, installation of
compliers in virtually and virtual machine migration.
C408.5 An ability to learn Hadoop clusters and to develop programs using Hadoop APIs.
Grid computing can mean different things to different individuals. The grand vision is often presented as an
analogy to power grids where users (or electrical appliances) get access to electricity through wall sockets
with no care or consideration for where or how the electricity is actually generated. In this view of grid
computing, computing becomes pervasive and individual users (or client applications) gain access to
computing resources (processors, storage, data,
applications, and so on) as needed with little or no knowledge of where those resources are located or what
the underlying technologies, hardware, operating system, and so on are.
Though this vision of grid computing can capture ones imagination and may indeed someday become a
reality, there are many technical, business, political, and social issues that need to be addressed. If we
consider this vision as an ultimate goal, there are many smaller steps that need to be taken to achieve it.
These smaller steps each have benefits of their own.
Therefore, grid computing can be seen as a journey along a path of integrating various technologies and
solutions that move us closer to the final goal. Its key values are in the underlying distributed computing
infrastructure technologies that are evolving in support of cross-organizational application and resource
sharingin a word, virtualizationvirtualization across technologies, platforms, and organizations. This
kind of virtualization is only achievable through the use of open standards. Open standards help ensure that
applications can transparently
take advantage of whatever appropriate resources can be made available to them. An environment that
provides the ability to share and transparently access resources across a distributed and heterogeneous
environment not only requires the technology to virtualize certain resources, but also technologies and
standards in the areas of scheduling, security, accounting, systems management, and so on.
Grid computing could be defined as any of a variety of levels of virtualization along a continuum. Exactly
where along that continuum one might say that a particular solution is an implementation of grid computing
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
versus a relatively simple implementation using virtual resources is a matter of opinion. But even at the
simplest levels of virtualization, one could say that grid-enabling technologies are being utilized.
Grid computing involves an evolving set of open standards for Web services and interfaces that make
services, or computing resources, available over the Internet. Very often grid technologies are used on
homogeneous clusters, and they can add value on those clusters by assisting, for example, with scheduling
or provisioning of the resources in the cluster. The term grid, and its related technologies, applies across this
1
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
entire spectrum. If we focus our attention on distributed computing solutions, then we could consider one
definition of grid computing to be distributed computing across virtualized resources. The goal is to create
the illusion of a simple yet large and powerful virtual computer out of a collection of connected (and
possibly heterogeneous) systems sharing various combinations of resources.
The Globus Alliance is made up of organizations and individuals that develop and make available
various technologies applicable to grid computing. The Globus Toolkit, the primary delivery vehicle for
technologies developed by the Globus Alliance, is an open source software toolkit used for building grid
systems and applications. Many companies and organizations are using the Globus Toolkit as the basis
for grid implementations of various types.
Though many components have Web service based implementations, some do not, and for compatibility
and migration reasons, some have both implementations.
2
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
Globus Toolkit 4 provides components in the following five categories:
Common runtime components
Security
Data management
Information services
Execution management
11.4 Installation
In order to install Globus Toolkit 4, we need to configure some tools that are essential for the Globus
Toolkit 4 installation. After installation of those tools, we
will install Globus Toolkit 4 using those tools.
3
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
AIM:
PROCEDURE:
Output:
4
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
Go to test page by clicking at "calculate" -> do the entry and click the "invoke" button.
It will show the result like the below figure.( After clicking the "invoke" button)
Similarly, you can perform different operations like addition, subtraction, multiplication, division
and modulo division. Now we consume this web service into window form application.
Create a windows application and add a service reference ( You can get help here ). After adding the
reference, go to the design page and arrange the UI controls like in the below figure.
5
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
Result:
Thus the program for performing arithmetic operation using web service was successfully executed.
6
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
AIM:
To develop a new OGSA-Compliant Web service in Grid Service using .NET language.
PROCEDURE:
The first issue is related to the implementation of Grid Service Specification prescribed in a MS .NET
language . In the framework of the GRASP project, we have selected the implementation of Grid
Service Specification provided by the Grid Computing Group of the Virginia University, named
OGSI.NET.
To manage the dynamic nature of information describing the resources, GT3 leverages on Service Data
Providers. In the MS environment, we rely on Performance Counters and Windows Management
Instrumentation (WMI) architecture to implement the Service Data Providers. For each component of a
MS system we have a performance object (e.g. Processor Object) gathering all the performance data of
the related entity. Each performance object provides a set of Performance Counters that retrieves specific
performance data regarding the resource associated to the performance object. For example, the
%Processor Time is a Performance Counter of the Processor Object representing the percentage of
time during which the processor is executing a thread. The performance counters are based on services at
the operating system level, and they are integrated in the .NET platform. In fact, .NET Framework
provides a set of APIs that allows the management of the performance counters.
To perform the collection and provisioning of the performance data to an index service, we leverage
on Windows Management Instrumentation (WMI) architecture. WMI is a unifying architecture that
allows the access to data from a variety of underlying technologies. WMI is based on the Common
Information Model (CIM) schema, which is an industry standard specification.
7
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
When a request for management information comes from a consumer (see Figure 3) to the CIM Object
Manager (CIMON), the latter evaluates the re-quest, identifies which provider has the information, and
returns the data to the consumer. The consumer only requests the desired information, and never knows
the information source or any details about the way the information data are extracted from the
underlying API. The CIMOM and the CIM repository are implemented as a system service, called
8
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
WinMgmt, and are accessed through a set of Component Object Model (COM) interfaces.
WMI provides an abstraction layer that o ers the access to many system information about hardware
and software and many functions to make calcu-lations on the collected values. The combination of
performance counters and WMI realizes a Service Data Provider that each resource in a VO should
provide.
To implement an OGSA-compliant Index Service, we exploit some Active Directory (AD) features .
AD is a directory service designed for distributed networking environments providing secure, structured,
hierarchical storage of information about interesting objects, such as users, computers, services, inside an
enterprise network. AD provides a rich support for locating and working with these objects, allowing the
organizations to eciently share and manage information about network resources and users. It acts as
the central authority for network security, letting the operating system to readily verify a user identity and
to control his/her access to network resources.
Our goal is to implement a Grid Service that, taking the role of a consumer (see Figure 3), queries at
regular intervals the Service Data Providers of a VO (see Figure 5) to obtain resources information,
collect and aggregate these information, and allows to perform searches, among the resources of an
organization, matching a specified criteria (e.g. to search for a machine with a specified number of
CPUs). In our environment this Grid Service is called Global Information Grid Service (GIGS) (see
Figure 5).
9
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
Fig 4
The hosts that run the GIGS have to be Domain Controllers (DC). A DC is a server computer,
running on Microsoft WindowsNT, Windows2000, or Windows Server2003 family operating systems,
that manages security for a domain. The use of a DC permits us to create a global catalog of all the
objects that reside in an organization, that is the primary goal of AD services. This scenario is depicted in
Figure 4 (a) where black-coloured machines run GIGS and the other ones are Service Data Providers.
Obviously, black-coloured hosts could also be Service Data Providers.
In order to avoid that the catalog grows too big and becomes slow and clumsy, AD is partitioned into
units, the triangles in Figure 4 (a). For each unit there is at least a domain controller. The AD partitioning
scheme emulates the Windows 2000 domain hierarchy (see Figure 4 (b)). Consequently, the unit of
partition for AD services is the domain. GIGS has to implement an interface in order to obtain, using a
publish/subscribe method, a set of data from Service Data Providers describing an active directory object.
Such data are then recorded in the AD by using Active Directory Service Interface (ADSI), a COM based
interface to perform common tasks, such as adding new objects.
After having stored those data in AD, the GIGS should be able to query AD for retrieving such data.
This is obtained exploiting the Directory Services.
Result:
Thus the program for developing OGSA- Complaint web service was successfully executed.
10
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
EXP.NO.3 Using Apache Axis develop a Grid Service.
AIM:
To develop a Grid service using Apache Axis.
Procedure:
1. Creating the New Level in the Package.
2. Edit the Configuration Files.
3. Modify the Service Code.
4. Modify the Client.
5. Compile and Deploy.
6. Starting the Container.
7. Compile the Client.
8. Run the Client.
OUTPUT:
Addition was successful
Subtraction was successful
Multiplication was successful
Division was successful
Current value: 20.
RESULT:
Thus the program for Grid Service using Apache Axis was successfully executed.
11
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
EXP.NO.4 Develop applications using Java or C/C++ Grid APIs
AIM:
Procedure:
1. Import all the necessary java packages and name the file as GridLayoutDemo.java
2. Set up components to preferred size.
3. Add buttons to experiment with Grid Layout.
4. Add controls to set up horizontal and vertical gaps.
5. Process the Apply gaps button press.
6. Create the GUI.
7. Create and set up the window, Set up the content pane and Display the window.
8. Schedule a job for the event dispatch thread.
9. Show the application's GUI.
OUTPUT:
Result:
Thus the program to develop an application in java using Grid APIs was successfully executed.
12
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
AIM:
To develop a secured applications using a basic security mechanisms available in Globus toolkit.
Procedure:
The most important thing to know about public key cryptography is that, unlike earlier cryptographic
systems, it relies not on a single key (a password or a secret "code"), but on two keys. These keys are
numbers that are mathematically related in such a way that if either key is used to encrypt a message, the
other key must be used to decrypt it. Also important is the fact that it is next to impossible (with our
current knowledge of mathematics and available computing power) to obtain the second key from the
first one and/or any messages encoded with the first key.
By making one of the keys available publicly (a public key) and keeping the other key private (a private
key), a person can prove that he or she holds the private key simply by encrypting a message. If the
message can be decrypted using the public key, the person must have used the private key to encrypt the
message.
Important: It is critical that private keys be kept private! Anyone who knows the private key can easily
impersonate the owner.
2. Digital Signatures
Using public key cryptography, it is possible to digitally "sign" a piece of information. Signing
information essentially means assuring a recipient of the information that the information hasn't been
tampered with since it left your hands.
13
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
To sign a piece of information, first compute a mathematical hash of the information. (A hash is a
condensed version of the information. The algorithm used to compute this hash must be known to the
recipient of the information, but it isn't a secret.) Using your private key, encrypt the hash, and attach it
to the message. Make sure that the recipient has your public key.
To verify that your signed message is authentic, the recipient of the message will compute the hash of
the message using the same hashing algorithm you used, and will then decrypt the encrypted hash that
you attached to the message. If the newly-computed hash and the decrypted hash match, then it proves
that you signed the message and that the message has not been changed since you signed it.
3. Certificates
A central concept in GSI authentication is the certificate. Every user and service on the Grid is identified
via a certificate, which contains information vital to identifying and authenticating the user or service.
A subject name, which identifies the person or object that the certificate represents.
The public key belonging to the subject.
The identity of a Certificate Authority (CA) that has signed the certificate to certify that the public
key and the identity both belong to the subject.
The digital signature of the named CA.
Note that a third party (a CA) is used to certify the link between the public key and the subject in the
certificate. In order to trust the certificate and its contents, the CA's certificate must be trusted. The link
between the CA and its certificate must be established via some non-cryptographic means, or else the
system is not trustworthy.
GSI certificates are encoded in the X.509 certificate format, a standard data format for certificates
established by the Internet Engineering Task Force (IETF). These certificates can be shared with other
public key-based software, including commercial web browsers from Microsoft and Netscape.
4. Mutual Authentication
If two parties have certificates, and if both parties trust the CAs that signed each other's certificates, then
the two parties can prove to each other that they are who they say they are. This is known as mutual
14
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
authentication. GSI uses the Secure Sockets Layer (SSL) for its mutual authentication protocol, which is
described below. (SSL is also known by a new, IETF standard name: Transport Layer Security, or TLS.)
Before mutual authentication can occur, the parties involved must first trust the CAs that signed each
other's certificates. In practice, this means that they must have copies of the CAs' certificates--which
contain the CAs' public keys--and that they must trust that these certificates really belong to the CAs.
To mutually authenticate, the first person (A) establishes a connection to the second person (B).
The certificate tells B who A is claiming to be (the identity), what A's public key is, and what CA is being
used to certify the certificate.
B will first make sure that the certificate is valid by checking the CA's digital signature to make sure that
the CA actually signed the certificate and that the certificate hasn't been tampered with. (This is where B
must trust the CA that signed A's certificate.)
Once B has checked out A's certificate, B must make sure that A really is the person identified in the
certificate.
A encrypts the message using his private key, and sends it back to B.
If this results in the original random message, then B knows that A is who he says he is.
Now that B trusts A's identity, the same operation must happen in reverse.
B sends A her certificate, A validates the certificate and sends a challenge message to be encrypted.
B encrypts the message and sends it back to A, and A decrypts it and compares it with the original.
15
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
At this point, A and B have established a connection to each other and are certain that they know each
others' identities.
5. Confidential Communication
By default, GSI does not establish confidential (encrypted) communication between parties. Once mutual
authentication is performed, GSI gets out of the way so that communication can occur without the
overhead of constant encryption and decryption.
GSI can easily be used to establish a shared key for encryption if confidential communication is desired.
Recently relaxed United States export laws now allow us to include encrypted communication as a
standard optional feature of GSI.
A related security feature is communication integrity. Integrity means that an eavesdropper may be able
to read communication between two parties but is not able to modify the communication in any way.
GSI provides communication integrity by default. (It can be turned off if desired). Communication
integrity introduces some overhead in communication, but not as large an overhead as encryption.
The core GSI software provided by the Globus Toolkit expects the user's private key to be stored in a file
in the local computer's storage. To prevent other users of the computer from stealing the private key, the
file that contains the key is encrypted via a password (also known as a passphrase). To use GSI, the user
must enter the passphrase required to decrypt the file containing their private key.
We have also prototyped the use of cryptographic smartcards in conjunction with GSI. This allows users
to store their private key on a smartcard rather than in a file system, making it still more difficult for
others to gain access to the key.
GSI provides a delegation capability: an extension of the standard SSL protocol which reduces the
number of times the user must enter his passphrase. If a Grid computation requires that several Grid
resources be used (each requiring mutual authentication), or if there is a need to have agents (local or
remote) requesting services on behalf of a user, the need to re-enter the user's passphrase can be avoided
by creating a proxy.
16
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
A proxy consists of a new certificate and a private key. The key pair that is used for the proxy, i.e. the
public key embedded in the certificate and the private key, may either be regenerated for each proxy or
obtained by other means. The new certificate contains the owner's identity, modified slightly to indicate
that it is a proxy. The new certificate is signed by the owner, rather than a CA. (See diagram below.) The
certificate also includes a time notation after which the proxy should no longer be accepted by others.
Proxies have limited lifetimes.
Figure 1.1. The new certificate is signed by the owner, rather than a CA.
The proxy's private key must be kept secure, but because the proxy isn't valid for very long, it doesn't
have to kept quite as secure as the owner's private key. It is thus possible to store the proxy's private key
in a local storage system without being encrypted, as long as the permissions on the file prevent anyone
else from looking at them easily. Once a proxy is created and stored, the user can use the proxy
certificate and private key for mutual authentication without entering a password.
When proxies are used, the mutual authentication process differs slightly. The remote party receives not
only the proxy's certificate (signed by the owner), but also the owner's certificate. During mutual
authentication, the owner's public key (obtained from her certificate) is used to validate the signature on
the proxy certificate. The CA's public key is then used to validate the signature on the owner's
certificate. This establishes a chain of trust from the CA to the proxy through the owner.
Result:
Thus the program to develop a security application available in Globus toolkit was successfully
executed.
17
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
EXP.NO 6 Develop a Grid portal, where user can submit a job and get the
result. Implement it with and without GRAM concept.
AIM:
To develop a Grid portal and implement it with and without GRAM concept.
Procedure:
2. The few basic steps to acquire an account, register your resource, and then submit a job for execution
using the TIET Grid Portal is as follows:
Load the TIET Grid Portal.
To register as a user with the Grid and obtain an account click on UserRegistration.
To Access the Grid resources and to enter into the Grid, user has to enter its Grid Identification
Name (GIN) or user id and Grid Identification Password (GIP) or password.
If the user is already registered to Grid then he can be directly login by its unique Grid
identification id and password otherwise user have to register to the Grid and he will be assigned
a unique Grid identification id and password.
In the next step u have to fill up your resource details in the provided form.
If the User wants to execute a job the he will open a portal that will take the job requirement
from the user.
From the Grid we can get the information about the current registered resources so that we can
match our job resource requirement with the available resources.
After submitting the job we can easily watch the current status of the job through the job status
portal after giving the unique job id that is provided to each job when the user submits the job.
At the end we can easily get the result of the job.
18
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
Output:
RESULT
Cloud computing is a buzzword that means different things to different people. For some, it's just
another way of describing IT (information technology) "outsourcing"; others use it to mean any
computing service provided over the Internet or a similar network; and some define it as any bought-in
computer service you use that sits outside your firewall.
Infrastructure as a Service (IaaS) means you're buying access to raw computing hardware over the
Net, such as servers or storage. Since you buy what you need and pay-as-you-go, this is often
referred to as utility computing. Ordinary web hosting is a simple example of IaaS: you pay a
monthly subscription or a per-megabyte/gigabyte fee to have a hosting company serve up files for
your website from their servers.
Software as a Service (SaaS) means you use a complete application running on someone else's
system. Web-based email and Google Documents are perhaps the best-known examples. Zoho is
another well-known SaaS provider offering a variety of office applications online.
Platform as a Service (PaaS) means you develop applications using Web-based tools so they run on
systems software and hardware provided by another company. So, for example, you might develop
your own ecommerce website but have the whole thing, including the shopping cart, checkout, and
payment mechanism running on a merchant's server. App Cloud (from salesforce.com) and the
Google App Engine are examples of PaaS.
20
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
The pros of cloud computing are obvious and compelling. If your business is selling books or repairing
shoes, why get involved in the nitty gritty of buying and maintaining a complex computer system? If
you run an insurance office, do you really want your sales agents wasting time running anti-virus
software, upgrading word-processors, or worrying about hard-drive crashes? Do you really want them
cluttering your expensive computers with their personal emails, illegally shared MP3 files, and naughty
YouTube videoswhen you could leave that responsibility to someone else? Cloud computing allows
you to buy in only the services you want, when you want them, cutting the upfront capital costs of
computers and peripherals. You avoid equipment going out of date and other familiar IT problems like
ensuring system security and reliability. You can add extra services (or take them away) at a moment's
notice as your business needs change. It's really quick and easy to add new applications or services to
your business without waiting weeks or months for the new computer (and its software) to arrive.
Drawbacks
Instant convenience comes at a price. Instead of purchasing computers and software, cloud computing
means you buy services, so one-off, upfront capital costs become ongoing operating costs instead. That
might work out much more expensive in the long-term.
If you're using software as a service (for example, writing a report using an online word processor or
sending emails through webmail), you need a reliable, high-speed, broadband Internet connection
functioning the whole time you're working. That's something we take for granted in countries such as the
United States, but it's much more of an issue in developing countries or rural areas where broadband is
unavailable.
The aim of a Private Cloud is not to expose to the world a cloud interface to sell capacity over the
Internet, but to provide local cloud users and administrators with a flexible and agile private
infrastructure to run virtualized service workloads within the administrative domain. OpenNebula
virtual infrastructure interfaces expose user and administrator functionality for virtualization,
networking, image and physical resource configuration, management, monitoring and accounting.
22
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
EXP.NO.1 Find procedure to run the virtual machine of different configuration.
Check how many virtual machines can be utilized at particular time.
AIM:
To write a procedure to run the virtual machine of different configuration and to check how
many virtual machines can be utilized at a particular time.
Procedure:
1. Setting up a Private Cloud in Open Nebula.
2. Front-end installation.
3. Add the cluster nodes to the system.
4. Configure ssh access.
5. Install the nodes.
6. Setting authorization.
7. prepare a virtual network for our VM.
8. Create a text file vmnet.template containing the following to create a virtual network with just
one IP.
Output:
ID NAME STAT CPU MEM HOSTNAME TIME
0 one-0 runn 0 65536 aquila01 00 0:00:02
Result:
Thus the program to run virtual machines on different configuration was successfully
executed & its utilization time is checked in various machines.
23
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
EXP.NO.2 Find procedure to attach virtual block to the virtual machine and
check whether it holds the data even after the release of the virtual machine.
AIM:
To write a procedure to attach virtual block to the virtual machine and to check whether it
holds the data even after the release of the virtual machine.
Procedure:
24
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
OUTPUT
RESULT:
Thus the program to attach virtual block to virtual machine was successfully executed & checked
whether it holds the data after the release of the virtual machine.
25
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
AIM:
To install a C compiler in the virtual machine and to execute a sample program.
Procedure:
To install this tool:
If you are using Fedora, Red Hat, CentOS, or Scientific Linux, use the following yum command to install
GNU c/c++ compiler:
# yum groupinstall 'Development Tools'
If you are using Debian or Ubuntu Linux, type the following apt-get command to install GNU c/c++
compiler:
$ sudo apt-get update
$ sudo apt-get install build-essential manpages-dev
Type the following command to display the version number and location of the compiler on Linux:
$ whereis gcc
$ which gcc
$ gcc --version
26
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
Sample outputs:
Create a file called demo.c using a text editor such as vi, emacs or joe:
#include<stdio.h>
/* demo.c: My first C program on a Linux */
int main(void)
{
printf("Hello! This is a test prgoram.\n");
return 0;
}
How do I compile the program on Linux?
Use any one of the following syntax to compile the program called demo.c:
cc program-source-code.c -o executable-file-name
OR
gcc program-source-code.c -o executable-file-name
27
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
OR
## assuming that executable-file-name.c exists ##
make executable-file-name
If there is no error in your code or C program then the compiler will successfully create an executable file
called demo in the current directory, otherwise you need fix the code. To verify this, type:
$ ls -l demo*
28
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
Animated gif 01: Compile and run C and C++ program demo
#include "iostream"
// demo2.C - Sample C++ prgoram
int main(void)
{
std::cout << "Hello! This is a C++ program.\n";
return 0;
}
29
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
## or use the following syntax ##
make demo2
./demo2
RESULT:
Thus the program to install a C complier is done and the sample program was executed
successfully.
30
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
EXP.NO.4 Show the virtual machine migration based on the certain condition from
one node to the other.
AIM:
To implement migration of virtual machine based on the certain conditions from one node to the
other.
Procedure:
1. Migr---- migrate the VM is migrating from one resource to another. This can be a life migration or
cold migration (the VM is saved and VM files are transferred to the new resource).
31
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
Email continues to be the key communication tool among organizations. Accordingly, IT departments regard
email systems as mission-critical applications. Microsoft Exchange Server is a widely used email platform in
business worldwide. Therefore,Microsoft Exchange Server 2010 was chosen as the email server to use to
study the impact of vMotion.
Test Methodology
Load-Generation Software
The Microsoft Exchange Load Generator 2010 tool (LoadGen), the official Exchange Server performance
assessment tool from Microsoft, was used to simulate the email users. LoadGen simulates a number of MAPI
(Mail Application Program Interface) clients accessing their email on Exchange Servers. Included with
LoadGen are profiles for light, medium and heavy workloads. In all of the tests, Outlook 2007 online clients
using a very heavy user profile workload150 messages sent/received per day per userwere used for load
generation. Each mailbox was initialized with 100MB of user data.
Tests were configured on the commonly used Exchange Server deployment scenarios.
Exchange Server Configuration
The Exchange Server test environment consisted of two mailbox server role virtual machines and two client
access and hub transport combined-role virtual machines to support 8,000 very heavy users. These two types
of virtual machines were configured as follows:
The mailbox server role virtual machine was configured with four vCPUs and 28GB of memory to support
4,000 users. The mailbox server role had higher resource (CPU, memory and storage I/O) requirements.
Therefore, a mailbox server role virtual machine was used as a candidate for vMotion testing.
The client access and hub transport combined-role virtual machine was configured with four vCPUs and
8GB of memory.
The following test scenarios for vMotion tests were used:
Test scenario1(one virtual machine): Perform vMotion on a single mailbox server role virtual
machine(running a load of 4,000 very heavy users).
Test scenario 2 (two virtualmachines): Perform vMotion on two mailbox server role virtual machines
simultaneously (running a combined load of 8,000 very heavy users).
32
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
In this study, the focus was on the duration of vMotion, and the impact on application performance when an
Exchange Server virtual machine was subjected to vMotion. To measure the application performance, the
following metrics were used:
Task Queue Length
The LoadGen task queue length is used as a popular metric to study the user experience and SLA trending in
Exchange Server benchmarking environments. The number of the tasks in the queue will increase if Exchange
Server fails to process the dispatched tasks expeditiously. So the rise in the task queue length directly
reflects a decline in the client experience.
Number of Task Exceptions
The LoadGen performance counter presents the number of task executions that resulted in a fatal exception,
typically due to lack of response from Exchange Servers.
Test Result
In the single mailbox server virtual machine test scenario, machine memory consumed and in use by the guest
was 28GB of memory when the migration of the mailbox server virtual machine was initiated. The vMotion
duration dropped from 71 seconds on vSphere 4.1 to 47 seconds on vSphere 5, a 33% reduction. In the two
mailbox server virtual machines scenario, the total machine memory consumed and in use by both mailbox
server virtual machines was 56GB, when vMotion was initiated. Once again, the vSphere 5 results were quite
impressive. The total duration dropped by about 49 seconds when using vSphere 5, a 34% reduction.
The following table compares the impact on the guest during vMotion on both vSphere 4.1 and vSphere 5
during the onevirtual machine test scenario
33
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
The table shows that the maximum size of the task queue length observed during vMotion on vSphere 5 was
219, much smaller than the 294 observed on vSphere 4.0. This confirms that Exchange Server users got a
better response time during the migration period in the vSphere 5 environment. There were no reported task
exceptions during migrations. This means that no Exchange Server task was dropped in either the vSphere 5
or the vSphere 4.1 environment.
Results from these tests clearly indicate that the impact of vMotion is minimal on even the largest memory-
intensive email applications.
RESULT:
Thus the program to implement migration of virtual machine was executed
successfully.
34
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
AIM:
To write a procedure to install storage controller and interact with it.
Procedure:
Storage Configuration
OpenNebula uses Datastores to manage VM disk Images. There are two configuration steps needed
to perform a basic set up:
First, you need to configure the system datastore to hold images for the running VMs, check
the the System Datastore Guide, for more details.
Then you have to setup one ore more datastore for the disk images of the VMs, you can find
more information on setting up Filesystem Datastores here.
The suggested configuration is to use a shared FS, which enables most of OpenNebula VM
controlling features. OpenNebula can work without a Shared FS, but this will force the
deployment to always clone the images and you will only be able to do cold migrations.
The simplest way to achieve a shared FS backend for OpenNebula datastores is to export via NFS
from the OpenNebula front-end both the system (/var/lib/one/datastores/0) and the images
(/var/lib/one/datastores/1) datastores. They need to be mounted by all the virtualization nodes to be
added into the OpenNebula cloud.
1.Enter the RAID controller BIOS by pressing Ctrl+R at the relevant prompt during boot.
5.Select the new virtual drive, press F2, and select InitializationStart Init.
7.Repeat steps 2 through 6 for the remaining internal volume, selecting drives three and four.
8.Press Escape, and choose Save and Exit to return to the boot sequence.
1.Using the command-line console, via serial cable, reset the first Dell EqualLogic PS5000XV by
using the reset command.
2.Supply a group name, group IP address, and IP address for eth0 on the first of three arrays.
3.Reset the remaining two arrays in the same manner, supply the group name to join and IP address
created in Step 2, and supply an IP address in the same subnet for eth0 on each remaining tray.
4.After group creation, using a computer connected to the same subnet as the storage, use the Dell
EqualLogic Web interface to do the following:
a.Assign IP addresses on the remaining NICs (eth1 and eth2) on each array. Enable the NICs.
b.Verify matching firmware levels on each array and MTU size of 9,000 on each NIC on each array.
c.To create two storage pools,right-click Storage pools,select choose Create storage pool. Designate
one storage pool for VM OS. Designate the other storage pool for VM SQL Server data, SQL Server
transaction logs, and the utility virtual disk.
d.Click each member (array), and choose Yes when prompted to configure the member. Choose
RAID 10 for each array.
36
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
e.Assign the two arrays containing 15K drives to the SQL Server data storage pool. Assign the one
array containing 10K drives to the VM OS storage pool.
f.Create eight 750GB volumes in the database storage poolfour for VMware vSphere 5 usage and
four for Microsoft Hyper-V R2 SP1 usage.
g.Create eight 460GBvolumes in the OS storage poolfour for VMware vSphere 5 usage and four
for Microsoft Hyper-V R2 SP1 usage.
h.Enable shared access to the iSCSI target from multiple initiators on the volume.
i.Create an access control record for the volume without specifying any limitations.
j.During testing, offline the volumes not in use by the current hypervisor.
RESULT:
Thus the program to install storage controller was executed successfully.
37
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
AIM:
Procedure:
1. Main Installation
# Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/hadoop
# Add Hadoop bin/ directory to PATH
export PATH= $PATH:$HADOOP_HOME/bin
hduser@ubuntu:~$ /usr/local/hadoop/bin/start-all.sh
Result:
Thus the procedure to set up one hadoop cluster was executed successfully.
38
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
AIM:
PROCEDURE:
$ wget http://archive.cloudera.com/one-click-install/maverick/cdh3-repository_1.0_all.deb
$ sudo dpkg -i cdh3-repository_1.0_all.deb
$ sudo apt-get update
$ sudo apt-get install hadoop-0.20-fuse
Once fuse-dfs is installed, go ahead and mount HDFS using FUSE as follows.
Once HDFS has been mounted at <mount_point>, you can use most of the traditional filesystem
operations (e.g., cp, rm, cat, mv, mkdir, rmdir, more, scp). However, random write operations such
as rsync, and permission related operations such as chmod, chown are not supported in FUSE-
mounted HDFS.
Result:
Thus the procedure to mount the one node hadoop cluster using FUSE was executed
successfully.
39
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
AIM:
To write a program using APIs of Hadoop to interact with it.
Procedure:
1. Once you have downloaded a test dataset, we can write an application to read a file from the local
file system and write the contents to Hadoop Distributed File System.
2. Export the Jar file and run the code from terminal to write a sample file to HDFS.
3. Verify whether the file is written into HDFS and check the contents of the file.
4. Next, we write an application to read the file we just created in Hadoop Distributed File System
and write its contents back to the local file system.
5. Export the Jar file and run the code from terminal to write a sample file to HDFS.
6. Verify whether the file is written back into local file system.
Result:
Thus the program using hadoop APIs was executed successfully.
40
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
EXP.NO.9 Word Count Program to demonstrate the use of Map and Reduce
Tasks
AIM:
To write a word count program to demonstrate the use of map and reduce tasks.
PROCEDURE:
Lets run a simple Map/Reduce job written in R and C++ (just for fun we assume that all the nodes
run the same operating system and they use the same CPU architecture).
$ su
# yum install readline-devel
# cd
# wget http://cran.rstudio.com/src/base/R-3.1.2.tar.gz
# tar -zxf R-3.1.2.tar.gz
# cd R-3.1.2
# /configure --with-x=no --with-recommended-packages=no
# make
# make install
#R
R> install.packages('stringi')
R> q()
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
#!/usr/bin/env Rscript
41
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
library('stringi')
stdin <- file('stdin', open='r')
RESULT:
Thus the Word Count Program to demonstrate the use of Map and Reduce Tasks was
executed successfully.
42
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
43
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
Procedure:
http://www.gridftp.org/tutorials/
cd gt-gridftp-binary-installer
globus-gridftp-server options
globus-gridftp-server --help
44
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
Experiment with -dbg, -vb -fast options
globus-url-copy -dbg file:///etc/groupftp://localhost:5000/tmp/group
globus-url-copy -vb file:///dev/zeroftp://localhost:5000/dev/null
Security Options
Clear text (RFC 959)
Username/password
Anonymous mode (anonymous/<email addr>)
Password file
SSHFTP
Use ssh/sshd to form the control connection
GSIFTP
Authenticate control and data channels with GSI
User Permissions
45
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
xinetd
service gsiftp
{
socket_type = stream
protocol = tcp
wait = no
user = root
env += GLOBUS_LOCATION=<GLOBUS_LOCATION>
env += LD_LIBRARY_PATH=<GLOBUS_LOCATION>/lib
server = <GLOBUS_LOCATION>/sbin/globus-gridftp-server
server_args = -i
disable = no
}
inetd
Result:
Thus the Anonymous Transfer using Globus Grid was successfully executed.
46
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
AIM:
To provide Password for a file.
Procedure:
Create a password file
gridftp-password.pl > pwfile
Result:
Procedure:
Test Methodology
Load-Generation Software
The SPECweb2005 architecture represents a typical Web architecture that consists of clients, Web
server software (that includes PHP or JSP support) and a back-end application and database server. The
SPECweb2005 benchmark comprises three component workloads including banking, e-commerce and
support. The support workload used in our tests is the most I/O intensive of the three workloads. It emulates a
vendor support site that provides downloads, such as driver updates and documentation, over HTTP. The
performance score of the workload is measured in terms of the number of simultaneous user/browser sessions
a Web server can handle while meeting the QoS requirements specified by the benchmark.
We used the following test scenario for our vMotion tests. Both the source and destination vSphere
hosts were configured with two 10GbE ports, one used for Web client traffic and the other for vMotion
traffic.
Test Scenario
The test scenario for this case study includes the following:
RockAWeb/JSPservdeployedrinasinglevirtualmachineconfiguredwithfourvCPUsand12GBmemory
SUSELinuxEnterpriseServer11x64astheguestOS
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
benchmarkAloadof12,000supportusers,whichgeneratednearly6GbpsWebtraffic
The objectives of the tests were to measure the total migration time and to quantify the application
slowdown when a virtual machine is subjected to vMotion during the steady-state phase of the SPECweb2005
benchmark. The SPECweb2005 benchmark was configured to enable fine-grained performance tracking.
Specifically, the BEAT_INTERVAL test parameter was configured with a value of 2 seconds, which resulted
in the clients reporting the performance data every 2 seconds (default: 10 seconds). Two seconds was the
lowest granularity level that was supported by the benchmark driver. This fine-grained performance tracking
helped us quantify the application slowdown (the number of user sessions failing to meet QoS requirements)
during the different phases of the vMotion.
As described in the test scenario, the test used a load of 12,000 support users, which generated a
substantial load on the virtual machine in terms of CPU and network usage. During the steady-state period of
the benchmark, the client network traffic was close to 6Gbps and the CPU utilization (esxtop %USED
counter) of the virtual machine was about 325%.
Test Results
Figure 2 compares the total elapsed time for vMotion in both vSphere 4.1 and vSphere 5 for the following
configurations:
30
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
25
20
10
0
vSphere 4.1 vSphere 5
Both test scenarios used a dedicated 10GbE network adaptor for vMotion traffic. The total vMotion time
dropped from 30 seconds to 19 seconds when running vSphere 5, a 37% reduction, clearly showing vMotion
performance improvements made in vSphere 5 towards reducing vMotion transfer times. Our analysis
indicated that most of the gains were due to the optimizations in vSphere 5 that enabled vMotion to
effectively saturate the 10GbE bandwidth during the migration.
Figure 3 plots the performance of the Web server virtual machinebefore, during and after vMotion when
running vSphere 4.1.
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
The figure plots the number of SPECweb2005 user sessions that meet the QoS requirements (Time Good) at
a given time. In this graph, the first dip observed at 17:09:30 corresponds to the beginning of the steady- state
interval of the SPECweb2005 benchmark when the statistics are cleared. The figure shows that even though
the actual benchmark load was 12,000 users, due to think-time used in the benchmark, the actual number of
users submitting the requests at a given time is about 2,750. During the steady-state interval, 100% of the
users were meeting the QoS requirements. The figure shows that the vMotion process started at about 1
minute into the steady-state interval. The figure shows two dips in performance. The first noticeable dip in
performance was during the guest trace phase during which trace is installed on all the memory pages. The
second dip is observed during the switchover phase when the virtual machine is momentarily quiesced on the
source host and is resumed on the destination host. In spite of these two dips, no network connections were
dropped or timed out and the SPECweb2005 benchmark run continued.
Figure 4 plots the performance of the Web server virtual machinebefore, during and after vMotion when
running vSphere 5 with a single 10GbE network adaptor configured for vMotion.
Figure 4 shows the beginning of the steady state at about 12:20:30 PDT, marked by a small dip. During the
steady-state interval, 100% of the users were meeting the QoS requirements. The figure shows that the
51
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
vMotion process started after 2 minutes into the steady-state interval. In contrast to the two dips observed on
vSphere 4.1, only a single noticeable dip in performance was observed during vMotion on vSphere 5. The dip
during the guest trace stage was insignificant, due to improvements made in vSphere 5 to minimize the impact
of memory tracing. The only noticeable dip in performance was during the switchover phase from the source
to the destination host. Even at such high load level, the amount of time the guest was quiesced during the
switchover phase was about 1 second. It took less than 5 seconds to resume to the normal level of
performance.
Result:
Thus the program to implement vMotion Performance in a Web Environment was successfully
executed.
52
PANIMALAR INSTITUTE OF TECHNOLOGY DEPT OF CSE
Procedure:
From start to finish, there are four fundamental transformations. Data is:
1. Transformed from the input files and fed into the mappers
Result:
Thus the program for MapReduce applications to have an understanding of how data is transformed was
successfully executed.
53
IV/VII SEM 5
4