Você está na página 1de 55

GRID COMPUTING

What is Grid Computing?


Grid Computing combines computers from multiple

administrative domains to reach a common goal, to solve a single task.


It enables virtual organizations to share geographically

distributed resources.
A resource is an entity that is to be shared

Computational Resource Storage Resources

Definition
Foster and Kesselman, 1998

A computational grid is a hardware and software infrastructure that provides dependable, consistent, and inexpensive access to high-end computational facilities.

3-point checklist (Foster 2002)


1.

Coordinates resources not subject to centralized control.


Uses standard, open, general purpose protocols and interfaces. Deliver nontrivial qualities of service e.g., response time, throughput, availability, security

2.

3.

Grid Architecture

Autonomous, globally distributed computers/clusters

Why do we need Grid?


Many large-scale problems cannot be solved by a single

computer. Globally distributed data and resources. Following table shows under utilized Infrastructure:
IT Resource Windows Servers UNIX Servers Average Daytime Utilization <5% 15 20%

Desktops

<5%

Values
Developed strong roots in the global academic and research communities over the last decade
Integrating large scale computing facility and resources.

Re-use unutilized resources.


Leverage multidisciplinary collaboration. Change the culture of academic research.

Values
Grid Computing for business enterprises Accelerating product development.
Reducing infrastructure and operational costs. Leveraging existing technology investments. Increasing corporate productivity. Strongest low cost but high throughput solution that allows companies to optimize and leverage existing IT

infrastructure and investments .

Grid Computing Value Element #1: Leveraging Existing Hardware Investments and Resources
There is a tremendous amount of unused capacity

in IT infrastructure at a typical enterprise. Grids can be deployed on an existing infrastructure. Costs savings are not limited to hardware and software expenditure.

Grid Computing Value Element #2: Reducing Operational Expenses


Grid Computing brings a level of automation and

ease previously unseen in the IT environments.


The ability of Grids to cross departmental and

geographical boundaries uniformly increases the level of computational capacity across the whole academic or enterprise.

Grid Computing Value Element #3: Creating a Scalable and Flexible Enterprise IT Infrastructure
Traditionally, IT managers have been forced

into making large-step function increases in spending to accommodate slight increases in infrastructure requirements. Grid Computing allows companies to add resources linearly based on real-time business requirements. These resources can be derived from within the enterprise or from utility computing services.

Grid Computing Value Element #3: Creating a Scalable and Flexible Enterprise IT Infrastructure
While departments will be making their

resources accessible to the whole enterprise, Grid Computing still allows them to maintain local control.

Grid Computing Value Element #4: Accelerating Product Development, Improving Time to Market, and Raising Customer Satisfaction
Grid Computing has a direct impact accelerating product development at enterprises and helping bring

product to market quicker. for example, simulation times can get products completed quickly. This also provides the capability to perform a lot more detailed product design.

Grid Computing Value Element #5: Increasing Productivity


Enterprises that have deployed Grid Computing are seeing tremendous productivity gains.

E.g.

Risk Analysis
In this section we will evaluate the key risk

factors,
That usually plague technology deployments

and analyze the vulnerabilities Computing deployments.

of

Grid

Risk Analysis: Lock-in


Like most software (and hardware) vendors,

Grid Computing vendors would probably prefer it if their

software locked-in a customer for a recurring or a future revenue stream.


Customers should pay keen attention to which vendors are

supporting the Grid Computing standards activities at the Global Grid Forum.

Risk Analysis: Switching Costs


Once a grid has been deployed, the primary switching cost

will be driven by the utilizing software development toolkits. Another way to mitigate switching costs is to introduce new grid software in the enterprise to support new grid-enabled applications, while letting the existing software deployment and its integration with legacy grid software remain unchanged.

Risk Analysis: Project Implementation failure


The final risk factor is of project failure,
either due to bad project management or incorrect needs

assessment. One way to mitigate the risk of project failure is to take advantage of hosted pilot and professional services offered by grid software vendors. Hosted pilots are conducted solely on the vendors data centers and have no impact on the operations of the company.

History of Grid Computing


Early to mid 1990s: Research projects in the academic and

research community that were focused on distributed computing. One key area was developing tools that would allow distributed high performance computing systems to act like one large computer. 1995: The IEEE/ACM Super Computing conference in San Diego 11 high speed networks were used to connect 17 sites with high-end computing resources for a demonstration to create one super metacomputer. This demonstration was called I-Way and was led by Ian Foster.

History of Grid Computing


1996: GLOBUS was foundation tools for Grid computing.
The research project was led by Ian Foster of ANL and Carl Kesselman of University of Southern California. 1997: At Super Computing Conference

- 80 sites worldwide running software based on the Globus Toolkit were connected together.

History of Grid Computing


1997: Entropia

- was launched to harness the idle computers worldwide to solve problems of scientific interest.
2000: articles on Grid Computing moved from the trade

press to the popular press.


Today,

large Microsystems applications.

corporations such as IBM, Sun and Intel are using for business

Background: Related Technologies


High Performance Computing Cluster computing Peer-to-peer computing Internet computing

High Performance Computing


Traditionally called supercomputing.

High Performance Computing


HPC Deployment by Industry
Industry area Telecommunication Finance Automotive Database Electronics Mechanics Chemistry Sample Companies Sprint, Duetsche Telekom Charles Schwab BMW, GM State Farm, Starbucks Cisco, Motorola Hitachi Bayer

Information Services
Manufacturing Worldwide web

EDS
Alcoa Newsky, Amazon

Cluster Computing
Cluster computing has been around since 1994.

Cluster Architecture

Peer to Peer Computing


Connect to other computers.

Can access files from any computer on the network.


Allows data sharing without going through central

server.
Models: - centralized model, such as the one used by Napster. - decentralized model, like the one used by Gnutella.

Peer to Peer Computing

Internet computing
It utilizes the vast processing cycles available at users desktops.

In this type of computing tasks can be broken down into smaller subtasks and distributed over the Internet for processing. Desktop clients periodically communicates with the central server to receive tasks.
The central server aggregates the information

received from all the different desktops and compiles the results.

Grid Computing
Grid computing

tries to bring, under one definitional umbrella all the work being done in the high performance, cluster, peer-to-peer, and Internet computing arenas.

Some of the definitions of Grid Computing are as

follows:

The flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources. The ability to form virtual, collaborative organizations that share applications and data in an open heterogeneous server environment in order to work on common problems. The Web provides us informationthe grid allows us to process it.

A Grid Computing Model


Some of the key requirements:
Identity and Authentication. Authorization and Policy. Resource Discovery.

Resource Characterization.
Resource Allocation. Resource Management. Accounting/Billing/Service Level Agreement (SLA). Security.

To overcome the systems problem, a set of protocols and mechanisms need to be defined that

address the security and policy concerns of the resource owners and users. A set of grid applications programming interfaces (APIs) and software development toolkits (SDKs) need to be defined. They provide interfaces to the grid protocols and services as well as facilitate application development by supplying higher-level abstraction.

A Grid Computing Model

A Grid Computing Model


The Fabric Layer Includes the protocols and interfaces that provides access to the resources that are being shared such as compute resources, data resources, etc.

The Connectivity Layer Defines core protocols required for grid-specific network transactions. Utilizes the existing Internet protocols such as IP, Domain Name Service, various routing protocols.

A Grid Computing Model


The Resource Layer defines protocols required to initiate and control sharing of local resources. Protocols defined at this layer include: - Grid Resource Allocation Management (GRAM) Remote allocation, reservation, monitoring, and control of resources - GridFTP (FTP Extensions) High performance data access and transport - Grid Resource Information Service (GRIS) Access to structure and state information.

A Grid Computing Model


The Collective Layer defines protocols that provide system oriented (versus local) capabilities for wide scale deployment. includes index or meta-directory services so that a custom view can be created of the resources available on the grid.

A Grid Computing Model


The Application Layer defines protocols and services that are targeted toward a specific application or a class of applications.
Following figure shows the relationship between

APIs, services, and protocols.

Grid Computing Protocols


Grid Security Infrastructure Grid Resource Allocation Management Grid File Transfer Protocol Grid Information Services

Security: Grid Security Infrastructure


Security is defined in the resource layer of the grid architecture.

The security problem in grid computing is complex:

- resources are located in different administrative domains. - each resource potential having its own policies and procedures. - there are different requirements by users, resource owners, and developers.

The users expectations are that a secure grid system will be easy to use, provide single sign-on capability, allow for delegation, and support all key applications.
The resource owners require that security should

specify local access control, have robust and detailed auditing and accounting, and should be able to integrate with local security infrastructure.

From a developers standpoint, the grid security

protocol should have a robust API/SDK.


GSI for grid has been defined by creating extension to

Secure Socket Layer/ Transport Layer Security (SSL/TLS) and X.509. Following diagram shows the Grid Security Infrastructure in action. The request submitted is as follows: Create processes at A and B that Communicate & Access Files at C.

Resource Management: Grid Resource Allocation Management Protocol


GRAM allows programs to be started on remote

resources. Resource Specification Language (RSL): a common notation for exchange of information between applications, resource brokers, and local resource managers. RSL provides two types of information: Resource requirements: machine type, number of nodes, memory, etc. Job configuration: directory, executable, arguments, environment

GRAM
GRAM protocol is a simple, HTTP-based remote procedure call (RPC). It sends messages such as job request, job cancel, status, and signal. Event notifications for state changes include pending, active, done, failed, or suspended.

Data Transfer: Grid File Transfer Protocol


Users who need access to the data are distributed across the

globe.
Key requirement for data-intensive grids is high-speed and

reliable access to remote data.


The standard FTP protocol has been extended while

preserving interoperability with existing servers to develop GridFTP.

Information Services: Grid Information Services


A set of protocols and APIs defined in the resource layer,

provides key information about the grid infrastructure.


Grid Information Service (GIS) provides access to static

and dynamic information regarding a grids various components, that includes the type and state of available resources.

There are two types of Grid Information Services.


The Grid Resource Information Service (GRIS):

The GRIS supplies information about a specific

resource.
Grid Index Information Service (GIIS):
GIIS provides a collection of information that has been

gathered from multiple GRIS servers.

Types of Grids
Departmental Grids: Departmental grids are deployed to solve problems for a particular group of people within

an enterprise.
Enterprise Grids: Enterprise grids consist of resources

spread across an enterprise and provide service to all users within that enterprise.
Extraprise Grids: Extraprise grids are established between companies, their partners, and their customers.

Global Grids: Grids established over the public Internet

constitute global grids.


Compute Grids: Compute grids are created solely for the

purpose of providing access to computational resources.


Data Grids: Grid deployments that require access to, and

processing of, data are called data grids. They are optimized for data-oriented operations.
Utility Grids: utility grids as being commercial compute

resources that are maintained and managed by a service provider.

Conclusion
Grid Computing enables virtual organizations, to

share geographically distributed resources. Resources can be supercomputers, clusters, desktop storage systems, sensors, scientific instruments, etc. Grid Computing is not a new concept. It leverages knowledge acquired by high performance computing, cluster computing, peer to peering, and Internet computing communities. Grid Computing protocols are based on protocols developed and refined by the Internet community. Existing protocols have been extended to provide grid specific functionality.

Você também pode gostar