Blue Gene

he U.S.
Government's ASCI program, a driving force behind supercomputing advances for more than five years, has set the goal of achieving petaflop-level supercomputer performance by the year 2010. ASCI was established primarily to develop systems capable of performing numerical simulation of nuclear weapons detonations, giving the U.S. an alternative to performing actual underground nuclear tests. These simulations work by dividing space into discrete two- or three-dimensional grids and time into small finite intervals or "steps." Compared to today's teraflop-scale supercomputers, a petaflop supercomputer would allow researchers to use denser grids and smaller time steps, dramatically increasing the accuracy and detail of simulations. Besides nuclear weapons research, the world's fastest supercomputers are also being built with the goal of solving "grand challenge" problems in science. Defeating the world chess champion is one challenge that received much attention in 1997, but this didn't require petaflop performance nor was it an especially critical problem to solve. At least one upcThe Grand Challenge of Protein Folding A major focus of the Blue Gene system is the simulation of human proteins and their "folding." Protein folding controls the human body's physical processes, and abnormal folding causes disease. Through simulation of these processes, Blue Gene can help to determine the cause of some human diseases and also contribute to the development of drugs and other means to cure disease. The Blue Gene/L system, targeted for completion sometime in 2005, will likely be the world's first computer capable of solving the
Why Supercomputers are Fast Several elements of a supercomputer contribute to its high level of performance:

Numerous high-performance processors (CPUs) for parallel processing Specially-designed high-speed interconnects (internal networks) Specially-designed or tuned operating systems Supercomputer Processors Supercomputers utilize either custom or mainstream commercial microprocessors. Small supercomputers may contain only a few dozen processors, but today's fastest supercomputers incorporate thousands of processors. The table below summarizes the processor configuration of today's top supercomputers. System ASCI Red ASCI Blue Pacific ASCI White Processor Configuration 9,472 Intel Pentium II Xeon 5,856 IBM PowerPC 604E 8,192 IBM Power3-II
NEC Earth Simulator
5,104 NEC vector processors
Some supercomputer designs feature network co-processors. When sending and receiving data at the rate necessary for high-performance networking, it's common for a single processor to become heavily loaded with communications interrupts that take away too many cycles from primary computing tasks. To solve this problem, the IBM Blue Gene system will utilize cells. Each cell contains a primary processor, a network co-processor, and shared on-chip memory. In total, the IBM Blue Gene system will contain one million custom IBM processors. So that the system will fit within a reasonably-sized room and not consume too much power, the processors are engineered so small that 32 of them will fit on a single microchip. Supercomputer Interconnects In order for a large number of processors to work together, supercomputers utilize specialized network interfaces. These interconnects support high bandwidth and very low latency communication. Interconnects join nodes inside the supercomputer together. A node is a communication endpoint running one instance of the operating system. Nodes utilize one or several processors and different types of nodes can exist within the system.Compute nodes, for example, execute the processes and threads required for raw computation. I/O nodes handle the reading and writing of data to disks within the system. Service nodes and network nodes provide the user interface into the system and also network interfaces to the outside world. Special-purpose nodes improve overall performance by segregating the system workload with hardware and system software configured to best handle that workload. Supercomputer nodes fit together into a network topology. Modern supercomputers have utilized several different specialized network topologies including hypercube, two-dimensional and three-dimensional mesh, and torus. Supercomputer network topologies can be either static (fixed) or dynamic (through the use of switches). More on Supercomputer Interconnects One of the most critical elements of supercomputer networking is routing. Supercomputers that utilize message passing require routing to ensure the individual pieces of a message are routed from source to destination through the topology without creating hotspots (bottlenecks). Advanced routing techniques like wormholeand virtual cut-through routing are employed by today's ASCI supercomputers. Supercomputers utilize various network protocols. Application data communications generally take place at the physical and data link layers. I/O and communications with external networks utilize technologies like HIPPI, FDDI, and ATM as well as Ethernet.
Supercomputer interconnects involve large quantities of network cabling. These cables can be very difficult to install as they often must fit within small spaces. Supercomputers do not utilize wireless networking internally as the bandwidth and latency properties of wireless are not suitable for high-performance communications. Supercomputer Operating Systems Many supercomputers run multiple copies of a UNIX-based operating system. The ASCI White and Blue Pacific systems, for example, run IBM AIX. In the 1990s, research into highperformance network operating systems led to the development of so-called "lightweight" operating systems (O/Ses) that consist of a small, simple kernel with many of the capabilities of a general-purpose O/S removed. The ASCI Red system runs the PUMA O/S on its compute nodes. grand challenge of pro
IBM Blue Gene

The world's most advanced network supercomputer from International Business Machines will tackle Grand Challenge problems
An Article by your Guide Bradley Mitchell Beyond Performance - Reliability, Availability, and Serviceability (RAS) Supercomputers are notoriously unreliable. They contain so many more processors, memory chips, disks and cables than an ordinary computer, that the "law of averages" alone dictates failures will occur much more frequently. In addition, some supercomputing software and hardware components are products of cuttingedge research, new and not yet "debugged." Finally, supercomputers are usually placed under heavy computing and communications workloads 24 hours a day. Supercomputer failures and "downtime" are extremely costly, due to the importance of their "computational mission" and the scarcity of available systems. To reduce downtime, modern supercomputers are built with a focus on three key aspects of operation: Reliability - likelihood of a failure occurring in the running system Availability - "uptime" of system and/or system resources in the presence of a failure Serviceability - ability to quickly detect, isolate, and recover from failures that occur
Join the Discussion "I was at an IBM seminar today, and one of the speakers asked the audience: 'Do you have any idea what the size of the Internet is these days? You know, you've got bytes, kilobytes, gigabytes, terabytes... and do you know what comes after that? 'Someone in the audience piped up - Trilobites!'" -XAVIANA Related Resources Network Clustering Bandwidth and Latency Network Topologies Availability Concepts for Networks and Systems Elsewhere on the Web IBM Research - Blue Gene
Some supercomputer designs incorporate additional hardware and software to support system RAS. Redundant power supplies and cooling systems are common, for example. A single failure in a redundant component typically does not cause downtime (impact the availability of the system). Hot-swappable components like disks and power supplies also are commonly employed to improve system serviceability.
The ASCI Red system features a Ethernet network, separate from the primary interconnect, that is used to detect and recover from failures to improve serviceability and manageability. SMASH RAS The IBM Blue Gene system adopts an architectural approach called SMASH (Simple, Many and Self-Healing). SMASH attempts to build into the system features that will minimize failures and downtime. Conclusion Supercomputing is currently the most advanced of all high-performance computing (HPC) approaches. HPC alternatives like clustering and grid computing hold great promise for achieving high performance computing and communications at a low cost, but for now, supercomputers remain the world's fastest computing systems best suited for solving many critical scientific problems. Supercomputers remain expensive and highly specialized. They run a limited set of applications and require nonstop "care and feeding" to keep running smoothly. The IBM Blue Gene system represents the next generation of advanced supercomputing technology. When complete, it will operate at speeds up to 100 times greater than today's systems. In addition to solving computational problems, Blue
tein folding.oming challenge, though, is of prime importance to us all.

Blue Gene

Enviado por

Dados do documento

Descrição original:

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Blue Gene

Enviado por

Direitos autorais:

Formatos disponíveis

he U.S.

NEC Earth Simulator

5,104 NEC vector processors

IBM Blue Gene

tein folding.oming challenge, though, is of prime importance to us all.

Você também pode gostar