Você está na página 1de 20

Lappeenranta University of Technology Information Technology CT30A7001 Concurrent and Parallel Computing

EARTH SIMULATOR Seminar Work


November 13, 2008

Group 10
Ibrahim Mohamed [b0319444 mohamed. ibrahim@lut.fi ] Awid Faisal [b0319224 faisal.awid@lut.fi ]

Supervisor: Professor, D.Sc. (Tech.) Jari Porras

Earth Simulator
Abstract:
This document contains concurrent and parallel computing seminar work at Lappeenranta University of technology; the seminar topic focuses on super computer technologies, specifically Earth Simulator.

The Earth Simulator (ES) is a high-speed super computer technology that was started as a research project in 1997 to better understand and predict the climate change and global environment at large.

The Earth simulator is a large scale super computer which encompass of 640 processor elements bonded by a single stage full crossbar network. Every node element contains 8 vector processors, which uses one single shared memory of capacity 16GB. This massive super computer brought about a high scaled performance of 40Tflops and 10TB memory capacity.

Key words: Earth Simulator, Super Computers, High performance, parallel programming, MDPS, Vector processor, shared memory, MPI, HPF.

Abbreviations

ES STA NASDA JEARI IN PN AP VU I/O SIMD MIMD COTS MMU DRAM RCU XCT XSW MDPS CTL MPI HPF

Earth Simulator Japan s Science and Technology Agency National Development Space Agency Japan Atomic Energy Research Institute Interconnection Networks Processor Node Arithmetic Process Vector Unit Input and Output Single Instruction Multiple Data Multiple Instruction Multiple Data Commercial off-the-shelf Main Memory Units Dynamic Random Access Memory Remote Control Unit Cross bar Control Unit Intra-node Cross bar Control Unit Mass Data processing System Cartridge Tape Library Message Passing Interface High Performance FORTRAN

Table of Contents

1. Introduction................................................................................................................................. 4 1.1. Overview.............................................................................................................................. 4 2. Hardware Overview .................................................................................................................... 5 2.1. Arithmetic Process ............................................................................................................... 5 2.1.1 Vector Unit..................................................................................................................... 6 2.1.2 Scalar Unit...................................................................................................................... 7 2.1.3. Memory System............................................................................................................. 7 3. Interconnection Network ............................................................................................................. 8 4. Mass Data processing System (MDPS)...................................................................................... 10 4.1. Network Queuing System II ............................................................................................... 11 5. Software.................................................................................................................................... 12 5.1. Programming Environment................................................................................................. 13 6. Earth Simulator and its Application Domain.............................................................................. 14 7. Comparison of Super-Computers .............................................................................................. 16 8. Summary................................................................................................................................... 17 9. References: ............................................................................................................................... 18

1. Introduction
Due to the rapid changes of the global environmental phenomena, such as global warming necessitated an urgent need of simulation system that can be used to predict events that would have large scale impact on human beings. However, to understand and realize these unprecedented phenomena require sophisticated and large scale computer simulations.

As result, The Science and Technology Agency (STA) of Japan have initiated the Earth Simulator project, then, research and development team was formed among the National Space Development Agency of Japan (NASDA), the Japan Atomic Energy Research Institute (JEARI), and Japan Marine Science [1]. The Earth Simulator had become a legend in meteorology, weather forecasting simulations. For many years Earth Simulator has been the fastest super computer existing. The Earth Simulator is aimed at providing up-to-date information about the effects of global warming and other relevant earth geophysics problems. The structure of this document is organized as follows, initially architecture of the Earth simulator technology is introduced, interconnection and network paradigm is presented, software and programming used in the Earth Simulator is described, domain applications and example of implemented projects are concisely illustrated, then short comparison of other super computer technology and its ranking in today s list of super computers. Finally the document is concluded with summary of the seminar work.

1.1. Overview The earth simulator is a really parallel vector supercomputer system constituted of 640 processor nodes (PN), and interconnection networks (IN). These nodes and their network devices/cables are held in cabinets, 320 PN cabinets (two PN per cabinet), and 65 IN cabinets. These cabinets were installed in premises 65 m long and 50 m wide. Earth simulator occupies an area of approx 1600 m2, or 41 m long by 40 m wide.

2. Hardware Overview
The Earth Simulator is massively parallel vector system with shared memory architecture, which is constituted of 640 processor nodes, bonded by (640 X 640) single stage crossbar switches. each node holds 8 vector processor with maximum performance of 8 Gflops, and shared memory of capacity 16GB, remote access control unit which links and controls data path and data transmission, also Input and output (I/O) processor. Any single processor has vector operation unit, a four way superscalar operation unit, and main memory access control unit [2].

2.1. Arithmetic Process As earlier mentioned the arithmetic process unit includes 4way super-scalar unit designed for sequential operations, a vector unit and main memory access control. The arithmetic process unit functions with clock speed ranging from 500MHZ to 1GHZ. Additionally, every super-scalar unit has a processor with 64KB instruction cache, data cache of 64KB as well as 128 general purpose scalar registers [2].

Figure1. Illustrates the structure of the arithmetic processor (AP) unit [2]

In Earth Simulator every processor elements utilizes techniques such as branch prediction, data prefetching as well as out-of-order execution techniques. Branch prediction attempts to speculate whether a conditional branch to be executed or not. Similarly, data pre-fetching is to obtain the instruction from the main memory before it is required; this is done in order to speed up the execution time. Moreover, the out-of-order instruction execution which is restricted way of data flow computation. The out-of-order execution applies a queue of pre-fetched instructions. Prefetched instruction is only employed whenever the necessary input required is on hand.

2.1.1 Vector Unit

Vector-based computers, perform an operation on massive amounts of data in parallel on a single node, and prevail over the communication lag time brought up by super-scalar when data must be moved. Applications that can take advantage of this technology prove large performance increase.

As can be seen from the above figure, the Vector Unit is the integral part of AP which consists of 8 set of vector pipelines, vector registers, and some mask registers. Vector pipelines have six types of operation pipeline which are add/shift, multiply, divide, logical, mask, and load/Store pipelines. The vector-pipelines with common type operate together by using single vector instruction while those with distinct type functions concurrently. Additionally, the AP has 72 vector registers of 256 vector elements [2] [3].

The Earth Simulator utilizes this vector unit to take advantage of SIMD single Instruction Multiple dataapproach to improve and enhance its performance. The vector instruction becomes beneficial in two forms, for instance Earth Simulator fetches and decodes far few instructions, so the control unit overhead is greatly reduced and the memory bandwidth necessary to perform this sequence of operations is reduced. The vector processor used in the Earth Simulator is NEC SX-6 processor [2][3].

Figure2. Vector architecture [2] .

2.1.2 Scalar Unit

Super-scalar processors are Multiple Instruction Multiple Data (MIMD) computers. MIMD is a technique used to achieve parallelism, where different processors may be executing different instructions on different data. The super-scalars are usually COTS Commercial off-the-

shelf components networked together to offer synchronization and data sharing. Each 4-way superscalar unit in ES has peak performance of 1.0 Gflops/s, and provides support and deal with branch prediction, data pre-fetching, and out of order execution. As we mentioned earlier the supper-scalar unit has a processor with 64KB instruction cache, data cache of 64KB as well as 128 general purpose scalar registers[2][3].

2.1.3. Memory System

Each single node of the ES has shared memory which is available to share equally by 8 processors, and is configured with 32 main memory package units (MMU) with 2048 banks. High speed DRAM of 128 mega-bits at 24 nsec bank cycle time is being used for the memory chip. The capacity of node s memory is about 16 GB, where each AP has 32 GB/s memory bandwidth and total bandwidth of 256 GB/s [4].

Figure3. Memory system of the nodes [4] .

3. Interconnection Network
Usually in multiprocessing systems, a different level of communications and cooperation occurs for solving a given problem, such as by sending message or by sharing memory. Parallel systems must be strongly attached if there are numerous processors interacting through shared memory. Parallel processing requires the use of capable system interconnection for speedy communication among multiprocessors, shared memory, I/O, and peripheral devices. The machine s speed is determined by three key factors, the memory bandwidth, the interconnection network topology, and an interconnection network s dynamic behavior [4].

Switched network such as crossbar network makes available dynamic interconnection among processors and main memory. A crossbar network offers the higher bandwidth and optimal

interconnection. A dedicated connection path between two nodes is being setup by cross-point switch, the switch can be set on or off dynamically upon a demand [4].

The Earth simulator interconnection network is a very enormous network of 640x640 non-blocking crossbar switches that connects the 640 nodes. The interconnection bandwidth between every pair of nodes is about 12.3 GB/s full-duplex. The entire switching capacity of the interconnection networks is approximately 7.87 TB/s [2].

Remote control unit (RCU) is connected to the crossbar switches and manages inter-node data transfer at 12.3 GB/s full duplex transfer rate for both sending and receiving data, hence the whole bandwidth of the network among nodes is about 8 TB/s [4].

The earth simulators crossbar network consist of two units as can be seen in the figure below; Internode crossbar control unit (XCT), which is responsible for coordinating of switching operations. Another one is the inter-node crossbar switch (XSW) which is a real data path [4].

Figure4. Configuration of interconnection network [13].

4. Mass Data processing System (MDPS)


While numerous large-scale distributed parallel programs are carried out on the Earth simulator (ES), I/O processing for multiple data such as user files and work files has become the bottleneck of numerical simulations. Handling huge volume of data generated by the ES is extremely difficult and creates significant problems. Therefore, in 2003 it was started on network queuing II and Mass data processing system (MDPS) to overcome these problems, and to improve the system utilization and maintainability as well [5][6].

As figure below depicts, mass data processing system (MDPS) was set up as data storage system, which renovates the existing archive system. MDPS consists of four file service processors, magnetic disks of 240TB, and cartridge tape library (CTL) of 1.5PB. The MDPS improved the overall system s transfer speed between working disk of the earth simulator and the storage; also it enhanced data I/O capability. Additionally MDPS allows users to get access and find the result computed by earth simulator from remote location, as it can move the data to assigned file server in the MDPS. To take advantage of MDPS capability further, the network queuing system II tool used for scheduling, and managing jobs. Short description of the Network queuing system is presented in the next sub-section The figure illustrates the connection between ES and MDPS [5] [6].

Figure5. MDPS storage system [5].

10

4.1. Network Queuing System II Since ES uses batch job system, the Earth nodes are classified as clusters of two types, S-clusters and L-clusters. One cluster from the 40 clusters of ES (cluster 0 nodes from 0-15) is dedicated for S-cluster operations, which provides interactive features for end users such debugging, and compiling. Within S-cluster node include special nodes for job execution. The rest of the 39 clusters (625nodes) are positioned in L-cluster to provide specific batch environment to execute massively parallel computing tasks employing fast number of processor nodes. To perform these massive and huge tasks the MDPS utilizes mechanism called job scheduling. [7].

A process node in the earth simulator is restricted to access to user disk, thus user files are moved from user disks in the S-cluster to work disks in L-cluster before the job is carried out.

This method of moving user files from user disks to work disk for job execution in the L-cluster is called Stage in , whiles when moving results computed in the Earth simulator to user disk for users is called Stage-out . Generally the entire job scheduling goes through the following process [6]: Processor node Allocation Stage in Process (copying files from user disk to work-disks) Job escalation (rescheduling for earlier start time if possible) Job execution Stage-out (Copying files from work-disks to user-disks).

The following figure demonstrates job scheduling activities that take place in the Earth Simulator.

11

Figure6. Job scheduling in L-Cluster [8].

5. Software
Unix Operating system is used as OS running on the PN in the earth simulator, the execution environments behaves likewise the conventional UNIX system. The UNIX version which ES uses is an enhanced version of SUPER-UX specifically designed for supercomputers. The characteristics of this OS include: Vector Processing Parallel Processing for distributed memory Parallel Processing for shared memory Batch system High performance I/O Cluster management All these features and characteristics enable the earth simulator to gain high performance when massive computation tasks are being implemented. Moreover ES supports high speed and parallel file system for large scale scientific data computation. The parallel file system offers the functionality by which number of files created on various kinds of disks is treated as logically aggregated large file [1].

12

5.1. Programming Environment Parallel programming plays essential role in large scale scientific computations, parallelization techniques often employed is task level program partitioning. This technique entails data communication between different partitioned tasks, reducing the communication time between these tasks would have significant impact on the performance the program s execution time [9].

The Earth simulator supports many rich libraries including message passing interface and high performance FORTRAN. The MPI library provides fast and rapid message passing capabilities, thus reducing significantly time spent on data communication between interacting tasks. The Earth Simulator utilizes both versions of MPI/MPI2.0 that are designed to exploit the Earth Simulator s hardware capabilities [earth-software]. Message Passing Interface designed for the Earth Simulator reduces both latency and overhead by interacting IN directly and by employing specific hardware feature. In addition MPI enhances system s throughput by providing the most appropriate method to move data, and separating inter-and intera-node communication internally. All these optimization techniques employed by MPI produces throughput of 12GB/s in point to point communication.

On the other hand HPF high performance FORTRAN is a compiler that provides easy yet efficient way to achieve highly scalable parallel programming. The compiler enables user to provide instructions, then depending on user s directives the compiler splits tasks and assign on the parallel processors, additionally the compiler generates FORTRAN code with MPI function calls to take advantage the MPI library. The major advantage of using HPF compiler is its parallelization capabilities. It provides three level parallelisms that essential for the Earth Simulator functionality including vectorizations, intra-node shared-memory parallelization, and inter-node distributed-memory parallelization [7].

13

6. Earth Simulator and its Application Domain


As mention earlier, the Earth Simulator was established to monitor the earth s environmental changes to prevent natural disasters such as earth quake, hurricanes, heat weaves, or extensive and lengthy heavy rains. The Earth Simulator has been employed for many research areas, focusing on specifically the earth research. In this section we will describe few of these application and give a brief description of 2008 Earth Simulator project.

The Atmosphere and Ocean Simulation project was one of the earliest researches implemented by a research group, this research intended to develop a better ways of enhancing weather forecasts and attempt to earlier prediction upcoming disasters. In this research the Earth Simulator simulates a portion of the earth and sea waters surface and their impact on weather were studied [10].

Additionally the research group (Solid Earth Simulation Research Group) conducted a study concerning the earth s structure and its dynamic interior complexity. The group is attempting to create new and dynamic algorithms solutions for the geophysical simulations and new grid systems in spherical geometry [10].

The research exploration made by Multi-scale simulation research group intended to examine and develop non-hydrostatic coupled ocean-atmosphere simulation code with high computational performance. The code enables us to simulate physical process on the earth and it also provides mechanism to comprehend of both weather and climate system. This is holistic simulation discovers the complex interdependency among micro-and macro scale process [11].

14

The other projects conducted this year using Earth Simulator is presented as follows [11]:

Earth Science Project Name Name of Project Representative Hajime Akimoto Wataru Ofuchi Professional Affiliation of Project Representative FRCGC, JAMSTEC ESC, JAMSTEC

Atmospheric Composition Change and its Climate Effect Studied by Chemical Transport Models Understanding and Forecasting High-Impact Phenomena in the Atmosphere and Ocean Table 1: Earth Science Research

Computer Science PROJECT NAME NAME OF PROFESSIONAL PROJECT AFFILIATION OF PROJECT REPRESENTATIVE REPRESENTATIVE Kanya Kusano Akira Nishida ESC, JAMSTEC Computing and Communications Center, Kyushu University

Development of Macro-Micro Interlocked Simulation Algorithm Development of General Purpose Numerical Software Infrastructure for Large Scale Scientific Computing Table 2: Computer Science Research

Epoch-making Simulation Project Name Name of Project Representative Nobuyuki Tsuboi Syogo Tejima Professional Affiliation of Project Representative Japan Aerospace Exploration Agency Research Organization for Information Science & Technology

Numerical Simulation of Rocket Engine Internal Flows Large-Scale Simulation on the Properties of Carbon-Nanotube
Table3. Epoch-making Simulation

15

7. Comparison of Super-Computers
As mention earlier the Earth Simulator has been on the top of the ranking list of super computers for many years. These statistics are collected and gathered from an online website called Top500 which releases the list of the most powerful and high performance super computers available in the computing society twice a year since 1993. Their latest release (June 2008) reveals that IBMRoadrunner built for U.S Department of Energy s Los Alamos National Laboratory is the most powerful supercomputer exist today with a performance of (1.026 petaflop/s). The top ten-list of today s most powerful computers are listed below [12]:

Rank Site 1 DOE/NNSA/LANL United States

Computer

2 3 4

5 6 7

9 10

Roadrunner - BladeCenter QS22/LS21 Cluster, PowerXCell 8i 3.2 Ghz Opteron DC 1.8 GHz , Voltaire Infiniband IBM DOE/NNSA/LLNL(United States) BlueGene/L - eServer Blue Gene SolutionIBM Argonne National Laboratory Blue Gene/P Solution United States IBM Ranger - SunBlade x6420, Opteron Quad 2Ghz, Texas Advanced Computing Infiniband Center/Univ. of Texas Sun Microsystems United States DOE/Oak Ridge National Laboratory Jaguar - Cray XT4 QuadCore 2.1 GHz United States Cray Inc. Forschungszentrum Juelich (FZJ) JUGENE - Blue Gene/P Solution Germany IBM Encanto - SGI Altix ICE 8200, Xeon quad core 3.0 New Mexico Computing GHz Applications Center (NMCAC) SGI United States EKA - Cluster Platform 3000 BL460c, Xeon 53xx Computational Research 3GHz, Infiniband Laboratories, TATA SONS Hewlett-Packard India IDRIS Blue Gene/P Solution France IBM Total Exploration Production SGI Altix ICE 8200EX, Xeon quad core 3.0 GHz France SGI

Table4. Top ten super computers

16

As can be seen from the list, The Earth Simulator does not include the top ten-list, and according this top500 latest release the Earth Simulator dropped to 49 ranking position which means the days of Earth Simulator being the most powerful and high performance computer are over.

8. Summary
The Earth Simulator has been ranking for many years on the top of the list of supercomputers presented in top500 list, until other super computers such as IBM runner, Blue-Gene, Columbia NASA came along to take over its place in the ranking.

This seminar work presents the architectural overview of ES, network topology used in the ES and interconnection between processor nodes. Similarly this seminar introduces ES data storage, work disks, user disks and mechanism to handle its job management. This seminar work also covers software aspect of the ES and its programming environment including various libraries used in the ES. Finally the application and research domain of the ES is presented.

17

9. References:
1. Present status of development of the Earth Simulator Publication Date: 2001
http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=955201

2. The Architecture of Earth Simulator Author: Glfem I klar


http://www.cmpe.boun.edu.tr/courses/cmpe511/fall2004/Gulfem%20Isiklar%20-%20Earth%20Simulator.doc

3. Discussion of Vector-based Computers and Applicability of Different Types of Programs. Authors: Weston Lahr & Matt Myers
www.cs.iastate.edu/~cs425/reports/lahr-myers-report.pdf

4. Earth Simulator Running Authors: Tetsuya Sato, Shigemune Kitawaki, and Mitsuo Yokokawa
www.ultrasim.info/sato.pdf

5. MDPS: The New Mass Data Processing Storage System for the Earth Simulator Authors: Ken ichi Itakura
http://www.jamstec.go.jp/esc/publication/journal/jes_vol.5/pdf/JES5_23-Itakura.pdf

6. Mass Data processing and NQS II Authors:


http://www.jamstec.go.jp/esc/publication/leaflet/pdf/01.pdf

7. Software system of the Earth Simulator Authors: Takashi Yanagawa, Kenji Suehiro
http://www.sciencedirect.com/

8. Job Scheduling
http://www.jamstec.go.jp/es/en/system/scheduling.html

9. An MPI Benchmark program library and its Application to the earth simulator. Authors: Hitoshi Uehara, Masanori Tamura, and Mitsuo Yokokawa Publisher: Springer Berlin / Heidelberg
http://www.springerlink.com/content/a1gp2w2attvk039y/

10. Solid Earth simulation Research Group


http://www.jamstec.go.jp/index.en.html#seg

11. Earth Simulator Collaboration Projects


http://www.jamstec.go.jp/es/en/project/list_kyoudou2008.html

12. Top 500 Supercomputers


http://www.top500.org/

13. Notes on the Earth Simulator


http://www-mips.unice.fr/~baude/Systemes-Distribues/Earth.pdf

18

19

Você também pode gostar