Você está na página 1de 6

Smarter Cluster Supercomputing

from the Supercomputer Experts


CS400-LC Liquid Cooling Highlights
Cools with warm water instead of chillers
Secondary loop cools critical server components
Leak detection and remote monitoring
Lowers energy costs; datacenter PUE of 1.1 or lower
Capable of up to 80 percent heat capture
Maximize Your Productivity with Flexible,
High-Performance Cray

CS400 Liquid-
Cooled Cluster Supercomputers
In science and business, as soon as one question is answered
another is waiting. And with so much depending on fast, accu-
rate answers to complex problems, you need reliable high perfor-
mance computing (HPC) tools matched to your specic tasks.
Understanding that time is critical and all HPC problems are not
created equal, we developed the Cray

CS400 cluster super-


computer series. These systems are industry standards-based,
highly customizable, easy to manage, and purposefully designed to
handle the broadest range of medium- to large-scale simulations
and data-intensive workloads.
All CS400 components have been carefully selected, optimized and
integrated to create a powerful, reliable high-performance compute
environment capable of scaling to over 11,000 compute nodes and
11 peak petaops.
Flexible node congurations featuring the latest processor and
interconnect technologies mean you can get to the solution faster
by tailoring a system to your specic HPC applications needs.
Innovations in packaging, power, cooling and density translate
to superior energy efciency and compelling price/performance.
Expertly engineered system management software instantly boosts
your productivity by simplifying system administration and mainte-
nance, even for very large systems.
Cray has long been a leader in delivering tightly integrated super-
computer systems for large-scale deployments. With the CS400
system, you get that same Cray expertise and productivity in a ex-
ible, standards-based and easy-to-manage cluster supercomputer.
CS400-LC Cluster Supercomputer:
Liquid-Cooled and Designed for Your Workload
The CS400-LC system is our direct-to-chip warm water-cooled
cluster supercomputer. Designed for signicant energy savings,
it features liquid-cooling technology that uses heat exchangers
instead of chillers to cool system components. Compared to tradi-
tional air-cooled clusters, the CS400-LC system can deliver three
times more energy efciency with typical payback cycles ranging
from immediate to one year.
Along with lowering operational costs, the CS400-LC system offers
the latest x86 processor technologies from Intel in a highly scalable
package. Industry-standard server nodes and components have
been optimized for HPC and paired with a comprehensive HPC
software stack, creating a unied system that excels at capacity-
and data-intensive workloads.
Innovative Liquid Cooling Keeps Your System Cool and Energy Costs Low
Designed to minimize power consumption without compromis-
ing performance, the CS400-LC cluster supercomputer uses an
innovative heat exchange system to cool system processors and
memory.
The heat exchange cooling process starts with a coolant distri-
bution unit (CDU), connected to each rack, and two separate
cooling loops. One loop delivers warm or cool facility water to the
CDU, where the heat is exchanged and the now-hot facility water
exits the other end of the loop. A second loop repeats the pro-
cess at the server level. A double-sealed low-pressure secondary
loop, with dripless quick connects, cools the critical server com-
ponents. It delivers cooled liquid to the servers where pump/cold
plate units atop the processors capture the heat, and the now-hot
liquid circulates back to the CDU for heat exchange. Facility water
and server loop liquid never mix liquid-to-liquid heat exchangers
within the CDU transfer heat between the loops.
This isolated dual-loop design safeguards the nodes. First, the
server loop is low pressure and low ow server loop compo-
nents are not subject to the high pressure of the facility loop.
Second, the server loop is prelled with nonconductive, deionized water containing additives to prevent corrosion.
Since it requires less powerful fans on the servers and fewer air conditioning units in the facility, the CS400-LC system reduces typical
energy consumption by 50 percent with predicted power usage effectiveness (PUE) of 1.1 or lower. The system can also capture up
to 80 percent of heat from the server components for possible reuse. Additionally, leak detection and prevention features are tightly
integrated with the system remote monitoring and reporting capabilities.
Choice of Flexible, Scalable Congurations
Flexibility is at the heart of the Cray CS400-LC system design.
At the system level, the CS400-LC cluster is built on the Cray


GreenBlade platform. Comprising server blades and chassis, the
platform is designed to provide mix-and-match building blocks for
easy, exible congurations, at both the node and whole system
level. Among its advantages, the GreenBlade platform offers
high density (up to 60 compute nodes per 42U rack), excellent
memory capacity (up to 1,024 GB per node), many power and
cooling efciencies and a built-in management module for indus-
try-leading reliability.
The CS400-LC system features the latest Intel

Xeon

processors. It offers multiple interconnect and network topology options, maxi-


mum bandwidth, local storage, many network-attached le system options and the ability to integrate with Lustre

-based global parallel


storage systems including Cray

Cluster Connect, Cray Tiered Adaptive Storage (TAS) and Cray

Sonexion

scale-out storage.
Within this framework, the Cray CS400-LC system can be tailored to multiple purposes from an all-purpose cluster, to one suited for
shared memory parallel tasks, to a system optimized for hybrid compute- and data-intensive workloads.
Nodes are divided by function into compute and service nodes. Compute nodes run parallel MPI and/or Open MP tasks with maximum
efciency, while service nodes provide I/O and login functions.
Compute nodes feature two Intel Xeon processors per node and up to 1,024 gigabytes of memory. Each node can host one local hard
drive.
With industry-standard components throughout, each system conguration can be replicated over and over to create a reliable and
powerful large-scale system.
CS400-LC Hardware Conguration Options
Two-socket x86 Intel Xeon processors
Large memory capacity per node
Multiple interconnect options: 3D torus/fat tree, single/dual
rail FDR InniBand
Local hard drives in each server
Choice of network-attached le systems and Lustre-based
parallel le storage systems
Outdoor Dry
Cooler
Coolant Distribution Unit
(CDU)
Facility Water
Low Pressure
Server Loop
Server Cooler Server Cooler
Easy, Comprehensive Manageability
A exible system is only as good as your ability to use it. The Cray CS400-AC cluster supercomputer offers two key productivity-boost-
ing tools a customizable HPC cluster software stack and the Cray Advanced Cluster Engine (ACE) system management software.

Cray HPC Cluster Software Stack
HPC
Programming
Tools
Development &
Performance
Tools
Cray PE on CS Intel

Cluster Studio
PGI Cluster
Development
Kit

GNU Toolchain NVIDIA

CUDA

Application
Libraries
Cray LibSci,
LibSci_ACC
Intel

MPI
IBM Platform
MPI
MVAPICH2 OpenMPI
Debuggers
Rogue Wave
TotalView

Allinea DDT, MAP Intel

IDB PGI PGDBG

GNU GDB
Schedulers
File Systems
and
Management
Resource
Management /
Job Scheduling
SLURM
Adaptive Computing
Moab

, Maui, Torque
Altair PBS
Professional
IBM Platform
LSF

Grid Engine
File Systems Lustre

NFS GPFS Panasas PanFS

Local
(ext3, ext4, XFS)
Cluster
Management
Cray Advanced Cluster Engine (ACE) Management Software
Operating
Systems and
Drivers
Drivers &
Network Mgmt.
Accelerator Software Stack & Drivers OFED
Operating
Systems
Linux

(Red Hat, CentOS)


The HPC cluster software stack con-
sists of a range of software tools com-
patible with most open source and
commercial compilers, debuggers,
schedulers and libraries. Also avail-
able as part of the software stack is
the Cray Programming Environment,
which includes the Cray Compiling
Environment, Cray Scientic and
Math Libraries, and Performance
Measurement and Analysis Tools.
Cray HPC Cluster Software Stack
OS, Drivers, Management, File Systems, Schedulers, Programming Tools
Advanced Cluster Engine Cluster Management Software
Turns Cray clusters into functional, usable, reliable and available computing systems
Compute
Automatic discovery
Scalable, fast, diskless
booting - inherits partition
personality
Storage
High bandwidth to storage
Network
Automatic discovery
Redundant paths
Load balancing
Failover
Cluster Management Compute Network Storage
GUI AND CLI
View/Change/Control
Monitor health
Plugin interface
Applications
Cluster Management
Hierarchical management
infrastructure
Hierarchical cached root file
system
Divides the cluster into
multiple logical partitions,
each with unique OS
personality
Revision system with rollback
Redundancy and failover
Remote management and
remote power control
The Advanced Cluster Engine
(ACE) management software
simplies cluster management
for large scale-out environments
with extremely scalable network,
server, cluster and storage man-
agement capabilities. Command
line (CLI) and graphical user inter-
face (GUI) options provide exi-
bility for the cluster administrator.
An easy-to-use ACE GUI con-
nects directly to the ACE daemon
on the management server and
can be executed on a remote
system. With ACE, a large system
is almost as easy to understand
and manage as a workstation.
Simplies compute, network and storage management
Supports multiple network topologies and diskless
congurations with optional local storage
Provides network failover with high scalability

Integrates easily with standards-based HPC software stack
components
Manages heterogeneous nodes with different software stacks
Monitors node and network health, power and component
temperatures
ACE at a Glance
Built-in Energy Efciencies and Reliability Features Lower Your TCO
Energy efciency features, combined with our long-standing expertise in meeting the reliability demands of very large, high-usage
deployments means you get more work done for less.
In addition to liquid cooling, the CS400-LC options for additional power and cost savings include high-efciency load balancing power
supplies and a 480V power distribution unit with a choice of 208V or 277V three-phase power supplies. It means you can use indus-
try-standard 208V and 230V power as well as 277V (single-phase of a 480V three-phase input) and reduce power loss caused by
step-down transformers and resistive losses as the power is delivered from the wall directly to the rack.
Reliability is built into the system design, starting with our careful selection of boards and components. Then multiple levels of redun-
dancy and fault tolerance ensure the system meets your uptime needs. The CS400-LC cluster has redundant power, cooling and
management servers and redundant networks all with failover capabilities.
Intel

Xeon

Processor E5-2600 Product Family


The Intel Xeon processor is at the heart of the agile, efcient datacenter. Built on Intels industry-leading microar-
chitecture based on the 22nm 3D Tri-Gate transistor technology, the Xeon processor supports high-speed DDR4
memory technology with increased bandwidth, larger density and lower voltage over previous generations. The
Intel support for PCI Express (PCIe) 3.0 ports improves I/O bandwidth, offering extra capacity and exibility for
storage and networking connections. The processor delivers energy efciency and performance that adapts to the
most complex and demanding workloads.
2014 Cray Inc. All rights reserved. Specications are subject to change without notice. Cray is a registered trademark of Cray Inc. All other trademarks mentioned herein are the properties of their respective owners. 20140929EMS
Cray Inc. 901 Fifth Avenue, Suite 1000 Seattle, WA 98164 Tel: 206.701.2000 Fax: 206.701.2500 www.cray.com
Cray

CS400-AC Specications
Architecture Liquid-cooled cluster architecture, up to 60 nodes per 42U rack
Processor, Coprocessor and
Accelerators
Support for 12-core, 64-bit, Intel

Xeon

processor E5-2600 v3 product family


Memory Up to 1,024 GB registered ECC DDR4 RAM per compute node using 16 x 64GB DDR4 DIMMs
Interconnect
and Networks
External I/O interface
10 GbE Ethernet
FDR InniBand with ConnectIB or QDR True Scale Host Channel Adapters
Options for single- or dual-rail fat tree or 3D torus
System Administration
Advanced Cluster Engine (ACE)
Complete remote management capability
Graphical and command line system administration
System software version rollback capability
Redundant management servers with automatic failover
Automatic discovery and status reporting of interconnect, server and storage hardware
Ability to detect hardware and interconnect topology conguration errors
Cluster partitioning into multiple logical clusters, each capable of hosting a unique software stack
Remote server control (power on/off, cycle) and remote server initialization (reset, reboot, shut down)
Scalable fast diskless booting for large node systems and root le systems for diskless nodes
Reliable, Available,
Serviceable (RAS)
Redundant power, cooling and management servers with failover capabilities
Redundant networks (InniBand, GbE and 10 GbE) with failover
All critical components easily accessible and hot swappable
Resource Management and
Job Scheduling
Options for SLURM, Altair PBS Professional, IBM Platform LSF, Adaptive Computing Torque, Maui and Moab, and
Grid Engine
File System
Cray

Cluster Connect , Cray

Sonexion

, NFS, Local FS (Ext3, Ext4 XFS) Lustre

and Panasas

PanFS


available as global le systems
Disk Storage Full line of FC-attached disk arrays with support for FC, SATA disk drives and SSDs
Operating System Red Hat, SUSE or CentOS
Performance Monitoring Tools Open source packages such as HPCC, Perfctr, IOR, PAPI/IPM, netperf
Compilers, Libraries and Tools
Options for Open MPI, MVAPICH2 or Intel MPI Libraries
Cray Compiler Environment (CCE), Cray LibSci, PGI, Intel Cluster Toolkit compilers, NVIDIA

CUDA, CUDA C/
C++, Fortran
OpenCL, DirectCompute Toolkits, GNU, DDT, TotalView, OFED programming tools and many others
Power
Up to 38 kW per cabinet depending on conguration
208V/230V/277V power
Optional 480V power distribution with 277V power supplies
Cooling Features
Liquid cooled
Low-pressure secondary loop completely isolated from primary datacenter liquid loop
Field-serviceable cooling kits with integrated pressure and leak detection with remote monitoring
Cabinet Dimensions (HxWxD) 82.40 (2,093 mm) H x 23.62 (600 mm) W x 59.06 (1,500 mm) D standard 42U/19 rack cabinet
Cabinet Weight 1,739 lbs.

Você também pode gostar