Teslapersonalsupercomputer 160201192005

Tesla Personal Super
Computer
Priya Manikpuri
M.Sc.(CS)-I Sem-
II
Shri.Shivaji Science
college, Nagpur
Contents
1. Introduction
2. Features
3. GPU Computing
4. CUDA parallel architecture and programming model.
5. Tesla C1060 Specifications and architecture.
6. Advantaged and Disadvantages
7. Future Scope
8. Conclusion
Introduction
GPU-based desktop computer
backed by NVIDIA
built by Dell, Lenovo and other companies
NVIDIA's CUDA parallel computing architecture
933 Gigaflops peak performance
250 times faster than standard PCs
Tesla certified system, Windows XP(32 bit) and Linux (64-bit

and 32-bit )are the supported platforms.
Features
Multi-GPU Computing
Massively Multi-threaded Computing Architecture
4 GB High-Speed Memory per GPU
High Speed , PCI-Express Gen 2.0 Data Transfer
64-bit ALUs for Double-Precision Math

GPU Computing
GPU computing is the use of a GPU(graphics processing unit)
to do general-purpose scientific and engineering computing.
The model for GPU computing is to use a CPU and GPU

together in a heterogeneous computing model.
CUDA Parallel Architecture and
Programming Model
CUDA stands for Compute

Unified Device Architecture
Developed by NVIDIA to help

code for GPUs (specifically
their GPUs)
An extension of C and C++
CUDA offers a data parallel

programming model
Tesla C1060 Computing Processor
GPU
-Number of processor cores: 240
-Processor core clock: 1.296 GHz
-Max Power Consumption:187.8 W
Memory
-Total Dedicated Memory: 4 GB
-Memory speed :800 MHz
-Memory Interface :512-bit GDDR3
-Memory Bandwidth: 102 GB/sec
External Connectors: None
Internal Connectors and Headers:

-One 6-pin PCI Express power connector
-One 8-pin PCI Express power connector
-4-pin fan connector
NVIDIA Tesla - Architecture
At the heart of the new Tesla personal supercomputer are three or four
NVIDIA Tesla C1060 computing processors.
The application start at the host side(the CPU) which communicates with
the device side(the GPU)through PCI-Express x16(bus).
NVIDIA Tesla - Components
Tesla C1060 comprises of 30 Streaming

multiprocessors(SMs).
The SM is the processing unit, and it is

unified graphics and computing
multiprocessor.
Each SM is comprised of eight scalar

processors (SPs) , 16-kb of shared chip
memory, and 16,884 32-bit registers.
Each SM has two single-precision

transcendental (Special Functions ,SF)
units to carry out transcendental
functions.
NVIDIA Tesla - Components
Texture Unit Processes one

group of threads per cycle, optimized
for texture computations
Raster operations processor

(ROP)
- Paired with a specific memory
partition and texture/processor cluster
- Supports an interconnect with both
DDR2 and GDDR3 memory for up to
16 GB/s bandwidth
- Processor is used to aid in anti
aliasing
Data Flow and Memory
Warp Capability: Each streaming multiprocessor handles 24

warps, or 768 threads.
Memory Access:
Data Flow and Memory
Memory and Interconnect:
Bus of 384 pins with 6 independent partitions (Means many

possible connections)
Use GDDR3 RAM, which has much higher bandwidth,

though requires more power, than DDR DRAM
Memory traffic within the chip goes through a specific

component of the hardware that combines the various
components together (the ROP)
Advantages and Disadvantages
Advantages:
Your own Supercomputer

Designed for Office Use
Solve Large-scale Problems using Multiple GPUs
They can be used in medical applications for processing brain and
bodyscans, resulting in faster diagnosis.
Disadvantages:
Overheating: If a GPU hits the maximum temperature, the driver

throttles down performance and shutdown the system.
CUDA does not support the full C standard, as it runs host code through
a C++ compiler, which makes some valid C (but invalid C++) code fail to
compile.
Future Scope
Although at 4,000 and 8,000 it is beyond the reach of most

consumers, the high-performance processor could become
invaluable to universities and medical institutions.
The NVIDIAs Tesla computer could prove invaluable to medical

researchers and accelerate the discovery cures for diseases.
With the massively parallel architecture of the GPU, scientists

and engineers can get a quantum jump in performance and
continue to advance the pace of their work, guiding us to faster
discovery in drug research, weather modeling, oil and gas
exploration, computational finance, and more
Conclusion
The technology represents a great leap forward in the history
of computing.
The new computers make innovative use of graphics

processing units
The Tesla Personal Supercomputer doesn't make

supercomputing clusters obsolete but it's a major
breakthrough for millions of researchers who can take
advantage of the huge heterogeneous computing power of this
system
These supercomputers can improve the time it takes to

process information by 1,000 times.
THANK
YOU

Teslapersonalsupercomputer 160201192005

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Teslapersonalsupercomputer 160201192005

Enviado por

Direitos autorais:

Formatos disponíveis

Tesla Personal Super

4. CUDA parallel architecture and programming model.

5. Tesla C1060 Specifications and architecture.

6. Advantaged and Disadvantages

built by Dell, Lenovo and other companies

NVIDIA's CUDA parallel computing architecture

933 Gigaflops peak performance

250 times faster than standard PCs

Tesla certified system, Windows XP(32 bit) and Linux (64-bit

Massively Multi-threaded Computing Architecture

4 GB High-Speed Memory per GPU

High Speed , PCI-Express Gen 2.0 Data Transfer

64-bit ALUs for Double-Precision Math

The model for GPU computing is to use a CPU and GPU

CUDA stands for Compute

Developed by NVIDIA to help

An extension of C and C++

CUDA offers a data parallel

External Connectors: None

Internal Connectors and Headers:

NVIDIA Tesla - Components

Tesla C1060 comprises of 30 Streaming

The SM is the processing unit, and it is

Each SM is comprised of eight scalar

Each SM has two single-precision

Texture Unit Processes one

Raster operations processor

Warp Capability: Each streaming multiprocessor handles 24

Data Flow and Memory

Memory and Interconnect:

Bus of 384 pins with 6 independent partitions (Means many

Use GDDR3 RAM, which has much higher bandwidth,

Memory traffic within the chip goes through a specific

Your own Supercomputer

Overheating: If a GPU hits the maximum temperature, the driver

Although at 4,000 and 8,000 it is beyond the reach of most

The NVIDIAs Tesla computer could prove invaluable to medical

With the massively parallel architecture of the GPU, scientists

The new computers make innovative use of graphics

The Tesla Personal Supercomputer doesn't make

These supercomputers can improve the time it takes to

Você também pode gostar