Você está na página 1de 36

Multi-Core

Processor and Parallel


Programming
Agenda
• Introduction
• History
• Hardware Model
• Software Model
• Programming Principle
• Programming Platform
• Simulation
INTRODUCTION
What is Multi-Core Processor
• Essentials for a Multi-Core Processor
– A device with more than one CPU core on a single die
– Which coherently shares a common virtual memory space
– An MIMD (Multiple Input Multiple Data) device

• Some devices with multiple core but are not


Multi-core.
– GPGPU
– MPPA
Why Multi-Core
•Single CPU were not getting faster the way they used
to.

•All the possible alternate designs have been used and


Caches grew to cover most of the die and we were still
far behind Moore’s Law.

•If the CPU and DSPs maintained the Moore’s law’s


rate then today we would have CPUs or DSPs of 15Ghz

•Fortunately Moore’s law’s Continued capacity growth


had made single-chip parallel computing possible.

•Multi-Core has become the most common form of


Parallel Computing.

•Performance increases have dropped sharply by 20%


per yr.
Continue…
• Parallel Computing could follow the Moore’s
law.
• Parallel Computers are programmed in software
and not in hardware
HISTORY
Production.
Today’s Multi-Cores

Multi-core processors are today available for


general purpose or use in Embedded application

• x86 multi-cores, such as Intel’s core 2 duo are


aimed for both general as well as embedded
purposes.
•Arm Cortex A-9 MPCore, are specialized for
embedded use.
Hardware & Software Model
How To build Parallel Computers
• Parallel or Serial Instruction Streams

• Parallel or Serials Data Streams

• Shared or Distributed memory

• Homogenous or Heterogeneous CPU.


Continue..
Continue…
Building Memory for Parallel
Computers

•Distributed Memory
•Each CPUs have its Own local Memory
•CPUs Communicate by sending message
•Shared Memory
•All CPUs Share a Common Memory
•CPUs communicate through Several variables in
Memory
Building CPU for Parallel
Computers

•Homogeneous CPUs
•All CPUs are same
•Any code can run on any CPU
•Heterogeneous CPU
•Mixed Different Types of processors each optimized
for different applications
•GP CPU-Codes
•DSP- signal processing
•FPU-Arithmetic Throughput
•GPU- Graphics
Multi-core caches & coherency

•Coherency- Every CPU must stay upto date when other is


writing to memory. For multicore computing to work this is
must.
•In multicore processors each cache snoops the others activity.
•Update it.
•Invalidate it.
Multi-Core shared memory
communication
Data are sent between processors through a shared variable
that goes through several level of caches and protocols.

 Source CPU writes data at shared location into its L1 cache.

L1 writes through to L2 cache.

Snooper invalidate the data in L1 cache.

Target CPU tries to read the data at the shared address into
its L1 cache.

L1 misses the data and ask to L2.

Data is copied from L2 to L1

 Communication completed.
Software Model
Symmetric multiprocessing

This is most common multi-core architecture.


What happens in SMP.

•Each processors sees the same Large or small memory.

•One OS runs on the entire platform.

•Inter-processor communicating is done through memory


variable.
Continue...

•SMP provided easy incremental path from serial


programming to parallel programming.

•Each processor sees the same memory as it was before.

•Existing application run without any change.

•Good utilization as any process can run on any


processor.
Asymmetric Multiprocessing

• Some processors can be specialized for a particular


task.

• Some processor may run their OWN Os

• Some processors may be dedicated for better real;


time responses
Programming Principle
Decomposition

For parallel operations we need to decompose the


work
• Functional Parallelism
• Data Parallelism

• Functional Parallelism
•Break the task into stream of successive
operations

• Data Parallelism
• Divide the data among multiple operators
Multi-Threading: Data
Parallelism

•Fork/Join Parallelism.

•Single Program Multiple Data(SPMD).


Communication and
Synchronism
• Problem of Communication and
synchronism.
• How to verify data has been read after writing.
Multi-Threaded programming
Communication and
Synchronism
Programming Platform
Choosing Programming platform
• OpenMP: C/C++/Fortran
– Thread synchronism and scheduling is handled
automatically
– Often proved best for Array processing
– Code is widely portable.
• Visual Studio: .NET 4.0 framework
– C#.NET,VB.NET etc
– Great programming support using feature of
intellisence.
Visual Studio platform
Basic Changes
• CLR thread pool 4.0
– Supports upto 256 processors unlike 2.0 which
supported upto 4.
– Reduces the synchronization overhead by
implementing ConcurrentQueue<T> Class.
– Supports Local queue per task and implement work
stealing algorithm.
– Cancellation aware application.
– Utilizes CPU upto 100%.
Simulation
Sequential Program

static void WalkTree(Tree treeToWalk)


{
if (treeToWalk == null) return;
WalkTree(treeToWalk.Left);
WalkTree(treeToWalk.Right);

ProcessItem(treeToWalk.Data);
}
Parallel Program

static void WalkTreeUsingTasks(Tree treeToWalk)


{
if (treeToWalk == null) return;
Task left = new Task(() => WalkTreeUsingTasks(treeToWalk.Left));
left.Start();
Task right = new Task(() => WalkTreeUsingTasks(treeToWalk.Right));
right.Start();

left.Wait();
right.Wait();
ProcessItem(treeToWalk.Data);
}
Simulation
Result of processing on Intel Core i3 (with two core)
THANK YOU

Você também pode gostar