Você está na página 1de 70

Reconfigurable Computing

Sherif Abou Zied Mohammad


Sherif.abouzied@nileu.edu.eg

Reconfigurable Computing 1
OUTLINE
• INTRODUCTION
• RECONFIGURABLE COMPUTING ARCHITECTURES
• RECONFIGURATION MANAGEMENT
• PROGRAMMING RECONFIGURABLE SYSTEMS
• COMPILING C FOR SPATIAL COMPUTING
• HW/SW Partitioning
• BEE2:A High-End Reconfigurable Computing System
• REFERENCES

Reconfigurable Computing 2
INTRODUCTION
Conventional Computing
• Software-programmed microprocessors
– Processors execute a set of instructions.
– Performance can suffer, if not in clock speed then in work
rate.
– Lower performance than ASICs.

Reconfigurable Computing 3
INTRODUCTION
Conventional Computing
• Hardwired (ASICs)
– Special purpose.
– Very fast and efficient.
– Circuit cannot be altered after fabrication.(Redesign!)

Reconfigurable Computing 4
INTRODUCTION
Reconfigurable Computing
• Fill the gap between hardware and software.
– Much higher performance than software.
– Higher level of flexibility than hardware.

Reconfigurable Computing 5
INTRODUCTION
Reconfigurable Computing
• Uses FPGAs or other programmable hardware for
compute-intensive calculations.
• Usually coupled with a general-purpose
microprocessor that is responsible for
– Controlling the reconfigurable logic .
– Executing program code that cannot be efficiently
accelerated.

Reconfigurable Computing 6
INTRODUCTION
Reconfigurable devices
• Contain an array of computational elements.
• Functionality is determined through configuration
bits.

Reconfigurable Computing 7
INTRODUCTION
Reconfigurable devices
• Most current FPGAs and reconfigurable devices are
SRAM-programmable
– Control routing.
– Control multiplexers, LUT,…
– Control signals for a computational units.

D flip-flop with
3-input LUT
optional bypass

Reconfigurable Computing 8
INTRODUCTION
Reconfigurable devices
• Reconfigurable Processing Fabric (RPF)
– Fine-grained
– Coarse-grained

Reconfigurable Computing 9
INTRODUCTION
Reconfigurable devices
• Fine-grained RPF
– Bit manipulation tasks
– For complex calculations, numerous fine-grained
PEs are required.
• slower clock rates

Reconfigurable Computing 10
INTRODUCTION
Reconfigurable devices
• Coarse-grained RPF
– Use bus interconnect and PEs
– Performs more than just bitwise operations, such
as ALUs and multipliers.

Reconfigurable Computing 11
OUTLINE
• INTRODUCTION
• RECONFIGURABLE COMPUTING ARCHITECTURES
• RECONFIGURATION MANAGEMENT
• PROGRAMMING RECONFIGURABLE SYSTEMS
• COMPILING C FOR SPATIAL COMPUTING
• HW/SW Partitioning
• BEE2:A High-End Reconfigurable Computing System
• REFERENCES

Reconfigurable Computing 12
RECONFIGURABLE COMPUTING
ARCHITECTURES
RPF
integration

S. Goldstein, H. Schmit, M. Moe, M. Budiu, S. Cadambi, R. R. Taylor, R. Laufer. PipeRench: A


coprocessor for streaming multimedia acceleration. 13
Reconfigurable Computing
RECONFIGURABLE COMPUTING
ARCHITECTURES
• RPF integration
– Separate processor (coprocessor)
• Data communication takes place through main memory
• Limited bandwidth between CPU and RPF

Reconfigurable Computing 14
RECONFIGURABLE COMPUTING
ARCHITECTURES
• RPF integration
– Loosely coupled RPF and
processor architecture
• RPF with the host
processor on the same
chip
• Direct interaction
between RPF and
processor
• RPF with direct memory
access
Chameleon’s architecture

Reconfigurable Computing 15
RECONFIGURABLE COMPUTING
ARCHITECTURES
• RPF integration
– Tightly coupled RPF and
processor
• RPF integrated as
functional unit such as
ALU, Multipliers.
• RFU access input data
through register files.

The datapath of the processor + RFU architecture

Reconfigurable Computing 16
RECONFIGURABLE COMPUTING
ARCHITECTURES
• RPF integration
– Tightly coupled RPF and processor
• Virtual Instruction Configurations(VICs ) in the RFU
typically run during the execute stage (and possibly the
memory stage) of the pipeline.

An example of a pipeline of a processor with an RFU

Reconfigurable Computing 17
OUTLINE
• INTRODUCTION
• RECONFIGURABLE COMPUTING ARCHITECTURES
• RECONFIGURATION MANAGEMENT
• PROGRAMMING RECONFIGURABLE SYSTEMS
• COMPILING C FOR SPATIAL COMPUTING
• HW/SW Partitioning
• BEE2:A High-End Reconfigurable Computing System
• REFERENCES

Reconfigurable Computing 18
RECONFIGURATION MANAGEMENT
Problem Definition
• Reconfigurability allows hardware to perform
different tasks at different times.
• Application’s configurations can be swapped
• Reconfiguring the hardware at runtime is called
Runtime Reconfiguration (RTR).

Reconfigurable Computing 19
RECONFIGURATION MANAGEMENT
Problem Definition
• RTR
– Run-time reconfiguration is based upon the
concept of virtual hardware, which is similar to
virtual memory.
• physical hardware is much smaller than the sum of the
resources required.
• swap configurations in and out of the actual hardware.

Reconfigurable Computing 20
RECONFIGURATION MANAGEMENT
Problem Definition
• RTR
– Increases hardware utilization
– Introduces significant reconfiguration overhead
– Time consuming
• Can require of hundreds of
milliseconds

Reconfigurable Computing 21
RECONFIGURATION MANAGEMENT
Problem Definition
• Computation and reconfiguration are mutually
exclusive
– time spent reconfiguring is time lost in terms of
application acceleration.
• Reconfiguration occupies approximately
25 to 98 percent of total
execution time

Reconfigurable Computing 22
RECONFIGURATION MANAGEMENT
Configuration Architectures
• What is Configuration
architectures?
• Architectures
– Single-context
– Multi-context
– Partially Reconfigurable
– Others

Reconfigurable Computing 23
RECONFIGURATION MANAGEMENT
Configuration Architectures
Single-context
• configurations are grouped into contexts, and
each full context is swapped in and out of the
FPGA as needed.

Reconfigurable Computing 24
RECONFIGURATION MANAGEMENT
Configuration Architectures
Single-context
• Configuration information is loaded into the
programmable array through a serial shift chain

Reconfigurable Computing 25
RECONFIGURATION MANAGEMENT
Configuration Architectures
Single-context
• require few pins for configuration, potentially
simplifying board-level design
• Entire chip must be reprogrammed for any change to
the configuration data because the data cannot
be selectively “reused” on the chip.

Reconfigurable Computing 26
RECONFIGURATION MANAGEMENT
Configuration Architectures
Single-context
• Configuration cycles can be reduced by
widening the configuration path
– Virtex-5 allow a configuration data bus up
to 32 bits wide

Reconfigurable Computing 27
RECONFIGURATION MANAGEMENT
Configuration Architectures
Multi-context
• Providing storage for multiple configurations
– facilitating configuration prefetching and fast
reconfiguration
– Contains multiple planes (contexts)
of configuration data

Reconfigurable Computing 28
RECONFIGURATION MANAGEMENT
Configuration Architectures
Multi-context
• Multiplexer chooses
between the context
planes

Reconfigurable Computing 29
RECONFIGURATION MANAGEMENT
Configuration Architectures
Multi-context advantage
• Background loading of configuration data
• Fast switching between stored configurations
– some in a single clock cycle
• Overlapping computations with configuration

Reconfigurable Computing 30
RECONFIGURATION MANAGEMENT
Configuration Architectures
Multi-context drawbacks
• Area overhead
– Additional configuration data
– Multiplexing
• Single cycle configuration
– Dynamic power?

Reconfigurable Computing 31
RECONFIGURATION MANAGEMENT
Configuration Architectures
Partially Reconfigurable
• Not all configurations require the entire chip
area
• Reconfigure utilized resources only
• Use addressable configuration
memory

Reconfigurable Computing 32
RECONFIGURATION MANAGEMENT
Configuration Architectures
Partially Reconfigurable
– Decrease reconfiguration time
– Decrease configuration data
– Configuration occupying large area (time issue)
– Independent configurations with
overlapping hardware?

Reconfigurable Computing 33
OUTLINE
• INTRODUCTION
• RECONFIGURABLE COMPUTING ARCHITECTURES
• RECONFIGURATION MANAGEMENT
• PROGRAMMING RECONFIGURABLE SYSTEMS
• COMPILING C FOR SPATIAL COMPUTING
• HW/SW Partitioning
• BEE2:A High-End Reconfigurable Computing System
• REFERENCES

Reconfigurable Computing 34
PROGRAMMING RECONFIGURABLE
SYSTEMS
• Reconfigurable systems can be ignored by application
programmers unless they are able to easily
incorporate its use into their systems.
• Software design environment that aids in the
creation of configurations for the
reconfigurable hardware is required.

Reconfigurable Computing 35
PROGRAMMING RECONFIGURABLE
SYSTEMS
• Software design environment
– Manual
• Powerful method for the creation of high-quality circuit
designs.
• Requires a great deal of background knowledge of the
particular reconfigurable system employed.
• Significant amount of design time.

Reconfigurable Computing 36
PROGRAMMING RECONFIGURABLE
SYSTEMS
• Software design environment
– Fully automatic
• Quick and easy.
• Makes the use of reconfigurable hardware more
accessible to general application programmers.
• Quality may suffer.

Reconfigurable Computing 37
PROGRAMMING RECONFIGURABLE
SYSTEMS

Reconfigurable Computing 38
OUTLINE
• INTRODUCTION
• RECONFIGURABLE COMPUTING ARCHITECTURES
• RECONFIGURATION MANAGEMENT
• PROGRAMMING RECONFIGURABLE SYSTEMS
• COMPILING C FOR SPATIAL COMPUTING
• HW/SW Partitioning
• BEE2:A High-End Reconfigurable Computing System
• REFERENCES

Reconfigurable Computing 39
Compiling C for spatial computing
Why C?
• There are many more C programmers than hardware
designers.
• Writing an algorithm in C is typically faster than in an
HDL.
• Large existing code base.
• Allows both hardware (HW) and software (SW)
versions to be created
– operating system can choose at runtime which is better

Reconfigurable Computing 40
Compiling C for spatial computing
Why C?
• Easy for the designer or compiler to quickly explore
the tradeoffs between different hardware/software
partitioning.
• The code can be easily tested on a conventional
microprocessor.

Reconfigurable Computing 41
Compiling C for spatial computing
How C runs on spatial hardware (overview)
• In a C program, the
statements execute in
order.
• With spatial
computation, each
operation is
implemented as a
function unit

Reconfigurable Computing 42
Compiling C for spatial computing
How C runs on spatial hardware (overview)
Memory loads and stores
• Memory access
operations must be
scheduled
– allow sharing among
memory operations.
– preserve sequential C
semantics.

Reconfigurable Computing 43
Compiling C for spatial computing
How C runs on spatial hardware (overview)
If-then-else Using
Multiplexers

Reconfigurable Computing 44
Compiling C for spatial computing
How C runs on spatial hardware (overview)
More than just simple
if-then-else control flow
– Use sub-circuits

Reconfigurable Computing 45
Compiling C for spatial computing
How C runs on spatial hardware (overview)
Optimizing the
Common Path

Reconfigurable Computing 46
Compiling C for spatial computing
How C runs on spatial hardware (overview)
• What about
–Parallelism?
–Pipelining?
–Memory dependencies?
–Operator size?

Reconfigurable Computing 47
Compiling C for spatial computing
Automatic Compilation
Circuit
Data
Overall compiler flow Flow
Generation

Hyperblocks Graph
Control
Flow
Graph
C source
code

Reconfigurable Computing 48
Compiling C for spatial computing
Automatic Compilation
Control Flow Graph (CFG)
• Breaking code into basic
blocks of simple
instructions.
• Blocks are connected by
control edges indicating a
possible branch.
• All instructions inside a
given block execute once
the block is entered.

Reconfigurable Computing 49
Compiling C for spatial computing
Automatic Compilation
Hyperblocks
• CFG basic blocks are quite
small and limit our
opportunities for
parallelism.
• Compiler combines blocks
along commonly taken
paths.
• Hyperblocks have a single
entry point at the top and
one or more exits.

Reconfigurable Computing 50
Compiling C for spatial computing
Automatic Compilation
Data Flow Graph (DFG)
• The DFG is composed of
nodes and edges.
• Nodes
– Inputs, constants,
operations, memory
access and exit nodes
• Edges
– Data transfer edges,
ordering edge, exit edge

Reconfigurable Computing 51
Compiling C for spatial computing
Automatic Compilation
Data Flow Graph (DFG)

Reconfigurable Computing 52
Compiling C for spatial computing
Automatic Compilation
DFG optimizations
• Strength reduction
– replacing one operator with another operator(s)
having less overall latency/area.
• replace x*2 with x+x or x<<1
• x*7 can be expressed as (x<<2)+(x<<1)+x, but even
better as (x<<3)-x.

Reconfigurable Computing 53
Compiling C for spatial computing
Automatic Compilation
DFG optimizations
• Boolean value identification
– ISO C does not contain a Boolean data type
– Although the result of a comparison is defined to
be either 0 or 1, the type of the result is a signed
integer—typically 32 bits.
– Use only one bit

Reconfigurable Computing 54
Compiling C for spatial computing
Automatic Compilation
DFG optimizations
• Type-based operator size reduction
– ISO C semantics dictate that arithmetic and logical
operations involving type char and/or short operands must
be performed at the precision of type int.
– Thus, a 16-bit adder will give the same result as a 32-bit
adder

Reconfigurable Computing 55
Compiling C for spatial computing
Automatic Compilation
DFG optimizations
• Type-based operator size reduction
– Analyze number of bits actually required by
variables and operators.
• Example
– Integer i within the loop
for (i = 0; i < 100; i++)

Reconfigurable Computing 56
Compiling C for spatial computing
Automatic Compilation
DFG to Reconfigurable Fabric
• Mapping DFG nodes to modules
• Scheduling each module to a
specific timestep.
• Then, finally, connections are made
between modules from different
hyperblocks sub-circuits to
complete the overall circuit.

Reconfigurable Computing 57
OUTLINE
• INTRODUCTION
• RECONFIGURABLE COMPUTING ARCHITECTURES
• RECONFIGURATION MANAGEMENT
• PROGRAMMING RECONFIGURABLE SYSTEMS
• COMPILING C FOR SPATIAL COMPUTING
• HW/SW Partitioning
• BEE2:A High-End Reconfigurable Computing System
• REFERENCES

Reconfigurable Computing 58
HW/SW Partitioning

• For systems that include both reconfigurable


hardware and a traditional microprocessor.
• program must first be partitioned into
– Sections to be executed on the reconfigurable
hardware
• ex. fixed datapath operations
– Sections to be executed in software on the
microprocessor
• ex. complex control sequences such as variable-length
loops

Reconfigurable Computing 59
HW/SW Partitioning
• Partitioning
– Manually
• Program developed ends up tuned to a specific
machine
• Alternative solution is to use compiler directives

The NAPA C language [Gokhale and Stone 1998] provides pragma


statements to allow a programmer to specify whether a section of
code is to be executed in software on the Fixed Instruction
Processor (FIP), or in hardware on the Adaptive Logic Processor
(ALP).

Reconfigurable Computing 60
HW/SW Partitioning
• Partitioning
– Automatically
• compiler and runtime system take full responsibility for
determining the right code and granularity
to move to the reconfigurable fabric.
• reconfigurable hardware transparent to the designer
• Cost functions based upon acceleration gained
• to determine whether the cost of configuration
is overcome by the benefits of hardware execution or not.

Reconfigurable Computing 61
OUTLINE
• INTRODUCTION
• RECONFIGURABLE COMPUTING ARCHITECTURES
• RECONFIGURATION MANAGEMENT
• PROGRAMMING RECONFIGURABLE SYSTEMS
• COMPILING C FOR SPATIAL COMPUTING
• HW/SW Partitioning
• BEE2:A High-End Reconfigurable Computing System
• REFERENCES

Reconfigurable Computing 62
BEE2:A High-End Reconfigurable
Computing System
• BEE: Berkeley Emulation Engine
– BEE2 can provide over 10 times more computing
throughput than a DSP-based system with similar
power consumption and cost.
– Over 100 times that of a microprocessor-based
system.

Reconfigurable Computing 63
BEE2:A High-End Reconfigurable
Computing System
• BEE: Berkeley Emulation Engine
– Applications
• Emulation and design of novel wireless
communications systems.
• High-performance real-time digital signal processing.
• Real-time scientific computation and simulation.
• The acceleration of CAD tools.

Reconfigurable Computing 64
BEE2:A High-End Reconfigurable
Computing System
• BEE: Berkeley Emulation Engine
– BEE2 system uses Xilinx Virtex-2 Pro FPGAs
– Virtex-2 Pro embeds PowerPC 405 processor cores
into the reconfigurable fabric.
– BEE2 has no hardware-managed caches, hence all
data transfers within the system have tightly
bounded latency.
• BEE2 is therefore well suited for real-time applications

Reconfigurable Computing 65
BEE2:A High-End Reconfigurable
Computing System
• BEE: Berkeley Emulation Engine
– Programming environment
• High-level block diagram design environment based on
Mathworks Simulink and the Xilinx System Generator
library.
• Uses automatic compilation tools

Reconfigurable Computing 66
BEE2:A High-End Reconfigurable
Computing System
• Compute modules:
– Compute modules:
consists of five “Xilinx
Virtex 2 Pro 70” FPGA
chips directly connected to
four Dual Data-
rate2(DDR2)- 240-pin
DRAM DIMMs, with a
maximum capacity of 4
Gbytes per FPGA.
– The local mesh connects
the four compute FPGAs
on a 2D grid.

Reconfigurable Computing 67
BEE2:A High-End Reconfigurable
Computing System
• Compute modules:
– Each link between the
adjacent FPGAs on the
grid provides over 40
Gbps of data throughput
per link.
– The four down links from
the control FPGA to each
of the computing FPGAs
provide up to 20 Gbps
per link

Reconfigurable Computing 68
REFERNCES
• Scott Hauck and Andre Dehon, “Reconfigurable Computing
The Theory and Practice of FPGA Based Computing”
• Katherine Compton,” Reconfigurable Computing: A Survey of
Systems and Software”, Northwestern University.
• Chen Chang, John Wawrzynek, and Robert W. Brodersen,
“Berkeley BEE2: A High-End Reconfigurable Computing
System”, University of California.

Reconfigurable Computing 69
Thank You

Reconfigurable Computing 70

Você também pode gostar