L16 Fpgas

LECTURE 16 FINE-GRAINED RECONFIGURABLE COMPUTING: FPGAS JOEL EMER AND DANIEL SANCHEZ
6.888 PARALLEL AND HETEROGENEOUS COMPUTER ARCHITECTURE SPRING 2013
Field Programmable Gate Arrays (FPGA)

And 00 0 0 0 1 00 01 10 11 Or 0 0 1 1
LUT
01
Latch
10 11
RAM
.....
Evolution of FPGA applications

3
Logic Replacement
Low
design cost and effort Low volume applications Often replaced with ASIC as volume increases
Algorithmic computation
Offloads
a general purpose processor Used for multiple algorithms ASIC replacement not expected
6.888 Spring 2013 - Sanchez and Emer L16
Benefits of FPGA computation

4
Custom operations/data types custom operations/data types Flexible flow control - control flow based on arbitrary state machines Local state access - local state elements allows parallel state access Fine grain parallelism - replicated logic permits easy parallelism Custom communication - explicit direct inter-module communication

Reduced memory references more direct reuse of data
Better power efficiency more activity directly applied to computation Better area efficiency more area directly applied to computation
QPI-attached FPGA platform

Four Socket QuickAssist Platform Topology
QuickAssist FPGA
Intel QuickAssist QPI-based FPGA Accelerator Accelerator Hardware Module (AHM) Platform (QAP)
P
QPI Links
QPI Links
M Xilinx Virtex 6 Module
CS
I/O I/O I/O
CS
I/O
FPGA
PCIe attached FPGA

Intel Xeon processor 7000 series
Altera Stratix IV Module
Reed Solomon Results

WiMAX requirement is to support a throughput of 134Mbps
Xilinx IP
Equivalent Gate Count Frequency (MHz) Steady State (Cycles/ Block) Data rate (Mbps) 297,409 145.3 660
CatapultC
596,730 91.2 2073
Bluespec
267,741 108.5 276
392.8
89.7
701.3
Lower is better
Higher is better
Source: MIT, Abhinav Agarwal, Alfred Ng CSG
BORPH
7
Berkeley Operating system for ReProgrammable Hardware OS for reconfigurable computers

Treats
reconfigurable hardware as computational resources
UNIX interface to HW designs

Familiar
to both software and hardware engineers Design language independent
Goal:
Make FPGA-based reconfigurable computers easy to use

Conventional View of FPGA Systems

8
User Process (SW) User Process (SW) User Process (SW)
Software
pipe User Library file OS Kernel Device Driver
socket IPC
Master-Slave Relationship
Hardware
Hardware Platform (Network, UART, HD)

FPGA
coprocessor
6.888 Spring 2013 - Sanchez and Emer 6.888 Spring 2013 - Sanchez and Emer L16 L16
BORPH Layers
User Process (SW) User Process (SW) User Process (SW)
Peer-to-Peer Relationship
User Library Software file pipe BORPH Kernel Device Driver IPC ioreg virtual file socket
Hardware
Hardware User Library

User Process (HW) User Process (HW)
6.888 Spring 2013 - Sanchez and Emer 6.888 Spring 2013 - Sanchez and Emer L16 L16
Overview of BORPH Concepts

Hardware process Hardware syscall interface Interacting with an FPGA

SW
SW
SW
Software
file
pipe
User Library
IPC ioreg
socket
Hardware
ioreg virtual file interface Hardware file I/O
BORPH Kernel Device Driver

HW HW
Hardware Process (1)
An executing instance of a hardware design
SW: An executing instance of a program

Software
SW
SW
SW
Normal UNIX process
file
pipe
User Library
IPC ioreg
socket
Has pid, check status with ps, kill, etc
Hardware
Unit of management Created when a BORPH Object File (BOF) file is exec-ed

HW HW
Kernel selects and configure hardware region automatically
11
Benefits of UNIX Process Model

12
Very easy for user to reason about Enable FPGA designs to become active component of the system
e.g.
an FIR filter: Conventional: a passive entity where software sends/receives data BORPH: an active entity that pulls/pushes data as needed
Enable multiple instances of the same FPGA design running in the system
No
more fixed accelerator concept Works well in true reconfigurable computing systems
HW Processes I/O
I/O managed by kernel
Similar to SW
SW SW SW
Hide details from users
Software
e.g. HW-SW, HW-HW UNIX file pipe
file
pipe
User Library
IPC ioreg
socket
File I/O, pipe, signal ioreg virtual file system
Hardware
Standard UNIX I/O mechanism HW specific service

HW HW
Dont ask How do I in HW. Think: What if it were SW?

13
ioreg Virtual File System

14
Maps user defined hardware constructs as virtual files under the processs / proc/<pid>/hw/ioreg/ directory

Single word register Memory: On-chip + Off-chip FIFO /proc/123/hw/ioreg/COUNTERVAL
Example:
ioreg information embedded in the executing BOF file read and write system calls translated to message packet by the kernel
Any UNIX program can communicate with hardware processes

n n n n n
Shell: echo 1 > /proc/123/hw/ioreg/enable C: MEM_FILE = fopen(/proc/123/hw/ioreg/MyMemory, r); fread(swbuf, 1, MEM_SIZE, MEM_FILE); Python, Java, etc
Hardware File I/O

15

Access to the general file system from hardware processes Debug by printing
printf
Read test vectors, record output

Analog Frontend
A/D
Baseband Process
Upper Layer
SW/HW processes chained by file pipe

bash$ receiver.bof < file.in > file.out
Decode video.in
Resize
Edge Detect
Encode video.out
bash$ decode video.in | resize | edgdet.bof | encode > video.out
Latency-Insensitive Design: A Higher Semantic

FPGA
Control Partition Control Timing Partition Fetch Decode Exe
Fetch
Decode
Exe
Functional Partition
Inter-module communication by latency insensitive channels
Changing the timing behavior of a module does not affect functional correctness of the program Improved modularity Simplified design-space exploration 16
Many HW designs use this methodology

Implemented with guarded FIFOs in current RTLs

FPGA0
FPGA
FPGA1
Fetch
Decode
Exe
Behavior of LI channels does not affect functional correctness.
17

FPGA
Fetch
Decode
Exe
There are many FIFOs in the design
It may not be safe to modify some of them Reasoning about cycle accuracy is difficult But the programmer knows about the LI property
18
Compilers see only wires and registers
A Syntax for LI Design
Programmer needs to differentiate LI channels from normal FIFOs Latency-Insensitive Send/Recv endpoints

module mkTimeP; Send#(Inst) send <- mkSend(Decode); endmodule
Implementation chosen by compiler FIFO order Guaranteed delivery Unspecified buffering & unspecified latency Programmer responsible for correct annotation
mkTimeP RTL
Explicit programmer contract

mkFuncP RTL
module mkFuncP; Recv#(Inst) recv <- mkRecv(Decode); endmodule
Easy to use often a textual substitution!
19
Connected User Application

20
Pla8orm_0 Pla8orm_0 Comms mkApplicaAon Pla8orm_1 Pla8orm_1 Comms mkApplicaAon_Stub mkA mkA_Stub mkB_Stub mkC mkB
mkC_Stub
Key:
User Spring 2013 Library Generated 6.888 - Sanchez and Emer L16

L16 Fpgas

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

L16 Fpgas

Enviado por

Direitos autorais:

Formatos disponíveis

LECTURE 16 FINE-GRAINED RECONFIGURABLE COMPUTING: FPGAS JOEL EMER AND DANIEL SANCHEZ

6.888 PARALLEL AND HETEROGENEOUS COMPUTER ARCHITECTURE SPRING 2013

Field Programmable Gate Arrays (FPGA)

Evolution of FPGA applications

6.888 Spring 2013 - Sanchez and Emer L16

Benefits of FPGA computation

Reduced memory references more direct reuse of data

6.888 Spring 2013 - Sanchez and Emer L16

QPI-attached FPGA platform

M Xilinx Virtex 6 Module

PCIe attached FPGA

Altera Stratix IV Module

Reed Solomon Results

Source: MIT, Abhinav Agarwal, Alfred Ng CSG

Berkeley Operating system for ReProgrammable Hardware OS for reconfigurable computers

reconfigurable hardware as computational resources

UNIX interface to HW designs

to both software and hardware engineers Design language independent

Make FPGA-based reconfigurable computers easy to use

Conventional View of FPGA Systems

pipe User Library file OS Kernel Device Driver

Hardware Platform (Network, UART, HD)

Hardware Platform (Network, UART, HD)

Hardware User Library

Overview of BORPH Concepts

Hardware process Hardware syscall interface Interacting with an FPGA

ioreg virtual file interface Hardware file I/O

BORPH Kernel Device Driver

Hardware Platform (Network, UART, HD)

Hardware User Library

6.888 Spring 2013 - Sanchez and Emer L16

Hardware Process (1)

An executing instance of a hardware design

SW: An executing instance of a program

Normal UNIX process

Has pid, check status with ps, kill, etc

BORPH Kernel Device Driver

Hardware Platform (Network, UART, HD)

Hardware User Library

Kernel selects and configure hardware region automatically

6.888 Spring 2013 - Sanchez and Emer L16

Benefits of UNIX Process Model

6.888 Spring 2013 - Sanchez and Emer L16

I/O managed by kernel

Hide details from users

e.g. HW-SW, HW-HW UNIX file pipe

File I/O, pipe, signal ioreg virtual file system

Standard UNIX I/O mechanism HW specific service

BORPH Kernel Device Driver

Hardware Platform (Network, UART, HD)

Hardware User Library

Dont ask How do I in HW. Think: What if it were SW?

ioreg Virtual File System

Single word register Memory: On-chip + Off-chip FIFO /proc/123/hw/ioreg/COUNTERVAL

Any UNIX program can communicate with hardware processes

6.888 Spring 2013 - Sanchez and Emer L16

Hardware File I/O

Read test vectors, record output

SW/HW processes chained by file pipe

bash$ decode video.in | resize | edgdet.bof | encode > video.out

6.888 Spring 2013 - Sanchez and Emer L16

Latency-Insensitive Design: A Higher Semantic

Inter-module communication by latency insensitive channels