Você está na página 1de 55

XD2000i Development System XD2000i FPGA In-Socket Accelerator Module User Handbook

Version 1.03., June 2009

Copyright 2009 XtremeData, Inc. All rights reserved. XtremeData, Inc., Computing Redefined, the stylized XtremeData logo, specific device or software designations, and all other words and logos that are identified as trademarks and/or service marks are, unless noted otherwise, the trademarks and service marks of XtremeData, Inc. in the U.S. and other countries. All other product or service names are the property of their respective holders. XtremeData products are protected under U.S. patents and pending applications, maskwork rights, and copyrights. XtremeData warrants performance of its products to current specifications in accordance with XtremeData's standard warranty, but reserves the right to make changes to any products and services at any time without notice. XtremeData assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by XtremeData, Inc. XtremeData customers are advised to obtain the latest version of device and software specifications before relying on any published information and before placing orders for products or services.

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

Table of Contents
1 Introduction............................................................................................................................................4 1.1 AFU Implementation......................................................................................................................4 1.2 Glossary..........................................................................................................................................4 2 Development System Parts....................................................................................................................6 2.1 XtremeData Supplied Items...........................................................................................................7 2.1.1 Hardware Development System.............................................................................................7 2.1.2 Reference Design Software ...................................................................................................7 2.1.3 Reference Design HDL and Configuration Files....................................................................7 2.2 User Supplied Items.......................................................................................................................8 3 Reference Design Project.......................................................................................................................9 3.1 Reference Design Software Overview...........................................................................................9 3.2 Module Applications....................................................................................................................10 3.3 Bridge FPGA Design ...................................................................................................................11 3.4 AFU FIFO Protocol......................................................................................................................11 3.4.1 AFU-Type FIFO Interface....................................................................................................12 3.4.2 DRV-Type FIFO Interface...................................................................................................13 3.5 Application FPGA Reference Design..........................................................................................14 3.5.1 Application FPGA Reference Design Source Files..............................................................15 3.5.2 Application FPGA Quartus Project......................................................................................22 3.6 Application FPGA Simulation.....................................................................................................25 3.6.1 QDRII Library......................................................................................................................25 3.6.2 AFU Simulation....................................................................................................................25 3.6.3 QDRII+ Interface Simulation...............................................................................................28 3.7 Software Framework....................................................................................................................30 3.7.1 Software Features.................................................................................................................30 3.7.2 Software Directory Structure................................................................................................31 4 Development System Setup ................................................................................................................33 4.1 Development System Setup - Hardware......................................................................................33 4.1.1 Load CPLD and Flash Configuration Files..........................................................................33 4.2 Development System Setup - Software........................................................................................35 4.2.1 Disable the MCH Snoop Filter (if required).........................................................................35 4.2.2 Log in to the Development PC.............................................................................................36 4.2.3 Install External Libraries......................................................................................................36 4.2.4 Unpack the Distribution........................................................................................................36 5 System Verification..............................................................................................................................37 5.1 Establish Communication with the Bridge...................................................................................37 5.1.1 Log in to the Development PC.............................................................................................37 5.1.2 Enable the AHM FSB Segment (if required).......................................................................37 5.1.3 Load the XD2000i Device Driver.........................................................................................37 2

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

5.2 Load the Reference Design and Exercise Hardware....................................................................38 5.2.1 Verify the Software Build Process.......................................................................................39 5.3 Monitor Data on the Application FPGAs.....................................................................................40 6 Software Programs...............................................................................................................................42 6.1 ahmAppDownload Configure Application FPGAs...................................................................42 6.2 ahmFlashDownload Write Bridge FPGA Boot Image to Flash................................................42 6.3 doAhmTest.sh Reference Design Demonstration Script...........................................................43 6.4 ahmTest Reference Design Demonstration Program................................................................43 6.4.1 Reference Design Software Functions..................................................................................44 7 AFU Selector Software Programming Model .....................................................................................49 7.1 Bridge FPGA AFU Selector.........................................................................................................50 7.2 Application FPGA AFU Selector.................................................................................................50 7.3 Application FPGA Control Register on Bridge...........................................................................51 7.4 Application FPGA Status Register...............................................................................................52 7.5 Bridge Null Write Register...........................................................................................................52 7.6 Send / Receive Alignment Pad Sequence.....................................................................................53 8 Software Notes ...................................................................................................................................54 9 Revision History ................................................................................................................................55

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

1 Introduction
This document describes the XtremeData XD2000i Development System. Specifically, it explains:

the different components of the XD2000i Development System the structure of the VHDL reference design project and what to modify to create and test your own custom design the procedure for configuring and running the XD2000i Development System the system software, including test scripts and pre-built executables

1.1

AFU Implementation

The XD2000i Accelerator Hardware Module (AHM) provides one Bridge and two Application FPGAs. The user defined portion of the XD2000i system is called the Accelerator Functional Unit, or AFU. An AFU resides on either or both of the two Application FPGAs. This system implements the AFU V0.7 programming model. The AFU 0.7 model presents a slave-only interface to the AFU. In this model, all data is sequentially streamed between system memory and the user defined AFU using a series of DMA-type transfers. The DMA engines reside in the Bridge FPGA and are accessed though a simple software interface. The AFU itself does not generate system memory addresses. This document and reference design support the AFU V0.7 implementation. Future AFU implementations will be supported on the current hardware by updating Flash Memory.

1.2
AFU

Glossary
Accelerator Functional Unit. The AFU is the user specified FPGA functionality as implemented in an Application FPGA. The current reference design implements the V0.7 AFU.

AHM

Accelerator Hardware Module. The XDI in-socket module considered as a whole. The module comprises the Bridge FPGA, two Application FPGAs and QDR memory.

Application FPGA AppA AppB Bridge FPGA

One of the two user defined application FPGAs. The first user defined application FPGA. The second user defined application FPGA. The AHM FPGA components that implements the FSB protocol. The Bridge FPGA has no user defined functionality. The Bridge also connects to the two Application FPGAs.

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

CPLD Development PC FSB

Non-volatile, programmable device used to control AHM startup and FPGA configuration. A Development PC is a Linux-based system containing an XD2000i module. Front Side Bus The Intel standard bus connecting an MCH to the CPU or AHM.

Flash Memory MCH

Non-volatile memory containing the Bridge FPGA image that is automatically loaded at power-up by the CPLD. Memory Controller Hub (Northbridge) The Intel system controller which interconnects the CPU, AHM and DDR memory. Although the AHM is logically connected to the system FSB, like the CPU, it is actually connected to the MCH. Due to MCH filtering, certain CPU transactions may not be presented to the AHM.

Module QDR

The XDI in-socket accelerator module considered as a whole (alternately, AHM). Quad-Data Rate Memory QDRII+ SRAM memory connected to each Application FPGA.

Remote FPGA Sequencer

One of the two user defined application FPGAs. A small AFU present in each of the three module FPGAs. The sequencer supports verification of the inter-FPGA interfaces and bandwidth measurements. The sequencer can Idle, Write, Read or Loopback FSB data. The User PC is the system hosting the Altera Quartus software tools.

User PC

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

2 Development System Parts


The XD2000i Development System is shown in Illustration 1 and consists of hardware and software provided by XtremeData Inc. and the user.

Windows or Linux User PC Not included with development system

USB - JTAG

XD2000iTM Module

XtremeData XD2000i Reference Project


INCLUDED

ALTERA Design Software Quartus II

Intel Northbridge (MCH)

Hard Drive with CentOS5

Not Included

INCLUDED

DUAL XEON MOTHERBOARD

Keyboard

Mouse

XD2000i Development PC

INCLUDED

Illustration 1: XD2000i Development System

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

2.1

XtremeData Supplied Items


Hardware Development System

2.1.1

1. XD2000i Development PC
Dual XEON motherboard with one Intel XEON processor XD2000i 1067M FSB module containing a Bridge FPGA connecting to the Northbridge and two remote Application FPGAs (AppA and AppB). Linux CentOS operating system Used to load FPGA/CPLD configuration files and monitor Application FPGA status utilizing Altera's SignalTap II logic analyzer

1. Altera USB-Blaster and cables

2.1.2

Reference Design Software


The reference design software is distributed as a tarred zip archive. It contains source code and executables for a variety of example programs. The software may be installed into any suitable directory. Use tar -xf to extract files.

1.XD2000i_Reference_Design_Software_v<ver>.tar.gz

2.1.3

Reference Design HDL and Configuration Files


The reference design hardware files are distributed as a tarred zip archive. It contains source code HDL for the Application FPGA reference design and configuration files for the various devices on the XD2000i module. Use tar -xf to extract files.

1.XD2000i_Reference_Design_Hardware_v<ver>.tar.gz

XD2000i Bridge FPGA Configuration Files

xdi_flash_PR_v<ver>.pof / xdi_flash_ES_v<ver>.pof
The appropriate .pof is stored into the on-module flash memory and contains the configuration file for the Bridge FPGA. At system power-up, the Bridge FPGA design is automatically loaded into the Bridge from flash memory. Depending on the specific module, the Bridge FPGA can be either an engineering sample or production device. ES denotes an engineering sample Bridge FPGA, while PR denotes a production FPGA. While the devices are functionally equivalent, their configuration files are not compatible with each other. The Bridge design contains an FSB core and high-speed interfaces to remote Application A and B FPGAs. The Bridge also contains an interface to the CPLD that is used to configure the remote Application FPGAs and write to / read from the on-module flash memory. These

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

Bridge .pof files are provided by XDI and are not alterable by the user.

XD2000i Application FPGA design files


xdi_app_top_ref_design_v<ver>.qar
A Quartus project containing all hardware design code required to design, simulate, and run in hardware, the reference design on both Application FPGAs. Includes timing constraints for all module interfaces. This design will be customized by the user.

XD2000i Application FPGA configuration files


xdi_appA_EP3SE260_v<ver>.rbf / xdi_appA_EP3SL150_v<ver>.rbf / xdi_appB_EP3SE260_v<ver>.rbf / xdi_appB_EP3SL150_v<ver>.rbf
Configuration files for the Application FPGAs containing the reference design. The name identifies which Application FPGA (A or B) and which FPGA size (EP3SE260 or EP3SL150) for which it is used. The reference design has interfaces to connect to the Bridge FPGA, the other Application FPGA, and the QDRII+ memory. The reference design on the Application FPGAs is described in more detail in Section 14.

XD2000i CPLD configuration file


xdi_cpld_v<ver>.pof
The configuration file for the CPLD. Its functionality includes communication with on-module flash memory, power-up configuration of the Bridge FPGA, and a Bridge interface to enable Application FPGA configuration over FSB.

2.2

User Supplied Items


1. Altera Design Software Suite

Use Quartus II version 8.1, downloaded from Altera's web site for the user PC, Windows or Linux version.

2. User PC

Windows or Unix system that will connect to the XD2000i via JTAG cable. This computer should be loaded with the Quartus II software.

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

3 Reference Design Project


The reference design consists of the XD2000i FPGA design, the software test applications, and the device drivers. The overall system data flow is shown in Illustration 2. This does not show CPLD connection or functionality.
Application A FPGA
QDRII+ SRAM QDRII Port AFU AppA Port Bridge Port

Application (CPU)

Intel Northbridge (MCH)

1067M FSB

FSB Core

Multi AFU Interface

App Port Sequencer

App Port

AppB Port Bridge Port

Bridge FPGA
AFU

System Memory
QDRII+ SRAM

QDRII Port

Application B FPGA XD2000i


TM

Module

Illustration 2: System Data Flow

3.1

Reference Design Software Overview

The reference design allows verification of the XD2000i software environment and the AHM hardware functionality. The application verifies the hardware associated with the Bridge and the two Application FPGAs using various AFUs. The exercised components include the following:
FSB

interface linking the MCH and Bridge FPGA FPGA interfaces linking the Bridge and each Application FPGA FPGA interface linking the two Application FPGAs FPGA interface linking each Application to its associated QDRII+ memory Application FPGA configuration via FSB Flash programming via FSB Sequencer AFUs located on the Bridge and Application FPGAs

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

Data

loop back AFUs located on the Application FPGAs

The test suite is provided in source and executable forms. It includes the following executables, scripts and link libraries.
ahmAppDownload ahmFlashDownload ahmTest doAhmTest.sh libXdiAhm.a libXdiAhm.so.0.0

// Configure Application FPGAs via FSB // Access Flash memory via FSB // AHM module test program // A test script that configures and calls ahmTest // A linkable library providing the FSB download capability // A shared library providing the FSB download capability

Each FPGA implements a set of AFUs that are connected to a simple multiplexer, or Multiple AFU Interface (MAFU). Data sent to an AFU includes a header that specifies the target AFU and programs the multiplexer with the amount of incoming data. An AFU target on the Bridge is identified by a single header. Bridge targets include the Test Sequencer, the AppA FPGA port and the AppB FPGA port. A Bridge header is required when sending data to the AHM. An Application MAFU target is identified by two headers. The first header sets the Bridge AFU multiplexer to either the AppA or AppB port. The second header selects the specific Application AFU. Each multiplexer strips one header before forwarding the remaining payload. It is also possible to design an Application FPGA that implements a single AFU target. In this case, the second header is not present.

3.2

Module Applications

The XD2000i AFU 0.7 design models a streaming data interface. The CPU prepares a data buffer of known size in system memory and sends it to the module using a Send/Receive command. A data buffer of known size is received back from the module and is placed in system memory. The user application is then notified that the transaction is complete and that the received data is available. The Send and Receive operations run concurrently. As described below, the Source and Receive memories may be from the same or different Workspaces. Send and receive data sizes may be different. A Workspace is a portion of system memory that can be the source or destination of a module transfer. Workspace data is contiguous in physical memory and the module uses physical addresses. The XDI provided FSB core manages the physical addresses - the user application only sees a stream of data. Workspace allocation, management and mapping to and from user space is handled by the device driver. The module appears as a character device. The user first opens the FPGA to get a file descriptor and then allocates a Workspace of a given size (somewhat analogous to a malloc() call). Using the returned Workspace address, the user fills in the Source data buffer and submits a Workspace Command. When no longer needed, Workspaces may be returned to the system (similar to free() call). When processing is complete, the module may be closed. A Workspace Command specifies a source address and source size and a response address and

10

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

response size. Workspace commands are submitted through the device driver. In the reference design, each Workspace Command is processed synchronously - the caller blocks until response data is received. The user may alter this behavior by modifying the included source code. AHM applications have the following constraints:

Workspace Command data sizes must be positive multiples of 64 Bytes (0 length buffers are not allowed). Both source and response buffers must be specified and be at least 64 Bytes. Input and Output buffers may be of the same or different lengths. Maximum workspace size is 4 MB. Maximum number of allocated workspaces is 4096. Only one Send operation may be active at a time. The Receive portion of the Workspace Command must complete before the next Send is submitted. If desired, the Send request may return immediately so that additional CPU work may be performed before submitting the Receive request. Reference Design Workspace Commands are processed synchronously (Send is immediately followed by a Receive).

These constraints will be removed in the future.

3.3

Bridge FPGA Design

The Bridge FPGA interfaces the Intel Front Side Bus to the module. The Bridge design is provided by XDI and is not configurable by the user. XDI provides .pof configuration files that contain the Bridge configuration data stream. The Bridge configuration file is stored in on-module flash memory and is auto-loaded into the Bridge FPGA at system power-up The Bridge FPGA contains the FSB core, the multiple-AFU interface, the Application FPGA interface ports, the CPLD interface port and a Test Sequencer. All but the CPLD and Flash devices are shown in Illustration 2. The FSB core connects to the MCH (Northbridge) and streams data to the multipleAFU interface. Based on the header embedded in the data stream, the multiple-AFU interface directs data to one of the Bridge AFUs : the AppA FPGA, the AppB FPGA, the Test Sequencer or the CPLD which configures the remote Application FPGAs or accesses Flash memory.

3.4

AFU FIFO Protocol

The Application FPGAs' user function, or Accelerator Functional Unit (AFU), connects to the Application/Bridge interface using a Data/Valid/Ready (DVR) protocol. This protocol is a simple pointto-point, FIFO-like handshake that pipelines all control and data signals. In the reference design, fifo_afudrv.vhd is instantiated in many files to implement FIFOs with various write-side and readside interfaces. This entity employs a FIFO_TYPE generic that specifies one of two FIFO interfaces, referred to as AFU-type and DRV-type. Using this block, the user can create four different types of FIFOs: one that has an AFU interface on the write and a DRV interface on the read, or vice versa (e.g.

11

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

eFIFO_IS_AFU2DRV); or the FIFO can have an AFU or DRV interface on both write and read ports (e.g. eFIFO_IS_AFU2AFU). Also, as the write and read ports are asynchronous, they can have different data widths and separate, asynchronous clocks. The difference between these two interfaces is described below. Note that there are other signals in the code, like <initiator>_data_is_csr, that are not used by the current slave-mode only AFU 0.7, but will be utilized in the later release of AFU 1.0.

3.4.1

AFU-Type FIFO Interface

The AFU-type interface is the FIFO interface used by the back end of FSB Core and is likewise employed for the user AFU interfaces. This interface requires the target to accept data whenever <initiator>_data_valid is asserted. Flow control is achieved by requiring the initiator to transmit data only when it senses <target>_data_ready is asserted. AFU-Type FIFO Protocol Rules 1. Initiator may assert <initiator>_data_valid whenever its data is valid and it senses <target>_data_ready is asserted. 2. Initiator must de-assert <initiator>_data_valid whenever it senses <target>_data_ready is de-asserted.

In-process transfers may progress subject to their being received at the target within 16 cycles of when the target de-asserted its <target>_data_ready. This 16 cycles encompasses the target-to-initiator pipeline length + initiator sample and turn-off delay + initiator-to-target pipeline length.

1. Target accepts data whenever it senses <initiator>_data_valid is asserted. 2. Target must accept a minimum of 16 transfers after de-asserting <target>_data_ready to allow in-flight transfers to complete. AFU-Type FIFO interfaces implement the following signals. Name <initiator>_data <initiator>_data_va lid <target>_data_ready Width 256 1 1 Direction initiator target initiator target target initiator Description Data Bus Initiator data is valid and must be accepted by the target. Target is ready to accept data. The target must accept data during the 16 cycles following the de-assertion of target_ready. Reset signal which is internally synchronized to the write and read clocks. All signals are sampled at the rising edge of the clock.

reset

system

clock

system

12

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

3.4.2

DRV-Type FIFO Interface

The DRV-Type FIFO interface performs data transfers when <initiator>_data_valid and <target>_data_ready are concurrently asserted. It may be convenient for an AFU input port to convert its AFU-Type interface to a DRV-Type interface since it allows the AFU to throttle incoming data without needing to provide additional buffer storage. A DRV-Type FIFO interface is used within afu_control_registers.vhd to adapt the incoming AFU-Type protocol so that the register block only accepts new commands when it has finished processing the last command. The control and data paths are not pipelined in a DRV-type interface. DRV-Type FIFO Protocol Rules 1. Initiator asserts <initiator>_data_valid whenever its data is valid.
<initiator>_data_valid

may be asserted while <target>_data_ready is de-asserted.

1. Target asserts <target>_data_ready whenever it is ready to accept data


<target>_data_ready

may be asserted while <initiator>_data_valid is de-asserted.

1. Data transfers occur when <initiator>_data_valid and <target>_data_ready are concurrently asserted. DRV-Type FIFO interfaces implement the following signals. Name <initiator>_data <initiator>_data_va lid Width 256 1 Direction initiator target initiator target Description Data Bus Initiator data is valid. Data is accepted by the target only if <target>_data_ready is concurrently asserted. Target is ready to accept data. The target accepts data when data_ready and data_valid are concurrently asserted. Reset signal which is internally synchronized to the write and read clocks. All signals are sampled at the rising edge of the clock.

<target>_data_ready

target initiator

reset

system

clock

system

13

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

3.5

Application FPGA Reference Design

The Application FPGAs implement the user's application specific AFUs. The reference design establishes a framework and may be used as a starting point for this work. A high level view of the Application FPGA reference design is shown in Illustration 3. While the user may completely specify the AFU, they must not modify the Bridge, App or QDRII+ port interfaces shown in red in the diagram. Very specific placement and timings constraints have been applied to these interfaces to allow them to run at the required speeds.

Reference Design AFU


AFU Sequencer AFU Loopback AFU Chain Loopback QDRII+ Test To Bridge FPGA Bridge Port Multiple AFU Interface
Status Application FPGA Registers

QDRII Port

To QDRII+ SRAM

AFU Control/Status Registers


Control Registers

control_loopback_BridgeAppPort
FIFO

0 1

FIFO

App Port
FIFO

To Other Application FPGA

1 0 AFU Port Tester

FIFO

Illustration 3: Application FPGA Reference Design


The Application FPGA reference design contains a multiple-AFU interface that allows the system access to its different AFUs. By setting appropriate control registers, the user can also send and receive data across the App Port to the other Application FPGA. Illustration 4 shows the hierarchy of the entities instantiated in the reference design. This hierarchy does not contain all the hardware files in the project, but rather shows the main functional blocks. As with Illustration 3, the entities marked in red represent blocks that are not alterable by the user.

14

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

app_top

i_FPGA_Interface _BridgePort i_FPGA_Interface _AppPort Various SignalTap monitors i_app_core

i_BridgeAppPorts _clocks i_app_clocks _unused i_mult_afu_if i_afu_loopback i_afu_chain_loopback i_afu_sequencer

i_app_clocks i_reset_filter Various SignalTap monitors i_afu_port_tester i_afu_control_status_registers i_qdr_test_top_inst FPGA_Interface _QDRII_Port i_qdr_test_control

Illustration 4: Application FPGA Reference Design Hierarchy

3.5.1

Application FPGA Reference Design Source Files

All HDL source files for the reference design can be found in the Altera Quartus project archive: xdi_app_top_ref_design_v<ver>.qar. The following sections will describe the different functional modules. The heading names given are instantiation : source file.

3.5.1.1

app_top : app_top.vhd

This is the top level of the Application FPGA design and it defines pins and instantiates the Bridge and App Port interfaces, the app_core which contains the AFU, and some Signal Tap monitor points. The Signal Tap monitors instantiate a general entity that keeps data/control signals from being optimized out during compilation so that they can be monitored in Signal Tap, the Quartus logic analyzer feature that is accessible over JTAG. These monitor points can be added or

15

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

removed at the user's discretion and are the only section in app_top that should be modified by the user.

3.5.1.2

i_FPGA_Interface_BridgePort : fpga_interface_port.vhd

Connects the Application FPGA with the Bridge FPGA linking the software to the AFU. This block contains the physical interface and FIFOs which are described in Section 11. The physical interface has two 64-bit unidirectional buses running at 200 MHz DDR. This is 400 MTps, or 3.2 GBps in each direction. This block is not modifiable by the user.

3.5.1.3

i_FPGA_Interface_AppPort : fpga_interface_port.vhd

Connects one Application FPGA with the other Application FPGA. This uses the same underlying VHDL code as the Bridge Port and therefore runs at the same rate and uses the same AFU-Type FIFO protocol. This block is also not modifiable by the user.

3.5.1.4

i_app_core : app_core.vhd

The main entity in the design, app_core contains clock/reset generation, AFU instantiations, and SignalTap monitors. This is the level at which AFU simulations will be run. Data flow to/from this block is controlled by FIFOs in the Bridge / App Ports interfaces. Most of this file is modifiable by the user; look for the note -- Below this point will be customized by the user. The areas that can not be modified by the user are reset generation and clock generation for the Bridge and App Ports. The reference Application designs implements a multiple-AFU interface that allows access to multiple AFUs, each with their own FIFO interface. The user may keep this multiple-AFU interface and replace the AFUs with their own, or may replace the multiple-AFU interface and all AFUs with a single AFU that connects directly to the Bridge Port.

3.5.1.4.1

Clock / Reset Generation

Each Application FPGA has four 100 MHz input reference clocks that drive directly into PLLs. One of these input clocks (clk_input_reserved_FPGA_interface_ports) is used by the Bridge / App Port and is not accessible by the user. The other three clocks (clk_input_User_100_MHz) are for the user. In the reference design, one user clock is used to generate a 100 MHz AFU clock, one is used to generate clocks for the QDRII+ interface, and one is unused. If the QDRII+ SRAM is incorporated in the user's design, it must use the same input clock that is used in the reference design namely clk_input_User_100_MHz(2). This is located on the same side of the FPGA as the QDRII+ interface. If the user is not utilizing the QDRII+ SRAM, this input clock may be used for another purpose. Resets are also generated in app_core.vhd based on input signals from the Bridge and PLL lock signals. Since the Bridge has control over the re-programming of the Application FPGAs it has the ability to reset each App. The reset generation code found in app_core.vhd should not be

16

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

modified by the user. Since the reset comes from the Bridge FPGA, it is asynchronous to the Application FPGA clocks. For use in the AFUs, this reset ( afu_rst_async_in) is synchronized to afu_data_clk_in by utilizing the reset_filter block. See Section 3.5.1.8. The clock and synchronous reset used by the AFUs are afu_data_clk_in and afu_rst_sync respectively.

3.5.1.5

i_BridgeAppPorts_clocks : bridge_clocks.vhd

Contains the PLL that generates clocks used by the Bridge and App Ports. This is not modifiable by the user.

3.5.1.6

i_app_clocks : app_clocks.vhd

Contains a PLL that generates the clock used by all AFUs. To solve a simulation delta issue, the output clock, afu_data_clk_out, goes back to app_top.vhd and routes back in as afu_data_clk_in which is then used in each AFU. Note that the lock signal is used by the signal afu_if_pll_module_rst to keep logic in reset until all PLLs are locked. So if this PLL is removed from design, this signal needs to be modified appropriately or the Application FPGA may never come out of reset.

3.5.1.7

i_app_clocks_unused : app_clocks.vhd

This reference clock is unused in the reference design, but a PLL is instantiated with it as input for completeness. The output clock from this PLL is not used, however the lock signal is used by the signal afu_if_pll_module_rst to keep logic in reset until all PLLs are locked. So if this PLL is removed from the design, this signal needs to be modified appropriately or the Application FPGA may never come out of reset.

3.5.1.8

i_reset_filter : reset_filter.vhd

This block takes in a clock and asynchronous reset and returns a reset that is synchronous to the clock. An asserted reset input will propagate directly to the output regardless if the clock is toggling. But reset removal (de-assertion) will be synchronous to the clock.

3.5.1.9

i_mult_afu_if : mult_afu_if.vhd

This block routes data and control signals between Bridge Port and the various AFUs. It allows a single data stream from the Bridge FPGA to route to different AFUs on the Application FPGA. If a single AFU is used, or the user implements their own multiple-AFU interface, this block should be removed, but if the user wants to utilize this multiple-AFU interface, this entity does not need to be modified by the user. To change the number and name of AFUs, mult_afu_config_pkg.vhd can be modified. Of course, the indexes found here need to be reconciled with software so that sent data goes to the correct AFU.

17

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

3.5.1.10

i_afu_loopback : afu_loopback.vhd

An AFU that simply loops back data through the Bridge Port. Three AFU-style FIFOs are used.

3.5.1.11

i_afu_chain_loopback : afu_chain_loopback.vhd

An AFU that loops back data through the Bridge Port, but links many FIFOs together of both AFUstyle and DRV-style. See Section 11 for FIFO types.

3.5.1.12

i_afu_sequencer : afu_test_sequencer.vhd

An AFU that can be used to test write, read, and loop back from system memory. This block is useful for testing downlink and uplink separately.

3.5.1.13

i_afu_control_status_registers : afu_control_registers.vhd

An AFU that contains writable / readable control registers and readable status registers. There are eight 16-bit registers, half control and half status. The information on the function of each register in the reference design can be found in afu_control_registers_pkg.vhd. The control registers can be written to and read by the system through the multiple-AFU interface. These control registers can then be used by other blocks. As shown in Illustration 3 the AFU Port Tester muxes are controlled by a signal coming from this block. Likewise, signals from this block control the reset and error injection of the QDRII+ Test. The status registers are read-only from the system; their values come from other logic. In the reference design, the app_ID, which differentiates AppA from AppB using a hard-wired input, is stored here (AppA=0, AppB=1). The reference design version can also be read from the status registers. Finally, the result of the QDRII+ test can be monitored in the status registers.

3.5.1.14

i_afu_port_tester : afu_port_tester.vhd

This AFU gives a connection to the other Application FPGA through the App Port interface. As seen in Illustration 3 the front end of the AFU Port Tester connects to the multiple-AFU interface and can be accessed by the system, while the back end connects to the App Port. The data flow is controlled by a control register which can be written in the AFU Control/Status Registers block. If this register is set to '1' (loop-back condition), then data from the multiple-AFU interface loops back to the Bridge, while the data coming from the App Port will loop back to the other App FPGA. Alternately, if the control register is set to '0' (feed-through condition), then the data from the multiple-AFU is driven to the other App FPGA, while the data from the other App FPGA is driven to the multiple-AFU interface. Of course, flow control is maintained by FIFOs. This functionality allows for various routing possibilities: putting both Application FPGAs in loopback leaves the App Port unused on both devices. Putting the AppA device into feed-through and the AppB device into loop-back means that system data from the Bridge to AppA will go through the App Ports on AppA and AppB and loop-back at the AFU Port Tester on AppB, returning to the Bridge through AppA's Bridge Port.

18

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

i_qdr_test_top_inst FPGA_Interface_QDRII_Port i_qdr_test_control i_qdr_subtest _counter i_qdr_subtest _ssn i_qdr_subtest _byte_en i_qdr_subtest _write_read_same_cycle i_qdr_subtest _pass_word i_qdr_subtest _prbs i_qdr_test_prbs_fsm generate _write_data_prbs compare _prbs

Illustration 5: QDRII+ Interface Test Hierarchy

3.5.1.15

i_qdr_test_top_inst : qdr_test_top.vhd

The QDRII+ interface test does various writes and reads to the on-module QDRII+ memory through a 36-bit data bus at 300 MHz DDR (600 MTps or 2.4 GBps). However, this is not an AFU in that it does not have a connection with the multiple-AFU interface and therefore can not receive streaming data from the system. This is a stand-alone test that can be reset by the system via a control register and reports its results to the system via status registers as can be seen in Illustration 3. This block in the reference design is not meant to fully test the QDRII+ memory, but to provide the optimized AltMemPhy interface and an example for the user on how to write and read from the QDRII+ SRAM. If the QDRII+ SRAM is incorporated in the user's design, it must use the same input clock that is used in the reference design namely clk_input_User_100_MHz(2). This is located on the same side of the FPGA as the QDRII+ interface. If the user is not utilizing the QDRII+ SRAM, this

19

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

input clock may be used for another purpose. The design file hierarchy for the QDRII+ interface test is shown in Illustration 5. Again note that the QDRII Port in red (containing AltMemPhy) should not be modified by the user.

3.5.1.15.1

FPGA_Interface_QDRII_Port : mf_qdr_phy.vhd

This contains the physical interface connecting the QDRII+ test to the pins. This is generated using the Altera AltMemPhy MegaFunction. Just as the Bridge Port and App Port physical interfaces have been optimized, so has the QDRII Port phy. Some of the files that are created by the MegaFunction have been manually modified to allow the use of trace length information in the timing analysis. Also, appropriate input and output delays have been set on a per-pin basis in order to optimize the DQ/DQS capture windows. This results in separate QDRII+ interface timing constraint files for each Application FPGA: mf_qdr_phy_ddr_timing_AppA.sdc and mf_qdr_phy_ddr_timing_AppB.sdc. The appropriate file is included by the Quartus .qsf settings file (see Section 22). The user should incorporate the current QDRII Port interface, mf_qdr_phy.vhd and all its subfiles as-is into their design. If the AltMemPhy is re-generated it will result in a physical interface that will not meet 300 MHz timing. There is also documentation by Altera on the use of the AltMemPhy MegaFunction to create external memory interfaces that the user can reference to further understand how XDI is implementing the whole QDRII+ interface.

3.5.1.15.2

i_qdr_test_control : qdr_test_control.vhd

This is the brains of the QDRII+ interface test. It runs on a 150 MHz clock. A state machine is employed to cycle through the various sub-tests, including counter, PRBS-23, high noise pattern, and byte enable test. If the test is not in reset, it keeps repeating all sub-tests. If a sub-test fails, the fail_sticky signal will be asserted which is then written to the status register, but the subtests will continue to run. If all sub-tests pass completely, a special 16-bit pass_word (0xFDBC) will be written to the status registers. This will help eliminate the danger of a false-pass result by observing the fail_sticky signal alone. The resync_successful signal is also routed to a status register to report that the AltMemPhy block has completed its initialization successfully. In the reference design, there are two registers in the afu_control_registers block that can affect the QDRII test. A reset register can be set that will clear fail_sticky, reset the pass_word, and re-start the QDRII+ interface sub-tests. Also, a register can be set to inject an error to the PRBS subtest. This will result in one word being written incorrectly to the QDRII+ SRAM so that on read and compare a failure will result.

3.5.1.15.3

i_qdr_subtest_counter : qdr_subtest_counter.vhd

Cycles through all of the QDRII+ memory addresses, using the address counter as data.

3.5.1.15.4

i_qdr_subtest_ssn : qdr_subtest_ssn.vhd

Cycles through all of the QDRII+ memory addresses, using a high switching pattern on data

20

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

lines: 000..000, 111..111, etc.

3.5.1.15.5

i_qdr_subtest_byte_en : qdr_subtest_byte_en.vhd

Tests functionality of all 16 byte enables. While there is a 36-bit pin interface to the QDRII+ memory, there is a 144-bit interface from the QDRII+ interface test to the AltMemPhy block since memory reads and writes are burst-of-4.

i_qdr_subtest_write_read_same_cycle : qdr_subtest_write_read_same_cycle .vhd


3.5.1.15.6
The interface to the AltMemPhy block contains a write address and a read address. In all other subtests only a write or read is being done in each cycle so the write and read addresses are tied together. In this test, both a write and read request (at different addresses) is given to the AltMemPhy block on the same cycle. This subtest goes through a small number of addresses and is just used to demonstrate that functionality.

3.5.1.15.7

i_qdr_subtest_pass_word : qdr_subtest_pass_word.vhd

As the last test in the series, it writes and reads a specific 16-bit value to the QDRII+ memory (0xFDBC).

3.5.1.15.8

i_qdr_subtest_prbs : qdr_subtest_prbs.vhd

Cycles through all of the QDRII+ memory addresses, using PRBS-23 as the data. The complete 144-bit data value is created with four 36-bit PRBS-23 patterns and each 36-bit word starts at a different initial value to keep the values unique. The following blocks are the entities used for the PRBS subtest, but similar blocks are found in each of the subtests.

3.5.1.15.9

i_qdr_test_prbs_fsm : qdr_test_counter_fsm.vhd

This contains a state machine that generates the control signals that get sent to the QDRII Port interface. Look at this to see how to set up writes and reads to the QDRII+ memory through the AltMemPhy block.

3.5.1.15.10 generate_write_data_prbs : qdr_subtest_prbs_generate_prbs_data.vhd


Instantiates four different 36-bit generate blocks to produce 144-bits of data of PRBS data. Accepts an input in_error that then produces a word of data with a single bit error.

3.5.1.15.11

compare_prbs : qdr_test_data_comparator.vhd

Compares data read from the QDRII+ memory to expected data. If a failure is detected, the fail_sticky signal will be asserted.

21

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

3.5.1.16

Various SignalTap Monitors

The Altera Quartus software employs a logic analyzer called Signal Tap II that can embed scanable registers in the design that can be read out over JTAG. The SignalTap monitors (e.g. i_SignalTap_BridgePort2AFU) instantiate a general entity, found in signal_tap_drv_monitor.vhd, that keeps data/control signals from being optimized out during compilation so that they can be included in a Signal Tap instance, compiled into the design and monitored at run-time. These blocks have no effect on the functionality of the design and the user can choose to utilize these blocks or remove them fully. It should be noted that every instance of Signal Tap utilizes a global clock and having many different instances may cause routing and timing difficulties.

3.5.2

Application FPGA Quartus Project

The complete reference design project for the Application FPGAs can be found in xdi_app_top_ref_design_v<ver>.qar which is a Quartus archive file. This includes VHDL and Verilog (QDRII interface AltMemPhy auto-generated files) design files, Quartus project file (app_top.qpf), timing constraint files (.sdc), SignalTap monitor file (.stp), Quartus settings files (.qsf), timing report files (.tcl), and simulation files (.vho). Settings in the reference design project are such that after compilation (specifically in the Assembler), Quartus will generate the .rbf configuration file used by the system.

3.5.2.1

Quartus Version

Altera has updated the Stratix III timing model with each version of Quartus that they release. In order to have correct timing results, the user must use Quartus 8.1 when compiling the Application FPGA project.

3.5.2.2

Project Organization AppA and AppB

While nearly all design files are common to the AppA and AppB FPGAs, there are differences in assignments and timing constraints. From the single Quartus project (app_top.qpf)there are two revisions, called appA_top and appB_top which reference their own Quartus settings files (appA_top.qsf, appB_top.qsf). Because of the differences between AppA and AppB (including pin locations and trace lengths), even if the same AFU is wanted in both Application FPGAs, each device must be compiled separately to produce a unique .rbf file for AppA and AppB.

3.5.2.2.1

appA_top.qsf and appB_top.qsf

These are the Quartus settings files which include all IO assignments, files associated with the project, SignalTap information. The IO assignments include pin location, IO standard, input/output delay, DQ/DQS groupings, and current strength. These assignments have been optimized for each Application FPGA and should not be altered by the user.

22

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

3.5.2.2.2

app.sdc

This is the timing constraint file used for both Application FPGAs. It defines reference clocks, then constrains the Bridge Port and App Port interfaces. It also includes constraints for the JTAG signals which are used when SignalTap is included in the design. The user may have to change the input clocks definitions if the clock usage is different than the reference design. The user should not change any constraints related to the Bridge Port or App Port. However, if the App Port is not used in the user's design, then these constraints could be removed. The Bridge / App / QDR Ports have been fully constrained such that any changes by the user to the AFU design should continue to result in a design that meets timing requirements in these areas.

3.5.2.2.3

app.stp

This is the Quartus SignalTap file that can be included in design compilation to include embedded registers that can be monitored and scanned using JTAG. There are three instances included in this file, but this can be changed by the user to any signals that the user wants to monitor. Care should be taken to choose the appropriate clock for the the monitored signals. Note that each SignalTap instance uses a global clock and that adding many instances could result in routing / timing issues.

3.5.2.2.4

app_ReportTiming.tcl

After full compilation, this script file can be sourced in Time Quest, Altera's timing analyzer, to create specific reports on all timing related to the reference design; from input and output timing to core timing. This can be modified by the user, but timing must be examined after each compile to ensure that timing requirements have been met.

3.5.2.3

Different Devices

XD2000i modules have either EP3SE260F1152C3 Application FPGAs or EP3SL150F1152C3 Application FPGAs. This can be chosen in the Settings Device dialog box. When you change the device, Quartus tells you that Altera recommends removing all location and IO standard assignments. Do NOT do this as all assignments and timing constraints are used for both devices. These are the only two valid devices as the Application FPGAs are in the F1152 package and are speed grade -3.

3.5.2.4
3.5.2.4.1

A Note on Compilation Warnings


Analysis and Synthesis Warnings

Many warnings reported by Quartus during Analysis and Synthesis resemble the following: Warning (10036): Verilog HDL or VHDL warning at app_bridge_phy.vhd(107): object "SIGTAP_app2bridge_data_valid_q" assigned a value but never read These are the registers that are monitored in SignalTap and scanned out. These do not drive any other logic and hence result in this benign warning. Another group of warnings is like the following:

23

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

Warning (10541): VHDL Signal Declaration warning at app_core.vhd(109): used implicit default value for signal "afu2AppPort_data_is_csr" because signal was never assigned a value or an explicit default value. Use of implicit default value may introduce unintended design optimizations. Signals like data_is_csr are not used in the current 0.7 version of the FSB protocol and these warnings can be ignored. These signals will be used in the future V1.0 protocol release.

3.5.2.4.2

Fitter Warnings

Most of the warnings reported by Quartus during the Fitter are like the following: Warning: Can't pack node fpga_interface_port:i_FPGA_Interface_AppPort| app_bridge_phy:i_BridgeAppPort_phy| app_bridge_phy_interface:app_bridge_phy_interface_inst| app_bridge_phy_DDR_IO:\genPhyBlocks: 5:app_bridge_phy_DDR_IO_inst|\dq_generate:7:dqi_io_ibuf to I/O pin Warning: Can't pack I/O cell fpga_interface_port:i_FPGA_Interface_AppPort| app_bridge_phy:i_BridgeAppPort_phy| app_bridge_phy_interface:app_bridge_phy_interface_inst| app_bridge_phy_DDR_IO:\genPhyBlocks: 5:app_bridge_phy_DDR_IO_inst|\dq_generate:7:dqi_io_ibuf -- no fan-out from combinational output port After contacting Altera technical support, they concluded that this is a benign Quartus bug, that the warning should not be reported and that it would be fixed in a future release of Quartus.

24

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

3.6

Application FPGA Simulation

The Application FPGA reference design xdi_app_top_ref_design_v<ver>.qar includes test bench and simulation files that are ignored by Quartus during synthesis. Altera Quartus 8.1 VHDL and Verilog libraries must be used for simulation. The reference design includes two test benches tb_testQDR.vhd tb_app.vhd - QDRII+ interface - AFU Interface

3.6.1

QDRII Library

The QDRII+ interface HDL must be compiled into a simulation library prior to running simulations on the Application FPGA reference design. The simulation library must be named qdrii. All of the source files for the qdrii library can be found in \fpga_modules\components_intel\xd2000i_components\src\qdrii. The following list specifies the compilation order for the qdrii library source files:
simulation\CY7C1565V18.v qdr_config_pkg.vhd altmemphy\mf_qdr_phy_alt_mem_phy_pll.vhd altmemphy\mf_qdr_phy_alt_mem_phy_qdrii_sequencer.vhd altmemphy\mf_qdr_phy_alt_mem_phy_qdrii_sequencer_wrapper.vho altmemphy\mf_qdr_phy.vho prbs23_36bit_msb.vhd qdr_subtest_prbs_generate_prbs_data.vhd mf_address_counter.vhd qdr_test_counter_fsm.vhd qdr_test_data_comparator.vhd qdr_subtest_byte_en.vhd qdr_subtest_write_read_same_cycle.vhd qdr_subtest_counter.vhd qdr_subtest_pass_word.vhd qdr_subtest_prbs.vhd qdr_subtest_ssn.vhd qdr_test_control.vhd qdr_test_top.vhd simulation\tb_testQDR.vhd

3.6.2

AFU Simulation

For Application FPGA AFU simulation, app_core.vhd is the top level of the design that should be used. This exposes the FIFO interfaces of the Application FPGA and bypasses the physical interfaces of the Bridge Port and App Port found in app_top.vhd. The test bench files provided, tb_app.vhd and afu_if_tester.vhd, provide an example of how to interface with the reference

25

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

design's AFU. As the same interface is used when communicating with all AFUs, only one of the AFUs is tested AFU Loopback. The simulation sends data to the reference design's AFU Loopback that then transfers the input data onto its output port and returns it to the test bench.

3.6.2.1

Source Files

The following source HDL files, listed in order of compilation, are required to run the AFU simulation:
fpga_modules/components_intel/xd2000i_components/src/ahm_pkg.vhd fpga_modules/components_intel/xd2000i_components/src/ahm_config_pkg.vhd src/app_config_pkg.vhd fpga_modules/components_intel/xd2000i_components/src/afu_control_register_pkg.vhd fpga_modules/components_intel/xd2000i_components/src/ahm_components_pkg.vhd fpga_modules/components_intel/xd2000i_components/src/afu_components_pkg.vhd fpga_modules/components_intel/xd2000i_components/src/test_sequencer_pkg.vhd fpga_modules/components_intel/xd2000i_components/src/mult_afu_if_pkg.vhd src/mult_afu_config_pkg.vhd fpga_modules/components_intel/xd2000i_components/src/signal_tap_slv8_monitor.vhd fpga_modules/components_intel/xd2000i_components/src/signal_tap_drv_monitor.vhd src/mf_clock_bridge_if_on_app.vhd src/mf_clock_afu_if.vhd src/bridge_clocks.vhd src/app_clocks.vhd fpga_modules/components_intel/xd2000i_components/src/ahm_fifo.vhd fpga_modules/components_intel/xd2000i_components/src/slv_pipeline.vhd fpga_modules/components_intel/xd2000i_components/src/ahm_fifo_rdwt_monitor.vhd fpga_modules/components_intel/xd2000i_components/src/test_sequencer_drv.vhd fpga_modules/components_intel/xd2000i_components/src/fifo_afudrv.vhd fpga_modules/components_intel/xd2000i_components/src/afu_chain_loopback.vhd fpga_modules/components_intel/xd2000i_components/src/afu_loopback.vhd fpga_modules/components_intel/xd2000i_components/src/afu_test_sequencer.vhd fpga_modules/components_intel/xd2000i_components/src/afu_port_tester.vhd fpga_modules/components_intel/xd2000i_components/src/mult_afu_if_mux_controller.vhd fpga_modules/components_intel/xd2000i_components/src/mult_afu_if.vhd fpga_modules/components_intel/xd2000i_components/src/afu_control_registers.vhd src/app_core.vhd fpga_modules/components_intel/xd2000i_components/src/afu_if_tester.vhd src/testbench/tb_app.vhd

In addition to these files, the qdrii library (Section 3.6.1) is required for this simulation.

3.6.2.2

AFU Loopback Simulation Results

The testbench for the AFU loopback simulation is tb_app.vhd. This testbench employs afu_if_tester.vhd to write an increasing sequence of 10,000 words starting at 0x0000 and ending at 0x270F. Stimuli generated by the testbench are applied to the FIFO interface between the Bridge Port and Multiple AFU Interface on the Reference Design AFU (See Illustration 3). As a

26

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

result, a two word header precedes the sequence of 10,000 words. This header directs the Multiple AFU Interface to select the AFU Loopback unit. With the exception of the header, all words applied by the testbench are compared to the data words read back. The loopback test runs for approximately 101 us. The following message is completed if there are no mis-compares between the read and write data: KERNEL: >> XDI TESTBENCH STATUS : SUCCESS After the testbench detects that the test has completed, the clocks are gated to stop the simulation. It should be noted that the gating of the clocks results in simulation messages indicating that the PLL has lost lock. This is expected behavior for this test. Below is some sample output showing the completion of the loopback test:
KERNEL: afu_egress (read from afu) : 000000000000000000000000000000000000000000000000000000000000270E @ 100485 ns KERNEL: >> XDI INFO : Checking : (expected, actual) = (9998, 9998) @ 100485 ns : total compares = 9999 : max gap = 1 : running status = ok KERNEL: afu_egress (read from afu) : 000000000000000000000000000000000000000000000000000000000000270F @ 100495 ns KERNEL: >> XDI INFO : Checking : (expected, actual) = (9999, 9999) @ 100495 ns : total compares = 10000 : max gap = 1 : running status = ok KERNEL: >> XDI INFO : Checking : (expected, actual) = (9999, 9999) @ 100495 ns : total compares = 10000 : max gap = 1 : running status = ok KERNEL: >> XDI TESTBENCH STATUS : SUCCESS EXECUTION:: NOTE : Stratix III PLL lost lock due to loss of input clock or the input clock is not detected within the allowed time frame. EXECUTION:: Time: 100515 ns, Iteration: 0, Instance: /i_app_core/i_qdr_test_top_inst/FPGA_Interface_QDRII_Port/mf_qdr_phy_mf_qdr_phy_alt_m em_phy_mf_qdr_phy_alt_mem_phy_inst_mf_qdr_phy_alt_mem_phy_clk_reset_clk_mf_qdr_phy_al t_mem_phy_pll_half_rate_pll_13925/altpll_component/STRATIXIII_ALTPLL/M4, Process: line__10834. EXECUTION:: NOTE : Please run timing simulation to check whether the input clock is operating within the supported VCO range or not. EXECUTION:: Time: 100515 ns, Iteration: 0, Instance: /i_app_core/i_qdr_test_top_inst/FPGA_Interface_QDRII_Port/mf_qdr_phy_mf_qdr_phy_alt_m em_phy_mf_qdr_phy_alt_mem_phy_inst_mf_qdr_phy_alt_mem_phy_clk_reset_clk_mf_qdr_phy_al t_mem_phy_pll_half_rate_pll_13925/altpll_component/STRATIXIII_ALTPLL/M4, Process: line__10834. EXECUTION:: NOTE : Stratix III PLL lost lock due to loss of input clock or the input clock is not detected within the allowed time frame. EXECUTION:: Time: 100515 ns, Iteration: 3, Instance: /i_app_core/i_BridgeAppPorts_clocks/gen_pll_clocks/i_bridge_if_clks/altpll_component/ STRATIXIII_ALTPLL/M4, Process: line__10834. ... EXECUTION:: NOTE : Please run timing simulation to check whether the input clock is operating within the supported VCO range or not. EXECUTION:: Time: 100515 ns, Iteration: 3, Instance: /i_app_core/i_app_clocks_unused/gen_pll_clocks/i_app_if_clks/altpll_component/STRATIX III_ALTPLL/M4, Process: line__10834.

27

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

EXECUTION:: WARNING: Illegal value detected on input clock. DLL will lose lock. EXECUTION:: Time: 100515 ns, Iteration: 6, Instance: /i_app_core/i_qdr_test_top_inst/FPGA_Interface_QDRII_Port/mf_qdr_phy_mf_qdr_phy_alt_m em_phy_mf_qdr_phy_alt_mem_phy_inst_mf_qdr_phy_alt_mem_phy_clk_reset_clk_stratixiii_dl l_dll_11214, Process: line__8019. KERNEL: Simulation has finished. There are no more test vectors to simulate. VSIM: Simulation has finished. There are no more test vectors to simulate.

3.6.3

QDRII+ Interface Simulation

The AltMemPhy MegaWizard used to produce the QDRII+ interface generates files for synthesis and for simulation. The .qar project includes these simulation-only files as well as the simulation model for the specific Cypress QDRII+ SRAM found on the XD2000i module, CY7C1565V18.v. Altera Quartus 8.1 VHDL and Verilog libraries must be used for simulation. In order to make simulation time reasonable, the range of addresses written to / read from is limited in simulation.

3.6.3.1

Files for QDRII+ Interface Simulation

The files required for the QDRII+ Interface simulation are in the qdrii simulation library described in Section 3.6.1.

3.6.3.2

QDRII+ Interface Simulation Results

The test bench file, tb_testQDR.vhd, controls inputs (reset_qdr_test and inputErrorToPRBS) and monitors outputs (qdr_test_out_resync_successful, qdr_test_out_fail_sticky, and qdr_test_out_pass_word) to test for proper QDRII+ interface operation. Run the simulation for at least 45 us in order to use all of the stimulus. A successful pass will result in the following printed to the screen: # KERNEL: # KERNEL: # KERNEL: # KERNEL: <---# KERNEL: # KERNEL: # KERNEL: <---# KERNEL: # KERNEL: # KERNEL: PASSED # KERNEL: # KERNEL: # KERNEL: PASSED ----> ----> Altera QDRII/+ Sequencer : Starting Calibration <----

Altera QDRII/+ Sequencer : Non-Deterministic Latency Mode

---->

Altera QDRII/+ Sequencer : DLL Initialisation Complete

---->

Altera QDRII/+ sequencer : Calibration For Chip Number 0

---->

Altera QDRII/+ sequencer : Calibration For Chip Number 1

28

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

# KERNEL: # KERNEL: # KERNEL: # KERNEL: # KERNEL: ----> Altera QDRII/+ Sequencer : Calibration Completed - STATUS SUCCESSFUL <---# KERNEL: # KERNEL: # KERNEL: # KERNEL: # KERNEL: ----> Altera QDRII/+ Sequencer : The Detected Read Latency Number Is 13 Phy Clock Cycles <---# KERNEL: Reset the QDRII+ interface test # KERNEL: Cycle through QDRII+ interface sub-tests and wait for pass_word # KERNEL: Test passes! 'qdr_test_out_fail_sticky'=0 # KERNEL: Continue to run test, but inject error into PRBS test # KERNEL: Error detected on 'qdr_test_out_fail_sticky' as expected # KERNEL: Stop injecting error and pulse reset to clear error # KERNEL: Error cleared and pass_word back to '0000' # KERNEL: Cycle through QDRII+ interface sub-tests and wait for pass_word # KERNEL: Test passes! 'qdr_test_out_fail_sticky'=0

29

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

3.7

Software Framework
Software Features

3.7.1

The software is distributed as a gzipped, tar archive named XD2000i_Reference_Design_Software_v<ver>.tar.gz. The archive contains source code for test programs that verify and exercise the reference design. Features of this AFU V0.7 FSB protocol release include
The Bridge FPGA is automatically configured from Flash memory at power-up. A soft or warm reset does not reload the Bridge FPGA. The AppA and AppB FPGAs are available. Data sent to an Application FPGA contains a destination header that identifies the target FPGA. The Bridge strips the header and forwards the payload to the targeted Application FPGA. The two Application FPGAs may be configured under software control. The FPGA image file format is compressed .rbf. The user AFU interface implements the AFU-Type FIFO protocol described in Section 11. The QDR interfaces are implemented and exercised by a stand-alone data generator and checker. The inter-FPGA interfaces are implemented (AppA AppB link) New Bridge images may be loaded into flash memory through the FSB for subsequent automatic configuration of the Bridge at power-on. Bridge images use a proprietary format and are supplied by XDI.

Features targeted for the future AFU V1.0 FSB protocol release include the following.
Support

for AAL 1.0 system software. AAL 1.0 enables the following.
A

fully asynchronous programming model with callbacks. independent AFUs. Note: The Bridge, AppA, and AppB address header mechanism will be removed in the AAL 1.0 release. The Application FPGAs will be addressed through an opaque software handle.

Multiple,

Increased FSB bandwidth. The 4 MB transfer size target is 3.5 GB/s to the FPGA, 2.5 GB/s from the FPGA and 1.5 GB/s for simultaneous read and write (loop back testing). Software updating of the Flash with bad image error recovery to eliminate the need for JTAG. Support for cache-snoop enabled MCHs System usability features including Recovery from application errors by selective AFU reset A master mode interface whereby the AFU can independently generate system memory addresses A memory mapped read write register interface to the AFU

30

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

3.7.2

Software Directory Structure


// Release directory root // // // // Various binary files Makefile, release notes and test script Source files for the reference design software Test scripts and pre-built executables

The software tar file contains the following software directory structure.
software_<ver> bin build src test

The release directory contains the following structure.


bin // Various binary files fapdrv_afu_07.ko // Higher level device driver fappip_afu_07.ko // Lower level device driver fsb_bus.tcl // Utility to enable FSB segment libXdiAhm.a // Link archive for App configuration libXdiAhm.so // Shared library for App configuration xdi_appA_EP3SE260_v<ver>.rbf // Reference design appA rbf xdi_appB_EP3SE260_v<ver>.rbf // Reference design appB rbf // Test program build directory // Bash test program built around ahmTest // Makefile to build ahmTest, ahmAppDownload and // ahmFlashDownload README // Latest information not mentioned elsewhere release_notes_<ver>.txt // Software release notes doAhmTest.sh Makefile // Software source files ahm.cpp ahm.h ahmAppDownload.cpp ahmException.h ahmFlashDownload.cpp ahmInstance.h ahmLogger.h ahmLoopbackTest.cpp ahmMonitor.h ahmQdrTest.cpp ahmRegReport.h ahmSequencerTest.cpp ahmSetControlReg.h ahmStdint.h ahmTest.cpp ahmTimer.h ahmUtils.h ahmWorkspace.h appControlRegistersPkg.h

build

src

31

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

fappip-drv.h MersenneTwister.h registerDefs.h registerWorkspace.h sequencerCommand.h test ahmAppDownload ahmFlashDownload ahmTest doAhmTest.sh Makefile seqCmds.in *.o *.d // Test directory containing pre-built programs // Application FPGA FSB configuration program // Application to access Flash Memory // Module test program // Shell test program built around ahmTest // Makefile to build ahmTest and ahmAppDownload // Sequencer input file generated by doAhmTest.sh // compilation object files // compilation dependency files

32

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

4 Development System Setup


This section describes how to set-up the Development and user PCs, both hardware and software. This section is only completed at initial system set-up or when a new XDI-provided file is obtained (e.g. a new CPLD configuration file or a new software distribution). After performing the steps described in this section, the system may be verified by following the steps described in Section 37.

4.1

Development System Setup - Hardware


Load CPLD and Flash Configuration Files

4.1.1

The CPLD and flash configuration files are loaded into non-volatile memory on the module and only need to be loaded one time. However, if a new .pof is provided by XDI for either device, the user needs to replace the current .pof by following these steps. The flash .pof contains the Bridge FPGA configuration file that will be automatically loaded into the Bridge at system power-up.

4.1.1.1

Connect the User PC to the Development PC via USB-Blaster

The Development PC has a JTAG connector located on the back of the system to allow access to the XD2000i module without opening the chassis. The USB-Blaster JTAG connector is keyed to enable proper connection to the blue 10-pin JTAG connector. The USB-side of the USB-blaster connects to any USB port on the user PC. There may be a need to install the driver. Power up both machines. There is no need to log into the Development PC at this time.

4.1.1.2

Detect Module Through Quartus II Programmer on the User PC

On the user PC, launch the Quartus Programmer. This can either be done through Quartus, or as a stand-alone tool. Ensure that the USB-Blaster is visible, or else choose it through Hardware Setup.... Hit Auto-Detect to access the JTAG chain on the XD2000i module. You should see a screen similar to Illustration 6. This shows five devices in the JTAG chain:

EPM2210 CFI_256MB EP3SL150 EP3SE260 EP3SE260

- MAXII CPLD used for module control features - Flash memory to store Bridge configuration file - Stratix III Bridge FPGA - Stratix III Application A FPGA - Stratix III Application B FPGA

The Bridge FPGA (EP3SL150) shown here is a production device. An engineering sample device would have ES at the end of its name (e.g. EP3SL150ES). The order of devices shown (which reflects their order in the JTAG chain) will always be the same. Even if the Application B FPGA were to be replaced with an EP3SL150, it would still appear last in

33

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

the chain.

Illustration 6: Quartus II Programmer Screen shot

4.1.1.3

Configure the CPLD

To configure the CPLD, add the xdi_cpld_v<ver>.pof file to the EPM2210 device and enable the associated Program/Configure check boxes, as shown in Illustration 7. Hit Start to configure the CPLD. When the CPLD has been configured, its associated blue LED will light up. The module LED descriptions can be found in the XD2000i_Module_LED_Definitions_<ver>.pdf document. When complete, power-cycle the Development PC.

Illustration 7: Quartus II Programmer Screen shot Configure CPLD

4.1.1.4

Configure the Flash Memory

The on-module flash memory will hold the Bridge configuration file that will be automatically loaded into the Bridge FPGA at system power-up. To configure the flash, launch the Quartus programmer

34

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

and hit Auto Detect to display the devices as before. Add the xdi_flash_PR_v<ver>.pof or xdi_flash_ES_v<ver>.pof file to the CFI_256MB device. The PR file is used with production Bridge FPGAs, while the ES file is used with engineering sample Bridge FPGAs which is indicated by the Bridge Device name in the JTAG programming window (EP3L150=PR, EP3L150ES=ES). Enable the associated Program/Configure check boxes which is shown in Illustration 8. Hit Start to configure the flash. This will take a few minutes. When configuration is complete, power-cycle the Development PC. This time, on power-up, the Bridge FPGA configuration file will be automatically loaded into the Bridge from flash and the associated blue LED will light up for the Bridge indicating completed configuration.

Illustration 8: Quartus II Programmer Screenshot Configure Flash

4.2

Development System Setup - Software

Software setup consists of potentially disabling the snoop filter, installing third party libraries, and unpacking the current software .tar file. The following steps require that hardware installation and required flash updates have been performed as described in Section 33.

4.2.1

Disable the MCH Snoop Filter (if required)

Certain MCHs utilize a snoop filter to reduce FSB coherency traffic. The XD2000i using AFU V0.7 FSB protocol is incompatible with systems using an enabled snoop filter. Any snoop filter must therefore be disabled before booting a system which loads the XD2000i device drivers. The 5000X and 5400 series MCHs are examples of snoop filter capable MCHs. The SuperMicro X7DGT and X7DWT are examples of systems containing snoop filter capable MCHs. The snoop filter is generally controlled through the BIOS so this step is typically performed once per system. Snoop filter functionality may be disabled from the BIOS setup screens. One possible flow is: 1. Press the Delete key during system boot to get to the BIOS setup screen.

35

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

2. Select Advanced => Advanced Chipset Control => Snoop Filter from the BIOS menu 3. Select Disabled 4. Press F10 to save, exit and reboot the system

4.2.2

Log in to the Development PC

Use the following username and password to log onto the Development PC. Username: xd2000 Password: xd2000

4.2.3
Boost

Install External Libraries


// for parsing command line options using program_options and // measuring time using posix_time

The reference design software depends on the following external library.

The Boost libraries may be installed from the Internet repository with the following command $ sudo yum install boost-devel

4.2.4

Unpack the Distribution


$ cd <target_dir> $ cp <src_dir>/XD2000i_Reference_Design_Software _v<ver>.tar.gz . $ tar -xf XD2000i_Reference_Design_Software_v<ver>.ta r.gz

Create and change into the target directory that will contain the software project. Copy the software archive to the target directory

Extract and create directory tree. This creates the bin, build, src and test sub-directories.

36

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

5 System Verification
At this point, the steps in Section 33 should have been completed so that the development PC is set up, the software code is available to be run, and the Bridge is ready to communicate to the system. This section describes how to load the drivers and verify the system.

5.1

Establish Communication with the Bridge

Each time the module starts up, the following sequence (Sections37to37) needs to be executed to open direct communication to the Bridge.

5.1.1

Log in to the Development PC

Use the following user name and password to log onto the Development PC. Username: xd2000 Password: xd2000

5.1.2

Enable the AHM FSB Segment (if required)

The XD2000i module occupies the secondary CPU socket. Certain systems disable the secondary FSB segment if a CPU is not detected in socket 1 during system boot. Only for systems with SuperMicro Rack Mount DP X7DGT-SG007 motherboards, the following tcl script must be executed to enable the AHM socket 1 front side bus interface. These systems can be identified by an XDI part number that includes the field IN02. This procedure to enable the AHM FSB segment must be performed after every system restart. $ cd <target_dir>/software_<ver>/bin $ sudo chmod 777 fsb_bus.tcl $ ./fsb_bus.tcl 1 on

5.1.3

Load the XD2000i Device Driver

The XD2000i AFU V0.7 protocol device driver has two components : fappip_afu_07.ko implements the low level hardware interface and fapdrv_afu_07.ko implements the application interface. After driver installation, the AHM is accessed through the /dev/fap device file. The driver must be installed with root permissions so the use of sudo is recommended. If prompted for a password, sudo is requesting the current user (e.g. xd2000) password. This procedure to load the device drivers must be performed after every hard or soft system restart. Load the XD2000i device drivers and enable /dev/fap for all users: $ cd <target_dir>/software_<ver>/bin $ sudo /sbin/insmod fappip_afu_07.ko $ sudo /sbin/insmod fapdrv_afu_07.ko

37

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

$ sudo chmod 666 /dev/fap Verify the driver has been loaded by reading the system log $ dmesg The following lines should appear near the end of the dmesg output (separated by other lines of text). fappip: ACP module @ socket=1 fappip: FSB Accelerator Hardware Module - PIP Module, fapdrv: FSB Accelerator Hardware Module - Device Driver

5.2

Load the Reference Design and Exercise Hardware

The software_<ver>/test directory contains pre-built programs suitable for reference design and system verification. The verification programs may also be generated by invoking make clean && make in the /test or /build sub-directories. The verification programs configure both Application FPGAs with the reference design and then test all interface ports and AFUs that are described in Section 14. Change to the test directory that contains the precompiled test program Verify the system by running the doAhm.sh script.
$ cd <target_dir>/software_<ver>/test $ ./doAhmTest.sh -n1 | grep LogTestStatus

Verify the test output using the sample output below.

Large amount of output; each LogTestStatus message, except the first, should report true.

Starting doAhmTest.sh (LogTestStatus : Test complete = false) ... 2009-Mar-17 17:59:34.250292 <> ahmFsbDownload.cpp - Line 242 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:35.205620 <> ahmFsbDownload.cpp - Line 242 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:37.066324 <> ahmSequencerTest.cpp - Line 176 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:38.925110 <> ahmSequencerTest.cpp - Line 176 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:40.782955 <> ahmSequencerTest.cpp - Line 176 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:42.642518 <> ahmLoopbackTest.cpp - Line 146 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:44.499615 <> ahmLoopbackTest.cpp - Line 146 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:46.356801 <> ahmLoopbackTest.cpp - Line 146 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:48.214138 <> ahmLoopbackTest.cpp - Line 146 <> LogTestStatus <>

38

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:48.255084 <> ahmQdrTest.cpp - Line 158 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:48.292021 <> ahmQdrTest.cpp - Line 158 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:48.299812 <> ahmSetControlReg.h - Line 93 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:50.156727 <> ahmLoopbackTest.cpp - Line 146 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:50.166389 <> ahmSetControlReg.h - Line 93 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:52.023792 <> ahmLoopbackTest.cpp - Line 146 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:52.034780 <> ahmSetControlReg.h - Line 93 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:52.043644 <> ahmSetControlReg.h - Line 93 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:53.901206 <> ahmLoopbackTest.cpp - Line 146 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:53.911157 <> ahmSetControlReg.h - Line 93 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:53.918962 <> ahmSetControlReg.h - Line 93 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h 2009-Mar-17 17:59:55.777041 <> ahmLoopbackTest.cpp - Line 146 <> LogTestStatus <> thisTestStatusIsOk = true <> = 0000000001h doAhmTest.sh : LogTestStatus : thisTestStatus = Success doAhmTest.sh : LogTestStatus : allTestStatus = Success doAhmTest.sh : LogTestStatus : allTestStatus = Success Ending doAhmTest.sh (LogTestStatus : Test complete = true) doAhmTest.sh : LogTestStatus : doAhmTest.sh exit code = 0

5.2.1

Verify the Software Build Process

The software build process may be verified by building the test software in the /build directory and running the doAhmTest.sh script. The test output should be the same as shown in Section 38. $ cd <target_dir>/software_<ver>/build $ make clean && make $ ./doAhmTest.sh -n1 | grep LogTestStatus

39

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

5.3

Monitor Data on the Application FPGAs

The data and control signals on the Application FPGA can be monitored while the reference design is running by using SignalTap. From Quartus II on the user PC, launch SignalTap (Tools->SignalTap II Logic Analyzer) and load the supplied SignalTap file: app.stp. Set up the hardware to choose the correct USB Blaster and choose either Application FPGA. Application A, device @3, is chosen in Illustration 9.

Illustration 9: SignalTap Hardware Setup

As an example, two screen captures of data transfers to Loopback AFU are shown in Illustration 10(wide view) and Illustration 11(zoom view). Illustration 10 shows valid data at BridgePort2AFU and then later in time at AFU2BridgePort after it loops back through AFU i_afu_loopback. Illustration 11 zooms in to show one of the 256-bit words of random data that is being transferred. As the Bridge FPGA is proprietary, there are no Signal Tap files available for that device.

40

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

Illustration 10: Loop back SignalTap Capture on Application A FPGA - wide

Illustration 11: Loop back SignalTap Capture on Application A FPGA - zoom

41

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

6 Software Programs
The following programs are found in the software_<ver>/test directory.

6.1

ahmAppDownload Configure Application FPGAs

ahmAppDownload configures each Application FPGA with the specified .rbf configuration files. The program has the following parameters and returns 0 on success and non-0 on failure.
ahmAppDownload usage: -h [ --help ] --version -a [ --rbfPathA ] arg -b [ --rbfPathB ] arg -A [ --loadAppA ] arg -B [ --loadAppB ] arg -t [ --debugPrintThreshold ] arg (=0) Print this help message Print program version AppA rbf path to download AppB rbf path to download Load AppA fpga Load AppB fpga Debug print level (lower value prints more messages)

The following command configures the AppA FPGA with appA_test.rbf $ ./ahmAppDownload -A -a appA_test.rbf Results: downloadAppAStatusIsOk = true All runningStatusIsOk == true

6.2

ahmFlashDownload Write Bridge FPGA Boot Image to Flash

The ahmFlashDownload program writes a new Bridge boot image into Flash memory. After the next system power-cycle, this boot image automatically loads into the Bridge. The write and verification process takes several minutes. Running ahmFlashDownload with no arguments will print various version registers.
ahmFlashDownload usage: -h [ --help ] --version -i [ --imagePath ] arg -l [ --load ] -p [ --imagePageNo ] arg (=-1) $ ./ahmFlashDownload Print this help message Print program version Image path to save or load Load flash image from file Flash image page to load

Results:
AHM Versions cpldMajorVersion = 2 = 2h

42

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

cpldMinorVersion pcbVersion interposerVersion bridgeMajorVersion bridgeMinorVersion Flash Device Information manufacturerId deviceId1 deviceId2 deviceId3

= = = = =

1 5 3 1 0

= = = = =

1h 5h 3h 1h 0h

= = = =

1 8830 8738 8705

= = = =

1h 227eh 2222h 2201h

6.3

doAhmTest.sh Reference Design Demonstration Script

The doAhmTest.sh script is a wrapper around ahmTest. It sets up arguments which can be set to demonstrate all reference design functionality or only specific AFUs.. The script calls ahmTest to perform the actual work. doAhmTest.sh accepts a single optional argument, n, specifying the number of test iterations. The script returns 0 on success and non-0 on failure. This is the program that is run in Section 38 to perform system verification. For example, to call ahmTest 5 times, run the following: $ ./doAhmTest.sh -n5 | grep LogTestStatus

6.4

ahmTest Reference Design Demonstration Program

The ahmTest program is the test program supervisor. It processes the command line arguments and calls the underlying test programs. The underlying programs return boolean true on success and boolean false on failure. The following functions are called by ahmTest:
Application Sequencer

FPGA Download Test Loopback Test Chain Loopback Test QDR Test Port Loopback Test App Port Loopback Test Parameters to ahmTest may be displayed with the -h option as shown.
./ahmTest -h ahmTest usage: -h [ --help ] --version -a [ --rbfPathA ] arg -b [ --rbfPathB ] arg -d [ --doAppDownload ] arg (=0)

Print this help message Print program version AppA rbf path to download AppB rbf path to download Download application fpga's

43

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

-e -f -g -i -j -k -n -s -p

[ [ [ [ [ [ [ [ [

--doSequencerTest ] arg (=0) --doLoopbackTest ] arg (=0) --doChainLoopbackTest ] arg (=0) --doQdrTest ] arg (=0) --doPortLoopbackTest ] arg (=0) --doAppPortLoopbackTest ] arg (=0) --loopbackTestNumPasses ] arg (=0) --loopbackTestSizeBits256 ] arg (=0) --loopbackTestDataPattern ] arg (=0)

-v -c -x -y -z -q -t

[ [ [ [ [ [ [

--doVerifyLoopbackData ] arg (=1) --seqCmdPath ] arg --testBridge ] arg (=0) --testAppA ] arg (=0) --testAppB ] arg (=0) --bridgePresent ] arg (=0) --debugPrintThreshold ] arg (=0)

Run sequencer test Run loopback test Run chain loopback test Run QDR test Run Port loopback test Run App Port loopback test Number of loopback test passes Size of loopback data (s x 256 bits) Loopback test data pattern 0 => sequential 1 => random Verify loopback test data Sequencer command file test Run tests on Bridge fpga Run tests on AppA fpga Run tests on AppB fpga System contains a Bridge fpga Debug print level (lower value prints more messages)

For example, the following configures the AppA FPGA and then runs the Loopback test using sequential data. $ ./ahmTest -a xdi_appA_EP3SE260_v1.0.0.rbf -d 1 -f 1 -n 5 -s 10 -y 1 -q 1

6.4.1

Reference Design Software Functions

The functions called by ahmTest are described here. The hardware being tested is described in Section 14 - Application FPGA Reference Design.

6.4.1.1

ahmFsbAppDownload Configure a Single Application FPGA

The ahmFsbAppDownload() function configures a single Application FPGA. Since the AppA and AppB configuration images are necessarily different, it is an error if both loadAppA and loadAppB are true. The doAhm program calls ahmFsbAppDownload() with the following parameters.
bool ahmFsbAppDownload( const char* bool bool Bridge_Afu_Mux::selector_id App_Afu_Mux::selector_id AfuTaskIf* ) sofPath, loadAppA, loadAppB, opaque, opaque, opaque

6.4.1.2

ahmSequencerTest Sequencer Test

The ahmSequencer function parses the specified seqCmdPath file and performs the associated actions. The sequencer performs Idle, Read, Write and Loopback operations where the operation

44

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

is specified with respect to the sequencer, i.e. write means FPGA writes to CPU. Transmitted data is a simple counting pattern. Idle causes the sequencer to wait the specified number of FPGA clock cycles. Sequencer file data values are interpreted as hexadecimal values. The doAhm program calls ahmSequencerTest() with the following parameters.
bool ahmSequencerTest( const char* size_t size_t bool Bridge_Afu_Mux::selector_id App_Afu_Mux::selector_id AfuTaskIf* ) seqCmdPath, maxSrcDstCountBits256, numPasses, checkData, bridgeAfuId, appAfuId, afuIf

seqCmdPath maxSrcDstCountBits256 numPasses checkData

bridgeAfuId appAfuId afuIf

// Name of command file to parse and execute // Size of AHM workspace to allocate maximum of 4 MB // Number of times to execute the file commands // Flag to enable or disable received data checking. // Disabling checking allows measurement of // read, write and read/write bandwidths // true => check received data // false => do not check received data // Bridge AFU selector ID (AppA or AppB). // Application AFU selector ID (AfuSequencer) // Unused

The doAhmTest.sh script prepares and executes a sample seqCmds.in file as shown. Sample SeqCmds.in:
// Write and read are relative to the sequencer // i.e. Read => CPU writes and Sequencer reads // Command format is ... // Command Count // Comment // Command is one of {Idle, Write, Read, Loopback} // For Command == {Write, Read, Loopback}, Count is the number of 256bit (32Byte) reads or writes to perform // For Command == Idle, Count is the number of FPGA clock cycles to wait // Counts are in hex // 999000 Idles is about 0.1 seconds // Max Read/Write/Loopback/Idle/Data count per file is 20,000h (4MB workspace buffer) // Max Idle count per call is FFF,FFFh

45

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

// Test software has an arbitrary timeout that may be triggered by excessive Idles Idle 999000 Write 40 Read 50 Loopback 1FFF0 // // // // // Suspend processing for 999000h FPGA cycles (~0.1 sec) FPGA write 40h sequential values to the CPU CPU write 50h sequenctial values to the FPGA The CPU write 1FFF0h sequential values to the FPGA which returns to the CPU

6.4.1.2.1

Bandwidth Measurement

The Test Sequencer provides a means to perform bandwidth tests. A one line Sequencer Command file can be called to report read, write or loopback bandwidths. Tests may be run on either the Bridge or App FPGAs. Bandwidth tests should disable data checking since the checking occurs within the timing loop. A single line command file should be used since all commands are executed withing the timing loop. Read Write Loopback 1FFF0 1FFF0 1FFF0 // CPU-to-FPGA bandwidth // FPGA-to-CPU bandwidth // CPU-to-FPGA-to-CPU bandwidth

6.4.1.3

ahmLoopbackTest - Loopback Test

The ahmLoopback function performs a data loopback from the Bridge through the targeted Application FPGA and back to the Bridge. Various loopback paths are selected by setting the value of the appAfuId parameter : different values select different loopback AFUs. If appAfuId == App_Afu_Mux::mux_id_loopback, then data passes through the minimum data path on the Application FPGA. If appAfuId == App_Afu_Mux::mux_id_chain_loopback then data passes though a series of FIFOs. The ahmTest program calls ahmLoopbackTest() with the following parameters.
bool ahmLoopbackTest( size_t size_t int bool Bridge_Afu_Mux::selector_id App_Afu_Mux::selector_id AfuTaskIf* ) numPasses, sndRcvCountBits256, testPattern, checkData, bridgeAfuId, appAfuId, afuIf

numPasses sndRcvCountBits256 testPattern

checkData

// Number of test iterations // Number of 256-bit (32 byte) data words to transfer // A specifier for the type of test data to send // 0 => sequential data // 1 => random data // Flag to enable or disable received data checking.

46

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

// Disabling checking allows measurement of // read, write and read/write bandwidths // true => check received data // false => do not check received data // Bridge AFU selector ID (AppA or AppB). // Application AFU selector ID // App_Afu_Mux::mux_id_loopback => minimum path // loopback // App_Afu_Mux::mux_id_chain_loopback => loopback // through a series of FIFOs // Unused

bridgeAfuId appAfuId

afuIf

6.4.1.4

ahmQDRTest QDRII+ Interface Test

The QDR test controls the stand-alone QDR data generator and checker through a register interface. The test restarts the QDR tester and monitors status for initial completion and a persistent error flag. The QDR test is described in Section19. The ahmTest program calls ahmQdrTest() with the following parameters.
bool ahmLoopbackTest( Bridge_Afu_Mux::selector_id App_Afu_Mux::selector_id AfuTaskIf* ) bridgeAfuId, appAfuId, afuIf

bridgeAfuId appAfuId afuIf

// Bridge AFU selector ID (AppA or AppB). // App AFU selector ID == App_Afu_Mux::mux_id_registers // Unused

6.4.1.5

Port Loopback Test

The Port Loopback Test runs a loopback test from the Bridge through the targeted Application FPGA. Data is looped back within the Application FPGA's AFU Port Tester block (refer to Illustration 3: Application FPGA Reference Design). The Port Loopback test puts the Application FPGA App Port into loopback mode by setting a control register to set control_loopback_BridgeAppPort=1. It then calls ahmLoopbackTest() with appAfuId == App_Afu_Mux::mux_id_test_port.

6.4.1.6

App Port Loopback Test

This test verifies the data link between the Application FPGAs as well as the App Port physical interfaces found on each FPGA. As described in Section 18 the AFU Port Tester block can be

47

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

configured to pass data to the other Application FPGA. The App Port Loopback Test tests the path from the Bridge through both Application FPGAs. For example, if AppA is to be tested (targeted), data goes from the Bridge, across into AppA (through the Bridge Port), is routed to the AFU Port Tester, feeds through to the App Port (refer to Illustration 3: Application FPGA Reference Design), across into AppB, is looped back at AppB's AFU Port Tester, goes back through AppA and back to the Bridge. There is an equivalent test for AppB, where the data feeds through AppB's AFU Port Tester and loops back at AppA's AFU Port Tester. Before sending loopback data, the test sets control registers on each Application FPGA to set up the respective Port Tester to either feed through (the targeted FPGA) or loopback (the nontargeted) data. ahmLoopbackTest() is called with appAfuId == App_Afu_Mux::mux_id_test_port to run the test. Because data is transferred between both Application FPGAs, the App Port Loopback Test requires that both AppA and AppB FPGAs are configured, although both FPGAs do not need to be tested (targeted).

48

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

7 AFU Selector Software Programming Model


Having multiple Bridge and Application AFUs is supported via the AFU Selector Programming Model. Its characteristics are: AFU Selector command specifies the target AFU and the number of data words being sent to that AFU.
The Bridge FPGA implements multiple AFUs to distinguish between the AppA and AppB FPGAs. A Bridge AFU Selector command is required when sending any payload data to the AHM. Multiple The The

AFU support is optional on the Application FPGA. The XDI reference design implements multiple Application AFUs to provide various test and demonstration capabilities.

AFU Selector is 64 bytes. The lower 8 bytes of the Selector specifies the AFU Selector opcode, the target AFU and the count of 32 byte payload data words being sent to the AFU. The upper 56 bytes of the Selector word are Don't Care values and are ignored.

After <count> data payload words have been transmitted to the AFU, the Selector state machine monitors the input data stream looking for the next valid AFU Selector Command. The The The A

Multiple AFU controller discards the entire AFU Selector before sending the data to the targeted AFU (the AFU does NOT see the Selector). Selector count field indicates the number of payload data words forwarded to the AFU (count does NOT include the Selector word). Selector count value is the number to 32 byte words.

count value of 0 is valid.

An

AFU Selector must account for all data sent to the AHM. System behavior is undefined if any unexpected data is transmitted.

amount of receive data is not specified in the Selector. The user must ensure all receive data is written to system memory before switching AFUs.

The

49

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

7.1
Bits

Bridge FPGA AFU Selector


Field Pad Pad0 AfuId Function Pad Pad Target AFU Identifier Value Range Don't care Don't care 0xC0 : Download AFU 0xC1 : Sequencer AFU 0xC2 : AppA AFU 0xC3 : AppB AFU 0xB1 : Select target AFU

Bridge FPGA AFU Selector [511:64] [63:48] [39:32]

[47:40] [31:0]

OpCode Count

OpCode

Number of 32 byte words following the 0 2^32-1 AFU Selector

7.2
Bits

Application FPGA AFU Selector


Field Pad Pad0 AfuId Function Pad Pad Target AFU Identifier Value Range Don't care Don't care 0xC0 : Test port AFU 0xC1 : Register file AFU 0xC2 : Sequencer AFU 0xC3 : Loopback AFU 0xC4 : Chain loopback AFU 0xB1 : Select target AFU

Application FPGA AFU Selector (using XDI reference design) [511:64] [63:48] [39:32]

[47:40] [31:0]

OpCode Count

OpCode

Number of 32 byte words following the 0 2^32-1 AFU Selector

The following sequence sends 2 words to the FSB Download AFU and then 2 words to the AppA AFU.
# KERNEL: afu_ingress (write to afu) : 000000000000EEE3000000000000EEE2000000000000EEE1AAA1B1C000000002 @ 245 ns // Bridge Selector Word 1 - Select afu_selector_opcode = B1h, fsb download AFU C0h, send 2h words # KERNEL: afu_ingress (write to afu) : 000000000000EEE7000000000000EEE6000000000000EEE5000000000000EEE4 @ 275 ns // Bridge Selector Word 2 - Ignored # KERNEL: afu_ingress (write to afu) : AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA00008B01 @ 285 ns // Data

50

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

word 1 to fsb download AFU # KERNEL: afu_ingress (write to afu) : AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA00008902 @ 295 word 2 to fsb download AFU # KERNEL: afu_ingress (write to afu) : 000000000000EEE3000000000000EEE2000000000000EEE1AAA1B1C200000002 @ 445 Selector Word 1 - Select afu_selector_opcode = B1h, AppA AFU C2h, send # KERNEL: afu_ingress (write to afu) : 000000000000EEE7000000000000EEE6000000000000EEE5000000000000EEE4 @ 475 Selector Word 2 - Ignored # KERNEL: afu_ingress (write to afu) : AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA00008B03 @ 485 word 1 to AppA # KERNEL: afu_ingress (write to afu) : AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA00008904 @ 495 word 2 to AppA

ns

// Data

ns // AppA 2h words ns // AppA

ns

// Data

ns

// Data

7.3

Application FPGA Control Register on Bridge

The Bridge contains a 256-bit write-only register capable of resetting each Application FPGA. This register may only be written when all previous AHM transfers have completed and the Application FPGA is not generating or receiving any data. An appropriate Bridge AFU Selector must precede sending the register write command. Application FPGA Control Register (write only) Bridge AFU Selector AfuId = 0xC0 : Download AFU Bits [255:16] [15] [14] [13:8] [7:2] [1] [0] Field Reserved2 Opcode Reserved1 Addr Reserved0 ResetB ResetA Function Reserved Read or write_n Reserved Register address Reserved Reset AppB FPGA Reset AppA FPGA Write Value Range 0 0 0 0x21 0x0 1 : Reset AppB 0 : Do not reset AppB 1 : Reset AppA 0 : Do not reset AppA

51

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

7.4

Application FPGA Status Register

The Bridge contains a 256-bit read-only register that returns the reset status of the Application FPGAs. This register may also be used to ensure that all data transactions send and receive a positive multiple of 64 bytes. Accessing this register adds 32 bytes to the send and receive data streams. An appropriate Bridge AFU Selector must precede sending the register read command. App FPGA Status Register (read-only) Bridge AFU Selector AfuId = 0xC0 : Download AFU Bits [255:16] [15] [14] [13:8] [7:2] [1] [0] Field Reserved2 Opcode Reserved1 Addr Reserved0 ResetB ResetA Function Reserved Read or write_n Reserved Register address Reserved Reset AppB FPGA Reset AppA FPGA Read Value Range Undefined Undefined Undefined Undefined Undefined 1 : AppB is reset 0 : AppB is not reset 1 : AppA is reset 0 : AppA is not reset

7.5

Bridge Null Write Register

The Bridge contains a 256-bit register that may be written without otherwise affecting the system. This register may be used to ensure that all data transactions send and receive a positive multiple of 64 bytes. Writing this register adds 32 bytes to the send data stream. An appropriate Bridge AFU Selector must precede sending the register write command. Bridge Null Write Register Bridge AFU Selector AfuId = 0xC0 : Download AFU Bits [255:16] [15:0] [14] [13:8] [7:0] Field Reserved2 Opcode Reserved1 Addr Reserved0 Function Reserved Read or write_n Reserved Register address Reserved Write Value Range 0 0 0 0x10 0x0

52

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

7.6

Send / Receive Alignment Pad Sequence

The following sequence ensures that send and receive transfer sizes are positive multiples of 64 bytes.
bool isOK(count32Bytes) { return ((count32Bytes >= 2) && !(count32Bytes & 1) ) } Add Bridge AFU Selector if( isOk(dstCount ) add read AppFpgaStatusRegister if( isOk(dstCount ) was 0 add read AppFpgaStatusRegister if( isOK(srcCount ) add write BridgeNullWriteRegister // adds 2 send words // adds 1 send + 1 receive word // call again in case original receive count // adds 1 send + 1 receive word // adds 1 send word

53

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

8 Software Notes
Other important points related to system operation:
If

data is sent to an Application FPGA before the FPGA has been configured, the AHM will hang and a soft reset is required for system recovery. configuration files, .rbf bit file images, are not checked for correctness before attempting to load them into an FPGA.

Application The

device driver provides asynchronous calls for sending data and for waiting for completion status. The reference design combines these two calls into a single, synchronous function. If asynchronous operation is desired, the start_sync() function found in file ahmInstance.h may be modified. The application must still wait for the receive completion before sending a new Workspace Command.

54

XD2000i Development System XD2000i Coprocessor Module User Handbook Version. 1.03, June 2009

9 Revision History
Revision 1.0.2 Date June 2009 Summary of Changes Updated Section 3.5.1.4.1 and Added Section 3.5.1.8. Described change in Reference Design done to create a synchronous reset. Initial Release

1.0.0

April 2009

55

Você também pode gostar