Escolar Documentos
Profissional Documentos
Cultura Documentos
NDA required
Agenda
Multi-core chips high level overview Multi-core programming
Memory consideration Inter-core communication Multi-core arbitration Peripherals consideration
Image creation
NDA required
C64X+ CORE
RSA
C64X+ CORE
RSA L1 Data L1 Prog
THREE C64X+ DSP CORE @ 1+ GHZ 16/32 bit ISA, doubled MPY vs C64x core RSA instruction set extension for CR processing (downlink & uplink) 65 nm process MEMORY 32 kB L1 program memory 32 kB L1 data memory 3 MB of total L2 memory (2 configurations) 1MB / 1MB / 1MB or 1.5MB / 1MB / 0.5MB Boot ROM DDR2-667MHz 32-bit COMMUNICATIONS SUBSYSTEM 2x sRIO (1x links) SGMII Gigabit Ethernet Antenna interface supporting OBSAI / CPRI 6 links ACCELERATION VCP2, TCP2 Receive accelerator (RAC) 561 BALLS, 23x23 MM FC-BGA FC 5 Rows + 11x11 center array OTHERS IP security, lead-free and green
L2 MEMORY
L2 MEMORY
L2 MEMORY
McBSP
Antenna Interface
DDR-2 IF
sRIO
10 / 100 / 1G Ethernet
NDA required
Agenda
Multi-core chips high level overview Multi-core programming
Memory consideration Inter-core communication Multi-core arbitration Peripherals consideration
Image creation
NDA required
Programming Considerations
Programming model
Shared image: programmer needs to determine whether aliased addressing is appropriate. If so, the code still needs to assign pointers to memory using the global address for any data transfers (aside from internal DMA performed within a single cores memory). Non-shared images: Only global addresses should be used. There is no advantage to aliased addressing.
Inter-core communication
Discrete events: INTGEN peripheral Message passing: Direct writes to memory, or DMA transfer. Can implement a polling or interrupt-driven protocol (DSP BIOS MSGQ available).
NDA required
Allows for common code to be run unmodified on multiple cores Not beneficial for un-shared code.
Core number
software can verify the core on which it is running through register (DNUM) that holds the DSP core number (0, 1, or 2) The core number can be used during run-time to conditionally execute code, update pointers, create a global address, etc.
NDA required
EDMA
Main inter-core data transaction engine
Shared memory
NDA required
NDA required
Inter-DSP Interrupts
NDA required
Multi-Channel Peripherals
These peripherals allow resources to be allocated to the cores and orthogonally controlled without the software hand-shaking prior to accesses. Examples to these multi-channel peripherals are:
EDMA
64 Channels and 256 Parameter RAM can be separated by software into Regions, with each region
assigned to a core.
EMAC
Eight receive and eight transmit DMA channels assigned by software. Received packets transferred to a core based on MAC address routing assigned to a channel. Transmit packets transferred from a core based on a core defined list.
SRIO
Eight receive and eight transmit DMA channels assigned by software. Received packets transferred to a core based on address routing assigned to a channel. Transmit packets transferred from a core based on a core defined list.
AIF
Six inbound and outbound links, the multi EDMA channels assigned by software.
INTGEN
The interrupt Generation logic, used for discrete signaling between cores, is designed to allow orthogonal event assertions and clearing by each core. Control registers are established per receiver and multiple senders can assert events concurrently.
GPIO
multi GPIO can be separated by software. NDA required
Single-Channel Peripherals
I2C
Typically used during boot, system setup, or board monitoring, the I2C should be serviced by a single core. If shared tables/resources are accessed through I2C it would be much faster to first copy the data to DSP memory and share from there. The I2C can be serviced by direct CPU accesses or EDMA.
Timer64
There are multiple timers on the chip. Typically these are individually allocated to single cores, allowing the owning core to control it without arbitrating.
Agenda
Multi-core chips high level overview Multi-core programming
Memory consideration Inter-core communication Multi-core arbitration Peripherals consideration
Image creation
NDA required
Default configuration of chip will be for single image. BIOS code and read-only data should be placed into shared memory.
.hwi_vec will default to LL2 memory (it can be modified during runtime). The sections .gblinit, .switch, .cinit, .pinit, and .const will default to shared memory. All other data sections will default to L2 memory. User can load and run the app on all cores synchronously with parallel debug manager (Simulator). User can also load and run app on each individual core (Simulator). Sections located in aliased memory will automatically be replicated across the cores memory. When done loading app, it can release all cores from reset.
L1 Data
L1 Data
L1 Data
L1 Prog
L1 Prog
L1 Prog
If using CCS
App.out
App.out
App.out
If using Bootloader
App.out
DDR2 memory
NDA required
L1 Data L1 Data
L1 Data L1 Data
L1 Data L1 Data
L1 Prog L1 Prog
L1 Prog L1 Prog
L1 Prog L1 Prog
Each core will be loaded with its app. Each app needs to manage its usage of memory and make sure it doesnt collide with any other app. If using CCS
Open and load each core with its app (Simulator). Use Parallel Debug Manager to run all cores synchronously or open up each core to run them asynchronously (Simulator).
App2.out
App0.out App1.out
L2 memory
L2 memory
L2 memory
App0.out
If using Bootloader
DDR2 memory App1.out App2.out
Load each core with its app Take each core out of reset
NDA required
L1 Data
L1 Data
L1 Data
If using CCS
L2 memory
L2 memory
App2.out
If using Bootloader
Load the partial link image (if not loaded with app). Load each core with its app. Release each core from reset. Note: The partial link image could be loaded once if not included in the load of the apps otherwise it would be loaded multiple times (once for each app loaded on each core).
App1.out
NDA required
Device Boot
Regardless of the number of .out files created, a single boot table should be generated for the final image to be loaded in the end system. The boot sequence is controlled by Core 0.
After device reset, Core 0 is responsible for releasing all cores from reset after the boot image is loaded into the device.
Details on the boot loader are available in TI user guide SPRUEA7, TMS320TCI648x DSP Bootloader
Core0.out Core0.rmd Core0.btbl M E R G E B T B L
Hex6x
Core1.btbl
DspCode.btbl
Core2.btbl
NDA required
Q &A
NDA required