Você está na página 1de 28

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-17: Memory organisation, and types of memory

1. Memory Organisation Random access model  A memory-, a data byte, or a word, or a double word, or a quad word may be accessed from or at all addressable locations with a similar process would be used to access from all locations and there is would be equal access time for a read or for a write that is independent of a memory address location. This mode differentiates from another model called serial access mode

Addresses
 Memory (both RAM and ROM) divided into a set of storage locations, each of which can hold 1 byte (8 bits) of data.  The storage locations are numbered, and the number of a storage location (called its address) is used to tell the memory system which location the processor wants to reference.  Important characteristics of a computer system is the width of the addresses it uses, which limits the amount of memory that the processor can address. Most current computers use either 32-bit or 64-bit addresses, allowing them to access either 232 or 264 bytes of memory. RANDOM ACCESS MODEL OF MEMORY  Simple model for RAM and ROM  Both has random-access model of memory  All memory operations take the same amount of time independent of the address of the byte or word at the memory Example  Assume that the memory system will support two operations: load (read operation into processor from memory) and store (read operation from processor into memory).  Load from one set of addresses (2 or 4) will take same time for store from another set of addresses (2 or 4) ROM  Contents of the read-only memory cannot be modified by the computer but may be read.  A system has ROM unit(s) for bootstrap program(s), basic input-output system (BIOS) program(s) and for vector addresses for the interrupts

 Used to hold bootstrap program that is executed automatically by the system every time it is turned on or reset. Instructs the system to load its operating system off ROM image ROM image holds the programs, operating system, and data required by the system Random-access memory (RAM)  Can be both read and, written,  Hold the programs, operating system, and data required by the system.  Generally volatile, meaning that it does not retain the data stored in it when the system 's power is turned off. A  Data that needs to be stored while the system is off must be written to a permanent storage device, such as a flash memory or hard disk.  An example is as follows: A mobile phone has 128 kB or 256 kB of RAM to hold the stack and temporary variables of the programs, operating system, and data. ALIGNMENT OF MULTIBYTE STORE AND LOAD IN A MEMORY ORGANISATION Some memory organisation requires loads and stores to be "aligned. A 4-byte word has been aligned at address 0x000C or 0x1000, which is a multiple of 4. This simplifies the organisation of the memory system LITTLE ENDIAN AND BIG ENDIAN IN A MEMORY ORGANISATION  Some processor and memory organisation requires little endian and other big endian aligned multiple bytes when there is store into the memory or load into the processor from memory.  ARM processor permits programming at the start and enables a programmer to define one of the word-alignments little endian or big endian at the beginning. Princeton Architecture  80x86 processors and ARM7 have Princeton architecture for main memory. 8051-family microcontrollers have Harvard architecture.). Vectors and pointers, variables, program segments and memory blocks for data and stacks have different addresses in the program in Princeton memory architecture. Harvard architecture  When the address spaces for the data and for program are distinct

 Handling streams of data that are required to be accessed in cases of single instruction multiple data type instructions and DSP instructions.  Separate data buses ensure simultaneous accesses for instructions and data. Harvard and Princeton Memory Organizations

2. Types of Memory
y y y Most systems two types of memory read-only memory (ROM) and random-access memory (RAM). A computer system has ROM unit(s) for bootstrap program(s), basic input-output system (BIOS) program's) and for vector addresses for the interrupts An embedded system has ROM unit(s) for storing ROM image and flash to save non-volatile data and results

ROM Uses
 Language specific bits for the fonts corresponding to each character to a printer or display unit.  Images bits for a display.  Pictogram bytes for the full bit-image corresponding to the pixels for a pictogram. Sequential changes at the inputs of display unit repeatedly generate the full pictogram.  In a CISC as a control ROM at a micro-programmed unit for implementing instructions 1) Masked ROM Used for large scale manufacturing; mask prepared for foundry

- A finalised ROM image of system program and data, pictograms, image pixels, pixels for the fonts of a language, combination-circuits implementing a truth-table 2) EPROM Used in place of masked ROM during development phase; UV Erasable and Electrically programmable by a device programmer 3) E2ROM Used during the program run to save non-volatile data and results (for examples, date and time of a transaction, present port status, port driving history, system malfunctions history); Electrically Erasable by writing a byte or a set of bytes with all 1s and Electrically programmable during a program run one byte write at each write instance. 4) Flash A flash memory functions as the ROM. Electrically Erasable sector of 16 kB to 256 kB at an instance and Electrically programmable one byte at each instance during a program run.

RAM
 The RAM can be both read and, written, and is used to hold the programs, operating system, and data required by a computer system. In embedded systems, it holds the stack and temporary variables of the programs, operating system, and data RAM Characteristics  RAM is generally volatile,

 does not retain the data stored in it when the system 's power is turned off.  Any data that needs to be stored while the system is off must be written to a permanent storage device, such as a flash memory or hard disk. Example :A mobile phone has 128 kB or 256 kB of RAM to hold the stack and temporary variables of the programs, operating system, and data RAM Types 1) SRAM (static RAM) and DRAM (dynamic RAM) Used for saving the variables, stacks, process control blocks, input buffer, output buffer, decompressed format of program and data at the ROM image 2) EDO (Extended Data Out) RAM Used up to 100 MHz clock rate, zero wait state between two fetches, single cycle read or write 3) SDRAM (Synchronous DRAM) Synchronised read operation; keeps next word ready while previous one is being fetched; used up to 1 GHz clock cycle 4) RDRAM (Rambus* DRAM) Burst accesses of four successive words in single fetch; used for 1 GHz + performance of the system * A developer company name 5) Parameterised Distributed RAM when slow bus accesses exists RAM distributed for the specific tasks of the system and devices - for examples for fast IO buffers, fast stacks, .. 6) Parameterised Block RAM Specific block dedicated for specific use, for example, for DCT operations

Summary We learnt Random access memory model, ROM, RAM Addresses Data alignment Little and big endian Flash Princeton and Harvard architectures

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-16: Processor organisation and Performance Metrics 1. Processor, Memory and buses Processor Organisation

Processor
 ALU.  Processor circuit does sequential operations and a clock guides these.  Program counter and stack pointer, which points to the instruction to be fetched and top of the data pushed into the stack.  Certain processor have on-chip memory management unit (MMU).

Registers  General-purpose registers.  Registers organize onto a common internal bus of the processor. A register is of 32, 16 or 8 bits depending on whether the ALU performs at an instance a 32- or 16- or 8-bit operation

CISC  Processor may have CISC (Complex Instruction Set Computer) or RISC (Reduced Instruction Set Computer) architecture may affect the system design.  CISC has ability to process complex instructions and complex data sets with fewer registers as it provides for a large number of addressing modes. RISC Simpler instructions and all in a single cycle per instruction. New RISC processors, such as ARM 7 and ARM9 also provide for a few most useful CISC instructions also. CISC converges to a RISC implementation because the most instructions are hardwired and implement in single clock cycle

Interrupts  Processor provides for the inputs for external interrupts so that the external circuits can send the interrupt signals  May possess an internal interrupt controller (handler) to program the service routine priorities and to allocate vector addresses. DMA (Direct Memory Access) Controller  External Devices can directly write and read into the blocks of RAM using the DMA controller, when the buses are not in use of the processor  Multiple DMA channels on chip.  When there are number of I/O devices and an I/O device needs to access a multi byte data set fast, the system memory on-chip DMA controller help greatly INSTRUCTION LEVEL PARALLELISM y y Execute several instructions is parallel. Two or more instructions execute in parallel as well as in pipeline. During the in which two parallel pipelines in a processor and two instructions In and In+1 executing in parallel at the separate execution units .

3. Processor Performance Metrics Metrics 1) MIPS Million Instructions Per Second 2) MFLOPS Million Floating Point Operations Per Second 3) Dhrystone/s Number of times a benchmark program called Dhrystone program can run per second.[1MIPS = 1757 Dhrystone/s] Embedded Benchmark Consortium (EEMBC) five-benchmark program suites  Telecommunications  Consumer Electronics  Automotive and Industrial Electronics  Consumer Electronics  Office Automation.

Summary We learnt Processor, address, data and control buses and Memory CISC and RISC Instruction Level Parallelism Performance Metrics

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-18: Memory Allocations and Memory Map 1. Memory Allocation To Program Segments and Blocks
Functions, Processes, Data and Stacks at the Various Segments of Memory Segment wise memory allocation in four segments; Code, Data, Stack and Extra (for examples, image, String) Segments and Paging at the Memory

Different Data Structures at the Various Memory Blocks 1) Stacks Return addresses on the nested calls, Sets of LIFO (Last In First Out) retrievable data, Saved Contexts of the tasks as the stacks

2) Arrays One dimensional or multidimensional 3) Queues Sets of FIFO (First In First Out) retrievable data; Circular Queue (Example- a Printer Buffer); Block Queue (Example- a network stack.

4) Table 5) Look up Table Look-up-table row first column points to another memory block of a data structure data

6) List: In a list element, a data structure of an item also points to the next item 7) Process Control Block [Refer Chapter 7 Lesson 1]

Memory Map Map to show the program and data allocation of the addresses to ROM, RAM, EEPROM or Flash in the system .

Memory map for an exemplary embedded system, smart card needing 2 kB memory

Memory map for an exemplary Java embedded card with software for encrypting and deciphering the transactions

Memory map sections in a smart card

Memory map sections in another smart card Summary We learnt Allocations to various Segments and data structures and the memory map of Exemplary cases

Chapter 6 Interfacing
Basic communications terminology Communications protocols Microprocessor interfacing: I/O addressing Port and bus-based I/O Memory mapped I/O and Standard I/O

Microprocessor interfacing: Interrupts Microprocessor interfacing: Direct memory access Arbitration Priority arbiter Daisy-chain arbitration Network oriented arbitration Advanced communication principles Serial / Parallel / Wireless communication Error detection and correction Serial protocols Parallel protocols Wireless protocols

Arbitration: Priority arbiter Consider the situation where multiple peripherals request service from single resource (e.g., microprocessor, DMA controller) simultaneously - which gets serviced first? Priority arbiter Single-purpose processor Peripherals make requests to arbiter, arbiter makes requests to resource Arbiter connected to system bus for configuration only

Arbitration: Daisy-chain arbitration Arbitration done by peripherals Built into peripheral or external logic added req input and ack output added to each peripheral

Peripherals connected to each other in daisy-chain manner One peripheral connected to resource, all others connected upstream Peripheral s req flows downstream to resource, resource s ack flows upstream to requesting peripheral Closest peripheral has highest priority

Pros/cons Easy to add/remove peripheral - no system redesign needed Does not support rotating priority One broken peripheral can cause loss of access to other peripherals Network-oriented arbitration When multiple microprocessors share a bus (sometimes called a network) Arbitration typically built into bus protocol Separate processors may try to write simultaneously causing collisions Data must be resent Don t want to start sending again at same time statistical methods can be used to reduce chances

Typically used for connecting multiple distant chips Trend use to connect multiple on-chip processors Example: Vectored interrupt using an interrupt table Fixed priority: i.e., Peripheral1 has highest priority Keyword _at_ followed by memory address forces compiler to place variables in specific memory locations

e.g., memory-mapped registers in arbiter, peripherals A peripheral s index into interrupt table is sent to memory-mapped register in arbiter Peripherals receive external data and raise interrupt

Multilevel bus architectures Don t want one bus for all communication Peripherals would need high-speed, processor-specific bus interface excess gates, power consumption, and cost; less portable

Too many peripherals slows down bus

Processor-local bus High speed, wide, most frequent communication Connects microprocessor, cache, memory controllers, etc.

Peripheral bus Lower speed, narrower, less frequent communication Typically industry standard bus (ISA, PCI) for portability

Bridge Single-purpose processor converts communication between busses

Advanced communication principles Layering Break complexity of communication protocol into pieces easier to design and understand Lower levels provide services to higher level Lower level might work with bits while higher level might work with packets of data

Physical layer Lowest level in hierarchy Medium to carry data from one actor (device or node) to another

Parallel communication Physical layer capable of transporting multiple bits of data Serial communication Physical layer transports one bit of data at a time Wireless communication No physical connection needed for transport at physical layer

Parallel communication Multiple data, control, and possibly power wires One bit per wire High data throughput with short distances Typically used when connecting devices on same IC or same circuit board Bus must be kept short long parallel wires result in high capacitance values which requires more time to charge/discharge Data misalignment between wires increases as length increases

Higher cost, bulky

Serial communication Single data wire, possibly also control and power wires Words transmitted one bit at a time Higher data throughput with long distances Less average capacitance, so more bits per unit of time Cheaper, less bulky More complex interfacing logic and communication protocol Sender needs to decompose word into bits Receiver needs to recompose bits into word Control signals often sent on same wire as data increasing protocol complexity

Wireless communication Infrared (IR) Electronic wave frequencies just below visible light spectrum Diode emits infrared light to generate signal Infrared transistor detects signal, conducts when exposed to infrared light Cheap to build Need line of sight, limited range Radio frequency (RF) Electromagnetic wave frequencies in radio spectrum Analog circuitry and antenna needed on both sides of transmission Line of sight not needed, transmitter power determines range Error detection and correction Often part of bus protocol Error detection: ability of receiver to detect errors during transmission Error correction: ability of receiver and transmitter to cooperate to correct problem Typically done by acknowledgement/retransmission protocol Bit error: single bit is inverted Burst of bit error: consecutive bits received incorrectly Parity: extra bit sent with word used for error detection Odd parity: data word plus parity bit contains odd number of 1 s Even parity: data word plus parity bit contains even number of 1 s Always detects single bit errors, but not all burst bit errors Checksum: extra word sent with data packet of multiple words e.g., extra word contains XOR sum of all data words in packet

Serial protocols: I2C I2C (Inter-IC) Two-wire serial bus protocol developed by Philips Semiconductors nearly 20 years ago Enables peripheral ICs to communicate using simple communication hardware Data transfer rates up to 100 kbits/s and 7-bit addressing possible in normal mode 3.4 Mbits/s and 10-bit addressing in fast-mode Common devices capable of interfacing to I2C bus: EPROMS, Flash, and some RAM memory, real-time clocks, watchdog timers, and microcontrollers

I2C bus structure

Serial protocols: CAN CAN (Controller area network) Protocol for real-time applications Developed by Robert Bosch GmbH Originally for communication among components of cars Applications now using CAN include: elevator controllers, copiers, telescopes, production-line control systems, and medical instruments

Data transfer rates up to 1 Mbit/s and 11-bit addressing Common devices interfacing with CAN: 8051-compatible 8592 processor and standalone CAN controllers

Actual physical design of CAN bus not specified in protocol Requires devices to transmit/detect dominant and recessive signals to/from bus e.g., 1 = dominant, 0 = recessive if single data wire used Bus guarantees dominant signal prevails over recessive signal if asserted simultaneously

Serial protocols: FireWire FireWire (a.k.a. I-Link, Lynx, IEEE 1394) High-performance serial bus developed by Apple Computer Inc. Designed for interfacing independent electronic components e.g., Desktop, scanner

Data transfer rates from 12.5 to 400 Mbits/s, 64-bit addressing Plug-and-play capabilities Packet-based layered design structure Applications using FireWire include:

disk drives, printers, scanners, cameras

Capable of supporting a LAN similar to Ethernet 64-bit address: 10 bits for network ids, 1023 subnetworks 6 bits for node ids, each subnetwork can have 63 nodes 48 bits for memory address, each node can have 281 terabytes of distinct locations.

Serial protocols: USB USB (Universal Serial Bus) Easier connection between PC and monitors, printers, digital speakers, modems, scanners, digital cameras, joysticks, multimedia game equipment 2 data rates: 12 Mbps for increased bandwidth devices 1.5 Mbps for lower-speed devices (joysticks, game pads)

Tiered star topology can be used One USB device (hub) connected to PC hub can be embedded in devices like monitor, printer, or keyboard or can be standalone

Multiple USB devices can be connected to hub Up to 127 devices can be connected like this

USB host controller Manages and controls bandwidth and driver software required by each peripheral Dynamically allocates power downstream according to devices connected/disconnected

Parallel protocols: PCI Bus PCI Bus (Peripheral Component Interconnect) High performance bus originated at Intel in the early 1990 s Standard adopted by industry and administered by PCISIG (PCI Special Interest Group) Interconnects chips, expansion boards, processor memory subsystems Data transfer rates of 127.2 to 508.6 Mbits/s and 32-bit addressing Later extended to 64-bit while maintaining compatibility with 32-bit schemes

Synchronous bus architecture Multiplexed data/address lines

Parallel protocols: ARM Bus ARM Bus Designed and used internally by ARM Corporation Interfaces with ARM line of processors Many IC design companies have own bus protocol Data transfer rate is a function of clock speed If clock speed of bus is X, transfer rate = 16 x X bits/s

32-bit addressing

Wireless protocols: IrDA IrDA Protocol suite that supports short-range point-to-point infrared data transmission Created and promoted by the Infrared Data Association (IrDA) Data transfer rate of 9.6 kbps and 4 Mbps IrDA hardware deployed in notebook computers, printers, PDAs, digital cameras, public phones, cell phones

Lack of suitable drivers has slowed use by applications Windows 2000/98 now include support Becoming available on popular embedded OS s

Wireless protocols: Bluetooth Bluetooth New, global standard for wireless connectivity Based on low-cost, short-range radio link Connection established when within 10 meters of each other No line-of-sight required e.g., Connect to printer in another room

Wireless Protocols: IEEE 802.11 IEEE 802.11 Proposed standard for wireless LANs Specifies parameters for PHY and MAC layers of network PHY layer physical layer handles transmission of data between nodes provisions for data transfer rates of 1 or 2 Mbps operates in 2.4 to 2.4835 GHz frequency band (RF) or 300 to 428,000 GHz (IR)

MAC layer medium access control layer protocol responsible for maintaining order in shared medium collision avoidance/detection

Chapter Summary
Basic protocol concepts Actors, direction, time multiplexing, control methods General-purpose processors Port-based or bus-based I/O I/O addressing: Memory mapped I/O or Standard I/O Interrupt handling: fixed or vectored Direct memory access Arbitration Priority arbiter (fixed/rotating) or daisy chain Bus hierarchy Advanced communication Parallel vs. serial, wires vs. wireless, error detection/correction, layering Serial protocols: I C, CAN, FireWire, and USB; Parallel: PCI and ARM. Serial wireless protocols: IrDA, Bluetooth, and IEEE 802.11.
2

Intel 8259 programmable priority controller

Signal D[7..0] A[0..0]

Descrip tion These wires are connected to the sy stem bus and are used by the microp rocessor to write or read the internal registers of the 8259. This p in actis in cunjunction with WR/RD signals. It is used by the 8259 to decipher various command words the microp rocessor writes and status the microprocessor wishes to read. When this write signal is asserted, the 8259 accepts the command on the data line, i.e., the microp rocessor writes to the 8259 by placing a command on the data lines and asserting this signal. When this read signal is asserted, the 8259 p rovides on the data lines its status, i.e., the microprocessor reads the status of the 8259 by asserting this signal and reading the data lines. This signal is asserted whenever a valid interrupt request is received by the 8259, i.e., it is used to interrup t the microprocessor. This signal, is used to enable 8259 interrup t-vector data onto the data bus by a sequence of interrup t acknowledge p ulses issued by the microp rocessor. An interrup t request is executed by a perip heral device when one of these signals is asserted. These are cascade signals to enable multip le 8259 chips to be chained together. This function is used in conjunction with the CAS signals for cascading purp oses.

WR

RD

INT

INTA

IR 0,1,2,3,4,5,6,7 CAS[2..0] SP/EN

Intel 8237 DMA controller

Signal D[7..0] A[19..0] ALE* MEMR* MEMW* IOR* IOW* HLDA HRQ

Description These wires are connected to the system bus (ISA) and are used by the microprocessor to write to the internal registers of the 8237. These wires are connected to the system bus (ISA) and are used by the DMA to issue the memory location where the transferred data is to be written to. The 8237 is also addressed by the micro-processorThe 8237 use this signal when driving the This is the address latch enable signal. through the lower bits of these address lines. system bus (ISA). This is the memory write signal issued by the 8237 when driving the system bus (ISA). This is the memory read signal issued by the 8237 when driving the system bus (ISA). This is the I/O device read signal issued by the 8237 when driving the system bus (ISA) in order to read a byte from an I/O device This is the I/O device write signal issued by the 8237 when driving the system bus (ISA) in order to write a byte to an I/O device. This signal (hold acknowledge) is asserted by the microprocessor to signal that it has relinquished the system bus (ISA). This signal (hold request) is asserted by the 8237 to signal to the microprocessor a request to relinquish the system bus (ISA).

REQ 0,1,2,3 An attached device to one of these channels asserts this signal to request a DMA transfer. ACK 0,1,2,3 The 8237 asserts this signal to grant a DMA transfer to an attached device to one of these channels. *See the ISA bus description in this chapter for complete details.

Você também pode gostar