Você está na página 1de 92

Advanced Microprocessors

SREE NARAYANA GURUKULAM COLLEGE OF ENGINEERING, KADAYIRUPPU, KOLENCHERY

DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING

LA 803 ADVANCED MICROPROCESSORS

Prepared by

Mr.Karthik.S

Approved by

Prof: Dr.Ramkumar S.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors

SREE NARAYANA GURUKULAM COLLEGE OF ENGINEERING

Branch: Electronics & Communication Engineering Subject Code & Name: L803 Advanced Microprocessors

Course Material

Course Co-coordinators: Co Mr.Shyju. Y------------------------------------------------------------- Mr.Karthik. S Lecturer/ECE


Karthik.S Shyju.Y S.N.G.C.E

Lecturer/ECE

Advanced Microprocessors

SEMESTER 8
YEAR 2005-2006 2006-2007 2007-2008 2008-2009 2009-2010 NAME OF THE SUBJECT INSTRUCTOR Mr.Shyju Y. Mr.Shyju Y. Mr.Shyju Y. Mr.Shyju Y. Mr.Karthik S.

Time Table:
DAY Monday Tuesday Wednesday Thursday Friday AMP AMP AMP AMP AMP 1 2 3 4 5 6 7

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
Department of Electronics & Communication Engineering Sree Narayana Gurukulam College of Engineering Lesson Plan LA 803 ADVANCED MICROPROCESSORS MODULE: 1 SLNO 1 2 3 4 5 6 7 8 9 10 11 DATE 25/01/2010 27/01/2010 28/01/2010 29/01/2010 30/01/2010 02/02/2010 03/02/2010 04/02/2010 05/02/2010 06/02/2010 09/02/2010 PERIOD 1 3 7 5 4 7 4 7 1 2 3 TOPICS Introduction to 8086 Internal Architecture & Block diagram Pin Description Register Organization Minimum mode operation Maximum mode operation Interrupt applications DMA data transfer 8087 math coprocessor 8086 memory organization, Segmentation & advantages Tutorial Total Hours MODULE: 2 HOURS NEEDED 1 1 1 1 1 1 1 1 1 1 1 11

SLNO 1 2 3 4

DATE PERIOD TOPICS HOURS NEEDED 10/02/2010 4 Addressing modes of 8086 1 11/02/2010 5 Data addressing modes 1 19/02/2010 1 Program memory addressing modes 1 20/02/2010 2 Stack memory addressing modes 1 Total Hours 4 MODULE: 3 DATE PERIOD TOPICS HOURS NEEDED 23/02/2010 4 Introduction to 80286 1 24/02/2010 6 Internal Architecture & Block diagram 1 25/02/2010 5 Pin Description of 80286 1 02/03/2010 4 Real mode operation of 80286 1 04/03/2010 1 Protected mode operation of 80286 1 Total Hours 5

SLNO 1 2 3 4 5

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors

MODULE: 4

SLNO 1 2 3 4 5 6 7 8 9 10 11

DATE 05/03/2010 09/03/2010 10/03/2010 16/03/2010 17/03/2010 19/03/2010 20/03/2010 23/03/2010 24/03/2010 25/03/2010 26/03/2010

PERIOD 2 4 1 5 2 5 4 4 1 2 6

TOPICS Introduction to 80386 Internal Architecture & Block diagram Pin Description of 80386 Real addressing mode of 80386 Protected mode operation Segmentation & Virtual memory, Paging concept Tutorial Segmentation privilege level & protection Interrupt & exception handling Task switching, paging mode Virtual 8086 operation & system connection Total Hours MODULE: 5

HOURS NEEDED 1 1 1 1 1 1 1 1 1 1 1 11

SLNO 1 2 3 4 5

DATE 30/03/2010 31/03/2010 03/04/2010 05/04/2010 06/04/2010

PERIOD 2 4 1 2 4

TOPICS Introduction to 80486, Internal Architecture & Block diagram Risc processor, 5 stage pipelining Integrated coprocessor, on board cache, burst bus mode Pentium superscalar architecture, UV pipeline Branch prediction logic, cache structure, BIST and MMX Total Hours

HOURS NEEDED 2 1 1 1 1 6

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
TEACHING PLAN LA 803 - ADVANCED MICROPROCESSORS EIGHTH SEMESTER Total Number of hours from January 2010 to May 2010

January February March

= = =

7 Hours 16 Hours 24 Hours = 47 Hours

Total Hours Hours Lost due to holidays Hours Lost due to exams = = 5 Hours 4 Hours

Total hours lost Expected Hours

= =

09 Hours 38 Hours

Hours needed to complete Module 1: 11 Hours Hours needed to complete Module 2: 04 Hours Hours needed to complete Module 3: 05 Hours Hours needed to complete Module 4: 11 Hours Hours needed to complete Module 5: 06 Hours

Extra hours

(38-37) = 01 Hour

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
LA 803 ADVANCED MICROPROCESSORS ASSIGNMENT QUESTIONS & TUTORIALS I Phase: 1. 2. 3. 4. 5. 6. 7. 8. 9. Silicon trends and limits for advanced microprocessors. The growth of microprocessors with ups and downs in electronic market. Moores law and its impact on advanced microprocessors. Advanced microprocessors as Microcomputers. Microprocessor evolution and bit-slice processors. Compare and contrast ENIAC, MARK1, EDVAC and UNIVAC. The role of advanced microprocessors and mainframes. The making of the Micro and role of ICs. Compare and contrast Von-Neumann with Harvard architecture and CISC machines with RISC machines. 10. Terahertz transistor- Processing at astronomical speed! 11. Explicitly Parallel Instruction Computing (EPIC). 12. The RAMAC 350 and its role in processing. 13. Konrad zuses idea and the machines Z1 and Z3. II Phase:
1. The 8086 microprocessor is divided into ______ ____ unit and __________ unit The size of instruction queue is ______ bytes. 2. The 8086 can read/write a byte or word in a single memory access (True/False)(1) 3. What is the physical address corresponding to an offset of 7002h with the content of data segment register is 10CFh?. 4. The SI & DI cannot be accessed as two separate bytes-(True/False).

5. What tells a P what to do, where to get data, how to process the data & where to put the results when done? 6. What is the function of IP register? 7. What dedicated operations are assigned to CX register? 8. What is the word length of 8086 physical address? 9. What is wrong with MOV BL, CX? 10. What is wrong with MOV DS, SS? 11. Copy AL into AH & Copy DS into AH. 12. Draw the architecture of 8087 math coprocessor. III Phase: 1. 2. 3. 4. 5. 6. Show how the physical address is generated for different addressing modes. Draw the block diagram of 80286. How the interrupts are handled in Intel 286 processor? Explain the descriptors & their access right bits of 80386 processor. List the advantages of a RISC processor. What is MMX technology and how it is implemented in Pentium processor.
S.N.G.C.E

Karthik.S Shyju.Y

Advanced Microprocessors
SREE NARAYANA GURUKULAM COLLEGE OF ENGINEERING- KADAYIRUPPU FIRST INTERNAL EXAMINATION, FEBRUARY -2010 Eighth Semester Advanced Microprocessors Time: Two Hours Maximum: 60 marks PART A (6 X 4 = 24 marks)
1. The 8086 microprocessor is divided into ______ ____ unit and __________ unit. The size of instruction queue is ______ bytes. (1) The 8086 can read/write a byte or word in a single memory access (True/False)(1) What is the physical address corresponding to an offset of 94D0h with the content of data segment register is 103Fh?(1) The four 8086 memory segments can be located anywhere within the 1MB of address space of 8086 (True/False). (1) 2. List the internal registers in 8086 Microprocessor (2)

What are the advantages of pipelining (2)


3. Explain all the flags in 8086 and draw PSW. (2)

Assume 16-bit flag word = 0BC3h. Determine the values of six status flags and three control flags. (2)
4. How do you configure 8086 into minimum and maximum modes(1)

List the signals in minimum and maximum modes (3)


5. Explain the roles of pins TEST, LOCK

Draw the timing diagram of HOLD & HLDA in minimum mode (3)
6. List out the advantages and disadvantages of segmented memory.(4)

PART B (3 X 12 =36 marks) Answer any Three questions 1. Sketch the block diagram of 8086 CPU and explain its architecture in detail. (12) 2. Explain the subsequent pins in points-- ALE, DT/R', BHE' QS0 QS1, RQ/GT, HOLD/HLDA (12) 3. A. What determines the size of the processor, how many address lines does 8086 include and what is the size of the memory that 8086 can address? (2). B. Configure 8086 as a single processor in minimum mode with neat diagram and explain its operation with essential control signals. (10) 4. Furnish nearly all important details about interrupts system in 8086 and types of interrupts (12)
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
LA803 SREE NARAYANA GURUKULAM COLLEGE OF ENGINEERING KADAYIRUPPU SECOND INTERNAL EXAMINATION, APRIL 2010 Eighth Semester ADVANCED MICROPROCESSORS Time: Two Hours PART A (6 X 4 = 24 marks)
Answer all questions

Maximum: 60 marks

1. Discuss any four data addressing modes with examples. 2. Delineate Stack. What are stack addressing modes? 3. List the prominent features of 80286 4. What is meant by protected mode operation? 5. Discuss memory segmentation and virtual memory in 80386 6. Discuss about swapping, descriptors and descriptor table with access right bits. PART B (12 X 3=36)
Answer any three questions

1. a. Discuss the program memory addressing modes with examples (6). b. Elucidate the physical address formation in different addressing modes (6). 2. With a tidy block diagram, discuss the internal architecture of 80286 processor (12). 3. a. Give details about the concept of virtual memory (4). b. Explain the physical address formation in real mode and protected virtual address mode operation in 80286 (8). 4. Discuss the paging mechanism of 80386 in detail. What are the different exceptions generated by 80386? (6 + 6) 5. Explicate segment and system descriptors, privilege level and call gates in 80386 (12) *************** ALL THE BEST ***********
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors

SREE NARAYANA GURUKULAM COLLEGE OF ENGINEERING

Branch: Electronics & Communication Engineering Subject Code & Name: L803 Advanced Microprocessors

Course Material

Course Co-coordinators: Co Mr.Shyju. Y----L/ECE L/ECE


Karthik.S Shyju.Y S.N.G.C.E

Mr.Karthik. S-----L/ECE S L/ECE

Advanced Microprocessors
Course Outline
ADVANCED
LA803 Module 1 Intel 8086 Microprocessor - Internal architecture Block diagram Minimum and maximum mode operation Interrupt and Interrupt applications DMA data transfer 8087 math coprocessor. 8086 memory organization even and odd memory banks segment registers logical and physical address advantages and disadvantages of physical memory. Module 2 Addressing modes used in 80x86 family - Data addressing mode register addressing, immediate addressing, direct addressing, register indirect addressing, base plus index addressing, register relative addressing, base relative plus index addressing, scaled addressing. Program memory addressing modes - direct program memory addressing, relative program memory addressing. Stack memory addressing mode. Module 3

MICROPROCESORS
3+1+0

Intel 80286 Microprocessor - 80286 Architecture, system connection Real address mode operation Protected mode operation
Module 4 Intel 80386 Microprocessor - 80386 Architecture and system connection Real operating mode 386 protected mode operation segmentation and virtual memory segment privilege levels and protection call gates I/O privilege levels Interrupts and exception handling task switching paging mode 80386 virtual 86 mode operation. Module 5 Advanced Intel Microprocessors - 80486 Processor model Reduced Instruction cycle five stage instruction pipe line Integrated coprocessor On board cache Burst Bus mode. Pentium super scalar architecture u-v pipe line branch prediction logic cache structure BIST (built in self test) Introduction to MMX technology.

References
1. The Microprocessors 6th Edition Barry B. Brey Pearson Edu. 2. Microprocessor and Interfacing 2nd Edition Douglous V. Hall TMH 3. The 80x86 family John Uffenbeck

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors Course Objectives Module 1:


To identify the foremost components of 8086 processor model together with the bus interface unit and execution unit. To clarify 8086 pipelined architecture, to list and make out the function of 8086 CPU data registers and flags. To illustrate how to convert an 8086 offset or logical address into a physical address. To catalog the 8086 segment registers and to reveal how to calculate a segments base address.

Module 2:
To define the role data, program & stack memory addressing modes of 8086 microprocessor. To list the different ways the 8086 processor access memory operands

Module 3:
To be familiar with the leading components of 80286 processor replica mutually with the bus interface unit and execution unit. To contrast the 286 segmentation techniques of memory management in real and protected modes.

Module 4:
To list & identify the function of the 80386 architecture, general purpose & special purpose registers. To compare the segmentation methods of memory management in real, protected and virtual modes. To list the rules of privilege for 386 processor. To explain how the 386 paging unit translates linear address into physical address.

Module 5:
To roll those factors that makes 486 faster than the 386 for a given clock speed. To gain knowledge of the architecture of 486 pipelined processor & architecture of UV pipelined Pentium processor. To cram the on board cache operation and MMX technology.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors Module: 1 1. Introduction:


In 1978, Intel produced its first 16 bit processor, the 8086. It was source compatible with the 8080 and 8085 (an 8080 derivative). This chip has probably had more effect on the present day computer market than any other, although whether this is justified is debatable; the chip was compatible with the 4 year old 8080 and this meant it had to use a most unusual overlapping segment register process to access a full 1 Megabyte of memory. It is a 16-bit p. 8086 has a 20 bit address bus can access up to 220 memory locations (1 MB). It can support up to 64K I/O ports. It provides 14, 16 -bit registers. It has multiplexed address and data bus AD0 - AD15 and AD16 AD19 It requires single phase clock with 33% duty cycle to provide internal timing. 8086 is designed to operate in two modes, Minimum and Maximum. It can prefetches up to 6 instruction bytes from memory and queues them in order to speed up instruction execution. It requires +5V power supply. A 40 pin dual in line package Minimum and Maximum Modes: The minimum mode is selected by applying logic 1 to the MN / MX# input pin. This is a single microprocessor configuration. The maximum mode is selected by applying logic 0 to the MN / MX# input pin. This is a multi micro processors configuration.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors 2. Block diagram of 8086:

Internal Architecture of 8086


8086 has two blocks BIU and EU. The BIU performs all bus operations such as instruction fetching, reading and writing operands for memory and calculating the addresses of the memory operands. The instruction bytes are transferred to the instruction queue. EU executes instructions from the instruction system byte queue. Both units operate asynchronously to give the 8086 an overlapping instruction fetch and execution mechanism which is called as Pipelining. This results in efficient use of the system bus and system performance. BIU contains Instruction queue, Segment registers, Instruction pointer, Address adder. EU contains Control circuitry, Instruction decoder, ALU, Pointer and Index register, Flag register.
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors

BUS INTERFACR UNIT: It provides a full 16 bit bidirectional data bus and 20 bit address bus. The bus interface unit is responsible for performing all external bus operations. Specifically it has the following functions: Instruction fetch, Instruction queuing, Operand fetch and storage, Address relocation and Bus control. The BIU uses a mechanism known as an instruction stream queue to implement pipeline architecture. This queue permits prefetch of up to six bytes of instruction code. Whenever the queue of the BIU is not full, it has room for at least two more bytes and at the same time the EU is not requesting it to read or write operands from memory, the BIU is free to look ahead in the program by prefetching the next sequential instruction. These prefetching instructions are held in its FIFO queue. With its 16 bit data bus, the BIU fetches two instruction bytes in a single memory cycle. After a byte is loaded at the input end of the queue, it automatically shifts up through the FIFO to the empty location nearest the output. The EU accesses the queue from the output end. It reads one instruction byte after the other from the output of the queue. If the queue is full and the EU is not requesting access to operand in memory. These intervals of no bus activity, which may occur between bus cycles, are known as Idle state. If the BIU is already in the process of fetching an instruction when the EU request it to read or write operands from memory or I/O, the BIU first completes the instruction fetch bus cycle before initiating the operand read / write cycle. The BIU also contains a dedicated adder which is used to generate the 20bit physical address that is output on the address bus. This address is formed by adding an appended 16 bit segment address and a 16 bit offset address. For example: The physical address of the next instruction to be fetched is formed by combining the current contents of the code segment CS register and the current contents of the instruction pointer IP register.
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
The BIU is also responsible for generating bus control signals such as those for memory read or write and I/O read or write. EXECUTION UNIT: The Execution unit is responsible for decoding and executing all instructions. The EU extracts instructions from the top of the queue in the BIU, decodes them, generates operands if necessary, passes them to the BIU and requests it to perform the read or write bys cycles to memory or I/O and perform the operation specified by the instruction on the operands. During the execution of the instruction, the EU tests the status and control flags and updates them based on the results of executing the instruction. If the queue is empty, the EU waits for the next instruction byte to be fetched and shifted to top of the queue. When the EU executes a branch or jump instruction, it transfers control to a location corresponding to another set of sequential instructions. Whenever this happens, the BIU automatically resets the queue and then begins to fetch instructions from this new location to refill the queue.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors 3. Pin Diagram of 8086:

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
The Microprocessor 8086 is a 16-bit CPU available in different clock rates and packaged in a 40 pin CERDIP or plastic package. The 8086 operates in single processor or multiprocessor configuration to achieve high performance. The pins serve a particular function in minimum mode (single processor mode) and other function in maximum mode configuration (multiprocessor mode ). The 8086 signals can be categorized in three groups. The first are the signal having common functions in minimum as well as maximum mode. The second are the signals which have special functions for minimum mode and third are the signals having special functions for maximum mode. The following signal descriptions are common for both modes. AD15 AD0: These are the time multiplexed memory I/O address and data lines. Address remains on the lines during T1 state, while the data is available on the data bus during T2, T3, Tw and T4. These lines are active high and float to a tristate during interrupt acknowledge and local bus hold acknowledge cycles. A19 /S6, A18/S5, A17/S4, A16/S3: These are the time multiplexed address and status lines. During T1 these are the most significant address lines for memory operations. During I/O operations, these lines are low. During memory or I/O operations, status information is available on those lines for T2, T3, T w and T4. The status of the interrupt enable flag bit is updated at the beginning of each clock cycle. The S4 and S3 combinedly indicate which segment register is presently being used for memory accesses as in below table. These lines float to tri-state off during the local bus hold acknowledge. The status line S6 is always low. The address bits are separated from the status bit using latches controlled by the ALE signal.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
BHE/S: The bus high enable is used to indicate the transfer of data over the higher order (D15-D8) data bus as shown in table. It goes low for the data transfer over D15-D8 and is used to derive chip selects of odd address memory bank or peripherals. BHE is low during T1 for read, write and interrupt acknowledge cycles, whenever a byte is to be transferred on higher byte of data bus. The status information is available during T2, T3 and T4. The signal is active low and tristated during hold. It is low during T1 for the first pulse of the interrupt acknowledge cycle.

RD Read: This signal on low indicates the peripheral that the processor is performing memory or I/O read operation. RD is active low and shows the state for T2, T3, and T w of any read cycle. The signal remains tristated during the hold acknowledge. READY: This is the acknowledgement from the slow device or memory that they have completed the data transfer. The signal made available by the devices is synchronized by the 8284A clock generator to provide ready input to the 8086. the signal is active high. INTR-Interrupt Request: This is a triggered input. This is sampled during the last clock cycles of each instruction to determine the availability of the request. If any interrupt request is pending, the processor enters the interrupt acknowledge cycle. This can be internally masked by resulting the interrupt enable flag. This signal is active high and internally synchronized. TEST: This input is examined by a WAIT instruction. If the TEST pin goes low, execution will continue, else the processor remains in an idle state. The input is synchronized internally during each clock cycle on leading edge of clock. CLK- Clock Input: The clock input provides the basic timing for processor operation and bus control activity. Its an asymmetric square wave with 33% duty cycle. MN/MX: The logic level at this pin decides whether the processor is to operate in either minimum or maximum mode.
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
The following pin functions are for the minimum mode operation of 8086. M/IO Memory/IO: This is a status line logically equivalent to S2 in maximum mode. When it is low, it indicates the CPU is having an I/O operation, and when it is high, it indicates that the CPU is having a memory operation. This line becomes active high in the previous T4 and remains active till final T4 of the current cycle. It is tristated during local bus hold acknowledge . INTA Interrupt Acknowledge: This signal is used as a read strobe for interrupt acknowledge cycles. I.e. when it goes low, the processor has accepted the interrupt. ALE Address Latch Enable: This output signal indicates the availability of the valid address on the address/data lines, and is connected to latch enable input of latches. This signal is active high and is never tristated. DT/R Data Transmit/Receive: This output is used to decide the direction of data flow through the transreceivers (bidirectional buffers). When the processor sends out data, this signal is high and when the processor is receiving data, this signal is low. DEN Data Enable: This signal indicates the availability of valid data over the address/data lines. It is used to enable the transreceivers (bidirectional buffers) to separate the data from the multiplexed address/data signal. It is active from the middle of T2 until the middle of T4. This is tristated during hold acknowledge cycle. HOLD, HLDA- Acknowledge: When the HOLD line goes high, it indicates to the processor that another master is requesting the bus access. The processor, after receiving the HOLD request, issues the hold acknowledge signal on HLDA pin, in the middle of the next clock cycle after completing the current bus cycle. At the same time, the processor floats the local bus and control lines. When the processor detects the HOLD line low, it lowers the HLDA signal. HOLD is an asynchronous input, and is should be externally synchronized. If the DMA request is made while the CPU is performing a memory or I/O cycle, it will release the local bus during T4 provided: 1. The request occurs on or before T2 state of the current cycle. 2. The current cycle is not operating over the lower byte of a word. 3. The current cycle is not the first acknowledge of an interrupt acknowledge sequence. 4. A Lock instruction is not being executed.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
The following pin functions are applicable for maximum mode operation of 8086. S2, S1, S0 Status Lines: These are the status lines which reflect the type of operation, being carried out by the processor. These become activity during T4 of the previous cycle and active during T1 and T2 of the current bus cycles.

LOCK: This output pin indicates that other system bus master will be prevented from gaining the system bus, while the LOCK signal is low. The LOCK signal is activated by the LOCK prefix instruction and remains active until the completion of the next instruction. When the CPU is executing a critical instruction which requires the system bus, the LOCK prefix instruction ensures that other processors connected in the system will not gain the control of the bus. The 8086, while executing the prefixed instruction, asserts the bus lock signal output, which may be connected to an external bus controller. By prefetching the instruction, there is a considerable speeding up in instruction execution in 8086. This is known as instruction pipelining. At the starting the CS: IP is loaded with the required address from which the execution is to be started. Initially, the queue will be empty an the microprocessor starts a fetch operation to bring one byte (the first byte) of instruction code, if the CS: IP address is odd or two bytes at a time, if the CS:IP address is even. The first byte is a complete opcode in case of some instruction (one byte opcode instruction) and is a part of opcode, in case of some instructions (two byte opcode instructions), the remaining part of code lie in second byte. The second byte is then decoded in continuation with the first byte to decide the instruction length and the number of subsequent bytes to be treated as instruction data.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
The queue is updated after every byte is read from the queue but the fetch cycle is initiated by BIU only if at least two bytes of the queue are empty and the EU may be concurrently executing the fetched instructions. The next byte after the instruction is completed is again the first opcode byte of the next instruction. A similar procedure is repeated till the complete execution of the program. The fetch operation of the next instruction is overlapped with the execution of the current instruction. As in the architecture, there are two separate units, namely Execution unit and Bus interface unit. While the execution unit is busy in executing an instruction, after it is completely decoded, the bus interface unit may be fetching the bytes of the next instruction from memory, depending upon the queue status.

RQ/GT0, RQ/GT1 Request/Grant : These pins are used by the other local bus master in maximum mode, to force the processor to release the local bus at the end of the processor current bus cycle. Each of the pin is bidirectional with RQ/GT0 having higher priority than RQ/GT1 RQ/GT pins have internal pull-up resistors and may be left unconnected. Request/Grant sequence is as follows: 1. A pulse of one clock wide from another bus master requests the bus access to 8086. 2.During T4(current) or T1 (next) clock cycle, a pulse one clock wide from 8086 to the requesting master, indicates that the 8086 has allowed the local bus to float and that it will enter the hold acknowledge state at next cycle. The CPU bus interface unit is likely to be disconnected from the local bus of the system. 3. A one clock wide pulse from another master indicates to the 8086 that the hold request is about to end and the 8086 may regain control of the local bus at the next clock cycle. Thus each master to master exchange of the local bus is a sequence of 3 pulses. There must be at least one dead clock cycle after each bus exchange. The request and grant pulses are active low. For the bus request those are received while 8086 is performing memory or I/O cycle, the granting of the bus is governed by the rules as in case of HOLD and HLDA in minimum mode.
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors 4. Internal Registers of 8086:


The 8086 has four groups of the user accessible internal registers. They are the instruction pointer, four data registers, four pointer and index register, four segment registers. The 8086 has a total of fourteen 16-bit registers including a 16 bit register called the status register, with 9 of bits implemented for status and control flags. Most of the registers contain data/instruction offsets within 64 KB memory segment. There are four different 64 KB segments for instructions, stack, data and extra data. To specify where in 1 MB of processor memory these 4 segments are located the processor uses four segment registers: Code segment (CS) is a 16-bit register containing address of 64 KB segment with processor instructions. The processor uses CS segment for all accesses to instructions referenced by instruction pointer (IP) register. CS register cannot be changed directly. The CS register is automatically updated during far jump, far call and far return instructions. Stack segment (SS) is a 16-bit register containing address of 64KB segment with program stack. By default, the processor assumes that all data referenced by the stack pointer (SP) and base pointer (BP) registers is located in the stack segment. SS register can be changed directly using POP instruction. Data segment (DS) is a 16-bit register containing address of 64KB segment with program data. By default, the processor assumes that all data referenced by general registers (AX, BX, CX, DX) and index register (SI, DI) is located in the data segment. DS register can be changed directly using POP and LDS instructions. Accumulator register consists of two 8-bit registers AL and AH, which can be combined together and used as a 16-bit register AX. AL in this case contains the low order byte of the word, and AH contains the high-order byte. Accumulator can be used for I/O operations and string manipulation. Base register consists of two 8-bit registers BL and BH, which can be combined together and used as a 16-bit register BX. BL in this case contains the low-order byte of the word, and BH contains the high-order byte. BX register usually contains a data pointer used for based, based indexed or register indirect addressing. Count register consists of two 8-bit registers CL and CH, which can be combined together and used as a 16-bit register CX. When combined, CL register contains the loworder byte of the word, and CH contains the high-order byte. Count register can be used in Loop, shift/rotate instructions and as a counter in string manipulation,. Data register consists of two 8-bit registers DL and DH, which can be combined together and used as a 16-bit register DX. When combined, DL register contains the loworder byte of the word, and DH contains the high-order byte. Data register can be used as a port number in I/O operations. In integer 32-bit multiply and divide
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
instruction the DX register contains high-order word of the initial or resulting number. The following registers are both general and index registers: Stack Pointer (SP) is a 16-bit register pointing to program stack. Base Pointer (BP) is a 16-bit register pointing to data in stack segment. BP register is usually used for based, based indexed or register indirect addressing. Source Index (SI) is a 16-bit register. SI is used for indexed, based indexed and register indirect addressing, as well as a source data address in string manipulation instructions. Destination Index (DI) is a 16-bit register. DI is used for indexed, based indexed and register indirect addressing, as well as a destination data address in string manipulation instructions.

Other registers: Instruction Pointer (IP) is a 16-bit register. A flag is a 16-bit register containing 9 one bit flags. Overflow Flag (OF) - set if the result is too large positive number, or is too small negative number to fit into destination operand. Direction Flag (DF) - if set then string manipulation instructions will auto-decrement index registers. If cleared then the index registers will be auto-incremented. Interrupt-enable Flag (IF) - setting this bit enables maskable interrupts. Single-step Flag (TF) - if set then single-step interrupt will occur after the next instruction. Sign Flag (SF) - set if the most significant bit of the result is set. Zero Flag (ZF) - set if the result is zero. Auxiliary carry Flag (AF) - set if there was a carry from or borrow to bits 0-3 in the AL register. Parity Flag (PF) - set if parity (the number of "1" bits) in the low-order byte of the result is even. Carry Flag (CF) - set if there was a carry from or borrow to the most significant bit during last result calculation.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors 5. Memory Organization:


The processor provides a 20-bit address to memory which locates the byte being referenced. The memory is organized as a linear array of up to 1 million bytes, addressed as 00000(H) to FFFFF (H). The memory is logically divided into code, data, extra data, and stack segments of up to 64K bytes each, with each segment falling on 16-byte boundaries. Program, data and stack memories occupy the same memory space. As the most of the processor instructions use 16-bit pointers the processor can effectively address only 64 KB of memory. To access memory outside of 64 KB the CPU uses special segment registers to specify where the code, stack and data 64 KB segments are positioned within 1 MB of memory (see the Registers" section below). 16-bit pointers and data are stored as: Address: low-order byte Address+1: high-order byte Program memory - program can be located anywhere in memory. Jump and call instructions can be used for short jumps within currently selected 64 KB code segment, as well as for far jumps anywhere within 1 MB of memory. All conditional jump instructions can be used to jump within approximately +127 to 127 bytes from current instruction. Data memory - the processor can access data in any one out of 4 available segments, which limits the size of accessible memory to 256 KB (if all four segments point to different 64 KB blocks). Accessing data from the Data, Code, Stack or Extra segments can be usually done by prefixing instructions with the DS:, CS:, SS: or ES: (some registers and instructions by default may use the ES or SS segments instead of DS segment). Word data can be located at odd or even byte boundaries. The processor uses two memory accesses to read 16-bit word located at odd byte boundaries. Reading word data from even byte boundaries requires only one memory access. Stack memory can be placed anywhere in memory. The stack can be located at odd memory addresses, but it is not recommended for performance reasons (see "DataMemory"above). Physically, the memory is organized as a high bank (D15 D8 ) and a low bank (D7D0) o f 512K 8-bit bytes addressed in parallel by the processors address lines A19A1. Byte data with even addresses is transferred on the D7D0 bus lines while odd addressed byte data (A0HIGH) is transferred on the D15D8 bus lines. The processor provides two
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
enable signals, BHE and A0, t o selectively allow reading from or writing into either an odd byte location, even byte location, or both. The instruction stream is fetched from memory as words and is addressed internally by the processor to the byte level as necessary.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors

Reserved memory locations: 0000h - 03FFh are reserved for interrupt vectors. Each interrupt vector is a 32-bit pointer in format segment: offset. FFFF0h - FFFFFh - after RESET the processor always starts program execution at the FFFF0h address.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors 6. Minimum Mode 8086 System:


In a minimum mode 8086 system, the microprocessor 8086 is operated in minimum mode by strapping its MN/MX pin to logic 1. In this mode, all the control signals are given out by the microprocessor chip itself. There is a single microprocessor in the minimum mode system.

The remaining components in the system are latches, transreceivers, clock generator, memory and I/O devices. Some type of chip selection logic may be required for selecting memory or I/O devices, depending upon the address map of the system. Latches are generally buffered output D-type flip-flops like 74LS373 or 8282. They are used for separating the valid address from the multiplexed address/data signals and are controlled by the ALE signal generated by 8086. Transreceivers are the bidirectional buffers and some times they are called as data amplifiers. They are required to separate the valid data from the time multiplexed address/data signals. They are controlled by two signals namely, DEN and DT/R. The DEN signal indicates the direction of data, i.e. from or to the processor. The system contains memory for the monitor and users program storage.
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
Usually, EPROM are used for monitor storage, while RAM for users program storage. A system may contain I/O devices. The working of the minimum mode configuration system can be better described in terms of the timing diagrams rather than qualitatively describing the operations. The opcode fetch and read cycles are similar. Hence the timing diagram can be categorized in two parts, the the first is the timing diagram for read cycle and the second is the timing diagram for write cycle. The read cycle begins in T1 with the assertion of address latch enable (ALE) signal and also M / IO signal. During the negative going edge of this signal, the the valid address is latched on the local bus. The BHE and A0 signals address low, high or both bytes. From T1 to T4 , the M/IO signal indicates a memory or I/O operation. At T2, the address is removed from the local bus and is sent to the output. The bus is i then tristated. The read (RD) control signal is also activated in T2. The read (RD) signal causes the address device to enable its data bus drivers. After RD goes low, the valid data is available on the data bus. The addressed device will drive the READY line high. When the processor returns the read signal to high level, the addressed device will again tristate its bus drivers. A write cycle also begins with the assertion of ALE and the emission of the address. The M/IO signal is again asserted to indicate indicate a memory or I/O operation. In T2, after sending the address in T1, the processor sends the data to be written to the addressed location. The data remains on the bus until middle of T4 state. The WR becomes active at the beginning of T2 (unlike RD is somewhat somewhat delayed in T2 to provide time for floating). The BHE and A0 signals are used to select the proper byte or bytes of memory or I/O word to be read or write. The M/IO, RD and WR signals indicate the type of data transfer as specified in table below.

Write cycle timing diagram


Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
Hold Response sequence: The HOLD pin is checked at leading edge of each clock pulse. If it is received active by the processor before T4 of the previous cycle or during T1 state of the current cycle, the CPU activates HLDA in the next clock cycle and for succeeding bus cycles, the bus will be given to another requesting master. The control of the bus is not regained by the processor until the requesting master does not drop the HOLD pin low. When the request is dropped by the requesting master, the HLDA is dropped by the processor at the trailing edge of the next clock.

Bus Request and Bus Grant Timings in Minimum Mode System

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors 7. Maximum Mode 8086 System:


In the maximum mode, the 8086 is operated by strapping the MN/MX pin to ground. In this mode, the processor derives the status signal S2, S1, S0. Another chip called bus controller derives the control signal using this status information . In the maximum mode, there may be more than one microprocessor in the system configuration. The components in the system are same as in the minimum mode system. The basic function of the bus controller chip IC8288, is to derive control signals like RD and WR ( for memory and I/O devices), DEN, DT/R, ALE etc. using the information by the processor on the status lines. The bus controller chip has input lines S2, S1, S0 and CLK. These inputs to 8288 are driven by CPU. It derives the outputs ALE, DEN, DT/R, MRDC, MWTC, AMWC, IORC, IOWC and AIOWC. The AEN, IOB and CEN pins are specially useful for multiprocessor systems.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
AEN and IOB are generally grounded. CEN pin is usually tied to +5V. The significance of the MCE/PDEN output depends upon the status of the IOB pin. If IOB is grounded, it acts as master cascade enable to control cascade 8259A, else it acts as peripheral data enable used in the multiple bus configurations. INTA pin used to issue two interrupt acknowledge pulses to the interrupt controller or to an interrupting device. IORC, IOWC are I/O read command and I/O write command signals respectively. These signals enable an IO interface to read or write the data from or to the address port. The MRDC, MWTC are memory read command and memory write command signals respectively and may be used as memory read or write signals. All these command signals instructs the memory to accept or send data from or to the bus. For both of these write command signals, the advanced signals namely AIOWC and AMWTC are available. Here the only difference between in timing diagram between minimum mode and maximum mode is the status signals used and the available control and advanced command signals. R0, S1, S2 are set at the beginning of bus cycle.8288 bus controller will output a pulse as on the ALE and apply a required signal to its DT / R pin during T1. In T2, 8288 will set DEN=1 thus enabling transceivers, and for an input it will activate MRDC or IORC. These signals are activated until T4. For an output, the AMWC or AIOWC is activated from T2 to T4 and MWTC or IOWC is activated from T3 to T4. The status bit S0 to S2 remains active until T3 and become passive during T3 and T4. If reader input is not activated before T3, wait state will be inserted between T3 and T4. Timings for RQ/ GT Signals: The request/grant response sequence contains a series of three pulses. The request/grant pins are checked at each rising pulse of clock input. When a request is detected and if the conditions for HOLD request are satisfied, the processor issues a grant pulse over the RQ/GT pin immediately during T4 (current) or T1 (next) state. When the requesting master receives this pulse, it accepts the control of the bus, it sends a release pulse to the processor using RQ/GT pin.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
Memory Read Timing in Maximum Mode

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
Memory Write Timing in Maximum mode.

RQ/GT Timings in Maximum Mode.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors 8. Interrupts:


The processor has the following interrupts: INTR is a maskable hardware interrupt. The interrupt can be enabled/disabled using STI/CLI instructions or using more complicated method of updating the FLAGS register with the help of the POPF instruction. When an interrupt occurs, the processor stores FLAGS register into stack, disables further interrupts, fetches from the bus one byte representing interrupt type, and jumps to interrupt processing routine address of which is stored in location 4 * <interrupt type>.Interrupt processing routine should return with the IRET instruction. NMI is a non-maskable interrupt. Interrupt is processed in the same way as the INTR interrupt. Interrupt type of the NMI is 2, i.e. the address of the NMI processing routine is stored in location 0008h. This interrupt has higher priority than the maskable interrupt. Software interrupts can be caused by: INT instruction - breakpoint interrupt. This is a type 3 interrupt. INT <interrupt number> instruction - any one interrupt from available 256 interrupts. INTO instruction - interrupt on overflow Single-step interrupt - generated if the TF flag is set. This is a type 1 interrupt. When the CPU processes this interrupt it clears TF flag before calling the interrupt processing routine. Processor exceptions: Divide Error (Type 0), Unused Opcode (type 6) and Escape opcode (type 7). Software interrupt processing is the same as for the hardware interrupts.

NON-MASKABLE INTERRUPT (NMI): The processor provides a single non-maskable inter-rupt pin (NMI) which has higher priority than the maskable interrupt request pin (INTR). A typical use would be to activate a power failure routine. The NMI is edge-triggered on a LOW-to-HIGH transition. The activation of this pin causes a type 2 interrupt. (See Instruction Set description.) NMI is required to have a duration in the HIGH state of greater than two CLK cycles, but is not required to be synchronized to the clock. Any high-going transition of NMI is latched on-chip and will be serviced at the end of the current instruction or between whole moves of a block-type instruction. Worst case response to NMI would be for multiply, divide, and variable shift instructions. There is no specification on the occurrence of the low-going edge; it may occur before, during, or after the servicing of NMI. Another high-going edge triggers another response if it occurs after the start of the NMI procedure. The signal must be free of logical spikes in general and be free of bounces on the low-going edge to avoid triggering extraneous responses.
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
MASKABLE INTERRUPT (INTR): The 8086 provides a single interrupt request input (INTR) which can be masked internally by software with the resetting of the interrupt enable FLAG status bit. The interrupt request signal is level triggered. It is internally synchronized during each clock cycle on the high-going edge of CLK. To be responded to, INTR must be present (HIGH) during the clock period preceding the end of the current instruction or the end of a whole move for a block type instruction. During the interrupt response sequence further interrupts are disabled. The enable bit is reset as part of the response to any interrupt (INTR, NMI, software interrupt or single-step), although the FLAGS register which is automatically pushed onto the stack reflects the state of the processor prior to the interrupt. Until the old FLAGS register is restored the enable bit will be zero unless specifically set by an instruction. During the response sequence (Figure 6) the processor executes two successive (back-toback) interrupt acknowledge cycles. The 8086 emits the LOCK signal from T2 of the first bus cycle until T2 of the second. A local bus hold request will not be honored until the end of the second bus cycle. In the second bus cycle a byte is fetched from the external interrupt system (e.g., 8259A PIC) which identifies the source (type) of the interrupt. This byte is multiplied by four and used as a pointer into the interrupt vector lookup table. An INTR signal left HIGH will be continually responded to within the limitations of the enable bit and sample period. The INTERRUPT RETURN instruction includes a FLAGS pop which returns the status of the original interrupt enable bit when it restores the FLAGS. HALT: When a software HALT instruction is executed the processor indicates that it is entering the HALT state in one of two ways depending upon which mode is strapped. In minimum mode, the processor issues one ALE with no qualifying bus control signals. In maximum mode, the processor issues appropriate HALT status on S2 ,S1 , and S0 ; and the 8288 bus controller issues one ALE. The 8086 will not leave the HALT state when a local bus hold is entered while in HALT. In this case, the processor reissues the HALT indicator. An interrupt request or RESET will force the 8086 out of the HALT state.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
INTERRUPT VECTOR TABLE:

9. Advantages & Disadvantages of Segmented memory:


Segmented memory has separate code and data segments, so that one program can work on several different sets of data. This is done by reloading register DS to point new data. Programs that reference logical addresses can be loaded and run anywhere in the memory. This is because the logical address always ranges from 0000h to FFFFh independent of code segment base. The disadvantages of segmented memory were, The segmented memory introduces extra complexity in both hardware and software. The complexity in software is especially frustrating as the 8086 limits segments to 64K bytes.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors Module: 2 1. Introduction:


The 80x86 processors let you access memory in many different ways. The 80x86 memory addressing modes provide flexible access to memory, allowing you to easily access variables, arrays, records, pointers, and other complex data types. Addressing mode tells where and how to locate data to be accessed. In essence, they are different ways used to specify the Operands. There are many addressing modes divided into 3 major categories in 8086 and they are: DATA MEMORY ADDRESSING MODES PROGRAM MEMORY ADDRESSING MODES STACK MEMORY ADDRESSING MODES

2. Data memory addressing modes:


There are 7 addressing modes in this addressing mode and they are: Register addressing mode Immediate addressing mode Direct addressing mode Register indirect addressing mode Indexed addressing mode Based-Indexed addressing mode Based-indexed plus displacement addressing mode These addressing modes are used to address the data present in the data segment of the memory. To access the data segment of the memory we need a physical address and it can be obtained from base segment address & logical address (offset). The base segment address for data segment will be held in data segment register by default. The offset for this could be obtained using these addressing modes.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
Register addressing mode:
Most 8086 instructions can operate on the 8086's general purpose register set. By specifying the name of the register as an operand to the instruction, you may access the contents of that register. Consider the 8086 mov (move) instruction: mov destination, source This instruction copies the data from the source operand to the destination operand. The eight and 16 bit registers are certainly valid operands for this instruction. The only restriction is that both operands must be the same size. Now let's look at some actual 8086 mov instructions: mov mov mov mov mov mov mov Example: Mov CH, AH : Copies the contents of upper bits of accumulator into upper bits of count register. ax, bx; Copies the value from BX into AX dl, al; Copies the value from AL into DL si, dx ;Copies the value from DX into SI sp, bp ;Copies the value from BP into SP dh, cl ;Copies the value from CL into DH ax, ax ;Valid aH,cx : In valid

NOTE: You should never use the segment registers as data registers to hold arbitrary values. They should only contain segment addresses.

Immediate addressing mode:


Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
If an operand is part of an instruction instead of the contents of a register or memory location, it is called an immediate operand. operand This operand is accessed using the immediate addressing mode. The operand can either be an 8-bit 8 or 16-bit data. Immediate operands normally represent constant data. This addressing ad mode can only be used to specify a source operand Consider the instruction: MOV AL, 15H In this instruction, the source operand is 15H, while the destination operand is register AL Thus this instruction uses both the immediate and register addressing modes. . Example: Write an instruction that will move the immediate value 1234H into the CX register Solution: The instruction must use immediate addressing mode for the source operand and register addressing mode for the destination operand, i.e. MOV CX, 1234H

Direct addressing mode:


This mode is also known as displacement only addressing mode. In this addressing mode, the logical address or the offset of the operand is specified specif directly in the instruction. instruction The offset address is used directly as the 16-bit 1 offset of the memory location and it is added with the segment base address specified by the segment base register (DATA DATA SEGMENT). The displacement-only only addressing mode is perfect for accessing simple variables. Example: Consider the instruction: MOV AX, [1234H] This instruction moves the data in the data segment of the memory, whose physical address is obtained by adding segment base address in data segment register and the offset directly given in the instruction, to AX register.

Register indirect addressing mode:


Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
In this addressing mode, the offset is specified either in a base register (BX or BP) or index register (SI or DI). This effective address will be combined with a segment base address in a segment register (default is DS register) to form a physical address. Example: Consider the instruction: MOV AX, [SI]. This instruction will move the contents of the memory location at the physical address given by the combination of the current data segment and the offset in register SI into AX register. There are four forms of this addressing mode on the 8086, best demonstrated by the following instructions: mov mov mov mov the stack segment (ss) by default. al, [bx] al, [bp] al, [si] al, [di]

The [bx], [si], and [di] modes use the ds segment by default. The [bp] [bp] addressing mode uses

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
Indexed addressing mode:
This is also referred as based addressing mode. In this addressing mode, the effective address is obtained by adding a direct or indirect displacement (given in instruction) to the contents of either BX or BP register. . The value in the base register defines the beginning of a data structure (e.g. array) in memory, and the displacement selects an element of data within the structure. Example: Consider the instruction: MOV [BX] +1234H, AL. This instruction uses base register BX and direct displacement 1234H to derive the destination operand. The offsets generated by these addressing modes are the sum of the constant and the specified register. The addressing modes involving bx, si, and di all use the data segment; the disp [bp] addressing mode uses the stack segment by default.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
Based indexed addressing mode:
The based indexed addressing modes are simply combinations of the register indirect addressing modes. These addressing modes form the offset by adding together a base register (bx or bp) and an index register (si or di). The allowable forms for these addressing modes are mov mov mov mov mov al, [bx][si] al, [bx][si] al, [bx][di] al, [bp][si] al, [bp][di]

Suppose that bx contains 1000h and si contains 880h. Then the instruction would load al from location DS: 1880h. Likewise, if bp contains 1598h and di contains 1004, mov ax, [bp+di] will load the 16 bits in ax from locations SS:259C and SS:259D. The addressing modes that do not involve bp use the data segment by default. Those that have bp as an operand use the stack segment by default. def

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
Based indexed plus displacement mode:
These addressing modes are a slight modification of the base/indexed addressing modes with the addition of an eight bit or sixteen bit constant. The following are some examples of these addressing modes: mov mov mov mov al, al, al, al, disp[bx][si] disp[bx+di] [bp+si+disp] [bp][di][disp]

Suppose bp contains 1000h, bx contains 2000h, si contains 120h, and di contains 5. Then mov al,10h[bx+si] loads al from address DS:2130; mov ch,125h[bp+di] loads ch from location SS:112A; and mov bx,cs:2[bx][di] loads bx from location CS:2007.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors 3. Program addressing mode:


Program memory addressing modes used with the jump and call instructions, consist of 3 distinct forms: 1. direct 2. relative 3 indirect DIRECT PROGRAM MEMORY ADDRESSING It is used in high level languages, the microprocessor uses this form of addressing.tha instruction for direct program memory addressing stores the address with the opcode.

RELATIVE PROGRAM MEMORY ADDRESSING Its not available in all earlier microprocessor but its available to this family of microprocessor. This term relative means "relative to the instruction pointer"(ip).if a jump instruction skips the next two bytes of memory, the address in relation to the instruction pointer is a 2 that adds to the IP this develops the address of the next program instruction. A 1 byte displacement is used in short jumps and 2 bytes displacement is used in near jumps and calls. Both types are considered to be intra segment jump. INDIRECT PROGRAM MEMORY ADDRESSING The microprocessor allows several forms of program indirect memory addressing for jump and call instructions. If a 16 bit register holds the address of a jump instruction, the jump is near. For e.g. the BX register contains a 1000H and JMP BX. Instruction executes, the
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
microprocessor jumps to offset address 1000H in the current code segment. If a relative register holds the address, The jump is also considered to be an indirect jump. For eg.a JMP [BX] refers to the memory location within the data segment at the offset address contained in BX. At this offset address is a 16 bit number that is used as the offset address in the intrasegment jump. This type of jump is sometimes called an indirect-indirect or doubleindirect jump. Example: JMP AX: jumps to the current code segment location addressed by the contents of AX JMP CX: jumps to the current code segment location addressed by the contents of CX JMP NEAR PTR[BX] :jumps to the current code segment location addressed by the contents of the data segment memory location addressed by BX. JMP NEAR PTR [DI+2] :jumps to the current code segment location addressed by the contents of the data segment memory location addressed by DI+2. JMP TABLE [BX] :jumps to the current code segment location addressed by the contents of the data segment memory location addressed by TABLE+BX. JMP ECX: jumps to the current code segment location addressed by the contents of ECX.

4. STACK ADDRESSING MODE:


The stack plays an important role in all mps. It holds data temporarily and stores return addresses for procedures. The stack memory is a LIFO(last in first out) memory, which describes the way that data are stores and removed from the stack. The data are placed on to the stack with a PUSH instruction. The CALL instruction also uses the stack to hold the return address for procedures and a RET (return) instruction to remove the return address from the stack. The stack memory is maintained in two registers: The stack pointer (SP or ESP) and the stack segment register (SS).whenever a word of data is pushed on to the stack, The higher order 8 bits are placed in the location addressed by SP-1.the lower order 8 bits are placed in the location addressed by SP-2.The SP is then decremented by 2 so that the next word of the data is stored in the next available stack memory location. Whenever the data are popped from the stack, the lower order 8 bits are removed from the location addressed by SP. The higher order 8 bits are removed from the location addressed by SP+1.
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
Example: POPF: Removes a word from the stack and places it into the flags. PUSHA: Copies the word contents pf AX,BX,CX,DX,SP,BP,DI and SI on to the stack. POPA: Removes the data from the stack and places it into SI,DI,BP,SP,AX,BX,CX and DX. PUSHAD: Copies the double word contents of EAX,ECX,EDX,EBX,ESP,EBP,EDI and ESI on to the stack. An Easy Way to Remember the 8086 Memory Addressing Modes: There are a total of 17 different legal memory addressing modes on the 8086: disp, [bx], [bp], [si], [di], disp[bx], disp[bp], disp[si], disp[di], [bx][si], [bx][di], [bp][si], [bp][di], disp[bx][si], disp [bx][di], disp[bp][si], and disp[bp][di]. You could memorize all these forms so that you know which are valid (and, by omission, which forms are invalid). However, there is an easier way besides memorizing these 17 forms. Consider the chart:

If you choose zero or one items from each of the columns and wind up with at least one item, you've got a valid 8086 memory addressing mode. Some examples:

Choose disp from column one, nothing from column two, [di] from column 3, you get disp[di]. Choose disp, [bx], and [di]. You get disp [bx][di]. Skip column one & two, choose [si]. You get [si] Skip column one, choose [bx], and then choose [di]. You get [bx][di]

Likewise, if you have an addressing mode that you cannot construct from this table, then it is not legal. For example, disp[dx][si] is illegal because you cannot obtain [dx] from any of the columns above.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors 5. Physical Address generation in different addressing modes:


The physical address is generated by adding offset with the base segment address. The way of selecting offset varies from one addressing mode to other addressing mode. The generation of physical address for different modes is shown in figure 5.1. Figure 5.1

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors Module: 3 1. Introduction:


The 80286 is an advanced 16 bit high performance microprocessor designed for multi-user and multi-tasking applications that require low power and high performance. The M80C286 is fully compatible with its predecessors and object-code compatible with the 8086 and 80386 family of products. The 80286 has built-in memory protection that maintains a four level protection mechanism for task isolation, a hardware task switching facility and memory management capabilities that map 230 bytes (one gigabyte) of virtual address space per task (per user) into 224 bytes (16 megabytes) of physical memory. The 80286 operates in real mode and protected mode. Using 8086 real address mode, the 80286 is object code compatible with existing 8086, 8088 software. In protected virtual address mode, the 80286 is source code compatible with 8086, 8088 software which may require upgrading to use virtual addresses supported by the 80286s integrated memory management and protection mechanism. Both modes operate at full 80286 performance and execute a superset of the 8086 instructions.

2. Architecture of 80286:
The 80286 is a 16 bit advanced, high-performance microprocessor with specially optimized capabilities for multiple user and multi-tasking systems. Depending on the application, a 10 MHz 80286's performance is up to eight times faster than the standard 5 MHz 8086's. It has 68 pins and consists of 24 bit address lines. The 80286 operates in two modes: 8086 real address mode and protected virtual address mode. In 8086 real address mode programs use real addresses with up to one megabyte of address space. Programs use virtual addresses in protected virtual address mode, also called protected mode. In protected mode, the 80286 CPU automatically maps 1 gigabyte of virtual addresses per task into a 16 megabyte real address space. This mode also provides memory protection to isolate the operating system and ensure privacy of each tasks programs and data. Both modes provide the same base instruction set, registers, and addressing modes. The architecture of 80286 is shown in figure 2.1.
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
Figure 2.1

The architecture of 80286 is divided into four sections, they are, BIU (Bus Interface unit) IU (Instruction unit) EU(Execution unit) AU(Address unit)

BUS INTERFACE UNIT: This controls the access of the bus. This unit interfaces the system bus with internal bus. It communicates to the physical memory by transferring data between CPU and memory. It consists of several units and they are, 1. Address latches and drivers 2. Prefetcher & processor extension interface 3. Bus control 4. Data transceivers 5. Prefetch queue

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
Address latches & drivers: This unit is used to latch the physical address generated by address unit. It places the 24 bit address (A23 to A0) onto the address bus. It makes use of BHE & M/IO signals for latching the address. Prefetcher & processor extension interface: This unit is used to prefetch the instructions from the physical memory according to the address. It fetches the instruction to be executed in advance, in order to pipeline the operation. It places the prefetched instructions in a 6 byte long queue which will be decoded one by one. In addition to these functions this unit provides interface to the processor extension such as math coprocessor 80287. It handles PEACK & PEREQ signals to interface processor extension with 80286. Bus control: This unit controls the bus operations. It handles the bus control signals such as LOCK and interrupt signals. It controls the transferring bus control to other masters and to the external devices based on the bus control signals. Data transceivers: The data transceivers act as a bridge between CPU and memory for transferring the data. It enables the 16 bit data transfer to and from the physical memory. It places the 16 bit data onto the data bus. Prefetch queue: This is a 6 byte long queue which holds the prefetched instructions. The instructions in the queue are executed one by one by the execution unit. Whenever the queue is full the prefetcher stops fetching the instructions from the memory and it resumes the fetching operation only if two bytes of queue is free. INSTRUCTION UNIT: The instruction unit decodes the instructions in the prefetch queue. After decoding the instructions are stored in the instruction queue. The instruction queue is capable of holding 3 decoded instructions. The execution unit executes the instruction from the instruction queue.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
EXECUTION UNIT: The execution unit is responsible for executing the instructions. It has a control unit, ALU and set of registers. The control unit is used to control & transfer the executions according to the exceptions and processor extension signals. The ALU is used for conventional purpose i.e. to perform integer arithmetic and logical operations. The 80286 consists of same set of registers as 8086 which can be used to store the intermediate results, base addresses and offsets. The register set of 80286 is shown in the figure 2.2. Figure 2.2

The 80286 has 14 16 bit registers including the flag register. The flag register is shown below. The 286 FLAG register has extra components than 8086 such as IOPL & NT. The IOPL is 2 bit I/O privilege level which denotes the privilege level. The NT is nested task, if set the processor performs the nested task operations. In addition to the flag register which is same as 8086, the 286 has a MSW (Machine status word) register. The MSW has four bits like, TS (task switch), PE (protection enable), MP (Monitor processor extension), EM (Extension emulated). The TS is set if the processor is switched from one task to other. The PE is set if the processor is operating in protection mode. The MP and EM are set according to the processor extension operations.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors

ADDRESS UNIT: The address unit computes the physical address by adding the segment base address with the offset or logical address. In real mode operation the physical address id computed by left shifting the base address by one bit and adding the offset with it as in 8086. In real mode the processor can able to address only 1MB of memory. In protected mode operation, the physical address is computed using the descriptor tables and it utilizes 24 bit address lines capable of addressing 16MB of physical memory. In addition it can address 1GB of virtual memory. The memory management unit handles the address generation in protected mode and is in-built. Thus the computed address is placed on to the address bus through bus interface unit. These are all the major components in the architecture of 80286 and this architecture is the enhanced version of 8086 with in-built MMU which makes 80286 to work in protected virtual address mode.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors 3. Functional description:


The 80286 consists of 68 pins, most of the pins are similar to 8086. The 80286 has a 16-bit data bus and a 24-bit non multiplexed address bus. The 24-bit address bus allows the processor to access 16 Mbytes of physical memory when operating in protected mode. Memory hardware for the 80286 is set up as an odd bank and an even bank, just as it is for the 8086. The even bank will be enabled when BHE is low. To access an aligned word, both A0 and BHE will be low. External buffers are used on both the address and the data bus. From a control standpoint, the 80286 functions similarly to an 8086 operating in maximum mode. Status signals S0#, S1#, and M/IO# are decoded by an external 8288 bus controller to produce the control bus, read, write, and interrupt-acknowledge signals. The HOLD, HLDA, INTR, INTA#, (NMI), READY#, and LOCK# and RESET pins function basically the same as they do on an 8086. An external 82284-clock generator is used to produce a clock signal for the 80286 and to synchronize RESET and READY# signals. The final four signal pins we need to discuss here are used to interface with processor extensions such as the 80287-math coprocessor. The processor extension request (PEREQ) input pin will be asserted by a coprocessor to tell the 80286 to perform a data transfer to or from memory for it. When the 80286 get around to do the transfer, it asserts the processor extension acknowledgement (PEACK#) signal to the coprocessor to let it know the data transfer has started. The BUSY signal input on the 80286 functions the same way as the TEST1# input does on the 8086. When the 80286 execute a WAIT instruction, it will remain in a WAIT loop until it finds the BUSY# signal from the coprocessor high. If a coprocessor finds some error during processing, it will assert the ERROR# input of the 80286.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors 4. Real Mode operation:


The M80C286 executes a fully upward-compatible upward compatible superset of the 8086 instruction set in real address mode. In real address mode the 80286is is object code compatible with 8086 and M8088 software. The real address mode architecture (registers and addressing modes) mod is exactly as described in the 80286 Base Architecture section of this Functional Description. In real mode operation the 80286 is Just act as a fast 8086. Instruction set is upward compatible with that of 8086. . The 80286 only address 1Mbytes of physical cal memory using A0A0 A19. Address lines A20-A23 A23 are not used in real mode operation. In this mode the physical address is computed in the same way as in 8086. The segment address in the corresponding segment is zero padded and added with the offset. All segments in real address mode are 64K bytes in size and may be read, written, or executed. If, in real address mode, the information contained in a segment does not use the full 64K bytes, the unused end of the segment may be overlayed by another segment to t reduce physical memory requirements.

The 80286 reserves two fixed areas of physical memory for system initialization (FFFF0H to FFFFFH) interrupt vector(00000H to 003FFH)

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors 5. Protected Virtual Address mode:


The 80286 is the first processor to support the concepts of virtual memory and memory management. The processor can able to address 1Gbyte of virtual memory. The M80C286 enters protected virtual address mode from real address mode by setting the PE (Protection Enable) bit of the machine status word. All registers, instructions, and addressing modes remain the same. Programs for the 8086, 88, 186, and real address mode 80286 can be run in protected mode. The protected mode 286 provides a 1 gigabyte virtual address space per task mapped into a 16 megabyte physical address space defined by the address pin A23A0 and BHE. The virtual address space may be larger than the physical address space since any use of an address that does not map to a physical memory location will cause a restartable exception. The virtual memory is mapped into the physical memory by swapping and un-swapping. The program to be executed is split into several portions and stored in secondary memory. The portion of the program that needs to be executed at that instant is transferred from secondary memory to physical memory and this is referred as swapping. During execution if any intermediate results occurs which is needed for future executions, it can be stored in the secondary memory from the physical memory and is called un-swapping. As in real address mode, protected mode uses 32- bit pointers, consisting of 16-bit selector and offset components. The selector, however, specifies an index into a memory resident table rather than the upper 16-bits of a real memory address. The 24-bit base address of the desired segment is obtained from the tables in memory. The table in the memory which has the 24-bit base address of the segment is called descriptor. The address formation is shown in the figure 5.1. The descriptor is a block of contiguous memory location containing information of a segment, like Segment base address & Segment limit Segment type & Privilege level Segment availability in physical memory Descriptor type Segment use by another task

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
Figure 5.1

6. Descriptor & their types:


Descriptors define the use of memory. Special types of descriptors also define new functions for transfer of control and task switching. It carries all relevant information regarding a segment and its access rights. The 80286 has segment descriptors for code, stack and data segments, and system control descriptors for special system data segments and control transfer operations. operations 6.1 Code & Data segment descriptors: Besides segment base addresses, code and data descriptors contain other segment attributes including segment size (1 to 64K bytes), access rights (read only, read/write, execute only, and execute/ read), and presence in memory (for virtual memory systems). systems) The code/data segment descriptor consists of Contains 16 bit segment limit 24 bit segment base address
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
8 bit access byte rights Remaining 16 bits are reserved by Intel The code/data segment descriptor is shown in figure 6.1. Figure 6.1

If S-bit in the descriptor is 1 then it refers the Code segment or data segment descriptor The code segments distinguished from data segments by E-bit. If E-bit is 1 then code segment descriptor type otherwise data segment descriptor type. If S-bit is 0 then System segment descriptor or Gate descriptor The present bit (P-bit) indicates whether the segment is available in physical memory and accessed bit (A-bit) indicates whether it is accessed previously or not. If P-bit is 1 segment is mapped into physical memory. Data segment: Data segment may be either read only or read-write depending upon the W-bit of the corresponding descriptor. If W=0 read only segments; if W=1 then read-write segments. The data segments(S=1 and E=0) can have data area on both sides as determined by the ED bit For normal data segments, the access is upward (ED=0) For stack data segments, the access is downward(ED=1) Code segment is identified by S=1 and E=1 may executable or execute-read, determined by the readable(R) bit. If R=0 the code segment is executable only; otherwise it is readable and executable.
S.N.G.C.E Karthik.S Shyju.Y

Code segment:

Advanced Microprocessors
DPL (Descriptor Privilege Level) defines the range of privilege level. Accessed (A) A = 0 Segment has not been accessed. A= 1 Segment selector has been loaded into segment register. 6.2 System Descriptors: (Type 1 to 3) In addition to code and data segment descriptors, the protected mode M80C286 defines System Segment Descriptors. These descriptors define special system data segments which contain a table of descriptors (Local Descriptor Table Descriptor) or segments which contain the execution state of a task. System segment descriptors are of 7 types. It is shown in figure 6.2 The types 1-3 are called system descriptors & the types 4-7 are called gate descriptors Figure 6.2

This descriptor contains 16-bit segment limit & 24-bit segment base address Access byte right contains P-bit, 2-bit DPL, S-bit(0), 4-bit type field Last word of the descriptor is reserved by the Intel.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
Type 1 Available Task State Segment(TSS) Type 2- Local descriptor table Type 3- Busy Task State Segment

6.3 Gate descriptors: (Type 4 to 7) The gate descriptors contains ontains the information regarding The destination of the control transfer Required stack manipulations Whether it is present in the physical memory or not. Privilege level Type Gate descriptors provide mechanism to keep track of source and destination of control transfer. Hence CPU can perform protection checks and controls the entry points points of the destination code. Call gates are used to alter the privilege. privilege Task gates are used to switch from one task to another. Interrupt and trap gates are used to specify the corresponding routines. The word count field is only used by a call gate descriptor to indicate the no of bytes transferred from the stack of the calling routine to stack of the called routine. routine The gate descriptor is shown in the figure 6.3. Figure 6.3

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors

7. Protection:
The 80286 supports the following three basic mechanism to provide protection 1. Restricted use of segments. This is accomplished with the help of read/write privileges. The segment usages are restricted by classifying the corresponding descriptors under LDT and GDT. 2. Restricted access to Segment. This is accomplished using descriptor usages limitations and the rules of privilege check, ie DPL,CPL 3. Privileged Instructions or Operations. These are to be executed or carried out at certain privilege levels determined by CPL and I/O privilege level(IOPL) as defined by flag register.

8. Interrupts:
An interrupt transfers execution to a new program location. The old program address (CS: IP) and machine state (Flags) are saved on the stack to allow resumption of the interrupted program. Interrupts fall into three classes: hardware initiated INT instructions, and instruction exceptions. Hardware initiated interrupts occur in response to an external input and are classified as nonmaskable or maskable. Programs may cause an interrupt with an INT instruction. Instruction exceptions occur when an unusual condition, which prevents further instruction processing, is detected while attempting to execute an instruction. The return address from an exception will always point at the instruction causing the exception and include any leading instruction prefixes. A table containing up to 256 pointers defines the proper interrupt service routine for each interrupt. Interrupts 0 to 31(some of which are used for instruction exceptions) are reserved. The interrupt vector table and type number is shown in the table 8.2. 8.1 Interrupt priority: When simultaneous interrupt requests occur, they are processed in a fixed order as shown in Table 8.2. Interrupt processing involves saving the flags, return address, and setting CS:IP to point at the first instruction of the interrupt handler. If other interrupts remain enabled they are processed before the first instruction of the current interrupt handler is executed. The last interrupt processed is therefore the first one serviced.
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors

Table 8.1

Table 8.2

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors

9. 80286 System Connection:


The 80286 system connection is shown in figure 9.1.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors Module 4: 1. Introduction:


This 80386 is a 32bit processor that supports 8bit/32bit data operands. The 80386 instruction set is upward compatible with all its predecessors. The 80386 can run 8086 applications under protected mode in its virtual 8086 mode of operation. With the 32 bit address bus, the 80386 can address up to 4Gbytes of physical memory. The physical memory is organized in terms of segments of 4Gbytes at maximum. The 80386 CPU supports 16K number of segments and thus the total virtual space of 4Gbytes * 16K = 64 Terabytes. Memory management section supports Virtual memory, Paging and 4 levels of protection. Operating at 20-33 MHz frequency

2. 80386 Architecture:
The Internal Architecture of 80386 is divided into 3 sections. The 80386 architecture is shown in figure 2.1. Central processing unit Memory management unit Bus interface unit Central processing unit is further divided into Execution unit and Instruction unit Execution unit has 8 General purpose and 8 Special purpose registers which are either used for handling data or calculating offset addresses. The Instruction unit decodes the opcode bytes received from the 16-byte instruction code queue and arranges them in a 3- instruction decoded instruction queue. After decoding them this unit pass it to the control section for deriving the necessary control signals. The barrel shifter increases the speed of all shift and rotate operations. The multiply / divide logic implements the bit-shift-rotate algorithms to complete the operations in minimum time. Even 32- bit multiplications can be executed within one microsecond by the multiply / divide logic. The Memory management unit consists of a Segmentation unit and a Paging unit. Segmentation unit allows the use of two address components, viz. segment and offset for relocability and sharing of code and data. Segmentation unit allows segments of size 4Gbytes at max.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
Figure 2.1

The Paging unit organizes the physical memory in terms of pages of 4kbytes size each. Paging unit works under the control of the segmentation unit, i.e. each segment is further divided into pages. The virtual memory is also organizes in terms of segments and pages by the memory management unit. The Segmentation unit provides a 4 level protection prote mechanism for protecting and isolating the system code and data from those of the application program. Paging unit converts linear addresses into physical addresses. The control and attribute PLA checks the privileges at the page level. Each of the pages ages maintains the paging information of the task. The limit and attribute PLA checks segment limits and attributes at segment level to avoid invalid accesses to code and data in the memory segments. The Bus control unit has a prioritizer to resolve the priority pr of the various bus requests. This controls the access of the bus. The address driver drives the bus enable and address signal A0 A31. The pipeline and dynamic bus sizing unit handle the related control signals. The data buffers interface the internal nal data bus with the system bus.
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors

3. Signal descriptions:
CLK2: The input pin provides the basic system clock timing for the operation of 80386. D0 D31: These 32 lines act as bidirectional data bus during different access cycles. A31 A2: These are upper 30 bit of the 32- bit address bus. BE0 to BE3: The 32- bit data bus supported by 80386 and the memory system of 80386 can be viewed as a 4- byte wide memory access mechanism. The 4 byte enable lines BE0 to BE3, may be used for enabling these 4 blanks. Using these 4 enable signal lines, the CPU may transfer 1 byte / 2 / 3 / 4 byte of data simultaneously. W/R#: The write / read output distinguishes the write and read cycles from one another. D/C#: This data / control output pin distinguishes between a data transfer cycle from a machine control cycle like interrupt acknowledge. M/IO#: This output pin differentiates between the memory and I/O cycles. LOCK#: The LOCK# output pin enables the CPU to prevent the other bus masters from gaining the control of the system bus. NA#: The next address input pin, if activated, allows address pipelining, during 80386 bus cycles. ADS#: The address status output pin indicates that the address bus and bus cycle definition pins( W/R#, D/C#, M/IO#, BE0# to BE3# ) are carrying the respective valid signals. The 80386 does not have any ALE signals and so these signals may be used for latching the address to external latches. READY#: The ready signals indicate to the CPU that the previous bus cycle has been terminated and the bus is ready for the next cycle. The signal is used to insert WAIT states in a bus cycle and is useful for interfacing of slow devices with CPU. VCC: These are system power supply lines. VSS: These return lines for the power supply. BS16#: The bus size 16 input pin allows the interfacing of 16 bit devices with the 32 bit wide 80386 data bus. Successive 16 bit bus cycles may be executed to read a 32 bit data from a peripheral.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
HOLD: The bus hold input pin enables the other bus masters to gain control of the system bus if it is asserted. HLDA: The bus hold acknowledge output indicates that a valid bus hold request has been received and the bus has been relinquished by the CPU. BUSY#: The busy input signal indicates to the CPU that the coprocessor is busy with the allocated task. ERROR#: The error input pin indicates to the CPU that the coprocessor has encountered an error while executing its instruction. PEREQ: The processor extension request output signal indicates to the CPU to fetch a data word for the coprocessor. INTR: This interrupt pin is a maskable interrupt that can be masked using the IF of the flag register. NMI: A valid request signal at the non-maskable interrupt request input pin internally generates a non- maskable interrupt of type2. RESET: A high at this input pin suspends the current operation and restart the execution from the starting location. N / C: No connection pins are expected to be left open while connecting the 80386 in the circuit.

4. Register organization:
The 80386 has eight 32 - bit general purpose registers which may be used as either 8 bit or 16 bit registers. A 32 - bit register known as an extended register, is represented by the register name with prefix E. Example: A 32 bit register corresponding to AX is EAX, similarly BX is EBX etc. The 16 bit registers BP, SP; SI and DI in 8086 are now available with their extended size of 32 bit and are names as EBP, ESP, ESI and EDI. AX represents the lower 16 bit of the 32 bit register EAX. BP, SP, SI, DI represents the lower 16 bit of their 32 bit counterparts, and can be used as independent 16 bit registers. The six segment registers available in 80386 are CS, SS, DS, ES, FS and GS. The CS and SS are the code and the stack segment registers respectively, while DS, ES, FS, GS are 4 data segment registers. A 16 bit instruction pointer IP is available along with 32 bit counterpart EIP.
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
Flag Register of 80386 The Flag register of 80386 is a 32 bit register. Out of the 32 bits, Intel has reserved bits D18 to D31, D5 and D3, while D1 is always set at 1.Two extra new flags are added to the 80286 flag to derive the flag register of 80386. They are VM and RF flags. The flag register is shown in figure 4.1. Figure 4.1

VM - Virtual Mode Flag: If this flag is set, the 80386 enters the virtual 8086 mode within the protection mode. This is to be set only when the 80386 is in protected mode. RF- Resume Flag: This flag is used with the debug register breakpoints. It is checked at the starting of every instruction cycle and if it is set, any debug fault is ignored during the instruction cycle. Segment Descriptor Registers: This registers are not available for programmers, rather they are internally used to store the descriptor information, like attributes, limit and base addresses of the segments. The six segment registers have corresponding six 73 bit descriptor registers. Each of them contains 32 bit base address, 32 bit base limit and 9 bit attributes. These are automatically loaded when the corresponding segments are loaded with selectors. Control Registers: The 80386 has three 32 bit control registers CR), CR2 and CR3 to hold global machine status independent of the executed task. Load and store instructions are available to access these registers. System Address Registers: Four special registers are defined to refer to the descriptor tables supported by 80386. The 80386 supports four types of descriptor table, viz. global descriptor table (GDT), interrupt descriptor table (IDT), local descriptor table (LDT) and task state segment descriptor (TSS).

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors 5. Real Address Mode of 80386


After reset, the 80386 starts from memory location FFFFFFF0H under the real address mode. In the real mode, 80386 works as a fast as 8086 with 32-bit registers and data types. In real mode, the default operand size is 16 bit but 32- bit operands and addressing modes may be used with the help of override prefixes. The segment size in real mode is 64k; hence the 32-bit effective addressing must be less than 0000FFFFFH. The real mode initializes the 80386 and prepares it for protected mode. Figure 5.1

5.1 Memory Addressing in Real Mode: In the real mode, the 80386 can address at the most 1Mbytes of physical memory using address lines A0-A19. Paging unit is disabled in real addressing mode, and hence the real addresses are the same as the physical addresses. To form a physical memory address, appropriate segment registers contents (16-bits) are shifted left by four positions and then added to the 16-bit offset address formed using one of the addressing modes, in the same way as in the 80386 real address mode. The real address mode address generation is shown in the figure 5.1. The segment in 80386 real mode can be read, write or executed, i.e. no protection is available. Any fetch or access past the end of the segment limit generates exception 13 in real address mode.
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
The segments in 80386 real mode may be overlapped or non-overlapped. The interrupt vector table of 80386 has been allocated 1Kbyte space starting from 00000H to 003FFH.

6. Protected Mode of 80386


All the capabilities of 80386 are available for utilization in its protected mode of operation. The 80386 in protected mode support all the software written for 80286 and 8086 to be executed under the control of memory management and protection abilities of 80386. The protected mode allows the use of additional instruction, addressing modes and capabilities of 80386. 6.1 Memory addressing in protected mode: In this mode, the contents of segment registers are used as selectors to address descriptors which contain the segment limit, base address and access rights byte of the segment. The memory addressing is shown in the figure 6.1. Figure 6.1

The effective address (offset) is added with segment base address to calculate linear address. This linear address is further used as physical address, if the paging unit is disabled, otherwise the paging unit converts the linear address into physical address.
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
The paging unit is a memory management unit enabled only in protected mode. The paging mechanism allows handling of large segments of memory in terms of pages of 4Kbyte size. The paging unit operates under the control of segmentation unit. The paging unit if enabled converts linear addresses into physical address, in protected mode.

7. Segmentation:
7.1 DESCRIPTOR TABLES: These descriptor tables and registers are manipulated by the operating system to ensure the correct operation of the processor, and hence the correct execution of the program. Three types of the 80386 descriptor tables are listed as follows: GLOBAL DESCRIPTOR TABLE (GDT) LOCAL DESCRIPTOR TABLE (LDT) INTERRUPT DESCRIPTOR TABLE (IDT) DESCRIPTORS: The 80386 descriptors have a 20-bit segment limit and 32-bit segment address. The descriptors of 80386 are 8-byte quantities access right or attribute bits along with the base and limit of the segments. The structure of a descriptor is shown in the figure 7.1. Descriptor Attribute Bits: The A (accessed) attributed bit indicates whether the segment has been accessed by the CPU or not. The TYPE field decides the descriptor type and hence the segment type. The S bit decides whether it is a system descriptor (S=0) or code/data segment descriptor ( S=1). Figure 7.1

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
BASE Base Address of the segment LIMIT The length of the segment P Present Bit - 1 = Present , 0 = not present S Segment Descriptor -0 = System Descriptor, 1 = Code or data segment descriptor TYPE Type of segment G Granularity Bit - 1= Segment length is page granular, 0 = Segment length is byte granular D Default Operation size 0 Bit must be zero AVL Available field for user or OS The DPL field specifies the descriptor privilege level. The D bit specifies the code segment operation size. If D=1, the segment is a 32-bit operand segment, else, it is a 16-bit operand segment. The P bit (present) signifies whether the segment is present in the physical memory or not. If P=1, the segment is present in the physical memory. The G (granularity) bit indicates whether the segment is page addressable. The zero bit must remain zero for compatibility with future process. The 80386 has five types of descriptors listed as follows: 1. Code or Data Segment Descriptors. 2. System Descriptors. 3. Local descriptors. 4. TSS (Task State Segment) Descriptors. 5. GATE Descriptors. The 80386 provides a four level protection mechanism exactly in the same way as the 80286 does. GDT: The global descriptor table is used for several tasks. It is the table common for many tasks. The base address of the GDT is stored in GDT register. LDT: The local descriptor table is for a specific task. All the tasks has one LDT. The base address of the LDT is stored in LDT register. IDT: The interrupt descriptor tables are used for interrupt vector tables and interrupt related operations. The base address of IDT is stored in IDT register.
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors 8. Paging:


8.1 PAGING OPERATION: Paging is one of the memory management techniques used for virtual memory multitasking operating system. The segmentation scheme may divide the physical memory into a variable size segments but the paging divides the memory into a fixed size pages. The segments are supposed to be the logical segments of the program, but the pages do not have any logical relation with the program. The pages are just fixed size portions of the program module or data. The advantage of paging scheme is that the complete segment of a task need not be in the physical memory at any time. Only a few pages of the segments, which are required currently for the execution, need to be available in the physical memory. Thus the memory requirement of the task is substantially reduced, relinquishing the available memory for other tasks. Whenever the other pages of task are required for execution, they may be fetched from the secondary storage. The previous page which are executed, need not be available in the memory, and hence the space occupied by them may be relinquished for other tasks. Thus paging mechanism provides an effective technique to manage the physical memory for multitasking systems. 8.2 Paging Unit: The paging unit of 80386 uses a two level table mechanism to convert a linear address provided by segmentation unit into physical addresses. The paging unit converts the complete map of a task into pages, each of size 4K. The task is further handled in terms of its page, rather than segments. The paging unit handles every task in terms of three components namely page directory, page tables and page itself. 8.3 Paging Descriptor Base Register: The control register CR2 is used to store the 32-bit linear address at which the previous page fault was detected. The CR3 is used as page directory physical base address register, to store the physical starting address of the page directory.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
The lower 12 bit of the CR3 is always zero to ensure the page size aligned directory. A move operation to CR3 automatically loads the page table entry caches and a task switch operation, to load CR0 suitably. 8.4 Page Directory : This is at the most 4Kbytes in size. Each directory entry is of 4 bytes, thus a total of 1024 entries are allowed in a directory. The upper 10 bits of the linear address are used as an index to the corresponding page directory entry. The page directory entries point to page tables. 8.5 Page Tables: Each page table is of 4Kbytes in size and many contain a maximum of 1024 entries. The page table entries contain the starting address of the page and the statistical information about the page.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors

The upper 20 bit page frame address is combined with the lower 12 bit of the linear address. The address bits A12- A21 are used to select the 1024 page table entries. The page table can be shared between the tasks. The P bit of the above entries indicates, if the entry can be used in address translation. If P=1, the entry can be used in address translation, otherwise it cannot be used. The P bit of the currently executed page is always high. The accessed bit A is set by 80386 before any access to the page. If A=1, the page is accessed, else unaccessed.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors

The D bit ( Dirty bit) is set before a write operation to the page is carried out. The D-bit is undefined for page director entries. The OS reserved bits are defined by the operating system software. The User / Supervisor (U/S) bit and read/write bit are used to provide protection. These bits are decoded to provide protection under the 4 level protection model. The level 0 is supposed to have the highest privilege, while the level 3 is supposed to have the least privilege. This protection provide by the paging unit is transparent to the segmentation unit. Virtual 8086 Mode In its protected mode of operation, 80386DX provides a virtual 8086 operating environment to execute the 8086 programs. The real mode can also used to execute the 8086 programs along with the capabilities of 80386, like protection and a few additional instructions. Once the 80386 enters the protected mode from the real mode, it cannot return back to the real mode without a reset operation.
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
Thus, the virtual 8086 mode of operation of 80386, offers an advantage of executing 8086 programs while in protected mode. The address forming mechanism in virtual 8086 mode is exactly identical with that of 8086 real mode. In virtual mode, 8086 can address 1Mbytes of physical memory that may be anywhere in the 4Gbytes address space of the protected mode of 80386. Like 80386 real mode, the addresses in virtual 8086 mode lie within 1Mbytes of memory. In virtual mode, the paging mechanism and protection capabilities are available at the service of the programmers. The 80386 supports multiprogramming, hence more than one programmer may be use the CPU at a time. Paging unit may not be necessarily enable in virtual mode, but may be needed to run the 8086 programs which require more than 1Mbyts of memory for memory management function. In virtual mode, the paging unit allows only 256 pages, each of 4Kbytes size. Each of the pages may be located anywhere in the maximum 4Gbytes physical memory. The virtual mode allows the multiprogramming of 8086 applications. The virtual 8086 mode executes all the programs at privilege level 3.Any of the other programmes may deny access to the virtual mode programs or data. However, the real mode programs are executed at the highest privilege level, i.e. level 0. The virtual mode may be entered using an IRET instruction at CPL=0 or a task switch at any CPL, executing any task whose TSS is having a flag image with VM flag set to 1. The IRET instruction may be used to set the VM flag and consequently enter the virtual mode. The PUSHF and POPF instructions are unable to read or set the VM bit, as they do not access it. Even in the virtual mode, all the interrupts and exceptions are handled by the protected mode interrupt handler. To return to the protected mode from the virtual mode, any interrupt or execution may be used. As a part of interrupt service routine, the VM bit may be reset to zero to pull back the 80386 into protected mode.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors 9. Virtual 8086 Mode:


In its protected mode of operation, 80386DX provides a virtual 8086 operating environment to execute the 8086 programs. The real mode can also used to execute the 8086 programs along with the capabilities of 80386, like protection and a few additional instructions. Once the 80386 enters the protected mode from the real mode, it cannot return back to the real mode without a reset operation. Thus, the virtual 8086 mode of operation of 80386, offers an advantage of executing 8086 programs while in protected mode. The address forming mechanism in virtual 8086 mode is exactly identical with that of 8086 real mode. In virtual mode, 8086 can address 1Mbytes of physical memory that may be anywhere in the 4Gbytes address space of the protected mode of 80386. Like 80386 real mode, the addresses in virtual 8086 mode lie within 1Mbytes of memory. In virtual mode, the paging mechanism and protection capabilities are available at the service of the programmers. The 80386 supports multiprogramming, hence more than one programmer may be use the CPU at a time. Paging unit may not be necessarily enable in virtual mode, but may be needed to run the 8086 programs which require more than 1Mbyts of memory for memory management function. In virtual mode, the paging unit allows only 256 pages, each of 4Kbytes size. Each of the pages may be located anywhere in the maximum 4Gbytes physical memory. The virtual mode allows the multiprogramming of 8086 applications. The virtual 8086 mode executes all the programs at privilege level 3.Any of the other programmes may deny access to the virtual mode programs or data. However, the real mode programs are executed at the highest privilege level, i.e. level 0. The virtual mode may be entered using an IRET instruction at CPL=0 or a task switch at any CPL, executing any task whose TSS is having a flag image with VM flag set to 1.
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
The IRET instruction may be used to set the VM flag and consequently enter the virtual mode. The PUSHF and POPF instructions are unable to read or set the VM bit, as they do not access it. Even in the virtual mode, all the interrupts and exceptions are handled by the protected mode interrupt handler. To return to the protected mode from the virtual mode, any interrupt or execution may be used. As a part of interrupt service routine, the VM bit may be reset to zero to pull back the 80386 into protected mode.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors Module: 5 1. Introduction to 80486:


The 32-bit 80486 is the next evolutionary step up from the 80386. One of the most obvious feature included in a 80486 is a built in math coprocessor. This coprocessor is essentially the same as the 80387 processor used with a 80386, but being integrated on the chip allows it to execute math instructions about three times as fast as a 80386/387 combination. 80486 is an 8Kbyte code and data cache. To make room for the additional signals, the 80486 is packaged in a 168 pin, pin grid array package instead of the 132 pin PGA used for the 80386.

2. Architecture of 80486:
The 32-bit pipelined architecture of Intels 80486 is shown in Figure 2.1. The internal architecture of 80486 can be broadly divided into three sections, namely bus interface unit, execution and control unit and floating point unit. Bus Interface unit: The bus unit provides the physical interface between the 80486 and external devices. Refer to figure 2-2. The bus unit consists of the following functional entities: Address drivers/receivers. When the 80486 is executing a bus cycle, the address drivers are used to drive the address out onto the processor's local address bus (A31:A2) and the byte enable lines. During cache invalidation cycles, address bits A31:A4 are input from the processor's local address bus through the address receivers. Write buffers. These four buffers allow the bus unit to buffer up to four write bus cycles from the processor, permitting these write operations to complete execution instantly. Data bus transceivers. Used to gate the data onto the processor's local data bus during write bus cycles Bus size control logic. Senses when the microprocessor is communicating with 8- or 16-bit devices, causing the microprocessor to automatically execute multiple bus cycles when necessary. Bus control request sequencer: Determines the order of addressing during burst transfers. Burst bus control logic: Used to control the buses during the execution of a burst transfer. Cache control logic: Connects the processor's local buses to the external cache controller.
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
Parity generation/checking logic. Automatically generates even parity on data being written by the microprocessor and checks for valid even parity during read bus cycles.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
Figure 2.2

Cache Unit: The 80486 microprocessor incorporates a cache controller and 8KB of fast access static RAM cache memory. The directory structure used by the cache controller is four-way set associative. The Instruction Pipeline/Decode Unit: The instruction pipeline/decode unit consists of three basic parts: Prefetcher 32 byte code queue Instruction decoder The 80486 microprocessor incorporates a five-deep pipeline that significantly speeds up instruction decode and execution: Instruction prefetch Stage 1 decode Stage 2 decode Execution Register write-back
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
At a given moment in time, a series of instructions are in the pipeline at various stages. The ability of the 80486 microprocessor to process a number of instructions in parallel in this fashion gives it the ability to complete execution of an instruction during each cycle of the processor clock (PCLK). However, this capability depends on the particular instructions in the instruction stream. Refer to figure 2-3.

Five stage pipe-line: Instruction Prefetch: The Prefetcher reads instructions in 16-byte blocks (lines). The line of code is read into both the internal cache and the 32-byte prefetch queue. Two-Stage Instruction Decode: During the stage 1 decode, the opcode byte is decoded, the optional MOD R/M byte is interpreted to indicate the form of addressing to be used and the optional Scale Index Byte (SIB)
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
is used to more fully specify the form of addressing. During the stage 2 decode, the displacement is added to the address and any immediate operands are taken into account. Execution: The instruction is executed. Register Write-Back: Instruction execution is completed and the result written back to a target register (if necessary). The Control Unit Also referred to as the microcode unit, the control unit consists of the following sub-units: The microcode sequencer The microcode control ROM This unit interprets the instruction word and microcode entry points fed to it by the instruction decode unit. It handles exceptions, breakpoints and interrupts. In addition, it controls integer and floating-point sequences. The Floating-Point Unit: The floating-point unit executes the same instruction set as the 80387 Numeric Co-Processor extensions. It shares microcode ROM, instruction decode and address pipelining logic with the data path, or integer execution, unit. The floating-point unit consists of two tightly-coupled subunits: The floating-point unit The floating-point register file. Contains the registers original to the floating-point unit. The Numeric Exception (NE) control bit in CR0 allows the programmer to select the error handling scenario to be used by the microprocessor when a floating-point error is detected. Setting this bit to 1 cause the microprocessor to generate an internal exception 16 interrupt when the floating-point unit incurs an error. The Datapath Unit: The data path unit contains the following sub-units: General-purpose registers. Arithmetic logic unit (ALU). This unit handles all integer and bit-oriented math functions. Barrel shifter. Used by the ALU to perform math functions. Flags. The flag register basically consists of two bit fields: The flag status bits reflect the results of the previously executed instruction.
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
The flag control bits allow the programmer to alter certain operational characteristics of the microprocessor. The data path unit also contains special stack pointer logic to accommodate single- clock pushes and pops. In addition, single load, store, add, subtract, logic and shift instructions can be executed in one clock. The Memory Management Unit (MMU): The MMU consists of two sub-units: The segmentation unit. Calculates effective (paging unit off) and linear (paging unit on) addresses from the segment and offset. It has been redesigned to generate one address per clock. The segmentation unit contains the segment descriptor cache. It also performs limit and access rights checks. The paging unit. If enabled by setting the PG bit in CR0, the paging unit translates the linear address to a physical address. It performs the same functions as the 80386 microprocessor's paging mechanism, but has been optimized to improve system performance. It can perform one translation look-aside buffer (TLB) lookup per clock. Individual pages may now bewriteprotected against supervisor access.

3. The 486 Internal Cache:


The internal cache introduced by Intel in the 486 processor provides the additional benefit of limiting the number of memory accesses that the processor must submit to external memory. The 486's internal cache keeps a copy of the most recently used instructions and data (typically referred to as a unified cache). The processor only has to access slow external memory when it experiences an internal cache read miss or a memory write. The 486 employs a burst transfer mechanism to speed up transfers from external memory. Each internal cache miss forces the processor to access slow external memory. Because the internal cache's line size is 16 bytes, four complete bus cycles would be required to transfer the whole cache line (because the 486 only has a 32-bit data path). The burst transfer capability permits the processor to complete the four transfers faster than it could with zero wait state bus cycles. If the DRAM subsystem utilizes interleaved memory architecture, the transfers can complete faster than would be possible otherwise.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
The cache memory system stores data used by a program and also the instructions of the program. The cache is organized as a 4 way set associative cache with each location containing 16 bytes or 4 double words of data. Control register CR0 is used to control the cache with two new control bits not present in the 80386 microprocessor.

The CD (cache disable ) , NW ( non-cache write through ) bits are new to the 80486 and are used to control the 8K byte cache. If the CD bit is a logic 1, all cache operations are inhibited. This setting is only used for debugging software and normally remains cleared. The NW bit is used to inhibit cache write through operation. As with CD, cache write through is inhibited only for testing. For normal operations CD = 0 and NW = 0. Because the cache is new to 80486 microprocessor and the cache is filled using burst cycle not present on the 386.

4. Architecture of Pentium processor:


The architecture of Pentium processor is shown in the figure 4. 1. The most important enhancements over the 486 are the separate instruction and data caches, the dual integer pipelines (the U- pipeline and the V-pipeline, as Intel calls them), branch prediction using the branch target buffer (BTB), the pipelined floating-point unit, and the 64-bit external data bus. Even-parity checking is implemented for the data bus and the internal RAM arrays (caches and TLBs).

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors

As for new functions, there are only a few; nearly all the enhancements in Pentium are included to improve performance, and there are only a handful of new instructions. Pentium is the first high-performance micro-processor to include a system management mode like those found on power-miserly processors for notebooks and other battery-based applications; Intel is holding to its promise to include SMM on all new CPUs. Pentium uses about 3 million transistors on a huge 294 mm 2 (456k mils 2). The caches plus TLBs use only about 30% of the die. At about 17 mm on a side, Pentium is one of the largest microprocessors ever fabricated and probably pushes Intels production equipment
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors
to its limits. The integer data path is in the middle, while the floating-point data path is on the side opposite the data cache. The Pentium processor has an optimized superscalar micro-architecture capable of executing two instructions in a single clock. A 64-bit external bus, separate 8-Kbyte data and instruction caches for Pentium processor (75/90/100/120/133/150/166/200), separate 16-Kbyte data and instruction caches for Pentium processor with MMX technology, write buffers, branch prediction (with an enhanced branch prediction algorithm for the Pentium processor with MMX technology), and a pipelined floating-point unit combine to sustain the high execution rate. These architectural features and their operation are discussed in this chapter.

5.

PIPELINE AND INSTRUCTION FLOW(U-V):

The integer instructions traverse a five stage pipeline in the Pentium processor (75/90/100/120/133/150/166/200), while the Pentium processor with MMX technology has an additional pipeline stage. The pipeline stages are as follows: PF Prefetch F Fetch (Pentium processor with MMX technology only) D1 Instruction Decode D2 Address Generate EX Execute - ALU and Cache Access WB Write back The Pentium processor is a superscalar machine, built around two general purpose integer pipelines and a pipelined floating-point unit capable of executing two instructions in parallel. Both pipelines operate in parallel allowing integer instructions to execute in a single clock in each pipeline. Figure 5-1 depicts instruction flow in the Pentium processor. The pipelines in the Pentium processor are called the u and v pipes and the process of issuing two instructions in parallel is termed pairing. The u-pipe can execute any instruction in the Intel architecture, while the v-pipe can execute simple instructions as defined in the Instruction Pairing Rules

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors
section of this chapter. When instructions are paired, the instruction issued to the v-pipe is always the next sequential instruction after the one issued to the u-pipe.

The Pentium processor pipeline has been optimized to achieve higher throughput compared to previous generations of Intel Architecture processors. The first stage of the pipeline is the Prefetch (PF) stage in which instructions are prefetched from the on-chip instruction cache or memory. Because the Pentium processor has separate caches for instructions and data, prefetches do not conflict with data references for access to the cache. If the requested line is not in the code cache, a memory reference is made. In the PF stage of the Pentium processor (75/90/100/120/133/150/166/200), two independent pairs of line-size (32-byte) prefetch buffers operate in conjunction with the branch target buffer. This allows one prefetch buffer to prefetch instructions sequentially, while the other prefetches according to the branch target buffer predictions. The prefetch buffers alternate their prefetch paths. In the Pentium processor with MMX technology, four 16-byte prefetch buffers operate in conjunction with the BTB to prefetch up to four independent instruction streams. In the Pentium processor with MMX technology only, the next pipeline stage is Fetch (F), and it is used for instruction length decode. It replaces the D1 instruction-length decoder and eliminates the need for end-bits to determine instruction length. Also, any prefixes are decoded in the F stage.
Karthik.S Shyju.Y S.N.G.C.E

Advanced Microprocessors 6. MMX :


MMX is a single instruction, multiple data (SIMD) instruction set designed by Intel, introduced in 1996 with their Pentium line of microprocessors, designated as "Pentium with MMX Technology. It developed out of a similar unit first introduced on the Intel i860. MMX is a processor supplementary capability that is supported on recent IA-32 processors by Intel and other vendors. MMX defined eight registers, known as MM0 through MM7 (henceforth referred to as MMn). To avoid compatibility problems with the context switch mechanisms in existing operating systems, these registers were aliases for the existing x87 FPU stack registers (so no new registers needed to be saved or restored). Hence, anything that was done to the floating point stack would also affect the MMX registers and vice versa. However, unlike the FP stack, the MMX registers are directly addressable (random access). Each of the MMX registers holds 64-bits (the mantissa-part of a full 80 bit FPU register). The main usage of the MMX instruction set is the concept of packed data types, which means that instead of using the whole register for a single 64-bit integer, two 32-bit integers, four 16-bit integers, or eight 8-bit integers may be processed concurrently. The mapping of the MMX registers onto the existing FPU registers made it somewhat difficult to work with floating point and SIMD data in the same application. To maximize performance, programmers often used the processor exclusively in one mode or the other, deferring the relatively slow switch between them as long as possible. Because the FPU stack registers is 80 bits wide, the upper 16 bits of the stack registers go unused in MMX, and these bits are set to all ones, which makes them NaNs or infinities in the floating point representation. This can be used to decide whether a particular register contents was intended as floating point or SIMD data. MMX provides only integer operations.

Karthik.S Shyju.Y

S.N.G.C.E

Advanced Microprocessors

Karthik.S Shyju.Y

S.N.G.C.E

Você também pode gostar