MUP e Notes Theory

www.bookspar.
com | Website for students | VTU NOTES
Microprocessor
A microprocessor (abbreviated as P or uP) is an electronic computer central processing unit
(CPU) made from miniaturized transistors and other circuit elements on a single semiconductor
integrated circuit (IC)
Before the advent of microprocessors, electronic CPUs were made from discrete (separate) TTL
integrated circuits; before that, individual transistors; and before that, from vacuum tubes. There
have even been designs for simple computing machines based on mechanical parts such as
gears, shafts, levers, Tinkertoys, etc. Leonardo DaVinci made one such design, although none
were possible to construct using the manufacturing techniques of the time.
The evolution of microprocessors have been known to follow Moore's Law

when it comes to steadily increasing performance over the years. This suggests that computing
power will double every eighteen months, a process that has been generally followed since the
early 1970's a surprise to everyone involved. From humble beginnings as the drivers for
calculators, the continued increase in power has led to the dominance of microprocessors over
every other form of computer; every system from the largest mainframes to the smallest handheld
computers now use a microprocessor at their core.
History
The first chips
As with many advances in technology, the microprocessor was an idea whose time had come.
Three projects arguably delivered a complete microprocessor at about the same time, Intel's 4004,
Texas Instruments' TMS 1000, and Garrett AiResearch's Central Air Data Computer.
In 1968 Garrett was invited to produce a digital computer to compete with electromechanical
systems then under development for the main flight control computer in the US Navy's new F-14
Tomcat fighter. The design was complete by 1970, and used a MOS-based chipset as the core
CPU. The design was smaller and much more reliable than the mechanical systems it competed
against, and was used in all of the early Tomcat models. However the system was considered so
advanced that the Navy refused to allow publication of the design, and continued to refuse until
1997. For this reason the CADC, and the MP944 chipset it used, are fairly unknown even today.
www.bookspar.com | Website for students | VTU NOTES
TI developed the 4-bit TMS 1000 and stressed pre-programmed embedded applications,
introducing a version called the TMS1802NC on September 17, 1971, which implemented a
calculator on a chip. The Intel chip was the 4-bit 4004, released on November 15, 1971, developed
by Federico Faggin.
TI filed for the patent on the microprocessor. Gary Boone was awarded for the single-chip
microprocessor architecture on September 4, 1973. It may never be known which company
actually had the first working microprocessor running on the lab bench. In both 1971 and 1976,
Intel and TI entered into broad patent cross-licensing agreements, with Intel paying royalties to TI
for the microprocessor patent. A nice history of these events is contained in court documentation
from a
legal dispute between Cyrix and Intel, with TI as and owner of the microprocessor
patent.
Interestingly, a third party claims to have been awarded a patent which might cover the
"microprocessor". See
a webpage claiming an inventor pre-dating both TI and Intel, describing
a "microcontroller", which may or may not count as a "microprocessor".
A computer-on-a-chip is a variation of a microprocessor which combines the microprocessor core

(CPU), some memory, and I/O (input/output) lines, all on one chip. The computer-on-a-chip patent,
called the microcomputer patent at the time, , was awarded to Gary Boone and Michael J. Cochran
of TI. Aside from this patent the proper meaning of microcomputer is a computer using a (number
of) microprocessor(s) as its CPU(s), while the concept of the patent is somewhat more similar to a
microcontroller.
According to A History of Modern Computing, (MIT Press), pp. 22021,

Intel entered into a contract with Computer Terminals Corporation, later called Datapoint, of San
Antonio TX, for a chip for a terminal they were designing. Datapoint later decided not to use the
chip, and Intel marketed it as the 8008 in April, 1972. This was the world's first 8-bit
microprocessor. It was the basis for the famous "Mark-8" computer kit advertised in the magazine
Radio-Electronics. The 8008 and its successor, the world-famous 8080, opened up the
microprocessor component marketplace.
Notable 8-bit designs
The 4004 was later followed by the 8008, the world's first 8-bit microprocessor. These processors
are the precursors to the very successful Intel 8080 Zilog Z80, and derivative Intel 8-bit

processors. The competing Motorola 6800 architecture was cloned and improved in the MOS
Technology 6502, rivaling the Z80 in popularity during the 1980s.
Both the Z80 and 6502 concentrated on low overall cost, through a combination of small
packaging, simple computer bus requirements, and the inclusion of circuitry that would normally
have to be provided in a separate chip (for instance, the Z80 included a memory controller). It was
these features that allowed the home computer "revolution" to take off in the early 1980s,
eventually delivering semi-usable machines that sold for US$99.
Motorola trumped the entire 8-bit world by introducing the MC6809, arguably one of the most
powerful, orthogonal, and clean 8-bit microprocessor designs ever fielded and also one of the
most complex hardwired logic designs that ever made it into production for any microprocessor.
Microcoding replaced hardwired logic at about this point in time for all designs more powerful
than the MC6809 specifically because the design requirements were getting too complex for
hardwired logic.
Another early 8-bit microprocessor was the Signetics 2650, which enjoyed a brief flurry of interest
due to its innovative and powerful instruction set architecture.
A seminal microprocessor in the world of spaceflight was RCA's RCA 1802(aka CDP1802, RCA
COSMAC) which was used in NASA's Voyager and Viking spaceprobes of the 1970s, and onboard
the Galileo probe to Jupiter (launched 1989, arrived 1995). The CDP1802 was used because it
could be run at very low power,* and because its production process (Silicon on Sapphire)
ensured much better protection against cosmic radiation and electrostatic discharges than that of
any other processor of the era; thus, the 1802 is said to be the first radiation-hardened
microprocessor.
16-bit
The first multi-chip 16-bit microprocessor was the National Semiconductor IMP-16,
introduced in early 1973. An 8-bit version of the chipset introduced in 1974 as the IMP-8. In 1975,
National introduced the first 16-bit single-chip microproessor, the PACE, which was later followed
by an NMOS version, the INS8900.
Other early multi-chip 16-bit microprocessors include one used by Digital Equipment Corporation

(DEC) in the LSI-11 OEM board set and the packaged PDP 11/03 minicomputer, and the Fairchild
Semiconductor MicroFlame 9440, both of which were introduced in the 1975 to 1976 timeframe.
Another early single-chip 16-bit microprocessor was TI's TMS 9900, which was also compatible
with their TI 990 line of minicomputers. The 9900 was used in the TI 990/4 minicomputer, the TI99/4A home computer, and the TM990 line of OEM microcomputer boards. The chip was packaged
in a large ceramic 64-pin DIP package package, while most 8-bit microprocessors such as the
Intel 8080 used the more common, smaller, and less expensive 40-pin DIP. A follow-on chip, the
TMS 9980, was designed to compete with the Intel 8080, had the full TI 990 16-bit instruction set,
used a plastic 40-pin package, moved data 8 bits at a time, but could only address 16 KB. A third
chip, the TMS 9995, was a new design. The family later expanded to include the 99105 and 99110.
Intel followed a different path, having no minicomputers to emulate, and instead "upsized" their
8080 design into the 16-bit Intel 8086, the first member of the x86 family which powers most
modern PC type computers. Intel introduced the 8086 as a cost effective way of porting software
from the 8080 lines, and succeeded in winning much business on that premise. Following up their
8086 and 8088, Intel released the 80186, 80286 and, in 1985, the 32-bit 80386, cementing their PC
market dominance with the processor family's backwards compatibility.
The integrated microprocessor memory management unit (MMU) was developed by Childs et al.
of Intel, and awarded US patent number 4,442,484.
32-bit designs
16-bit designs were in the market only briefly when full 32-bit implementations started to appear.
The world's first single-chip 32-bit microprocessor was the AT&T Bell Labs BELLMAC-32A, with
first samples in 1980, and general production in 1982 (See
reference and
this webpage for a bibliographic
this webpage for a general reference). After the divestiture of AT&T in 1984, it
was renamed the WE 32000 (WE for Western Electric), and had two follow-on generations, the WE
32100 and WE 32200. These microprocessors were used in the AT&T 3B5 and 3B15
minicomputers; in the 3B2, the world's first desktop supermicrocomputer; in the "Companion",
the world's first 32-bit laptop computer; and in "Alexander", the world's first book-sized
supermicrocomputer, featuring ROM-pack memory cartridges similar to today's gaming consoles.
All these systems ran the original Bell Labs UNIX Operating System, which included the first

Windows-type software called xt-layers.
The most famous of the 32-bit designs is the MC68000, introduced in 1979. The 68K, as it was
widely known, had 32-bit registers but used 16-bit internal data paths, and a 16-bit external data
bus to reduce pin count. Motorola generally described it as a 16-bit processor, though it clearly
has 32-bit architecture. The combination of high speed, large (16 megabyte) memory space and
fairly low costs made it the most popular CPU design of its class. The Apple Lisa and Macintosh
designs made use of the 68000, as did a host of other designs in the mid-1980s, including the
Atari ST and Commodore Amiga.
Intel's first 32-bit microprocessor was the iAPX 432, which was introduced in 1981 but was not a
commercial success. It had an advanced capability-based object-oriented architecture, but poor
performance compared to other competing architectures such as the Motorola 68000.
Motorola's success with the 68000 led to the MC68010, which added virtual memory support. The
MC68020, introduced in 1985 added full 32-bit data and address busses. The 68020 became
hugely popular in the Unix supermicrocomputer market, and many small companies (e.g., Altos,
Charles River Data Systems) produced desktop-size systems. Following this with the MC68030,
which added the MMU into the chip, the 68K family became the processor for everything that
wasn't running DOS. The continued success led to the MC68040, which included a FPU for better
math performance. An 68050 failed to achieve its performance goals and was not released, and
the follow-up MC68060 was released into a market saturated by much faster RISC designs. The
68K family faded from the desktop in the early 1990s.
Other large companies designed the 68020 and follow-ons into embedded equipment. At one
point, there were more 68020s in embedded equipment than there were Intel Pentiums in PCs (See
this webpage for this embedded usage information). The ColdFire processor cores are
derivatives of the venerable 68020.
During this time (early to mid 1980s), National Semiconductor introduced a very similar 16-bit
pinout, 32-bit internal microprocessor called the NS 16032 (later renamed 32016), the full 32-bit
version named the NS 32032, and a line of 32-bit industrial OEM microcomputers. By the mid1980s, Sequent introduced the first symmetric multiprocessor (SMP) server-class computer using
the NS 32032. This was one of the designs few wins, and it disappeared in the late 1980s.

Other designs included the interesting Zilog Z8000, which arrived too late to market to stand a
chance and disappeared quickly.
In the late 1980s, "microprocessor wars" started killing off some of the microprocessors.
Apparently, with only one major design win, Sequent, the NS 32032 just faded out of existence,
and Sequent switched to Intel microprocessors.
64 bit microchips
Though RISC (see below) based designs featured the first crop of 64 bit processors long before
the current mainstream PC microchips from AMD & Intel, they were limited to proprietary OSes.
However with AMD's introduction of the first 64-bit chip Athlon 64, followed by Intel's own 64 bit
chips, the 64 bit race has truly begun. Both processors are also backward compatible meaning
they can run 32 bit legacy apps as well as the new 64 bit software. With 64 bit Windows XP and
Linux that runs on 64 bits, the software too is geared to utilise the full power of such processors.
RISC
In the mid-1980s to early-1990s, a crop of new high-performance RISC (reduced instruction set
computer) microprocessors appeared, which were initially used in special purpose machines and
Unix workstations, but have since become almost universal in all roles except the Intel-standard
desktop.
The first commercial design was released by MIPS Technology, the 32-bit R2000 (the R1000 was
not released). The R3000 made the design truly practical, and the R4000 introduced the world's
first 64-bit design. Competing projects would result in the IBM POWER and Sun SPARC systems,
respectively. Soon every major vendor was releasing a RISC design, including the AT&T CRISP,
AMD 29000, Intel i860 and Intel i960, Motorola 88000, DEC Alpha and the HP-PA.
Market forces have "weeded out" many of these designs, leaving the POWER and the derived
PowerPC as the main desktop RISC processor, with the SPARC being used in Sun designs only.
MIPS continues to supply some SGI systems, but is primarily used as an embedded design,
notably in Cisco routers. The rest of the original crop of designs have either disappeared, or are
about to. Other companies have attacked niches in the market, notably ARM, originally intended
for home computer use but since focussed at the embedded processor market. Today RISC

designs based on the MIPS, ARM or PowerPC core power the vast majority of computing devices.
Of course, in the IBM-compatible PC world, Intel, AMD, and now VIA of Taiwan all make x86compatible microprocessors. In 64-bit computing, the DEC(-Intel) ALPHA, the AMD 64, and the
HP-Intel Itanium are the most popular designs as of late 2004.
x86 or 80x86 is the generic name of a microprocessor architecture first developed and
manufactured by Intel.
The architecture is called x86 because the earliest processors in this family were identified only
by numbers ending in the sequence "86": the 8086, the 80186, the 80286, the 386, and the 486.
Because one cannot trademark numbers, Intel and most of its competitors began to use
trademarkable names such as Pentium for subsequent generations of processors, but the earlier
naming scheme has stuck as a term for the entire family. Intel now refers to x86 as IA-32, an
abbreviation for Intel Architecture, 32-bit.
Intel 8085
The Intel 8085 is an 8-bit microprocessor made by Intel in the mid-1970s It was binary compatible
with the more-famous Intel 8080 but required less supporting hardware, thus allowing simpler and
less expensive microcomputer systems to be built.
The "5" in the model number came from the fact that the 8085 required only a 5-volt power supply
rather than the 5V and 12V supplies the 8080 needed. Both processors were sometimes used in
computers running the CP/M operating system
, and the 8085 later saw use as a microcontroller (much by virtue of its component count reducing
feature). Both designs were later eclipsed by the compatible but more capable Zilog Z80, which
took over most of the CP/M computer market as well as taking a large share of the booming home
computer market in the early-to-mid-1980s.
The 8085 can access 65,536 individual memory locations, but can only access one at a time,
because it is an eight bit microprocessor and each operation requires eight bits to preform it.
Unlike some other microprocessors of its era, it has a separate address space for up to 256 I/O
ports. It also has a built in register array which are usually labeled A,B,C,D,E,H and L. The
microprocessor also has three hardware based HALT operations which are found in pin 7 pin 8

and pin 9, these are called RST 7.5, RST 6.5, and RST 5.5 respectively. RST 7.5 is used in case of a
power surge.
8-bit
8-bit CPUs normally use an 8-bit data bus and a 16-bit address bus which means that their
address space is limited to 64 kilobytes; this is not a "natural law", however, and thus there are
exceptions.
The first widely adopted 8-bit microprocessor was the Intel 8080, being used in many hobbyist
computers of the late 1970s and early 1980s, often running the CP/M. The Zilog Z80 (compatible
with the 8080) and the Motorola 6800 were also used in similar computers. The Z80 and the MOS
Technology 6502 8-bit CPUs were widely used in home computers and game consoles of the 70s
and 80s. Many 8-bit CPUs or microcontrollers are the basis of today's ubiquitous embedded
systems
There are 28 (256) possible permutations for 8 bits.
Address space
In computing, an address space defines a context in which an address makes sense.
Two addresses may be numerically the same, but refer to different things, if they belong to
different address spaces.
Some example address spaces include:

Main memory(physical memory)
Virtual memory
I/O port space
IP address
house numbers on street addresses
In general, things in one address space are physically in a different location than things in
another address space. For example, "house number 101 South" on one particular southward
street is completely different from any house number (not just the 101st house) on a different
southward street.
However, sometimes different address spaces overlap (some physical location exists in both
address spaces). When overlapping address spaces are not aligned, translation is necessary.
For example, virtual-to-physical address translation is necessary to translate addresses in the
virtual memory address space to ddresses in physical address space -- one physical address,
and one or more numerically different virtual address, all refer to the same physical byte of
RAM.
Many programmers prefer to use a flat memory model, in which there is no distinction
between code space, data space, and virtual memory-- in other words, numerically identical
pointers refer to exactly the same byte of RAM in all three address spaces.
Unfortunately, many early computers did not support a flat memory model -- in particular,
Harvard architecture machines force program storage to be completely separate from data
storage.
Many modern DSPs (such as the Motorola 56000) have 3 separate storage areas -- program
storage, coefficient storage, and data storage. Some commonly-used instructions fetch from
all three areas simultaneously -fewer storage areas (even if there were the same or more total bytes of storage) would make
those instructions run slower. 3 storage areas merely a type of Harvard architecture,or does
"Harvard" imply exactly 2 storage areas ?
In the Linux kernel, address spaces include:

Kernel memory
User memory, accessed through copy_to_user(), copy_from_user and similar
functions
I/O memory, accessed through readb(), writel(), memcpy_toio(), etc.
Primary storage
Primary storage is a category of computer storage, often called main memory. Confusingly, the
term primary storage has recently been used in a few contexts to refer to online storage (hard
disk), which is usually classified as secondary storage.

Primary storage is used to store data that is likely to be in active use, so it is usually faster than
long-term secondary storage. Today, many computers have cache memory located in between the
central processing unit and primary storage in order to further increase speed.
A particular location in storage is selected by its physical memory address. That address remains
the same, no matter how the particular value stored there changes.
Over the history of computing, a variety of technologies have been used for primary storage.
Today, we are most familiar with random access memory (RAM) made out of many small
integrated circuits. Some early computers used mercury delay lines, in which a series of acoustic
pulses were sent along a tube filled with mercury. When the pulse reached the end of the tube, the
circuitry detected whether the pulse represented a binary 1 or 0 and caused the oscillator at the
beginning of the line to repeat the pulse. Other early computers stored RAM on high-speed
magnetic drums.
Modern primary storage devices include:
Random access memory (RAM) - includes VRAM WRAM NVRAM

Read-only memory (ROM)
Before the use of integrated circuits for memory became widespread, primary storage was
implemented in many different forms:
Williams tube
Delay line memory
Drum memory
Core memory
Twistor memory
Bubble memory
Virtual memory
Virtual memory is a computerdesign feature that permits software to use more main memory (the
memory which the CPU can read and write to directly) than the computer actually physically
possesses.
Most computers possess four kinds of memory: registers in the CPU, caches both inside and
adjacent to the CPU, physical memory, generally in the form of RAM which the CPU can read and
write to directly and reasonably quickly; and disk storage which is much slower, but also much
larger. Many applications require access to more information (codeas well as data) than can be
stored in physical memory. This is especially true when the operating system is one that wishes
to allow multiple processes/applications to run seemingly in parallel. The obvious response to the
problem of the maximum size of the physical memory being less than that required for all running
programs is for the application to keep some of its information on the disk, and move it back and
forth to physical memory as needed, but there are a number of ways to do this.
One option is for the application software itself to be responsible both for deciding which
information is to be kept where, and also for moving it back and forth. The programmer would do
this by determining which sections of the program (and also its data) were mutually exclusive and
then arranging for loading and unloading the appropriate sections from physical memory, as
needed. The disadvantage of this approach is that each application's programmer must spend
time and effort on designing, implementing, and debuggingthis mechanism, instead of focusing
on their application; this hampered programmers' efficiency. Also, if any programmer could truly
choose which of their items of data to store in the physical memory at any one time, they could
easily conflict with the decisions made by another programmer, who also wanted to use all the
available physical memory at that point.
The alternative is to use virtual memory, in which a combination of special hardware and
operating system software makes use of both kinds of memory to make it look as if the computer
has a much larger main memory than it actually does. It does this in a way that is invisible to the
rest of the software running on the computer. It usually provides the ability to simulate a main
memory of almost any size (as limited by the size of the addresses being used by the operating
system and cpu; the total size of the Virtual Memory can be 232 for a 32 bit system, or
approximately 4 Gigabytes, while newer 64 bit chips and operating systems use 64 or 48 bit
addresses and can index much more virtual memory).
This makes the job of the application programmer much simpler. No matter how much memory
the application needs, it can act as if it has access to a main memory of that size. The
programmer can also completely ignore the need to manage the moving of data back and forth
between the different kinds of memory.
In technical terms, virtual memory allows software to run in a memory address space whose size
and addressing are not necessarily tied to the computer's physical memory. While conceivably
virtual memory could be implemented solely by operating system software, in practice its
implementation almost universally uses a combination of hardware and operating system
software.
Basic operation
When virtual memory is used, or when a main memory location is read or written to by the CPU,
hardware within the computer translates the address of the memory location generated by the
software (the virtual memory address) into either:
the address of a real memory location (the physical memory address) which is assigned
within the computer's physical memory to hold that memory item, or
an indication that the desired memory item is not currently resident in main memory (a socalled virtual memory exception)
In the former case, the memory reference operation is completed, just as if the virtual memory
were not involved. In the latter case, the operating system is invoked to handle the situation,
since the actions needed before the program can continue are usually quite complex.
The effect of this is to swap sections of information between the physical memory and the disk;
the area of the disk which holds the information which is not currently in physical memory is
called the swap file, page file, or swap partition (on some operating systems it is a dedicated
partition of a disk).
Details
The translation from virtual to physical addresses is implemented by an MMU. This may be either
a module of the CPU, or an auxiliary, closely coupled chip.
The operating system is responsible for deciding which parts of the program's simulated main
memory are kept in physical memory. The operating system also maintains the translation tables
which provide the mappings between virtual and physical addresses, for use by the MMU. Finally,
when a virtual memory exception occurs, the operating system is responsible for allocating an
area of physical memory to hold the missing information, bringing the relevant information in

from the disk, updating the translation tables, and finally resuming execution of the software that
incurred the virtual memory exception.
In most computers, these translation tables are stored in physical memory. Therefore, a virtual
memory reference might actually involve two or more physical memory references: one or more
to retrieve the needed address translation from the page tables, and a final one to actually do the
memory reference.
To minimize the performance penalty of address translation, most modern CPUs include an onchip MMU, and maintain a table of recently used physical-to-virtual translations, called a
Translation Lookaside Buffer, or TLB. Addresses with entries in the TLB require no additional
memory references (and therefore time) to translate, However, the TLB can only maintain a fixed
number of mappings between virtual and physical addresses; when the needed translation is not
resident in the TLB, action will have to be taken to load it in.
On some processors, this is performed entirely in hardware; the MMU has to do additional
memory references to load the required translations from the translation tables, but no other
action is needed. In other processors, assistance from the operating system is needed; an
exception is raised, and on this exception, the operating system replaces one of the entries in the
TLB with an entry from the translation table, and the instruction which made the original memory
reference is restarted.
The hardware that supports virtual memory almost always supports memory protection
mechanisms as well. The MMU may have the ability to vary its operation according to the type of
memory reference (for read, write or execution), as well as the privilege mode of the CPU at the
time the memory reference was made. This allows the operating system to protect its own code
and data (such as the translation tables used for virtual memory) from corruption by an erroneous
application program and to protect application programs from each other and (to some extent)
from themselves (e.g. by preventing writes to areas of memory which contain code).Paging and
virtual memory
Virtual memory is usually (but not necessarily) implemented using paging. In paging, the low
order bits of the binary representation of the virtual address are preserved, and used directly as
the low order bits of the actual physical address; the high order bits are treated as a key to one or
more address translation tables, which provide the high order bits of the actual physical address.
For this reason a range of consecutive addresses in the virtual address space whose size is a
power of two will be translated in a corresponding range of consecutive physical addresses. The
memory referenced by such a range is called a page. The page size is typically in the range of 512
to 8192 bytes (with 4K currently being very common), though page sizes of 4 megabytes or larger
may be used for special purposes. (Using the same or a related mechanism, contiguous regions
of virtual memory larger than a page are often mappable to contiguous physical memory for
purposes other than virtualization, such as setting access and caching control bits.)
The operating system stores the address translation tables, the mappings from virtual to physical
page numbers, in a data structure known as a page table,
If a page that is marked as unavailable (perhaps because it is not present in physical memory, but
instead is in the swap area), when the CPU tries to reference a memory location in that page, the
MMU responds by raising an exception (commonly called a page fault) with the CPU, which then
jumps to a routine in the operating system. If the page is in the swap area, this routine invokes an
operation called a page swap, to bring in the required page.
The page swap operation involves a series of steps. First it selects a page in memory, for
example, a page that has not been recently accessed and (preferably) has not been modified
since it was last read from disk or the swap area. (See page replacement algorithms for details.) If
the page has been modified, the process writes the modified page to the swap area. The next step
in the process is to read in the information in the needed page (the page corresponding to the
virtual address the original program was trying to reference when the exception occurred) from
the swap file. When the page has been read in, the tables for translating virtual addresses to
physical addresses are updated to reflect the revised contents of the physical memory. Once the
page swap completes, it exits, and the program is restarted and continues on as if nothing had
happened, returning to the point in the program that caused the exception.
It is also possible that a virtual page was marked as unavailable because the page was never
previously allocated. In such cases, a page of physical memory is allocated and filled with zeros,
the page table is modified to describe it, and the program is restarted as above.
Additional details
One additional advantage of virtual memory is that it allows a computer to multiplex its CPU and

memory between multiple programs without the need to perform expensive copying of the
programs' memory images. If the combination of virtual memory system and operating system
supports swapping, then the computer may be able to run simultaneous programs whose total
size exceeds the available physical memory. Since most programs have a small subset (active
set) of pages that they reference over significant periods of their execution, the performance
penalty is less than that which might be expected. If too many programs are run at once, or if a
single program continuously accesses widely scattered memory locations, then page swapping
becomes excessively frequent and overall system performance will become unacceptably slow.
This is often called thrashing (since the disk is being excessively overworked - thrashed) or
paging storm, which corresponds to accessing the swap medium being three orders of magnitude
slower compared to main memory access.
Note that virtual memory is not a requirement for precompilation of software, even if the software
is to be executed on a multiprogramming system. Precompiled software loaded by the operating
system has the opportunity to carry out address relocation at load time. This suffers by
comparison with virtual memory in that a copy of program relocated at load time cannot run at a
distinct address once it has started execution.
It is possible to avoid the overhead of address relocation using a process called rebasing, which
uses metadata in the executable image header to guarantee to the run-time loader that the image
will only run within a certain virtual address space. This technique is used on the system libraries
on Win32 platforms, for example.
In embedded systems, swapping is typically not supported.
Systems with a large amount of RAM can create a virtual hard disk within the RAM itself. This
does block some of the RAM from being available for other system tasks but it does considerably
speed up access to the swap file itself.
Processor register
In computer architecture, a processor register is a small amount of very fast computer memory
used to speed the execution of computer programs by providing quick access to commonly used
valuestypically, the values being in the midst of a calculation at a given point in time.
These registers are the top of the memory hierarchy, and are the fastest way for the system to
manipulate data. Registers are normally measured by the number of bits they can hold, for
example, an "8-bitregister" or a "32-bit register". Registers are now usually implemented as a
register file, but they have also been implemented using individual flip-flops, high speed core
memory, thin film memory, and other ways in various machines.
The term is often used to refer only to the group of registers that can be directly indexed for input
or output of an instruction, as defined by the instruction set. More properly, these are called the
"architected registers". For instance, the x86 instruction set defines a set of eight 32-bit registers,
but a CPU that implements the x86 instruction set will contain many more hardware registers than
just these eight.
There are several other classes of registers:
Data registers are used to store integer numbers (see also Floating Point Registers, below).
In some simple/older CPUs, a special data register is the accumulator, used for arithmetic
calculations.
Address registers hold memory addresses and are used to access memory. In some
simple/older CPUs, a special address register is the index register (one or more of these may be
present)
General Purpose registers (GPRs) can store both data and addresses, i.e., they are
combined Data/Address registers.
Floating Point registers (FPRs) are used to store floating point
Constant registers hold read-only values (e.g., zero, one, pi, ...).
Vector registers hold data for vector processing done by SIMD instructions (Single
Instruction, Multiple Data).
Special Purpose registers store internal CPU data, like the program counter (aka
instruction pointer), stack pointer, and status register (aka processor status word).
In some architectures, model-specific registers (also called machine-specific registers)
store data and settings related to the processor itself. Because their meanings are attached to the
design of a specific processor, they cannot be expected to remain standard between processor
generations.
Memory segment
On the Intel x86 architecture, a memory segment is the portion of memory which may be
addressed by a single index register without changing a 16-bit segment selector. In real mode or
protected mode on the 80286 processor (or V86 mode on the 80386 and later processors), a
segment is 64 kilobytes in size (using 16-bit index registers). In 32-bit protected mode, available in
80386 and subsequent processors, a segment is 4 gigabytes (due to 32-bit index registers).
In 16-bit mode, enabling applications to make use of multiple memory segments (in order to
access more memory than available in any one 64K-segment) was quite complex, but was viewed
as a necessary evil for all but the smallest tools (which could do with less memory). The root of
the problem was that no appropriate address-arithmetic instructions suitable for flat addressing
of the entire memory range were available. Flat addressing is possible by applying multiple
instructions, which however leads to slower programs.
The introduction of 32-bit operating systems and the more comfortable 32-bit flat memory model
has resulted in the almost elimination in use of segmented addressing towards the end of the
1990s. However, using the flat memory model has resulted in the 4 gigabyte limit not being far
from everyday use. Segmentation allows operating systems to make the limit a per-process
virtual address space issue, utilizing up to a maximum of 64 gigabytes of system memory, but the
reluctance to eventually return to segmentation is often cited as motivation to move towards 64bit processors.
Computer bus
In computer architecture, a bus is a subsystem that transfers data or power between computer
components inside a computer or between computers. Unlike a point-to-point connection, a bus
can logically connect several peripheral s over the same set of wires.
Early computer buses were literally parallel electrical buses with multiple connections, but the
term is now used for any physical arrangement that provides the same logical functionality as a
parallel electrical bus. Modern computer buses can use both parallel and bit-serial connections,
and can be wired in either a multidrop (electrical parallel) or daisy chain topology, or connected
by switched hubs, as in the case of USB.
Intel 8086
The 8086 is a 16-bit microprocessor chip designed by Intel in 1978, which gave rise to the x86
architecture. Shortly later the Intel 8088 was introduced with an external 8-bit bus, allowing the
use of cheap chipsets. It was based on the design of the 8080 and 8085 (it was assembly
languagewith the 8080) with a similar register set, but was expanded to 16 bits. The Bus Interface
Unit fed the instruction stream to the Execution Unit through a 6 byte prefetch queue, so fetch
and execution were concurrent a primitive form of pipelining (8086 instructions varied from 1 to
4 bytes).
It featured four 16-bit general registers, which could also be accessed as eight 8-bit registers, and
four 16-bit index registers (including the stack pointer). The data registers were often used
implicitly by instructions, complicating register allocation for temporary values. It featured 64K 8bit I/O (or 32K 16 bit) ports and fixed vectored interrupts. Most instructions could only access one
memory location, so one operand had to be a register. The result was stored in one of the
operands.
There were also four segment registers that could be set from index registers. The segment
registers allowed the CPU to access one megabyte of memory in an odd way. Rather than just
supplying missing bytes, as in most segmented processors, the 8086 shifted the segment register
left 4 bits and added it to the address. As a result segments overlapped, which most people
consider to have been poor design. Although this was largely acceptable (and even useful) for
assembly language, where control of the segments was complete, it caused confusion in
languages which make heavy use of pointers (such as C). It made efficient representation of
pointers difficult, and made it possible to have two pointers with different values pointing to the
same location. Worse, this scheme made expanding the address space to more than one
megabyte difficult. Effectively, it was expanded by changing the addressing scheme in the 80286.
The processor runs at clock speeds between 4.77 (in the original IBM PC) and 10 MHz.
Typical execution times in cycles (estimates):

addition: 34 (register), 9+EA25+EA (memory access)
multiplication: 70118 (register), 76+EA143+EA (memory access)
move: 2 (register), 8+EA14+EA (memory access)
near jump: 1115, 18+EA (memory access)
far jump: 15, 24+EA (memory access)

EA: time to compute effective address, ranging from 5 to 12 cycles
The 8086 did not contain any floating point instructions, but could be connected to a
mathematical coprocessors to add this capability. The Intel 8087 was the standard version, but
manufacturers like Weitek soon offered higher performance alternatives.
Microcomputers using the 8086

The first commercial microcomputer built on the basis of the 8086 was the Mycron 2000.
The IBM Displaywriter word processing machine also used the 8086. The most influential
microcomputer of all, the IBM PC, used the 8-bit variant, the Intel 8088.
History
The x86 architecture first appeared inside the Intel 8086 CPU in 1978; the 8086 was a development
of the 8008 processor (which itself followed the 4004). It was adopted (in the simpler 8088 version)
three years later as the standard CPU of the IBM PC. The ubiquity of the PC platform has resulted
in the x86 becoming one of the most successful CPU architectures ever.
Other companies also manufacture or have manufactured CPUs conforming to the x86
architecture: examples include Cyrix (now owned by VIA Technologies), NEC Corporation, IBM,
IDT, and Transmeta. The most successful of the clone manufacturers has been AMD, whose
Athlon series is a close second to the Pentium series for popularity.
The 8086 was a 16-bit processor; the architecture remained 16-bit until 1985, when the 32-bit
80386 was developed. Subsequent processors represented refinements of the 32-bit architecture,
introducing various extensions, until in 2003 AMD developed a 64-bit extension to the architecture
in the form of the AMD64 standard, introduced with the Opteron processor family, which was also
adopted a few years later (under a different name) in a new generation of Intel Pentiums.
Note that Intel also introduced a separate 64-bit architecture used in its Itanium processors which
it calls IA-64 or more recently IPF (Itanium Processor Family). IA-64 is a completely new system

that bears no resemblance whatsoever to the x86 architecture; it should not be confused with IA32, which is essentially synonymous with x86.
Design
The x86 architecture is essentially CISC with variable instruction length. Word sized memory
access is allowed to unaligned memory addresses. Words are stored in the little-endian order.
Backwards compatibility has always been a driving force behind the development of the x86
architecture (the design decisions this has required are often criticised, particularly by
proponents of competing processors, who are frustrated by the continued success of an
architecture widely perceived as quantifiably inferior). Modern x86 processors translate the x86
instruction set to more RISC-like micro-instructions upon which modern micro-architectural
techniques can be applied.
Note that the names for instructions and registers (mnemonics) that appear in this brief review
are the ones specified in Intel documentation and used by Intel (and compatible, eg. Microsoft's
MASM, Borland's TASM, CAD-UL's as386, etc.) assemblers. An instruction that is specified in the
Intel syntax by mov al, 30h is equivalent to AT&T-syntax movb $0x30, %al, and both translate to
the two bytes of machine code B0 30 (hexadecimal). You can see that there is no trace left in this
code of either "mov" or "al", which are the original Intel mnemonics. If we wanted, we could write
an assembler that would produce the same machine code from the command "move immediate
byte hexadecimally encoded 30 into low half of the first register". However, the convention is to
stick to Intel's original mnemonics.
The x86 assembly language is discussed in more detail in the x86 assembly language article.
Real mode
Intel 8086 and 8088 had 14 16-bit registers. Four of them (AX, BX, CX, DX) were general purpose
(although each had also an additional purpose; for example only CX can be used as a counter
with the loop instruction). Each could be accessed as two separate bytes (thus BX's high byte can
be accessed as BH and low byte as BL). In addition to them, there are 4 segment registers (CS,
DS, SS and ES). They are used to form a memory address. There are 2 pointer registers (SP which
points to the bottom of the stack, and BP which can be used to point at some other place in the
stack or the memory). There are two index registers (SI and DI) which can be used to point inside

an array. Finally, there are the flag register (containing flags such as carry, overflow, zero and so
on), and the instruction pointer (IP) which points at the current instruction.
In real mode, memory access is segmented. This is done by shifting the segment address left by 4
bits and adding an offset in order to receive a final 20-bit address. Thus the total address space in
real mode is 220 bytes, or 1 MB, quite an impressive figure for 1978. There are two addressing
modes: near and far. In far mode, both the segment and the offset are specified. In near mode,
only the offset is specified, and the segment is taken from the appropriate register. For data the
register is DS, for code is CS, and for stack it is SS. For example, if DS is A000h and SI is 5677h,
DS:SI will point at the absolute address DS 16 + SI = A5677h.
In this scheme, two different segment/offset pairs can point at a single absolute location. Thus, if
DS is A111h and SI is 4567h, DS:SI will point at the same A5677h as above. In addition to duplicity,
this scheme also makes it impossible to have more than 4 segments at once. Moreover, CS, DS
and SS are vital for the correct functioning of the program, so that only ES can be used to point
somewhere else. This scheme, which was intended as a compatibility measure with the Intel 8085
has caused no end of grief to programmers.
In addition to the above-said, the 8086 also had 64K of 8-bit (or alternatively 32K of 16-bit) I/O
space, and a 64K (one segment) stack in memory supported by hardware. Only words (2 bytes)
can be pushed to the stack. The stack grows downwards, its bottom being pointed by SS:SP.
There are 256 interrupts, which can be created by both hardware and software. The interrupts can
cascade, using the stack to store the return address.
16-bit protected mode

The Intel 80286 could support 8086 real mode 16-bit software without any changes, however it
also supported another mode of work called the protected mode, which expanded addressable
physical memory to 16MB and addressable virtual memory to 1GB. This was done by using the
segment registers only for storing an index to a segment table. There were two such tables, the
GDT and the LDT, holding each up to 8192 segment descriptors, each segment giving access to
up to 64 KB of memory. The segment table provided a 24-bit base address, which could then be
added to the desired offset to create an absolute address. In addition, each segment could be
given one of four privilege levels (called the rings).

Although the introductions were an improvement, they were not widely used because a protected
mode operating system could not run existing real mode software as processes. Such capability
only appeared with the virtual 8086 mode of the subsequent 80386 processor.
In the meantime, operating systems like OS/2 tried to ping-pong the processor between protected
and real modes. This was both slow and unsafe, as in real mode a program could easily crash the
computer. OS/2 also defined restrictive programming rules which allowed a Family API or bound
program to run either in real mode or in protected mode. This was however about running
programs originally designed for protected mode, not vice-versa. By design, protected mode
programs did not suppose that there is a relation between selector values and physical
addresses. It is sometimes mistakenly believed that problems with running real mode code in 16bit protected mode resulted from IBM having chosen to use Intel reserved interrupts for BIOS
calls. It is actually related to such programs using arbitrary selector values and performing
"segment arithmetic" described above on them.
This problem also appeared with Windows 3.0. Optimally, this release wanted to run programs in
16-bit protected mode, while previously they were running in real mode. Theoretically, if a
Windows 1.x or 2.x program was written "properly" and avoided segment arithmetic it would run
indifferently in both real and protected modes. Windows programs generally avoided segment
arithmetic because Windows implemented a software virtual memory scheme and moved
program code and data in memory when programs were not running, so manipulating absolute
addresses was dangerous; programs were supposed to only keep handles to memory blocks
when not running, and such handles were quite similar to protected-mode selectors already.
Starting an old program while Windows 3.0 was running in protected mode triggered a warning
dialog, suggesting to either run Windows in real mode (it could presumably still use expanded
memory, possibly emulated with EMM386 on 80386 machines, so it was not limited to 640KB) or to
obtain an updated version from the vendor. Well-behaved programs could be "blessed" using a
special tool to avoid this dialog. It was not possible to have some GUI programs running in 16-bit
protected mode and other GUI programs running in real mode, probably because this would
require having two separate environments and (on 80286) would be subject to the previously
mentioned ping-ponging of the processor between modes. In version 3.1 real mode disappeared.
32-bit protected mode
The Intel 80386 introduced, perhaps, the greatest leap so far in the x86 architecture. With the
notable exception of the Intel 80386SX, which was 32-bit yet only had 24-bit addressing (and a 16bit data bus), it was all 32-bit - all the registers, instructions, I/O space and memory. To work with

the latter, it used a 32-bit extension of Protected Mode. As it was in the 286, segment registers
were used to index inside a segment table that described the division of memory. Unlike the 286,
however, inside each segment one could use 32-bit offsets, which allowed every application to
access up to 4GB without segmentation and even more if segmentation was used. In addition, 32bit protected mode supported paging, a mechanism which made it possible to use virtual
memory.
No new general-purpose registers were added. All 16-bit registers except the segment ones were
expanded to 32 bits. Intel represented this by adding "E" to the register mnemonics (thus the
expanded AX became EAX, SI became ESI and so on). Since there was a greater number of
registers, instructions and operands, the machine code format was expanded as well. In order to
provide backwards compatibility, the segments which contain executable code can be marked as
containing either 16 or 32 bit instructions. In addition, special prefixes can be used to include 32bit instructions in a 16-bit segment and vice versa.
Paging and segmented memory access were both required in order to support a modern
multitasking operating system. Linux, 386BSD, Windows NT and Windows 95 were all initially
developed for the 386, because it was the first CPU that made it possible to reliably support the
separation of programs' memory space (each into its own address space) and the preemption of
them in the case of necessity (using rings). The basic architecture of the 386 became the basis of
all further development in the x86 series.
The Intel 80387 math co-processor was integrated into the next CPU in the series, the Intel 80486.
The new FPU could be used to make floating point calculations, important for scientific
calculation and graphic design.
MMX and beyond
1996 saw the appearance of the MMX (Matrix Math Extensions, though sometimes incorrectly
referred to as Multi-Media Extensions) technology by Intel. While the new technology has been
advertised widely and vaguely, its essence is very simple: MMX defined 8 64-bit SIMD registers
overlayed onto the FPU stack to the Intel Pentium CPU design. Unfortunately, these instructions
were not easily mappable to the code generated by ordinary C compilers, and Microsoft, the
dominant compiler vendor, was slow to support them even as intrinsics. MMX also is limited to
integer operations. These technical shortcomings caused MMX to have little impact in its early
existence. Nowadays, MMX is typically used for some 2D video applications.
3DNow!

In 1997 AMD introduced the 3DNow! which were SIMD floating point instruction enhancements to
MMX (targeting the same MMX registers). While this did not solve the compiler difficulties, the
introduction of this technology coincided with the rise of 3D entertainment applications in the PC
space. 3D video game developers and 3D graphics hardware vendors used 3DNow! to help
enhance their performance on AMD's K6 and Athlon series of processors.
SSE
In 1999 Intel introduced the SSE instruction set which added 8 new 128 bit registers (not
overlayed with other registers). These instructions were analogous to AMD's 3DNow! in that they
primarily added floating point SIMD.
SSE2
In 2001 Intel introduced the SSE2 instruction set which added 1) a complete complement of
integers instructions (analogous to MMX) to the original SSE registers and 2) 64-bit SIMD floating
point instructions to the original SSE registers. The first addition made MMX almost obsolete, and
the second allowed the instructions to be realistically targeted by conventional compilers.
SSE3
Introduced in 2004 along with the Prescott revision of the Pentium 4 processor, SSE3 added
specific memory and thread-handling instructions to boost the performance of Intel's
HyperThreading technology. AMD later licensed the SSE3 instruction set for it's latest (E) revision
Athlon 64 processors. The SSE3 instruction set included on the new Athlons are only lacking a
couple of the instructions that Intel designed for HyperThreading, since the Athlon 64 doesn't
support HyperThreading; however SSE3 is still recognized in software as being supported on the
platform.
64-bit
As of 2002, the x86 architecture began to reach some design limits due to the 32-bit character
length. This makes it more difficult to handle massive information stores larger than 4 GB, such
as those found in databases or video editing.
Intel had originally decided to completely drop x86 compatibility with the 64-bit generation, by

introducing a new architecture called IA-64. IA-64 technology is the basis for its Itanium line of
processors. IA-64 is not software compatible with x86 software natively; it uses various forms of
emulation to run x86 software.
AMD took the initiative of extending out the 32-bit x86, aka IA-32 to 64-bit. It came up with an
architecture, called AMD64 (it was called x86-64 until being rebranded), and the first products
based on this technology were the Opteron and Athlon 64 family of processors. Due to the
success of the AMD64 line of processors, Intel adopted the AMD64 instruction set and added
some new extensions of their own, rebranding it the EM64T architecture (apparently not wishing
to acknowledge that the instruction set came from its main rival).
This was the first time that a major upgrade of the x86 architecture was initiated and originated by
a manufacturer other than Intel. Perhaps more importantly, it was the first time that Intel actually
accepted technology of this nature from an outside source.
Virtualization
x86 virtualization is difficult because the architecture does not meet the Popek and Goldberg
virtualization requirements. Nevertheless, there are several commercial x86
virtualizationproducts, such as VMware and Microsoft Virtual PC. Intel and AMD have both
announced that future x86 processors will have new enhancements to facilitate more efficient
virtualization. Intel's code names for their virtualization features are "Vanderpool" and
"Silvervale"; AMD uses the code name "Pacifica".
8086/88 Device Specifications
Both are packaged in DIP (Dual In-Line Packages).
8086: 16-bit microprocessor with a 16-bit data bus
8088: 16-bit microprocessor with an 8-bit data bus.
Both are 5V parts:
8086: Draws a maximum supply current of 360mA.
8086: Draws a maximum supply current of 340mA.
80C86/80C88: CMOS version draws 10mA with temp spec -40 to 225degF.
Input/Output current levels:
Yields a 350mV noise immunity for logic 0 (Output max can be as high as 450mV while input max
can be no higher than 800mV).
This limits the loading on the outputs.
8086/88 Pinout
8086/88 Pinout
Pin functions:
AD15-AD0
Multiplexed address(ALE=1)/data bus(ALE=0).
A19/S6-A16/S3 (multiplexed)
High order 4 bits of the 20-bit address OR status bits S6-S3.
M/IO
Indicates if address is a Memory or IO address.

RD
When 0, data bus is driven by memory or an I/O device.
WR
Microprocessor is driving data bus to memory or an I/O device. When 0, data bus contains valid
data.
ALE (Address latch enable)
When 1, address data bus contains a memory or I/O address.
DT/R (Data Transmit/Receive)
Data bus is transmitting/receiving data.
DEN (Data bus Enable)
Activates external data bus buffers.
8086/88 Pinout
Pin functions:
S7, S6, S5, S4, S3, S2, S1, S0
o
o
o
S7: Logic 1, S6: Logic 0.

S5: Indicates condition of IF flag bits.
S4-S3: Indicate which segment is accessed during current bus
cycle:
S2, S1, S0 : Indicate function of current bus cycle (decoded by

8288).
8086/88 Pinout
Pin functions:
INTR
When 1 and IF=1, microprocessor prepares to service interrupt.

INTA becomes active after current instruction completes.
INTA
Interrupt Acknowledge generated by the microprocessor in response to INTR. Causes the

interrupt vector to be put onto the data bus.
NMI
Non-maskable interrupt. Similar to INTR except IF flag bit is not consulted and interrupt is vector
2.
CLK
Clock input must have a duty cycle of 33% (high for 1/3 and low for 2/3s)
VCC/GND
Power supply (5V) and GND (0V).
8086/88 Pinout
Pin functions:
MN/ MX
Select minimum (5V) or maximum mode (0V) of operation.
BHE
Bus High Enable. Enables the most significant data bus bits (D
15
-D ) during a read or write

8
operation.
READY
Used to insert wait states (controlled by memory and IO for reads/writes) into the microprocessor.
RESET
Microprocessor resets if this pin is held high for 4 clock periods.
Instruction execution begins at FFFF0H and IF flag is cleared.
TEST
o
o
An input that is tested by the WAIT instruction.

Commonly connected to the 8087 coprocessor.
8086/88 Pinout
Pin functions:
HOLD

Requests a direct memory access (DMA). When 1, microprocessor stops and places address, data
and control bus in high-impedance state.
HLDA (Hold Acknowledge)
Indicates that the microprocessor has entered the hold state.
RO/GT1 and RO/GT0
Request/grant pins request/grant direct memory accesses (DMA) during maximum mode
operation.
LOCK
Lock output is used to lock peripherals off the system. Activated by using the LOCK: prefix on
any instruction.
QS1 and QS0
The queue status bits show status of internal instruction queue. Provided for access by the
numeric coprocessor (8087).
8284A Clock Generator

Basic functions:
Clock generation.
RESET synchronization.
READY synchronization.
Peripheral clock signal.
Connection of the 8284 and the 8086.


Clock generation:
Crystal is connected to X1 and X2.
XTAL OSC generates square wave signal at crystal's frequency which feeds:
An inverting buffer (output OSC) which is used to drive the EFI input of other 8284As.
2-to-1 MUX
F/ C selects XTAL or EFI external input.
The MUX drives a divide-by-3 counter (15MHz to 5MHz).

This drives:
The READY flipflop (READY synchronization).
A second divide-by-2 counter (2.5MHz clk for peripheral components).
The RESET flipflop.
CLK which drives the 8086 CLK input.

RESET:

Negative edge-triggered flipflop applies the RESET signal to the 8086 on the falling edge.
The 8086 samples the RESET pin on the rising edge.
Correct reset timing requires that the RESET input to the microprocessor becomes a logic 1 NO
LATER than 4 clocks after power up and stay high for at least 50us.
BUS Buffering and Latching

Demultiplexing the Buses:
Computer systems have three buses:
Address
Data
Control
The Address and Data bus are multiplexed (shared) due to pin limitations on the 8086.
The ALE pin controls a set of latches.
All signals MUST be buffered.
Latches buffer for A

Control and A
-A
.
15
+ BHE
-A
16
19
are buffered separately.
Data bus buffers must be bidirectional buffers (BB).
BHE : Selects the high-order memory bank.
BUS Buffering and Latching
BUS Timing
Writing:
Dump address on address bus.
Dump data on data bus.
Issue a write ( WR ) and set M/ IO to 1.
BUS Timing
Reading:
Dump address on address bus.
Issue a read ( RD ) and set M/ IO to 1.
Wait for memory access cycle.
BUS Timing
Bus Timing:
BUS Timing
During T
:
1
The address is placed on the Address/Data bus.
Control signals M/ IO , ALE and DT/ R specify memory or I/O, latch the address onto the
address bus and set the direction of data transfer on data bus.
During T
:
2
8086 issues the RD or WR signal, DEN , and, for a write, the data.
DEN enables the memory or I/O device to receive the data for writes and the 8086 to receive the
data for reads.
During T
:
3
This cycle is provided to allow memory to access data.
READY is sampled at the end of T
If low, T
becomes a wait state.

3
Otherwise, the data bus is sampled at the end of T
During T
:
4
All bus signals are deactivated, in preparation for next bus cycle.
Data is sampled for reads, writes occur for writes.
BUS Timing
Timing:
Each BUS CYCLE on the 8086 equals four system clocking periods (T states).
The clock rate is 5MHz , therefore one Bus Cycle is 800ns .
The transfer rate is 1.25MHz .
Memory specs (memory access time) must match constraints of system timing.
For example, bus timing for a read operation shows almost 600ns are needed to read data.
However, memory must access faster due to setup times, e.g. Address setup and data setup.
This subtracts off about 150ns .
Therefore, memory must access in at least 450ns minus another 30-40ns guard band for buffers
and decoders.
420ns DRAM required for the 8086.
BUS Timing
READY:
An input to the 8086 that causes wait states for slower memory and I/O components.
A wait state (T
) is an extra clock period inserted between T and T to lengthen the bus cycle.
W
2
3
For example, this extends a 460ns bus cycle (at 5MHz clock) to 660ns .
Text discusses role of 8284A and timing requirements for the 8086.
MIN and MAX Mode

Controlled through the MN/ MX pin.

Minimum mode is cheaper since all control signals for memory and I/O are generated by the
microprocessor.
Maximum mode is designed to be used when a coprocessor (8087) exists in the system.
Some of the control signals must be generated externally, due to redefinition of certain control
pins on the 8086.
The following pins are lost when the 8086 operates in Maximum mode .
ALE
WR
IO/ M
DT/ R
DEN
INTA
This requires an external bus controller: The 8288 Bus Controller .
8288 Bus Controller
Separate signals are used for I/O ( IORC and IOWC ) and memory ( MRDC and
MWTC ).
Also provided are advanced memory ( AIOWC ) and I/O ( AIOWC ) write strobes
plus INTA .
MAX Mode 8086 System
Intel 8088
The Intel 8088 is an Intel microprocessor based on the 8086, with 16-bit registers and an 8-bit
external data bus. The processor was used in the original IBM PC.
The 8088 was targeted at economical systems by allowing the use of 8-bit designs. Large bus
width circuit boards were still fairly expensive when it was released. The prefetch queue of the
8088 is 4 bytes, as opposed to the 8086's 6 bytes. The descendants of the 8088 include the 8018,
80288 (obsolete), and 80388 microcontrollers which are still in use today.
The most influential microcomputer to use the 8088 was, by far, the IBM PC. The original PC

processor ran at a clock frequency of 4.77 MHz.
Apparently IBM's own engineers wanted to use the Motorola 68000, and it was used later in the
forgotten IBM Instruments 9000 Laboratory Computer, but IBM already had rights to manufacture
the 8086 family, in exchange for giving Intel the rights to its bubble memory designs. A factor for
using the 8-bit Intel 8088 version was that it could use existing Intel 8085-type components, and
allowed the computer to be based on a modified 8085 design. 68000 components were not widely
available at the time, though it could use Motorola 6800 components to an extent. Intel bubble
memory was on the market for a while, but Intel left the market due to fierce competition from
Japanese corporations who could undercut by cost, and left the memory market to focus on
processors.
A compatible replacement chip, the V20, was produced by NEC for an approximate 20 percent
improvement in computing power.
Assembly language
Assembly language or simply assembly is a human-readable notation for the machine language
that a specific computer architecture uses. Machine language, a pattern of bits encoding machine
operations, is made readable by replacing the raw values with symbols called mnemonics.
For example, a computer with the appropriate processor will understand this x86/IA-32 machine
instruction:
10110000 01100001
For programmers, however, it is easier to remember the equivalent assembly language
representation:
mov al, 0x61
which means to move the hexadecimal value 61 (97 decimal) into the processor register with the
name "al". The mnemonic "mov" is short for "move", and a comma-separated list of arguments or
parameters follows it; this is a typical assembly language statement.
Unlike in high-level languages, there is usually a 1-to-1 correspondence between simple assembly
statements and machine language instructions. Transforming assembly into machine language is
accomplished by an assembler, and the reverse by a disassembler.
Every computer architecture has its own machine language, and therefore its own assembly
language. Computers differ by the number and type of operations that they support. They may
also have different sizes and numbers of registers, and different representations of data types in
storage. While all general-purpose computers are able to carry out essentially the same
functionality, the way they do it differs, and the corresponding assembly language must reflect
these differences.
In addition, multiple sets of mnemonics or assembly-language syntax may exist for a single
instruction set. In these cases, the most popular one is usually that used by the manufacturer in
their documentation.
Machine instructions
Instructions in assembly language are generally very simple, unlike in a high-level language. Any
instruction that references memory (for data or as a jump target) will also have an addressing
mode to determine how to calculate the required memory address. More complex operations must
be built up out of these simple operations. Some operations available in most instruction sets
include:
movings
set a register (a temporary "scratchpad" location in the CPU itself) to a fixed constant
value
move data from a memory location to a register, or vice versa. This is done to obtain
the data to perform a computation on it later, or to store the result of a computation.
read and write data from hardware devices
computing
add, subtract, multiply, or divide the values of two registers, placing the result in a
register
perform bitwise operations, taking the conjunction/disjunction (and/or) of
corresponding bits in a pair of registers, or the negation (not) of each bit in a register
compare two values in registers (for example, to see if one is less, or if they are equal)
affecting program flow
jump to another location in the program and execute instructions there
jump to another location if a certain condition holds

jump to another location, but save the location of the next instruction as a point to
return to (a call)
Specific instruction sets will often have single, or a few instructions for common operations
which would otherwise take many instructions. Examples:
saving many registers on the stack at once
moving large blocks of memory
complex and/or floating-point arithmetic (sine, cosine, square root, etc.)
applying a simple operation (for example, addition) to a vector of values
Addressing mode
In computer programming, addressing modes are primarily of interest to compiler writers and to
those (few nowadays) who use assembly language. Some computer science students may also
need to learn about addressing modes as part of their studies. Those involved with CPU design or
computer architecture should already know this and a lot more.
Addressing modes form part of the instruction set architecture for some particular type of CPU.
Some machine languages will need to refer to (addresses of) operands in memory. An addressing
mode specifies how to calculate the effective memory address of an operand by using
information held in registers and/or
constants contained within a machine instruction.
Opcode
Microprocessors perform operations using binary bits (on/off/1or0).

Four bits is equal to a byte, and two bytes is equal to a word.
as an example lets design a crude 4-bit microprocessor.

all registers/ALU/counter/address have a data path of 4-bit wide.
and all of our instructions must fit in a 3-bit address.
these are the op-codes mnemonic operations explanation

000 ADD add A to B and store in b
001 mov move A to B and store in b
010 Jmp jump value in A
011 xorA xor A with next op-code store in b
100 clrA clear A
101 return return to pointer
110 counter counter value
111 end end program
when the op-code values are active at the decoders logic inputs the desired operations are
performed..
this is a better explanation then whats below..cleanUpLater.., each of which is assigned a numeric
code called an opcode. To assist in the use of these numeric codes, mnemonics are used as
textual abbreviations. It's much easier to remember ADD than 05, for example.
Opcodes operate on registers, values in memory, values stored on the stack, I/Oports, the bus,
etc. They are used to perform arithmetic operations and move and change values. Operands are
the things that opcodes operate on.
Mnemonic
A mnemonic (Pronounced in American English in British English is a memoryaid. Mnemonics are
often verbal, are sometimes in verse form, and are often used to remember lists. Mnemonics rely
not only on repetition to remember facts, but also on associations between easy-to-remember
constructs and lists of data, based on the principle that the human mind much more easily
remembers data attached to spatial, personal or otherwise meaningful information than that
occurring in meaningless sequences. The word mnemonic shares etymology with Mnemosyne,
the name of the titan who personified Memory in Greek mythology
Techniques
A mnemonic technique is one of many memory aids that is used to create associations among

facts that make it easier to remember these facts. Popular mnemonic techniques include mind
mapping and peg lists. These techniques make use of the power of the visual cortex to simplify
the complexity of memories. Thus simpler memories can be stored more efficiently. For example,
a number can be remembered as a picture. This makes it easier to retrieve it from memory.
Mnemonic techniques should be used in conjunction with active recall to actually be beneficial.
For example, it is not enough to look at a mind map; one needs to actively reconstruct it in one's
memory
Instruction set
An instruction set, or instruction set architecture (ISA), describes the aspects of a computer
architecture visible to a programmer, including the native datatypes, instructions, registers,
addressing modes, memory architecture, interrupt and exception handling, and external I/O (if
any).
An ISA is a specification of the set of all binary codes (opcodes) that are the native form of
commands implemented by a particular CPU design. The set of opcodes for a particular ISA is
also known as the machine language for the ISA.
"Instruction set architecture" is sometimes used to distinguish this set of characteristics from the
microarchitecture, which is the set of processor design techniques used to implement the
instruction set (including microcode, pipelining, cache systems, and so forth). Computers with
different microarchitectures can share a common instruction set. For example, the Intel Pentium
and the AMD Athlon implement nearly identical versions of the x86 instruction set, but have
radically different internal designs. This concept can be extended to unique ISAs like TIMI present
in the IBM System/38 and IBM IAS/400. TIMI is an ISA that is implemented as low-level software
and functionally resembles what is now referred to as a virtual machine It was designed to
increase the longevity of the platform and applications written for it, allowing the entire platform
to be moved to very different hardware without having to modify any software except that which
comprises TIMI itself. This allowed IBM to move the AS/400 platform from an older CISC
architecture to the newer POWER architecture without having to rewrite any parts of the OS or
software associated with it.
When designing microarchitectures, engineers use Register Transfer Language (RTL) to define

the operation of each instruction of an ISA.
An ISA can also be emulated in software by a interpreter. Due to the additional translation needed
for the emulation, this is usually slower than directly running programs on the hardware
implementing that ISA. Today, it is common practice for vendors of new ISAs or
microarchitectures to make software emulators available to software developers before the
hardware implementation is ready.
Assembly language directives

In addition to codes for machine instructions, assembly languages have extra directives for
assembling blocks of data, and assigning address locations for instructions or code.
They usually have a simple symbolic capability for defining values as symbolic expressions
which are evaluated at assembly time, making it possible to write code that is easier to read and
understand.
Like most computer languages, comments can be added to the source code which are ignored by
the assembler.
They also usually have an embedded macro language to make it easier to generate complex
pieces of code or data.
In practice, the absence of comments and the replacement of symbols with actual numbers
makes the human interpretation of disassembled code considerably more difficult than the
original source would be.
Usage of assembly language

There is some debate over the usefulness of assembly language. It is often said that modern
compilers can render higher-level languages into codes that run as fast as hand-written assembly,
but counter-examples can be made, and there is no clear consensus on this topic. It is reasonably
certain that, given the increase in complexity of modern processors, effective hand-optimization

is increasingly difficult and requires a great deal of knowledge.
However, some discrete calculations can still be rendered into faster running code with assembly,
and some low-level programming is simply easier to do with assembly. Some system-dependent
tasks performed by operating system simply cannot be expressed in high-level languages. In
particular, assembly is often used in writing the low level interaction between the operating
system and the hardware, for instance in device drivers. Many compilers also render high-level
languages into assembly first before fully compiling, allowing the assembly code to be viewed for
debugging and optimization purposes.
It's also common, especially in relatively low-level languages such as, to be able to embed
assembly language into the source code with special syntax. Programs using such facilities, such
as the Linux kernel often construct abstractions where different assembly is used on each
platform the program supports, but it is called by portable code through a uniform interface.
Many embedded systems are also programmed in assembly to obtain the absolute maximum
functionality out of what is often very limited computational resources, though this is gradually
changing in some areas as more powerful chips become available for the same minimal cost.
Another common area of assembly language use is in the system BIOS of a computer. This lowlevel code is used to initialize and test the system hardware prior to booting the OS and is stored
in ROM. Once a certain level of hardware initialization has taken place, code written in higher level
languages can be used, but almost always the code running immediately after power is applied is
written in assembly language. This is usually due to the fact system RAM may not yet be
initialized at power-up and assembly language can execute without explicit use of memory,
especially in the form of a stack.
Assembly language is also valuable in reverse engineering, since many programs are distributed
only in machine code form, and machine code is usually easy to translate into assembly language
and carefully examine in this form, but very difficult to translate into a higher-level language.
Tools such as the Interactive Disassembler make extensive use of disassembly for such a
purpose.
Interrupt

In computer science, an interrupt is a signal from a device which typically results in a context
switch: that is, the processor sets aside what it's doing and does something else.
Digital computers usually provide a way to start software routines in response to asynchronous
electronic events. These events are signaled to the processor via interrupt requests (IRQ). The
processor and interrupt code make a context switch into a specifically written piece of software to
handle the interrupt. This software is called the interrupt service routine, or interrupt handler. The
addresses of these handlers are termed interrupt vectors and are generally stored in a table in
RAM, allowing them to be modified if required.
Interrupts were originated to avoid wasting the computer's valuable time in software loops (called
polling loops) waiting for electronic events. Instead, the computer was able to do other useful
work while the event was pending. The interrupt would signal the computer when the event
occurred, allowing efficient accommodation for slow mechanical devices.
Interrupts allow modern computers to respond promptly to electronic events, while other work is
being performed. Computer architectures also provide instructions to permit processes to initiate
software interrupts or traps. These can be used, for instance, to implement co-operative
multitasking
A well-designed interrupt mechanism arranges the design of the computer bus software and
interrupting device so that if some single part of the interrupt sequence fails, the interrupt restarts
and runs to completion. Usually there is an electronic request, an electronic response, and a
software operation to turn off the device's interrupt, to prevent another request.
Interrupt Types
Typical interrupt types include:
timer interrupts
disk interrupts
power-off interrupts
traps
Other interrupts exist to transfer data bytes using UARTs, or Ethernet, sense key-presses, control
motors, or anything else the equipment must do.
A classic timer interrupt just interrupts periodically from a counter or the power-line. The software
(usually part of an operating system counts the interrupts to keep time. The timer interrupt may
also be used to reschedule the priorities of running processes. Counters are popular, but some
older computers used the power line because power companies control the power-line frequency
with an atomic clock.
A disk interrupt signals the completion of a data transfer from or to the disk peripheral. A process
waiting to read or write a file starts up again.
A power-off interrupt predicts or requests a loss of power. It allows the computer equipment to
perform an orderly shutdown.
Interrupts are also used in typeahead features for buffering events like keystrokes.
Interrupt routines generally have a short execution time. Most interrupt routines do not allow
themselves to be interrupted, because they store saved context on a stack, and if interrupted
many times, the stack could overflow. An interrupt routine frequently needs to be able to respond
to a further interrupt from the same source. If the interrupt routine has significant work to do in
response to an interrupt, and it is not critical that the work be performed immediately, then often
the routine will do nothing but schedule the work for some later time and return as soon as
possible. Some processors support a hierarchy of interrupt priorities, allowing certain kinds of
interrupts to occur while processing higher priority interrupts.
Processors also often have a mechanism referred to as interrupt disable which allows software to
prevent interrupts from interfering with communication between interrupt-code and non-interrupt
code. See mutual exclusion.
Typically, the user can configure the machine using hardware registers so that different types of
interrupts are enabled or disabled, depending on what the user wants. The interrupt signals are
And'ed with a mask, thus allowing only desired interrupts to occur. Some interrupts cannot be
disabled - these are referred to as non-maskable interrupts
Interrupt vector

When a processor receives an interrupt, the normal flow of whatever program it is running stops
and control is passed to another program (or a different part of the same program). A more lowlevel description is to say that the CPU stops what it was doing, stores its status somewhere and
jumps to another area of memory and starts running whatever code is there.
The destination to which the CPU jumps for a given interrupt is termed the interrupt vector.
Generally, most computer system designs will incorporate a list of such vectors; this is termed
the interrupt vector table or dispatch table.
Interrupt handler
An Interrupt Handler is the modern progression of an interrupt service routine, a routine whose
execution is triggered by an interrupt.
In modern systems Interrupt Handlers are split into two parts: the First-Level Interrupt Handler
(FLIH) and the Second-Level Interrupt Handlers (SLIH).
The FLIH operates in the same way as the old interrupt routines did. In response to an interrupt
there is a context switch and the code for the interrupt is loaded and executed. The job of the
FLIH, however, is not to process the interrupt, but to schedule the execution of the SLIH, while
recording any critical information which is only available at the time of the interrupt.
The SLIH sits on the run queue of the operating system until it can be executed to perform the
processing for the interrupt when processor time is available.
It is worth noting that in many systems the FLIH and SLIH are referred to as upper halves and
lower halves, or a derivation of those names.
Non-Maskable interrupt
A non-maskable interrupt (or NMI) is a special type of interrupt used on in most types of
microcomputer, for example the IBM PC and Apple II.
An NMI causes the CPU to stop what it was doing, change the program counter to point to a
particular address and continue executing code from that location. Programmers are unable to

program the CPU to ignore these interrupts, hence the term "non-maskable".
In practice, NMIs are particularly useful for two reasons.
One is for debugging faulty code, where it can be instantly suspended at any point and control
transferred to a special monitor program, from which the developer can inspect the machine's
memory and examine the internal state of the program as it rests in "suspended animation". The
Apple Macintosh's "programmers' button" worked in this way, as do certain key combinations on
SUN workstations.
A second is for leisure users and gamers. Devices which added a button to generate an NMI, such
as Romantic Robot's Multiface, were a popular accessory for 1980s 8-bit and 16-bit home
computers. These peripherals had a small amount of ROM and an NMI button. Pressing the button
transferred control to the software in the peripheral's ROM, allowing the suspended program to be
saved to disk (very useful for tape-based games with no disk support, but also for saving games
in progress), screenshots to be saved or printed, or values in memory to be manipulated -- a
cheating technique to acquire extra lives, for example.
Some floppy disk interfaces, such as the Miles Gordon Technology's DISCiPLE and PlusD for the
ZX Spectrum, also included an NMI button.
Intel 8087
The 8087 was the first math coprocessor designed by Intel and it was built to be paired with the
Intel 8088 and 8086 microprocessors. The purpose of the 8087, the first of the x87 family, was to
speed up computations on demanding applications involving floating point mathematics. The
performance enhancements went from 20% to 500% depending on the specific application.
This coprocessor introduced about 60 new instructions available to the programmer, all
beginning with "F" to differentiate them from the standard 8086/88 integer math instructions. For
example, in constrast to ADD/MUL, the 8087 provided FADD/FMUL.
The 8087 (and, in fact, the entire x87 family) does not provide a freely, linear register set such as
the AX/BX/CX/DX registers of the 8086/88 and 80286 processors -- the x87 registers are structured

in some form of stack (altough it is not exactly like a typical stack data structure) ranging from
ST0 to ST7. The floating point instructions of the 80x87 coprocessors operate popping and
pushing values onto this stack.
When Intel designed the 8087 it aimed to make a standard floating point format for future designs.
In fact, one of the most successful things from a historical perspective of this coprocessor was
the introduction of the first floating point standard for the x86 PCs: the IEEE 754. The 8087
provided two basic 32/64-bit floating point data types and an additional extended 80-bit internal
support to improve accuracy over large and complex calculations. Apart from this, the 8087
offered a 80-bit/17-digit packed BCD (binary coded decimal) format and 16,32 and 64-bit integer
data types.
The 8087, announced in 1980, was superseded by the 80287, 80387DX/SX and the 487SX. Intel
80486DX, Pentium and later processors include a built-in coprocessor on the CPU core.
Pin Diagram of Intel 8087
Peripheral
A peripheral is a type of computer hardware that is added to a host computer, in order to expand
its abilities. More specifically the term is used to describe those devices that are optional in
nature, as opposed to hardware that is either demanded, or always required in principle.
The term also tends to be applied to devices that are hooked up externally, typically though some
form of computer bus like USB. Typical examples include joysticks, printers and scanners.

Devices such as monitors and disk drives are not considered peripherals because they are not
truly optional, and video capture card are typically not referred to as peripheral because they are
internal devices.
Programmable Peripheral Interface (82C55)

The 82C55 is a popular interfacing component, that can interface any TTL-compatible I/O device
to the microprocessor.
It is used to interface to the keyboard and a parallel printer port in PCs (usually as part of an
integrated chipset).
Requires insertion of wait states if used with a microprocessor using higher that an 8 MHz clock.
PPI has 24 pins for I/O that are programmable in groups of 12 pins and has three distinct modes
of operation.
In the PC, an 82C55 or its equivalent is decoded at I/O ports 60H-63H.
Pinout of 82C55 PPI
Interfacing the 82C55 PPI
Programming the 82C55
82C55: Mode 0 Operation

Mode 0 operation causes the 82C55 to function as a buffered input device or as a latched output
device.
In previous example, both ports A and B are programmed as (mode 0) simple latched output
ports.
Port A provides the segment data inputs to display and port B provides a means of selecting one
display position at a time.
Different values are displayed in each digit via fast time multiplexing.
The values for the resistors and the type of transistors used are determined using the current
requirements (see text for details).
Textbook has the assembly code fragment demonstrating its use.

Examples of connecting LCD displays and stepper motors are also given.
82C55: Mode 1 Strobed Input

Port A and/or port B function as latching input devices. External data is stored in the ports until
the microprocessor is ready.
Port C used for control or handshaking signals (cannot be used for data).
Signal definitions for Mode 1 Strobed Input
82C55: Mode 1 Strobed Input
82C55: Mode 1 Strobed Input Example
Keyboard encoder debounces the key-switches, and provides a strobe whenever a key is
depressed.
DAV is activated on a key press strobing the ASCII-coded key code into Port A.
82C55: Mode 1 Strobed Output

Similar to Mode 0 output operation, except that handshaking signals are provided using port C.
Signal Definitions for Mode 1 Strobed Output
82C55: Mode 1 Strobed Output
82C55: Mode 2 Bi-directional Operation
Only allowed with port A. Bi-directional bused data used for interfacing
two computers, GPIB interface etc.
82C55: Mode 2 Bi-directional Operation
Timing diagram is a combination of the Mode 1 Strobed Input and Mode 1 Strobed Output Timing
diagrams.
PIC 8259

What are Interrupts?
When receiving data and change in status from I/O Ports, we have two methods available to us.
We can Poll the port, which involves reading the status of the port at fixed intervals to determine
whether any data has been received or a change of status has occurred. If so, then we can branch
to a routine to service the ports requests.
As you could imagine, polling the port would consume quite some time. Time which could be
used doing other things such refreshing the screen, displaying the time etc. A better alternative
would be to use Interrupts. Here, the processor does your tasks such as refreshing the screen,
displaying the time etc, and when a I/O Port/Device needs attention as a byte has been received or
status has changed, then it sends a Interrupt Request (IRQ) to the processor.
Once the processor receives an Interrupt Request, it finishes its current instruction, places a few
things on the stack, and executes the appropriate Interrupt Service Routine (ISR) which can
remove the byte from the port and place it in a buffer. Once the ISR has finished, the processor
returns to where it left off.
Using this method, the processor doesn't have to waste time, looking to see if your I/O Device is
in need of attention, but rather the device will interrupt the processor when it needs attention.
Interrupts and Intel Architecture

Interrupts do not have to be entirely associated with I/O devices. The 8086 family of
microprocessors provides 256 interrupts, many of these are only for use as software interrupts,
which we do not attempt to explain in this document.
The 8086 series of microprocessors has an Interrupt Vector Table situated at 0000:0000 which
extends for 1024 bytes. The Interrupt Vector table holds the address of the Interrupt Service
Routines (ISR), all four bytes in length. This gives us room for the 256 Interrupt Vectors.
INT (Hex)
IRQ
Common Uses
00 - 01
Exception Handlers
02
Non-Maskable IRQ
03 - 07
Exception Handlers
08
Hardware IRQ0
System Timer
09
Hardware IRQ1
Keyboard
0A
Hardware IRQ2
Redirected
0B
Hardware IRQ3
Serial Comms. COM2/COM4
Non-Maskable IRQ (Parity Errors)

-

0C
Hardware IRQ4
Serial Comms. COM1/COM3
0D
Hardware IRQ5
Reserved/Sound Card
0E
Hardware IRQ6
Floppy Disk Controller
0F
Hardware IRQ7
Parallel Comms.
10 - 6F
Software Interrupts
70
Hardware IRQ8
Real Time Clock
71
Hardware IRQ9
Redirected IRQ2
72
Hardware IRQ10
Reserved
73
Hardware IRQ11
Reserved
74
Hardware IRQ12
PS/2 Mouse
75
Hardware IRQ13
Math's Co-Processor
76
Hardware IRQ14
Hard Disk Drive
77
Hardware IRQ15
Reserved
78 - FF
Software Interrupts
Table 1 : x86 Interrupt Vectors
The average PC, only has 15 Hardware IRQ's plus one Non-Maskable IRQ. The rest of the interrupt
vectors are used for software interrupts and exception handlers. Exception handlers are routines
like ISR's which get called or interrupted when an error results. Such an example is the first
Interrupt Vector which holds the address of the Divide By Zero, Exception handler. When a divide
by zero occurs the Microprocessor fetches the address at 0000:0000 and starts executing the
code at this Address.
Hardware Interrupts
The Programmable Interrupt Controller (PIC) handles hardware interrupts. Most PC's will have two
of them located at different addresses. One handles IRQ's 0 to 7 and the other, IRQ's 8 to 15,
giving a total of 15 individual IRQ lines, as the second PIC is cascaded into the first, using IRQ2.
Most of the PIC's initialization is done by BIOS, thus we only have to worry about two
instructions. The PIC has a facility available where we can mask individual IRQ's so that these
requests will not reach the Processor. Thus the first instruction is to the Operation Control Word 1
(OCW1) to set which IRQ's to mask and which IRQ's not too.
As there are two PIC's located at different addresses, we must first determine which PIC we need
to use. The first PIC, located at Base Address 0x20h controls IRQ 0 to IRQ 7. The bit format of
PIC1's Operation Control Word 1 is shown below in table 2.

Bit
Disable IRQ
Function
IRQ7
Parallel Port
IRQ6
Floppy Disk Controller
IRQ5
Reserved/Sound Card
IRQ4
Serial Port
IRQ3
Serial Port
IRQ2
PIC2
IRQ1
Keyboard
IRQ0
System Timer
Table 2 : PIC1 Operation Control Word 1 (0x21)
Note that IRQ 2 is connected to PIC2, thus if you mask this IRQ, then you will be disabling IRQ's 8
to 15.
The second PIC located at a base address of 0xA0h controls IRQs 8 to 15. Below is the individual
bits required to make up it's Operation Control Word.
Bit
Disable IRQ
Function
IRQ15
Reserved
IRQ14
Hard Disk Drive
IRQ13
Maths Co-Processor
IRQ12
PS/2 Mouse
IRQ11
Reserved
IRQ10
Reserved
IRQ9
Redirected IRQ2
IRQ8
Real Time Clock

Table 3 : PIC2 Operation Control Word 1 (0xA1)
As the above table shows the bits required to disable an IRQ, we must invert them should we
want to enable an IRQ. For example, if we want to enable IRQ 3 then we would send the byte 0xF7
as OCW1 to PIC1. But what happens if one of these IRQs are already enabled and then we come
along and disable it?
Therefore we must first get the mask and use the AND function to output the byte back to the
register with our changes so to cause the least upset to the other IRQs. Going back to our IRQ3
example, we could use outportb(0x21,(inportb(0x21) & 0xF7); to enable IRQ3. Take note that the
OCW1 goes to the register at Base + 1.
The same procedure must be used to mask (disable) an IRQ once we are finished with it. However
this time we must OR the byte 0x08 to the contents of OCW1. Such and example of code is
outportb(0x21,(inportb(0x21) | 0x08);

The other PIC instruction we have to worry about is the End of Interrupt (EOI). This is sent to the
PIC at the end of the Interrupt Service Routine so that the PIC can reset the In Service Register.
See The Programmable Interrupt Controller for more information. An EOI can be sent using
outportb(0x20,0x20); for PIC1 or outportb(0xA0,0x20); for PIC2
Implementing the Interrupt Service Routine (ISR)

In C you can implement your ISR using void interrupt yourisr() where yourisr is a far pointer,
pointing to the address that your Interrupt Service Routine will reside in memory. This is later
placed in the Interrupt Vector Table so that, it will be called when interrupted.
The following code is a basic implementation of an ISR.
void interrupt yourisr() /* Interrupt Service Routine (ISR) */
{
disable();
/* Body of ISR goes here */

oldhandler();
outportb(0x20,0x20);
/* Send EOI to PIC1 */
enable();
}
void interrupt yourisr() defines this function as an Interrupt Service Routine. disable(); clears the
interrupt flag, so that no other hardware interrupts ,except a NMI (Non-Maskable Interrupt) can
occur. Otherwise, and interrupt with a higher priority that this one can interrupt the execution of
this ISR. However this is not really a problem in many cases, thus is optional.
The body of your ISR will include code which you want to execute upon this interrupt request
being activated. Most Ports/UARTs may interrupt the processor for a range of reasons, eg byte
received, time-outs, FIFO buffer empty, overruns etc, thus the nature of the interrupt has to be
determined. This is normally achieved by reading the status registers of the port you are using.
Once it has been established, you can service it's requests.
If you read any data from a port, it is normally common practice to place it in a buffer, rather that
immediately writing it to the screen, inhibiting further interrupts to be processed. Most Ports
these days will have FIFO buffers which can contain more than one byte, thus repeat your read
routine, until the FIFO is empty, then exit your ISR.

In some cases it may be appropriate to chain the old ISR to this one. Such an example would be
the Clock Interrupt. Other TSR or resident programs may also be using it, thus if you intercept the
interrupt and keep it all for yourself, the other ISR's can no longer function possibly causing some
side effects. However for Serial/Parallel Ports this is not a problem. To chain the old ISR to your
new ISR, you can call it using oldhandler(); where oldhandler points to your old ISR.
Before we can return from the interrupt, we must tell the Programmable Interrupt Controller, that
we are ending the interrupt by sending an EOI (End of Interrupt 0x10) to it. As there are two PIC's
you must first establish which one to send it to. Use outportb(0x20,0x20); for PIC 1 (IRQ 0 - 7) or
outportb(0xA0,0x20); for PIC 2 (IRQ 8 - 15).
Note: If using PIC2, then an EOI has to be sent to both PIC1 and PIC2.
Using your new Interrupt Service Routine

Now that we have written our new interrupt service routine, we can start looking at how to
implement it. The following code segment shows the basic usage of your new ISR. For this
example we have chosen to use IRQ 3.
#include <dos.h>
#define INTNO 0x0B
/* Interupt Number - See Table 1 */
void main(void)
{
oldhandler = getvect(INTNO);
setvect(INTNO, yourisr);
/* Save Old Interrupt Vector */

/* Set New Interrupt Vector Entry */
outportb(0x21,(inportb(0x21) & 0xF7)); /* Un-Mask (Enable) IRQ3 */

/* Set
Card - Port to Generate Interrupts */
/* Body of Program Goes Here */
/* Reset Card - Port as to Stop Generating Interrupts */
outportb(0x21,(inportb(0x21) | 0x08)); /* Mask (Disable) IRQ3 */

setvect(INTNO, oldhandler);
/* Restore old Interrupt Vector Before Exit */

}
Before you can place the address of your new ISR in the interrupt vector table, you must first save
the old interrupt vector, so that you can restore it once you exit your program. This is done using
oldhandler = getvect(INTNO); where INTNO is the number of the interrupt vector you wish to
return. Before oldhandler can be used, you must first declare it using void interrupt ( *oldhandler)
();
Once the old interrupt vector is stored, we can now install your new ISR into the interrupt vector
table. This is done using the line setvect(INTNO, yourisr); where yourisr points to your interrupt
service routine.
The IRQ which you are using must now be unmasked. We have already discussed this earlier. See
Hardware Interrupts.
Most Ports/UARTs will need some initialization to be able to generate interrupts. For Example, The
Standard Parallel Port (SPP) will require Bit 4 of the Control Port, Enable IRQ Via ACK Line to be
set at Base + 2. The Serial Port will require the appropriate setting of Bits 0 to 4 of the Interrupt
Enable Register (IER) located at Base + 1.
Your body of the program normally consists of a few housekeeping tasks depending upon your
application. Here you look for new keys pressed, menus being selected, updating clocks, checking for
incoming data in buffers etc, knowing that any data from your Ports will be automatically read and
processed, by the ISR.
If you like implementing ISR's so much, you can attach your own ISR to the Keyboard Interrupt, so any
keys being pressed will be automatically handled by another ISR, and even one to the clock. Upon every
18.2 ticks you can update the seconds on your display! The possibilities of ISR's are endless.
Before you exit your program always restore the old interrupt vector, so that your computer
doesn't become unstable. This is done using setvect(INTNO, oldhandler); , where oldhandler
points to the old interrupt service routine, which we stored using oldhandler = getvect(INTNO);
The Programmable Interrupt Controller

As we have all ready discussed, the Interrupt ReQuests (IRQ's) of a PC is handled by two 8259
Programmable Interrupt Controllers. On the old XT's/AT's these were two 28 Pin DIP IC's, but as
you can imagine, things have changed dramatically since then. While the operational principal is
still the same, the PIC is now integrated somewhere into your chipset, along with many other
devices.
The basic block diagram of the PIC is shown above. The 8 individual interrupt request lines are
first passed through the Interrupt Mask Register (IMR) to see if they have been masked or not. If
they are masked, then the request isn't processed any further. However if they are not masked,
they will register their request with the Interrupt Request Register (IRR).
The Interrupt Request Register will hold all the requested IRQ's until they have been dealt with
appropriately. If required, this register can be read by setting certain bits of the Operation Control
Word 3. The Priority Resolver simply selects the IRQ of highest priority. The higher priority
interrupts are the lower numbered ones. For Example IRQ 0 has the highest priority followed by
IRQ 1 etc.
Now that the PIC has determined which IRQ to process, it is now time to tell the processor, so that
it can call your ISR for you. This process is done by sending a INT to the processor, i.e. the INT
line on the processor is asserted. The processor will then finish the current instruction it's
processing and acknowledge your INT request with a INTA (Interrupt Acknowledge) pulse.
Upon receiving the processor's INTA, the IRQ which the PIC is processing at the time is stored in
the In Service Register (ISR) which as the name suggests, shows which IRQ is currently in
service. The IRQ's bit is also reset in the Interrupt Request Register, as it is no longer requesting
service but actually getting service.
Another INTA pulse will be sent by the processor, to tell the PIC to place a 8 bit pointer on the data
bus, corresponding to the IRQ number. If an IRQ serviced by PIC2 is requesting the service, then
PIC2 will send the pointer to the processor. The Master (PIC1) at this stage, will select PIC2 to

send the pointer, by placing PIC2's Slave ID on the Cascade lines, which is a 3 wire bus between
all the PIC's in a system.
The 5 most significant bits of this pointer is set using the Initialization Command Word 2 (ICW2).
This will be 00001 for PIC1 and 01110 for PIC2. The three least significant bits, will be sent due to
which IRQ is being serviced. For example, if IRQ3 is requesting service then the 8 bit pointer will
be made up with 00001 for the 5 most significant bits and 011 (IR3) for the least significant bits.
Put this together and you get 00001011 or 0x0B which just happens to be IRQ3's interrupt vector.
For PIC2, the same principal is applied. If IRQ10 is requesting service, then 01110010 will be sent,
which just happens to represent Interrupt 72h. IRQ10 happens to be connected to IR2 on the
Second PIC, thus 010 is used as the least significant bits.
Once your ISR has done everything it needs, it sends an End of Interrupt (EOI) to the PIC, which
resets the In-Service Register. If the request came from PIC2, then EOI's are required to be sent to
both PICs. The PIC will then determine the next highest priority interrupt and repeat the same
process. If no Interrupt Requests are present, then the PIC waits for the next request before
interrupting the processor.
IRQ2/IRQ9 Redirection
The redirection of IRQ2 causes quite some confusion, and thus is discussed here. In the original
XT's there were only one PIC, thus only eight IRQ's. However users soon out grew these
resources, thus an additional 7 IRQ's were added to the PC. This involved attaching another PIC
to the existing one already in the XT. Compatibility always causes problems as the new
configuration still had to be compatible with old hardware and software. The "new" configuration
is shown below.
The CPU only has one interrupt line, thus the second controller had to be connected to the first
controller, in a master/slave configuration. IRQ2 was selected for this. By using IRQ2 for the
second controller, no other devices could use IRQ2, so what happened to all these devices using
IRQ2? Nothing, the interrupt request line found on the bus, was simply diverted into the IRQ 9
input. As no devices yet used the second PIC or IRQ9, this could be done.
The next problem was that a hardware device using IRQ2 would install it's ISR at INT 0x0A.
Therefore an ISR routine was used at INT 71h, which sent a EOI to PIC2 and then called the ISR at
INT 0x0A. If you dis-assemble the ISR for IRQ9, it will go a little like,
MOV AL,20
OUT A0,AL
INT 0A
IRET
; Send EOI to PIC2

; Call ISR for IRQ2
The routine only has to send a EOI to PIC2, as it is expected that a ISR routine written for IRQ2 will
send a EOI to PIC1. This example destroys the contents of Register AL, thus this must be placed
on the stack first (Not shown in example). As PIC2 is initialized with a Slave on IRQ2, any request
using PIC2 will not call the ISR routine for IRQ2. The 8 bit pointer will come from PIC2.
Programmable Interrupt Controller's Addresses

The two PIC's found in an IBM compatible system are initialized via BIOS thus you don't have to
worry about all of their registers. However for some people, who have inquisitive minds the
following information may come in some use or maybe you want to (re)program a BIOS? Below is
a table of all the command words of the 8259 and compatible Programmable Interrupt Controller.
The Top Table shows the Addresses for the PIC1, while the bottom table shows addresses for
PIC2.
Address
20h
Read/Write
Function
Write
Initialization Command Word 1 (ICW1)
Write
Operation Command Word 2 (OCW2)
Write
Read
Interrupt Request Register (IRR)
Read
In-Service Register (ISR)
Write
Write
Write
21h
Read/Write
Interrupt Mask Register (IMR)

Table 4 : Addresses/Registers for PIC1
PIC2 Addresses . . .
Address
A0h
Read/Write
Function
Write
Write
Write
Read
Interrupt Request Register (IRR)
Read
In-Service Register (ISR)
Write
Write
Write
A1h
Read/Write
Interrupt Mask Register (IMR)

Table 5 : Addresses/Registers for PIC2

If the PIC has been reset, it must be initialized with 2 to 4 Initialization Command Words (ICW)
before it will accept and process Interrupt Requests. The following selection outlines the four
possible Initialization Command Words.
Bit(s)
7:5
4
Function
Interrupt Vector Addresses for MCS-80/85 Mode.
Must be set to 1 for ICW1
1
Level Triggered Interrupts
Edge Triggered Interrupts
Call Address Interval of 4
Call Address Interval of 8
Single PIC
Cascaded PICs
Will be Sending ICW4
Don't need ICW4
Table 6 : Initialization Command Word 1 (ICW1)
The 8259 Programmable Interrupt Controller, offers many other features which are not used in the
PC. It also offers support for MCS-80/85 microprocessors. All we have to be aware of being PC
uses, is if the system is running in single mode (One PIC) or if in Cascaded Mode (More than one
PIC) and if the Initialization Command Word 4 is needed. If no ICW4 is used, then all of it's bits will
be set to 0. As we are using it in 8086 mode, we must send a ICW4.

Bit
8086/8080 Mode
MCS 80/85 Mode
I7
A15
I6
A14
I5
A13
I4
A12
I3
A11
A10
A9
A8

Initialization Command Word 2 (ICW2) selects which vector information is released onto the bus,
during the 2nd INTA Pulse. Using the 8086 mode, only bits 7:3 need to be used. This will be
00001000 (0x08) for PIC1 and 01110000 (0x70) for PIC2. If you wish to relocate the IRQ Vector
Table, then you can use this register.

There are two different Initialization Command Word 3's. One is used, if the PIC is a master, while
the other is used for slaves. The top table shows the ICW3 for the master.
Bit
Function
IR7 is connected to a Slave

Table 8 : Initialization Command Word 3 for Master PIC (ICW3)
And for the slave device, the ICW3 below is used.

Bit(s)
Function
Reserved. Set to 0
Reserved. Set to 0
Reserved. Set to 0
Reserved. Set to 0
Reserved. Set to 0
2:0
Slave ID
000
Slave 0
001
Slave 1
010
Slave 2
011
Slave 3
100
Slave 4
101
Slave 5
110
Slave 6

111
Slave 7
Table 9 : Initialization Command Word 3 for Slaves (ICW3)

Bit(s)
Function
Reserved. Set to 0
Reserved. Set to 0
Reserved. Set to 0
1
Special Fully Nested Mode
Not Special Fully Nested Mode
0x
Non - Buffered Mode
10
Buffered Mode - Slave
11
Buffered Mode - Master
Auto EOI
Normal EOI
8086/8080 Mode
MCS-80/85
3:2
Once again, many of these are special functions not used with the 8259 PIC in a PC. We don't use,
Special Fully Nested Mode, thus this bit is set to 0. Likewise we use non-buffered mode and
Normal EOI's thus all these corresponding bits are set to 0. The only thing we must set is
8086/8080 Mode which is done using Bit 0.
Operation Control Word 1 (OCW1)

Once all the required Initialization Command Words have been sent to the PIC, then you
can send Operation Control Words, in any order and at any time during the PIC's
operation. The Operation Control Words are shown in the next sections
Bit
PIC 2
PIC 1
Mask IRQ15
Mask IRQ7
Mask IRQ14
Mask IRQ6

5
Mask IRQ13
Mask IRQ5
Mask IRQ12
Mask IRQ4
Mask IRQ11
Mask IRQ3
Mask IRQ10
Mask IRQ2
Mask IRQ9
Mask IRQ1
Mask IRQ8
Mask IRQ0
Table 11 : Operation Control Word 1 (OCW1)
Operation Control Word 1, shown above is used to mask the inputs of the PIC. This has already
been discussed, earlier in this article.

Bit(s)
7:5
Function
000
Rotate in Auto EOI Mode (Clear)
001
Non Specific EOI
010
Reserved
011
Specific EOI
100
Rotate in Auto EOI Mode (Set)
101
Rotate on Non-Specific EOI
110
Set Priority Command (Use Bits 2:0)
111
Rotate on Specific EOI (Use Bits 2:0)
Must be set to 0
Must be set to 0
2:0
000
Act on IRQ 0 or 8
001
Act on IRQ 1 or 9
010
Act on IRQ 2 or 10
011
Act on IRQ 3 or 11
100
Act on IRQ 4 or 12
101
Act on IRQ 5 or 13
110
Act on IRQ 6 or 14
111
Act on IRQ 7 or 15

Operation Control Word 2 selects how the End of Interrupt (EOI) procedure works. The only thing
of interest to us in this register is the non-specific EOI command, which we must send at the end
of our ISR's.

Bit(s)
7
6:5
Function
Must be set to 0
00
Reserved
01
Reserved
10
Reset Special Mask
11
Set Special Mask
Must be set to 0
Must be set to 1
1
Poll Command
No Poll Command
00
Reserved
01
Reserved
10
Next Read Returns Interrupt Request Register
11
Next Read Returns In-Service Register
1:0
Bits 0 and 1 are of the most significant to us, in Operation Control Word 3. These two bits enable
us to read the status of the Interrupt Request Register (IRR) and the In-Service Register (ISR).
This is done by setting the appropriate bits correctly as above, and reading the register at the
Base Address.
For example if we wanted to read the In-Service Register (ISR), then we would set both bits 1 and
0 to 1. The next read to the base register, (0x20 for PIC1 or 0xA0 for PIC2) will return the status of
the In-Service Register.
http://www.absoluteastronomy.com/reference/list_of_intel_microprocessors
List of Intel microprocessors
This list of Intel microprocessors attempts to present all of Intel's processors (Ps) from the
pioneering 4-bit 4004 (1971) to the present high-end offerings, the 64-bit Itanium 2 (2002) and
Pentium 4F with EM64T(2004). Concise technical data are given for each product.
The 4-bit and 8-bit processors
Intel 4004: 1st single-chip P

Introduced November 15, 1971
Clock speed 740 kHz
0.06 MIPS
Bus Width 4 bits (multiplexed address/data due to limited pins)
PMOS
Number of Transistors 2,300 at 10 m
Addressable Memory 640 bytes
Program Memory 4K bytes
World's first microprocessor
Used in Busicom calculator
Trivia: The original goal was to equal the clock speed of the IBM 1620; this was not quite
met.
4040
Introduced 4th Qtr, 1974
Clock speed of 500 kHz to 740 kHz using 4 to 5.185 MHz crystals
0.06 MIPS
PMOS
Addressable Memory 640 bytes
Program Memory 8K bytes
Interrupts
Enhanced version of 4004
8008
Introduced April 1, 1972
Clock speed 500 kHz (8008-1: 800 kHz)
0.05 MIPS

PMOS
Addressable memory 16 kilobytes
Typical in dumb terminals, general calculators, bottling machines
Developed in tandem with 4004
Originally intended for use in the Datapoint 2200 terminal
8080
Clock speed 2MHz
0.64 MIPS
Bus Width 8 bits data, 16 bits address
NMOS
Addressable memory 64 kilobytes
10X the performance of the 8008
Used in the Altair 8800, Traffic light controller, cruise missile
Required six support chips versus 20 for the 8008
8085
Introduced March 1976
Clock speed 5MHz
0.37 MIPS
Used in Toledo scale
High level of integration, operating for the first time on a single 5 volt power supply, from
12 volts previously
The 16-bit processors: Origin of x86
8086
Introduced June 8, 1978
Clock speeds:
5MHz with 0.33 MIPS
8MHz with 0.66MIPS
10MHz with 0.75 MIPS

Addressable memory 1 megabyte
10X the performance of 8080
Used in portable computing
Assembly language compatible with 8080
Used segment registers to access more than 64K of data at once, bane of programmers'
existence for years to come
8088
Clock speeds:
5MHz with 0.33 MIPS
8MHz with 0.75 MIPS
Internal architecture 16 bits
External bus Width 8 bits data, 20 bits address
Addressable memory 1 megabyte
Identical to 8086 except for its 8 bit external bus
Used in IBM PCs and PC clones
iAPX 432 (chronological entry)

Introduced January 1, 1981
Multi-chip CPU; Intel's first 32-bit microprocessor
See main entry
80186
Introduced 1982
Used mostly in embedded applications - controllers, point-of-sale systems, terminals, and
the like
Included two timers, a DMA controller, and an interrupt controller on the chip in addition to
the processor
Later renamed the iAPX 186
80188
A version of the 80186 with an 8-bit external data bus
Later renamed the iAPX 188

80286
Introduced February 1, 1982
Clock speeds:
6MHz with 0.9 MIPS
8MHz, 10MHz with 1.5 MIPS
12.5MHz with 2.66 MIPS
Bus Width 16 bits
Included memory protection hardware to support multitasking operating systems with perprocess address space
Number of Transistors 134,000 at 1.5 m
Addressable memory 16 megabytes
Added protected-mode features to 8086 with essentially the same instruction set
3-6X the performance of the 8086
Widely used in PC clones at the time
32-bit processors: The non-x86 Ps
iAPX 432
Introduced January 1, 1981 as Intel's first 32-bit microprocessor
Object/capability architecture
Microcoded operating system primitives
One terabyte virtual address space
Hardware support for fault tolerance
Two-chip General Data Processor (GDP), consists of 43201 and 43202
43203 Interface Processor (IP) interfaces to I/O subsystem
43204 Bus Interface Unit (BIU) simplifies building multiprocessor systems
43205 Memory Control Unit (MCU)
Architecture and execution unit internal data paths 32 bit
Clock speeds:
5 MHz
7 MHz
8 MHz
80186, 80188, 80286, 80386(DX) (chronological entries)

Introduced 19811988
See main entries
i960 aka 80960
RISC-like 32-bit architecture
predominantly used in embedded systems
Evolved from the capability processor developed for the BiiN joint venture with Siemens
Many variants identified by two-letter suffixes.
80386SX (chronological entry)

See main entry
80376 (chronological entry)

See main entry
i860 aka 80860
Intel's first superscalarprocessor
RISC 32/64-bit architecture, with pipeline characteristics very visible to programmer
Used in Intel Paragon massively parallel supercomputer
XScale
Introduced August 23, 2000
32-bit RISC microprocessor based on the ARM architecture
Many variants, such as the PXA2xx applications processors, IOP3xx I/O processors and
IXP2xxx and IXP4xx network processors.
32-bit processors: The 80386 range
80386DX
Introduced October 17, 1985
Clock speeds:
16MHz with 5 to 6 MIPS
2/16/1987 20MHz with 6 to 7 MIPS
4/4/1988 25MHz with 8.5 MIPS

4/10/1989 33MHz with 11.4 MIPS (9.4 SPECint92 on Compaq/i 16K L2)
Bus Width 32 bits
Addressable memory 4 gigabytes
Virtual memory 64 terabytes
First x86 chip to handle 32-bit data sets
Reworked and expanded memory protection support including paged virtual memory and
virtual-86 mode, features required by Windows 95and OS/2 Warp
Used in Desktop computing
Can address enough memory to manage an eight-page history of every person on earth
Can scan the Encyclopdia Britannica in 12.5 seconds
80960 (i960) (chronological entry)

See main entry
80386SX
Clock speeds:
16MHz with 2.5 MIPS
1/25/1989 20MHz with 2.5 MIPS, 25MHz with 2.7 MIPS
10/26/1992 33MHz with 2.9 MIPS
External bus width 16 bits
Number of Transitors 275,000 at 1 m
Virtual memory 256 gigabytes
16-bit address bus enable low cost 32-bit processing
Built in multitasking
Used in entry-level desktop and portable computing
80376
Introduced January 16, 1989; Discontinued June 15, 2001
Variant of 386 intended for embedded systems
No "real mode", starts up directly in "protected mode"
Replaced by much more successful 80386EX from 1994
80860 (i860) (chronological entry)

See main entry
80486DX (chronological entry)

See main entry
80386SL
Clock speeds:
20MHz with 4.21 MIPS
9/30/1991 25MHz with 5.3 MIPS
External bus width 16 bits
First chip specifically made for portable computers because of low power consumption of
chip
Highly integrated, includes cache, bus, and memory controllers
80486SX/DX2/SL, Pentium, 80486DX4 (chronological entries)

Introduced 19911994
See main entries
Intel386 EX
Introduced August 1994
Variant of 80386SX intended for embedded systems
Static core, i.e. may run as slowly (and thus, power efficiently) as desired, down to full halt
On-chip peripherals:
clock and power mgmt
timers/counters
watchdog timer
serial I/O units (sync and async) and parallel I/O

DMA
RAM refresh
JTAG test logic
Significantly more successful than the 80376
Used aboard several orbiting satellites and microsatellites
Used in NASA's FlightLinux project
32-bit processors: The 80486 range
80486DX
Clock speeds:
25MHz with 20 MIPS (16.8 SPECint92, 7.40 SPECfp92)
5/7/1990 33MHz with 27 MIPS (22.4 SPECint92 on Micronics M4P 128k L2)
6/24/1991 50MHz with 41 MIPS (33.4 SPECint92, 14.5 SPECfp92 on Compaq/50L 256K
L2)
Bus Width 32 bits
Number of Transistors 1.2 million at 1 m; the 50MHz was at 0.8 m
Level 1 cache on chip
50X performance of the 8088
Used in Desktop computing and servers
80386SL (chronological entry)

See main entry
80486SX
Clock speeds:
9/16/1991 16MHz with 13 MIPS, 20MHz with 16.5 MIPS
9/16/1991 25MHz with 20 MIPS (12 SPECint92)
9/21/1992 33MHz with 27 MIPS (15.86 SPECint92)
Bus Width 32 bits
Number of Transistors 1.185 million at 1 m and 900,000 at 0.8 m

Identical in design to 486DX but without math coprocessor
Used in low-cost entry to 486 CPU desktop computing
Upgradable with the Intel OverDrive processor
80486DX2
Introduced March 3, 1992
Clock speeds:
50MHz with 41 MIPS (29.9 SPECint92, 14.2 SPECfp92 on Micronics M4P 256K L2)
8/10/1992 66 MHz with 54 MIPS (39.6 SPECint92, 18.8 SPECfp92 on Micronics M4P
256K L2)
Bus Width 32 bits
Number of Transistors 1.2 million at 0.8 m
Used in high performance, low cost desktops
Uses "speed doubler" technology where the microprocessor core runs at twice the speed
of the bus
80486SL
Clock speeds:
20MHz with 15.4MIPS
25MHz with 19 MIPS
33MHz with 25 MIPS
Bus Width 32 bits
Used in notebook PCS
Pentium (chronological entry)

See main entry
80486DX4

Clock speeds:
75MHz with 53 MIPS (41.3 SPECint92, 20.1 SPECfp92 on Micronics M4P 256K L2)
100MHz with 70.7 MIPS (54.59 SPECint92, 26.91 SPECfp92 on Micronics M4P 256K L2)
Bus width 32 bits
Pin count 168 PGA Package, 208 SQFP Package
Die size 345 Square mm
Used in high performance entry-level desktops and value notebooks
32-bit processors: The Pentium ("I")
Pentium ("Classic")
P5 0.8 m process technology
Bus width 64 bits
System bus speed 50 or 60 or 66 MHz
Address bus 32 bits
Number of transistors 3.1 million
Addressable Memory 4 gigabytes
Virtual Memory 64 terabytes
Socket 4 273 pin PGA processor package
Package dimensions 2.16" x 2.16"
Superscalar architecture brought 5X the performance of the 33MHz 486DX processor
Runs on 5 volts
Used in desktops
16KB of L1 cache
Variants
60 MHz with 100 MIPS (70.4 SPECint92, 55.1 SPECfp92 on Xpress 256K L2)
66 MHz with 112 MIPS (77.9 SPECint92, 63.6 SPECfp92 on Xpress 256K L2)
P54C 0.6 m process technology
Socket 7 296/321 pin PGA package

75 MHz Introduced October 10, 1994
90 MHz Introduced March 7, 1994
90mm die size
120 MHz Introduced March, 1995
133 MHz Introduced June, 1995
150 MHz Introduced January 4, 1996
200 MHz Introduced June 10, 1996
80486DX4 (chronological entry)

See main entry
80386EX (Intel386 EX) (chronological entry)

Introduced August 1994
See main entry
Pentium Pro (chronological entry)

Introduced November 1995
See main entry
Pentium MMX
Intel MMX instructions
Socket 7 296/321 pin PGA (pin grid array) package
32KB L1 cache
System bus speed 66 MHz
Variants

166 MHz (Mobile) Introduced January 12, 1998
200 MHz (Mobile) Introduced September 8, 1997
32-bit processors: Pentium Pro, II, Celeron, III, M
Pentium Pro
0.6 m process technology
Precursor to Pentium II and III
Socket 8 processor package (387 pins) (Dual SPGA)
Number of transistors 22 million
16KB L1 cache
256KB integrated L2 cache
60 MHz system bus speed
Variants
150 MHz Introduced November 1, 1995
0.35 m process technology, or 0.35 m CPU with 0.6 m L2 cache
Number of transistors 36.5 million or 22 million
512KB or 256KB integrated L2 cache
60 or 66 MHz system bus speed
Variants
166 MHz (66 MHz bus speed, 512KB 0.35 m cache) Introduced November 1, 1995
200 MHz (66 MHz bus speed, 1MB 0.35 m cache) Introduced August 18, 1997
Pentium II
Introduced May 7, 1997
Klamath 0.35 m process technology (233, 266, 300MHz)

Pentium Pro with MMX and improved 16-bit performance
242-pin Slot 1 (SEC) processor package
66MHz system bus speed
32KB L1 cache
512KB 1/2 speed external L2 cache
Variants
233 MHz Introduced May 7, 1997
Deschutes 0.25 m process technology (333, 250, 400, 450MHz)
66MHz system bus speed (333MHz variant), 100MHz system bus speed for all models after
Variants
350 MHz Introduced April 15, 1998
450 MHz Introduced August 24, 1998
233 MHz (Mobile) Introduced April 2, 1998
333 MHz (Mobile)
Celeron (Pentium II-based)
Covington - 0.25 m process technology
242-pin Slot 1 SEPP (Single Edge Processor Package), Socket 370 PPGA package
32KB L1 cache
No L2 cache
Variants
Mendocino - 0.25 m process technology

242-pin Slot 1SEPP (Single Edge Processor Package), Socket 370 PPGA package
32KB L1 cache
128KB integrated cache
Variants
300A MHz Introduced August 24, 1998
466 MHz
266 MHz (Mobile)
300 MHz (Mobile)
366 MHz (Mobile)
400 MHz (Mobile)
433 MHz (Mobile)
450 MHz (Mobile) Introduced February 14, 2000
466 MHz (Mobile)
500 MHz (Mobile) Introduced February 14, 2000
Pentium II Xeon chronological entry)

See main entry
Pentium III
Katmai - 0.25 m process technology
Improved PII, i.e. P6-based core, now including Streaming SIMD Extensions (SSE)

512KB 1/2 speed L2 External cache
242-pin Slot-1 SECC2 (Single Edge Contact cartridge 2) processor package
System Bus Speed 100 MHz
Variants
450 MHz Introduced February 26, 1999
500 MHz Introduced February 26, 1999
533 MHz Introduced (133MHz bus speed) September 27, 1999
600 MHz Introduced (133MHz bus speed) September 27, 1999
Coppermine - 0.18 m process technology
256KB Advanced Transfer L2 Cache (Integrated)
242-pin Slot-1 SECC2 (Single Edge Contact cartridge 2) processor package, 370-pin FCPGA (Flip-chip pin grid array) package
System Bus Speed 100 MHz, 133 MHz (Those with 133 MHz bus carried a 'B' suffix in their
name)
Variants
500 MHz (100MHz bus speed)
533 MHz
600 MHz
650 MHz (100MHz bus speed) Introduced October 25, 1999
700 MHz (100MHz bus speed) Introduced October 25, 1999
750 MHz (100MHz bus speed) Introduced December 20, 1999
800 MHz (100MHz bus speed) Introduced December 20, 1999
800 MHz Introduced December 20, 1999
850 MHz (100MHz bus speed) Introduced March 20, 2000

1000 MHz Introduced March 8, 2000 (Not widely available at time of release)
400 MHz (Mobile) Introduced October 25, 1999
750 MHz (Mobile) Introduced June 19, 2000
900 MHz (Mobile) Introducted March 19, 2001
Tualatin - 0.13 m process technology
Introduced July 2001
32KB L1 cache
256KB or 512KB Advanced Transfer L2 cache (Integrated)
370-pin FC-PGA (Flip-chip pin grid array) package
Variants
1133 MHz (512KB L2)
1200 MHz
1266 MHz (512KB L2)
1333 MHz
1400 MHz (512KB L2)
Pentium II and III Xeon
PII Xeon
Variants
450 MHz (512 KB L2 Cache) Introduced October 6, 1998
450 MHz (1 MB and 2 MB L2 Cache) Introduced January 5, 1999
PIII Xeon
Number of transistors: 9.5 million at 0.25 m or 28 million at 0.18 m)

L2 cache is 256KB, 1MB, or 2MB Advanced Transfer Cache (Integrated)
Processor Package Style is Single Edge Contact Cartridge (S.E.C.C.2) or SC330
System Bus Speed 133 MHz (256KB L2 cache) or 100 MHz (1-2MB L2 cache)
System Bus Width 64 bit
Used in two-way servers and workstations (256KB L2) or 4- and 8-way servers (1-2MB L2)
Variants
500 MHz (0.25 m process) Introduced March 17, 1999
550 MHz (0.25 m process) Introduced August 23, 1999
600 MHz (0.18 m process, 256KB L2 cache) Introduced October 25, 1999
800 MHz (0.18 m process, 256KB L2 cache) Introduced January 12, 2000
866 MHz (0.18 m process, 256KB L2 cache) Introduced April 10, 2000
933 MHz (0.18 m process, 256KB L2 cache)
1000 MHz (0.18 m process, 256KB L2 cache) Introduced August 22, 2000
700 MHz (0.18 m process, 1-2MB L2 cache) Introduced May 22, 2000
Celeron (Pentium III Coppermine-based)
Introduced March,2000
Coppermine-128 - 0.18 m process technology
Streaming SIMD Extensions (SSE)
Socket 370
PPGA processor package
66MHz system bus speed, 100MHz system bus speed on January 3, 2001
32KB L1 cache
128KB Advanced Transfer L2 cache
Variants
533 MHz
566 MHz

800 MHz
850 MHz Introducted April 9, 2001
900 MHz Introducted July 2, 2001
550 MHz (Mobile)
800 MHz (Mobile)
850 MHz (Mobile) Introduced July 2, 2001
600 MHz (LV Mobile)
500 MHz (ULV Mobile) Introducted January 30, 2001
600 MHz (ULV Mobile)
XScale (chronological entry)

See main entry
Pentium 4 (not 4EE, 4E, 4F), Itanium, P4-based Xeon, Itanium 2 (chronological entries)
Introduced April 2000 July 2002
See main entries
Celeron (Pentium III Tualatin-based)
Tualatin Celeron - 0.13 m process technology
32KB L1 cache
256KB Advanced Transfer L2 cache

Variants
1.0 GHz
1.1 GHz
1.2 GHz
1.3 GHz
1.4 GHz
Pentium M
Banias 0.13 m process technology
64KB L1 cache
1MB L2 cache (integrated)
Based on Pentium III core, with SIMD SSE2 instructions and deeper pipeline
Micro-FCPGA, Micro-FCBGA processor package
Heart of the Intel mobile "Centrino" system
400 MHz Netburst-style system bus.
Variants
900MHz (Ultra low voltage)
1.0 GHz (Ultra low voltage)
1.1 GHz (Low voltage)
1.3 GHz
1.4 GHz
1.5 GHz
1.6 GHz
1.7 GHz
Dothan 0.09 m (90nm) process technology
Introduced May 2004
2MB L2 cache
Revised data prefetch unit
Variants

1.5 GHz
1.6 GHz
1.7 GHz
1.8 GHz
1.9 GHz
2.0 GHz
2.1 GHz
2.2 GHz (To arrive in Q3 2005)
Yonah 0.065 m (65nm) process technology
To be introduced 2006
Dual Core variants with 2MB Shared L2 cache
Variants
None yet announced
Celeron M
Banias-512 0.13 m process technology
64KB L1 cache
512KB L2 cache (integrated)
No SpeedStep technology, is not part of the 'Centrino' package
32-bit processors: Pentium 4 range
Pentium 4
0.18 m process technology (1.40 and 1.50 GHz)
L2 cache was 256KB Advanced Tansfer Cache (Integrated)
Processor Package Style was PGA423, PGA478
System Bus Speed 400 MHz
SSE2 SIMD Extensions
Number of Transistors 42 million
Used in desktops and entry-level workstations
0.18 m process technology (1.7 GHz)

See the 1.4 and 1.5 chips for details
0.18 m process technology (1.6 and 1.8 GHz)
Introduced July 2, 2001
See 1.4 and 1.5 chips for details
Core Voltage is 1.15 volts in Maximum Performance Mode; 1.05 volts in Battery
Optimized Mode
Power <1 watt in Battery Optimized Mode
Used in full-size and then light mobile PCs
0.18 m process technology "Willamette" (1.9 and 2.0 GHz)
See 1.4 and 1.5 chips for details
Pentium 4 (2 GHz, 2.20 GHz)
Pentium 4 (2.4 GHz)
0.13 m process technology "Northwood A"(1.7, 1.8, 1.9, 2, 2.2, 2.4, 2.5, 2.6 GHz)
Improved branch prediction and other microcodes tweaks
512KB integrated L2 cache
400 MHz system bus.
0.13 m process technology "Northwood B" (2.26, 2.4, 2.53, 2.66, 2.8, 3.06 GHz)
533 MHz system bus. (3.06 includes Intel's hyper threading technology).
0.13 m process technology "Northwood C" (2.4, 2.6, 2.8, 3.0, 3.2, 3.4 GHz)
800MHz system bus (all versions include Hyper Threading)
6500 to 10000 MIPS
Itanium (chronological entry)

Introduced 2001
See main entry
Xeon
Official designation now Xeon, i.e. not "Pentium 4 Xeon"
Xeon 1.4, 1.5, 1.7 GHz
Introduced May 21, 2001

L2 cache was 256KB Advanced Transfer Cache (Integrated)
Processor Package Style was Organic Lan Grid Array 603 (OLGA 603)
System Bus Speed 400MHz
SSE2 SIMD Extensions
Used in high-performance and mid-range dual processor enabled workstations
Xeon 2.0 GHz
Introduced September 25, 2001
Itanium 2 (chronological entry)

Introduced July 2002
See main entry
Pentium 4EE
Introduced September 2003
EE = "Extreme Edition"
same as Pentium 4 Processor, but with 2MB onboard L3 Cache
Pentium 4E
Introduced February 2004
built on 0.09 m (90 nm) process technology "Prescott" (2.4A, 2.8, 2.8A, 3.0, 3.2, 3.4) 1MB
L2 cache
533MHz system bus (2.4A and 2.8A only)
800MHz system bus (all other models)
Hyper-Threading support is only available on CPUs using the 800MHz system bus.
The processor's integer instruction pipeline has been increased from 20 stages to 31
stages, which theoretically allows for even greater clock speeds.
7500 to 11000 MIPS
Pentium 4F
Introduced Spring 2004
same core as 4E, "Prescott"
3.23.6 GHz
starting with the D0 stepping of this processor, EM64T 64-bit extensions has also been
incorporated
Pentium D
Introduced Q2 2005
"Smithfield" dual-core version

2.83.2 GHz
1MB+1MB L2 cache (non-shared, 2MB total)
800MHz system-bus
Not hyperthreading, performance increase of 60% over similarly clocked Prescott
Cache-coherency between cores requires communication over the 800MHz FSB
The 64-bit processors: Itanium & ...
Itanium
Released May 29, 2001
733 MHz and 800 MHz
Itanium 2
Released July 2002
900 MHz and 1 GHz
Pentium M (chronological entry)

See main entry
Pentium 4EE, 4E (chronological entries)

Introduced September 2003, February 2004, respectively
See main entries
EM64T
Intel Extended Memory 64 Technology
Introduced Spring 2004, with the Pentium 4F (D0 and later P4 steppings)
64-bit architecture extension for the x86 range; near clone of AMD64

MUP e Notes Theory

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

MUP e Notes Theory

Enviado por

Direitos autorais:

Formatos disponíveis

www.bookspar.

com | Website for students | VTU NOTES

The evolution of microprocessors have been known to follow Moore's Law

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

a webpage claiming an inventor pre-dating both TI and Intel, describing

a "microcontroller", which may or may not count as a "microprocessor".

A computer-on-a-chip is a variation of a microprocessor which combines the microprocessor core

According to A History of Modern Computing, (MIT Press), pp. 22021,

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

this webpage for a bibliographic

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

Some example address spaces include:

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

In the Linux kernel, address spaces include:

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

Modern primary storage devices include:

Random access memory (RAM) - includes VRAM WRAM NVRAM

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

In embedded systems, swapping is typically not supported.

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

Typical execution times in cycles (estimates):

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

Microcomputers using the 8086

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

16-bit protected mode

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

www.bookspar.com | Website for students | VTU NOTES

8086/88 Device Specifications