Escolar Documentos
Profissional Documentos
Cultura Documentos
Microprocessor
A microprocessor (abbreviated as P or uP) is an electronic computer central processing unit
(CPU) made from miniaturized transistors and other circuit elements on a single semiconductor
integrated circuit (IC)
Before the advent of microprocessors, electronic CPUs were made from discrete (separate) TTL
integrated circuits; before that, individual transistors; and before that, from vacuum tubes. There
have even been designs for simple computing machines based on mechanical parts such as
gears, shafts, levers, Tinkertoys, etc. Leonardo DaVinci made one such design, although none
were possible to construct using the manufacturing techniques of the time.
History
The first chips
As with many advances in technology, the microprocessor was an idea whose time had come.
Three projects arguably delivered a complete microprocessor at about the same time, Intel's 4004,
Texas Instruments' TMS 1000, and Garrett AiResearch's Central Air Data Computer.
In 1968 Garrett was invited to produce a digital computer to compete with electromechanical
systems then under development for the main flight control computer in the US Navy's new F-14
Tomcat fighter. The design was complete by 1970, and used a MOS-based chipset as the core
CPU. The design was smaller and much more reliable than the mechanical systems it competed
against, and was used in all of the early Tomcat models. However the system was considered so
advanced that the Navy refused to allow publication of the design, and continued to refuse until
1997. For this reason the CADC, and the MP944 chipset it used, are fairly unknown even today.
TI developed the 4-bit TMS 1000 and stressed pre-programmed embedded applications,
introducing a version called the TMS1802NC on September 17, 1971, which implemented a
calculator on a chip. The Intel chip was the 4-bit 4004, released on November 15, 1971, developed
by Federico Faggin.
TI filed for the patent on the microprocessor. Gary Boone was awarded for the single-chip
microprocessor architecture on September 4, 1973. It may never be known which company
actually had the first working microprocessor running on the lab bench. In both 1971 and 1976,
Intel and TI entered into broad patent cross-licensing agreements, with Intel paying royalties to TI
for the microprocessor patent. A nice history of these events is contained in court documentation
from a
legal dispute between Cyrix and Intel, with TI as and owner of the microprocessor
patent.
Interestingly, a third party claims to have been awarded a patent which might cover the
"microprocessor". See
Both the Z80 and 6502 concentrated on low overall cost, through a combination of small
packaging, simple computer bus requirements, and the inclusion of circuitry that would normally
have to be provided in a separate chip (for instance, the Z80 included a memory controller). It was
these features that allowed the home computer "revolution" to take off in the early 1980s,
eventually delivering semi-usable machines that sold for US$99.
Motorola trumped the entire 8-bit world by introducing the MC6809, arguably one of the most
powerful, orthogonal, and clean 8-bit microprocessor designs ever fielded and also one of the
most complex hardwired logic designs that ever made it into production for any microprocessor.
Microcoding replaced hardwired logic at about this point in time for all designs more powerful
than the MC6809 specifically because the design requirements were getting too complex for
hardwired logic.
Another early 8-bit microprocessor was the Signetics 2650, which enjoyed a brief flurry of interest
due to its innovative and powerful instruction set architecture.
A seminal microprocessor in the world of spaceflight was RCA's RCA 1802(aka CDP1802, RCA
COSMAC) which was used in NASA's Voyager and Viking spaceprobes of the 1970s, and onboard
the Galileo probe to Jupiter (launched 1989, arrived 1995). The CDP1802 was used because it
could be run at very low power,* and because its production process (Silicon on Sapphire)
ensured much better protection against cosmic radiation and electrostatic discharges than that of
any other processor of the era; thus, the 1802 is said to be the first radiation-hardened
microprocessor.
16-bit
The first multi-chip 16-bit microprocessor was the National Semiconductor IMP-16,
introduced in early 1973. An 8-bit version of the chipset introduced in 1974 as the IMP-8. In 1975,
National introduced the first 16-bit single-chip microproessor, the PACE, which was later followed
by an NMOS version, the INS8900.
Other early multi-chip 16-bit microprocessors include one used by Digital Equipment Corporation
Another early single-chip 16-bit microprocessor was TI's TMS 9900, which was also compatible
with their TI 990 line of minicomputers. The 9900 was used in the TI 990/4 minicomputer, the TI99/4A home computer, and the TM990 line of OEM microcomputer boards. The chip was packaged
in a large ceramic 64-pin DIP package package, while most 8-bit microprocessors such as the
Intel 8080 used the more common, smaller, and less expensive 40-pin DIP. A follow-on chip, the
TMS 9980, was designed to compete with the Intel 8080, had the full TI 990 16-bit instruction set,
used a plastic 40-pin package, moved data 8 bits at a time, but could only address 16 KB. A third
chip, the TMS 9995, was a new design. The family later expanded to include the 99105 and 99110.
Intel followed a different path, having no minicomputers to emulate, and instead "upsized" their
8080 design into the 16-bit Intel 8086, the first member of the x86 family which powers most
modern PC type computers. Intel introduced the 8086 as a cost effective way of porting software
from the 8080 lines, and succeeded in winning much business on that premise. Following up their
8086 and 8088, Intel released the 80186, 80286 and, in 1985, the 32-bit 80386, cementing their PC
market dominance with the processor family's backwards compatibility.
The integrated microprocessor memory management unit (MMU) was developed by Childs et al.
of Intel, and awarded US patent number 4,442,484.
32-bit designs
16-bit designs were in the market only briefly when full 32-bit implementations started to appear.
The world's first single-chip 32-bit microprocessor was the AT&T Bell Labs BELLMAC-32A, with
first samples in 1980, and general production in 1982 (See
reference and
this webpage for a general reference). After the divestiture of AT&T in 1984, it
was renamed the WE 32000 (WE for Western Electric), and had two follow-on generations, the WE
32100 and WE 32200. These microprocessors were used in the AT&T 3B5 and 3B15
minicomputers; in the 3B2, the world's first desktop supermicrocomputer; in the "Companion",
the world's first 32-bit laptop computer; and in "Alexander", the world's first book-sized
supermicrocomputer, featuring ROM-pack memory cartridges similar to today's gaming consoles.
All these systems ran the original Bell Labs UNIX Operating System, which included the first
The most famous of the 32-bit designs is the MC68000, introduced in 1979. The 68K, as it was
widely known, had 32-bit registers but used 16-bit internal data paths, and a 16-bit external data
bus to reduce pin count. Motorola generally described it as a 16-bit processor, though it clearly
has 32-bit architecture. The combination of high speed, large (16 megabyte) memory space and
fairly low costs made it the most popular CPU design of its class. The Apple Lisa and Macintosh
designs made use of the 68000, as did a host of other designs in the mid-1980s, including the
Atari ST and Commodore Amiga.
Intel's first 32-bit microprocessor was the iAPX 432, which was introduced in 1981 but was not a
commercial success. It had an advanced capability-based object-oriented architecture, but poor
performance compared to other competing architectures such as the Motorola 68000.
Motorola's success with the 68000 led to the MC68010, which added virtual memory support. The
MC68020, introduced in 1985 added full 32-bit data and address busses. The 68020 became
hugely popular in the Unix supermicrocomputer market, and many small companies (e.g., Altos,
Charles River Data Systems) produced desktop-size systems. Following this with the MC68030,
which added the MMU into the chip, the 68K family became the processor for everything that
wasn't running DOS. The continued success led to the MC68040, which included a FPU for better
math performance. An 68050 failed to achieve its performance goals and was not released, and
the follow-up MC68060 was released into a market saturated by much faster RISC designs. The
68K family faded from the desktop in the early 1990s.
Other large companies designed the 68020 and follow-ons into embedded equipment. At one
point, there were more 68020s in embedded equipment than there were Intel Pentiums in PCs (See
this webpage for this embedded usage information). The ColdFire processor cores are
derivatives of the venerable 68020.
During this time (early to mid 1980s), National Semiconductor introduced a very similar 16-bit
pinout, 32-bit internal microprocessor called the NS 16032 (later renamed 32016), the full 32-bit
version named the NS 32032, and a line of 32-bit industrial OEM microcomputers. By the mid1980s, Sequent introduced the first symmetric multiprocessor (SMP) server-class computer using
the NS 32032. This was one of the designs few wins, and it disappeared in the late 1980s.
In the late 1980s, "microprocessor wars" started killing off some of the microprocessors.
Apparently, with only one major design win, Sequent, the NS 32032 just faded out of existence,
and Sequent switched to Intel microprocessors.
64 bit microchips
Though RISC (see below) based designs featured the first crop of 64 bit processors long before
the current mainstream PC microchips from AMD & Intel, they were limited to proprietary OSes.
However with AMD's introduction of the first 64-bit chip Athlon 64, followed by Intel's own 64 bit
chips, the 64 bit race has truly begun. Both processors are also backward compatible meaning
they can run 32 bit legacy apps as well as the new 64 bit software. With 64 bit Windows XP and
Linux that runs on 64 bits, the software too is geared to utilise the full power of such processors.
RISC
In the mid-1980s to early-1990s, a crop of new high-performance RISC (reduced instruction set
computer) microprocessors appeared, which were initially used in special purpose machines and
Unix workstations, but have since become almost universal in all roles except the Intel-standard
desktop.
The first commercial design was released by MIPS Technology, the 32-bit R2000 (the R1000 was
not released). The R3000 made the design truly practical, and the R4000 introduced the world's
first 64-bit design. Competing projects would result in the IBM POWER and Sun SPARC systems,
respectively. Soon every major vendor was releasing a RISC design, including the AT&T CRISP,
AMD 29000, Intel i860 and Intel i960, Motorola 88000, DEC Alpha and the HP-PA.
Market forces have "weeded out" many of these designs, leaving the POWER and the derived
PowerPC as the main desktop RISC processor, with the SPARC being used in Sun designs only.
MIPS continues to supply some SGI systems, but is primarily used as an embedded design,
notably in Cisco routers. The rest of the original crop of designs have either disappeared, or are
about to. Other companies have attacked niches in the market, notably ARM, originally intended
for home computer use but since focussed at the embedded processor market. Today RISC
Of course, in the IBM-compatible PC world, Intel, AMD, and now VIA of Taiwan all make x86compatible microprocessors. In 64-bit computing, the DEC(-Intel) ALPHA, the AMD 64, and the
HP-Intel Itanium are the most popular designs as of late 2004.
x86 or 80x86 is the generic name of a microprocessor architecture first developed and
manufactured by Intel.
The architecture is called x86 because the earliest processors in this family were identified only
by numbers ending in the sequence "86": the 8086, the 80186, the 80286, the 386, and the 486.
Because one cannot trademark numbers, Intel and most of its competitors began to use
trademarkable names such as Pentium for subsequent generations of processors, but the earlier
naming scheme has stuck as a term for the entire family. Intel now refers to x86 as IA-32, an
abbreviation for Intel Architecture, 32-bit.
Intel 8085
The Intel 8085 is an 8-bit microprocessor made by Intel in the mid-1970s It was binary compatible
with the more-famous Intel 8080 but required less supporting hardware, thus allowing simpler and
less expensive microcomputer systems to be built.
The "5" in the model number came from the fact that the 8085 required only a 5-volt power supply
rather than the 5V and 12V supplies the 8080 needed. Both processors were sometimes used in
computers running the CP/M operating system
, and the 8085 later saw use as a microcontroller (much by virtue of its component count reducing
feature). Both designs were later eclipsed by the compatible but more capable Zilog Z80, which
took over most of the CP/M computer market as well as taking a large share of the booming home
computer market in the early-to-mid-1980s.
The 8085 can access 65,536 individual memory locations, but can only access one at a time,
because it is an eight bit microprocessor and each operation requires eight bits to preform it.
Unlike some other microprocessors of its era, it has a separate address space for up to 256 I/O
ports. It also has a built in register array which are usually labeled A,B,C,D,E,H and L. The
microprocessor also has three hardware based HALT operations which are found in pin 7 pin 8
8-bit
8-bit CPUs normally use an 8-bit data bus and a 16-bit address bus which means that their
address space is limited to 64 kilobytes; this is not a "natural law", however, and thus there are
exceptions.
The first widely adopted 8-bit microprocessor was the Intel 8080, being used in many hobbyist
computers of the late 1970s and early 1980s, often running the CP/M. The Zilog Z80 (compatible
with the 8080) and the Motorola 6800 were also used in similar computers. The Z80 and the MOS
Technology 6502 8-bit CPUs were widely used in home computers and game consoles of the 70s
and 80s. Many 8-bit CPUs or microcontrollers are the basis of today's ubiquitous embedded
systems
There are 28 (256) possible permutations for 8 bits.
Address space
In computing, an address space defines a context in which an address makes sense.
Two addresses may be numerically the same, but refer to different things, if they belong to
different address spaces.
In general, things in one address space are physically in a different location than things in
another address space. For example, "house number 101 South" on one particular southward
street is completely different from any house number (not just the 101st house) on a different
southward street.
However, sometimes different address spaces overlap (some physical location exists in both
address spaces). When overlapping address spaces are not aligned, translation is necessary.
For example, virtual-to-physical address translation is necessary to translate addresses in the
virtual memory address space to ddresses in physical address space -- one physical address,
and one or more numerically different virtual address, all refer to the same physical byte of
RAM.
Many programmers prefer to use a flat memory model, in which there is no distinction
between code space, data space, and virtual memory-- in other words, numerically identical
pointers refer to exactly the same byte of RAM in all three address spaces.
Unfortunately, many early computers did not support a flat memory model -- in particular,
Harvard architecture machines force program storage to be completely separate from data
storage.
Many modern DSPs (such as the Motorola 56000) have 3 separate storage areas -- program
storage, coefficient storage, and data storage. Some commonly-used instructions fetch from
all three areas simultaneously -fewer storage areas (even if there were the same or more total bytes of storage) would make
those instructions run slower. 3 storage areas merely a type of Harvard architecture,or does
"Harvard" imply exactly 2 storage areas ?
Primary storage
Primary storage is a category of computer storage, often called main memory. Confusingly, the
term primary storage has recently been used in a few contexts to refer to online storage (hard
disk), which is usually classified as secondary storage.
A particular location in storage is selected by its physical memory address. That address remains
the same, no matter how the particular value stored there changes.
Over the history of computing, a variety of technologies have been used for primary storage.
Today, we are most familiar with random access memory (RAM) made out of many small
integrated circuits. Some early computers used mercury delay lines, in which a series of acoustic
pulses were sent along a tube filled with mercury. When the pulse reached the end of the tube, the
circuitry detected whether the pulse represented a binary 1 or 0 and caused the oscillator at the
beginning of the line to repeat the pulse. Other early computers stored RAM on high-speed
magnetic drums.
Before the use of integrated circuits for memory became widespread, primary storage was
implemented in many different forms:
Williams tube
Delay line memory
Drum memory
Core memory
Twistor memory
Bubble memory
Virtual memory
Virtual memory is a computerdesign feature that permits software to use more main memory (the
memory which the CPU can read and write to directly) than the computer actually physically
possesses.
Most computers possess four kinds of memory: registers in the CPU, caches both inside and
adjacent to the CPU, physical memory, generally in the form of RAM which the CPU can read and
write to directly and reasonably quickly; and disk storage which is much slower, but also much
larger. Many applications require access to more information (codeas well as data) than can be
stored in physical memory. This is especially true when the operating system is one that wishes
to allow multiple processes/applications to run seemingly in parallel. The obvious response to the
problem of the maximum size of the physical memory being less than that required for all running
programs is for the application to keep some of its information on the disk, and move it back and
forth to physical memory as needed, but there are a number of ways to do this.
One option is for the application software itself to be responsible both for deciding which
information is to be kept where, and also for moving it back and forth. The programmer would do
this by determining which sections of the program (and also its data) were mutually exclusive and
then arranging for loading and unloading the appropriate sections from physical memory, as
needed. The disadvantage of this approach is that each application's programmer must spend
time and effort on designing, implementing, and debuggingthis mechanism, instead of focusing
on their application; this hampered programmers' efficiency. Also, if any programmer could truly
choose which of their items of data to store in the physical memory at any one time, they could
easily conflict with the decisions made by another programmer, who also wanted to use all the
available physical memory at that point.
The alternative is to use virtual memory, in which a combination of special hardware and
operating system software makes use of both kinds of memory to make it look as if the computer
has a much larger main memory than it actually does. It does this in a way that is invisible to the
rest of the software running on the computer. It usually provides the ability to simulate a main
memory of almost any size (as limited by the size of the addresses being used by the operating
system and cpu; the total size of the Virtual Memory can be 232 for a 32 bit system, or
approximately 4 Gigabytes, while newer 64 bit chips and operating systems use 64 or 48 bit
addresses and can index much more virtual memory).
This makes the job of the application programmer much simpler. No matter how much memory
the application needs, it can act as if it has access to a main memory of that size. The
programmer can also completely ignore the need to manage the moving of data back and forth
between the different kinds of memory.
In technical terms, virtual memory allows software to run in a memory address space whose size
and addressing are not necessarily tied to the computer's physical memory. While conceivably
virtual memory could be implemented solely by operating system software, in practice its
implementation almost universally uses a combination of hardware and operating system
software.
Basic operation
When virtual memory is used, or when a main memory location is read or written to by the CPU,
hardware within the computer translates the address of the memory location generated by the
software (the virtual memory address) into either:
the address of a real memory location (the physical memory address) which is assigned
within the computer's physical memory to hold that memory item, or
an indication that the desired memory item is not currently resident in main memory (a socalled virtual memory exception)
In the former case, the memory reference operation is completed, just as if the virtual memory
were not involved. In the latter case, the operating system is invoked to handle the situation,
since the actions needed before the program can continue are usually quite complex.
The effect of this is to swap sections of information between the physical memory and the disk;
the area of the disk which holds the information which is not currently in physical memory is
called the swap file, page file, or swap partition (on some operating systems it is a dedicated
partition of a disk).
Details
The translation from virtual to physical addresses is implemented by an MMU. This may be either
a module of the CPU, or an auxiliary, closely coupled chip.
The operating system is responsible for deciding which parts of the program's simulated main
memory are kept in physical memory. The operating system also maintains the translation tables
which provide the mappings between virtual and physical addresses, for use by the MMU. Finally,
when a virtual memory exception occurs, the operating system is responsible for allocating an
area of physical memory to hold the missing information, bringing the relevant information in
In most computers, these translation tables are stored in physical memory. Therefore, a virtual
memory reference might actually involve two or more physical memory references: one or more
to retrieve the needed address translation from the page tables, and a final one to actually do the
memory reference.
To minimize the performance penalty of address translation, most modern CPUs include an onchip MMU, and maintain a table of recently used physical-to-virtual translations, called a
Translation Lookaside Buffer, or TLB. Addresses with entries in the TLB require no additional
memory references (and therefore time) to translate, However, the TLB can only maintain a fixed
number of mappings between virtual and physical addresses; when the needed translation is not
resident in the TLB, action will have to be taken to load it in.
On some processors, this is performed entirely in hardware; the MMU has to do additional
memory references to load the required translations from the translation tables, but no other
action is needed. In other processors, assistance from the operating system is needed; an
exception is raised, and on this exception, the operating system replaces one of the entries in the
TLB with an entry from the translation table, and the instruction which made the original memory
reference is restarted.
The hardware that supports virtual memory almost always supports memory protection
mechanisms as well. The MMU may have the ability to vary its operation according to the type of
memory reference (for read, write or execution), as well as the privilege mode of the CPU at the
time the memory reference was made. This allows the operating system to protect its own code
and data (such as the translation tables used for virtual memory) from corruption by an erroneous
application program and to protect application programs from each other and (to some extent)
from themselves (e.g. by preventing writes to areas of memory which contain code).Paging and
virtual memory
Virtual memory is usually (but not necessarily) implemented using paging. In paging, the low
order bits of the binary representation of the virtual address are preserved, and used directly as
the low order bits of the actual physical address; the high order bits are treated as a key to one or
more address translation tables, which provide the high order bits of the actual physical address.
For this reason a range of consecutive addresses in the virtual address space whose size is a
power of two will be translated in a corresponding range of consecutive physical addresses. The
memory referenced by such a range is called a page. The page size is typically in the range of 512
to 8192 bytes (with 4K currently being very common), though page sizes of 4 megabytes or larger
may be used for special purposes. (Using the same or a related mechanism, contiguous regions
of virtual memory larger than a page are often mappable to contiguous physical memory for
purposes other than virtualization, such as setting access and caching control bits.)
The operating system stores the address translation tables, the mappings from virtual to physical
page numbers, in a data structure known as a page table,
If a page that is marked as unavailable (perhaps because it is not present in physical memory, but
instead is in the swap area), when the CPU tries to reference a memory location in that page, the
MMU responds by raising an exception (commonly called a page fault) with the CPU, which then
jumps to a routine in the operating system. If the page is in the swap area, this routine invokes an
operation called a page swap, to bring in the required page.
The page swap operation involves a series of steps. First it selects a page in memory, for
example, a page that has not been recently accessed and (preferably) has not been modified
since it was last read from disk or the swap area. (See page replacement algorithms for details.) If
the page has been modified, the process writes the modified page to the swap area. The next step
in the process is to read in the information in the needed page (the page corresponding to the
virtual address the original program was trying to reference when the exception occurred) from
the swap file. When the page has been read in, the tables for translating virtual addresses to
physical addresses are updated to reflect the revised contents of the physical memory. Once the
page swap completes, it exits, and the program is restarted and continues on as if nothing had
happened, returning to the point in the program that caused the exception.
It is also possible that a virtual page was marked as unavailable because the page was never
previously allocated. In such cases, a page of physical memory is allocated and filled with zeros,
the page table is modified to describe it, and the program is restarted as above.
Additional details
One additional advantage of virtual memory is that it allows a computer to multiplex its CPU and
Note that virtual memory is not a requirement for precompilation of software, even if the software
is to be executed on a multiprogramming system. Precompiled software loaded by the operating
system has the opportunity to carry out address relocation at load time. This suffers by
comparison with virtual memory in that a copy of program relocated at load time cannot run at a
distinct address once it has started execution.
It is possible to avoid the overhead of address relocation using a process called rebasing, which
uses metadata in the executable image header to guarantee to the run-time loader that the image
will only run within a certain virtual address space. This technique is used on the system libraries
on Win32 platforms, for example.
Systems with a large amount of RAM can create a virtual hard disk within the RAM itself. This
does block some of the RAM from being available for other system tasks but it does considerably
speed up access to the swap file itself.
Processor register
In computer architecture, a processor register is a small amount of very fast computer memory
used to speed the execution of computer programs by providing quick access to commonly used
valuestypically, the values being in the midst of a calculation at a given point in time.
These registers are the top of the memory hierarchy, and are the fastest way for the system to
manipulate data. Registers are normally measured by the number of bits they can hold, for
example, an "8-bitregister" or a "32-bit register". Registers are now usually implemented as a
register file, but they have also been implemented using individual flip-flops, high speed core
memory, thin film memory, and other ways in various machines.
The term is often used to refer only to the group of registers that can be directly indexed for input
or output of an instruction, as defined by the instruction set. More properly, these are called the
"architected registers". For instance, the x86 instruction set defines a set of eight 32-bit registers,
but a CPU that implements the x86 instruction set will contain many more hardware registers than
just these eight.
There are several other classes of registers:
Data registers are used to store integer numbers (see also Floating Point Registers, below).
In some simple/older CPUs, a special data register is the accumulator, used for arithmetic
calculations.
Address registers hold memory addresses and are used to access memory. In some
simple/older CPUs, a special address register is the index register (one or more of these may be
present)
General Purpose registers (GPRs) can store both data and addresses, i.e., they are
combined Data/Address registers.
Floating Point registers (FPRs) are used to store floating point
Constant registers hold read-only values (e.g., zero, one, pi, ...).
Vector registers hold data for vector processing done by SIMD instructions (Single
Instruction, Multiple Data).
Special Purpose registers store internal CPU data, like the program counter (aka
instruction pointer), stack pointer, and status register (aka processor status word).
In some architectures, model-specific registers (also called machine-specific registers)
store data and settings related to the processor itself. Because their meanings are attached to the
design of a specific processor, they cannot be expected to remain standard between processor
generations.
Memory segment
On the Intel x86 architecture, a memory segment is the portion of memory which may be
addressed by a single index register without changing a 16-bit segment selector. In real mode or
protected mode on the 80286 processor (or V86 mode on the 80386 and later processors), a
segment is 64 kilobytes in size (using 16-bit index registers). In 32-bit protected mode, available in
80386 and subsequent processors, a segment is 4 gigabytes (due to 32-bit index registers).
In 16-bit mode, enabling applications to make use of multiple memory segments (in order to
access more memory than available in any one 64K-segment) was quite complex, but was viewed
as a necessary evil for all but the smallest tools (which could do with less memory). The root of
the problem was that no appropriate address-arithmetic instructions suitable for flat addressing
of the entire memory range were available. Flat addressing is possible by applying multiple
instructions, which however leads to slower programs.
The introduction of 32-bit operating systems and the more comfortable 32-bit flat memory model
has resulted in the almost elimination in use of segmented addressing towards the end of the
1990s. However, using the flat memory model has resulted in the 4 gigabyte limit not being far
from everyday use. Segmentation allows operating systems to make the limit a per-process
virtual address space issue, utilizing up to a maximum of 64 gigabytes of system memory, but the
reluctance to eventually return to segmentation is often cited as motivation to move towards 64bit processors.
Computer bus
In computer architecture, a bus is a subsystem that transfers data or power between computer
components inside a computer or between computers. Unlike a point-to-point connection, a bus
can logically connect several peripheral s over the same set of wires.
Early computer buses were literally parallel electrical buses with multiple connections, but the
term is now used for any physical arrangement that provides the same logical functionality as a
parallel electrical bus. Modern computer buses can use both parallel and bit-serial connections,
and can be wired in either a multidrop (electrical parallel) or daisy chain topology, or connected
by switched hubs, as in the case of USB.
Intel 8086
www.bookspar.com | Website for students | VTU NOTES
The 8086 is a 16-bit microprocessor chip designed by Intel in 1978, which gave rise to the x86
architecture. Shortly later the Intel 8088 was introduced with an external 8-bit bus, allowing the
use of cheap chipsets. It was based on the design of the 8080 and 8085 (it was assembly
languagewith the 8080) with a similar register set, but was expanded to 16 bits. The Bus Interface
Unit fed the instruction stream to the Execution Unit through a 6 byte prefetch queue, so fetch
and execution were concurrent a primitive form of pipelining (8086 instructions varied from 1 to
4 bytes).
It featured four 16-bit general registers, which could also be accessed as eight 8-bit registers, and
four 16-bit index registers (including the stack pointer). The data registers were often used
implicitly by instructions, complicating register allocation for temporary values. It featured 64K 8bit I/O (or 32K 16 bit) ports and fixed vectored interrupts. Most instructions could only access one
memory location, so one operand had to be a register. The result was stored in one of the
operands.
There were also four segment registers that could be set from index registers. The segment
registers allowed the CPU to access one megabyte of memory in an odd way. Rather than just
supplying missing bytes, as in most segmented processors, the 8086 shifted the segment register
left 4 bits and added it to the address. As a result segments overlapped, which most people
consider to have been poor design. Although this was largely acceptable (and even useful) for
assembly language, where control of the segments was complete, it caused confusion in
languages which make heavy use of pointers (such as C). It made efficient representation of
pointers difficult, and made it possible to have two pointers with different values pointing to the
same location. Worse, this scheme made expanding the address space to more than one
megabyte difficult. Effectively, it was expanded by changing the addressing scheme in the 80286.
The processor runs at clock speeds between 4.77 (in the original IBM PC) and 10 MHz.
The 8086 did not contain any floating point instructions, but could be connected to a
mathematical coprocessors to add this capability. The Intel 8087 was the standard version, but
manufacturers like Weitek soon offered higher performance alternatives.
The IBM Displaywriter word processing machine also used the 8086. The most influential
microcomputer of all, the IBM PC, used the 8-bit variant, the Intel 8088.
History
The x86 architecture first appeared inside the Intel 8086 CPU in 1978; the 8086 was a development
of the 8008 processor (which itself followed the 4004). It was adopted (in the simpler 8088 version)
three years later as the standard CPU of the IBM PC. The ubiquity of the PC platform has resulted
in the x86 becoming one of the most successful CPU architectures ever.
Other companies also manufacture or have manufactured CPUs conforming to the x86
architecture: examples include Cyrix (now owned by VIA Technologies), NEC Corporation, IBM,
IDT, and Transmeta. The most successful of the clone manufacturers has been AMD, whose
Athlon series is a close second to the Pentium series for popularity.
The 8086 was a 16-bit processor; the architecture remained 16-bit until 1985, when the 32-bit
80386 was developed. Subsequent processors represented refinements of the 32-bit architecture,
introducing various extensions, until in 2003 AMD developed a 64-bit extension to the architecture
in the form of the AMD64 standard, introduced with the Opteron processor family, which was also
adopted a few years later (under a different name) in a new generation of Intel Pentiums.
Note that Intel also introduced a separate 64-bit architecture used in its Itanium processors which
it calls IA-64 or more recently IPF (Itanium Processor Family). IA-64 is a completely new system
Design
The x86 architecture is essentially CISC with variable instruction length. Word sized memory
access is allowed to unaligned memory addresses. Words are stored in the little-endian order.
Backwards compatibility has always been a driving force behind the development of the x86
architecture (the design decisions this has required are often criticised, particularly by
proponents of competing processors, who are frustrated by the continued success of an
architecture widely perceived as quantifiably inferior). Modern x86 processors translate the x86
instruction set to more RISC-like micro-instructions upon which modern micro-architectural
techniques can be applied.
Note that the names for instructions and registers (mnemonics) that appear in this brief review
are the ones specified in Intel documentation and used by Intel (and compatible, eg. Microsoft's
MASM, Borland's TASM, CAD-UL's as386, etc.) assemblers. An instruction that is specified in the
Intel syntax by mov al, 30h is equivalent to AT&T-syntax movb $0x30, %al, and both translate to
the two bytes of machine code B0 30 (hexadecimal). You can see that there is no trace left in this
code of either "mov" or "al", which are the original Intel mnemonics. If we wanted, we could write
an assembler that would produce the same machine code from the command "move immediate
byte hexadecimally encoded 30 into low half of the first register". However, the convention is to
stick to Intel's original mnemonics.
The x86 assembly language is discussed in more detail in the x86 assembly language article.
Real mode
Intel 8086 and 8088 had 14 16-bit registers. Four of them (AX, BX, CX, DX) were general purpose
(although each had also an additional purpose; for example only CX can be used as a counter
with the loop instruction). Each could be accessed as two separate bytes (thus BX's high byte can
be accessed as BH and low byte as BL). In addition to them, there are 4 segment registers (CS,
DS, SS and ES). They are used to form a memory address. There are 2 pointer registers (SP which
points to the bottom of the stack, and BP which can be used to point at some other place in the
stack or the memory). There are two index registers (SI and DI) which can be used to point inside
In real mode, memory access is segmented. This is done by shifting the segment address left by 4
bits and adding an offset in order to receive a final 20-bit address. Thus the total address space in
real mode is 220 bytes, or 1 MB, quite an impressive figure for 1978. There are two addressing
modes: near and far. In far mode, both the segment and the offset are specified. In near mode,
only the offset is specified, and the segment is taken from the appropriate register. For data the
register is DS, for code is CS, and for stack it is SS. For example, if DS is A000h and SI is 5677h,
DS:SI will point at the absolute address DS 16 + SI = A5677h.
In this scheme, two different segment/offset pairs can point at a single absolute location. Thus, if
DS is A111h and SI is 4567h, DS:SI will point at the same A5677h as above. In addition to duplicity,
this scheme also makes it impossible to have more than 4 segments at once. Moreover, CS, DS
and SS are vital for the correct functioning of the program, so that only ES can be used to point
somewhere else. This scheme, which was intended as a compatibility measure with the Intel 8085
has caused no end of grief to programmers.
In addition to the above-said, the 8086 also had 64K of 8-bit (or alternatively 32K of 16-bit) I/O
space, and a 64K (one segment) stack in memory supported by hardware. Only words (2 bytes)
can be pushed to the stack. The stack grows downwards, its bottom being pointed by SS:SP.
There are 256 interrupts, which can be created by both hardware and software. The interrupts can
cascade, using the stack to store the return address.
In the meantime, operating systems like OS/2 tried to ping-pong the processor between protected
and real modes. This was both slow and unsafe, as in real mode a program could easily crash the
computer. OS/2 also defined restrictive programming rules which allowed a Family API or bound
program to run either in real mode or in protected mode. This was however about running
programs originally designed for protected mode, not vice-versa. By design, protected mode
programs did not suppose that there is a relation between selector values and physical
addresses. It is sometimes mistakenly believed that problems with running real mode code in 16bit protected mode resulted from IBM having chosen to use Intel reserved interrupts for BIOS
calls. It is actually related to such programs using arbitrary selector values and performing
"segment arithmetic" described above on them.
This problem also appeared with Windows 3.0. Optimally, this release wanted to run programs in
16-bit protected mode, while previously they were running in real mode. Theoretically, if a
Windows 1.x or 2.x program was written "properly" and avoided segment arithmetic it would run
indifferently in both real and protected modes. Windows programs generally avoided segment
arithmetic because Windows implemented a software virtual memory scheme and moved
program code and data in memory when programs were not running, so manipulating absolute
addresses was dangerous; programs were supposed to only keep handles to memory blocks
when not running, and such handles were quite similar to protected-mode selectors already.
Starting an old program while Windows 3.0 was running in protected mode triggered a warning
dialog, suggesting to either run Windows in real mode (it could presumably still use expanded
memory, possibly emulated with EMM386 on 80386 machines, so it was not limited to 640KB) or to
obtain an updated version from the vendor. Well-behaved programs could be "blessed" using a
special tool to avoid this dialog. It was not possible to have some GUI programs running in 16-bit
protected mode and other GUI programs running in real mode, probably because this would
require having two separate environments and (on 80286) would be subject to the previously
mentioned ping-ponging of the processor between modes. In version 3.1 real mode disappeared.
32-bit protected mode
The Intel 80386 introduced, perhaps, the greatest leap so far in the x86 architecture. With the
notable exception of the Intel 80386SX, which was 32-bit yet only had 24-bit addressing (and a 16bit data bus), it was all 32-bit - all the registers, instructions, I/O space and memory. To work with
No new general-purpose registers were added. All 16-bit registers except the segment ones were
expanded to 32 bits. Intel represented this by adding "E" to the register mnemonics (thus the
expanded AX became EAX, SI became ESI and so on). Since there was a greater number of
registers, instructions and operands, the machine code format was expanded as well. In order to
provide backwards compatibility, the segments which contain executable code can be marked as
containing either 16 or 32 bit instructions. In addition, special prefixes can be used to include 32bit instructions in a 16-bit segment and vice versa.
Paging and segmented memory access were both required in order to support a modern
multitasking operating system. Linux, 386BSD, Windows NT and Windows 95 were all initially
developed for the 386, because it was the first CPU that made it possible to reliably support the
separation of programs' memory space (each into its own address space) and the preemption of
them in the case of necessity (using rings). The basic architecture of the 386 became the basis of
all further development in the x86 series.
The Intel 80387 math co-processor was integrated into the next CPU in the series, the Intel 80486.
The new FPU could be used to make floating point calculations, important for scientific
calculation and graphic design.
MMX and beyond
1996 saw the appearance of the MMX (Matrix Math Extensions, though sometimes incorrectly
referred to as Multi-Media Extensions) technology by Intel. While the new technology has been
advertised widely and vaguely, its essence is very simple: MMX defined 8 64-bit SIMD registers
overlayed onto the FPU stack to the Intel Pentium CPU design. Unfortunately, these instructions
were not easily mappable to the code generated by ordinary C compilers, and Microsoft, the
dominant compiler vendor, was slow to support them even as intrinsics. MMX also is limited to
integer operations. These technical shortcomings caused MMX to have little impact in its early
existence. Nowadays, MMX is typically used for some 2D video applications.
3DNow!
SSE
In 1999 Intel introduced the SSE instruction set which added 8 new 128 bit registers (not
overlayed with other registers). These instructions were analogous to AMD's 3DNow! in that they
primarily added floating point SIMD.
SSE2
In 2001 Intel introduced the SSE2 instruction set which added 1) a complete complement of
integers instructions (analogous to MMX) to the original SSE registers and 2) 64-bit SIMD floating
point instructions to the original SSE registers. The first addition made MMX almost obsolete, and
the second allowed the instructions to be realistically targeted by conventional compilers.
SSE3
Introduced in 2004 along with the Prescott revision of the Pentium 4 processor, SSE3 added
specific memory and thread-handling instructions to boost the performance of Intel's
HyperThreading technology. AMD later licensed the SSE3 instruction set for it's latest (E) revision
Athlon 64 processors. The SSE3 instruction set included on the new Athlons are only lacking a
couple of the instructions that Intel designed for HyperThreading, since the Athlon 64 doesn't
support HyperThreading; however SSE3 is still recognized in software as being supported on the
platform.
64-bit
As of 2002, the x86 architecture began to reach some design limits due to the 32-bit character
length. This makes it more difficult to handle massive information stores larger than 4 GB, such
as those found in databases or video editing.
Intel had originally decided to completely drop x86 compatibility with the 64-bit generation, by
AMD took the initiative of extending out the 32-bit x86, aka IA-32 to 64-bit. It came up with an
architecture, called AMD64 (it was called x86-64 until being rebranded), and the first products
based on this technology were the Opteron and Athlon 64 family of processors. Due to the
success of the AMD64 line of processors, Intel adopted the AMD64 instruction set and added
some new extensions of their own, rebranding it the EM64T architecture (apparently not wishing
to acknowledge that the instruction set came from its main rival).
This was the first time that a major upgrade of the x86 architecture was initiated and originated by
a manufacturer other than Intel. Perhaps more importantly, it was the first time that Intel actually
accepted technology of this nature from an outside source.
Virtualization
x86 virtualization is difficult because the architecture does not meet the Popek and Goldberg
virtualization requirements. Nevertheless, there are several commercial x86
virtualizationproducts, such as VMware and Microsoft Virtual PC. Intel and AMD have both
announced that future x86 processors will have new enhancements to facilitate more efficient
virtualization. Intel's code names for their virtualization features are "Vanderpool" and
"Silvervale"; AMD uses the code name "Pacifica".
80C86/80C88: CMOS version draws 10mA with temp spec -40 to 225degF.
Yields a 350mV noise immunity for logic 0 (Output max can be as high as 450mV while input max
can be no higher than 800mV).
This limits the loading on the outputs.
8086/88 Pinout
8086/88 Pinout
Pin functions:
AD15-AD0
Multiplexed address(ALE=1)/data bus(ALE=0).
A19/S6-A16/S3 (multiplexed)
High order 4 bits of the 20-bit address OR status bits S6-S3.
M/IO
Indicates if address is a Memory or IO address.
8086/88 Pinout
Pin functions:
o
o
o
8086/88 Pinout
Pin functions:
INTR
INTA
CLK
Clock input must have a duty cycle of 33% (high for 1/3 and low for 2/3s)
VCC/GND
Power supply (5V) and GND (0V).
8086/88 Pinout
Pin functions:
MN/ MX
Select minimum (5V) or maximum mode (0V) of operation.
BHE
Bus High Enable. Enables the most significant data bus bits (D
15
operation.
READY
Used to insert wait states (controlled by memory and IO for reads/writes) into the microprocessor.
RESET
Microprocessor resets if this pin is held high for 4 clock periods.
Instruction execution begins at FFFF0H and IF flag is cleared.
TEST
o
o
8086/88 Pinout
Pin functions:
HOLD
LOCK
Lock output is used to lock peripherals off the system. Activated by using the LOCK: prefix on
any instruction.
QS1 and QS0
The queue status bits show status of internal instruction queue. Provided for access by the
numeric coprocessor (8087).
Correct reset timing requires that the RESET input to the microprocessor becomes a logic 1 NO
LATER than 4 clocks after power up and stay high for at least 50us.
-A
.
15
+ BHE
-A
16
19
are buffered separately.
BUS Timing
Writing:
BUS Timing
Reading:
BUS Timing
Bus Timing:
BUS Timing
During T
:
1
The address is placed on the Address/Data bus.
Control signals M/ IO , ALE and DT/ R specify memory or I/O, latch the address onto the
address bus and set the direction of data transfer on data bus.
During T
:
2
8086 issues the RD or WR signal, DEN , and, for a write, the data.
DEN enables the memory or I/O device to receive the data for writes and the 8086 to receive the
data for reads.
During T
:
3
This cycle is provided to allow memory to access data.
READY is sampled at the end of T
If low, T
During T
:
4
All bus signals are deactivated, in preparation for next bus cycle.
BUS Timing
Timing:
Each BUS CYCLE on the 8086 equals four system clocking periods (T states).
The clock rate is 5MHz , therefore one Bus Cycle is 800ns .
The transfer rate is 1.25MHz .
Memory specs (memory access time) must match constraints of system timing.
For example, bus timing for a read operation shows almost 600ns are needed to read data.
However, memory must access faster due to setup times, e.g. Address setup and data setup.
This subtracts off about 150ns .
Therefore, memory must access in at least 450ns minus another 30-40ns guard band for buffers
and decoders.
420ns DRAM required for the 8086.
BUS Timing
READY:
An input to the 8086 that causes wait states for slower memory and I/O components.
A wait state (T
) is an extra clock period inserted between T and T to lengthen the bus cycle.
W
2
3
For example, this extends a 460ns bus cycle (at 5MHz clock) to 660ns .
Text discusses role of 8284A and timing requirements for the 8086.
Some of the control signals must be generated externally, due to redefinition of certain control
pins on the 8086.
The following pins are lost when the 8086 operates in Maximum mode .
ALE
WR
IO/ M
DT/ R
DEN
INTA
Separate signals are used for I/O ( IORC and IOWC ) and memory ( MRDC and
MWTC ).
Also provided are advanced memory ( AIOWC ) and I/O ( AIOWC ) write strobes
plus INTA .
Intel 8088
The Intel 8088 is an Intel microprocessor based on the 8086, with 16-bit registers and an 8-bit
external data bus. The processor was used in the original IBM PC.
The 8088 was targeted at economical systems by allowing the use of 8-bit designs. Large bus
width circuit boards were still fairly expensive when it was released. The prefetch queue of the
8088 is 4 bytes, as opposed to the 8086's 6 bytes. The descendants of the 8088 include the 8018,
80288 (obsolete), and 80388 microcontrollers which are still in use today.
The most influential microcomputer to use the 8088 was, by far, the IBM PC. The original PC
Apparently IBM's own engineers wanted to use the Motorola 68000, and it was used later in the
forgotten IBM Instruments 9000 Laboratory Computer, but IBM already had rights to manufacture
the 8086 family, in exchange for giving Intel the rights to its bubble memory designs. A factor for
using the 8-bit Intel 8088 version was that it could use existing Intel 8085-type components, and
allowed the computer to be based on a modified 8085 design. 68000 components were not widely
available at the time, though it could use Motorola 6800 components to an extent. Intel bubble
memory was on the market for a while, but Intel left the market due to fierce competition from
Japanese corporations who could undercut by cost, and left the memory market to focus on
processors.
A compatible replacement chip, the V20, was produced by NEC for an approximate 20 percent
improvement in computing power.
Assembly language
Assembly language or simply assembly is a human-readable notation for the machine language
that a specific computer architecture uses. Machine language, a pattern of bits encoding machine
operations, is made readable by replacing the raw values with symbols called mnemonics.
For example, a computer with the appropriate processor will understand this x86/IA-32 machine
instruction:
10110000 01100001
For programmers, however, it is easier to remember the equivalent assembly language
representation:
mov al, 0x61
which means to move the hexadecimal value 61 (97 decimal) into the processor register with the
name "al". The mnemonic "mov" is short for "move", and a comma-separated list of arguments or
parameters follows it; this is a typical assembly language statement.
Unlike in high-level languages, there is usually a 1-to-1 correspondence between simple assembly
statements and machine language instructions. Transforming assembly into machine language is
accomplished by an assembler, and the reverse by a disassembler.
Every computer architecture has its own machine language, and therefore its own assembly
language. Computers differ by the number and type of operations that they support. They may
also have different sizes and numbers of registers, and different representations of data types in
storage. While all general-purpose computers are able to carry out essentially the same
functionality, the way they do it differs, and the corresponding assembly language must reflect
these differences.
In addition, multiple sets of mnemonics or assembly-language syntax may exist for a single
instruction set. In these cases, the most popular one is usually that used by the manufacturer in
their documentation.
Machine instructions
Instructions in assembly language are generally very simple, unlike in a high-level language. Any
instruction that references memory (for data or as a jump target) will also have an addressing
mode to determine how to calculate the required memory address. More complex operations must
be built up out of these simple operations. Some operations available in most instruction sets
include:
movings
set a register (a temporary "scratchpad" location in the CPU itself) to a fixed constant
value
move data from a memory location to a register, or vice versa. This is done to obtain
the data to perform a computation on it later, or to store the result of a computation.
read and write data from hardware devices
computing
add, subtract, multiply, or divide the values of two registers, placing the result in a
register
perform bitwise operations, taking the conjunction/disjunction (and/or) of
corresponding bits in a pair of registers, or the negation (not) of each bit in a register
compare two values in registers (for example, to see if one is less, or if they are equal)
affecting program flow
jump to another location in the program and execute instructions there
jump to another location if a certain condition holds
Specific instruction sets will often have single, or a few instructions for common operations
which would otherwise take many instructions. Examples:
saving many registers on the stack at once
moving large blocks of memory
complex and/or floating-point arithmetic (sine, cosine, square root, etc.)
applying a simple operation (for example, addition) to a vector of values
Addressing mode
In computer programming, addressing modes are primarily of interest to compiler writers and to
those (few nowadays) who use assembly language. Some computer science students may also
need to learn about addressing modes as part of their studies. Those involved with CPU design or
computer architecture should already know this and a lot more.
Addressing modes form part of the instruction set architecture for some particular type of CPU.
Some machine languages will need to refer to (addresses of) operands in memory. An addressing
mode specifies how to calculate the effective memory address of an operand by using
information held in registers and/or
constants contained within a machine instruction.
Opcode
when the op-code values are active at the decoders logic inputs the desired operations are
performed..
this is a better explanation then whats below..cleanUpLater.., each of which is assigned a numeric
code called an opcode. To assist in the use of these numeric codes, mnemonics are used as
textual abbreviations. It's much easier to remember ADD than 05, for example.
Opcodes operate on registers, values in memory, values stored on the stack, I/Oports, the bus,
etc. They are used to perform arithmetic operations and move and change values. Operands are
the things that opcodes operate on.
Mnemonic
A mnemonic (Pronounced in American English in British English is a memoryaid. Mnemonics are
often verbal, are sometimes in verse form, and are often used to remember lists. Mnemonics rely
not only on repetition to remember facts, but also on associations between easy-to-remember
constructs and lists of data, based on the principle that the human mind much more easily
remembers data attached to spatial, personal or otherwise meaningful information than that
occurring in meaningless sequences. The word mnemonic shares etymology with Mnemosyne,
the name of the titan who personified Memory in Greek mythology
Techniques
A mnemonic technique is one of many memory aids that is used to create associations among
Instruction set
An instruction set, or instruction set architecture (ISA), describes the aspects of a computer
architecture visible to a programmer, including the native datatypes, instructions, registers,
addressing modes, memory architecture, interrupt and exception handling, and external I/O (if
any).
An ISA is a specification of the set of all binary codes (opcodes) that are the native form of
commands implemented by a particular CPU design. The set of opcodes for a particular ISA is
also known as the machine language for the ISA.
"Instruction set architecture" is sometimes used to distinguish this set of characteristics from the
microarchitecture, which is the set of processor design techniques used to implement the
instruction set (including microcode, pipelining, cache systems, and so forth). Computers with
different microarchitectures can share a common instruction set. For example, the Intel Pentium
and the AMD Athlon implement nearly identical versions of the x86 instruction set, but have
radically different internal designs. This concept can be extended to unique ISAs like TIMI present
in the IBM System/38 and IBM IAS/400. TIMI is an ISA that is implemented as low-level software
and functionally resembles what is now referred to as a virtual machine It was designed to
increase the longevity of the platform and applications written for it, allowing the entire platform
to be moved to very different hardware without having to modify any software except that which
comprises TIMI itself. This allowed IBM to move the AS/400 platform from an older CISC
architecture to the newer POWER architecture without having to rewrite any parts of the OS or
software associated with it.
When designing microarchitectures, engineers use Register Transfer Language (RTL) to define
An ISA can also be emulated in software by a interpreter. Due to the additional translation needed
for the emulation, this is usually slower than directly running programs on the hardware
implementing that ISA. Today, it is common practice for vendors of new ISAs or
microarchitectures to make software emulators available to software developers before the
hardware implementation is ready.
They usually have a simple symbolic capability for defining values as symbolic expressions
which are evaluated at assembly time, making it possible to write code that is easier to read and
understand.
Like most computer languages, comments can be added to the source code which are ignored by
the assembler.
They also usually have an embedded macro language to make it easier to generate complex
pieces of code or data.
In practice, the absence of comments and the replacement of symbols with actual numbers
makes the human interpretation of disassembled code considerably more difficult than the
original source would be.
However, some discrete calculations can still be rendered into faster running code with assembly,
and some low-level programming is simply easier to do with assembly. Some system-dependent
tasks performed by operating system simply cannot be expressed in high-level languages. In
particular, assembly is often used in writing the low level interaction between the operating
system and the hardware, for instance in device drivers. Many compilers also render high-level
languages into assembly first before fully compiling, allowing the assembly code to be viewed for
debugging and optimization purposes.
It's also common, especially in relatively low-level languages such as, to be able to embed
assembly language into the source code with special syntax. Programs using such facilities, such
as the Linux kernel often construct abstractions where different assembly is used on each
platform the program supports, but it is called by portable code through a uniform interface.
Many embedded systems are also programmed in assembly to obtain the absolute maximum
functionality out of what is often very limited computational resources, though this is gradually
changing in some areas as more powerful chips become available for the same minimal cost.
Another common area of assembly language use is in the system BIOS of a computer. This lowlevel code is used to initialize and test the system hardware prior to booting the OS and is stored
in ROM. Once a certain level of hardware initialization has taken place, code written in higher level
languages can be used, but almost always the code running immediately after power is applied is
written in assembly language. This is usually due to the fact system RAM may not yet be
initialized at power-up and assembly language can execute without explicit use of memory,
especially in the form of a stack.
Assembly language is also valuable in reverse engineering, since many programs are distributed
only in machine code form, and machine code is usually easy to translate into assembly language
and carefully examine in this form, but very difficult to translate into a higher-level language.
Tools such as the Interactive Disassembler make extensive use of disassembly for such a
purpose.
Interrupt
www.bookspar.com | Website for students | VTU NOTES
Digital computers usually provide a way to start software routines in response to asynchronous
electronic events. These events are signaled to the processor via interrupt requests (IRQ). The
processor and interrupt code make a context switch into a specifically written piece of software to
handle the interrupt. This software is called the interrupt service routine, or interrupt handler. The
addresses of these handlers are termed interrupt vectors and are generally stored in a table in
RAM, allowing them to be modified if required.
Interrupts were originated to avoid wasting the computer's valuable time in software loops (called
polling loops) waiting for electronic events. Instead, the computer was able to do other useful
work while the event was pending. The interrupt would signal the computer when the event
occurred, allowing efficient accommodation for slow mechanical devices.
Interrupts allow modern computers to respond promptly to electronic events, while other work is
being performed. Computer architectures also provide instructions to permit processes to initiate
software interrupts or traps. These can be used, for instance, to implement co-operative
multitasking
A well-designed interrupt mechanism arranges the design of the computer bus software and
interrupting device so that if some single part of the interrupt sequence fails, the interrupt restarts
and runs to completion. Usually there is an electronic request, an electronic response, and a
software operation to turn off the device's interrupt, to prevent another request.
Interrupt Types
Typical interrupt types include:
timer interrupts
disk interrupts
power-off interrupts
traps
Other interrupts exist to transfer data bytes using UARTs, or Ethernet, sense key-presses, control
motors, or anything else the equipment must do.
A classic timer interrupt just interrupts periodically from a counter or the power-line. The software
(usually part of an operating system counts the interrupts to keep time. The timer interrupt may
also be used to reschedule the priorities of running processes. Counters are popular, but some
older computers used the power line because power companies control the power-line frequency
with an atomic clock.
A disk interrupt signals the completion of a data transfer from or to the disk peripheral. A process
waiting to read or write a file starts up again.
A power-off interrupt predicts or requests a loss of power. It allows the computer equipment to
perform an orderly shutdown.
Interrupts are also used in typeahead features for buffering events like keystrokes.
Interrupt routines generally have a short execution time. Most interrupt routines do not allow
themselves to be interrupted, because they store saved context on a stack, and if interrupted
many times, the stack could overflow. An interrupt routine frequently needs to be able to respond
to a further interrupt from the same source. If the interrupt routine has significant work to do in
response to an interrupt, and it is not critical that the work be performed immediately, then often
the routine will do nothing but schedule the work for some later time and return as soon as
possible. Some processors support a hierarchy of interrupt priorities, allowing certain kinds of
interrupts to occur while processing higher priority interrupts.
Processors also often have a mechanism referred to as interrupt disable which allows software to
prevent interrupts from interfering with communication between interrupt-code and non-interrupt
code. See mutual exclusion.
Typically, the user can configure the machine using hardware registers so that different types of
interrupts are enabled or disabled, depending on what the user wants. The interrupt signals are
And'ed with a mask, thus allowing only desired interrupts to occur. Some interrupts cannot be
disabled - these are referred to as non-maskable interrupts
Interrupt vector
The destination to which the CPU jumps for a given interrupt is termed the interrupt vector.
Generally, most computer system designs will incorporate a list of such vectors; this is termed
the interrupt vector table or dispatch table.
Interrupt handler
An Interrupt Handler is the modern progression of an interrupt service routine, a routine whose
execution is triggered by an interrupt.
In modern systems Interrupt Handlers are split into two parts: the First-Level Interrupt Handler
(FLIH) and the Second-Level Interrupt Handlers (SLIH).
The FLIH operates in the same way as the old interrupt routines did. In response to an interrupt
there is a context switch and the code for the interrupt is loaded and executed. The job of the
FLIH, however, is not to process the interrupt, but to schedule the execution of the SLIH, while
recording any critical information which is only available at the time of the interrupt.
The SLIH sits on the run queue of the operating system until it can be executed to perform the
processing for the interrupt when processor time is available.
It is worth noting that in many systems the FLIH and SLIH are referred to as upper halves and
lower halves, or a derivation of those names.
Non-Maskable interrupt
A non-maskable interrupt (or NMI) is a special type of interrupt used on in most types of
microcomputer, for example the IBM PC and Apple II.
An NMI causes the CPU to stop what it was doing, change the program counter to point to a
particular address and continue executing code from that location. Programmers are unable to
One is for debugging faulty code, where it can be instantly suspended at any point and control
transferred to a special monitor program, from which the developer can inspect the machine's
memory and examine the internal state of the program as it rests in "suspended animation". The
Apple Macintosh's "programmers' button" worked in this way, as do certain key combinations on
SUN workstations.
A second is for leisure users and gamers. Devices which added a button to generate an NMI, such
as Romantic Robot's Multiface, were a popular accessory for 1980s 8-bit and 16-bit home
computers. These peripherals had a small amount of ROM and an NMI button. Pressing the button
transferred control to the software in the peripheral's ROM, allowing the suspended program to be
saved to disk (very useful for tape-based games with no disk support, but also for saving games
in progress), screenshots to be saved or printed, or values in memory to be manipulated -- a
cheating technique to acquire extra lives, for example.
Some floppy disk interfaces, such as the Miles Gordon Technology's DISCiPLE and PlusD for the
ZX Spectrum, also included an NMI button.
Intel 8087
The 8087 was the first math coprocessor designed by Intel and it was built to be paired with the
Intel 8088 and 8086 microprocessors. The purpose of the 8087, the first of the x87 family, was to
speed up computations on demanding applications involving floating point mathematics. The
performance enhancements went from 20% to 500% depending on the specific application.
This coprocessor introduced about 60 new instructions available to the programmer, all
beginning with "F" to differentiate them from the standard 8086/88 integer math instructions. For
example, in constrast to ADD/MUL, the 8087 provided FADD/FMUL.
The 8087 (and, in fact, the entire x87 family) does not provide a freely, linear register set such as
the AX/BX/CX/DX registers of the 8086/88 and 80286 processors -- the x87 registers are structured
When Intel designed the 8087 it aimed to make a standard floating point format for future designs.
In fact, one of the most successful things from a historical perspective of this coprocessor was
the introduction of the first floating point standard for the x86 PCs: the IEEE 754. The 8087
provided two basic 32/64-bit floating point data types and an additional extended 80-bit internal
support to improve accuracy over large and complex calculations. Apart from this, the 8087
offered a 80-bit/17-digit packed BCD (binary coded decimal) format and 16,32 and 64-bit integer
data types.
The 8087, announced in 1980, was superseded by the 80287, 80387DX/SX and the 487SX. Intel
80486DX, Pentium and later processors include a built-in coprocessor on the CPU core.
Peripheral
A peripheral is a type of computer hardware that is added to a host computer, in order to expand
its abilities. More specifically the term is used to describe those devices that are optional in
nature, as opposed to hardware that is either demanded, or always required in principle.
The term also tends to be applied to devices that are hooked up externally, typically though some
form of computer bus like USB. Typical examples include joysticks, printers and scanners.
It is used to interface to the keyboard and a parallel printer port in PCs (usually as part of an
integrated chipset).
Requires insertion of wait states if used with a microprocessor using higher that an 8 MHz clock.
PPI has 24 pins for I/O that are programmable in groups of 12 pins and has three distinct modes
of operation.
In previous example, both ports A and B are programmed as (mode 0) simple latched output
ports.
Port A provides the segment data inputs to display and port B provides a means of selecting one
display position at a time.
Different values are displayed in each digit via fast time multiplexing.
The values for the resistors and the type of transistors used are determined using the current
requirements (see text for details).
Port C used for control or handshaking signals (cannot be used for data).
Keyboard encoder debounces the key-switches, and provides a strobe whenever a key is
depressed.
DAV is activated on a key press strobing the ASCII-coded key code into Port A.
Only allowed with port A. Bi-directional bused data used for interfacing
two computers, GPIB interface etc.
Timing diagram is a combination of the Mode 1 Strobed Input and Mode 1 Strobed Output Timing
diagrams.
PIC 8259
www.bookspar.com | Website for students | VTU NOTES
IRQ
Common Uses
00 - 01
Exception Handlers
02
Non-Maskable IRQ
03 - 07
Exception Handlers
08
Hardware IRQ0
System Timer
09
Hardware IRQ1
Keyboard
0A
Hardware IRQ2
Redirected
0B
Hardware IRQ3
Hardware IRQ4
0D
Hardware IRQ5
Reserved/Sound Card
0E
Hardware IRQ6
0F
Hardware IRQ7
Parallel Comms.
10 - 6F
Software Interrupts
70
Hardware IRQ8
71
Hardware IRQ9
Redirected IRQ2
72
Hardware IRQ10
Reserved
73
Hardware IRQ11
Reserved
74
Hardware IRQ12
PS/2 Mouse
75
Hardware IRQ13
Math's Co-Processor
76
Hardware IRQ14
77
Hardware IRQ15
Reserved
78 - FF
Software Interrupts
The average PC, only has 15 Hardware IRQ's plus one Non-Maskable IRQ. The rest of the interrupt
vectors are used for software interrupts and exception handlers. Exception handlers are routines
like ISR's which get called or interrupted when an error results. Such an example is the first
Interrupt Vector which holds the address of the Divide By Zero, Exception handler. When a divide
by zero occurs the Microprocessor fetches the address at 0000:0000 and starts executing the
code at this Address.
Hardware Interrupts
The Programmable Interrupt Controller (PIC) handles hardware interrupts. Most PC's will have two
of them located at different addresses. One handles IRQ's 0 to 7 and the other, IRQ's 8 to 15,
giving a total of 15 individual IRQ lines, as the second PIC is cascaded into the first, using IRQ2.
Most of the PIC's initialization is done by BIOS, thus we only have to worry about two
instructions. The PIC has a facility available where we can mask individual IRQ's so that these
requests will not reach the Processor. Thus the first instruction is to the Operation Control Word 1
(OCW1) to set which IRQ's to mask and which IRQ's not too.
As there are two PIC's located at different addresses, we must first determine which PIC we need
to use. The first PIC, located at Base Address 0x20h controls IRQ 0 to IRQ 7. The bit format of
PIC1's Operation Control Word 1 is shown below in table 2.
Disable IRQ
Function
IRQ7
Parallel Port
IRQ6
IRQ5
Reserved/Sound Card
IRQ4
Serial Port
IRQ3
Serial Port
IRQ2
PIC2
IRQ1
Keyboard
IRQ0
System Timer
Table 2 : PIC1 Operation Control Word 1 (0x21)
Note that IRQ 2 is connected to PIC2, thus if you mask this IRQ, then you will be disabling IRQ's 8
to 15.
The second PIC located at a base address of 0xA0h controls IRQs 8 to 15. Below is the individual
bits required to make up it's Operation Control Word.
Bit
Disable IRQ
Function
IRQ15
Reserved
IRQ14
IRQ13
Maths Co-Processor
IRQ12
PS/2 Mouse
IRQ11
Reserved
IRQ10
Reserved
IRQ9
Redirected IRQ2
IRQ8
As the above table shows the bits required to disable an IRQ, we must invert them should we
want to enable an IRQ. For example, if we want to enable IRQ 3 then we would send the byte 0xF7
as OCW1 to PIC1. But what happens if one of these IRQs are already enabled and then we come
along and disable it?
Therefore we must first get the mask and use the AND function to output the byte back to the
register with our changes so to cause the least upset to the other IRQs. Going back to our IRQ3
example, we could use outportb(0x21,(inportb(0x21) & 0xF7); to enable IRQ3. Take note that the
OCW1 goes to the register at Base + 1.
The same procedure must be used to mask (disable) an IRQ once we are finished with it. However
this time we must OR the byte 0x08 to the contents of OCW1. Such and example of code is
outportb(0x21,(inportb(0x21) | 0x08);
outportb(0x20,0x20);
enable();
}
void interrupt yourisr() defines this function as an Interrupt Service Routine. disable(); clears the
interrupt flag, so that no other hardware interrupts ,except a NMI (Non-Maskable Interrupt) can
occur. Otherwise, and interrupt with a higher priority that this one can interrupt the execution of
this ISR. However this is not really a problem in many cases, thus is optional.
The body of your ISR will include code which you want to execute upon this interrupt request
being activated. Most Ports/UARTs may interrupt the processor for a range of reasons, eg byte
received, time-outs, FIFO buffer empty, overruns etc, thus the nature of the interrupt has to be
determined. This is normally achieved by reading the status registers of the port you are using.
Once it has been established, you can service it's requests.
If you read any data from a port, it is normally common practice to place it in a buffer, rather that
immediately writing it to the screen, inhibiting further interrupts to be processed. Most Ports
these days will have FIFO buffers which can contain more than one byte, thus repeat your read
routine, until the FIFO is empty, then exit your ISR.
void main(void)
{
oldhandler = getvect(INTNO);
setvect(INTNO, yourisr);
The basic block diagram of the PIC is shown above. The 8 individual interrupt request lines are
first passed through the Interrupt Mask Register (IMR) to see if they have been masked or not. If
they are masked, then the request isn't processed any further. However if they are not masked,
they will register their request with the Interrupt Request Register (IRR).
The Interrupt Request Register will hold all the requested IRQ's until they have been dealt with
appropriately. If required, this register can be read by setting certain bits of the Operation Control
Word 3. The Priority Resolver simply selects the IRQ of highest priority. The higher priority
interrupts are the lower numbered ones. For Example IRQ 0 has the highest priority followed by
IRQ 1 etc.
Now that the PIC has determined which IRQ to process, it is now time to tell the processor, so that
it can call your ISR for you. This process is done by sending a INT to the processor, i.e. the INT
line on the processor is asserted. The processor will then finish the current instruction it's
processing and acknowledge your INT request with a INTA (Interrupt Acknowledge) pulse.
Upon receiving the processor's INTA, the IRQ which the PIC is processing at the time is stored in
the In Service Register (ISR) which as the name suggests, shows which IRQ is currently in
service. The IRQ's bit is also reset in the Interrupt Request Register, as it is no longer requesting
service but actually getting service.
Another INTA pulse will be sent by the processor, to tell the PIC to place a 8 bit pointer on the data
bus, corresponding to the IRQ number. If an IRQ serviced by PIC2 is requesting the service, then
PIC2 will send the pointer to the processor. The Master (PIC1) at this stage, will select PIC2 to
IRQ2/IRQ9 Redirection
The redirection of IRQ2 causes quite some confusion, and thus is discussed here. In the original
XT's there were only one PIC, thus only eight IRQ's. However users soon out grew these
resources, thus an additional 7 IRQ's were added to the PC. This involved attaching another PIC
to the existing one already in the XT. Compatibility always causes problems as the new
configuration still had to be compatible with old hardware and software. The "new" configuration
is shown below.
The CPU only has one interrupt line, thus the second controller had to be connected to the first
controller, in a master/slave configuration. IRQ2 was selected for this. By using IRQ2 for the
second controller, no other devices could use IRQ2, so what happened to all these devices using
IRQ2? Nothing, the interrupt request line found on the bus, was simply diverted into the IRQ 9
input. As no devices yet used the second PIC or IRQ9, this could be done.
The next problem was that a hardware device using IRQ2 would install it's ISR at INT 0x0A.
Therefore an ISR routine was used at INT 71h, which sent a EOI to PIC2 and then called the ISR at
INT 0x0A. If you dis-assemble the ISR for IRQ9, it will go a little like,
MOV AL,20
OUT A0,AL
INT 0A
IRET
The routine only has to send a EOI to PIC2, as it is expected that a ISR routine written for IRQ2 will
send a EOI to PIC1. This example destroys the contents of Register AL, thus this must be placed
on the stack first (Not shown in example). As PIC2 is initialized with a Slave on IRQ2, any request
using PIC2 will not call the ISR routine for IRQ2. The 8 bit pointer will come from PIC2.
20h
Read/Write
Function
Write
Write
Write
Read
Read
Write
Write
Write
21h
Read/Write
PIC2 Addresses . . .
Address
A0h
Read/Write
Function
Write
Write
Write
Read
Read
Write
Write
Write
A1h
Read/Write
Function
Interrupt Vector Addresses for MCS-80/85 Mode.
Must be set to 1 for ICW1
1
Single PIC
Cascaded PICs
The 8259 Programmable Interrupt Controller, offers many other features which are not used in the
PC. It also offers support for MCS-80/85 microprocessors. All we have to be aware of being PC
uses, is if the system is running in single mode (One PIC) or if in Cascaded Mode (More than one
PIC) and if the Initialization Command Word 4 is needed. If no ICW4 is used, then all of it's bits will
be set to 0. As we are using it in 8086 mode, we must send a ICW4.
8086/8080 Mode
I7
A15
I6
A14
I5
A13
I4
A12
I3
A11
A10
A9
A8
Table 7 : Initialization Command Word 2 (ICW2)
Function
Function
Reserved. Set to 0
Reserved. Set to 0
Reserved. Set to 0
Reserved. Set to 0
Reserved. Set to 0
2:0
Slave ID
000
Slave 0
001
Slave 1
010
Slave 2
011
Slave 3
100
Slave 4
101
Slave 5
110
Slave 6
Slave 7
Table 9 : Initialization Command Word 3 for Slaves (ICW3)
Function
Reserved. Set to 0
Reserved. Set to 0
Reserved. Set to 0
1
0x
10
11
Auto EOI
Normal EOI
8086/8080 Mode
MCS-80/85
3:2
Once again, many of these are special functions not used with the 8259 PIC in a PC. We don't use,
Special Fully Nested Mode, thus this bit is set to 0. Likewise we use non-buffered mode and
Normal EOI's thus all these corresponding bits are set to 0. The only thing we must set is
8086/8080 Mode which is done using Bit 0.
Bit
PIC 2
PIC 1
Mask IRQ15
Mask IRQ7
Mask IRQ14
Mask IRQ6
Mask IRQ13
Mask IRQ5
Mask IRQ12
Mask IRQ4
Mask IRQ11
Mask IRQ3
Mask IRQ10
Mask IRQ2
Mask IRQ9
Mask IRQ1
Mask IRQ8
Mask IRQ0
Operation Control Word 1, shown above is used to mask the inputs of the PIC. This has already
been discussed, earlier in this article.
Function
000
001
010
Reserved
011
Specific EOI
100
101
110
111
Must be set to 0
Must be set to 0
2:0
000
Act on IRQ 0 or 8
001
Act on IRQ 1 or 9
010
Act on IRQ 2 or 10
011
Act on IRQ 3 or 11
100
Act on IRQ 4 or 12
101
Act on IRQ 5 or 13
110
Act on IRQ 6 or 14
111
Act on IRQ 7 or 15
Table 12 : Operation Control Word 2 (OCW2)
Function
Must be set to 0
00
Reserved
01
Reserved
10
11
Must be set to 0
Must be set to 1
1
Poll Command
No Poll Command
00
Reserved
01
Reserved
10
11
1:0
Bits 0 and 1 are of the most significant to us, in Operation Control Word 3. These two bits enable
us to read the status of the Interrupt Request Register (IRR) and the In-Service Register (ISR).
This is done by setting the appropriate bits correctly as above, and reading the register at the
Base Address.
For example if we wanted to read the In-Service Register (ISR), then we would set both bits 1 and
0 to 1. The next read to the base register, (0x20 for PIC1 or 0xA0 for PIC2) will return the status of
the In-Service Register.
http://www.absoluteastronomy.com/reference/list_of_intel_microprocessors
This list of Intel microprocessors attempts to present all of Intel's processors (Ps) from the
pioneering 4-bit 4004 (1971) to the present high-end offerings, the 64-bit Itanium 2 (2002) and
Pentium 4F with EM64T(2004). Concise technical data are given for each product.
The 4-bit and 8-bit processors
8086
Introduced June 8, 1978
Clock speeds:
5MHz with 0.33 MIPS
8MHz with 0.66MIPS
10MHz with 0.75 MIPS
iAPX 432
Introduced January 1, 1981 as Intel's first 32-bit microprocessor
Object/capability architecture
Microcoded operating system primitives
One terabyte virtual address space
Hardware support for fault tolerance
Two-chip General Data Processor (GDP), consists of 43201 and 43202
43203 Interface Processor (IP) interfaces to I/O subsystem
43204 Bus Interface Unit (BIU) simplifies building multiprocessor systems
43205 Memory Control Unit (MCU)
Architecture and execution unit internal data paths 32 bit
Clock speeds:
5 MHz
7 MHz
8 MHz
80386DX
Introduced October 17, 1985
Clock speeds:
16MHz with 5 to 6 MIPS
2/16/1987 20MHz with 6 to 7 MIPS
4/4/1988 25MHz with 8.5 MIPS
80486DX
Introduced April 10, 1989
Clock speeds:
25MHz with 20 MIPS (16.8 SPECint92, 7.40 SPECfp92)
5/7/1990 33MHz with 27 MIPS (22.4 SPECint92 on Micronics M4P 128k L2)
6/24/1991 50MHz with 41 MIPS (33.4 SPECint92, 14.5 SPECfp92 on Compaq/50L 256K
L2)
Bus Width 32 bits
Number of Transistors 1.2 million at 1 m; the 50MHz was at 0.8 m
Addressable memory 4 gigabytes
Virtual memory 64 terabytes
Level 1 cache on chip
50X performance of the 8088
Used in Desktop computing and servers
Pentium ("Classic")
Introduced March 22, 1993
P5 0.8 m process technology
Bus width 64 bits
System bus speed 50 or 60 or 66 MHz
Address bus 32 bits
Number of transistors 3.1 million
Addressable Memory 4 gigabytes
Virtual Memory 64 terabytes
Socket 4 273 pin PGA processor package
Package dimensions 2.16" x 2.16"
Superscalar architecture brought 5X the performance of the 33MHz 486DX processor
Runs on 5 volts
Used in desktops
16KB of L1 cache
Variants
60 MHz with 100 MIPS (70.4 SPECint92, 55.1 SPECfp92 on Xpress 256K L2)
66 MHz with 112 MIPS (77.9 SPECint92, 63.6 SPECfp92 on Xpress 256K L2)
P54C 0.6 m process technology
Socket 7 296/321 pin PGA package
Number of transistors 3.2 million
Pentium Pro
Introduced November 1, 1995
0.6 m process technology
Precursor to Pentium II and III
Socket 8 processor package (387 pins) (Dual SPGA)
Number of transistors 22 million
16KB L1 cache
256KB integrated L2 cache
60 MHz system bus speed
Variants
150 MHz Introduced November 1, 1995
0.35 m process technology, or 0.35 m CPU with 0.6 m L2 cache
Introduced November 1, 1995
Number of transistors 36.5 million or 22 million
512KB or 256KB integrated L2 cache
60 or 66 MHz system bus speed
Variants
166 MHz (66 MHz bus speed, 512KB 0.35 m cache) Introduced November 1, 1995
180 MHz (60 MHz bus speed, 256KB 0.6 m cache) Introduced November 1, 1995
200 MHz (66 MHz bus speed, 256KB 0.6 m cache) Introduced November 1, 1995
200 MHz (66 MHz bus speed, 512KB 0.35 m cache) Introduced November 1, 1995
200 MHz (66 MHz bus speed, 1MB 0.35 m cache) Introduced August 18, 1997
Pentium II
Introduced May 7, 1997
Klamath 0.35 m process technology (233, 266, 300MHz)
Pentium 4 (not 4EE, 4E, 4F), Itanium, P4-based Xeon, Itanium 2 (chronological entries)
Introduced April 2000 July 2002
See main entries
Celeron (Pentium III Tualatin-based)
Tualatin Celeron - 0.13 m process technology
32KB L1 cache
256KB Advanced Transfer L2 cache
100 MHz system bus speed
Pentium 4
0.18 m process technology (1.40 and 1.50 GHz)
Introduced November 20, 2000
L2 cache was 256KB Advanced Tansfer Cache (Integrated)
Processor Package Style was PGA423, PGA478
System Bus Speed 400 MHz
SSE2 SIMD Extensions
Number of Transistors 42 million
Used in desktops and entry-level workstations
0.18 m process technology (1.7 GHz)
Introduced April 23, 2001
Itanium
Released May 29, 2001
733 MHz and 800 MHz
Itanium 2
Released July 2002
900 MHz and 1 GHz