Você está na página 1de 71

ARM

Advanced RISC Machines

INTRODUCTION AND
ARCHITECTURE

ARM HISTORY

Key component of many 32 bit embedded systems


Portable Consumer devices
ARM1 first prototype in 1985
One of the ARMs most successful cores is the
ARM7TDMI, provides high code density and low
power consumption

ARM HISTORY
ARM

Acorn RISC Machine(1983)

Acorn Computers Limited, Cambridge, England


Known before becoming ARM as computer
manufacturer
Acorn which developed a 32-bit RISC processor for
its own use (used in Acorn Archimedes)

ARM

Advanced RISC Machine 1990

ARM Limited, Founded 1990, owned by Acorn, Apple


and VLSI
Licenses ARM core designs to semiconductor
partners who fabricate and sell to their customers.
ARM does not fabricate silicon itself

COMPANIES LICENSING WITH ARM


o
o
o
o
o
o
o
o
o
o
o

3com
Agilent Technologies
Altera
Epson
Freescale
Fijitsu
NEC
Nokia
Intel
IBM
Microsoft

o
o
o
o
o
o
o
o
o
o
o

Motorola
Panasonic
Qualcomm
Sharp
Sanyo
Sun Microsystems
Sony
Symbian
Texas Instruments
Toshiba
Wipro

.And many more

ARM PARTNERSHIP MODEL

ARM POWERED PRODUCTS

ADVANCED RISC MACHINES

ARM Core uses a RISC architecture


ARM is Physical hardware design company.
ARM licenses its cores out and other companies
make processors based on its cores

THE RISC DESIGN PHILOSOPHY

RISC is characterized by limited number of


instructions
A complex instruction is obtained as a sequence
of simple instructions. In RISC processor
software is complex but the processor
architecture is simple.
Large number of registers are required.
Pipelined instruction execution

Ex : ARM, ATMEL AVR, MIPS(Microprocessor without


Interlocked Pipeline Stages), Power PC etc

THE CISC DESIGN PHILOSOPHY

CISC is characterized by large instruction set.


The aim of designing CISC processors is to
reduce software complexity by increasing the
complexity of processor architecture.
Very small number of registers are available.
Ex : Intel X86 family, Motorola 68000 series.

CISC VS. RISC


RISC

CISC

Compiler

Compiler
Code Generation

Processor

Greater
Complexity

Greater
Complexity

Code Generation

Processor

CPU performance is given by the fundamental


Law:

CPU TIME = seconds / program


= instruction / program
x
cycles / instruction
x
seconds / cycle

RISC DESIGN RULES

Instructions
Pipelines
Registers
Load Store Architecture

RISC 4 MAJOR DESIGN RULES

Instructions

Reduced Number of Instructions


Execute in a single cycle
The compiler synthesizes complicated operations
Each instruction is a fixed length

RISC 4 MAJOR DESIGN RULES

Pipelines

The processing of instructions is broken down into smaller


units that can be executed in parallel by pipelines
Pipeline advances by one step on each cycle for maximum
throughput

RISC 4 MAJOR DESIGN RULES

Registers

Have a large general purpose register set


Any register can contain either data or address

LOAD STORE ARCHITECTURE

Memory can be accessed only through two dedicated


instructions

LDR
STR

; move word from memory to register


; move word from register to memory

All other instructions have to work on registers only.

Ex: For Arithmetic operation


ADD A, #32H Possible in CISC/8051
ADD R0, #32H Not possible in RISC/ARM
ADD Rd, Rr
LDI R1,0x32 - Possible in RISC/ARM
ADD R0,R1
Ex: For Machine cycle
DJNZ R0, label in CISC require 2 machine cycle = 24 clk pulse(no flag check)
DEC R0
BRNE label - in RISC require 1+1 machine cycle condition false condition 2 clk pulse,
1+2 machine cycle 3 clk pulse true condition ( happened because of falg check)
Djnz- Decrement and jump if not equal , BRNE brach if not equal

ARM DESIGN POLICY

Reduce power consumption


High code density
Low Price
Reduce the area of the die taken up by
the
embedded processor
ARM Incorporated hardware debug technology

DATA SIZES AND INSTRUCTION


SETS

The ARM is a 32-bit architecture.

When used in relation to the ARM:

Most ARMs implement two instruction sets

Byte means 8 bits


Halfword means 16 bits (two bytes)
Word means 32 bits (four bytes)

32-bit ARM Instruction Set


16-bit Thumb Instruction Set

Jazelle cores can also execute Java bytecode

INSTRUCTION SET FOR EMBEDDED SYSTEMS

Variable cycle execution for certain instructions


Inline barrel shifter leading to more complex
instructions
Thumb 16 bit instructions
Conditional execution
Enhanced Instructions

EMBEDDED SYSTEM
SOFTWARE
Application

Operating System

Device initialization (Boot) code and


Device Drivers

Hardware

ARM PROCESSOR
FUNDAMENTALS

ARM SEVEN OPERATING MODES

User : unprivileged mode under which most tasks run

FIQ : entered when a high priority (fast) interrupt is raised

IRQ : entered when a low priority (normal) interrupt is raised

Supervisor : entered on reset and when a Software Interrupt


instruction is executed

Abort : used to handle memory access violations

Undef : used to handle undefined instructions

System : privileged mode using the same registers as user mode

THE ARM REGISTER SET


Current
CurrentVisible
VisibleRegisters
Registers
Abort
Mode
SVC
Undef
Mode
Mode
IRQ
FIQ
User
Mode
Mode
Mode

r0r0
r1r1
r2r2
r3r3
r4r4
r5r5
r6r6
r7r7
r8r8
r9r9
r10
r10
r11
r11
r12
r12
r13
r13(sp)
(sp)
r14
r14(lr)
(lr)
r15
r15(pc)
(pc)
cpsr
cpsr
spsr
spsr

Banked out Registers


User
User

FIQ
FIQ

IRQ
IRQ

SVC
SVC

Undef

Abort

r8
r9
r10
r11
r12
r13
r13(sp)
(sp)
r14
r14(lr)
(lr)

r8
r8
r9
r9
r10
r10
r11
r11
r12
r12
r13
r13 (sp)
(sp)
r14
r14 (lr)
(lr)

r13
r13 (sp)
(sp)
r14
r14 (lr)
(lr)

r13
r13 (sp)
(sp)
r14
r14 (lr)
(lr)

r13 (sp)
r14 (lr)

r13 (sp)
r14 (lr)

spsr
spsr

spsr
spsr

spsr
spsr

spsr

spsr

REGISTER ORGANIZATION
SUMMARY
User
r0
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
r13 (sp)
r14 (lr)
r15 (pc)

FIQ

User
mode
r0-r7,
r15,
and
cpsr

IRQ

User
mode
r0-r12,
r15,
and
cpsr

SVC

User
mode
r0-r12,
r15,
and
cpsr

Undef

User
mode
r0-r12,
r15,
and
cpsr

Abort

User
mode
r0-r12,
r15,
and
cpsr

r8
r9
r10
r11
r12
r13 (sp)
r14 (lr)

r13 (sp)
r14 (lr)

r13 (sp)
r14 (lr)

r13 (sp)
r14 (lr)

r13 (sp)
r14 (lr)

spsr

spsr

spsr

spsr

spsr

Thumb state
Low registers

Thumb state
High registers

cpsr

Note: System mode uses the User mode register set

THE REGISTERS

ARM has 37 registers all of which are 32-bits long.

The current processor mode governs which of several banks is


accessible. Each mode can access

1 dedicated program counter


1 dedicated current program status register
5 dedicated saved program status registers
30 general purpose registers

a particular set of r0-r12 registers


a particular r13 (the stack pointer, sp) and r14 (the link register, lr)
the program counter, r15 (pc)
the current program status register, cpsr

Privileged modes (except System) can also access

a particular spsr (saved program status register)

REGISTERS USER MODE

Registers R0 R12 are General Purpose Registers


R13 is used as Stack Pointer (sp)
R14 is used as Link register (lr)
R15 is used as Program Counter (pc)
CPSR is Current Program Status Register
SPSR is Saved Program Status Register

Day 1

R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R13
R14
R15
CPSR
SPSR

CURRENT PROGRAM STATUS


REGISTER

To monitor and control internal operations


Some ARM Processor core have extra bits
allocated
Fields
Bit 31

Flags
30

29

28

NZCV
Function

Condition
Flags

Status

Control

Extension
7

I F T Mode
Interrupt
Masks

Processor
Mode

Thumb State

PROCESSOR MODES
Determines which registers are active and
the access rights to the cpsr register itself
Privileged & Nonprivileged

Abort
Fast

Interrupt Request
Interrupt Request
Supervisor
System
Undefined
User

Privilege
d
Nonprivileg
ed

BANKED REGISTERS
r0
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
r13 sp
r14 lr
r15 pc
cpsr
-

Banked Registers
Fast
Interrup
t
Request
r8_fiq
r9_fiq
r10_fiq
r11_fiq
r12_fiq
r13_fiq
r14_fiq

spsr_fiq

Interrup
SupervisoUndefined
t
r
Request

Abort

r13_irq
r14_irq

r13_svc
r14_svc

r13_undef
r14_undef

r13_abt
r14_abt

spsr_irq

spsr_svc

spsr_undef

spsr_abt

BANKED REGISTERS

Banked registers are available only when the


processor is in a particular mode
Every processor mode except user mode can change
mode by writing directly to the mode bits of the
cpsr
Banked registers are a subset of the main 16
registers
If we change processor mode, a banked register
from the new mode will replace an existing register
Exceptions and Interrupts cause a mode change

CHANGING MODE ON AN
EXCEPTION
User
Mode
r0
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
r13 sp
r14 lr
r15 pc
cpsr
-

Interrup
t
Request
Mode
r13_irq
r14_irq

spsr_irq

This change causes user


register r13 and r14 to
be banked
The user registers are
replaced with registers
r13_irq and r14_irq
spsr stores the previous
mode cpsr

PROCESSOR MODE
Mode

Abbr:

Privileged

Mode[4:0]

Abort

abt

yes

10111

Fast Interrupt Request

fiq

yes

10001

Interrupt Request

irq

yes

10010

Supervisor

svc

yes

10011

System

sys

yes

11111

Undefined

und

yes

11011

User

usr

no

10000

cpsr is not copied into the spsr when a


mode change is forced due to a program
writing directly to the cpsr.

PROCESSOR MODES
User
FIQ

Unprivileged mode for most applications to


run
Fast Interrupt Routine

IRQ

Interrupt Routines

Supervisor Entered on reset and when there is a


exception
Abort
Entered when data or instruction prefetch
aborted
Undefined When an undefined instructions is executed
System

Privileged user mode for operating system

STATE AND INSTRUCTION SETS

There are three instruction sets


ARM
Thumb
Jazelle

The Jazelle instruction set is a closed instruction set


and is not openly available.

To take advantage of Jazelle extra software


has to be licensed from both ARM Limited
and Sun Microsystems.

STATE AND INSTRUCTION SETS


ARM
(cpsr T = 0)

Thumb
(cpsr T = 1)

Instruction Size

32 bit

16 bit

Core Instruction

58

30

Conditional Execution

Most

Only branch instructions

Data Processing Instructions

Access to barrel shifter


and ALU

Separate barrel and ALU


instructions

Program Status Register

R/W in privileged mode

No direct access

Register Usage

15 GPR + PC

8 GPR + 7 high registers


+ PC

STATE AND INSTRUCTION SETS


Jazelle
(cpsr T = 0, J = 1)
Instruction Size

8 bit

Core Instruction

Over 60% of the java bytecodes are


implemented in hardware; the rest of the codes
are implemented in software

INTERRUPT MASKS

Are used to stop specific interrupt requests from


interrupting the processor
IRQ
FIQ

The I bit masks IRQ when set to binary 1, and F


bit masks FIQ when set to binary 1

CONDITION FLAGS
Flag

Flag Name

Set when

Saturation

The result causes an overflow and / or saturation

oVerflow

The result causes a signed overflow

Carry

The result causes an unsigned carry

Zero

The result is zero, frequently used to indicate the


equality

Negative

Bit 31 of the result is a binary 1

CONDITION FLAGS

Condition flags are updated by comparisons and the


result of ALU operations that specify the S instruction
suffix

If SUBS results in a register value of zero, then the Z flag


in the CPSR is set

Fields
Bit 31

Flags
30

29

28

Status
27

24

00100 0
Function

nzCvq

Control

Extension
7

0 1 0 10011
i

cpsr = nzCvqjiFt_SVC

svc

CONDITION CODES

The possible condition codes are listed below:

Note AL is the default and does not need to be specified

Suffix

EQ
NE
CS/HS
CC/LO
MI
PL
VS
VC
HI
LS
GE
LT
GT
LE
AL

Description
Equal
Not equal
Unsigned higher or same
Unsigned lower
Minus
Positive or Zero
Overflow
No overflow
Unsigned higher
Unsigned lower or same
Greater or equal
Less than
Greater than
Less than or equal
Always

Flags tested
Z=1
Z=0
C=1
C=0
N=1
N=0
V=1
V=0
C=1 & Z=0
C=0 or Z=1
N=V
N!=V
Z=0 & N=V
Z=1 or N=!V

CPSR
31

28

N Z C V

24

23

16 15

I F T

mode

hold information about the most recently performed ALU


operation
Set the processor
operating mode
Condition
code flags
N = Negative result from ALU Interrupt Disable bits.
I = 1: Disables the IRQ.
Z = Zero result from ALU
F = 1: Disables the FIQ.
C = ALU operation Carried out
V = ALU operation oVerflowed T Bit
Architecture xT only
J bit
T = 0: Processor in ARM
Architecture 5TEJ only
state
J = 1: Processor in Jazelle
T = 1: Processor in Thumb
state
state
Mode bits
Specify the processor mode

ARM CORE DATAFLOW MODEL


Data
Instruction
Decoder
Sign Extend
Read
Register File
r0 r15

r15
pc
Rn

Rd
Resul
t

Rm

Barrel Shifter
MAC

N
ALU

Address Register
Incrementer
Address

Acc

PIPELINE

Is a mechanism a RISC processor uses to execute


instructions
Using a pipeline speeds up execution by fetching
the next instruction while other instructions are
being decoded and executed

ARM7 THREE STAGE PIPELINE


Fetch

Decode

Execute

Fetch loads an instruction from memory


Decode identifies the instruction to be executed
Execute processes the instruction and writes the result
back to a register.

1. Fetch the instruction from memory (fetch).


2. Decode it to see what sort of instruction it
is (dec).
3. Access any operands that may be required
from the register bank (reg).
4. Combine the operands to form the result or
a memory address (ALU).
5. Access memory for a data operand, if
necessary (mem).
6. Write the result back to the register bank
(res).

PIPELINED INSTRUCTION
SEQUENCE
Fetch
Cycle
Time
1
Cycle
2
Cycle
3

ADD

Decod
e

SUB

ADD

CMP

SUB

Execute

ADD

Filling the pipeline


Allows the core to execute an instruction every cycle

ARM INSTRUCTION SEQUENCE


Fetch
Time Cycle
1
Cycle
2
Cycle
3
Cycle
4

cpsr

Decod
e

Execut
e

MSR IFt_SVC
cpsr
IFt_SVC

ADD

MSR

AND

ADD

MSR iFt_SVC

SUB

AND

ADD

cpsr

PIPELINE CHARACTERISTICS

An instruction in the execute stage will complete


even though an interrupt has been raised
The execution of a branch instruction or
branching by the direct modification of the PC
causes the ARM core to flush its pipeline

PIPELINE CHANGES FOR ARM9TDMI


ARM7TDMI
Instruction
Fetch

ThumbARM
decompress

FETCH

ARM decode
Reg Select

Reg
Shift
Read

DECODE

Reg
Write

ALU

EXECUTE

ARM9TDMI
Instruction
Fetch

FETCH

ARM or Thumb
Inst Decode
Reg
Reg
Decode
Read

DECODE

Shift + ALU

EXECUTE

Memory
Access

MEMORY

Reg
Write

WRITE

ARM10 VS. ARM11 PIPELINES


ARM10
Branch
Prediction
Instruction
Fetch

FETCH

ARM or
Thumb
Instruction
Decode

Reg Read

Memory
Access

EXECUTE

MEMORY

ARM11
Fetch
1

Fetch
2

Decode

Issue

Reg
Write

Multiply
Add

Multiply

DECODE

ISSUE

Shift + ALU

Shift

ALU

Saturate

MAC
1

MAC
2

MAC
3

Address

Data
Cache
1

Data
Cache
2

Write
back

WRITE

ARM EXCEPTIONS

ARM supports range of Interrupts, Traps,


Supervisor Calls, all grouped under general
heading of

Exceptions

EXCEPTIONS, INTERRUPTS, AND


THE VECTOR TABLE
When an exception or interrupt occurs, the
processor set the PC to a specific memory
address
The address is within a special address range
called the vector table
The entries in the vector table are instructions
that branch to specific routines designed to
handle a particular exception or interrupt
When an exception or interrupt occurs, the
processor suspends normal execution and
starts loading instructions from the exception
vector table

VECTOR ADDRESSES
Exception / Interrupt

Shorthand

Address

High Address

Reset

RESET

0x00000000

0xffff0000

Undefined Instruction

UNDEF

0x00000004

0xffff0004

Software Interrupt

SWI

0x00000008

0xffff0008

Prefetch Abort

PABT

0x0000000C

0xffff000C

Data Abort

DABT

0x000000010

0xffff0010

Reserved

0x000000014

0xffff0014

Interrupt Request

IRQ

0x000000018

0xffff0018

Fast Interrupt Request

FIQ

0x00000001C

0xffff001C

EXCEPTION HANDLING

When an exception occurs, the ARM:


Copies CPSR into SPSR_<mode>
Sets appropriate CPSR bits

Change to ARM state


Change to exception mode
Disable interrupts (if appropriate)

Stores return address in LR_<mode>


Sets PC to vector address

To return, exception handler needs to:


Restore CPSR from SPSR_<mode>
Restore PC from LR_<mode>

This can only be done in ARM state.

0x1C FIQ
0x18 IRQ
0x14 (Reserved)
0x10 Data Abort
0x0C Prefetch Abort
0x08 Software Interrupt
0x04 Undefined Instruction
0x00 Reset

Vector Table
Vector table can be at
0xFFFF0000 on ARM720T
and on ARM9/10 family devices

EXCEPTIONS, INTERRUPTS, AND


THE VECTOR TABLE

RESET when power is applied, branches to


initialization code
UNDEF when the processor cannot decode an
instruction
SWI when a SWI instruction is called
PABT attempts to fetch an instruction from an
address without the correct access permissions
DABT attempts to access data memory without
the correct access permissions
IRQ by external hardware
FIQ by external hardware requiring faster
response time

EXCEPTION PRIORITIES
1.
2.
3.
4.
5.
6.

Reset (Highest Priority)


Data Abort
FIQ
IRQ
Prefetch Abort
SWI, Undefined

CORE EXTENSIONS

Standard components placed next to the ARM


core
Improve performance, manage resources, provide
extra functionality
Three hardware extensions
Caches
Memory Management
Coprocessors(Vector Floating Point)

CACHES

Cache is a block of fast memory placed between


main memory and the core
Cache provides an overall increase in performance
ARM has two forms of cache
Single unified cache for data and instruction
Separate caches for data and instruction

MEMORY MANAGEMENT

MMU is a class of processor hardware components


for handling memory accesses requested by the
CPU.
The functions of MMUs are
Translation of virtual address to physical address.
Memory protection
Cache control etc

COPROCESSORS
Coprocessors can be attached to the ARM processor
A separate chip, that performs lot of calculations for
the microprocessor, relieving the CPU some of its
work and thus enhancing overall speed of system.
A secondary processor used to speed up operation by
taking over a specific part of main processors work.
The ARM processor uses coprocessor 15 registers to
control cache, TCMs, and memory management

DESCRIPTION OF CPSR
Parts

Bits

Architecture Description

Mode
T
I&F
J
Q
V
C
Z
N

4:0
5
7:6
24
27
28
29
30
31

all
ARMv4T
all
ARMv5TEJ
ARMv5T
E
all
all
all
all

processor mode
Thumb state
interrupt masks
Jazelle state
condition flag
condition flag
condition flag
condition flag
condition flag

ARM FAMILY ATTRIBUTE


COMPARISON
ARM7
Pipeline
depth
Typical
mW/MHz
MHz
MIPS/MHz
Architecture
Multiplie
r

ARM9

ARM10

ARM11

three-stage five-stage
six-stage
eight-stage
80
150
260
335
0.06 mW/MHz
0.19 mW/MHz0. 5 mW/MHz0.4
mW/MHz
(+ cache)
(+
(+
0.97
1.1
1.3
1.2
cache)
cache)
Von Neumann
Harvard
Harvard
Harvard
8 x 32
8 x 32
16 x 32
16 x 32

ARM PROCESSOR FAMILIES

ARCHITECTURE REVISIONS
RevisionExample core ISA
Implementatioenhancement
ARMv1
First ARM Processor
nARM1
ARMv2

ARM2

ARMv2a

ARM3

ARMv3

ARM6 &
ARM7DI

ARMv3M
ARMv4

ARM7M
StrongARM

26 bit addressing
32 bit multiplier
32 bit coprocessor
On chip cache
support
Atomic swap instruction
32 bit addressing
Separate cpsr & spsr
New modes UNDEF,
MMU support virtual memory
ABORT
Signed & unsigned long
multiply
Load store instruction
New Mode - System

ARCHITECTURE REVISIONS
RevisionExample core ISA
Implementatioenhancement
ARMv4T n
ARM7TDMI &
Thumb
ARMv5T
E

ARMv5T
EJ
ARMv6

ARM9T
ARM9E & ARM10E Superset of the ARMv4T
Extra inst. added for changing
state between ARM & Thumb
Enhanced multiply instructions
Extra DSP type instructions
Faster multiply accumulate
ARM7EJ &
Java acceleration
ARM926EJ
ARM11
New multimedia instructions

ARM PROCESSORS

ARM7 Family
ARM7EJ-S
ARM7TDMI
ARM7TDMI-S
ARM720T
ARM9/9E Families
ARM920T
ARM922T
ARM926EJ-S
ARM940T
ARM946E-S
ARM966E-S
ARM968E-S
Vector Floating Point Families
VFP10
ARM10 Family
ARM1020E
ARM1022E
ARM1026EJ-S

ARM11 Family
ARM1136J-S
ARM1136JF-S
ARM1156T2(F)-S
ARM1176JZ(F)-S
ARM11 MPCore

Cortex Family
Cortex-A8
Cortex-M1
Cortex-M3
Cortex-R4

Other Processors/Microarchitectures
StrongARM (DEC-Intel)
Xscale (Intel- Marvell Tech)
Other

CORTEX FAMILY

ARM Cortex-A Series - Application processors

for complex OS and user applications

ARM Cortex-A8, ARM Cortex-A9

ARM Cortex-R Series - Embedded processors

for real-time systems

ARM Cortex-R4(F)

ARM Cortex-M Series Embedded processors

optimized for cost sensitive applications, as


Mobile devices

ARM Cortex-M0, ARM Cortex-M1, ARM Cortex-M3

SWITCHING STATES

ARM to Thumb

Execute the BX instruction with state bit=1

Thumb to ARM

Execute the BX instruction with state bit=0


An interrupt or exception occurs

IN TODAYS SYSTEMS THE KEY


IS NOT RAW PROCESSOR SPEED
BUT TOTAL EFFECTIVE SYSTEM
PERFORMANCE AND POWER
CONSUMPTION
START EXPLORING MORE

Você também pode gostar