Você está na página 1de 27

STM32 Seminar

I t d ti to
Introduction t Cortex-M
C t MC Core

COMPEL/STM Seminar
November 2010
Seminar Agenda
ƒ Overview of ST Microcontroller Portfolio
ƒ Introduction to Cortex-M Core
ƒ STM32 General Purpose Lines
ƒ Product-Line Overview (F100/F101/F103)
ƒ Walk through the main peripherals
ƒ ST Standard Peripheral Library
ƒ Live demonstration of the STM32 Value Discovery Kit
ƒ STM32 Low
Low-Power
Power Line
ƒ Product-Line Overview (L15x)
ƒ Low-Power modes and consumption
ƒ Specific Peripherals
ƒ STM32 Connectivity Line
ƒ Product-Line Overview (F105/7 & next)
ƒ Ethernet & USB Host Peripherals
ƒ Third Party Stacks
ƒ Audio Support
pp
ƒ STM32 Wireless
ƒ Product-Line Overview (W108)
ƒ RF Performances
ƒ Wireless Stacks (Zigbee, RF4CE, proprietary)
ƒ STM32 Tools
ƒ Third Party Compiler & IDE
ƒ Boards and Debuger
ƒ ST Libraries

2
Cortex-M processors
ƒ Forget traditional 8/16/32-bit classifications
ƒ Seamless architecture across all applications
pp
ƒ Every product optimised for ultra low power and ease of use

Cortex-M0 Cortex-M3 Cortex-M4


“8/16-bit” applications “16/32-bit” applications “32-bit/DSC” applications

Binary and tool compatible


Cortex-M processors binary compatible
Cortex-M3
Cortex M3 Training
Cortex M3 Core Presentation
Cortex-M3
Cortex-M3 Processor

ƒ Hierarchical processor integrating core and advanced


system peripherals

ƒ Cortex-M3 core
ƒ Harvard architecture
ƒ 3-stage pipeline w. branch speculation
ƒ Thumb®-2 2 and traditional Thumb
ƒ ALU w. H/W divide and single cycle multiply

ƒ Cortex-M3 Processor
ƒ Cortex-M3 core
ƒ Configurable interrupt controller
ƒ Bus matrix
ƒ Advanced debug components
ƒ Optional MPU & ETM (Not available in STM32F10x)
Cortex-M3 Processor Overview (1/2)
ARM v7M Architecture
Thumb-2 Instruction Set Architecture
Mix of 16 and 32 bit instructions for very high code density
Harvard architecture
Separate I & D buses allow parallel instruction fetching & data storage
Integrated Nested Vectored Interrupt Controller (NVIC) for low latency
interrupt processing
Vector Table is addresses, not instructions
Designed to be fully programmed in C
Even reset, interrupts and exceptions
Integrated Bus Matrix
Bus Arbiter
Bit Banding – Atomic Bit Manipulation
W ite B
Write Buffer
ffe
Memory Interface (I&D) Plus System Interface & Private Peripheral Bus
Integrated System Timer (SysTick) for Real Time OS or other scheduled tasks
Cortex-M3 Processor Overview (2/2)
ƒ 3-Stage Pipeline
ƒ Fetch,
Fetch Decode & Execute
ƒ Single Cycle Multiply

Source Destination Cycles


16b x 16b 32b 1
32b x 16b 32b 1
32b x 32b 32b 1
32b x 32b 64b 3-7*

*UMULL, SMULL,UMLAL, and SMLAL are interruptible and can also complete early
d
depending
di on source values
l
Hardware Division
UDIV & SDIV (Unsigned or Signed divide)
Instruction takes between 2 & 12 cycles depending on dividend and devisor
Closer the dividend and division the faster the instruction completes
Instruction is interruptible (abandoned/restarted)
Cortex-M3 & ARM7: Comparison
ARM7TDMI-S Cortex-M3
Architecture v4T v7M
ISA Support ARM (32
(32-bit)
bit) & Thumb (16
(16-bit)
bit) Thumb-2
Thumb 2 (Merged 32/16
32/16-bit)
bit)

DMIPS/MHz 0.74 Thumb / 0.93 ARM 1.25 Thumb-2

Pipeline 3-Stage 3-Stage + Branch Speculation

Interrupts FIQ / IRQ NMI, SysTick and up to 240 interrupts


NMI interrupts.
Integrated NVIC Interrupt Controller
up to 1-255 Priorities
Interrupt Latency 24-42 Cycles 12 Cycles
(Depending on LSM) (6 when Tail Chaining)
Memory Map Undefined Architecture Defined
System Status PSR. 6 modes. xPSR. 2 modes.
20 Banked regs Stacked regs (1 bank)
p Modes
Sleep No Three

ƒ Additional Features of the Cortex-M3


ƒ Reduced pin debug & trace interfaces reduce pin overhead from 9-pins to 2- or 3-pins
ƒ Hardware Interrupt Handling removes need for assembler code in interrupts
ƒ Integrated atomic bit manipulation for improved data storage
ƒ Extended Data Watchpoints & Flash Patch technology
ƒ Embedded sleep control and power-down modes
ƒ Optional very small Memory Protection Unit (MPU) & Embedded Trace Macrocell (ETM)
High Performance CPU and Buses
ARM v7M Architecture: Harvard benefits with Von Neumann single memory space
Von Neumann “bottleneck” Three 32bit buses for a parallel
Single 32bit bus for: CODE ♦ code execution,
execution
0 1 0
♦ code execution, CORE 0 CM3 1 ♦ data transfer (core/dma),

DATA
0 0
♦ data transfer (core/dma),
0
♦ peripheral control
1 1 1 1

CST
♦ peripheral control 1
1 1 1 00
1 1 0 1 1
0001 1 001 0 1 1 11 0 1
0
11 0 00 100 00 0 0
DMA 011 0 1 0 11 0
0 DMA 1 0
1 1
01
1 0 1 0 0
PERIPH 0 PERIPH
0 RAM FLASH 0 RAM FLASH
00
PERIPH 1 PERIPH 1
1 CORTEX-M3
DMIPS ARM966 (ARM)
ARM7TDMI (ARM)
Outstanding efficiency of 1.25 DMIPS/MHz and 1.2 CPI
ARM7TDMI (THUMB)

fCPU
THUMB2 instruction set provide 32bit performance with 16bit code density

THUMB 16bit Instruction Set Full THUMB compatibility THUMB-2


ARM 32bit Instruction Subset Complete ARM instruction set
for better performance ♦ Single POWERFULL instruction set
Æ No more mode switching
New 16/32bit Instructions 1 cycle MAC and Hardware Divide ♦ Two 16bit instruction fetch per
Bit handling FLASH access
Cortex-M3 Memory Map
ƒ Vendor Specific (0.5GB)
ƒ Set aside to enable vendors to implement
peripheral compatibility with previous systems
ƒ Private Peripheral Bus (1M)
ƒ Address space for system components
(CoreSight, NVIC etc.)
ƒ External Device (1GB).
ƒ Intended for external devices and/or shared
memory that needs ordering/non-buffered
ƒ External RAM (1GB)
ƒ Intended for off chip memory
ƒ Peripheral (0.5G)
ƒ Intended for normal peripherals. The bottom
1MB of the 32MB peripheral address space
((0x40000000 – 0x400FFFFF)) is reserved for bit-
band accesses. Accesses to the peripheral 32MB
bit band alias region (0x42000000 – 0x43FFFFFF)
are remapped to this 1MB
ƒ SRAM (0.5GB)
ƒ Intended for on-chip SRAM. The bottom 1MB of
the SRAM address space (0x20000000 -
0x200FFFFF) is reserved for bit-band accesses.
Accesses to the SRAM 32MB bit band alias
region (0x22000000 – 0x23FFFFFF) are remapped
to this 1MB address space.
ƒ Code(0.5GB)
ƒ Reserved for code memory (flash, SRAM). This
region is accessed via the Cortex-M3 ICode and
DCode busses.
Optimized use of the RAM

Bit banding allows optimized code and give highest density use of SRAM
Unaligned data access supported to improve data constant and RAM utilization
long (32) long (32)
char (8)
Structure
u u char (8) long (32) …
long (32) management … long char (8) char (8) char (8)
32bit machine char (8) char (8) char (8) example int (16) long (32) …
which does Data
int (16) … long int (16)c
not support aligned long (32) char (8) int (16) long …
int (16)c char (8) … long (32)
unaligned data int (16)
long (32)

Unused (wasted) space Free space for the rest of the application

Reduces SRAM Memory Requirements By Over 25%

Less Memory - LowER Cost devices

15
Debug Capabilities
Serial Wire Debugging for optimized device pin-out

More pins
M i available
il bl
JTAG SWD
for the application

Embedded break/watch capabilities for easy flashed application debugging


♦ 2 hardware breakpoints Æ 8 hardware breakpoints
♦ 2 hardware watchpoints

Serial Wire Vi
S i l Wi Viewer for
f targeted
t t d low
l bandwidth
b d idth data
d t trace
t
♦ Using serial wire interface or dedicated bus CKout+D[3..0] for better bandwidth
♦ Triggered by embedded break and watch points

ETM capability
bilit for
f better
b tt reall time
ti debugging
d b i
♦ Instruction trace only
♦ External signal triggering capability
♦ Can be used in parallel with data watchpoint

Debugging features still kept whilst the core entered low power mode

17
Privilege, Modes and Stacks

ƒ Privileged/Non-privileged operation
ƒ Same as ARM7 Supervisor/User

ƒ Thread mode and Handler mode


ƒ Handler mode is an exception or interrupt
ƒ Thread mode is just normal application code running

ƒ Main stack – Process stack


ƒ Exceptions use main stack in privileged mode
ƒ Applications (thread mode) can use process stack

18
Execution Modes
ƒ Cortex-M3 has 2 execution modes and 2 privilege levels:
Privileged User

ƒ Handler mode
ƒ An exception is being processed Handler Mode
ƒ Always privileged execution

ƒ Thread mode
ƒ No
N exception
ti iis b
being
i processed
d Thread Mode Thread Mode

ƒ Normal code is executing


ƒ Could be privileged or user
ƒ When Thread mode has been changed to user,
it cannot change itself back to privileged.
Only a Handler can change the privilege of Thread mode.

ƒ This model is a simplification of the modes from other ARM processors

19
Stacks
ƒ Cortex-M3 supports two stacks
ƒ Main Stack (initialised after reset by hardware)
ƒ Process Stack
ƒ Exceptions use main stack
ƒ Thread mode uses either the main or process stack
ƒ Firmware
Firm are selectable
ƒ The intended usage model is
ƒ OS and
dEExceptions
ti use main
i stack
t k
ƒ Threads (user processes) use the process stack
ƒ Intended
I d d to prevent user process ffrom modifying
dif i the
h main
i stack
k
ƒ Can be configured to use just one stack (reset default)

20
Exception/Interrupt Handling
ƒ Very low latency interrupt processing
ƒ Exceptions processed in Privileged operation
ƒ Interruptible LDM/STM for low interrupt latency
ƒ Automatic processor state save and restore
ƒ Provides low latency ISR entry and exit
ƒ Allows handler to be written entirely in ‘C’

ƒ The Cortex-M3 processor integrates an advanced Nested Vectored


Interrupt Controller (NVIC)
ƒ 43 maskable interrupts channels (not including 16 interrupt lines of Cortex-M3)
ƒ 16 programmable priority levels
ƒ Allows early processing of interrupts
ƒ Supports advanced features for next generation real-time applications
ƒ Tail-chaining
g of p
pending
g interrupts
p
ƒ Late-arrival interrupt handling and priority boosting / inversion

Exceptional Control Capabilities Through Integrated Interrupt Handling

23
Interrupt Response- Tail Chaining
IRQ1
Highest
IRQ2
42 CYCLES

ARM7 PUSH ISR 1 POP PUSH ISR 2 POP


IInterrupt
t t handling
h dli in
i
assembler code 26 16 26 16

Tail-chaining

Cortex-M3 PUSH ISR 1 ISR 2 POP


Interrupt handling in HW
12 6 12
6 CYCLES

ARM7 Cortex-M3

• 26 cycles from IRQ1 to ISR1 entered • 12 cycles from IRQ1 to ISR1 entered
•Up to 42 cycles if LSM • 12 cycles if LSM
•42 cycles from ISR1 exit to ISR2 entry •6 cycles from ISR1 exit to ISR2 entry
•16 cycles to return from ISR2 •12 cycles to return from ISR2

25
Interrupt Response – Preemption

IRQ1
Highest

IRQ2

42 CYCLES

ARM7 ISR 1 POP PUSH 2 ISR 2 POP


16 26 16

Cortex-M3 ISR 1 POP ISR 2 POP


1- 6 12
12 7 18 CYCLES
7-18
Cortex-M3
ARM7
• POP may y be abandoned early y if another
• Load Multiple uninterruptible, interrupt arrives
and hence the core must complete the • If POP is interrupted it only takes 6
POP and the full stack PUSH cycles to enter ISR2 ( Equivalent to Tail
-chaining)

26
Interrupt Response – Late Arriving

IRQ1
Q
Highest
IRQ2

ARM7 PUSH PUSH ISR 1 POP ISR 2 POP


26 26 16 16

Cortex-M3 PUSH ISR 1 ISR 2 POP


6 12
Tail-
Chaining

ARM7 Cortex-M3

• 26 cycles to ISR2 entered • Stack push to ISR 2 is interrupted


• Immediately
I di t l pre-emptedt d by
b IRQ1 and d • Stacking
St ki continues
ti b t new vector
but t address
dd
takes a further 26 cycles to enter ISR 1. is fetched in parallel
• ISR 1 completes and then takes 16 • 6 cycles from late-arrival to ISR1 entry.
cycles to return to ISR 2. • Tail-chain into ISR 2

27
Interrupt Prioritization
Each interrupt source has an 4-bit interrupt priority value
The 4 bits are divided into pre-empting priority levels and non-pre-empting
“sub-priority”
sub priority levels
The software programmable PRIGROUP register field of the NVIC chooses how
many of the 4-bits are used for “group-priority” and how many are used for “sub-
priority”
S b
Sub-priority
i it levels
l l only
l hhave an effect
ff t if the
th pre-empting
ti priority
i it llevels
l are th
the
same
Group priority is the pre-empting priority
Lower numbers are higher priority
Hardware interrupt number is lowest level of prioritization
IRQ3 is higher priority than IRQ4 if the priority registers are programmed the same

Preempting Priority
PRIGROUP Binary Point Sub-Priority
(Group Priority)
(3 Bits) (group.sub)
Bits Levels Bits Levels
011 4.0 gggg 4 16 0 0
100 3.1 gggs
ggg 3 8 1 2
101 2.2 ggss 2 4 2 4
110 1.3 gsss 1 2 3 8
111 0.4 ssss 0 0 4 16

In STM32F10x 16 levels (4-bit) of priority are implemented


Interrupt Priority Settings Examples
PRIGROUP Groups Sub-Groups

0
16 groups all with pre-
PRIGOUP = 011 „gggg“ emption
p over lower ggroups
p
15

0
0 4 groups with each 4
3 sub-groups. Pre-
PRIGOUP = 101 „ggss“ 0 emption only across
3
groups
3

0
16 sub-groups without
PRIGOUP = 111 „ssss“ pre-emption over lower
15
sub-groups
Cortex-M3 Exception Types
y
Type of
No. Exception Type Priority Descriptions
Priority
1 Reset -3 (Highest) fixed Reset

2 NMI -2 fixed Non-Maskable Interrupt

3 Hard Fault -1 fixed Default fault if other hander not implemented

4 MemManage Fault 0 settable MPU violation or access to illegal locations

5 Bus Fault 1 settable Fault if AHB interface receives error

6 Usage Fault 2 settable Exceptions due to program errors

7-10 Reserved N.A. N.A.

11 SVCall 3 settable System Service call

12 Debug Monitor 4 settable Break points, watch points, external debug

13 Reserved N.A. N.A.

14 PendSV 5 settable Pendable request for System Device

15 SYSTICK 6 settable System Tick Timer

16 Interrupt #0 7 settable External Interrupt #0

…… ………………….. ………………….. settable …………………..


256 Interrupt#240 247 settable External Interrupt #240

In STM32F10x 43 Interrupts are implemented (total interrupts available 59)


Vector Table
ƒ Vector Table starts at location 0 Address Vector
0x00 Initial Main SP
ƒ In the code section of the memory map
0x04 Reset
ƒ Vector Table contains addresses (vectors)
0x08 NMI
of exception handlers and ISRs 0x0C Hard Fault
ƒ Not instructions like other ARM processors 0x10 Memory Manage

ƒ Table size (in words) is = number of IRQ inputs + 0x14 Bus Fault
16 0x18 Usage Fault

ƒ Minimum size ( case of 1 IRQ) : 17 words 0x1C-0x28 Reserved


0x2C SVCall
ƒ Maximum size ( case of 240 IRQs) 256 words
0x30 Debug Monitor
ƒ Main stack pointer initial value in location 0
0x34 Reserved
ƒ Set up by hardware during Reset
0x38 PendSV
ƒ Vector Table can be relocated (to SRAM) 0x3C Systick
ƒ Software
S ft configurable
fi bl through
th h dedicated
d di t d register
i t iin 40 IRQ0
SCB … More IRQs

In STM32F10x the Vector Table size is 236 bytes (59 * 4 bytes)

37
Power Management
“8bit Microcontroller like” power mode management
SLEEP NOW
♦ “Wait
“W i ffor I
Interrupt”
” instructions
i i to enter low
l power mode
d
Æ No more dedicated control register settings sequence
♦ “Wait for Event” instructions to enter low power mode
Æ No need of Interruptp to wake-upp from sleep
p
Æ Rapid resume from sleep
SLEEP on EXIT
♦ Sleep request done in interrupt routine
♦ Low
L power moded entered
t d on interrupt
i t t return
t
Æ Very fast wakeup time without context saving (6 cycles)
DEEP SLEEP
♦ Long
o g duration
du a o sleep
s eep
Æ From product side: PLL can be stopped or shuts down the power to digital
parts of the system
Æ Enables low power consumption

Optimized RUN mode CORE power consumption


3 time less than ARM7TDMI
System Timer (SysTick)
ƒ Flexible system timer
ƒ 24 bit self
24-bit self-reloading
reloading down counter with end of count interrupt generation
ƒ 2 configurable Clock sources
ƒ Suitable for Real Time OS or other scheduled tasks

In STM32F10x the SysTick clock can be: CPU clock or CPU clock/8
(provided externally by the Reset Clock Control )

39
Thank You !

50

Você também pode gostar