Escolar Documentos
Profissional Documentos
Cultura Documentos
System Timing
1
Synchronous System Timing
Clock signals
2
Sample Clock Specifications
Xilinx FPGA
Micron SDRAM
Clock Jitter
• Jitter is the phase variations that happens in a clock signal
as a result of noise, patterns, or other causes, with a
frequency of variation greater than a few tens of Hertz.
• Slower changes in phase due to temperature, voltage, and
other physical changes are usually referred to as "wander."
• Period Jitter: The short-term variations in the clock period.
Tideal
+Tjitter
−Tjitter
BYU ECEn 320
3
Measurement of Clock Jitter
• The variations in every clock period sampled are plotted as a
histogram of the number of periods with a given length. It has
normal distribution.
• The outer curve is the accumulated count (in this case, 865,000
samples). Peak-to-peak period jitter is 56.2 ps.
• The inner curve is the latest sampled count of 1000 periods. Peak-
to-peak period jitter is 34.2 ps.
4
D flip-flop timing parameters
tres
5
Xilinx Flip-Flop Variations
6
Setup and Hold Times
There
Thereisisaatiming
timing
"window"
"window"around
aroundthe the
clocking
clockingevent
event
during
duringwhich
whichthetheinput
input
must
mustremain
remainstable
stable
and
andunchanged
unchanged
in
inorder
order
Input to
Tsu tobe
berecognized
recognized
Invalid!
Clock
7
Setup and Hold Time Violations
Invalid!
Clock Th
If you leave after 7:40, you will miss the train. If you
leave before 7:40, you should have enough time to
get to the station before it leaves.
8
Helping your friend onto the train . . .
Once the train has started, your friend needs help
staying on the train. If he does not have continuous
help for five minutes, he will fall off the train.
Input The
Thevalid
validregion
regionor
or
Tsu "window"
"window"associated
associated
with the clock event
with the clock event
Th (Negative Hold Time) does
doesnot
nothave
haveto
tobe
be
centered
centeredaround
aroundthe
the
Clock clock edge.
clock edge.
The
Theregion
regioncan
canbe beto
to
the
theright
rightor
orleft
leftof
ofthe
the
clock
clockedge
edgewhen
whenthe the
When the hold time is negative, the setup
setupororhold
holdtimes
times
valid region is to the left of the clock are
arenegative.
negative.
edge.
9
Negative Setup Time
The
Thevalid
validregion
regionor
or
Input "window"
"window"associated
associated
with
withthe
theclock
clockevent
event
Tsu does
doesnot
nothave
havetotobe
be
(Negative Setup Time) centered
centered aroundthe
around the
Th clock
clockedge.
edge.
Clock
The
Theregion
regioncan
canbe beto
to
the
theright
rightor
orleft
leftof
ofthe
the
clock
clockedge
edgewhen
whenthe the
setup
setupororhold
holdtimes
times
When the setup time is negative, the are
arenegative.
negative.
valid region is to the right of the clock
edge. Note: you cannot have both
a negative setup time and
This allows the input to change a negative hold time!
slightly after the clock edge without
disturbing the operation of the flip-flop.
Clk
Tplh T phl
Tcq Tcq
Q
10
Timing Specifications
74LS74 Positive
Edge Triggered Tsu Th T su Th
D Flipflop 20 5 20 5
ns ns ns ns
D
• Setup time Tw
• Hold time 25
• Minimum clock width ns
• Propagation delays Clk
(low to high, high to low, Tplh T phl
max and typical) 25 ns 40 ns
13 ns 25 ns
Q
Cascaded Flip-Flops
IN Q0 Q1
D Q D Q
C Q C Q
CLK
Clock
IN
Q0
Q1
11
Cascaded Flip-Flops
Are the Setup and Hold Times met?
Q0: Input is IN
IN Q0 Q1
D Q D Q
C Q C Q
Setup and hold
times are met
CLK
if the input, IN,
does not change
within the valid
region or
window.
Clock
IN
Q0
Q1
Cascaded Flip-Flops
Are the Setup and Hold Times met?
Q1: Input is Q0
IN Q0 Q1
D Q D Q
Wait!
C Q C Q
Q0 is changing
CLK right at the clock
edge. Won’t this
violate the hold
time of Q1?
Does it violate
Clock the setup time of
Q1?
IN
Q0
Q1
12
Cascaded Flip-Flops
Are the Setup and Hold times of Q1 met?
IN
Tcq
Tcq
Q0
Th Th
Q1
As long as Tplh > Th and Tphl > Th Do we use Tplh min or max?
BYU ECEn 320
Cascaded Flip-Flops
Are the Setup and Hold times of Q1 met?
IN
Tcq
Q0
Tcq
Th Th
Q1
13
Cascading Flip-Flops
• Flip-flop families are designed to guarantee that
Tcq(min) > Th
• You can safely mix within a flip-flop family
• If you mix flip-flop families, you need to make sure
that Tcq(min)_fam1 > Th_fam2 & Tcq(min)_fam2 > Th_fam1
• You cannot solve this kind of problem by slowing
down the clock.
Hold-Time Margin
• Tcq_min + Tcomb_min ≥ Thold IN
D Q
Q0
Comb
Q0’
D Q
Q1
C Q C Q
14
Cascaded Flip-Flops
How fast can you clock this circuit?
IN Q0 Q1
D Q D Q
C Q C Q
CLK
Tclk ?
Clock
Q0
Q1
Cascaded Flip-Flops
How fast can you clock this circuit?
IN Q0 Q1
D Q D Q
C Q C Q
CLK
Clock
Tcq
Q0
Tsetup
Q1
15
Synchronous Circuit
How fast can you clock this circuit?
IN Q0 Q0’ Q1
D Q D Q
C Q C Q
CLK
Clock
IN
Q0
Q0’
Q1
Synchronous Circuit
How fast can you clock this circuit?
IN Q0 Q0’ Q1
D Q D Q
C Q C Q
CLK
Clock
IN
Tcq
Q0 Tpinv
Q0’ Tsetup
Q1
16
Setup-Time Margin
• Tclk ≥ Tcq + Tcomb + Tsetup
• Clock Jitter
– Must account for worst-case clock jitter
– Tclkmin = Tclk - Tmaxjitter
– Tclk ≥ Tcq + Tcomb + Tsetup + Tmaxjitter
Combinational
Synchronous
Logic
Inputs
Flip-Flops
BYU ECEn 320
17
Synchronous Circuit
Combinational
Synchronous Logic
Inputs Flip-Flops
X A
D Q
B Z
Q
A
B
A B
D Q
Q
B
CLK
18
State Machine Example
Flip-Flop Timing Input Timing
Th 5 ns Tinput 35 ns (max)
25 ns (min)
Tsu 20 ns Gate Timing
Tplh 25 ns (max) Tplh 22 ns (max)
10 ns (min) 10 ns (min)
Tphl 40 ns (max) Tphl 15 ns (max)
20 ns (min) 8 ns (min)
X A
D Q
B Z Tp_fl
Q
A
B 40 ns
A B
D Q
Q
B
CLK
X A
D Q
B Z Tp_fl + Tp_ifl
Q
A
B 40 + 22 + 22 ns
A B
D Q
Q
B
CLK
19
State Machine Example
Flip-Flop Timing Input Timing
Th 5 ns Tinput 35 ns (max)
25 ns (min)
Tsu 20 ns Gate Timing
Tplh 25 ns (max) Tplh 22 ns (max)
10 ns (min) 10 ns (min)
Tphl 40 ns (max) Tphl 15 ns (max)
20 ns (min) 8 ns (min)
X A
D Q
B Z Tp_fl + Tp_ifl + Tsu
Q
A
B 40 + 22 + 22 + 20 = 104 ns
A B
D Q
Q
B
CLK
X A
D Q
B Z Tinput
Q
A
B 35 ns
A B
D Q
Q
B
CLK
20
State Machine Example
Flip-Flop Timing Input Timing
Th 5 ns Tinput 35 ns (max)
25 ns (min)
Tsu 20 ns Gate Timing
Tplh 25 ns (max) Tplh 22 ns (max)
10 ns (min) 10 ns (min)
Tphl 40 ns (max) Tphl 15 ns (max)
20 ns (min) 8 ns (min)
X A
D Q
B Z Tinput + Tp_ifl
Q
A
B 35 + 22 + 22 ns
A B
D Q
Q
B
CLK
X A
D Q
B Z Tinput + Tp_ifl + Tsu
Q
A
B 35 + 22 + 22 + 20 = 99 ns
A B
D Q
Q
B
CLK
21
State Machine Example
Flip-Flop Timing Input Timing Feedback Path: 104 ns
Th 5 ns Tinput 35 ns (max)
Input Path: 99 ns
25 ns (min)
Tsu 20 ns Gate Timing
Critical Path: 104 ns (9.6 MHz)
Tplh 25 ns (max) Tplh 22 ns (max)
10 ns (min) 10 ns (min)
Tphl 40 ns (max) Tphl 15 ns (max)
20 ns (min) 8 ns (min)
X A
D Q
B Z
Q
A
B
A B
D Q
Q
B
CLK
X A
D Q
B Z Tp_fl
Q
A
B 40 ns
A B
D Q
Q
B
CLK
22
State Machine Example
Output Timing
Flip-Flop Timing Input Timing
Th 5 ns Tinput 35 ns (max)
25 ns (min)
Tsu 20 ns Gate Timing
Tplh 25 ns (max) Tplh 22 ns (max)
10 ns (min) 10 ns (min)
Tphl 40 ns (max) Tphl 15 ns (max)
20 ns (min) 8 ns (min)
X A
D Q
B Z Tp_fl + Tp_ofl
Q
A
B 40 + 22 = 62 ns
A B
D Q
Q
B
CLK
X A
D Q
B Z
Q
A
B
A B
D Q
Q
B
CLK
23
Static Timing Analysis
• Static timing analysis tools are used to identify
worst-case signal delays
– Identifies every combinational path in the circuit
– Calculates timing on each path
– Identifies the worst-case design path
• Delay file: delay of each independent net (not
combinational paths)
– .dly file in the project implementation directory
• Trace file: delay of each combinational path
– Must run trce command on placed and routed design
trce -a -v vendingmachine.ncd
– .twr output file from trce command
File: vendingmachine.dly
24
Trace File Example
--------------------------------------------------------------------------------
Delay: 18.621ns (data path)
Source: u1_u1.A
Destination: cathodes<6>
Data Path Delay: 18.621ns (Levels of Logic = 3)
Source Clock: CLK_BUFGP rising
25
Clock Skew
Clock Skew
• Proper operation of synchronous systems
requires that all registered elements are clocked
at the same time
• Some times this is not possible - the clock seen
at one flip-flop may be slightly delayed with
respect to the clock at another flip-flop
• The relative delay of the clock is called clock skew
26
Causes of Clock Skew
• Natural delays in clock wiring
• Capacitance on clock line
• Artificial delay due to improper design
• Wiring delays between chips
IN Q0
D Q D Q Q1
• Skew is in opposite
C C
direction of
δ
dataflow CLK1 skew CLK0
IN Q0
D Q D Q Q1
• Type of skew is not
C C
known.
Clock Network
27
Clock Skew (Type 1)
IN Q0 Q1
D Q D Q
C C
δ
CLK0 skew CLK1
C C
δ
Assuming the setup and
CLK0 skew CLK1 hold times for Q0 are met,
are the setup and hold
times for Q1 met?
CLK0
IN
Q0
Tskew
δ
CLK1
28
Clock Skew (Type 1)
IN Q0 Q1
D Q D Q
C C
δ
CLK0 skew CLK1
CLK0
IN
Q0
Tskew
Tsu Tsu Tsu
CLK1
Th Th Th
CLK0
IN
Tcq
Q0
Tcq
Ts Ts
CLK1 Th Th
Tskew Tskew
29
Clock Skew (Type 1)
CLK0
IN
Tcq
Q0
Tcq
Ts Ts
CLK1 Th Th
Tskew Tskew
Hold Time Margin = Tcq(min) − Thold − Tskew > 0
How should combinational propagation delays be handled?
BYU ECEn 320
• First two terms are minimum time after clock edge that
a D input changes
• Hold time is earliest time that the input may change
• Clock skew subtracts from the available
hold-time margin
• Compensating for clock skew:
– Longer flip-flop propagation delay
– Explicit combinational delays
– Shorter (even negative) flip-flop hold times
30
Very Long Clock Skew
CLK0
IN
Tcq
Q0
Tcq
Ts Ts
CLK1 Th Th
Tskew Tskew
The clock skew can be so long that setup and hold times
are met. What is wrong with this condition?
BYU ECEn 320
C C
δ
CLK0 skew CLK1
31
Clock Skew and Clock Rate
CLK0
IN
Tcq
Q0
Tclk Tcq
Ts
CLK1 Th
Tskew
Assuming hold-times are met, how does clock skew
affect minimum clock rate?
BYU ECEn 320
CLK0
IN
Tcq
Q0
Tclk Tcq
Ts
CLK1 Th
Tskew
Setup Time Margin = Tclk − (Tcq + Tsetup − Tskew) > 0
Tclk > Tcq + Tsetup − Tskew
BYU ECEn 320
32
Summary of Effects of Skew
Type of Skew Hold Time Setup Time Max Frequency
Margin Margin
Skew is in same Tcq(min) + Tcomb(min) Tclk − (Tcq(max) + Tcq(max) + Tcomb(max)
direction as data − Thold − Tskew(max) Tcomb(max) Tsetup − Tsetup − Tskew(max)
Tskew(max))
Decrease Increase Increase
Skew is in Tcq(min) + Tcomb(min) Tclk − (Tcq(max) + Tcq(max) + Tcomb(max)
opposite direction − Thold + Tskew(max) Tcomb(max) Tsetup + Tsetup + Tskew(max)
of data Tskew(max))
Increase Decrease Decrease
Skew type not Tcq(min) + Tcomb(min) Tclk − (Tcq(max) + Tcq(max) + Tcomb(max)
specified or − Thold − Tskew(max) Tcomb(max) Tsetup + Tsetup + Tskew(max)
unknown, assume Tskew(max))
worse case. Decrease Decrease Decrease
33
Clock distribution in ASICs
“Clock-tree” solution
34
Clock Skew in FPGAs
• Wiring delays in FPGAs are relatively long
– Wiring delay much longer than gate delay
– Custom ICs: gate delays dominate wiring delay
• FPGAs use a segmented routing structure
– Each wire consumes multiple routing segments
– Routing segments connected by routing boxes: switches & gates
– Each wire segment and routing box consume delay
• It is very difficult to guarantee that all clock wires will
arrive at the flip-flops at the same time
Routing Box
Routing Box
35
Clock Skew in FPGAs
• Solution: Provide a dedicated low-skew clock wire
– Single wire routed throughout entire device
– Delay of wire carefully controlled (same at each FF)
– High-speed (short rise and fall time)
• Spartan 2 (3) FPGA
– Provides 4 (8) global clock wires
– Must use the BUFG primitive to use a clock wire
36
DLL-Based Clock Design
CLK_PIN CLK
37
Xilinx Delay-Locked Loop ( DLL )
• Used to synchronize two clocks
– Minimize internal clock skew
– Remove internal clock buffer delays
• Operates using feedback
– Compares the phase of the input clock and feedback
clock
– Adds delay to insure CLKFB and CLKIN are in phase
The delay of an
IBUFG has been
characterized
and is
compensated
for in the DLL
Using feedback from BUFG output, CLKDLL adds appropriate
delay to insure input of IBUFG (clk_pin) and output of BUFG (clk)
are in phase (i.e. delay of both BUFG and IBUFG are hidden).
38
This synchronizes CLK_IN with CLK…
Using a Single DLL That gives control over setup times for
external inputs …
IBUFG BUFG
CLKIN_IBUFG CLK0 CLK
CLKIN_IN CLKIN CLK0
To rest of FPGA…
This ensures main system is reset for about 4 cycles after DCM locks
DLL Notes
• The DLL may take several thousand clock cycles to
synchronize clocks.
– Output clocks may introduce glitches, spikes, and other spurious
movement
– The LOCKED output pin indicates that the DLL has locked
synchronization
– You must allow proper starting of flip-flops driven by DLL by using
LOCKED signal
• Input clock frequency must be stable and fall within
specified frequency range
• Resetting DLL:
– DLL must be reset when input frequency changes
– DLL must be reset when device is reconfigured
• The phase and duty cycle can be controlled with
attributes
– CLK0, CLK90, CLK180, and CLK270 output pins
BYU ECEn 320
39
Dealing with Late Starting Clocks
These FFs power up to the low state
When FPGA configuration is
done, the I/O buffers turn on.
However, the input clock pin
‘1’ D Q … D Q starts toggling a little late.
Reset the DLL until after the
clock starts so the DLL can
acquire the lock properly.
IBUFG
RST
CLKIN_IN CLKIN CLK0
To on-chip circuits
DLL BUFG
CLKFB
LOCKED
To reset circuitry
CLK
BUFG
CLK2x
CLK/16
BUFG
CLKDV_DIVIDE=16
40
Coarse Phase Adjustment Using
DLLs
• Each DLL has many outputs including:
– CLK0: the “normal” clock output
– CLK90: 90 degree phase delay
– CLK180: 180 degree phase delay
– CLK270: 270 degree phase delay
• Useful in special contexts
• Not useful for “globally synchronous” designs
To SDRAM
OBUF
41
The Results (not good)
Setup and Hold times are not met at the SDRAM. The problem is there is
skew between the on-chip clock and the clock that goes to the SDRAM.
SDRAM
42
Dual DLLs
Dual DLLs
IBUFG OBUF
CLKBUF RST EXTCLK0
CLK_IN CLKIN SCLK
SCLK_FB FBBUF
EXTDLL To off-chip
CLKFB circuits
From LOCKED
off-chip IBUF
circuits EXTLOCK
INTRST
RST BUFG
CLK0
CLKIN CLK
INTDLL To on-chip
CLKFB
LOCKED circuits
LOCKED_OUT
To reset circuitry
43
Simulation of Dual DLL
1 2 3 4 5 6
library UNISIM;
use UNISIM.Vcomponents.ALL;
Provides component definitions.
Also provides simulation
capability for your design
that is ignored in synthesis…
component IBUFG
port ( I : in std_logic;
O : out std_logic);
end component;
component BUFG
port ( I : in std_logic;
O : out std_logic); Defined for you
end component;
in UNISIM.Vcomponents
component CLKDLL
port (
clkin : in std_logic;
clkfb : in std_logic;
rst : in std_logic;
clk0 : out std_logic;
clkdv : out std_logic;
locked : out std_logic
);
BYU ECEn 320
44
Metastability
HIGH LOW
LOW HIGH
45
Analog Analysis
• Assume pure CMOS thresholds, 5V rail
• Theoretical threshold center is 2.5 V
Dynamic Analysis
Top inverter
Vout1
= Vin2 reflection
line
Bottom
inverter Vin1 = Vout2
Vin2 = Vout1
Vin1= Vout2
46
Dynamic Analysis
4
Vout1
= Vin2 2
3
Vin1 = Vout2
Vin2 = Vout1
Metastable State
• Metastability is inherent in any bistable circuit
2.5 V 2.5 V
Vout1
= Vin2
2.5 V 2.5 V
Vin1 = Vout2
Vin2 = Vout1
Vin1= Vout2
• Two stable points, one metastable point
BYU ECEn 320
47
Metastability
6
4
5
2
3
48
Why Study Metastability?
• All real systems are subject to it
– Problems are caused by “asynchronous inputs” that do
not meet flip-flop setup and hold times.
• Especially severe in high-speed systems
– since clock periods are so short, “metastability
resolution time” can be longer than one clock period.
• Many digital designers, products, and
companies have been burned by this
phenomenon.
tcq
49
Metastability resolution time
tcq
tr = Resolution time is the extra time
needed to resolve the logic state
BYU ECEn 320
50
Metastability Window
T0 Assume that data arrives
uniformly over clock cycle, Tclk.
The probability that data will
tcq arrive during T0 in a clock
tcq period Tclk is:
T0
tcq P= = T0 f
HOLD
Tclk
The probability of a
metastable event
happening
Tclk
BYU ECEn 320
P = e − tr /τ
The probability of a metastable event
lasting longer than some time, tr
1 τ : “Resolving Time Constant”
51
How Resolution Time is Measured
VOHmin
Exponential
decay of
Failures failures
VOLmax
tcqMax
A digital oscilloscope is used to count failure events by zones
(masks). The width of each mask represents a time unit for
comparing events at different times. The tallies of these
masks reveal the population decay rate. The number of masks
should be chosen so that enough decay rate is observed.
BYU ECEn 320
Semi-log scale
1 b
− =
τ a
52
Mean Time Between Failure
Error Rate
=
Frequency of asynchronous events
x P=a
The probability of a metastable event happening
x P = T0 f
The probability of a metastable event lasting
longer than some time, t P = e − t /τ r
53
Typical
flip-flop
metastability
parameters
MTBF (t r )
r ( t /τ )
= e
T0 ⋅ f ⋅ a
τ and T0 are
flip-flop dependent
constants
BYU ECEn 320
XAPP 094
• T0 = 0.1 · 10-9
– XC4000-3 IOB: τ = .062 ns
– XC4000-3 CLB: τ = .051 ns
BYU ECEn 320
54
MTBF Metastability Example
10 MHz microprocessor clock
Asynchronous input changes 100,000 times/second
T0 = 0.4 (for 74LS74)
τ = 1.5ns (for 74LS74)
Output must be stable 80 ns after Tcq
( tr / τ )
MTBF (t r ) = Te0 ⋅ f ⋅a
MTBF(80 ns) = e80/1.5 / (0.4 · 107 · 105) = 3.6 · 1011 sec
(~100 centuries)
Note: if you ship 10,000 copies of your system, one will fail
every year
BYU ECEn 320
55
Asynchronous Inputs and
Synchronizers
Asynchronous inputs
• Not all inputs are synchronized with the clock
– Keystrokes
– Switches & buttons
– Sensor inputs
– Interrupt signals
– Asynchronous communication protocols (UART, etc.)
• Asynchronous inputs must be synchronized with
system clock before being used within the
system
– Asynchronous inputs can cause data integrity
problems if the signals are not synchronized properly.
56
Synchronization Problems
• A flip-flop output may not operate
correctly when setup and hold times are
not met.
– Output may enter a metastable state
– Time in this state is theoretically unbounded
(although probability decreases exponentially
with time)
– Some gates may interpret metastable state
as a “1” while others will interpret it as a “0”
A One-Stage Synchronizer
57
Metastability Resolution Time
• Max tr - Maximum metastability resolution time
• Maximum time that the output can remain
metastable without causing synchronizer failure
– The flip-flop may be metastable for a short time and
return to normal before being sampled by the next FF
clock
tr < setup-time margin
tr < Tclk − tcq(max − tcomb(max) − tsetup
ff-out
A Two-Stage Synchronizer
58
MTBF Metastability Example
- Asynchronous interrupt to a microprocessor system
- 10 MHz system clock
- Use the following synchronizer circuit
- Use 74LS74 parts
Note: if you ship 10,000 copies of your system, one will fail
every year
BYU ECEn 320
59
MTBF Metastability Example #2
Multiple-cycle synchronizer
60
Multiple-cycle synchronizer
Clock-skew problem
61
Cascaded Synchronizer
ASYNCIN MET1 MET2 METn
...
CLOCK
62
Synchronizers Between Different Clock
Domains
Synchronizer
CLK1 CLK2
Synchronous
Synchronous
Domain
Domain
Synchronizer
Synchronization Objectives:
• Don’t miss any events, or drop any data
• Don’t duplicate any events, or data
• Maintain event/data order
BYU ECEn 320
Level Synchronization
clk1
in
clk2
out
CLK2
63
Pulse Synchronization
clk1
in
clk2
out
RST RST
CLK1
CLK2
64
Divergence of an Asynchronous Signal
SYNC1 ≠ SYNC2
65
Example of the Problem
One-hot State Machine
A in A B
in
CLOCK
B IN (ASYNC)
1. in is asserted asynchronously.
2. in changes near clock edge, both flip-flops go metastable.
3. State A is reset, but state B is not set.
4. State machine is stuck in invalid state, game over !
BYU ECEn 320
A in
CLOCK
66
The Way to Do It
Combinational Synchronizer
Logic
CLKB
CLKA
67
The Way to Do It
Combinational
Synchronizer
Logic
CLKB
CLKA
Combinational
Synchronizer Logic
Y Ysync
CLKA CLKB
The logic beyond the synchronizers still matters, because the propagation
delay through the synchronizers is not predictable resulting in the loss of
correlation among signals crossing clock domains.
The relative timing (order of changes) that exists between X and Y do not
exist between Xsync and Ysync.
Simultaneity of Xsync and Ysync cannot be assured during synchronization.
Sequence of Xsync and Ysync cannot be assured during synchronization if X
and Y change concurrently. BYU ECEn 320
68
The Way to Do It
Combinational
Synchronizer
Logic
CLKB
CLKA
Consolidate signals before they are sent to reduce the number of registers
and eliminate the issue of relative timing between synchronized signals.
When planning the design, it's important to eliminate the need for these
arrival order (timing) dependencies between synchronizes signals.
69
Transferring Data
Across Different Clock Domains
Multi-Clock Systems
• With increasing system integration, systems
operating with multiple I/O standards using
multiple asynchronous clock frequencies are
becoming more common.
• Because of the asynchronous nature of these
designs, passing data or control signals between
logic operating on different clock frequencies
presents a special set of problems, often
dependent on the clock frequencies.
• Errors are nearly impossible to detect in
simulation and easily missed in product validation.
70
Transferring Data from ACLK to BCLK
Data0
Data1
Data2
ACLK BCLK
Basic structure of data transfer circuit from ACLK domain to BCLK domain.
If you put a synchronizer in the control path the data path will work properly.
Pulse
Catcher Synchronizer
NEW META
BLOAD
EN BYTE FF1 FF2
ALOAD
Control RST RST
path ACLK
BCLK
ALOAD BLOAD
CE CE
Data ADATA[7:0]
D
SDATA[7:0]
D
BDATA[7:0]
path
AREG BREG
71
BLOAD Circuit Timing
Pulse
Catcher Synchronizer
NEW META BLOAD
EN BYTE FF1 FF2
ALOAD
RST RST tr = Tbclk − tcq - tsetup
ACLK
BCLK
ACLK
ALOAD
SDATA[7:0]
NEWBYTE
BCLK
tcq tr tsetup tcq tr tsetup
META
BLOAD
BDATA[7:0]
ACLK
ALOAD
SDATA[7:0]
NEWBYTE
woops
BCLK
tcq tr tsetup
META
BLOAD
BDATA[7:0]
72
Timing Problems
• ALOAD comes too early
• NEWBYTE does not get set, second ALOAD is lost
• BDATA gets the wrong data
• First data is missed
ACLK
ALOAD
SDATA[7:0]
NEWBYTE
woops
BCLK
tcq tr tsetup
META
BLOAD
BDATA[7:0]
73
Transfer More Data at a Time
Buffer two bytes of Adata before transferring both data
bytes between ACLK and BCLK domains
DOUBLE
AREG1 BUFFER BREG1
ADATA[7:0]
D D D
ALOAD1 BLOAD
CE CE CE
AREG2 BREG2
D D
ALOAD2
CE CE
ACLK BCLK
ACLK
ACK BCLK
SDATA
74
Handshake Protocol
The handshake protocol must ensure that
• the data holds stable long enough for circuitry to sample
it once in the destination-clock domain
• a new data-valid signal can not be asserted until the
destination has acknowledged that the first data valid
signal was received
• data transfer works independent of both TX and RX clock
frequencies
TX Data
Domain A
Data Data
Valid Ack Valid Ack
RX Data
Domain B
BYU ECEn 320
75
Four-Phase Handshake Protocol
1 3
Req (data valid)
2 4
Ack (data ack)
valid data
76
Four-Phase Level-Handshake Protocol
Synchronizer
REQ REQsync
Sender Receiver
Synchronizer
FSM ACKsync ACK FSM
Q D
CLK
EN EN
Data Bus (not synchronized)
77
The Two-Phase Handshake “Toggle” Protocol
SDATA
ALOAD
BLOAD
ACK
BDATA
BLOAD
ALOAD
ACK
78
Using FIFOs to Transfer Data
• Most high-speed data transfers occur in bursts
rather than evenly spaced data streams
– Large blocks of data arriving at random times
– Often modeled with a Poisson distribution
e − λ λx
p ( x) =
– Example: Disk & Video I/O x!
• It is difficult to transfer such data using the
simple circuit described earlier
– Can’t keep up during burst data mode
– Is sitting idle when no data transfer is occurring
• First-in First out (FIFO) memories are often
used to transfer such data
79
FIFO-Handshake Protocols
• Due to bursty data and the difference in clock speeds, a
latency-absorbing FIFO often acts as a data buffer
between different clock domains.
• In this situation, the FIFO empty and full conditions
perform the handshaking.
• Asynchronous FIFOs must pass the read and write
pointers between clock domains (write clock/read clock).
• Because the pointers contain multiple bits, they can
introduce a race condition through the synchronizer. To
avoid this problem, you must implement the input and
output pointers as Gray Code counters to ensure that
only one bit changes at a time.
Synchronizing FIFO
• Valid data on in (out)
when ivalid (ovalid) is true
• FIFO (output client) able
to accept data on in (out)
when iready (oready) is
true
• Data exchanged on input
(output) when ivalid &
iready (ovalid & oready)
• iready = not full
• oready = not empty
80
FIFO Full and Empty
• Head points to next location from which to read
• Tail points to next location to which to write
• FIFO empty when head = tail and no data in FIFO
• FIFO full when head = tail and all locations full
• Hard to tell the two apart
• Simple solution – leave one location empty
– FIFO full when (tail+1) = head (gray coded increment)
• Need to compare head (clkout domain) with tail
(clkin domain) to compute full and empty
– Requires synchronizers to get head in clkin domain and
tail in clkout domain.
– Gray code these two counters
BYU ECEn 320
BlockRAM FIFO
WE
DataIn DataOut
DATA IN DATA OUT
Write Clock Read Clock
81
Common Pitfalls
• In level-handshakes protocols, you must initiate second round trip
to restore the logical level of the handshake signals to their original
state—doubling the protocol latency, in preparation for the next
handshake.
• Clocking of the data, as well as the strobe signal, through the
synchronizers causes a race condition. Only put synchronizers on the
control signals, not the data lines.
• A designer must neither assume nor require any fixed relationship
between the source- and destination clock frequencies.
• Simulation cannot hope to duplicate the infinite number of clock and
signal-edge relationships that are possible in a clock-domain-
crossing design.
• When inspecting a clock-domain-crossing design, you must fully
analyze two cases.
– First, assume one extreme clock-frequency relationship (10 to one, for
example) and manually analyze the behavior of the implementation on a
timing diagram for at least two consecutive transactions.
– Then, repeat the analysis, assuming the opposite clock-frequency
relationship.
Homework Unit 4
1. What is the maximum frequency 3. For the circuit below, if asyncin
that a synchronous system can changes 106 times per second, the
operate, given: clock frequency is 333 MHz, setup
Tcq (clock-to-q prop time) = 8 ns time is 0.5 ns, clock-to-q prop time is
(min) and 10 ns (max),
1 ns, T0=10-10sec and τ=.051 ns, what
Tsetup= 3 ns, Thold = 5 ns,
Tprop (total combinational logic prop is the mean time to failure of this
time) = 12 ns (min) and 21 ns (max) circuit? Is it acceptable if you are
Tskew= 1 ns max. planning to sell 100,000 of them?
82
Homework Unit 4
5. Consider the pulse synchronizer c. Considering the worse case
circuit below. scenario, what is the maximum
rate (or minimum period) of input
TRAP META OUT pulses that can be handled by
IN EN
this circuit, in terms of Tclk1,
RST RST
Tclk2, Tsetup, Thold, and clock-
CLK1
to-q time Tcq.
CLK2
a. Show how you could alter the Hint: Worse case scenario is when
circuit to produce a two clk2-cycle Tclk2 > Tclk1 . Trapped pulse
pulse in response to a single cycle TRAP comes barely too late to
pulse at the input. meet the setup and hold times of
the middle flip-flop, which goes
b. Suppose the MTBF of the meta-stable. META eventually
synchronizer was not enough. Show resolves low, and it takes another
how you could alter the circuit to clk2 cycle to get meta to go high.
increase the MTBF.
83