Você está na página 1de 30

Content Addressable

Memories
Cell Design and Peripheral Circuits
CAM: Introduction
E CAM vs. RAM
0 0 1 1 0 1 1 1 5
1 0 0 0 1 1 0 1 4
1 0 1 1 1 1 0 1 3
1 1 0 0 1 0 1 1 2
0 0 0 0 1 1 0 1 1
0 1 0 1 0 1 0 1 0
1 0 0 0 1 1 0 1
Data Out
4
A
d
d
r
e
s
s

I
n

1 1 0 0 0 1 1 1 5
0 0 0 1 1 1 0 1 4
1 0 0 0 1 1 0 1 3
1 1 0 0 1 0 1 1 2
0 0 0 0 1 1 0 1 1
0 1 0 1 0 1 0 1 0
1 0 0 0 1 1 0 1
Data In
3
A
d
d
r
e
s
s

O
u
t

1 0 0 0 1 1 0 1
1 0 0 0 1 1 0 1
CAM: Introduction
E Binary CAM Cell
ML pre-charged to V
DD

Match: ML remains at V
DD

Mismatch: ML discharges
BL1c
BL1
WL
SL1c SL1
ML
BL1c_cell BL1_cell
P1 P2
N1 N2
N3
N4
N5 N7
N6 N8
CAM: Introduction
E Ternary CAM (TCAM)
0 0 X 0 0 1 1 1 5
0 1 0 0 1 1 0 1 4
0 0 0 1 1 1 0 1 3
1 1 0 0 1 0 X 1 2
1 0 1 0 1 1 0 1 1
0 1 0 X 0 1 0 1 0
X X X 0 1 1 0 1
Input Keyword
X X X X X 1 1 1 5
X X X X 1 1 0 1 4
X X X 1 1 1 0 1 3
X X 0 0 1 0 1 1 2
X 0 0 0 1 1 0 1 1
0 1 0 1 0 1 0 1 0
0 1 1 0 1
0 1 1 0 1
1 1 0 1
0 0 0 1 1 0 1
1
4
Match
Match
1
4
Match
Match
1 0 0 0 1 1 0 1
Input Keyword
CAM: Introduction
E TCAM Cell
Global Masking SLs
Local Masking BLs
BL1 BL2 Logic
0 1 0
1 0 1
1 1 X
0 0 N.A.
BL1
BL2
WL
RAM
Cell
RAM
Cell
SL1 SL2
ML
BL1c BL2c
Comparison
Logic
CAM: Introduction
E DRAM based TCAM Cell
` Higher bit density
Slower table update
Expensive process
Refreshing circuitry
Scaling issues (Leakage)
BL2
BL1
WL
SL2 SL1
ML
BL2_cell BL1_cell
N3 N4
N5 N7
N6 N8
CAM: Introduction
E SRAM based TCAM Cell
` Standard CMOS process
` Fast table update
Large area (16T)
BL1 BL1c BL2 BL2c
WL
SL1 SL2
ML
BL1c_cell BL2c_cell
CAM: Introduction
E Block diagram of a 256 x 144 TCAM
CAM Cell (0)
BL1c(0) BL2c(0)
CAM Cell (143)
BL1c(N) BL2c(N)
CAM Cell (0)
BL1c(0) BL2c(0)
CAM Cell (143)
BL1c(N) BL2c(N)
ML0
SL1(143) SL2(143) SL1(0) SL2(0)
MLSA
MLSO(0)
MLSA
ML255
MLSO(255)
SL Drivers
Search Lines (SLs)
ML Sense Amplifiers
Match Lines
(MLs)
CAM: Introduction
E Why low-power TCAMs?
Parallel search Very high power
(2Mb Sibercore TCAM 66MHz 66Msps 3.4W)

IPv6, OC-768 Larger word size, larger no. of
entries High power

Embedded applications (SoC)
CAM: Introduction
E Why high-performance TCAMs?
OC-768 135M packets/s (7.4 ns/packet)

Application complexity Multiple searches

IPv6 Larger word size larger search
time
CAM: Design Techniques
E Cell Design: 12T Static TCAM cell*
0 is retained by Leakage (V
WL
~ 200 mV)
` High density
Leakageq (3 orders)
Noise margin
Soft-errors (node S)
Unsuitable for READ
CAM: Design Techniques
E Cell Design: NAND vs. NOR Type CAM
` Low Power
Charge-sharing
Slow
CAM
Cell (N)
CAM
Cell (1)
CAM
Cell (0)
SA
ML_NAND M
SA
CAM
Cell (N)
CAM
Cell (1)
CAM
Cell (0)
ML_NOR MM
BL1 BL1c
WL
SL1 SL1c
V
DD
BL1 BL1c
WL
SL1c SL1
V
DD
NAND-type CAM
NOR-type CAM
CAM: Design Techniques
E MLSA Design: Conventional
Pre-charge ML to V
DD
Match V
ML
= V
DD
Mismatch V
ML
= 0
MM MM
V
DD
PRE
MLSO
V
DD
ML
CAM: Design Techniques
E MLSA Design: Current Race Sensing*
Dummy ML

MLOFF

Delay

RST
V
DD
RSTc
ML
MLSO
MLOFF

MATCH

MM MM
CAM: Design Techniques
E MLSA Design: Current Race Sensing
` No need to reset SLs in every clock cycle
` Lower ML voltage swing (V
th
+ V) V
DD

Speed q = Current q = Voltage Margin +

Voltage Margin
ML [0]
MLSO [0]
ML [1]
CAM: Design Techniques
E MLSA Design: Charge Redistribution*
Fast pre-charge ML through M
REF
Mismatch SP=0 MLSO=1
I
ML
> I
REF
> leakage
` V
ML
(V
REF
V
th
)
FAST_PRE High power

FAST_PRE
RST
V
REF
V
DD
V
DD
SP
MLSO
I
REF
ML
C
ML
C
SP
M
REF
RST
CAM: Design Techniques
E MLSA Design: Charge Injection*
Reset ML and pre-charge C
INJ
Charge share C
INJ
and C
ML
Match V
ML
= C
INJ
x V
DD
/(C
INJ
+C
ML
)
Mismatch V
ML
= 0
` Small V
ML

Poor noise margin
Area penalty (C
INJ
)
V
DD
ML

MLSO
C
ML
OFFSET SA
CHARGE_IN
PRE
C
INJ
RST
CAM: Design Techniques
E Low Power: Selective Pre-charge*
MLs: Two segments
If MATCH in pre-search Main-search
No. of bits in pre-search Data statistics
ML1 ML2 MLSA1
MLSO1
MLSA2
MLSO2
ML1 ML2 MLSA1
MLSO1
MLSA2
MLSO2
PRE-SEARCH MAIN-SEARCH
CAM: Design Techniques
E Low Power: Dual-ML TCAM*
MLSA1 is enabled first
MLSA2 is enabled if MLSO1 = 1
CAM Cell (0)
BL1c(0) BL2c(0)
CAM Cell (N)
BL1c(N) BL2c(N)
ML1
SL1(N) SL2(N) SL1(0) SL2(0)
MLSA1
MLSO1
ML2
MLSA2
MLSO2
ML1
ML2
CAM: Design Techniques
E Low Power: Dual-ML TCAM
Cap(ML1) = Cap(ML2) = C(ML)
Same speed, 50% less energy (Ideally!)

Parasitic interconnects degrade both speed and
energy
Additional ML increases coupling capacitance
CAM: Design Techniques
E Low Power: Dual-ML TCAM
Simulation results (144 bits)*
E Interconnect cap. = 27 fF
E W/L = 0.6m/0.18m
Old New Difference
T
S
(ns) 8.14 8.46 4% q
E
1
(fJ) 769 426 45% +
E
2
(fJ) 769 973 26% q
CAM: Design Techniques
E Low Power: Dual-ML TCAM*
E
AVG
= P
ML1
x E
1
+(1 P
ML1
) x E
2
SA1 cannot detect Type I
For M mismatches, P
ML1
= 1 (0.5)
M

Mismatch SL1 SL2 BL1 BL2
Type I 0 1 1 0
Type II 1 0 0 1
SL1
BL1c
ML1
CAM: Design Techniques
E Low Power: Dual-ML TCAM*
0
1
2
3
4
5
6
1 2 3 4 5 6
Number of Mismatches (M)
A
v
e
r
a
g
e

E
n
e
r
g
y

(
f
J
/
b
i
t
/
s
e
a
r
c
h
)
Traditional
Dual ML
43%
CAM: Design Techniques
E Low Power: Hierarchical SLs*
144 bits (5 segments: 8, 34, 34, 34, 34)
SLs Multiple blocks (64 words each)
` V
GSL
0.45V (V
DD
=1.8V)
Logic complexity
Search time/latency
64-bit OR gates
CAM: Design Techniques
E Static Power Reduction
16T TCAM: Leakage Paths*
WL
BL1 BL1c
SL1 SL2
BL2
BL2c
ML
1
0
1
0
N1 N2
N3 N4
P1 P2
N5 N6
N7 N8
P3 P4
N12
N9 N11
N10
0 0 1 1
BL1c_cell BL2c_cell
CAM: Design Techniques
E Static Power Reduction
Technology Scaling
1
E Dimensions 30% +
E Dynamic power 50% +
E Leakage current 5x q
Architectural level techniques
2, 3
E A small portion is enabled
CAM: Design Techniques
E Static Power Reduction
Leakage current*
V
DD
+ = I
SUB
+
1 2
0
exp( )
S S DD
SUB
T
k k V
I I
nV


V
DD
CAM: Design Techniques
E Static Power Reduction
Side Effects of V
DD
Reduction in TCAM Cells
` Speed: No change
` Dynamic power: No change
Robustness +
V
DD
+ = Volt. Margin+
(Current-race sensing)

Voltage Margin
ML [0]
MLSO [0]
ML [1]
CAM: Design Techniques
E Static Power Reduction
Voltage Margin of 144-bit TCAM word in 0.18
m CMOS*
CAM: Design Techniques
E Static Power Reduction
Effects of Technology Scaling*
E Berkeley predictive technology model (BPTM)

Você também pode gostar