Escolar Documentos
Profissional Documentos
Cultura Documentos
Vishwani D. Agrawal
Rutgers University, Dept. of ECE New Jersey
http://cm.bell-labs.com/cm/cs/who/va
January 2003
2003
Course Outline
Basic concepts and definitions Fault modeling Fault simulation ATPG DFT and scan design BIST Boundary scan IDDQ test
2003
Write specifications
Design synthesis and Verification
Test development
Fabrication Manufacturing test
Chips to customer
2003 Agrawal: Digital Test and DFT 3
Definitions
Design synthesis: Given an I/O function, develop a procedure to manufacture a device using known materials and processes. Verification: Predictive analysis to ensure that the synthesized design, when manufactured, will perform the given I/O function. Test: A manufacturing step that ensures that the physical device, manufactured from the synthesized design, has no manufacturing defect.
2003
Realities of Tests
Based on analyzable fault models, which may not map onto real defects. Incomplete coverage of modeled faults due to high complexity. Some good chips are rejected. The fraction (or percentage) of such chips is called the yield loss. Some bad chips pass tests. The fraction (or percentage) of bad chips among all passing chips is called the defect level.
2003
Costs of Testing
Software processes of test Test generation and fault simulation Test programming and debugging Manufacturing test Automatic test equipment (ATE) capital cost Test center operational cost
2003
0.5-1.0GHz, analog instruments,1,024 digital pins: ATE purchase price = $1.2M + 1,024 x $3,000 = $4.272M Running cost (five-year linear depreciation) = Depreciation + Maintenance + Operation = $0.854M + $0.085M + $0.5M = $1.439M/year Test cost (24 hour ATE operation) = $1.439M/(365 x 24 x 3,600) = 4.5 cents/second
Agrawal: Digital Test and DFT 7
2003
Method of Testing
2003
2003
10
2003
11
A manufacturing defect is a finite chip area with electrically malfunctioning circuitry caused by errors in the fabrication process. A chip with no manufacturing defect is called a good chip. Fraction (or percentage) of good chips produced in a manufacturing process is called the yield. Yield is denoted by symbol Y. Cost of a chip:
Cost of fabricating and testing a wafer -------------------------------------------------------------------Yield x Number of chip sites on the wafer
2003 Agrawal: Digital Test and DFT 12
chips among the chips that pass tests. DL is measured as parts per million (ppm). DL is a measure of the effectiveness of tests. DL is a quantitative measure of the manufactured product quality. For commercial VLSI chips a DL greater than 500 ppm is considered unacceptable.
2003
13
2003
Bus interface controller ASIC fabricated and tested at IBM, Burlington, Vermont 116,000 equivalent (2-input NAND) gates 304-pin package, 249 I/O Clock: 40MHz, some parts 50MHz 0.45m CMOS, 3.3V, 9.4mm x 8.8mm area Full scan, 99.79% fault coverage Advantest 3381 ATE, 18,466 chips tested at 2.5MHz test clock Data obtained courtesy of Phil Nigh (IBM)
Agrawal: Digital Test and DFT 14
Computed DL
237,700 ppm (Y = 76.23%)
Summary: Introduction
VLSI Yield drops as chip area increases; low yield means high cost Fault coverage measures the test quality Defect level (DL) or reject ratio is a measure of chip quality DL can be determined by an analysis of test data For high quality: DL < 500 ppm, fault coverage ~ 99%
2003
16
Fault Modeling
2003
17
I/O function tests inadequate for manufacturing (functionality versus component and interconnect testing) Real defects (often mechanical) too numerous and often not analyzable A fault model identifies targets for testing A fault model makes analysis possible Effectiveness measurable by experiments
2003
18
Processing defects Missing contact windows Parasitic transistors Oxide breakdown . . . Material defects Bulk defects (cracks, crystal imperfections) Surface impurities (ion migration) . . . Time-dependent failures Dielectric breakdown Electromigration . . . Packaging failures Contact degradation Seal leaks . . .
Ref.: M. J. Howes and D. V. Morgan, Reliability and Degradation Semiconductor Devices and Circuits, Wiley, 1981.
2003 Agrawal: Digital Test and DFT 19
Single stuck-at faults Transistor open and short faults Memory faults PLA faults (stuck-at, cross-point, bridging) Functional faults (processors) Delay faults (transition, path) Analog faults For more examples, see Section 4.4 (p. 6070) of the book.
2003
21
Example: XOR circuit has 12 fault sites ( ) and 24 single stuck-at faults
c
1 0
Only one line is faulty The faulty line is permanently set to 0 or 1 The fault can be at an input or output of a gate
a b
d e f
s-a-0
0(1) 1(0)
g
1
h i k
Fault Equivalence
Number of fault sites in a Boolean gate circuit = #PI + #gates + #(fanout branches). Fault equivalence: Two faults f1 and f2 are equivalent if all tests that detect f1 also detect f2. If faults f1 and f2 are equivalent then the corresponding faulty functions are identical. Fault collapsing: All single faults of a logic circuits can be divided into disjoint equivalence subsets, where all faults in a subset are mutually equivalent. A collapsed fault set contains one fault from each equivalence subset.
Agrawal: Digital Test and DFT 23
2003
Equivalence Example
sa0 sa1 sa0 sa1 sa0 sa1 Faults in red removed by equivalence collapsing
sa0 sa1
sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
2003
24
Fault models are analyzable approximations of defects and are essential for a test methodology. For digital logic single stuck-at fault model offers best advantage of tools and experience. Many other faults (bridging, stuck-open and multiple stuck-at) are largely covered by stuck-at fault tests. Stuck-short and delay faults and technologydependent faults require special tests. Memory and analog circuits need other specialized fault models and tests.
Agrawal: Digital Test and DFT 25
2003
Fault Simulation
2003
26
Determine
Motivation
2003
27
Low
Test generator
Add vectors
Adequate Stop
2003 Agrawal: Digital Test and DFT 28
Mostly logic with some switch-level for highimpedance (Z) and bidirectional signals High-level models (memory, etc.) with pin faults Two (0, 1) or three (0, 1, X) states for purely Boolean logic circuits Four states (0, 1, X, Z) for sequential MOS circuits Zero-delay for combinational and synchronous circuits Mostly unit-delay for circuits with feedback
Timing:
2003
29
Faults:
Mostly single stuck-at faults Sometimes stuck-open, transition, and path-delay faults; analog circuit fault simulators are not yet in common use Equivalence fault collapsing of single stuck-at faults Fault-dropping -- a fault once detected is dropped from consideration as more vectors are simulated; fault-dropping may be suppressed for diagnosis Fault sampling -- a random sample of faults is simulated when the circuit is large
2003
30
fn detected?
Disadvantage: Much repeated computation; CPU time prohibitive for VLSI circuits Alternative: Simulate many faults together
2003
31
Fault Sampling
A randomly selected subset (sample) of faults is simulated. Measured coverage in the sample is used to estimate fault coverage in the entire circuit. Advantage: Saving in computing resources (CPU time and memory.) Disadvantage: Limited data on undetected faults.
2003
32
Undetected fault
Random
picking
(a random variable)
2003
33
-- -----------2s
2
(x--C )2
2 = C (1 - C) Variance, s ------------
p (x )
Ns
Mean = C
Sampling error
C -3s
C +3s 1.0
Sample coverage
2003 Agrawal: Digital Test and DFT 34
[ -------------- ] 1/2 N
s
C (1 - C )
Solving the quadratic equation for C, we get the 3-sigma (99.7% confidence) estimate:
4.5 C 3s = x ------- [1 + 0.44 Ns x (1 - x )]1/2 Ns Where Ns is sample size and x is the measured fault coverage in the sample. Example: A circuit with 39,096 faults has an actual fault coverage of 87.1%. The measured coverage in a random sample of 1,000 faults is 88.7%. The above formula gives an estimate of 88.7% 3%. CPU time for sample simulation was about 10% of that for all faults.
2003
35
Fault simulator is an essential tool for test development. Concurrent fault simulation algorithm offers the best choice. For restricted class of circuits (combinational and synchronous sequential with only Boolean primitives), differential algorithm can provide better speed and memory efficiency (Section 5.5.6.) For large circuits, the accuracy of random fault sampling only depends on the sample size (1,000 to 2,000 faults) and not on the circuit size. The method has significant advantages in reducing CPU time and memory needs of the simulator.
2003
36
2003
37
2003
38
Functional ATPG generate complete set of tests for circuit input-output combinations 129 inputs, 65 outputs: 2129 = 680,564,733,841,876,926,926,749, 214,863,536,422,912 patterns Using 1 GHz ATE, would take 2.15 x 1022 years Structural test: No redundant adder hardware, 64 bit slices Each with 27 faults (using fault equivalence) At most 64 x 27 = 1728 faults (tests) Takes 0.000001728 s on 1 GHz ATE Designer gives small set of functional tests augment with structural tests to boost coverage to 98+ %
Agrawal: Digital Test and DFT 39
2003
Random-Pattern Generation
Flow chart for method Use to get tests for 6080% of faults, then switch to D-algorithm or other ATPG for rest
2003
40
2003
41
2003
42
D
D D
1
D
2003
43
0
D D D
1 1
2003
44
Sequential Circuits
A sequential circuit has memory in addition to combinational logic. Test for a fault in a sequential circuit is a sequence of vectors, which
Initializes the circuit to a known state Activates the fault, and Propagates the fault effect to a primary output Time-frame expansion methods Simulation-based methods
2003
45
Concept of Time-Frames
Replicate combinational logic block n times Place fault in each block Generate a test for the multiple stuck-at fault using combinational ATPG with 9-valued logic
Vector -n+1 Vector -1 Vector 0
Fault
Unknown or given Init. state
Comb. block
2003
Timeframe -n+1
State variables
Timeframe -1 PO -1
Timeframe 0 PO 0
Next state
PO -n+1
46
FF1
s-a-1
FF2
2003
47
X/1
0/X FF1 FF2
X X
0/1
X/1
Time-frame -1
2003
B X
Time-frame 0
0/1
48
s5378
4,603 3,639
s35932
39,094 35,100
Fault coverage
Test vectors CPU time HP J200 256MB
93.3%
3,943 1.3 hrs.
79.1%
11,571 37.8 hrs.
89.8%
257 10.2 hrs.
Ref.: M. S. Hsiao, E. M. Rudnick and J. H. Patel, Dynamic State Traversal for Sequential Circuit Test Generation, ACM Trans. on Design Automation of Electronic Systems (TODAES), vol. 5, no. 3, July 2000.
2003 Agrawal: Digital Test and DFT 49
Summary: ATPG
Combinational ATPG is significantly more efficient than sequential ATPG. Combinational ATPG tools are commercially available. Design for testability is essential if the circuit is large (million or more gates) and high fault coverage (~95%) is required.
2003
50
2003
51
Definition
Design for testability (DFT) refers to those design techniques that make test generation and test application cost-effective. DFT methods for digital circuits: Ad-hoc methods Structured methods:
Scan Partial Scan Built-in self-test (BIST) Boundary scan Analog test bus
2003
52
Design reviews conducted by experts or design auditing tools. Disadvantages of ad-hoc DFT methods:
Avoid asynchronous (unclocked) feedback. Make flip-flops initializable. Avoid redundant gates. Avoid large fanin gates. Provide test control for difficult-to-control signals. Avoid gated clocks. ... Consider ATE requirements (tristates, etc.)
Experts and tools not always available. Test generation is often manual with no guarantee of high fault coverage. Design iterations may be necessary.
Agrawal: Digital Test and DFT 53
2003
Scan Design
Circuit is designed using pre-specified design
Add a test control (TC) primary input. Replace flip-flops by scan flip-flops (SFF) and connect to form one or more shift registers in the test mode. Make input/output of each scan shift register controllable/observable from PI/PO.
2003
testable faults in the combinational logic. Add shift register tests and convert ATPG tests into scan sequences for use in manufacturing test.
Agrawal: Digital Test and DFT 54
Use only clocked D-type of flip-flops for all state variables. At least one PI pin must be available for test; more pins, if available, can be used. All clocks must be controlled from PIs. Clocks must not feed data inputs of flip-flops.
2003
55
Master latch
Slave latch
Q
MUX
SD CK
D flip-flop
CK
t
Scan mode, SD selected
TC
2003
56
MCK SCK SD
Logic overhead
TCK
2003
SFF
SFF SFF
SCANOUT
TC or TCK SCANIN
2003 Agrawal: Digital Test and DFT
I1
I2 Combinational logic
O1
O2
PO
S1
S2
N1
N2
2003
59
Scan register must be tested prior to application of scan test sequences. A shift sequence 00110011 . . . of length nsff+4 in scan mode (TC=0) produces 00, 01, 11 and 10 transitions in all flip-flops and observes the result at SCANOUT output. Total scan test length: (ncomb + 2) nsff + ncomb + 4 clock periods. Example: 2,000 scan flip-flops, 500 comb. vectors, total scan test length ~ 106 clocks. Multiple scan registers reduce test length.
Agrawal: Digital Test and DFT 60
2003
Scan Overheads
IO pins: One pin necessary. Area overhead: Gate overhead = [4 nsff/(ng+10nff)] x 100%, where ng = comb. gates; nff = flip-flops; Example ng = 100k gates, nff = 2k flip-flops, overhead = 6.7%. More accurate estimate must consider scan wiring and layout area. Performance overhead: Multiplexer delay added in combinational path; approx. two gate-delays. Flip-flop output loading due to one additional fanout; approx. 5-6%.
Agrawal: Digital Test and DFT 61
2003
2003
62
Advantages:
Disadvantages:
2003
2003
64
BIST Process
test simultaneously on all PCBs Each board controller activates parallel chip BIST Diagnosis effective only if very high fault coverage
2003 Agrawal: Digital Test and DFT 65
2003
Definitions
circuit response no information loss fully invertible (can get back original response) Signature analysis Compact good machine response into good machine signature.
Actual signature generated during testing, and compared with good machine signature
2003
67
2003
Problem with ordinary LFSR response compacter: Too much hardware if one of these is put on each primary output (PO) Solution: MISR compacts all outputs into one LFSR Works because LFSR is linear obeys
superposition principle
final remainder is XOR sum of remainders of polynomial divisions of each PO by the characteristic polynomial
Agrawal: Digital Test and DFT 69
2003
2003
70
Combined functionality of D flip-flop, pattern generator, response compacter, & scan chain Reset all FFs to 0 by scanning in zeros
2003
71
Circuit Initialization
1. Modeling all BIST hardware 2. Setting all FFs to Xs 3. Running logic simulation of CUT with BIST hardware
2003
72
Summary: BIST
LFSR pattern generator and MISR response compacter preferred BIST methods BIST has overheads: test controller, extra circuit delay, Input MUX, pattern generator, response compacter, DFT to initialize circuit & test the test hardware BIST benefits: At-speed testing for delay & stuck-at faults Drastic ATE cost reduction Field test capability Faster diagnosis during system test Less effort to design testing process Shorter test application times
Agrawal: Digital Test and DFT 73
2003
IEEE 1149.1
2003
74
2003
75
2003
76
2003
77
Test Access Port (TAP) includes these signals: Test Clock Input (TCK) -- Clock for test logic
2003
Can run at different rate from system clock Test Mode Select (TMS) -- Switches system from functional to test mode Test Data Input (TDI) -- Accepts serial test data and instructions -- used to shift in vectors or one of many test instructions Test Data Output (TDO) -- Serially shifts out test results captured in boundary scan chain (or device ID or other internal registers) Test Reset (TRST) -- Optional asynchronous TAP controller reset
Functional test: verify system hardware, software, function and performance; pass/fail test with limited diagnosis; high (~100%) software coverage metrics; low (~70%) structural fault coverage. Diagnostic test: High structural coverage; high diagnostic resolution; procedures use fault dictionary or diagnostic tree. SOC design for testability:
Partition SOC into blocks of logic, memory and analog circuitry, often on architectural boundaries. Provide external or built-in tests for blocks. Provide test access via boundary scan and/or analog test bus. Develop interconnect tests and system functional tests. Develop diagnostic procedures.
Agrawal: Digital Test and DFT 79
2003
IDDQ Test
2003
80
2003
poly to bulk Cmp overlapped metal wire to poly Floating gate voltage depends on capacitances and node voltages If nFET and pFET get enough gate voltage to turn them on, then IDDQ test detects this defect K is the transistor gain
82
Sematech Results
Test process: Wafer Test Package Test Burn-In & Retest Characterize & Failure Analysis Data for devices failing some, but not all, tests. Scan-based Stuck-at pass pass 6 14 0 6 1 52 36 pass fail fail 1463 34 13 1251 pass fail 7 pass 1 pass 8 fail fail fail Scan-based delay
83
IDDQ (5 mA limit)
pass fail pass fail
Functional
Agrawal: Digital Test and DFT
2003
causing: Delay, bridging, weak faults Chips damaged by electro-static discharge No natural breakpoint for current threshold Get continuous distribution bimodal would be better Conclusion: now need stuck-fault, IDDQ, and delay fault testing combined Still uncertain whether IDDQ tests will remain useful as chip feature sizes shrink further
Agrawal: Digital Test and DFT 84
2003
References
of Electronic Testing for Digital, Memory and Mixed-Signal VLSI Circuits, Boston:
Kluwer Academic Publishers, 2000, ISBN 0-7923-7991-8. For the material on a course taught by the authors at Rutgers University, and a complete bibliography from the above book, see website:
http://cm.bell-labs.com/cm/cs/who/va
2003 Agrawal: Digital Test and DFT 85