Escolar Documentos
Profissional Documentos
Cultura Documentos
APRIL 2007
MEMORY TEST & REPAIR PRIMER
TABLE OF CONTENTS
TABLE OF CONTENTS............................................................................................................................ 2
INTRODUCTION ........................................................................................................................................ 3
REFERENCES ......................................................................................................................................... 24
April 2007 2
MEMORY TEST & REPAIR PRIMER
INTRODUCTION
One of the most notable consequences of the semiconductor industry moving to deeper nanoscale
technology nodes is the significant growth in both the number and densities of embedded memories.
Designs have migrated from containing a handful of memories to containing hundreds and in some cases
over a thousand memories of all types. This explosion in embedded memories is driving the need for
rethinking the manufacturing test strategy for these designs [1]. In particular, embedded memories now
represent in most cases a die’s largest contributor to yield loss due to the very large area and density of
these regular circuits. A successful memory strategy must now incorporate some form of repair
methodology in order to achieve profitable yield levels.
Formulating a repair methodology often requires combining IP from memory providers, automation from
DFT providers, and data from foundries. This often represents a significant challenge as not only are
there several combinations and choices to consider, but more importantly, there is generally very little
information on how to best make these choices. This document attempts to address this challenge by
explaining the memory repair process along with all of its components and choices as well as by
providing repair related information on popular memory IP vendors and foundries.
April 2007 3
MEMORY TEST & REPAIR PRIMER
To address this growing problem, some commercial memory BIST solutions now provide programmable
BIST engines (figure 1b). With these engines it is possible to download (on the tester or in-system)
program code that implements an arbitrary memory test algorithm, allowing new or enhanced algorithms
to be applied as needed to specific memories as new defect mechanisms need to be addressed. To
maintain a simplified manufacturing test flow, these programmable BIST engines will typically support
predetermined default algorithms as well. This removes the need to program the BIST engine if the
default algorithm is sufficient. Only when new defect mechanisms are discovered does it become
necessary to program each BIST engine before having it execute the memory test. The programmable
BIST engines are larger than the more traditional hard-coded ones and therefore should only be used
when the need is justified. This tends to be when a new memory design and/or a new foundry process
are to be used.
Control Gen Address Gen
Control Gen Address Gen
Scannable
Microcode
Memory
TO / FROM MEMORY
TO / FROM MEMORY
Data Gen
Control
Compare
Compare
PROGRAMMABLE
BIST BIST
CONTROLLER CONTROLLER
(a) (b)
Figure 1: Memory BIST Architectures
April 2007 4
MEMORY TEST & REPAIR PRIMER
REPAIR ANALYSIS
This component of the repair process consists of determining which of a memory’s defective sections
(typically rows or columns) must be replaced with available spares. Repair analysis can be performed on
or off chip. In the off-chip approach, all memory failures are logged on the tester and the resulting fail data
is post-processed offline. A significant drawback of the off-chip approach is that logging all of the fail data
off-chip results in a large increase in test time. Because of this, the majority of today’s repair approaches
use an on-chip repair analysis capability, often referred to as BIRA for Built-In Repair Analysis. With
BIRA, absolutely no fail data needs to be logged externally as the BIRA circuitry or engine analyzes the
fail data coming out of an associated BIST controller on the fly. By the end of the memory test, the BIRA
engine has determined the spare element allocation necessary to repair the chip.
A key requirement for a BIRA engine is to maximize its success at finding spare allocation solutions. If
only spare rows or spare columns are used then the repair analysis is straightforward as any defective
row or column is simply replaced. The analysis becomes much more complex however when both spare
rows and columns are available. Take for example the memory represented in figure 2a which contains 2
spare rows and one spare column and contains the six defects shown. If a simple linear algorithmic
approach is taken to allocate spares, then the allocation shown in figure 2b would be the outcome and
the repair would not be successful. A successful allocation is possible in this case however as shown in
figure 2c. In general, determining the optimum allocation when both spare rows and columns are used is
in mathematical terms an NP-Complete problem, or more simply put, a problem that grows exponentially
Spare Column
Spare Rows
(a) (b) (c)
April 2007 5
MEMORY TEST & REPAIR PRIMER
in difficulty with growing number of spare elements. Fortunately though, when the number of rows and
columns is relatively small (which is generally the case) an optimal solution can typically always be
computed. Commercial BIRA solutions exist that can always find an optimal solution when one spare
column and any number of spare rows are used.
REPAIR DELIVERY
There are two general forms of repair delivery: hard repair and soft repair.
Hard Repair
In this approach, repair instructions are stored permanently within the die through the programming of
fuses. The two common fuse types are laser and electrical. Laser fuses are programmed by cutting a
metal link, while electrical fuses (eFuses) are typically one-time programmable or flash memory elements
and are programmed using an elevated voltage level. eFuse usage is growing rapidly as they are
generally smaller than laser fuses—typically by a factor of 2 to 3 (e.g. 0.02 mm2 vs. 0.05 mm2), and they
do not require special equipment or a different test insertion to be programmed. For this last reason,
eFuses are also associated with Self-Repair approaches which are described later in this section.
Soft Repair
In this approach, repair instructions are stored in volatile memory, typically in scan registers, at each
power up of the device. Soft repair has the advantage of being able to address defects that may arise
over time as new repair instructions can be created and stored throughout the life of the device. This
provides higher long term availability and reliability. Because the repair instructions are not permanently
stored within the device, they have to be either stored somewhere external to the device (somewhere in
the system) or they have to be generated on-the-fly at power-up. Storing the repair instructions in the
system can be daunting from a logistics point of view as the repair instructions for typically many different
memories within many different devices have to be properly managed. For this reason, soft repair is
almost exclusively associated with a BIRA mechanism to calculate repair instructions on-chip at power
up.
Self-Repair:
A self-repair solution, typically referred to as BISR (Built-In Self-Repair), is one where both the repair
analysis and repair delivery are performed on-chip. In its simplest form, a BISR solution consists of the
combined BIRA and soft repair capabilities described above. One important disadvantage of this
approach however is that since the repair instructions are calculated once at power-up, they may not take
into account defects that only manifest themselves under specific operating conditions such as high
temperature. For this reason, more advanced BISR solutions now incorporate a combination of both soft
and hard repair capabilities. Hard repair is used to store repair instructions determined during
manufacturing test and soft repair is then used at each power up to address any new defects. These
advanced solutions provide several advantages including: a simplified manufacturing test and repair
process, support for long term reliability using soft-repair as explained above, and significant silicon area
savings through pooling of fuse data as explained below. A potential drawback of this incremental soft
April 2007 6
MEMORY TEST & REPAIR PRIMER
T BIST BIRA
redundancy are equipped with a BIRA engine A ENGINE with
P Controller Redundancy
to analyze failures and generate any necessary
repair instructions in the form of fuse data. A BIST Interface
April 2007 7
MEMORY TEST & REPAIR PRIMER
BIRA REG
BISR REG
the next step. T BIST BIRA
A with
Controller ENGINE
P Redundancy
4
Y Action: Run the memory BIST/BIRA
BIST Interface
controllers (at typically different test corners)
Result: Each memory is fully tested and any Other
1 BISR
necessary repair info is automatically Registers
processing.
BIST Interface
T BIST BIRA
A with
Controller ENGINE
mode. P Redundancy
April 2007 8
MEMORY TEST & REPAIR PRIMER
Generally, only steps 6 and 7 are repeated during final test as it is assumed that any additional memory
yield fall out at final test is too small to warrant the test time cost of the extra repair steps. Nevertheless, if
additional repair is mandated at final test, the above seven steps can be re-executed with the following
modifications:
X This step is now equivalent to step 6 as the fuse controller will load the stored wafer sort repair info
into the BISR chain.
Y Before executing this step, the BIRA register is loaded with the BISR register contents in order to
create a baseline or starting point for any additional repair info calculation.
\ In this step the fuse controller is now run in incremental programming mode. Only new additional
repair info is compressed and stored in the eFuse array. This is necessary as the eFuse array can
not be reprogrammed and therefore the incremental repair info must be stored separately.
In the field, all memories are repaired automatically at power up by the fuse controller as described in
step 6. For long term reliability, it is possible to perform additional incremental soft repair at power up to
address any defects that may have developed over time. To accomplish this, once the fuse controller has
loaded the BISR registers to repair the memories, the BIRA registers are then loaded with the BISR
register contents to create a baseline as in the modified step 2 above. The BIST controllers (with BIRA)
are then executed and the BIRA registers are then updated to contain the baseline repair info combined
with any new repair info. This combined repair info is then transferred back into the BISR registers to
repair the memories. This is a soft repair as the new repair info can not be programmed into the eFuse
array and is therefore only available while the device is powered up.
April 2007 9
MEMORY TEST & REPAIR PRIMER
Redundancy is added to improve memory yield and thus die yield. A method to calculate yield is therefore
required in order to analyze redundancy requirements. There is a long history of work on determining
accurate yield models [2]. A common model for memory yield used by several companies is one based
on the Negative Binomial model:
1
DMEM = 0.002 defects/mm2
YMEM = ( 1 + DMEM AMEM )-C (1)
(defects/mm2)
YMEM 0.9
AMEM = memory core size (mm2)
C=12
C = complexity factor. This parameter
0.85
relates to the complexity of the
C=14
underlying process and is derived
from the number of critical steps 0.8
in the manufacturing flow. Values 1 2 3 4 5 6 7 8 9 10
Figure 5 plots yields for memories ranging in size from 1 mm2 to 10 mm2, for complexity factor values
ranging from 8 to 14. A relatively high defect density of 0.002 defects/mm2 (1.3 defects/in2) is assumed. It
is clear that redundancy will be needed if even a few of the larger memories are placed together on a die
as the die’s yield will be the product of the already low memory yields.
To calculate the effect of redundancy on a memory’s yield, consider first the case when one spare
element (row or column) is added to a memory. In this case the memory can be viewed as being divided
into N equal parts of the size of the spare element. For example, if a spare row is added to a memory
then N is equal to the number of rows. With the spare element, the memory has N+1 parts each with the
same yield value, YMEM/N, which can be calculated using equation (1) with an area value N times smaller
than the full memory. The yield of the memory can be closely approximated by the probability of no more
than one of the N+1 parts being bad. The memory yield with one spare element can therefore be
calculated by:
April 2007 10
MEMORY TEST & REPAIR PRIMER
As additional independent spare elements are added, the yield calculation becomes increasingly complex
as all combinations of allowable bad part combinations must be taken into account. With 2 spare
elements, the yield calculation grows to:
Y2SP = (YMEM/N )N+2 + (N+2) (YMEM/N )N+1 (1- YMEM/N ) + ½ (N+2)(N+1) (YMEM/N )N (1- YMEM/N )2 (3)
Figure 6 shows the improved yield values for both the single and double spare element cases for the
same memory sizes and defect density used in figure 5. It is interesting to note that even at the relatively
high defect density, a single spare element seems sufficient for all but the largest memories. It also
appears from this data that it will be rare to need more than 2 spare elements within an individual
memory.
Adding redundancy to a given memory will therefore increase the DPWGOOD value if the resulting
percentage increase in the die yield, YDIE (which is equal to the percentage increase in the given memory
yield as yields are multiplied together) is greater than the resulting percentage increase in the die area
ADIE. The ratio YDIE / ADIE must increase to justify the added redundancy. For example, if one spare
element is added to a memory, then the ratio Y1SP / A1SP must be greater than the ratio Y0SP / A0SP or
Y1SP
A1SP >1
Y0SP
A0SP
April 2007 11
MEMORY TEST & REPAIR PRIMER
where:
Y1SP = die yield with one spare element added to the memory
Y0SP = die yield with no spare element added to the memory
A1SP = die area with one spare element added to the memory
A0SP = die area with no spare element added to the memory
The graph in figure 7a displays the above ratio for the memory sizes and yield data used in figure 6. The
ratio is greater than one for all memory sizes and therefore indicates that one spare element should be
added in all cases. The graph also displays the ratio
Y2SP
A2SP
Y1SP
A1SP
which measures whether a second spare element should be added. In this case, the values indicate that
a second spare element should only be added for memory sizes 4 mm2 or greater.
Figure 7b shows the same two ratios for the same memory sizes but with the defect density decreased
from 0.002 defects/mm2 (1.3 defects/in2) to 0.0002 defects/mm2 (0.13 defects/in2). At this reduced defect
density, two spare elements are never justified, and one spare element helps for only memory sizes 3
mm2 or greater.
Note that if the design starts off pad-limited then there is some unused silicon area in the core that can be
used for redundancy without any cost. The above ratios are still useful in this case as they serve to rank
the relative benefits of adding redundancy to the various memories.
The above analysis also assumes that the die area can grow to an arbitrary size. This is often not the
case as specific die sizes may only be available due to packaging and other issues. In this case adding
redundancy may force a change to the next die size resulting in a more significant area increase.
1.2 1.02
Y1SP Y1SP
1.15 A1SP 1.015 A1SP
Y0SP Y0SP
A0SP A0SP
1.01
1.1
Y2SP 1.005 Y2SP
A2SP A2SP
1.05 Y1SP Y1SP
A1SP 1
A1SP
1
0.995
DMEM = 0.002 defects/mm2 DMEM = 0.0002 defects/mm2
0.95 0.99
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Memory core size (mm2) Memory core size (mm2)
(a) (b)
April 2007 12
MEMORY TEST & REPAIR PRIMER
The number of fuses to incorporate in a chip’s centralized fuse pool is calculated using the following
equation:
Where
BISRRegSize = Maximum number of repair data bits required to encode a spare element
allocation. This number is typically between 6 and 12 bits and depends on the
type of spare element (row or column) as well as the memory size and
configuration.
Example:
A design has 40 reparable memories and each memory contains two spare rows, resulting in 80 spare
elements on the chip. Each spare row requires 8 bits to encode. This results in the following parameter
values:
BISRRegSize = 8
BISRChainLength = 8 * 80 = 640
ZeroCountBits = log(640) = 10
The resulting relationship between number of repairs to be supported per die and the number of fuses
required is shown in the following table:
April 2007 13
MEMORY TEST & REPAIR PRIMER
# Repairs % Spare
Supported Utilization Fuse Count
Typical 5 6% 100
10 13% 190
20 25% 370
40 50% 730
80 100% 1450
This reparability versus area overhead tradeoff must in general be analyzed to determine the optimal
number of fuses to incorporate into the design. An exact analysis of this trade-off is difficult however as it
must take into account many factors such as the defect density of the process being used, the physical
design of the memories, and the relative percentage of memory and logic on the chip. It is important to
keep in mind however that the logic on the chip can not be repaired. Even though the defect density
within the memories tends to be greater than within the logic, as the number of defects grows, the logic
quickly becomes the limiting factor. As a result, a good rule of thumb is to plan for a relatively low number
of repairs, ten for example, regardless of the number of memories with redundancy. In the above
example, a fuse count of 190 would therefore be acceptable.
April 2007 14
MEMORY TEST & REPAIR PRIMER
One or more spare rows are added per memory. In the case of several spare rows, some redundancy
schemes force all rows to be allocated as a contiguous block while others allow each row to be allocated
separately. It is rare to have more than two spare rows within a memory.
Advantages: This is the cheapest repair method from a BIST and BIRA overhead point of view. The BIST
overhead is cheapest as a serial test interface between the BIST controller and memory can be used. A
serial interface only requires one comparator per word rather than one per bit (I/O). The amount of BIRA
logic is also low and varies only slightly with the memory size as only the most significant bits (MSBs) of
the row address bits are logged.
April 2007 15
MEMORY TEST & REPAIR PRIMER
Advantages: This has the least effect on memory performance as there is no impact on address
decoding.
Disadvantages: It precludes the use of a serial interface between the BIST controller and the memory as
a comparator per bit (I/O) is needed. The area cost is a function of the number of I/Os so that even a
small memory can require a large amount of repair circuitry. The BIRA circuitry required to encode the
failing I/O number is relatively big and slow. This may reduce the maximum frequency at which the BIST
and BIRA can operate.
A combination of both the row and column redundancy schemes. One or more spare rows as well as one
or more spare columns are added per memory. The number of spares rows or spare columns rarely
exceeds two.
Advantages: Provides the highest repair success rate for a given number of spares. Having spares in
both dimensions not only improves the ability to cover a random distribution of defects, but also improves
the ability to cover defect mechanisms that affect an entire word (e.g. word line fault) or entire column
(e.g. bit line fault).
Disadvantages: Very expensive from both a memory overhead as well as from a BIRA overhead point of
view. Can only be justified for very large memories and generally for less mature processes.
April 2007 16
MEMORY TEST & REPAIR PRIMER
Memory redundancy As the amount and type of redundancy may only be determined later in the
options design process, it is important to ensure that the chosen family of memories
or memory compiler offer a wide enough range of redundancy options
Test algorithms The repair process is contingent on proper screening of all defects. It is
important therefore to ensure that test algorithms suitable for the chosen
memories and foundry process are used. In most cases BIST is used for
memory testing and therefore the test algorithms supported by a given BIST
solution should be examined. If new memory designs or new processes are to
be used, it may be valuable to adopt a BIST solution that supports soft
algorithm programming.
BIST and BIRA IP Finding delay related defects requires that the BIST and BIRA (repair) IP
performance operate at the operating frequencies of the memories under test. This should
be investigated taking into account the chosen foundry process. In particular,
BIRA logic for column redundancy analysis tends to be slow and is often the
limiting factor.
BIST and BIRA area Area overhead numbers should be examined when choosing BIST and BIRA
overhead solutions. BIST solutions should be examined for how well they are able to
share test resources across multiple memories. BIRA solutions should be
examined for area efficiency for the chosen amount and type of redundancy.
Design automation If a design contains a large number of memories then the level of design
automation that is provided with the chosen BIST and self-repair solutions
becomes very important. Comprehensive automation can result in significant
design schedule savings.
Memory Vendor Support Choosing BIST and self-repair solutions that support a large variety of
memory vendor IP can be very useful as the choice of memory vendor and/or
IP may change from one design to the next. It is also not uncommon to have
memories from different vendors within the same design.
April 2007 17
MEMORY TEST & REPAIR PRIMER
Reparable Memories Lists the memory types that are fully supported by LogicVision’s
ETMemory-Repair BISR solution. This support has been fully verified by
supported by
both LogicVision and the memory IP provider.
ETMemory - Repair?
Indicates whether the vendor’s compilers automatically generate
Automated LV Memory LogicVision memory library file descriptions of the memories. If this is not
Library File generation available for some memories, the memory library file can be generated
manually or provided as a service by LogicVision.
April 2007 18
MEMORY TEST & REPAIR PRIMER
Please Note: Memory vendors are listed in alphabetical order. LogicVision does not endorse nor
recommend any particular memory vendor. Information is compiled from publicly available
data. The reader is encouraged to directly contact any of the memory vendors for the most
up to date information and product updates.
Multi-Row Multi-Row
Supported Redundancy
Single Column I/O Dual Column I/O None
Types
Row and Column Row and Column
Reparable Memories
supported by ETMemory - YES YES YES
Repair?
Automated LV Memory
Select Memories All Memories NO
Library File generation
April 2007 19
MEMORY TEST & REPAIR PRIMER
Multi-Row Multi-Row
Supported Redundancy
Bank Single Column I/O Single Column I/O
Types
Row and Column
April 2007 20
MEMORY TEST & REPAIR PRIMER
FOUNDRY INFO
This section provides a cross reference between memory IP vendors and the memory types they provide
for each of the most popular foundries. For easy access, the information is provided in table form in the
following two tables. The fields contained in these two tables are described below.
Memory Vendors For each technology node, this field lists the 3rd party memory IP vendors
that provide memories for each listed foundry. Details on the memories
provided by each memory IP vendor and the specific process nodes
supported (low power, high speed, etc) can typically be found within the
vendor’s website. Most listed vendors provide both single port and double
port SRAMs.
Reparable Memory For each technology node, this field lists the 3rd party memory IP vendors
Vendors that provide reparable memories for each listed foundry. Details on the
memories provided by each memory IP vendor and the specific process
nodes supported (low power, high speed, etc) can typically be found within
the vendor’s website. Most listed vendors provide both single port and
double port reparable SRAMs, and most provide row only, column only, as
well as row and column redundancy options.
Please Note: Foundries and memory vendors are listed in alphabetical order. LogicVision does not
endorse nor recommend any particular foundry or memory vendor. Information is compiled
from publicly available data. The reader is encouraged to directly contact any of the
foundries and memory vendors for the most up to date information and product updates.
April 2007 21
MEMORY TEST & REPAIR PRIMER
ARM / Artisan
ARM / Artisan Dolphin ARM / Artisan
Memory
Virage Logic Virage Logic
vendors
180nm
Reparable
Dolphin
Memory
vendors
130nm
90nm
April 2007 22
MEMORY TEST & REPAIR PRIMER
Memory Dolphin
vendors TSMC
65nm
Reparable
Dolphin
Memory
TSMC
vendors
April 2007 23
MEMORY TEST & REPAIR PRIMER
REFERENCES
[1] S. Pateras, “Best Practices for Cost Effective Test and Yield Optimization of Embedded Memories”,
FSA Forum, vol. 13, no. 4, December 2006
[2] J.A. Cunningham, “The Use and Evaluation of Yield Models in Integrated Circuit Manufacturing”,
IEEE Transactions on Semiconductor Manufacturing, vol. 3, no. 2, May 1990, pp. 60-71.
[3] S. Shoukourian, V.Vardanian and Y. Zorian, “SoC Yield Optimization via an Embedded-Memory Test
and Repair Infrastructure”, IEEE Design & Test of Computers, May-June 2004, pp. 200-207.
April 2007 24