Escolar Documentos
Profissional Documentos
Cultura Documentos
single, bc_wc, on_chip_variation analysis modes in PrimeTime. It will also explain how these three worst_arrival).
analysis modes are affected by the chosen slew propagation mode (worst_slew or
Two slew propagation modes Timing paths and their proper analysis Three timing analysis modes Potential for optimism in the
Figure 1: Buffer To Buffer Example Circuit The rising cell delay across cell U1 and the rising net delay across net n1 can be shown graphically by the waveforms in Figure 2:
Figure 2: How Cell And Net Delay Arcs Are Measured The dashed lines in Figure 2 represent the points at which the waveforms cross the delay threshold voltage (also called the delay trip point), which is typically 50% of the rail voltage. The cell and net delays represent the amount of time between these voltage threshold crossing points. We can also see the slew degradation, which is the slowdown of the slew rate due to resistance as it travels along the wire.
detailed RC delay calculation to calculate the slews and delays. Delay calculation is performed in stages, where a stage is defined as a driving cell and the driven net. The input transition is applied to the driving cell's input. The response is computed at the driver cell's output pin and the downstream loads, and the cell/net delays and pin slews are derived from the responses:
Figure 3: Measuring Stage Response To Determine Delays And Slews Stage delays and slews are a function of the input transition rate, and the characteristics of the parasitic network being driven. Since the parasitic networks are fixed, the slews are the real determining factor in determining the delay/slew characteristics of the logic. Slews are propagated from stage to stage in a forward direction to determine the timing of all stages. Since the output slews of a stage are influenced by the input slew, varying the slew at any point will affect the slews/delays for several downstream stages.
Figure 4: Propagating Slews Through The Design In the string of buffers above, the propagation of slews is straightforward. We take the output slew from each gate's output, send it down the wire, and feed it into the next gate's input. What happens, however, when two slews arrive at the same point? This can happen at a combinational gate's output or at a load pin of a multidriven net. PrimeTime (and for that matter, all static timing analysis tools) must choose one of these slews to propagate forward. These points where a slew must be chosen are called slew merge points, as shown in Figure 5:
Figure 5: Examples Of Slew Merge Points Let's take a closer look at the most common type of slew merge point, where multiple arcs arrive at a gate output:
Figure 6: Slew Merge Point At Cell Output Pin In this example, arcs (a) and (b) each result in a unique slew arriving at output pin U1/Z. Slew (a) at U1/Z arrives first and has a slow rise rate. Slew (b) at U1/Z arrives last and has a fast rise rate. The decision of which slew to propagate is a crucial one, as the output slew directly controls the cell delays and slews of the downstream logic cone. Which slew do we propagate forward from U1/Z into U2/A for a max-delay calculation? There are two slew propagation modes in PrimeTime: 1.
worst_slew propagation - In this mode, the worst slew is chosen and propagated forward.
This is the slowest (numerically largest) slew for max delays, and the fastest (numerically smallest) slew for min delays. For our example circuit, we would propagate the slower slew (a) forward into U2/A for a max-delay calculation. This is the default mode for PrimeTime and PrimeTime SI.
2.
worst_arrival propagation - In this mode, the slew with the worst arrival time is chosen
and propagated forwards. This is the latest-arriving slew for max delay, and the earliestarriving slew for min delays. In this case, the faster slew (b) arrives last at U1/Z, and would be propagated forward into U2/A for a max-delay calculation.
Let's take a look at how this slew propagation setting would affect the max-delay propagation of the timing paths through U1/A and U1/B:
worst_slew mode, the slowest slew is used for edges (1) and (2). This is accurate for the timing worst_arrival mode, the faster and later-arriving slew is used for edges (3) and (4). This is
path through edge (1), and conservative for the timing path through edge (2). accurate for the timing path through edge (4), but is optimisticfor the timing path through edge (3).
worst_arrival enables us to track slews on a per-clock domain basis, since arrival times are
measured against a reference launching event (our clock edge). As a result, the memory and runtime requirements of Note:
worst_slew is the default slew propagation mode in PrimeTime and PrimeTime SI for the following
reasons:
The resulting timing correctly bounds (that is, it will never be optimistic) the analysis for subcritical paths, although the critical path is more accurate in The memory and runtime requirements are less than
The transition times selected for signal integrity effects correctly bound the analysis
worst_arrival is typically used by expert users for specific analysis runs. The slew propagation mode is controlled by the variable timing_slew_propagation_mode. For more information on
slew propagation, see the man page for this variable.
Setup paths are paths where the checked signal edge must be stable for some time (the setup time) beforethe capturing edge. In simple terms, this makes sure the launched edge gets to the capture point soon enough. Setup paths include normal data-to-clock setup paths, the assertion of data-to-data and clock gating checks, and asynchronous recovery checks. For proper analysis, setup paths must check the latest launching edge against the earliest capturing edge.
Hold paths are paths where the checked signal edge must be stable for some time (the hold time) after the capturing edge. In simple terms, this makes sure the launched edge does not
arrive at the capture point too soon. Hold paths include normal data-to-clock paths, the deassertion of data-to-data and clock gating checks, and asynchronous removal checks. For proper analysis, hold paths must check the earliest launching edge against the latest capturing edge. An example setup/hold path is shown in Figure 8:
Figure 8: Example Setup/Hold Path The launch portion of this path consists of all cells/nets between the clock port and FF2/D (U1, U2, FF1 and U4). The capture portion consists of all cells/nets betweeen the clock port and FF2/CLK (U1, U3 and FF2). The CLK->D setup/hold arcs in the capturing sequential device are part of the capturing portion of the path. In this case, buffer cell U1 is common to both the launch and capture paths. As mentioned previously, setup paths must check the latest launching edge against the earliest capturing edge for a proper analysis. This means that we must combine the slowest possible delays along our launch path with the fastest possible delays along our capture path. In our example path above, we must find the slowest possible launch path to the data pin of FF2, and the fastest possible capture path to the clock pin of FF2. We would then check to see if this latest-arriving launch edge arrives in time to be captured by the earliest possible capture edge. Hold paths must check the earliest launching edge against the latest capturing edge. This means that we must combine the fastest possible delays along our launch path with the slowest possible delays along our capture path. In our example path above, we must find the fastest possible launch path to the data pin of FF2, and the slowest possible capture path to the clock pin of FF2. We would then check to see if this earliest-arriving launch edge arrives late enough to avoid being captured by the previous cycle's latest possible capture edge.
The selected analysis mode controls two very fundamental aspects of the timing analysis:
Whether min slews or max slews (or both) are selected at the slew merge points for propagation How the resulting min-delay or max-delay arcs are combined to form setup and hold timing paths for analysis
Let's examine how each analysis mode affects timing analysis. The folders represent a database of cell delay arcs which are calculated and stored inside PrimeTime.
The
single analysis mode analyzes a single operating corner. This goes back to the first single mode every timing arc is evaluated once using the "max" stimuli:
Max lumped capacitive loads are used if they are annotated Max pin loads or receiver model characteristics are always used Max slew propagation is performed at slew merge points
releases of Design Compiler over a decade ago, when it was the only available analysis mode. In the
o o o
Figure 9: Delay Arcs Used In PrimeTime's As a result, all delay and transition values in the case (slowest) timings at that single corner.
Both setup and hold paths use the computed max-delay arcs. Setup paths use the longest path through these arcs for launch, and the shortest path for capture. Hold paths use the shortest path through the arcs for launch, and the longest path for capture.
bc_wc analysis mode analyzes two operating corners simultaneously. In the bc_wc mode every timing arc is evaluated twice, once using the "max" stimuli and once
The using the "min" stimuli:
o o o
Min lumped capacitive loads are used for the min arcs, and max lumped capacitive loads are used for the max arcs (if annotated) Min pin caps or receiver models are used for the min arcs, and max pin caps or receiver models are used for the max arcs Min slew propagation is performed at the slew merge points for min delays, and max slew propagation is performed at slew merge points for max delays
In the
bc_wc mode, the two corners can represent two PVT (process/voltage/temperature)
corners whichcannot physically coexist at the same time. For example, the min corner could be configured at 0 C and 1.3 V, while the max corner could be configured at 100 C and 1.1 V. The two corners in
Setup paths use the longest path through the max-delay arcs for launch, and the shortest path through the max-delay arcs for capture. Hold paths use the shortest path through the min-delay arcs for launch, and the longest path through the min-delay arcs for capture. In other words, the
bc_wc analysis mode only checks setup at the max corner, and hold at the
min corner. It is important to remember that setup paths are not checked at the min corner, and hold paths are not checked at the max corner. This could miss timing violations due to differences in how the launch and capture paths track the PVT difference between the corners.
The in
considering the variation in arc timing which can exist within that corner. Just as
bc_wc mode, in on_chip_variation mode every timing arc is evaluated twice, once
o o
Min lumped capacitive loads are used for the min arcs, and max lumped capacitive loads are used for the max arcs Min slew propagation is performed at the slew merge points for min delays, and max slew propagation is performed at slew merge points for max delays
using the "max" stimuli and once using the "min" stimuli:
on_chip_variation mode, the min and max corners represent two conditions
which can physically coexist on the die at the same time. For example, the min corner could be configured at 98 C and 1.22 V, while the max corner could be configured at 102 C and 1.18 V. Unlike
establish the ranges for possible delays and slews. The actual delays and slews on the chip could be anywhere between these min/max bounds. Setup paths use the longest path through the max-delay arcs for launch, and the shortest path through the min-delay arcs for capture. Hold paths use the shortest path through the min-delay arcs for launch, and the longest path through the max-delay arcs for capture. When performing single-corner analysis, Physical Compiler (and all Design Compiler versions since the 1998.02 synthesis release) track min and max slews separately. This is consistent with the behavior of PrimeTime's analysis using the
condition is applied. If Design Compiler or Physical Compiler is configured for min/max corner
-min/-max commands, it is similar to checking hold paths in a fast (min) corneron_chip_variation run, and checking setup paths in a slow (max) corner on_chip_variation run in PrimeTime. Only a single library is used within each
corner.
Figure 12: Delay Arcs Used In DesignTime's min/max Analysis Mode (used by Physical Compiler, Design Compiler, and IC Compiler) It is important to note that DesignTime's min/max mode refers to min and max corners. Hold paths are only checked at the min corner, but on-chip variation within the min corner is included in the analysis. Likewise, setup paths are only checked at the max corner, but on-chip variation within the max corner is included in the analysis. The three analysis modes can be summarized in the following two charts:
analysis mode
single
slowest path through max-delay fastest path through max-delay arcs, arcs, single operating condition, single operating condition,
no derating
no derating fastest path through max-delay arcs, worst-case operating condition, early derating fastest path through min-delay arcs, best-case operating condition, early derating
bc_wc
slowest path through max-delay arcs, worst-case operating condition, late derating slowest path through max-delay arcs, worst-case operating condition, late derating
on_chip_variation
analysis mode
hold launch path fastest path through max-delay arcs, single operating condition, no derating fastest path through min-delay arcs, best-case operating condition, early derating fastest path through min-delay arcs, best-case operating condition, early derating
hold capture path slowest path through max-delay arcs, single operating condition, no derating slowest path through min-delay arcs, best-case operating condition, late derating slowest path through max-delay arcs, worst-case operating condition, late derating
single
bc_wc
on_chip_variation
modes. We know that in these analysis modes, max delays are used for all timing information (slews and delays) in setup paths. In a setup timing path, the launch path should be as slow as possible and the capture path should be as fast as possible. Our clock mux has a fast slew and a slow slew at its inputs. This will result in a fast slew and a slow slew at its output as well. Since we always propagate max slews in the in the
bc_wc mode, the slow slew will be propagated into downstream cells U4 and U5. This
will not yield the fastest possible timing for these gates. When we time a path captured by CLK1, these gates will not be as fast as they would behave in actual operation, and this optimism could result in a missed setup violation.
Figure 14 shows a hold timing path being analyzed in the that in the
single analysis mode. We know singleanalysis mode, max delays are used for all timing information (slews and
delays) in all paths. In a hold timing path, the launch path should be as fast as possible and the capture path should be as slow as possible. Note, however, that our AND gate has a fast and slow slew at its inputs. This will result in a fast slew and a slow slew at its output as well. Since we always propagate max slews in the
propagated into downstream cells U7 and U8, which will not yield the fastest possible timing
for these gates. When we time a path launched by FF1, these gates will not be as fast as they would behave in actual operation, and this optimism could result in a missed hold violation.
Figure 15 shows a hold timing path being analyzed in the that in the
bc_wc analysis mode. We know bc_wcanalysis mode, min delays are used for all timing information (slews and
delays) in hold paths. In a hold timing path, the launch path should be as fast as possible and the capture path should be as slow as possible. Our clock mux has a fast slew and a slow slew at its inputs. This will result in a fast and slow slew at its output as well. Since we always propagate min slews for hold paths in the
propagated into downstream cells U4 and U5, which will not yield the slowest possible timing for these gates. When we time a path captured by CLK2, these gates will not be as slow as they would behave in actual operation, and this optimism could result in a missed hold violation.
Example 4: Launch/Capture Races Across PVT Variation In the past, it was often true that designers would only check hold times at the fast corner. When using thebc_wc analysis mode, setup is only checked at the slow corner and hold is only checked at the fast corner. The cross-checks (hold at slow, setup at fast) are not performed. To see why this is not safe, let's take a look at an example.
launched by CLK, and is captured by a divide-by-2 version of the clock. If we abstract this path into delays (and assume zero delays/check values in the sequential devices), we can represent the path as follows:
Figure 17: Hold Path Timing In The Fast Corner We can see that the launching clock edge goes through 4 ns of clock delay and 2 ns of data delay, for a launch arrival of 6 ns at FF2. Our capture edge goes through 6 ns of total clock delay, which results in a required time for our data at FF2 of 6 ns. Luckily, our timing is just sufficient to meet this requirement, and we pass the hold time check with zero slack. Now let's analyze this same hold path at our slow corner:
Figure 18: Hold Path Timing In The Slow Corner We know that the logic will slow down as we move the timing path from fast PVT conditions to slow PVT conditions. This slowdown will not, however, be a completely linear effect across all library cells. Different library cells will react to PVT changes differently. In our example above, the timing of one of the capture path segments has slowed down more than the other segments. As a result, our arrival time at FF2 is 9 ns but our required time is 10 ns, resulting in a hold time failure of -1 ns. This violation would be missed by thebc_wc analysis mode. This issue can affect setup paths too, but it is much less likely due to the difference in total propagation times between the launch and capture legs of a setup path.
When specifying a device, the output conditions are typically given as a range of board-level capacitive loading values. The device must be able to tolerate any load within the range at any output port. Consider a clock-and-data output path at the top level:
Figure 19: Board Level Loading Ranges At The Top Level To check setup timing in the path above properly, we need the slowest launch (data) timing and the fastest capture (clock) timing. This means that 10 pF loading should be used for DATOUT and 2 pF loading should be used for CLKOUT. To check hold timing properly in the path above, we need the fastest launch (data) timing and the slowest capture (clock) timing. This means that 2 pF loading should be used for DATOUT and 10 pF loading should be used for CLKOUT. In the
single analysis mode, we are always computing max delays. As a result, the 10 pF
loading is used for all output ports. This would result in an optimistically slow capture path for the setup analysis, and an optimistically slow launch path for the hold analysis. For setup paths in the
result, the 10 pF loading is used for both the launch and capture sides of the setup path. This would lead to an optimistically slow capture path. For hold paths, min-delays are computed for all delays. As a result, the 2 pF loading is used for both the launch and capture sides of the hold path. This would lead to an optimistically fast capture path. This analysis is performed properly by the
paths, 10 pF loading is used for DATOUT and 2 pF loading is used for CLKOUT. For hold paths, 2 pF loading is used for DATOUT and 10 pF loading is used for CLKOUT.
on_chip_variation analysis mode is only needed set_timing_derate command. In fact, there are multiple
potential sources for on-chip timing variation, including (but not limited to):
Using from
set_driving_cell orset_input_transition Min/max annotated lumped loads (set_load -min/-max) Min/max annotated slews (set_annotated_transition -min/-max) Min/max annotated delays (set_annotated_delay -min/-max)
Reading and using both the min and max triplet arc delays from an SDF file
Delay calculation for all multiple input gates, which result in more than one slew at the output(s) Crosstalk aggressions which can induce speedups/slowdowns in slews as they travel along wires
The
on_chip_variation analysis mode is a must for accurate analysis of modern designs. Even if
you are not specifying different min/max operating conditions or early/late timing derates, on-chip variation due to min/max slew propagation must still be taken into account and properly analyzed. At a minimum, it is important to understand that every multiple-input gate will have multiple slews at its output. The
delays and slews are captured and computed correctly. In Tables 1 and 2 above, any time the "slowest path through min-delay arcs" or the "fastest path through max-delay arcs" is used, the potential for optimism exists (as we have seen in the examples above). For a thorough analysis, both setup and hold should be checked at each corner using the
on_chip_variationanalysis mode.
As we have seen, slew propagation plays a crucial role in determining the accuracy of the timing analysis. For more information on the path-based slew propagation technology in PrimeTime, refer to the following article: 012134: Accurate Sign-Off Analysis with PrimeTime's Path-Based Analysis
single operating condition to be similar toon_chip_variation. In versions true explicitly switched the analysis to the on_chip_variation analysis mode and issued an informational message:
modifying the 2000.11 and later, setting this variable to
pt_shell> set timing_propagate_single_condition_min_slew true Information: Issuing set_operating_conditions equivalent to timing_propagate_single_condition_min_slew setting. (PTE-037) set_operating_conditions -analysis_type on_chip_variation -library [get_libs {slow.db:slow}] -min slow -max slow true pt_shell>
In PrimeTime V-2003.12 and earlier releases, the default value of this variable was false, which left the singleanalysis mode unchanged. In the PrimeTime V-2004.06 release, the default value of this variable was changed totrue so that the more accurate on_chip_variation would become the default analysis mode. However, this change in the default value caused compatibility issues with customer scripts.
To address the compatibility issue and still provide an accurate default analysis, the behavior in PrimeTime W-2004.12 is as follows:
The variable is now hidden. This means you can set its value, but you cannot query it. Although it is hidden, its effective default behavior is now an analysis mode, the
on_chip_variation analysis mode will be used. The set_operating_conditions command now takes precedence over this variable. If
you set an analysis mode with this command, the specified analysis mode is set, independent of the value of thetiming_propagate_single_condition_min_slew variable.
In the X-2005.06 release this variable will no longer be present, as you should be explicitly setting the analysis mode to the desired type. Note that it is strongly encouraged that only
on_chip_variation analysis mode be used for signoff-quality static timing analysis. If you do not specify an analysis mode, on_chip_variation will be used by default.
the
Question: Can there be different delays caculated for the same common buffers in the launch and capture clock paths? Answer:
Yes, there be different delays caculated for the same common buffers in the launch and capture clock paths due crosstalk effects. When crosstalk analysis is enabled: For a setup (non-zero cycle) check, the agressor switching can be different and can affect the launch and capture signals in different ways. However, for a hold (zero-cycle) check, the launch and capture delays always match because the agressor switching cannot affect the same clock edge in different ways.