Você está na página 1de 31

Dynamic Power Analysis of

Custom Macros
Stephen Bijansky
Bassam Mohd
Baker Mohammad
2

Outline

Motivation

HSIM Power Analysis

ESP-CV Power analysis

ESP-CV Flow

Results

Conclusions
3

Motivation

Power characterization is an important part of low


power design

Custom macro with transistor level design has a


challenge to model active power
Spice level simulation is slow
Characterizing all custom cells is a big task
Need a detail gate level model to use ASIC design flow
Changes in the top level affects macro power
Need on going modeling of power with new stimulus
4

Overview

Power estimation for custom macros


Transistor level schematics
Post-layout capacitance extraction

Reduce analysis time

Improve accuracy for long test cases

This work is used extensively in Qualcomms 45nm


low power DSPs
5

Traditional Approach (.lib)

Fast SPICE simulator HSIM


Assume certain activities on data
Append power into lib files
conditional statements based on control signals
Limitation on conditional statement
Mutually exclusive
Depends on internal state nodes
Has the macro just come out of reset or has the macro
been running for a while
Potential 2M+N entry in the lib file
6

Cont. Traditional Approach (HSIM)


Fast SPICE simulator
Accuracy within 2% to 3% of HSPICE
Use HSIM to run the entire power benchmark
Power benchmark might be thousands of cycles
Potential for long run time
Large macros could take days or weeks
Reduce benchmark to only 100 cycles
Which 100 cycle window should be used
Power analysis could be too large or too small
Can be time consuming and error prone to set initial
conditions
7

1st Order Power Equation

Power = Activity Factor * Cap * Voltage2 * Freq

Capacitance LPE

Voltage Fixed

Frequency Fixed

Activity Factor Unknown


8

ESP-CV Simulation

Symbolic equivalence checking of schematics vs RTL

Input to ESP-CV is a standard Verilog testbench

Use ESP-CV as a Verilog simulator for schematics

Verilog simulation orders of magnitude faster than


Spice
Functional simulation
Only need to determine activity factor
9

RC verilog switch-level simulator


G

D S

Gold standard High Performance Functional Accuracy Extremely Fast


For Accuracy For Accuracy Automated Modeling No Timing
HSPICE HSIM VCS
ESPCV
10

ESP-CV Simulation

ESP-CV converts schematic to switch level verilog


Special directives for transistor strengths
Internal node names in a custom macro are not in RTL
ESP-CV uses the internal nodes in the schematic

Run entire benchmarks using thousands of cycles


Same benchmarks used in PT-PX for power estimation of
synthesized logic
Includes reset and initialization

Fast run time allows running many more benchmarks


11

Flow Steps
Full Chip
simulation
Custom
macro design

xmp d g s b qc_pch l=40e-9 w=120e-


9 T=0 010101001

Input to the Flow T=1101010101

Spice netlist Spice netlist for Macro


fsdb from top level VCS
sims

VCD on the macro boundaries


Cap file
Vtran converts fsdb into
Output : Power Value in W ESPCV
verilog test bench

Verilog test bench for

Integrate the flow with PTPX chip macro interface

level run *GV file from


ESPCV
ESPCV simulate the
Verilog test bench

Vcd dump of all


nodes

Vcd2saif

Nodes AF
(SAIF)

Cap file from


Power_calc_script
nanotime

Power value (avg, peak, static)

set_annotated_power -internal_power 2.452e-02


12

Flow Steps

ESP-CV
Create Switch Calculate
RTL Calculate
Verilog Level Activity Node Caps
Simulation Power
Test Bench Verilog Factor
Simulation
13

RTL Simulation and


Testbench Creation

Entire benchmark is simulated for the top level design


Verilog VCS simulation
Starts from reset, performs initialization, then benchmark
Single fsdb dump file for each benchmark

Vtran converts the fsdb dump of the benchmark to a


Verilog testbench
Macro testbench has all of the same inputs as the top level
simulation
14

Flow Steps

ESP-CV
Create Switch Calculate
RTL Calculate
Verilog Level Activity Node Caps
Simulation Power
Test Bench Verilog Factor
Simulation
15

Calculate Activity Factor

Process ESP-CV VCD dump file and calculate an


activity factor for each node

Vcd2saif produces a switch activity interchange


format (SAIF) file
Time spent at 0/1/Z, numbers of transitions,
Computed for only the window of interest

Process the SAIF file to get the activity factor for each
node
Transitions / Number of cycles
16

Node Capacitances

Calibre layout parasitic extraction (LPE)

Nanotime calculates the total cap of every node


Reads Calibre SPEF file
Add gate, diffusion, and wire caps

Qcs_process_cap_rpt.pl
Converts Nanotime report to an easy to use column based
text file format

For nodes, such as bitlines, that do not have a full rail


swing, the caps can be scaled
17

Flow Steps

ESP-CV
Create Switch Calculate
RTL Calculate
Verilog Level Activity Node Caps
Simulation Power
Test Bench Verilog Factor
Simulation
18

Calculate Power

Qcs_calc_power.pl
Combines switching activities with the capacitances to
compute the power
Voltage and frequency are fixed

Output is a text file with the power, activity factor,


capacitance, and name for each node
Easily sort to determine which nodes use the most power
Retains hierarchy easy to filter
Can partition to determine power on multiple supply nets
19

100 Cycle Validation

2.5

P 2
o
w 1.5 HSIM
e ESP-CV
r 1

0.5

0
Test1 Test2 Test3 Test4 Test5 Test6
20

100 Cycle Validation

Run ESP-CV with the same 100 cycle window that is


used for HSIM

For tests that use more than 1 mW of power,


ESP-CV is within 3% of the HSIM

For tests that use less than 1 mW,


ESP-CV is within 0.08 mW of HSIM

ESP-CV has good correlation to HSIM


21

ESP-CV for Entire Test


versus HSIM for 100 Cycles
3

2.5

P 2
o
w 1.5 HSIM
e ESP-CV
r 1

0.5

0
Test1 Test2 Test3 Test4 Test5 Test6
22

Results

100 cycles do not accurately model an entire test

Test3 reported 4.7X more power using 100 cycles


compared to the entire test

Test4 reported 55% less power using 100 cycles


compared to the entire test

Difficult to choose a good 100 cycle window


23

Run Time Comparison

10000
9000
R
u 8000
n 7000
100 cycles
6000
T
HSIM
i 5000 240,000
m 4000 cycles ESP-CV
e 3000

2000
(

s
)

1000
0
Test1 Test2 Test3 Test4 Test5 Test6
24

Run Time Comparison

ESP-CV full test simulations


Test3 with 49,101 cycles took 406 seconds
Test4 with 240,510 cycles took 3267
Event based simulations scales with the number of cycles

ESP-CV 100 cycle simulations needed 21 seconds


Not many events in 100 cycles

HSIM needed between 1,950 seconds (Test5) and


9,468 seconds (Test2) to run 100 cycles
Large differences in run time with fixed number of cycles
25

IR Drop Analysis

Compute fixed activity factor power for use in


Redhawk IR drop analysis
Every clock nodes is assigned an activity factor
of 100%
Every non-clock node is assigned an activity
factor of 15% which is 3 transitions per every 10
clock cycles
This is worst case analysis that is used to stress
the power grid to see where are the weak points
26

Conclusion

Simulate an entire benchmark instead of trying to


guess at a subset of the benchmark
The wrong subset led to a 4.7X overestimation of power
Includes reset and initialization

Fast simulation enables running more benchmarks

ESP-CV is being used to generate power estimations


of longer benchmarks
27

Future Work

Short circuit power modeling


Current flow does not address

Leakage power modeling


Active leakage power is not accurately modeled

Enable other methods to calculate node capacitances

More calibration on different circuit families


28

Thank You!

Questions
29

Backup Slides
30

Nanotime Capacitance Report

# max rise CAP # max rise CAP

NODE : clk NODE : xblock/lclk


C_diff : 0.000 C_diff : 0.000
C_overlap : 0.004 C_overlap : 0.013
C_gate : 0.003 C_gate : 0.012
C_wire : 0.081 C_wire : 0.093
C_pin : 0.006 C_pin : 0.009
C_total : 0.094 C_total : 0.127
31

Process Capacitance Report

%nodeCap = ();
while ($line = <CAPFILE>) {
if ($line =~ /^NODE : (\S+)/) {
$node = $1;
$line = <CAPFILE>; $line = <CAPFILE>; $line = <CAPFILE>;
$line = <CAPFILE>; $line = <CAPFILE>; $line = <CAPFILE>;
if ($line =~ /^C_total\s*:\s*(\S+)/) {
$ctotal = $1;
$nodeCap{$node} = max ($ctotal, $nodeCap{$node};
}
}
}

Você também pode gostar