Low-Power Design Techniques in Digital Systems

2535 days ago, 900 views
PowerPoint PPT Presentation
Low-Control Plan Methods in Advanced Frameworks. Prof. Vojin G. Oklobdzija College of California. Layout of the Discussion. Power patterns in VLSI Scaling hypothesis and expectations Examination endeavors in force decrease Proficiency measures and outline rules Hooks and Flip-Flops for Low-Control

Presentation Transcript

Slide 1

Low-Power Design Techniques in Digital Systems Prof. Vojin G. Oklobdzija University of California

Slide 2

Outline of the Talk Power slants in VLSI Scaling hypothesis and forecasts Research endeavors in power decrease Efficiency measures and plan rules Latches and Flip-Flops for Low-Power Dual-Edge FFs SOI Conclusion: Low-Power point of view

Slide 3

Power inclines in VLSI

Slide 4

CMOS Circuits scatter little power by nature". So trusted circuit planners (Kuroda-Sakurai, 95) 100 x4/3years 10 Power (W) 1 0.1 0.01 80 85 90 95 "By the year 2000 power scattering of top of the line ICs will surpass the viable furthest reaches of artistic bundles, regardless of the possibility that the supply voltage can be attainably lessened." (* Taken from Sakurai's ISSCC 2001 introduction)

Slide 5

Gloom and Doom forecasts Source: Shekhar Borkar, Intel

Slide 6

Source: Shekhar Borkar, Intel

Slide 7

Power versus Year: taken from ISSCC, uP Report, Hot-Chips High-end developing at 25%/year RISC @ 12%/yr X86 @ 15%/yr Consumer (low-end) At 13%/year

Slide 8

VDD, Power and Current Trend 2.5 200 500 Voltage 2 Power 1.5 Voltage [V] Power for every chip [W] Current VDD current [A] 1 0.5 0 1998 2002 2006 2010 2014 Year International Technology Roadmap for Semiconductors 1999 refresh supported by the Semiconductor Industry Association in collaboration with European Electronic Component Association (EECA) , Electronic Industries Association of Japan (EIAJ), Korea Semiconductor Industry Association (KSIA), and Taiwan Semiconductor Industry Association (TSIA) (* Taken from Sakurai's ISSCC 2001 introduction)

Slide 9

Power Delivery Problem (not simply California) Your auto starter ! Source: Shekhar Borkar, Intel

Slide 10

Trend in L di/dt di/dt is generally relative to I * f , where I is the chip's present and f is the clock recurrence or I * Vdd * f/Vdd = P * f/Vdd , where P is the chip's energy . The pattern is: P f Vdd on-chip L bundle L somewhat diminishes Therefore, L di/dt vacillation increments essentially. (* Taken from Norman Chang, HP )

Slide 11

Saving Grace ! Vitality Delay item is enhancing more than 2x/era

Slide 12

X86 productivity enhancing drastically 4X/era normal enhancing 3X/era High-End processors effectiveness not enhancing

Slide 13

Scaling hypothesis and forecasts

Slide 14

The power dispersal has expanded 1000 circumstances over the 15 years and is surpassing 70 Watts Scaling standards: 1. A "steady field scaling" hypothesis [Dennard] expect that gadget voltages and in addition gadget measurements are scaled by a scaling component x (>1), bringing about a consistent electric field in a gadget: control thickness stays consistent circuit execution can be enhanced as far as: thickness x 2 speed x control 1/x 2 control postpone item 1/x 3 Limitless advance in CMOS is guaranteed with this scaling situation

Slide 15

by and by neither a supply voltage nor an edge voltage had been scaled till 1990 prompting to the hypothesis of: "Steady voltage scaling" which accept the steady voltage This suspicion yields: speed change by x 2 control thickness increments quickly by x 3

Slide 16

The steady field is not reasonable, x 0.5 is agreeable - however even with that the power scattering would surpass ECL by 2001: another theory is required ! (* Taken from Sakurai and Kuroda, IEICE 95 paper)

Slide 17

High-Performance View Point on Power *taken from Ron Preston, DEC Alpha P=k C V 2 f : Shrinking to the new innovation (30% lessening in l ) C diminishes by 30% f increments by 1/0.7 = 43% P new =0.7 (1/0.7) P old = P old (No Change in Power ! ) New outline: Double the No. of gadgets P new =2 x 0.7 (1/0.7) P old = 2 X P old (Power Doubles !) Scale V dd by 30% in the new outline: P new =2 x 0.7 (1/0.7) (0.7) 2 P old = P old (Power remains consistent !)

Slide 18

High-Performance View Point on Power *taken from Ron Preston, DEC Alpha Reality: Paradigm Changes: More Aggressive Circuits, Toggle rate expanding, Out of Order, Speculative Execution What to Expect: Power will be constrained by the bundle and cooling methods Frequency will be controlled by the power - as high as bundle can take !

Slide 19

Research Efforts in Low-Power Design Reduce the dynamic load: Minimize the circuits Use more effective outline Charge reusing More productive format Technology scaling: The most noteworthy win Thresholds ought to scale Leakage begins to byte Dynamic voltage scaling P sw = k C L V 2 cc f CLK Reduce Switching Activity: Conditional clock Conditional precharge Switching-off dormant squares Conditional execution Run it slower: Use parallelism Less pipeline stages Use twofold edge flip-slump

Slide 20

Reducing the Power Dissipation The power scattering can be limited by diminishing: supply voltage stack capacitance exchanging movement Reducing the supply voltage brings a quadratic change Reducing the heap capacitance adds to the change of both power dispersal and circuit speed.

Slide 21

Voltage Scaling There are three intends to keep up the throughput : Reduce V th to enhance circuit speed Introduce parallel and pipelined engineering while utilizing slower gadget speeds (accept boundless no. of transistors, truly the transistor thickness is just expanding by 60% every year) Prepare various supply voltages and for each group of circuits pick the most minimal supply voltage that fulfills the speed. (A decent level converter is essential which displays little postponement and expends little power, little region)

Slide 23

Is there an ideal plan point ?

Slide 24

V k C V k•Q th L DD Delay 2 • Power : S = P = p f C V + I 10 V t • CLK • L • DD 0 • DD an I ( V - V ) DD th a ( =1.3) - 10 x 10 5 4 3 A Delay (s) 2 1 B 0 4 A B 3 V - 0.4 0 2 DD 0.4 (V) (V) 1 0.8 V th Power Dissipation and Circuit Delay - 4 x 10 1 0.8 0.6 Power (W) 0.4 0.2 0 4 3 V - 0. 4 0 2 DD (V) 0.4 1 (V) 0.8 th (* Taken from T. Sakurai)

Slide 25

1.8 1.5 V 1.4 Normalized Delay 3.0 V 1.0 5.0 V 0.6 0 0.4 0.7 0.2 1 (V) TH Sensitivity to V th vacillation V =1.0 V DD Δ V = TH 0.15V ± 0.05V ± 0.5 (* Taken from T. Sakurai)

Slide 26

Power-Delay Product, Energy-Delay Product Lowest Voltage – Highest Threshold – no ideal (*from Sakurai, Kuroda, IEICE 95 paper) Power-Delay Product is a deceptive measure; it will dependably support a processor that works at lower recurrence Energy-Delay is more satisfactory - however Energy-Delay 2 ought to be utilized

Slide 27

Power-Delay Product, Energy-Delay Product Horowitz, Indermaur, Gonzales contend against Power-Delay, SLPE'94

Slide 28

Energy-Delay**2 (*courtesy of Prof. T. Sakurai)

Slide 29

Energy-Delay Product versus Vitality Delay**2 Nowka, Hofstee, Carpenter of IBM contend against Energy-Delay as a plan productivity measure (private correspondence)

Slide 30

Energy-Delay Product versus Vitality Delay**2 a similar outline ought to have moderately a similar productivity Optimal point: (due to V th being settled ?) Nowka, Hofstee, Carpenter of IBM contend against Energy-Delay as a plan proficiency measure (private correspondence)

Slide 31

Feature 601 + 604 620 Diff . Recurrence MHz 100 133 (100) same CMOS Process .5u 5-metal .5u 4-metal .5u 4-metal ~same Cache Total 32KB Cache 16K+16K Cache 64K ~same Load/Store Unit No Yes Dual Integer Unit No Yes Register Renaming No Yes Peak Issue 2 + Br 4 Insts 4 Insts ~double Transistors 2.8 Million 3.6 Million 6.9 Million +30%/+146% SPECint92 105 160 225 (169) +50%/+61% SPECfp02 125 165 300 (225) +30%/+80% Power 4W 13W 30W (22.5W) +225%/+463% Spec/Watt 26.5/31.2 12.3/12.7 7.5/10 - 115%/ - 252% PF=Watt/Freq**3 4.0E-6 13.0E-6 12.8E-6 (PF/Trans)*E12 1.43 3.61 1.86 IPC 1.05 1.6 1.69 PE*IPC**3 (*E6) 4.01 12.98 12.69 PE=Watt/Spec**3 3.46E-6 3.17E-6 2.63E-6 Example: PowerPC

Slide 32

Feature Digital 21164 MIPS 10000 PowerPC 620 HP 8000 Sun Ultra-Sparc Freq 500 MHz 200 MHz 200 MHz 180 MHz 250 MHz Pipeline Stages 7 5-7 5 7-9 6-9 Issue Rate 4 Out-of-Order Exec. 6 lds 32 16 56 none Register Renam. (int/FP) none/8 32/32 8/8 56 none Transistors/Logic transistors 9.3M/1.8M 5.9M/2.3M 6.9M/2.2M 3.9M*/3.9M 3.8M/2.0M SPEC95 (Intg/FlPt) 12.6/18.3 8.9/17.2 9/9 10.8/18.3 8.5/15 Power 25W 30W 40W 20W SpecInt/Watt 0.5 0.3 0.27 0.43 1/Energy*Delay 6.4 2.6 2.7 2.9 3.6 Watt/Freq**3 0.2E-6 3.75E-6 3.75E-6 6.86E-6 1.28E-6 (PF/Trans)*E12 0.022 0.64 0.54 1.76 0.34 (PF/LTrans)*E12 0.11 1.63 1.7 1.76 0.64 Watt/Spec**3 12.5E-3 42.5E-3 41.5E-3 31.7E-3 32.5E-3

Slide 33

Use of Different Circuits Families

Slide 34

Capacitance Reduction The heap capacitance is the aggregate of: entryway capacitance dispersion capacitance steering capacitance Using modest number of transistors, or little size of transistors adds to the decrease in the door capacitance and the dissemination capacitance. Pass transistor rationale may have advantage since it involves less transistors and shows littler stray capacitance than traditional static CMOS rationale.

Slide 35

Pass-Transistor Logic

Slide 36

Pass-Transistor Logic: CVSL, CPL, SRPL, DSL, DPL, DCVSPG

Slide 37

SAPL: Sense-Amplifying Pass-transistor Logic All hubs are initially released and after that assessed by data sources. Yields are 100mV above GND

Slide 38

Where does the power go ?

Slide 39

Power utilize is not the same as chip to chip: (*from Sakurai, Kuroda, IEICE 95 paper) MPU1 is a low end microchip MPU2 is a top of the line CPU with substantial reserve ASSP1 is MPEG-2 decoder ASSP2 is an ATM switch

Slide 40

Design Example: Strong Arm 110 Two power modes: sit still and rest Power: 0.5W utilizing 1.1V inner PS: 184 Drystone/MIPS @162MHz 1.1W utilizing 2V inside PS: 245 Drystone/MIPS @ 215MHz Power Breakdown: I-C