5-GHz 32-bit integer execution core in 130-nm dual-V/sub T/ CMOS

A 32-bit integer execution core containing a Han-Carlson arithmetic-logic unit (ALU), an 8-entry /spl times/ 2 ALU instruction scheduler loop and a 32-entry /spl times/ 32-bit register file is described. In a 130 nm six-metal, dual-V/sub T/ CMOS technology, the 2.3 mm/sup 2/ prototype contains 160 K...

Full description

Saved in:
Bibliographic Details
Published in:IEEE journal of solid-state circuits Vol. 37; no. 11; pp. 1421 - 1432
Main Authors: Vangal, S., Anders, M.A., Borkar, N., Seligman, E., Govindarajulu, V., Erraguntla, V., Wilson, H., Pangal, A., Veeramachaneni, V., Tschanz, J.W., Ye, Y., Somasekhar, D., Bloechel, B.A., Dermer, G.E., Krishnamurthy, R.K., Soumyanath, K., Mathew, S., Narendra, S.G., Stan, M.R., Thompson, S., De, V., Borkar, S.
Format: Journal Article
Language:English
Published: IEEE 01-11-2002
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:A 32-bit integer execution core containing a Han-Carlson arithmetic-logic unit (ALU), an 8-entry /spl times/ 2 ALU instruction scheduler loop and a 32-entry /spl times/ 32-bit register file is described. In a 130 nm six-metal, dual-V/sub T/ CMOS technology, the 2.3 mm/sup 2/ prototype contains 160 K transistors. Measurements demonstrate capability for 5-GHz single-cycle integer execution at 25/spl deg/C. The single-ended, leakage-tolerant dynamic scheme used in the ALU and scheduler enables up to 9-wide ORs with 23% critical path speed improvement and 40% active leakage power reduction when compared to a conventional Kogge-Stone implementation. On-chip body-bias circuits provide additional performance improvement or leakage tolerance. Stack node preconditioning improves ALU performance by 10%. At 5 GHz, ALU power is 95 mW at 0.95 V and the register file consumes 172 mW at 1.37 V. The ALU performance is scalable to 6.5 GHz at 1.1 V and to 10 GHz at 1.7 V, 25/spl deg/C.
ISSN:0018-9200
1558-173X
DOI:10.1109/JSSC.2002.803944