VLSI implementation of a 1616 discrete cosine transform

The implementation of a 16*16 discrete cosine transform (DCT) chip using a concurrent architecture is presented. The chip contains 32 processing elements working in parallel and a random-access memory (RAM) which performs a 16*16 matrix transposition. The structure is highly regular and modular, and...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on circuits and systems Vol. 36; no. 4; pp. 610 - 617
Main Authors:	Sun, M.-T., Chen, T.-C., Gottlieb, A.M.
Format:	Journal Article
Language:	English
Published:	IEEE 01-04-1989
Subjects:	Circuit synthesis CMOS technology Costs Discrete cosine transforms Discrete transforms Image coding Sun Transform coding Two dimensional displays Very large scale integration
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The implementation of a 1616 discrete cosine transform (DCT) chip using a concurrent architecture is presented. The chip contains 32 processing elements working in parallel and a random-access memory (RAM) which performs a 1616 matrix transposition. The structure is highly regular and modular, and thus very efficient for VLSI implementation. The chip was designed for real-time processing of 14.3-MHz sample video data. It performs an equivalent of a half billion multiplications and accumulations per second. Fabricated in 2- mu m double-metal CMOS technology, the chip contains approximately 73000 transistors which occupy a 7.27.0-mm/sup 2/ area. The 68-pad die size is 8.38.1 mm/sup 2/. It is fully functional and is the first working 16*16 DCT chip. The architecture and accuracy studies for finite-wordlength processing are presented. The circuit design and layout using the symbolic design tool MULGA are described in detail. Possible variations are also discussed for multipurpose (variable transform sizes, forward-inverse transform) applications.< >
ISSN:	0098-4094 1558-1276
DOI:	10.1109/31.92893