VLSI implementation of a 1616 discrete cosine transform
The implementation of a 16*16 discrete cosine transform (DCT) chip using a concurrent architecture is presented. The chip contains 32 processing elements working in parallel and a random-access memory (RAM) which performs a 16*16 matrix transposition. The structure is highly regular and modular, and...
Saved in:
Published in: | IEEE transactions on circuits and systems Vol. 36; no. 4; pp. 610 - 617 |
---|---|
Main Authors: | , , |
Format: | Journal Article |
Language: | English |
Published: |
IEEE
01-04-1989
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The implementation of a 16*16 discrete cosine transform (DCT) chip using a concurrent architecture is presented. The chip contains 32 processing elements working in parallel and a random-access memory (RAM) which performs a 16*16 matrix transposition. The structure is highly regular and modular, and thus very efficient for VLSI implementation. The chip was designed for real-time processing of 14.3-MHz sample video data. It performs an equivalent of a half billion multiplications and accumulations per second. Fabricated in 2- mu m double-metal CMOS technology, the chip contains approximately 73000 transistors which occupy a 7.2*7.0-mm/sup 2/ area. The 68-pad die size is 8.3*8.1 mm/sup 2/. It is fully functional and is the first working 16*16 DCT chip. The architecture and accuracy studies for finite-wordlength processing are presented. The circuit design and layout using the symbolic design tool MULGA are described in detail. Possible variations are also discussed for multipurpose (variable transform sizes, forward-inverse transform) applications.< > |
---|---|
ISSN: | 0098-4094 1558-1276 |
DOI: | 10.1109/31.92893 |