Search Results - "Zalani, Vidhi"
-
1
A 7-nm Four-Core Mixed-Precision AI Chip With 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware Throttling
Published in IEEE journal of solid-state circuits (01-01-2022)“…Reduced precision computation is a key enabling factor for energy-efficient acceleration of deep learning (DL) applications. This article presents a 7-nm…”
Get full text
Journal Article -
2
Power-Limited Inference Performance Optimization Using a Software-Assisted Peak Current Regulation Scheme in a 5-nm AI SoC
Published in IEEE journal of solid-state circuits (18-10-2024)“…Discrete AI inference cards, operating under form-factor and system-defined peak power constraints, must serve diverse inference requests with widely varying…”
Get full text
Journal Article -
3
RaPiD: AI Accelerator for Ultra-low Precision Training and Inference
Published in 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA) (01-06-2021)“…The growing prevalence and computational demands of Artificial Intelligence (AI) workloads has led to widespread use of hardware accelerators in their…”
Get full text
Conference Proceeding -
4
14.1 A Software-Assisted Peak Current Regulation Scheme to Improve Power-Limited Inference Performance in a 5nm AI SoC
Published in 2024 IEEE International Solid-State Circuits Conference (ISSCC) (18-02-2024)“…The rapid emergence of AI models, specifically large language models (LLMs) requiring large amounts of compute, drives the need for dedicated AI inference…”
Get full text
Conference Proceeding -
5
Rapid and Holistic Technology Evaluation for Exploratory DTCO in Beyond 7nm Technologies
Published in 2018 International Conference on Simulation of Semiconductor Processes and Devices (SISPAD) (01-09-2018)“…New device architectures such as horizontal Nanosheets have been seriously considered as a replacement for FinFET. A comprehensive, and realistic assessment of…”
Get full text
Conference Proceeding -
6
9.1 A 7nm 4-Core AI Chip with 25.6TFLOPS Hybrid FP8 Training, 102.4TOPS INT4 Inference and Workload-Aware Throttling
Published in 2021 IEEE International Solid- State Circuits Conference (ISSCC) (13-02-2021)“…Low-precision computation is the key enabling factor to achieve high compute densities (TOPS/W and TOPS/mm 2 ) in AI hardware accelerators across cloud and…”
Get full text
Conference Proceeding