Search Results - "Fleischer, Bruce"
-
1
DLFloat: A 16-b Floating Point Format Designed for Deep Learning Training and Inference
Published in 2019 IEEE 26th Symposium on Computer Arithmetic (ARITH) (01-06-2019)“…The resilience of Deep Learning (DL) training and inference workloads to low-precision computations, coupled with the demand for power-and area-efficient…”
Get full text
Conference Proceeding -
2
Efficient AI System Design With Cross-Layer Approximate Computing
Published in Proceedings of the IEEE (01-12-2020)“…Advances in deep neural networks (DNNs) and the availability of massive real-world data have enabled superhuman levels of accuracy on many AI tasks and ushered…”
Get full text
Journal Article -
3
A 7-nm Four-Core Mixed-Precision AI Chip With 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware Throttling
Published in IEEE journal of solid-state circuits (01-01-2022)“…Reduced precision computation is a key enabling factor for energy-efficient acceleration of deep learning (DL) applications. This article presents a 7-nm…”
Get full text
Journal Article -
4
Power-Limited Inference Performance Optimization Using a Software-Assisted Peak Current Regulation Scheme in a 5-nm AI SoC
Published in IEEE journal of solid-state circuits (18-10-2024)“…Discrete AI inference cards, operating under form-factor and system-defined peak power constraints, must serve diverse inference requests with widely varying…”
Get full text
Journal Article -
5
RaPiD: AI Accelerator for Ultra-low Precision Training and Inference
Published in 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA) (01-06-2021)“…The growing prevalence and computational demands of Artificial Intelligence (AI) workloads has led to widespread use of hardware accelerators in their…”
Get full text
Conference Proceeding -
6
14.1 A Software-Assisted Peak Current Regulation Scheme to Improve Power-Limited Inference Performance in a 5nm AI SoC
Published in 2024 IEEE International Solid-State Circuits Conference (ISSCC) (18-02-2024)“…The rapid emergence of AI models, specifically large language models (LLMs) requiring large amounts of compute, drives the need for dedicated AI inference…”
Get full text
Conference Proceeding -
7
Low-Cost Concurrent Error Detection for Floating-Point Unit (FPU) Controllers
Published in IEEE transactions on computers (01-07-2013)“…We present a nonintrusive concurrent error detection (CED) method for protecting the control logic of a contemporary floating-point unit (FPU). The proposed…”
Get full text
Journal Article -
8
A Scalable Multi-TeraOPS Core for AI Training and Inference
Published in IEEE solid-state circuits letters (01-12-2018)“…This letter presents a multi-TOPS AI accelerator core for deep learning training and inference. With a programmable architecture and custom ISA, this engine…”
Get full text
Journal Article -
9
9.1 A 7nm 4-Core AI Chip with 25.6TFLOPS Hybrid FP8 Training, 102.4TOPS INT4 Inference and Workload-Aware Throttling
Published in 2021 IEEE International Solid- State Circuits Conference (ISSCC) (13-02-2021)“…Low-precision computation is the key enabling factor to achieve high compute densities (TOPS/W and TOPS/mm 2 ) in AI hardware accelerators across cloud and…”
Get full text
Conference Proceeding -
10
A Scalable Multi- TeraOPS Deep Learning Processor Core for AI Trainina and Inference
Published in 2018 IEEE Symposium on VLSI Circuits (01-06-2018)“…A multi-TOPS AI core is presented for acceleration of deep learning training and inference in systems from edge devices to data centers. With a programmable…”
Get full text
Conference Proceeding -
11
Exponent monitoring for low-cost concurrent error detection in FPU control logic
Published in 29th VLSI Test Symposium (01-05-2011)“…We present a non-intrusive concurrent error detection (CED) method for protecting the control logic of a contemporary floating point unit (FPU). The proposed…”
Get full text
Conference Proceeding -
12
A 4R2W register file for a 2.3GHz wire-speed POWER™ processor with double-pumped write operation
Published in 2011 IEEE International Solid-State Circuits Conference (01-02-2011)“…In multi-ported register files, memory cell size grows quadratically with the total number of ports due to wordline and bitline wiring. Reducing the number of…”
Get full text
Conference Proceeding -
13
Static timing analysis for self resetting circuits
Published in International Conference on Computer Aided Design: Proceedings of the 1996 IEEE/ACM international conference on Computer-aided design; 10-14 Nov. 1996 (01-01-1997)“…Static timing analysis techniques are widely used to verify the timing behavior of large digital designs implemented predominantly in conventional static CMOS…”
Get full text
Conference Proceeding -
14
64-bit prefix adders: Power-efficient topologies and design solutions
Published in 2009 IEEE Custom Integrated Circuits Conference (01-09-2009)“…64-bit adders of various prefix algorithms are designed using a novel dataflow synthesis methodology. Our synthesis methodology offers robust adder solutions…”
Get full text
Conference Proceeding -
15
A 3.0 TFLOPS 0.62V Scalable Processor Core for High Compute Utilization AI Training and Inference
Published in 2020 IEEE Symposium on VLSI Circuits (01-06-2020)“…A processor core is presented for AI training and inference products. Leading-edge compute efficiency is achieved for robust fp16 training via efficient…”
Get full text
Conference Proceeding -
16
A statistical critical path monitor in 14nm CMOS
Published in 2016 IEEE 34th International Conference on Computer Design (ICCD) (01-10-2016)“…Local variation of delay paths has a significant impact on modern microprocessor performance and yield. A critical path monitor is reported which extracts…”
Get full text
Conference Proceeding -
17
A 5GHz+ 128-bit Binary Floating-Point Adder for the POWER6 Processor
Published in 2006 Proceedings of the 32nd European Solid-State Circuits Conference (01-09-2006)“…A fast 128-bit end-around carry adder is designed and fabricated as part of the POWER6 floating-point unit in a 65nm SOI process technology. Efficient use of…”
Get full text
Conference Proceeding -
18
Synthesis design strategies for energy-efficient microprocessors
Published in 2016 IEEE 34th International Conference on Computer Design (ICCD) (01-10-2016)“…A detailed synthesis study has been performed on a functional unit from a recent IBM microprocessor to explore the voltage-frequency space for energy-efficient…”
Get full text
Conference Proceeding -
19
Jitter in relaxation oscillators
Published 01-01-1989“…Relaxation oscillators can be used as modulators in information-transmission systems. Even with a fixed controlling input, however, a relaxation oscillator's…”
Get full text
Dissertation -
20
Bruce Fleisher's Swing Keys : Stay down through the shot
Published in Golf digest (01-10-1999)Get full text
Magazine Article