Search Results - "Chatarasi, Prasanth"
-
1
Polyhedral Optimizations of Explicitly Parallel Programs
Published in 2015 International Conference on Parallel Architecture and Compilation (PACT) (01-10-2015)“…The polyhedral model is a powerful algebraic framework that has enabled significant advances to analysis and transformation of sequential affine (sub)programs,…”
Get full text
Conference Proceeding -
2
MAESTRO: A Data-Centric Approach to Understand Reuse, Performance, and Hardware Cost of DNN Mappings
Published in IEEE MICRO (01-05-2020)“…The efficiency of an accelerator depends on three factors-mapping, deep neural network (DNN) layers, and hardware-constructing extremely complicated design…”
Get full text
Journal Article -
3
Evaluating Spatial Accelerator Architectures with Tiled Matrix-Matrix Multiplication
Published in IEEE transactions on parallel and distributed systems (01-04-2022)“…There is a growing interest in custom spatial accelerators for machine learning applications. These accelerators employ a spatial array of processing elements…”
Get full text
Journal Article -
4
Extending Polyhedral Model for Analysis and Transformation of OpenMP Programs
Published in 2015 International Conference on Parallel Architecture and Compilation (PACT) (01-10-2015)“…The polyhedral model is a powerful algebraic framework that has enabled significant advances in analysis and transformation of sequential affine (sub)programs,…”
Get full text
Conference Proceeding -
5
FEATHER: A Reconfigurable Accelerator with Data Reordering Support for Low-Cost On-Chip Dataflow Switching
Published in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29-06-2024)“…The inference of ML models composed of diverse structures, types, and sizes boils down to the execution of different dataflows (i.e. different tiling,…”
Get full text
Conference Proceeding -
6
Evaluating Spatial Accelerator Architectures with Tiled Matrix-Matrix Multiplication
Published in IEEE transactions on parallel and distributed systems (11-08-2021)“…There is a growing interest in custom spatial accelerators for machine learning applications. These accelerators employ a spatial array of processing elements…”
Get full text
Journal Article -
7
Power-Limited Inference Performance Optimization Using a Software-Assisted Peak Current Regulation Scheme in a 5-nm AI SoC
Published in IEEE journal of solid-state circuits (18-10-2024)“…Discrete AI inference cards, operating under form-factor and system-defined peak power constraints, must serve diverse inference requests with widely varying…”
Get full text
Journal Article -
8
Union: A Unified HW-SW Co-Design Ecosystem in MLIR for Evaluating Tensor Operations on Spatial Accelerators
Published in 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT) (01-09-2021)“…To meet the extreme compute demands for deep learning across commercial and scientific applications, dataflow accelerators are becoming increasingly popular…”
Get full text
Conference Proceeding -
9
Extending the Polyhedral Compilation Model for Debugging and Optimization of Spmd-Style Explicitly-Parallel Programs
Published 01-01-2017“…The SPMD (Single Program Multiple Data) parallelism continues to be one of the most popular parallel execution models in use today, as exemplified by OpenMP…”
Get full text
Dissertation -
10
14.1 A Software-Assisted Peak Current Regulation Scheme to Improve Power-Limited Inference Performance in a 5nm AI SoC
Published in 2024 IEEE International Solid-State Circuits Conference (ISSCC) (18-02-2024)“…The rapid emergence of AI models, specifically large language models (LLMs) requiring large amounts of compute, drives the need for dedicated AI inference…”
Get full text
Conference Proceeding -
11
FEATHER: A Reconfigurable Accelerator with Data Reordering Support for Low-Cost On-Chip Dataflow Switching
Published 21-05-2024“…The inference of ML models composed of diverse structures, types, and sizes boils down to the execution of different dataflows (i.e. different tiling,…”
Get full text
Journal Article -
12
Vyasa: A High-Performance Vectorizing Compiler for Tensor Convolutions on the Xilinx AI Engine
Published in 2020 IEEE High Performance Extreme Computing Conference (HPEC) (22-09-2020)“…Xilinx's AI Engine is a recent industry example of energy-efficient vector processing that includes novel support for 2D SIMD datapaths and shuffle…”
Get full text
Conference Proceeding -
13
Vyasa: A High-Performance Vectorizing Compiler for Tensor Convolutions on the Xilinx AI Engine
Published 01-06-2020“…Xilinx's AI Engine is a recent industry example of energy-efficient vector processing that includes novel support for 2D SIMD datapaths and shuffle…”
Get full text
Journal Article -
14
Evaluating Spatial Accelerator Architectures with Tiled Matrix-Matrix Multiplication
Published 19-06-2021“…There is a growing interest in custom spatial accelerators for machine learning applications. These accelerators employ a spatial array of processing elements…”
Get full text
Journal Article -
15
Union: A Unified HW-SW Co-Design Ecosystem in MLIR for Evaluating Tensor Operations on Spatial Accelerators
Published 15-09-2021“…To meet the extreme compute demands for deep learning across commercial and scientific applications, dataflow accelerators are becoming increasingly popular…”
Get full text
Journal Article -
16
Experimental Insights from the Rogues Gallery
Published in 2019 IEEE International Conference on Rebooting Computing (ICRC) (01-11-2019)“…The Rogues Gallery is a new deployment for understanding next-generation hardware with a focus on unorthodox and uncommon technologies. This testbed project…”
Get full text
Conference Proceeding -
17
Marvel: A Data-centric Compiler for DNN Operators on Spatial Accelerators
Published 18-02-2020“…The efficiency of a spatial DNN accelerator depends heavily on the compiler and its cost model ability to generate optimized mappings for various operators of…”
Get full text
Journal Article -
18
Understanding Reuse, Performance, and Hardware Cost of DNN Dataflows: A Data-Centric Approach Using MAESTRO
Published 04-05-2018“…The data partitioning and scheduling strategies used by DNN accelerators to leverage reuse and perform staging are known as dataflow, and they directly impact…”
Get full text
Journal Article