Search Results - "Moreira, Jose E"
-
1
Energy Efficiency Boost in the AI-Infused POWER10 Processor
Published in 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA) (01-06-2021)“…We present the novel micro-architectural features, supported by an innovative and novel pre-silicon methodology in the design of POWER10. The resulting…”
Get full text
Conference Proceeding -
2
IBM's POWER10 Processor
Published in IEEE MICRO (01-03-2021)“…The IBM POWER10 processor represents the 10th generation of the POWER family of enterprise computing engines. It is built on a balance of computation and…”
Get full text
Journal Article -
3
Compiling for the IBM Matrix Engine for Enterprise Workloads
Published in IEEE MICRO (01-09-2022)“…The matrix-multiply assist (MMA) facility is the latest addition to IBM’s power instruction set architecture and first shipped in the recently introduced…”
Get full text
Journal Article -
4
Fast matrix multiplication via compiler‐only layered data reorganization and intrinsic lowering
Published in Software, practice & experience (01-09-2023)“…The resurgence of machine learning has increased the demand for high‐performance basic linear algebra subroutines (BLAS), which have long depended on libraries…”
Get full text
Journal Article -
5
The Blue Gene/L Supercomputer: A Hardware and Software Story
Published in International journal of parallel programming (01-06-2007)“…The Blue Gene/L system at the Department of Energy Lawrence Livermore National Laboratory in Livermore, California is the world's most powerful supercomputer…”
Get full text
Journal Article -
6
The Case for Full-Throttle Computing: An Alternative Datacenter Design Strategy
Published in IEEE MICRO (01-07-2010)“…The authors argue that the minimum cost of computing can be provided by consolidating real-time workloads onto relatively large servers, which can operate at…”
Get full text
Journal Article -
7
Modeling Matrix Engines for Portability and Performance
Published in 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (01-05-2022)“…Matrix engines, also known as matrix-multiplication accel-erators, capable of computing on 2D matrices of various data types are traditionally found only on…”
Get full text
Conference Proceeding -
8
C++ and Interoperability Between Libraries: The GraphBLAS C++ Specification
Published in 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) (01-05-2023)“…Interoperability between libraries is often hindered by incompatible data formats, which can necessitate creating new copies of data when transferring data…”
Get full text
Conference Proceeding -
9
Introduction to GraphBLAS 2.0
Published in 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) (01-06-2021)“…The GraphBLAS is a set of basic building blocks for constructing graph algorithms in terms of linear algebra. They are first and foremost defined…”
Get full text
Conference Proceeding -
10
GraphBLAS: C++ Iterators for Sparse Matrices
Published in 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) (01-05-2022)“…Iteration over opaque, generic data structures is an important feature of many C++ libraries. Aggressive compiler optimization and inlining enables generic C++…”
Get full text
Conference Proceeding -
11
A Roadmap for the GraphBLAS C++ API
Published in 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) (01-05-2020)“…The GraphBLAS are building blocks for expressing graph algorithms in terms of linear algebra. Currently, the GraphBLAS are defined as a C API. Implementations…”
Get full text
Conference Proceeding -
12
Considerations for a Distributed GraphBLAS API
Published in 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) (01-05-2020)“…The GraphBLAS emerged from an international effort to standardize linear-algebraic building blocks for computing on graphs and graph-structured data. The…”
Get full text
Conference Proceeding -
13
Delivering Teraflops: An Account of how Blue Gene was Brought to Life
Published in IEEE John Vincent Atanasoff 2006 International Symposium on Modern Computing (JVA'06) (01-10-2006)“…The Blue Gene/L system at the Department of Energy Lawrence Livermore National Laboratory in Livermore, California is the world's most powerful supercomputer…”
Get full text
Conference Proceeding -
14
Multitoroidal Interconnects For Tightly Coupled Supercomputers
Published in IEEE transactions on parallel and distributed systems (01-01-2008)“…The processing elements of many modern tightly coupled multicomputers are connected via mesh or toroidal networks. Such interconnects are simple and highly…”
Get full text
Journal Article -
15
The GraphBLAS 3.0 Project
Published in 2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) (27-05-2024)“…The GraphBLAS C API is mature with an updated specification (version 2.1) and a compliant implementation (SuiteSparse GraphBLAS). We are now focused on…”
Get full text
Conference Proceeding -
16
Performance Evaluation of a Commercial Application, Trade, in Scale-out Environments
Published in 2007 15th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (01-10-2007)“…Scale-out approach, in contrast to scale-up approach (exploring increasing performance by utilizing more powerful shared-memory servers), refers to deployment…”
Get full text
Conference Proceeding -
17
Supporting multidimensional arrays in Java
Published in Concurrency and computation (01-03-2003)“…The lack of direct support for multidimensional arrays in JavaTM has been recognized as a major deficiency in the language's applicability to numerical…”
Get full text
Journal Article -
18
Unlocking the Performance of the BlueGene/L Supercomputer
Published in Conference on High Performance Networking and Computing: Proceedings of the 2004 ACM/IEEE conference on Supercomputing; 06-12 Nov. 2004 (06-11-2004)“…The BlueGene/L supercomputer is expected to deliver new levels of application performance by providing a combination of good single-node computational…”
Get full text
Conference Proceeding -
19
GraphBLAS C API: Ideas for future versions of the specification
Published in 2017 IEEE High Performance Extreme Computing Conference (HPEC) (01-09-2017)“…The GraphBLAS C specification provisional release 1.0 is complete. To manage the scope of the project, we had to defer important functionality to a future…”
Get full text
Conference Proceeding -
20
Implementing the GraphBLAS C API
Published in 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) (01-05-2018)“…This paper describes our implementation of the GraphBLAS C API. The implementation fully hides the internals of GraphBLAS objects from application programs,…”
Get full text
Conference Proceeding