Search Results - "Besta, Maciej"
-
1
Transformations of High-Level Synthesis Codes for High-Performance Computing
Published in IEEE transactions on parallel and distributed systems (01-05-2021)“…Spatial computing architectures promise a major stride in performance and energy efficiency over the traditional load/store devices currently employed in large…”
Get full text
Journal Article -
2
Evaluating the Cost of Atomic Operations on Modern Architectures
Published in 2015 International Conference on Parallel Architecture and Compilation (PACT) (01-10-2015)“…Atomic operations (atomics) such as Compare-and-Swap (CAS) or Fetch-and-Add (FAA) are ubiquitous in parallel programming. Yet, performance tradeoffs between…”
Get full text
Conference Proceeding -
3
Slim fly: a cost effective low-diameter network topology
Published in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (01-11-2014)“…We introduce a high-performance cost-effective network topology called Slim Fly that approaches the theoretically optimal network diameter. Slim Fly is based…”
Get full text
Conference Proceeding -
4
Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency Analysis
Published in IEEE transactions on pattern analysis and machine intelligence (01-05-2024)“…Graph neural networks (GNNs) are among the most powerful tools in deep learning. They routinely solve complex problems on unstructured networks, such as node…”
Get full text
Journal Article -
5
Scaling betweenness centrality using communication-efficient sparse matrix multiplication
Published in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (12-11-2017)“…Betweenness centrality (BC) is a crucial graph problem that measures the significance of a vertex by the number of shortest paths leading through it. We…”
Get full text
Conference Proceeding -
6
A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning
Published in 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (01-05-2019)“…We introduce Deep500: the first customizable benchmarking infrastructure that enables fair comparison of the plethora of deep learning frameworks, algorithms,…”
Get full text
Conference Proceeding -
7
Communication-Efficient Jaccard similarity for High-Performance Distributed Genome Comparisons
Published in 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (01-05-2020)“…The Jaccard similarity index is an important measure of the overlap of two sets, widely used in machine learning, computational genomics, information…”
Get full text
Conference Proceeding -
8
FatPaths: Routing in Supercomputers and Data Centers when Shortest Paths Fall Short
Published in SC20: International Conference for High Performance Computing, Networking, Storage and Analysis (01-11-2020)“…We introduce FatPaths: a simple, generic, and robust routing architecture that enables state-of-the-art low-diameter topologies such as Slim Fly to achieve…”
Get full text
Conference Proceeding -
9
Practice of Streaming Processing of Dynamic Graphs: Concepts, Models, and Systems
Published in IEEE transactions on parallel and distributed systems (01-06-2023)“…Graph processing has become an important part of various areas of computing, including machine learning, medical applications, social network analysis,…”
Get full text
Journal Article -
10
High-Performance Routing With Multipathing and Path Diversity in Ethernet and HPC Networks
Published in IEEE transactions on parallel and distributed systems (01-04-2021)“…The recent line of research into topology design focuses on lowering network diameter. Many low-diameter topologies such as Slim Fly or Jellyfish that…”
Get full text
Journal Article -
11
HexaMesh: Scaling to Hundreds of Chiplets with an Optimized Chiplet Arrangement
Published in 2023 60th ACM/IEEE Design Automation Conference (DAC) (09-07-2023)“…2.5D integration is an important technique to tackle the growing cost of manufacturing chips in advanced technology nodes. This poses the challenge of…”
Get full text
Conference Proceeding -
12
Sparse Hamming Graph: A Customizable Network-on-Chip Topology
Published in 2023 60th ACM/IEEE Design Automation Conference (DAC) (09-07-2023)“…Chips with hundreds to thousands of cores require scalable networks-on-chip (NoCs). Customization of the NoC topology is necessary to reach the diverse design…”
Get full text
Conference Proceeding -
13
Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency Analysis
Published 19-05-2022“…IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023 Graph neural networks (GNNs) are among the most powerful tools in deep learning…”
Get full text
Journal Article -
14
SlimSell: A Vectorizable Graph Representation for Breadth-First Search
Published in 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (01-05-2017)“…Vectorization and GPUs will profoundly change graph processing. Traditional graph algorithms tuned for 32- or 64-bit based memory accesses will be inefficient…”
Get full text
Conference Proceeding -
15
Accelerating Irregular Computations with Hardware Transactional Memory and Active Messages
Published 18-10-2020“…Proceedings of the 24th ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC'15), 2015 We propose Atomic Active Messages…”
Get full text
Journal Article -
16
Fault Tolerance for Remote Memory Access Programming Models
Published 18-10-2020“…Proceedings of the 23rd ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC'14), 2014 Remote Memory Access (RMA) is an…”
Get full text
Journal Article -
17
Slim Fly: A Cost Effective Low-Diameter Network Topology
Published 18-12-2019“…Proceedings of the ACM/IEEE International Conference on High Performance Computing, Networking, Storage and Analysis, November 2014 We introduce a…”
Get full text
Journal Article -
18
Active Access: A Mechanism for High-Performance Distributed Data-Centric Computations
Published 28-10-2019“…Proceedings of the 29th ACM International Conference on Supercomputing (ACM ICS'15), 2015 Remote memory access (RMA) is an emerging high-performance…”
Get full text
Journal Article -
19
I/O-Optimal Cache-Oblivious Sparse Matrix-Sparse Matrix Multiplication
Published in 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (01-05-2022)“…Data movements between different levels of the memory hierarchy (I/O-transitions, or simply I/O s) are a critical performance bottleneck in modern computing…”
Get full text
Conference Proceeding -
20
High-Performance Parallel Graph Coloring with Strong Guarantees on Work, Depth, and Quality
Published in SC20: International Conference for High Performance Computing, Networking, Storage and Analysis (01-11-2020)“…We develop the first parallel graph coloring heuristics with strong theoretical guarantees on work and depth and coloring quality. The key idea is to design a…”
Get full text
Conference Proceeding