Search Results - "Besta, Maciej"

Refine Results
  1. 1

    Transformations of High-Level Synthesis Codes for High-Performance Computing by de Fine Licht, Johannes, Besta, Maciej, Meierhans, Simon, Hoefler, Torsten

    “…Spatial computing architectures promise a major stride in performance and energy efficiency over the traditional load/store devices currently employed in large…”
    Get full text
    Journal Article
  2. 2

    Evaluating the Cost of Atomic Operations on Modern Architectures by Schweizer, Hermann, Besta, Maciej, Hoefler, Torsten

    “…Atomic operations (atomics) such as Compare-and-Swap (CAS) or Fetch-and-Add (FAA) are ubiquitous in parallel programming. Yet, performance tradeoffs between…”
    Get full text
    Conference Proceeding
  3. 3

    Slim fly: a cost effective low-diameter network topology by Besta, Maciej, Hoefler, Torsten

    “…We introduce a high-performance cost-effective network topology called Slim Fly that approaches the theoretically optimal network diameter. Slim Fly is based…”
    Get full text
    Conference Proceeding
  4. 4

    Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency Analysis by Besta, Maciej, Hoefler, Torsten

    “…Graph neural networks (GNNs) are among the most powerful tools in deep learning. They routinely solve complex problems on unstructured networks, such as node…”
    Get full text
    Journal Article
  5. 5

    Scaling betweenness centrality using communication-efficient sparse matrix multiplication by Solomonik, Edgar, Besta, Maciej, Vella, Flavio, Hoefler, Torsten

    “…Betweenness centrality (BC) is a crucial graph problem that measures the significance of a vertex by the number of shortest paths leading through it. We…”
    Get full text
    Conference Proceeding
  6. 6

    A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning by Ben-Nun, Tal, Besta, Maciej, Huber, Simon, Ziogas, Alexandros Nikolaos, Peter, Daniel, Hoefler, Torsten

    “…We introduce Deep500: the first customizable benchmarking infrastructure that enables fair comparison of the plethora of deep learning frameworks, algorithms,…”
    Get full text
    Conference Proceeding
  7. 7

    Communication-Efficient Jaccard similarity for High-Performance Distributed Genome Comparisons by Besta, Maciej, Kanakagiri, Raghavendra, Mustafa, Harun, Karasikov, Mikhail, Ratsch, Gunnar, Hoefler, Torsten, Solomonik, Edgar

    “…The Jaccard similarity index is an important measure of the overlap of two sets, widely used in machine learning, computational genomics, information…”
    Get full text
    Conference Proceeding
  8. 8

    FatPaths: Routing in Supercomputers and Data Centers when Shortest Paths Fall Short by Besta, Maciej, Schneider, Marcel, Konieczny, Marek, Cynk, Karolina, Henriksson, Erik, Girolamo, Salvatore Di, Singla, Ankit, Hoefler, Torsten

    “…We introduce FatPaths: a simple, generic, and robust routing architecture that enables state-of-the-art low-diameter topologies such as Slim Fly to achieve…”
    Get full text
    Conference Proceeding
  9. 9

    Practice of Streaming Processing of Dynamic Graphs: Concepts, Models, and Systems by Besta, Maciej, Fischer, Marc, Kalavri, Vasiliki, Kapralov, Michael, Hoefler, Torsten

    “…Graph processing has become an important part of various areas of computing, including machine learning, medical applications, social network analysis,…”
    Get full text
    Journal Article
  10. 10

    High-Performance Routing With Multipathing and Path Diversity in Ethernet and HPC Networks by Besta, Maciej, Domke, Jens, Schneider, Marcel, Konieczny, Marek, Girolamo, Salvatore Di, Schneider, Timo, Singla, Ankit, Hoefler, Torsten

    “…The recent line of research into topology design focuses on lowering network diameter. Many low-diameter topologies such as Slim Fly or Jellyfish that…”
    Get full text
    Journal Article
  11. 11

    HexaMesh: Scaling to Hundreds of Chiplets with an Optimized Chiplet Arrangement by Iff, Patrick, Besta, Maciej, Cavalcante, Matheus, Fischer, Tim, Benini, Luca, Hoefler, Torsten

    “…2.5D integration is an important technique to tackle the growing cost of manufacturing chips in advanced technology nodes. This poses the challenge of…”
    Get full text
    Conference Proceeding
  12. 12

    Sparse Hamming Graph: A Customizable Network-on-Chip Topology by Iff, Patrick, Besta, Maciej, Cavalcante, Matheus, Fischer, Tim, Benini, Luca, Hoefler, Torsten

    “…Chips with hundreds to thousands of cores require scalable networks-on-chip (NoCs). Customization of the NoC topology is necessary to reach the diverse design…”
    Get full text
    Conference Proceeding
  13. 13

    Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency Analysis by Besta, Maciej, Hoefler, Torsten

    Published 19-05-2022
    “…IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023 Graph neural networks (GNNs) are among the most powerful tools in deep learning…”
    Get full text
    Journal Article
  14. 14

    SlimSell: A Vectorizable Graph Representation for Breadth-First Search by Besta, Maciej, Marending, Florian, Solomonik, Edgar, Hoefler, Torsten

    “…Vectorization and GPUs will profoundly change graph processing. Traditional graph algorithms tuned for 32- or 64-bit based memory accesses will be inefficient…”
    Get full text
    Conference Proceeding
  15. 15

    Accelerating Irregular Computations with Hardware Transactional Memory and Active Messages by Besta, Maciej, Hoefler, Torsten

    Published 18-10-2020
    “…Proceedings of the 24th ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC'15), 2015 We propose Atomic Active Messages…”
    Get full text
    Journal Article
  16. 16

    Fault Tolerance for Remote Memory Access Programming Models by Besta, Maciej, Hoefler, Torsten

    Published 18-10-2020
    “…Proceedings of the 23rd ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC'14), 2014 Remote Memory Access (RMA) is an…”
    Get full text
    Journal Article
  17. 17

    Slim Fly: A Cost Effective Low-Diameter Network Topology by Besta, Maciej, Hoefler, Torsten

    Published 18-12-2019
    “…Proceedings of the ACM/IEEE International Conference on High Performance Computing, Networking, Storage and Analysis, November 2014 We introduce a…”
    Get full text
    Journal Article
  18. 18

    Active Access: A Mechanism for High-Performance Distributed Data-Centric Computations by Besta, Maciej, Hoefler, Torsten

    Published 28-10-2019
    “…Proceedings of the 29th ACM International Conference on Supercomputing (ACM ICS'15), 2015 Remote memory access (RMA) is an emerging high-performance…”
    Get full text
    Journal Article
  19. 19

    I/O-Optimal Cache-Oblivious Sparse Matrix-Sparse Matrix Multiplication by Gleinig, Niels, Besta, Maciej, Hoefler, Torsten

    “…Data movements between different levels of the memory hierarchy (I/O-transitions, or simply I/O s) are a critical performance bottleneck in modern computing…”
    Get full text
    Conference Proceeding
  20. 20

    High-Performance Parallel Graph Coloring with Strong Guarantees on Work, Depth, and Quality by Besta, Maciej, Carigiet, Armon, Janda, Kacper, Vonarburg-Shmaria, Zur, Gianinazzi, Lukas, Hoefler, Torsten

    “…We develop the first parallel graph coloring heuristics with strong theoretical guarantees on work and depth and coloring quality. The key idea is to design a…”
    Get full text
    Conference Proceeding