Search Results - "Kwasniewski, Grzegorz"

Refine Results
  1. 1

    Using Compiler Techniques to Improve Automatic Performance Modeling by Bhattacharyya, Arnamoy, Kwasniewski, Grzegorz, Hoefler, Torsten

    “…Performance modeling can be utilized in a number of scenarios, starting from finding performance bugs to the scalability study of applications. Existing…”
    Get full text
    Conference Proceeding
  2. 2

    Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0 by Fuhrer, Oliver, Chadha, Tarun, Hoefler, Torsten, Kwasniewski, Grzegorz, Lapillonne, Xavier, Leutwyler, David, Lüthi, Daniel, Osuna, Carlos, Schär, Christoph, Schulthess, Thomas C, Vogt, Hannes

    Published in Geoscientific Model Development (02-05-2018)
    “…The best hope for reducing long-standing global climate model biases is by increasing resolution to the kilometer scale. Here we present results from an…”
    Get full text
    Journal Article
  3. 3

    High Performance Unstructured SpMM Computation Using Tensor Cores by Okanovic, Patrik, Kwasniewski, Grzegorz, Labini, Paolo Sylos, Besta, Maciej, Vella, Flavio, Hoefler, Torsten

    Published 21-08-2024
    “…High-performance sparse matrix-matrix (SpMM) multiplication is paramount for science and industry, as the ever-increasing sizes of data prohibit using dense…”
    Get full text
    Journal Article
  4. 4

    Deinsum: Practically I/O Optimal Multi-Linear Algebra by Ziogas, Alexandros Nikolaos, Kwasniewski, Grzegorz, Ben-Nun, Tal, Schneider, Timo, Hoefler, Torsten

    “…Multilinear algebra kernel performance on modern massively-parallel systems is determined mainly by data movement. However, deriving data movement-optimal…”
    Get full text
    Conference Proceeding
  5. 5

    SeBS: A Serverless Benchmark Suite for Function-as-a-Service Computing by Copik, Marcin, Kwasniewski, Grzegorz, Besta, Maciej, Podstawski, Michal, Hoefler, Torsten

    Published 28-12-2020
    “…Function-as-a-Service (FaaS) is one of the most promising directions for the future of cloud services, and serverless functions have immediately become a new…”
    Get full text
    Journal Article
  6. 6

    Deinsum: Practically I/O Optimal Multilinear Algebra by Ziogas, Alexandros Nikolaos, Kwasniewski, Grzegorz, Ben-Nun, Tal, Schneider, Timo, Hoefler, Torsten

    Published 16-06-2022
    “…Multilinear algebra kernel performance on modern massively-parallel systems is determined mainly by data movement. However, deriving data movement-optimal…”
    Get full text
    Journal Article
  7. 7

    On the Parallel I/O Optimality of Linear Algebra Kernels: Near-Optimal Matrix Factorizations by Kwasniewski, Grzegorz, Kabić, Marko, Ben-Nun, Tal, Ziogas, Alexandros Nikolaos, Saethre, Jens Eirik, Gaillard, André, Schneider, Timo, Besta, Maciej, Kozhevnikov, Anton, VandeVondele, Joost, Hoefler, Torsten

    Published 25-04-2023
    “…Published at Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, November, 2021(SC'21) Matrix…”
    Get full text
    Journal Article
  8. 8
  9. 9

    Flexible Communication Avoiding Matrix Multiplication on FPGA with High-Level Synthesis by Licht, Johannes de Fine, Kwasniewski, Grzegorz, Hoefler, Torsten

    Published 25-01-2021
    “…In Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA'20), February 23-25, 2020, Seaside, CA, USA Data movement…”
    Get full text
    Journal Article
  10. 10
  11. 11

    On the Parallel I/O Optimality of Linear Algebra Kernels: Near-Optimal LU Factorization by Kwasniewski, Grzegorz, Ben-Nun, Tal, Ziogas, Alexandros Nikolaos, Schneider, Timo, Besta, Maciej, Hoefler, Torsten

    Published 12-10-2020
    “…Dense linear algebra kernels, such as linear solvers or tensor contractions, are fundamental components of many scientific computing applications. In this…”
    Get full text
    Journal Article
  12. 12

    Lifting C Semantics for Dataflow Optimization by Calotoiu, Alexandru, Ben-Nun, Tal, Kwasniewski, Grzegorz, Licht, Johannes de Fine, Schneider, Timo, Schaad, Philipp, Hoefler, Torsten

    Published 24-05-2022
    “…C is the lingua franca of programming and almost any device can be programmed using C. However, programming mod-ern heterogeneous architectures such as…”
    Get full text
    Journal Article
  13. 13

    Pebbles, Graphs, and a Pinch of Combinatorics: Towards Tight I/O Lower Bounds for Statically Analyzable Programs by Kwasniewski, Grzegorz, Ben-Nun, Tal, Gianinazzi, Lukas, Calotoiu, Alexandru, Schneider, Timo, Ziogas, Alexandros Nikolaos, Besta, Maciej, Hoefler, Torsten

    Published 15-05-2021
    “…Determining I/O lower bounds is a crucial step in obtaining communication-efficient parallel algorithms, both across the memory hierarchy and between…”
    Get full text
    Journal Article
  14. 14
  15. 15

    ProbGraph: High-Performance and High-Accuracy Graph Mining with Probabilistic Set Representations by Besta, Maciej, Miglioli, Cesare, Labini, Paolo Sylos, Tětek, Jakub, Iff, Patrick, Kanakagiri, Raghavendra, Ashkboos, Saleh, Janda, Kacper, Podstawski, Michal, Kwasniewski, Grzegorz, Gleinig, Niels, Vella, Flavio, Mutlu, Onur, Hoefler, Torsten

    Published 24-08-2022
    “…Proceedings of the ACM/IEEE International Conference on High Performance Computing, Networking, Storage and Analysis, November 2022 Important graph mining…”
    Get full text
    Journal Article
  16. 16

    A scalable weakly-synchronous algorithm for solving partial differential equations by Aditya, Konduri, Gysi, Tobias, Kwasniewski, Grzegorz, Hoefler, Torsten, Donzis, Diego A, Chen, Jacqueline H

    Published 13-11-2019
    “…Synchronization overheads pose a major challenge as applications advance towards extreme scales. In current large-scale algorithms, synchronization as well as…”
    Get full text
    Journal Article
  17. 17

    Red-blue pebbling revisited: near optimal parallel matrix-matrix multiplication by Kwasniewski, Grzegorz, Kabić, Marko, Besta, Maciej, VandeVondele, Joost, Solcà, Raffaele, Hoefler, Torsten

    Published 26-08-2019
    “…We propose COSMA: a parallel matrix-matrix multiplication algorithm that is near communication-optimal for all combinations of matrix dimensions, processor…”
    Get full text
    Journal Article
  18. 18

    Motif Prediction with Graph Neural Networks by Besta, Maciej, Grob, Raphael, Miglioli, Cesare, Bernold, Nicola, Kwasniewski, Grzegorz, Gjini, Gabriel, Kanakagiri, Raghavendra, Ashkboos, Saleh, Gianinazzi, Lukas, Dryden, Nikoli, Hoefler, Torsten

    Published 26-05-2021
    “…Proceedings of the 28th SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'22), 2022 Link prediction is one of the central problems in graph mining…”
    Get full text
    Journal Article
  19. 19

    A PCIe Congestion-Aware Performance Model for Densely Populated Accelerator Servers by Martinasso, Maxime, Kwasniewski, Grzegorz, Alam, Sadaf R., Schulthess, Thomas C., Hoefler, Torsten

    “…MeteoSwiss, the Swiss national weather forecast institute, has selected densely populated accelerator servers as their primary system to compute weather…”
    Get full text
    Conference Proceeding
  20. 20