Search Results - "Hager, Georg"

Refine Results
  1. 1

    CRAFT: A Library for Easier Application-Level Checkpoint/Restart and Automatic Fault Tolerance by Shahzad, Faisal, Thies, Jonas, Kreutzer, Moritz, Zeiser, Thomas, Hager, Georg, Wellein, Gerhard

    “…In order to efficiently use the future generations of supercomputers, fault tolerance and power consumption are two of the prime challenges anticipated by the…”
    Get full text
    Journal Article
  2. 2

    LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments by Treibig, J, Hager, G, Wellein, G

    “…Exploiting the performance of today's processors requires intimate knowledge of the microarchitecture as well as an awareness of the ever-growing complexity in…”
    Get full text
    Conference Proceeding
  3. 3

    High-performance implementation of Chebyshev filter diagonalization for interior eigenvalue computations by Pieper, Andreas, Kreutzer, Moritz, Alvermann, Andreas, Galgon, Martin, Fehske, Holger, Hager, Georg, Lang, Bruno, Wellein, Gerhard

    Published in Journal of computational physics (15-11-2016)
    “…We study Chebyshev filter diagonalization as a tool for the computation of many interior eigenvalues of very large sparse symmetric matrices. In this technique…”
    Get full text
    Journal Article
  4. 4

    Comparison of different propagation steps for lattice Boltzmann methods by Wittmann, Markus, Zeiser, Thomas, Hager, Georg, Wellein, Gerhard

    “…Several possibilities exist to implement the propagation step of lattice Boltzmann methods. This paper describes common implementations and compares the number…”
    Get full text
    Journal Article
  5. 5

    A flexible Patch-based lattice Boltzmann parallelization approach for heterogeneous GPU–CPU clusters by Feichtinger, Christian, Habich, Johannes, Köstler, Harald, Hager, Georg, Rüde, Ulrich, Wellein, Gerhard

    Published in Parallel computing (01-09-2011)
    “…► We investigate performance and scaling behavior of a LBM solver on GPU–CPU clusters. ► Based on hardware models performance estimations for GPUs and CPUs are…”
    Get full text
    Journal Article
  6. 6

    Pushing the limits for medical image reconstruction on recent standard multicore processors by Treibig, Jan, Hager, Georg, Hofmann, Hannes G., Hornegger, Joachim, Wellein, Gerhard

    “…Volume reconstruction by backprojection is the computational bottleneck in many interventional clinical computed tomography (CT) applications. Today vendors in…”
    Get full text
    Journal Article
  7. 7
  8. 8

    The Role of Idle Waves, Desynchronization, and Bottleneck Evasion in the Performance of Parallel Programs by Afzal, Ayesha, Hager, Georg, Wellein, Gerhard

    “…The performance of highly parallel applications on distributed-memory systems is influenced by many factors. Analytic performance modeling techniques aim to…”
    Get full text
    Journal Article
  9. 9

    Analytic performance model for parallel overlapping memory‐bound kernels by Afzal, Ayesha, Hager, Georg, Wellein, Gerhard

    Published in Concurrency and computation (01-05-2022)
    “…Complex applications running on multicore processors show a rich performance phenomenology. The growing number of cores per ccNUMA domain complicates…”
    Get full text
    Journal Article
  10. 10

    Level-Based Blocking for Sparse Matrices: Sparse Matrix-Power-Vector Multiplication by Alappat, Christie, Hager, Georg, Schenk, Olaf, Wellein, Gerhard

    “…The multiplication of a sparse matrix with a dense vector (SpMV) is a key component in many numerical schemes and its performance is known to be severely…”
    Get full text
    Journal Article
  11. 11

    Performance Modeling of Streaming Kernels and Sparse Matrix-Vector Multiplication on A64FX by Alappat, Christie, Laukemann, Jan, Gruber, Thomas, Hager, Georg, Wellein, Gerhard, Meyer, Nils, Wettig, Tilo

    “…The A64FX CPU powers the current #1 supercomputer on the Top500 list. Although it is a traditional cache-based multicore processor, its peak performance and…”
    Get full text
    Conference Proceeding
  12. 12

    Making applications faster by asynchronous execution: Slowing down processes or relaxing MPI collectives by Afzal, Ayesha, Hager, Georg, Markidis, Stefano, Wellein, Gerhard

    Published in Future generation computer systems (01-11-2023)
    “…Comprehending the performance bottlenecks at the core of the intricate hardware–software interactions exhibited by highly parallel programs on HPC clusters is…”
    Get full text
    Journal Article
  13. 13

    Analytical performance estimation during code generation on modern GPUs by Ernst, Dominik, Holzer, Markus, Hager, Georg, Knorr, Matthias, Wellein, Gerhard

    “…Automatic code generation is frequently used to create implementations of algorithms specifically tuned to particular hardware and application parameters. The…”
    Get full text
    Journal Article
  14. 14

    Exploring performance and power properties of modern multi-core chips via simple machine models by Hager, Georg, Treibig, Jan, Habich, Johannes, Wellein, Gerhard

    Published in Concurrency and computation (01-02-2016)
    “…Summary Modern multi‐core chips show complex behavior with respect to performance and power. Starting with the Intel Sandy Bridge processor, it has become…”
    Get full text
    Journal Article
  15. 15

    Algebraic temporal blocking for sparse iterative solvers on multi-core CPUs by Alappat, Christie, Thies, Jonas, Hager, Georg, Fehske, Holger, Wellein, Gerhard

    “…Sparse linear iterative solvers are essential for many large-scale simulations. Much of the runtime of these solvers is often spent in the implicit evaluation…”
    Get full text
    Journal Article
  16. 16

    Performance engineering for real and complex tall & skinny matrix multiplication kernels on GPUs by Ernst, Dominik, Hager, Georg, Thies, Jonas, Wellein, Gerhard

    “…General matrix-matrix multiplications with double-precision real and complex entries (DGEMM and ZGEMM) in vendor-supplied BLAS libraries are best optimized for…”
    Get full text
    Journal Article
  17. 17

    Execution‐Cache‐Memory modeling and performance tuning of sparse matrix‐vector multiplication and Lattice quantum chromodynamics on A64FX by Alappat, Christie, Meyer, Nils, Laukemann, Jan, Gruber, Thomas, Hager, Georg, Wellein, Gerhard, Wettig, Tilo

    Published in Concurrency and computation (10-09-2022)
    “…The A64FX CPU is arguably the most powerful Arm‐based processor design to date. Although it is a traditional cache‐based multicore processor, its peak…”
    Get full text
    Journal Article
  18. 18

    Electron confinement in graphene with gate-defined quantum dots by Fehske, Holger, Hager, Georg, Pieper, Andreas

    “…We theoretically analyse the possibility to electrostatically confine electrons in circular quantum dot arrays, impressed on contacted graphene nanoribbons by…”
    Get full text
    Journal Article
  19. 19

    A domain-specific language and matrix-free stencil code for investigating electronic properties of Dirac and topological materials by Pieper, Andreas, Hager, Georg, Fehske, Holger

    “…We introduce PVSC-DTM (Parallel Vectorized Stencil Code for Dirac and Topological Materials), a library and code generator based on a domain-specific language…”
    Get full text
    Journal Article
  20. 20

    Chip-level and multi-node analysis of energy-optimized lattice Boltzmann CFD simulations by Wittmann, Markus, Hager, Georg, Zeiser, Thomas, Treibig, Jan, Wellein, Gerhard

    Published in Concurrency and computation (01-05-2016)
    “…Summary Memory‐bound algorithms show complex performance and energy consumption behavior on multicore processors. We choose the lattice Boltzmann method on an…”
    Get full text
    Journal Article