Search Results - "Hager, Georg"

1
CRAFT: A Library for Easier Application-Level Checkpoint/Restart and Automatic Fault Tolerance by Shahzad, Faisal, Thies, Jonas, Kreutzer, Moritz, Zeiser, Thomas, Hager, Georg, Wellein, Gerhard

Published in IEEE transactions on parallel and distributed systems (01-03-2019)
“…In order to efficiently use the future generations of supercomputers, fault tolerance and power consumption are two of the prime challenges anticipated by the…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
2
LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments by Treibig, J, Hager, G, Wellein, G

Published in 2010 39th International Conference on Parallel Processing Workshops (01-09-2010)
“…Exploiting the performance of today's processors requires intimate knowledge of the microarchitecture as well as an awareness of the ever-growing complexity in…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
3
High-performance implementation of Chebyshev filter diagonalization for interior eigenvalue computations by Pieper, Andreas, Kreutzer, Moritz, Alvermann, Andreas, Galgon, Martin, Fehske, Holger, Hager, Georg, Lang, Bruno, Wellein, Gerhard

Published in Journal of computational physics (15-11-2016)
“…We study Chebyshev filter diagonalization as a tool for the computation of many interior eigenvalues of very large sparse symmetric matrices. In this technique…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
4
Comparison of different propagation steps for lattice Boltzmann methods by Wittmann, Markus, Zeiser, Thomas, Hager, Georg, Wellein, Gerhard

Published in Computers & mathematics with applications (1987) (01-03-2013)
“…Several possibilities exist to implement the propagation step of lattice Boltzmann methods. This paper describes common implementations and compares the number…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
5
A flexible Patch-based lattice Boltzmann parallelization approach for heterogeneous GPU–CPU clusters by Feichtinger, Christian, Habich, Johannes, Köstler, Harald, Hager, Georg, Rüde, Ulrich, Wellein, Gerhard

Published in Parallel computing (01-09-2011)
“…► We investigate performance and scaling behavior of a LBM solver on GPU–CPU clusters. ► Based on hardware models performance estimations for GPUs and CPUs are…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
6
Pushing the limits for medical image reconstruction on recent standard multicore processors by Treibig, Jan, Hager, Georg, Hofmann, Hannes G., Hornegger, Joachim, Wellein, Gerhard

Published in The international journal of high performance computing applications (01-05-2013)
“…Volume reconstruction by backprojection is the computational bottleneck in many interventional clinical computed tomography (CT) applications. Today vendors in…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
7
Benefits from using mixed precision computations in the ELPA-AEO and ESSEX-II eigensolver projects by Alvermann, Andreas, Basermann, Achim, Bungartz, Hans-Joachim, Carbogno, Christian, Ernst, Dominik, Fehske, Holger, Futamura, Yasunori, Galgon, Martin, Hager, Georg, Huber, Sarah, Huckle, Thomas, Ida, Akihiro, Imakura, Akira, Kawai, Masatoshi, Köcher, Simone, Kreutzer, Moritz, Kus, Pavel, Lang, Bruno, Lederer, Hermann, Manin, Valeriy, Marek, Andreas, Nakajima, Kengo, Nemec, Lydia, Reuter, Karsten, Rippl, Michael, Röhrig-Zöllner, Melven, Sakurai, Tetsuya, Scheffler, Matthias, Scheurer, Christoph, Shahzad, Faisal, Simoes Brambila, Danilo, Thies, Jonas, Wellein, Gerhard

Published in Japan journal of industrial and applied mathematics (01-07-2019)
“…We first briefly report on the status and recent achievements of the ELPA-AEO (Eigen value Solvers for Petaflop Applications—Algorithmic Extensions and…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
8
The Role of Idle Waves, Desynchronization, and Bottleneck Evasion in the Performance of Parallel Programs by Afzal, Ayesha, Hager, Georg, Wellein, Gerhard

Published in IEEE transactions on parallel and distributed systems (01-02-2023)
“…The performance of highly parallel applications on distributed-memory systems is influenced by many factors. Analytic performance modeling techniques aim to…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
9
Analytic performance model for parallel overlapping memory‐bound kernels by Afzal, Ayesha, Hager, Georg, Wellein, Gerhard

Published in Concurrency and computation (01-05-2022)
“…Complex applications running on multicore processors show a rich performance phenomenology. The growing number of cores per ccNUMA domain complicates…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
10
Level-Based Blocking for Sparse Matrices: Sparse Matrix-Power-Vector Multiplication by Alappat, Christie, Hager, Georg, Schenk, Olaf, Wellein, Gerhard

Published in IEEE transactions on parallel and distributed systems (01-02-2023)
“…The multiplication of a sparse matrix with a dense vector (SpMV) is a key component in many numerical schemes and its performance is known to be severely…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
11
Performance Modeling of Streaming Kernels and Sparse Matrix-Vector Multiplication on A64FX by Alappat, Christie, Laukemann, Jan, Gruber, Thomas, Hager, Georg, Wellein, Gerhard, Meyer, Nils, Wettig, Tilo

Published in 2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) (01-11-2020)
“…The A64FX CPU powers the current #1 supercomputer on the Top500 list. Although it is a traditional cache-based multicore processor, its peak performance and…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
12
Making applications faster by asynchronous execution: Slowing down processes or relaxing MPI collectives by Afzal, Ayesha, Hager, Georg, Markidis, Stefano, Wellein, Gerhard

Published in Future generation computer systems (01-11-2023)
“…Comprehending the performance bottlenecks at the core of the intricate hardware–software interactions exhibited by highly parallel programs on HPC clusters is…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
13
Analytical performance estimation during code generation on modern GPUs by Ernst, Dominik, Holzer, Markus, Hager, Georg, Knorr, Matthias, Wellein, Gerhard

Published in Journal of parallel and distributed computing (01-03-2023)
“…Automatic code generation is frequently used to create implementations of algorithms specifically tuned to particular hardware and application parameters. The…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
14
Exploring performance and power properties of modern multi-core chips via simple machine models by Hager, Georg, Treibig, Jan, Habich, Johannes, Wellein, Gerhard

Published in Concurrency and computation (01-02-2016)
“…Summary Modern multi‐core chips show complex behavior with respect to performance and power. Starting with the Intel Sandy Bridge processor, it has become…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
15
Algebraic temporal blocking for sparse iterative solvers on multi-core CPUs by Alappat, Christie, Thies, Jonas, Hager, Georg, Fehske, Holger, Wellein, Gerhard

Published in The international journal of high performance computing applications (25-09-2024)
“…Sparse linear iterative solvers are essential for many large-scale simulations. Much of the runtime of these solvers is often spent in the implicit evaluation…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
16
Performance engineering for real and complex tall & skinny matrix multiplication kernels on GPUs by Ernst, Dominik, Hager, Georg, Thies, Jonas, Wellein, Gerhard

Published in The international journal of high performance computing applications (01-01-2021)
“…General matrix-matrix multiplications with double-precision real and complex entries (DGEMM and ZGEMM) in vendor-supplied BLAS libraries are best optimized for…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
17
Execution‐Cache‐Memory modeling and performance tuning of sparse matrix‐vector multiplication and Lattice quantum chromodynamics on A64FX by Alappat, Christie, Meyer, Nils, Laukemann, Jan, Gruber, Thomas, Hager, Georg, Wellein, Gerhard, Wettig, Tilo

Published in Concurrency and computation (10-09-2022)
“…The A64FX CPU is arguably the most powerful Arm‐based processor design to date. Although it is a traditional cache‐based multicore processor, its peak…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
18
Electron confinement in graphene with gate-defined quantum dots by Fehske, Holger, Hager, Georg, Pieper, Andreas

Published in Physica Status Solidi. B: Basic Solid State Physics (01-08-2015)
“…We theoretically analyse the possibility to electrostatically confine electrons in circular quantum dot arrays, impressed on contacted graphene nanoribbons by…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
19
A domain-specific language and matrix-free stencil code for investigating electronic properties of Dirac and topological materials by Pieper, Andreas, Hager, Georg, Fehske, Holger

Published in The international journal of high performance computing applications (01-01-2021)
“…We introduce PVSC-DTM (Parallel Vectorized Stencil Code for Dirac and Topological Materials), a library and code generator based on a domain-specific language…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
20
Chip-level and multi-node analysis of energy-optimized lattice Boltzmann CFD simulations by Wittmann, Markus, Hager, Georg, Zeiser, Thomas, Treibig, Jan, Wellein, Gerhard

Published in Concurrency and computation (01-05-2016)
“…Summary Memory‐bound algorithms show complex performance and energy consumption behavior on multicore processors. We choose the lattice Boltzmann method on an…”

Get full text

Journal Article
QR Code
Save to List

Saved in:

Search Results - "Hager, Georg"

CRAFT: A Library for Easier Application-Level Checkpoint/Restart and Automatic Fault Tolerance by Shahzad, Faisal, Thies, Jonas, Kreutzer, Moritz, Zeiser, Thomas, Hager, Georg, Wellein, Gerhard

LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments by Treibig, J, Hager, G, Wellein, G

High-performance implementation of Chebyshev filter diagonalization for interior eigenvalue computations by Pieper, Andreas, Kreutzer, Moritz, Alvermann, Andreas, Galgon, Martin, Fehske, Holger, Hager, Georg, Lang, Bruno, Wellein, Gerhard

Comparison of different propagation steps for lattice Boltzmann methods by Wittmann, Markus, Zeiser, Thomas, Hager, Georg, Wellein, Gerhard

A flexible Patch-based lattice Boltzmann parallelization approach for heterogeneous GPU–CPU clusters by Feichtinger, Christian, Habich, Johannes, Köstler, Harald, Hager, Georg, Rüde, Ulrich, Wellein, Gerhard

Pushing the limits for medical image reconstruction on recent standard multicore processors by Treibig, Jan, Hager, Georg, Hofmann, Hannes G., Hornegger, Joachim, Wellein, Gerhard

The Role of Idle Waves, Desynchronization, and Bottleneck Evasion in the Performance of Parallel Programs by Afzal, Ayesha, Hager, Georg, Wellein, Gerhard

Analytic performance model for parallel overlapping memory‐bound kernels by Afzal, Ayesha, Hager, Georg, Wellein, Gerhard

Level-Based Blocking for Sparse Matrices: Sparse Matrix-Power-Vector Multiplication by Alappat, Christie, Hager, Georg, Schenk, Olaf, Wellein, Gerhard

Performance Modeling of Streaming Kernels and Sparse Matrix-Vector Multiplication on A64FX by Alappat, Christie, Laukemann, Jan, Gruber, Thomas, Hager, Georg, Wellein, Gerhard, Meyer, Nils, Wettig, Tilo

Making applications faster by asynchronous execution: Slowing down processes or relaxing MPI collectives by Afzal, Ayesha, Hager, Georg, Markidis, Stefano, Wellein, Gerhard

Analytical performance estimation during code generation on modern GPUs by Ernst, Dominik, Holzer, Markus, Hager, Georg, Knorr, Matthias, Wellein, Gerhard

Exploring performance and power properties of modern multi-core chips via simple machine models by Hager, Georg, Treibig, Jan, Habich, Johannes, Wellein, Gerhard

Algebraic temporal blocking for sparse iterative solvers on multi-core CPUs by Alappat, Christie, Thies, Jonas, Hager, Georg, Fehske, Holger, Wellein, Gerhard

Performance engineering for real and complex tall & skinny matrix multiplication kernels on GPUs by Ernst, Dominik, Hager, Georg, Thies, Jonas, Wellein, Gerhard

Execution‐Cache‐Memory modeling and performance tuning of sparse matrix‐vector multiplication and Lattice quantum chromodynamics on A64FX by Alappat, Christie, Meyer, Nils, Laukemann, Jan, Gruber, Thomas, Hager, Georg, Wellein, Gerhard, Wettig, Tilo

Electron confinement in graphene with gate-defined quantum dots by Fehske, Holger, Hager, Georg, Pieper, Andreas

A domain-specific language and matrix-free stencil code for investigating electronic properties of Dirac and topological materials by Pieper, Andreas, Hager, Georg, Fehske, Holger

Chip-level and multi-node analysis of energy-optimized lattice Boltzmann CFD simulations by Wittmann, Markus, Hager, Georg, Zeiser, Thomas, Treibig, Jan, Wellein, Gerhard

Search Tools:

Refine Results

Format

Subject Area

Topic

Language

Year of Publication