Search Results - "Alappat, Christie"
-
1
Multiway p-spectral graph cuts on Grassmann manifolds
Published in Machine learning (01-02-2022)“…Nonlinear reformulations of the spectral clustering method have gained a lot of recent attention due to their increased numerical benefits and their solid…”
Get full text
Journal Article -
2
YaskSite: Stencil Optimization Techniques Applied to Explicit ODE Methods on Modern Architectures
Published in 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) (27-02-2021)“…The landscape of multi-core architectures is growing more complex and diverse. Optimal application performance tuning parameters can vary widely across CPUs,…”
Get full text
Conference Proceeding -
3
Level-Based Blocking for Sparse Matrices: Sparse Matrix-Power-Vector Multiplication
Published in IEEE transactions on parallel and distributed systems (01-02-2023)“…The multiplication of a sparse matrix with a dense vector (SpMV) is a key component in many numerical schemes and its performance is known to be severely…”
Get full text
Journal Article -
4
Performance Modeling of Streaming Kernels and Sparse Matrix-Vector Multiplication on A64FX
Published in 2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) (01-11-2020)“…The A64FX CPU powers the current #1 supercomputer on the Top500 list. Although it is a traditional cache-based multicore processor, its peak performance and…”
Get full text
Conference Proceeding -
5
Algebraic temporal blocking for sparse iterative solvers on multi-core CPUs
Published in The international journal of high performance computing applications (25-09-2024)“…Sparse linear iterative solvers are essential for many large-scale simulations. Much of the runtime of these solvers is often spent in the implicit evaluation…”
Get full text
Journal Article -
6
Execution‐Cache‐Memory modeling and performance tuning of sparse matrix‐vector multiplication and Lattice quantum chromodynamics on A64FX
Published in Concurrency and computation (10-09-2022)“…The A64FX CPU is arguably the most powerful Arm‐based processor design to date. Although it is a traditional cache‐based multicore processor, its peak…”
Get full text
Journal Article -
7
Bridging the Architecture Gap: Abstracting Performance-Relevant Properties of Modern Server Processors
Published in Supercomputing frontiers and innovations (01-06-2020)Get full text
Journal Article -
8
Algebraic Temporal Blocking for Sparse Iterative Solvers on Multi-Core CPUs
Published 05-09-2023“…Sparse linear iterative solvers are essential for many large-scale simulations. Much of the runtime of these solvers is often spent in the implicit evaluation…”
Get full text
Journal Article -
9
Cache Blocking of Distributed-Memory Parallel Matrix Power Kernels
Published 21-05-2024“…Sparse matrix-vector products (SpMVs) are a bottleneck in many scientific codes. Due to the heavy strain on the main memory interface from loading the sparse…”
Get full text
Journal Article -
10
Code Generation and Performance Engineering for Matrix-Free Finite Element Methods on Hybrid Tetrahedral Grids
Published 12-04-2024“…This paper introduces a code generator designed for node-level optimized, extreme-scalable, matrix-free finite element operators on hybrid tetrahedral grids…”
Get full text
Journal Article -
11
Level-based Blocking for Sparse Matrices: Sparse Matrix-Power-Vector Multiplication
Published 03-05-2022“…The multiplication of a sparse matrix with a dense vector (SpMV) is a key component in many numerical schemes and its performance is known to be severely…”
Get full text
Journal Article -
12
Multiway $p$-spectral graph cuts on Grassmann manifolds
Published 30-08-2020“…Mach Learn (2021) Nonlinear reformulations of the spectral clustering method have gained a lot of recent attention due to their increased numerical benefits…”
Get full text
Journal Article -
13
ECM modeling and performance tuning of SpMV and Lattice QCD on A64FX
Published 30-07-2021“…The A64FX CPU is arguably the most powerful Arm-based processor design to date. Although it is a traditional cache-based multicore processor, its peak…”
Get full text
Journal Article -
14
Performance Modeling of Streaming Kernels and Sparse Matrix-Vector Multiplication on A64FX
Published 29-09-2020“…The A64FX CPU powers the current number one supercomputer on the Top500 list. Although it is a traditional cache-based multicore processor, its peak…”
Get full text
Journal Article -
15
Understanding HPC Benchmark Performance on Intel Broadwell and Cascade Lake Processors
Published 12-02-2020“…Hardware platforms in high performance computing are constantly getting more complex to handle even when considering multicore CPUs alone. Numerous features…”
Get full text
Journal Article -
16
Bridging the Architecture Gap: Abstracting Performance-Relevant Properties of Modern Server Processors
Published 01-07-2019“…We describe a universal modeling approach for predicting single- and multicore runtime of steady-state loops on server processors. To this end we strictly…”
Get full text
Journal Article -
17
A Recursive Algebraic Coloring Technique for Hardware-Efficient Symmetric Sparse Matrix-Vector Multiplication
Published 15-07-2019“…The symmetric sparse matrix-vector multiplication (SymmSpMV) is an important building block for many numerical linear algebra kernel operations or graph…”
Get full text
Journal Article