Performance portable Vlasov code with C++ parallel algorithm

This paper presents the performance portable implementation of a kinetic plasma simulation code with C++ parallel algorithm to run across multiple CPUs and GPUs. Relying on the language standard parallelism stdpar and proposed language standard multi-dimensional array support mdspan, we demonstrate...

Full description

Saved in:
Bibliographic Details
Published in:2022 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC) pp. 68 - 80
Main Authors: Asahi, Yuuichi, Padioleau, Thomas, Latu, Guillaume, Bigot, Julien, Grandgirard, Virginie, Obrejan, Kevin
Format: Conference Proceeding
Language:English
Published: IEEE 01-11-2022
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper presents the performance portable implementation of a kinetic plasma simulation code with C++ parallel algorithm to run across multiple CPUs and GPUs. Relying on the language standard parallelism stdpar and proposed language standard multi-dimensional array support mdspan, we demonstrate that a performance portable implementation is possible without harming the readability and productivity. We obtain a good overall performance for a mini-application in the range of 20 % to the Kokkos version on Intel Icelake, NVIDIA V100, and A100 GPUs. Our conclusion is that stdpar can be a good candidate to develop a performance portable and productive code targeting the Exascale era platform, assuming this approach will be available on AMD and/or Intel GPUs in the future.
ISSN:2831-3909
DOI:10.1109/P3HPC56579.2022.00012