Shared-memory block-based fast marching method for hierarchical meshes

The fast marching method is commonly used in expanding front simulations in various fields, such as, fluid dynamics, computer graphics, and in microelectronics, to restore the signed-distance field property of the level-set function, also known as re-distancing. To improve the performance of the re-...

Full description

Saved in:
Bibliographic Details
Published in:Journal of computational and applied mathematics Vol. 392; p. 113488
Main Authors: Quell, Michael, Diamantopoulos, Georgios, Hössinger, Andreas, Weinbub, Josef
Format: Journal Article
Language:English
Published: Elsevier B.V 15-08-2021
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The fast marching method is commonly used in expanding front simulations in various fields, such as, fluid dynamics, computer graphics, and in microelectronics, to restore the signed-distance field property of the level-set function, also known as re-distancing. To improve the performance of the re-distancing step, parallel algorithms for the fast marching method as well as support for hierarchical meshes have been developed; the latter to locally support higher resolutions of the simulation domain whilst limiting the impact on the overall computational demand. In this work, the previously developed multi-mesh fast marching method is extended by a so-called block-based decomposition step to improve serial and parallel performance on hierarchical meshes. OpenMP tasks are used for the underlying coarse-grained parallelization on a per mesh basis. The developed approach offers improved load balancing as the algorithm employs a high mesh partitioning degree, enabling to balance mesh partitions with varying mesh sizes. Various benchmarks and parameter studies are performed on representative geometries with varying complexities. The serial performance is increased by up to 21% whereas parallel speedups ranging from 7.4 to 19.1 for various test cases on a 24-core Intel Skylake computing platform have been achieved, effectively doubling the parallel efficiency of the previous approach.
ISSN:0377-0427
1879-1778
DOI:10.1016/j.cam.2021.113488