An Autonomic Approach for the Selection of Robust Dynamic Loop Scheduling Techniques

Parallel applications are highly irregular and high performance computing (HPC) infrastructures are very complex. The HPC applications of interest herein are timestepping scientific applications (TSSA). Often, TSSA involve the repeated execution of multiple parallel loops with thousands of iteration...

Full description

Saved in:

Bibliographic Details
Published in:	2017 16th International Symposium on Parallel and Distributed Computing (ISPDC) pp. 9 - 17
Main Authors:	Boulmier, Anthony, Banicescu, Ioana, Ciorba, Florina M., Abdennadher, Nabil
Format:	Conference Proceeding
Language:	English
Published:	IEEE 01-07-2017
Subjects:	Algorithm design and analysis Autonomic Computing Dynamic Loop Scheduling Dynamic scheduling Heuristic algorithms High Performance Computing Learning (artificial intelligence) Measurement Performance Optimization Processor scheduling Reinforcement Learning Robustness
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Parallel applications are highly irregular and high performance computing (HPC) infrastructures are very complex. The HPC applications of interest herein are timestepping scientific applications (TSSA). Often, TSSA involve the repeated execution of multiple parallel loops with thousands of iterations and irregular behavior. Dynamic loop scheduling (DLS) techniques were developed over time and have proven to be effective in scheduling parallel loops for achieving load balancing of TSSA. Using a single particular DLS technique throughout the entire execution of a time-step, or even over the entire application, does not guarantee optimal performance due to the unpredictable variations in problem and algorithmic characteristics as well as those of the infrastructure capabilities. For that reason, an autonomic selection of DLS techniques as function of the parallel loop execution time has shown to improve application performance. Recently, a robustness metric of DLS techniques, named "flexibility", has been proposed to estimate the capability of a DLS technique to resist to variations in the loop iterations execution time. To improve the performance of TSSA, we propose in this work an approach that involves the autonomic selection of DLS techniques as function of the flexibility of DLS techniques. The first major novelty of our approach lies in the use of state-of-the-art reinforcement learning (RL) algorithms as smart agents. The second novelty lies in the design of a modified flexibility metric. The third major novelty resides in using the new modified flexibility metric as a reward for the smart agents. The fourth novelty is the evaluation of the proposed approach within a simulated environment, in particular using the SimGrid-SMPI interface to execute DLS algorithms. We discuss the advantages and the limitations of the new proposed flexibility metric as a reward.
DOI:	10.1109/ISPDC.2017.9