Spear: Optimized Dependency-Aware Task Scheduling with Deep Reinforcement Learning

Modern data parallel frameworks, such as Apache Spark, are designed to execute complex data processing jobs that contain a large number of tasks, with dependencies between these tasks represented by a directed acyclic graph (DAG). When scheduling these tasks, the ultimate objective is to minimize th...

Full description

Saved in:
Bibliographic Details
Published in:2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS) pp. 2037 - 2046
Main Authors: Hu, Zhiming, Tu, James, Li, Baochun
Format: Conference Proceeding
Language:English
Published: IEEE 01-07-2019
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Modern data parallel frameworks, such as Apache Spark, are designed to execute complex data processing jobs that contain a large number of tasks, with dependencies between these tasks represented by a directed acyclic graph (DAG). When scheduling these tasks, the ultimate objective is to minimize the makespan of the schedule, which is equivalent to minimizing the job completion time. With task dependencies, however, minimizing the makespan of the schedule is non-trivial, especially when tasks in the DAG have different resource demands with respect to multiple resource types. In this paper, we present Spear, a new scheduling framework designed to minimize the makespan of complex jobs, while considering both task dependencies and heterogeneous resource demands at the same time. Inspired by recent advances in artificial intelligence, Spear applies Monte Carlo Tree Search (MCTS) in the specific context of task scheduling, and trains a deep reinforcement learning model to guide the expansion and rollout steps in MCTS. With deep reinforcement learning, search efficiency can be significantly improved by focusing on more promising branches. With both simulations and experiments using traces from production workloads, we compare the scheduling performance of Spear with state-of-the-art job schedulers in the literature, and Spear can outperform those approaches by up to 20%. Our results have validated our claims that MCTS and deep reinforcement learning can readily be applied to optimize the scheduling of complex jobs with task dependencies.
ISSN:2575-8411
DOI:10.1109/ICDCS.2019.00201