Multi-Objective Scientific-Workflow Scheduling With Data Movement Awareness in Cloud

Due to serving several purposes simultaneously, running scientific workflows on dynamic environments such as cloud computing, has become multi-objective scheduling. Among these purposes, Cost and Makespan are probably the most two primitive objectives. Another critical factor in a large-scale scient...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access Vol. 7; pp. 177063 - 177081
Main Authors: Wangsom, Peerasak, Lavangnananda, Kittichai, Bouvry, Pascal
Format: Journal Article
Language:English
Published: Piscataway IEEE 2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Due to serving several purposes simultaneously, running scientific workflows on dynamic environments such as cloud computing, has become multi-objective scheduling. Among these purposes, Cost and Makespan are probably the most two primitive objectives. Another critical factor in a large-scale scientific workflow is tremendous amount of data during execution. Therefore, this work also includes Data Movement as an additional objective as it has a major impact on network utilization and energy consumption in network equipment in cloud data center. In considering these three objectives, this work proposes a framework for scheduling solutions which combines a new nodes clustering technique in Directed Acyclic Graph (DAG) model known as Multilevel Dependent Node Clustering (MDNC) and the multi-objective optimization, Extreme Nondominated Sorting Genetic Algorithm-III (E-NSGA-III). E-NSGA-III is the recent extension of Nondominated Sorting Genetic Algorithm (NSGA-III). Five well-known scientific workflows, CyberShake, Epigenomics, LIGO, Montage, and SIPHT are selected as testbeds, while the commonly known Hypervolume is chosen as the performance metric. In this work, MDNC is also experimented with both NSGA-III. Comparison among three approaches, E-NSGA-III alone, E-NSGA-III with Peer-to-Peer clustering and E-NSGA-III with MDNC are carried out. The superiority of the proposed framework among them and its limitation are discussed.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2019.2957998