BDEv 3.0: Energy efficiency and microarchitectural characterization of Big Data processing frameworks
As the size of Big Data workloads keeps increasing, the evaluation of distributed frameworks becomes a crucial task in order to identify potential performance bottlenecks that may delay the processing of large datasets. While most of the existing works generally focus only on execution time and reso...
Saved in:
Published in: | Future generation computer systems Vol. 86; pp. 565 - 581 |
---|---|
Main Authors: | , , , |
Format: | Journal Article |
Language: | English |
Published: |
Elsevier B.V
01-09-2018
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | As the size of Big Data workloads keeps increasing, the evaluation of distributed frameworks becomes a crucial task in order to identify potential performance bottlenecks that may delay the processing of large datasets. While most of the existing works generally focus only on execution time and resource utilization, analyzing other important metrics is key to fully understanding the behavior of these frameworks. For example, microarchitecture-level events can bring meaningful insights to characterize the interaction between frameworks and hardware. Moreover, energy consumption is also gaining increasing attention as systems scale to thousands of cores. This work discusses the current state of the art in evaluating distributed processing frameworks, while extending our Big Data Evaluator tool (BDEv) to extract energy efficiency and microarchitecture-level metrics from the execution of representative Big Data workloads. An experimental evaluation using BDEv demonstrates its usefulness to bring meaningful information from popular frameworks such as Hadoop, Spark and Flink.
•A comprehensive state-of-the-art survey about the benchmarking of Big Data systems.•Proposal of BDEv 3.0, a holistic evaluation tool for Big Data processing frameworks.•BDEv includes resource usage, energy efficiency and microarchitectural metrics.•A practical use case of BDEv comparing current versions of Hadoop, Spark and Flink. |
---|---|
ISSN: | 0167-739X 1872-7115 |
DOI: | 10.1016/j.future.2018.04.030 |