Serverless-like platform for container-based YARN clusters

Serverless computing is an emerging paradigm that has gained a lot of relevance in recent years, as it allows users to consume computing resources without worrying about the underlying infrastructure and pay only for what they actually use. Most current services that implement this paradigm typicall...

Full description

Saved in:
Bibliographic Details
Published in:Future generation computer systems Vol. 155; pp. 256 - 271
Main Authors: Castellanos-Rodríguez, Óscar, Expósito, Roberto R., Enes, Jonatan, Taboada, Guillermo L., Touriño, Juan
Format: Journal Article
Language:English
Published: Elsevier B.V 01-06-2024
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Serverless computing is an emerging paradigm that has gained a lot of relevance in recent years, as it allows users to consume computing resources without worrying about the underlying infrastructure and pay only for what they actually use. Most current services that implement this paradigm typically rely on the Function-as-a-Service (FaaS) model, which works perfectly for simple applications based on stateless functions triggered by specific events. However, these services are not designed to run more complex applications with intricate interactions, usually presenting a significant degree of configuration difficulty and/or low ability to customise the execution environment. They also tend to be designed for short and simple workloads, with some services even limiting their maximum runtime to just a few minutes. In this paper, we present a platform based on Hadoop YARN oriented to the execution of Big Data workloads in a containerised and serverless way, so that the resources allocated to such containers are automatically and dynamically scaled according to their actual usage. An experimental evaluation has been carried out to compare our serverless-like platform with a standard YARN deployment when executing Big Data workloads concurrently. Our results have shown experimental evidence of enhancing both performance and overall resource efficiency, providing runtime reductions and resource usage improvements of up to 41% and 50%, respectively. •Platform to deploy serverless YARN clusters for the execution of Big Data workloads.•Fine-grained scaling and reallocation of cluster container resources in real time.•Automated deployment system through IaC tools and web interface to ease management.•Runtime and resource efficiency improvements up to 41% and 50%, respectively.
ISSN:0167-739X
1872-7115
DOI:10.1016/j.future.2024.02.013