Topology-Aware Data Aggregation for Intensive I/O on Large-Scale Supercomputers

Reading and writing data efficiently from storage systems is critical for high performance data-centric applications. These I/O systems are being increasingly characterized by complex topologies and deeper memory hierarchies. Effective parallel I/O solutions are needed to scale applications on curre...

Full description

Saved in:

Bibliographic Details
Published in:	2016 First International Workshop on Communication Optimizations in HPC (COMHPC) pp. 73 - 81
Main Authors:	Tessier, Francois, Malakar, Preeti, Vishwanath, Venkatram, Jeannot, Emmanuel, Isaila, Florin
Format:	Conference Proceeding
Language:	English
Published:	IEEE 01-11-2016
Subjects:	Bandwidth Computational modeling Data aggregation Network topology Supercomputers Topology Writing
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Reading and writing data efficiently from storage systems is critical for high performance data-centric applications. These I/O systems are being increasingly characterized by complex topologies and deeper memory hierarchies. Effective parallel I/O solutions are needed to scale applications on current and future supercomputers. Data aggregation is an efficient approach consisting of electing some processes in charge of aggregating data from a set of neighbors and writing the aggregated data into storage. Thus, the bandwidth use can be optimized while the contention is reduced. In this work, we take into account the network topology for mapping aggregators and we propose an optimized buffering system in order to reduce the aggregation cost. We validate our approach using micro-benchmarks and the I/O kernel of a large-scale cosmology simulation. We show improvements up to 15× faster for I/O operations compared to a standard implementation of MPI I/O.
DOI:	10.1109/COMHPC.2016.013