GPU-Accelerated Error-Bounded Lossy Compression for Scientific Data Toward Exascale Computing

Today’s scientific exploration is driven by large-scale simulations and advanced instruments, generating vast amounts of data. As we advance into the era of exascale computing, the data volume is projected to increase exponentially, imposing an unprecedented burden on supercomputing systems and beco...

Full description

Saved in:
Bibliographic Details
Main Author: Tian, Jiannan
Format: Dissertation
Language:English
Published: ProQuest Dissertations & Theses 01-01-2024
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Today’s scientific exploration is driven by large-scale simulations and advanced instruments, generating vast amounts of data. As we advance into the era of exascale computing, the data volume is projected to increase exponentially, imposing an unprecedented burden on supercomputing systems and becoming a bottleneck in scientific applications. Moreover, supercomputers and HPC applications are evolving to be more heterogeneous, incorporating accelerator-based architectures, especially GPUs. This shift presents significant challenges in error-bounded lossy compression, a state-of-the-art data reduction technique for HPC applications due to its ability to reduce storage overhead while retaining high fidelity for post-analysis significantly. This research aims to design novel GPU-accelerated lossy compressors for scientific data based on SZ, a renowned error-bounded lossy compression framework, to achieve high throughput with a high compression ratio while maintaining high data fidelity. This dissertation mainly includes three research aspects: (1) We design the first GPU-accelerated error-bounded lossy compression framework, called cuSZ, with innovations in parallel algorithm design and GPU architectural optimizations, achieving high throughput and balanced compression ratios. (2) We utilize GPU traits akin to the parallel random-access machine (PRAM) model to develop high-throughput Huffman encoding, a critical component in the cuSZ framework. (3) We develop alternative compression pipelines to enhance usability in real-world data movement scenarios, designed to unify the goals of achieving a high compression ratio, high throughput, and high fidelity, thereby addressing these key aspects. Our aim is to reach the Pareto frontier of scientific data compression to meet the data-processing demands in the era of exascale computing.
ISBN:9798384027522