ROOT’s RNTuple I/O Subsystem: The Path to Production

The RNTuple I/O subsystem is ROOT’s future event data file format and access API. It is driven by the expected data volume increase at upcoming HEP experiments, e.g. at the HL-LHC, and recent opportunities in the storage hardware and software landscape such as NVMe drives and distributed object stor...

Full description

Saved in:
Bibliographic Details
Published in:EPJ Web of conferences Vol. 295; p. 6020
Main Authors: Blomer, Jakob, Canal, Philippe, de Geus, Florine, Hahnfeld, Jonas, Naumann, Axel, Lopez-Gomez, Javier, Lazzari Miotto, Giovanna, Padulano, Vincenzo Eduardo
Format: Journal Article Conference Proceeding
Language:English
Published: Les Ulis EDP Sciences 2024
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The RNTuple I/O subsystem is ROOT’s future event data file format and access API. It is driven by the expected data volume increase at upcoming HEP experiments, e.g. at the HL-LHC, and recent opportunities in the storage hardware and software landscape such as NVMe drives and distributed object stores. RNTuple is a redesign of the TTree binary format and API and has shown to deliver substantially faster data throughput and better data compression both compared to TTree and to industry standard formats. In order to let HENP computing workflows benefit from RNTuple’s superior performance, however, the I/O stack needs to connect efficiently to the rest of the ecosystem, from grid storage to (distributed) analysis frameworks to (multithreaded) experiment frameworks for reconstruction and ntuple derivation. With the RNTuple binary format soon arriving at its first production release, we present RNTuple’s feature set, integration efforts, and its performance impact on the time-to-solution. We show the latest performance figures of RDataFrame analysis code of realistic complexity, comparing RNTuple and TTree as data sources. We discuss RNTuple’s approach to functionality critical to the HENP I/O (such as multithreaded writes, fast data merging, schema evolution) and we provide an outlook on the road to its use in production.
ISSN:2100-014X
2101-6275
2100-014X
DOI:10.1051/epjconf/202429506020