Hello Everyone,

I would like to share a small open-source project that may be of interest
to the Arrow Flight community.

RNArrow is a C++ library and Arrow Flight gRPC server that reads ROOT
RNTuple files and exposes the data as Apache Arrow, both in-process via
pybind11 and over the network via Flight.

Quick context: RNTuple is the new columnar successor to ROOT's TTree format
(production ready in ROOT 6.36, May 2025), used at CERN's LHC experiments
for petabytes of detector data per year. Before this project there was no
standalone C++ library to convert RNTuple to Arrow, and no Arrow Flight
server for this data type.

A bit about me: I am a software engineer, active contributor to HSF and
HEP, rencelty finished working on an LHCb affiliated project.

v0.1 covers primitives and single-level list columns. Per-cluster
RecordBatch building, Arrow C Data Interface for the Python boundary,
FlightServerBase subclass for gRPC. Apache 2.0.

Reference:
  Code: https://github.com/KaranSinghDev/RNTuple-Arrow-Gateway
  DOI:  https://doi.org/10.5281/zenodo.20034922

Feedback on the Flight server design or benchmark methodology would be much
appreciated. I am open to suggestions and collaborations.

Karan Singh

Reply via email to