Hello, My colleagues at Deephaven Data Labs and I have been addressing problems at the intersection of data-driven applications, data science, and updating (/ticking) data for some years.
Deephaven has a query engine that supports updating tabular data via a protocol that communicates precise changes about datasets, such as 1) which rows were removed, 2) which rows were added, 3) which rows were modified (and for which columns). We are inspired by Arrow and would like to adopt a version of this protocol that adheres to goals similar to Arrow and Arrow Flight. Out of the box, Arrow Flight is insufficient to represent such a stream of changes. For example, because you cannot identify a particular row within an Arrow Flight, you cannot indicate which rows were removed or modified. The project integrates with Arrow Flight at the header-metadata level. We have preliminarily named the project Barrage as in a "barrage of arrows" which plays in the same "namespace" as a "flight of arrows." We built this as part of an initiative to modernize and open up our table IPC mechanisms. This is part of a larger open source effort which will become more visible in the next month or so once we've finished the work necessary to share our core software components, including a unified static and real time query engine complete with data visualization tools, a REPL experience, Jupyter integration, and more. I would like to find out: - if we have understood the primary goals of Arrow, and are honoring them as closely as possible - if there are other projects that might benefit from sharing this extension of Arrow Flight - if there are any gaps that are best addressed early on to maximize future compatibility A great place to digest the concepts that differ from Arrow Flight are here: https://deephaven.github.io/barrage/Concepts.html The proposed protocol can be perused here: https://github.com/deephaven/barrage Internally, we already have a java server and java client implemented as a working proof of concept for our use case. I really look forward to your feedback; thank you! Nate Bauernfeind Deephaven Data Labs - https://deephaven.io/ --