I don't intend table transfer protocols to be "Arrow protocols" in the sense that I agree that using the Arrow brand would confuse people in this case.
"It wouldn't transport Arrow data" -- no, it *would* transport Arrow data sometimes, potentially even all of the time in specific deployments and workloads: - When there is any row-level filtering or aggregation done on the server side -- this is exactly Arrow Flight's scenario - When there is no row-level filtering or aggregation on the server side, but the client side is not familiar with with server side's column-level encodings or compressions and asks the server side to covert columns to the baseline format which the client side understands, which is Arrow. I mentioned this in my reply to Micah a few messages above. The part of the original writing where I discuss this is here <https://engineeringideas.substack.com/i/148153268/column-sections-and-transfer-encodings> . - When there is no row-level filtering or aggregation on the server side, but that data is freshly ingested to the DB and is not even compressed by the DB yet. A-la Druid's or Pinot's real-time ingestion segment, or any HTAP or timeseries database's "staging" data segment. On Fri, 13 Sept 2024 at 19:11, Antoine Pitrou <anto...@python.org> wrote: > > Ok, so to make things clear, you're simply having a general discussion > about something that would not be a Arrow protocol (since it wouldn't > transport Arrow data, if I understand your complaints about Flight)? > > If so, then it would be less confusing if you stopped talking about > "improved Arrow Flight". > > Regards > > Antoine. > > > > Le 13/09/2024 à 11:54, Roman Leventov a écrit : > > Hello Antoine, > > > > I understand what Arrow is. Perhaps, you are confused by the phrases > > "Improved Arrow Flight" or "Arrow Flight++" that I used? These are > > metaphors intended to convey the essence of the table transfer protocols > as > > succinctly as possible (though imperfectly), not the actual proposals to > > make table transfer protocols the next versions of Arrow Flight. > > > > I started this discussion in the Arrow dev list because there is > probably a > > reasonably high concentration of interested audience here. > > > > FWIW, I thought that DataFusion would be a more fitting home for the > > development of these protocols (at least, initially) among any of the > > existing Apache projects. I chose this dev list rather than DataFusion's > > dev list because there is already Arrow Flight as a transfer protocol > > reference for discussion (rather than "DataFusion Flight"), and because > all > > potentially interested people in the DataFusion dev list likely read this > > dev list anyway. > > > > Re: sending Parquet files over HTTP, it would not help to efficiently > > query/process data stored in ClickHouse's (Databend, Doris, Druid, > > DeepLake, Firebolt, Lance, Oxla, Pinot, QuestDB, SingleStore, etc.) > native > > at-rest partition file formats. > > > > On Fri, 13 Sept 2024 at 16:43, Antoine Pitrou <anto...@python.org> > wrote: > > > >> > >> Hello, > >> > >> I'm perplexed by this discussion. If you want to send highly-compressed > >> files over the network that is already possible: just send Parquet > >> files via HTTP(S) (or another protocol of choice). > >> > >> Arrow Flight is simply a *streaming* protocol that allows > >> sending/requesting the Arrow format over gRPC. The Arrow format was > >> never meant to be high-compression as it's primarily an in-memory > >> format, and only secondarily a transfer/storage format. > >> > >> I don't think it's reasonable to enlarge the scope of Arrow to > >> something else. > >> > >> Regards > >> > >> Antoine. > >> > >> > >> > >> On Tue, 10 Sep 2024 01:29:05 +0800 > >> Roman Leventov <leven...@apache.org> wrote: > >>> Hello Arrow community, > >>> > >>> I've recently proposed a family of "table transfer protocols": Table > >> Read, > >>> Table Write, and Table Replication protocols (provisional names) that I > >>> think can become a full-standing alternative to Iceberg for interop > >> between > >>> diverse databases/data warehouses and processing and ML engines: > >>> > >> > https://engineeringideas.substack.com/p/table-transfer-protocols-improved. > >>> > >>> It appears to me that the "convergence on Iceberg" doesn't really serve > >> the > >>> users because the competition in the storage layer is inhibited and > >>> everyone is using the "lowest common denominator" format that has > >>> limitations. See more details about this argument at the link above. > >>> > >>> But why then everyone is so excited about Iceberg then and push > databases > >>> to add Iceberg catalogs (essentially turning themselves into processing > >>> engines), in addition to their native storage formats? > >>> 1) Cheap storage -- not due to Iceberg per se, but due to object > >> storage. > >>> Snowflake, Redshift, Firebolt, Databend, and other DBs have their own > >>> object storage-first storage formats, too, and many other DBs can be > >>> configured with aggressive tiering to object storage. > >>> 2) Interop with different processing engines -- Arrow Flight provides > >>> this, too > >>> 3) Ability to lift and move data to another DB or cloud -- this is > fair, > >>> but the mere ability of the DB to write Iceberg permits the data team > to > >>> convert and move *if* they decide to do it, but before that, why > wouldn't > >>> they use DB's native format, more optimised for this DB, the ingestion > >>> pattern, etc.? > >>> 4) *Parallel writing* -- processing engine's nodes can write parquet > >> files > >>> completely in parallel within an Iceberg transaction, Arrow Flight > >> doesn't > >>> permit this > >>> 5) *Reading compressed data* from object storage to external processing > >>> engines -- Arrow Flight requires convert everything into Arrow before > >>> sending. If the processing job requires to pull a lot of Parquet > >>> files/columns (even after filtering partitions) converting them all to > >>> Arrow before sending as the minimum can increase the job latency quite > a > >>> bit, as the maximum will cost much more money for egress. > >>> > >>> I cannot think of any other substantial reason (when compared with DBs > >>> rather than with Hive tables). If this is the case, Arrow Flight is not > >> yet > >>> a full-standing alternative to Iceberg catalogs for these three > reasons: > >>> 1) *No parallel writes* > >>> 2) *Inability to send the table column data in the on-disk compressed > >>> formats* (rather than Arrow), and > >>> 3) A few other provisions in the protocol design that don't make it > easy > >>> to control/balance the load of DB nodes, failover to replica nodes, the > >>> ability to re-request partition's endpoint from the server "in the same > >>> query" (that is, at the same consistency snapshot), etc. > >>> > >>> These are the things that I tried to address in Table Read > >>> < > >> > https://engineeringideas.substack.com/i/148153268/table-read-protocol-walkthrough > >>> > >>> and Table Write > >>> < > >> > https://engineeringideas.substack.com/i/148485754/use-cases-for-table-write-protocol > >>> > >>> protocols. > >>> > >>> So, I think the vendors of DBs like ClickHouse, Databend, Druid, Doris, > >>> InfluxDB, StarRocks, SingleStore, and others might be interested in > these > >>> new protocols to gain foothold because it would enable these DBs to > >> compete > >>> on storage formats, and indexing/ingestion approaches, rather than be > >>> reduced to "processing engines for Iceberg", where there are already > far > >>> too many other options. > >>> > >>> Specialised and "challenger" processing engine vendors might be > >> interested, > >>> too, as table transfer protocols that I proposed are kind of "Arrow > >>> Flight++", maybe there is a room for something like "DataFusion > >> Ballista++" > >>> where table transfer protocols are used as source and sink protocols > >>> instead of Arrow Flight. > >>> > >>> I'm interested in developing these protocols, but it requires some > >>> preliminary interest from vendors to validate the idea. Other feedback > is > >>> welcome, too. > >>> > >> > >> > >> > >> > > >