I don't intend table transfer protocols to be "Arrow protocols" in the
sense that I agree that using the Arrow brand would confuse people in this
case.

"It wouldn't transport Arrow data" -- no, it *would* transport Arrow data
sometimes, potentially even all of the time in specific deployments and
workloads:
- When there is any row-level filtering or aggregation done on the server
side -- this is exactly Arrow Flight's scenario
- When there is no row-level filtering or aggregation on the server side,
but the client side is not familiar with with server side's column-level
encodings or compressions and asks the server side to covert columns to the
baseline format which the client side understands, which is Arrow. I
mentioned this in my reply to Micah a few messages above. The part of the
original writing where I discuss this is here
<https://engineeringideas.substack.com/i/148153268/column-sections-and-transfer-encodings>
.
- When there is no row-level filtering or aggregation on the server side,
but that data is freshly ingested to the DB and is not even compressed by
the DB yet. A-la Druid's or Pinot's real-time ingestion segment, or any
HTAP or timeseries database's "staging" data segment.

On Fri, 13 Sept 2024 at 19:11, Antoine Pitrou <anto...@python.org> wrote:

>
> Ok, so to make things clear, you're simply having a general discussion
> about something that would not be a Arrow protocol (since it wouldn't
> transport Arrow data, if I understand your complaints about Flight)?
>
> If so, then it would be less confusing if you stopped talking about
> "improved Arrow Flight".
>
> Regards
>
> Antoine.
>
>
>
> Le 13/09/2024 à 11:54, Roman Leventov a écrit :
> > Hello Antoine,
> >
> > I understand what Arrow is. Perhaps, you are confused by the phrases
> > "Improved Arrow Flight" or "Arrow Flight++" that I used? These are
> > metaphors intended to convey the essence of the table transfer protocols
> as
> > succinctly as possible (though imperfectly), not the actual proposals to
> > make table transfer protocols the next versions of Arrow Flight.
> >
> > I started this discussion in the Arrow dev list because there is
> probably a
> > reasonably high concentration of interested audience here.
> >
> > FWIW, I thought that DataFusion would be a more fitting home for the
> > development of these protocols (at least, initially) among any of the
> > existing Apache projects. I chose this dev list rather than DataFusion's
> > dev list because there is already Arrow Flight as a transfer protocol
> > reference for discussion (rather than "DataFusion Flight"), and because
> all
> > potentially interested people in the DataFusion dev list likely read this
> > dev list anyway.
> >
> > Re: sending Parquet files over HTTP, it would not help to efficiently
> > query/process data stored in ClickHouse's (Databend, Doris, Druid,
> > DeepLake, Firebolt, Lance, Oxla, Pinot, QuestDB, SingleStore, etc.)
> native
> > at-rest partition file formats.
> >
> > On Fri, 13 Sept 2024 at 16:43, Antoine Pitrou <anto...@python.org>
> wrote:
> >
> >>
> >> Hello,
> >>
> >> I'm perplexed by this discussion. If you want to send highly-compressed
> >> files over the network that is already possible: just send Parquet
> >> files via HTTP(S) (or another protocol of choice).
> >>
> >> Arrow Flight is simply a *streaming* protocol that allows
> >> sending/requesting the Arrow format over gRPC. The Arrow format was
> >> never meant to be high-compression as it's primarily an in-memory
> >> format, and only secondarily a transfer/storage format.
> >>
> >> I don't think it's reasonable to enlarge the scope of Arrow to
> >> something else.
> >>
> >> Regards
> >>
> >> Antoine.
> >>
> >>
> >>
> >> On Tue, 10 Sep 2024 01:29:05 +0800
> >> Roman Leventov <leven...@apache.org> wrote:
> >>> Hello Arrow community,
> >>>
> >>> I've recently proposed a family of "table transfer protocols": Table
> >> Read,
> >>> Table Write, and Table Replication protocols (provisional names) that I
> >>> think can become a full-standing alternative to Iceberg for interop
> >> between
> >>> diverse databases/data warehouses and processing and ML engines:
> >>>
> >>
> https://engineeringideas.substack.com/p/table-transfer-protocols-improved.
> >>>
> >>> It appears to me that the "convergence on Iceberg" doesn't really serve
> >> the
> >>> users because the competition in the storage layer is inhibited and
> >>> everyone is using the "lowest common denominator" format that has
> >>> limitations. See more details about this argument at the link above.
> >>>
> >>> But why then everyone is so excited about Iceberg then and push
> databases
> >>> to add Iceberg catalogs (essentially turning themselves into processing
> >>> engines), in addition to their native storage formats?
> >>>   1) Cheap storage -- not due to Iceberg per se, but due to object
> >> storage.
> >>> Snowflake, Redshift, Firebolt, Databend, and other DBs have their own
> >>> object storage-first storage formats, too, and many other DBs can be
> >>> configured with aggressive tiering to object storage.
> >>>   2) Interop with different processing engines -- Arrow Flight provides
> >>> this, too
> >>>   3) Ability to lift and move data to another DB or cloud -- this is
> fair,
> >>> but the mere ability of the DB to write Iceberg permits the data team
> to
> >>> convert and move *if* they decide to do it, but before that, why
> wouldn't
> >>> they use DB's native format, more optimised for this DB, the ingestion
> >>> pattern, etc.?
> >>> 4) *Parallel writing* -- processing engine's nodes can write parquet
> >> files
> >>> completely in parallel within an Iceberg transaction, Arrow Flight
> >> doesn't
> >>> permit this
> >>> 5) *Reading compressed data* from object storage to external processing
> >>> engines -- Arrow Flight requires convert everything into Arrow before
> >>> sending. If the processing job requires to pull a lot of Parquet
> >>> files/columns (even after filtering partitions) converting them all to
> >>> Arrow before sending as the minimum can increase the job latency quite
> a
> >>> bit, as the maximum will cost much more money for egress.
> >>>
> >>> I cannot think of any other substantial reason (when compared with DBs
> >>> rather than with Hive tables). If this is the case, Arrow Flight is not
> >> yet
> >>> a full-standing alternative to Iceberg catalogs for these three
> reasons:
> >>>   1) *No parallel writes*
> >>>   2) *Inability to send the table column data in the on-disk compressed
> >>> formats* (rather than Arrow), and
> >>>   3) A few other provisions in the protocol design that don't make it
> easy
> >>> to control/balance the load of DB nodes, failover to replica nodes, the
> >>> ability to re-request partition's endpoint from the server "in the same
> >>> query" (that is, at the same consistency snapshot), etc.
> >>>
> >>> These are the things that I tried to address in Table Read
> >>> <
> >>
> https://engineeringideas.substack.com/i/148153268/table-read-protocol-walkthrough
> >>>
> >>> and Table Write
> >>> <
> >>
> https://engineeringideas.substack.com/i/148485754/use-cases-for-table-write-protocol
> >>>
> >>> protocols.
> >>>
> >>> So, I think the vendors of DBs like ClickHouse, Databend, Druid, Doris,
> >>> InfluxDB, StarRocks, SingleStore, and others might be interested in
> these
> >>> new protocols to gain foothold because it would enable these DBs to
> >> compete
> >>> on storage formats, and indexing/ingestion approaches, rather than be
> >>> reduced to "processing engines for Iceberg", where there are already
> far
> >>> too many other options.
> >>>
> >>> Specialised and "challenger" processing engine vendors might be
> >> interested,
> >>> too, as table transfer protocols that I proposed are kind of "Arrow
> >>> Flight++", maybe there is a room for something like "DataFusion
> >> Ballista++"
> >>> where table transfer protocols are used as source and sink protocols
> >>> instead of Arrow Flight.
> >>>
> >>> I'm interested in developing these protocols, but it requires some
> >>> preliminary interest from vendors to validate the idea. Other feedback
> is
> >>> welcome, too.
> >>>
> >>
> >>
> >>
> >>
> >
>

Reply via email to