Re: [DISCUSS][Flight] Improved Arrow Flight as alternative to Iceberg for DB--engine interop

Roman Leventov Fri, 13 Sep 2024 01:35:01 -0700

Hello Micah, thanks for the valuable input, replies inline

On Fri, 13 Sept 2024 at 01:06, Micah Kornfield <emkornfi...@gmail.com>
wrote:


> IIUC you are
> advocating for something which to some extent is a generalization of
> Iceberg's REST catalog planning API [1].  But instead of just Iceberg, it
> would encompass other formats, and also be more flexible on negotiating
> which parts of the computation client vs server are responsible for?


Yes, this is a possible rough description of table transfer protocols.
However, I de-emphasize "formats" and instead focus on column-level
encodings.


> The main downside to this approach is you end up requiring client side
> libraries for any potential scan operation (e.g. Iceberg, Parquet, Nimble,
> etc) or you need to negotiate with the server on supported tasks.  This is
> solvable but adds complexity to deployments and operational debuggability.
>

Yes. The server tells the client what transfer encodings it supports for
each column, but Arrow encodings are always among them (at least if the
column type itself is known to Arrow) as a fallback. The client may decide
whether it can perform the scan operations on its side for the given
at-rest encoding/compression. Then, if the client doesn't doesn't support
the scan operator for the at-rest encoding, it can either request the
column in Arrow format (that is, require the server to decode the column on
its side) or, to trade off the network traffic with its own memory usage
and CPU, still request the client to transfer the column in the at-rest
format, then decode it without applying the scan operation, and then apply
the scan operation on each decoded stripe/block in a generic way.

I agree this is complicated and re-introduces sub-optimality if the
processing engine (client) doesn't support the at-rest file format. Though
in the absence of table transfer protocols, the situation is not any better.

However, it seems that a combination of Velox *or* DataFusion (libraries
specialised in scan operators well on top of different file formats) *with*
Apache xTable that you referenced below (metadata abstraction) can go a
pretty long way towards my vision.

Still, not *all* the way: 1) it doesn't enable closed-source file formats
to join the party: Firebolt, Oxla, SingleStore, Snowflake, etc., 2) doesn't
leverage SSD/memory caches of column data and/or indexes, and 3)
alternative, non-object storage architectures a-la Vastdata, BigQuery
Managed storage, Meta's Tectonic, etc.

5) *Reading compressed data* from object storage to external processing
> > engines -- Arrow Flight requires convert everything into Arrow before
> > sending. If the processing job requires to pull a lot of Parquet
> > files/columns (even after filtering partitions) converting them all to
> > Arrow before sending as the minimum can increase the job latency quite a
> > bit, as the maximum will cost much more money for egress.
>
> The Disassociated IPC protocol [2] was a start in this direction, and I
> believe the original protocol mentioned referencing parquet files (which
> was removed).
>

Interesting, I didn't know that. I suggested basing table transfer
protocols on (unnamed) streaming protocol for columnar data:
https://github.com/apache/arrow/issues/43762, whose main differences from
Dissociated IPC are 1) no hard requirement to use Arrow column encodings
only, and 2) reactive streams-style async flow control.

Re: [DISCUSS][Flight] Improved Arrow Flight as alternative to Iceberg for DB--engine interop

Reply via email to