Re: [INFO_REQUEST][FLIGHT] - Dynamic schema changes in ArrowFlight streams

Wes McKinney Fri, 09 Jul 2021 08:25:44 -0700

It sounds like we may want to discuss some potential evolutions of the
Arrow binary protocol (for example: new Message types). Certainly a
can of worms but rather than trying to bolt some new functionality
onto the existing structures, it might be better to support the new
use cases through some new structures which will be more clear cut
from a forward compatibility standpoint.


On Wed, Jul 7, 2021 at 8:31 PM David Li <lidav...@apache.org> wrote:
>
> To summarize so far, it sounds like schema evolution is neither sufficient 
> nor necessary for either Gosh or Nate's use-cases here? It could be useful 
> for FlightSQL but even there I don't think it's a requirement.
>
> For Nate - it almost sounds like what you need is some way to slice up a 
> record batch and send columns individually, which isn't really a concept in 
> IPC (and hence Flight). Or rather, record batch is almost the wrong 
> abstraction for your use case (when you're sending per-column deltas), even 
> though you could model it as a record batch with 'empty' columns or as a 
> stream with constantly shifting schema (neither of which are perfect 
> encodings).
>
> -David
>
> On Wed, Jul 7, 2021, at 13:24, Nate Bauernfeind wrote:
> > > Flatbuffers does not support modifying structs
> > > in any forwards or backwards compatible way
> > > (only tables support evolution).
> >
> > Bah. I did not realize that.
> >
> > To reiterate the feature that would be ideal:
> > I realize the specific feature I am missing is the ability to encode that a
> > field (i.e. its FieldNode and accompanying Buffers in the RecordBatch) is
> > empty/has-no-data in O(0) cost (yes; for free).
> >
> > Since RecordBatch is a table, might there be interest in adding a field
> > that is a bitset (formatted as a byte vector) to indicate the subset of
> > root FieldNodes that are included in the RecordBatch's field-node-list
> > (noting that the buffer list most be appropriately filtered, too)? If the
> > bitset is omitted/empty we can be forwards-compatible by assuming every
> > field is included (seems fair; because why send a record batch without a
> > payload?). This does bring down the cost from 16 bytes (field node) + 32
> > bytes (2x buffer node minimum; validity buffer node + payload buffer node)
> > to 1 bit per field node -- that's a compression ratio of >= 384:1 --
> > although it's not free, it's a lot closer.
> >
> > Are there any other alternatives the Arrow community might consider?
> >
> >
> > > This sounds a lot like what DenseUnion provides though?
> >
> > The use case is as follows: We send multiple record batches that aggregate
> > / accumulate into a single logical update. The first set of record batches
> > contain payload for rows that were added since the last logical update
> > (this is a set of updates to accommodate that not every implementation
> > supports 32-bit lengths). A field node for every column and every added row
> > will be sent. For this half of the logical update the RecordBatch length
> > matches the root FieldNode lengths. The second set of record batches
> > contain payload for only rows, and fields, that were actually modified. If
> > only one field changed, we send only that payload for that row. There is
> > additional metadata that allows the client to understand which existing row
> > is to be replaced by any given row for a given Field/Column. In this
> > context, each root field node has a different mapping to minimize total
> > payload size.
> >
> > We might be able to use DenseUnion but it certainly feels like folding the
> > entire data-source into a dense union makes the design less useful and less
> > accessible. I will spend some time sleeping on your suggestion, but I'm not
> > immediately excited about it. At this moment, I suspect I will continue to
> > lie and state that the RecordBatch's length is the max length across all
> > root field node lengths (and be content that it's not ideal from a
> > copy/allocation perspective).
> > -
> >
> > On Wed, Jul 7, 2021 at 10:57 AM Micah Kornfield <emkornfi...@gmail.com>
> > wrote:
> >
> > > >
> > > > Might there be interest in adding a "field_id" to the FieldNode (which 
> > > > is
> > > > encoded on the RecordBatch flatbuffer)? I see a simple 
> > > > forward-compatible
> > > > upgrade (by either keying off of 0, or explicitly set the field default
> > > to
> > > > -1) which would allow the sender to "skip" fields that have 1) FieldNode
> > > > length of zero, and 2) all Buffer's associated at that level (and 
> > > > further
> > > > nested) are also equally empty (i.e. Buffer length is zero).
> > >
> > >
> > > FieldNode is a struct in Message.fbs.   Flatbuffers does not support
> > > modifying structs in any forwards or backwards compatible way (only tables
> > > support evolution).  I think there was originally more metadata in
> > > FieldNode and it was stripped down due to size concerns.
> > >
> > > I understand this concept slightly interferes with RecordBatch's `length`
> > > > field, and that many implementations use that length to resize the
> > > > root-level FieldNodes. The use-case I have in mind has different logical
> > > > lengths per field node; current implementations require sending a
> > > > RecordBatch length of the max length across all root level field nodes. 
> > > > I
> > > > believe this requires a copy of data whenever a field node is too short;
> > > I
> > > > don't know if there is a decent solution to this slight inefficiency. I
> > > am
> > > > bringing it up because if "skipping a field node when it is empty" is a
> > > > feature, then we may not want to allocate space for those nodes given
> > > that
> > > > the record batch length will likely be greater than zero.
> > >
> > >
> > > Having conflicting RecordBatch and top-level field nodes is something that
> > > I believe we have pushed back on in the past.   This sounds a lot like
> > > what  DenseUnion provides though?
> > >
> > > On Wed, Jul 7, 2021 at 8:32 AM Nate Bauernfeind <
> > > natebauernfe...@deephaven.io> wrote:
> > >
> > > > Deephaven and I are very supportive of "upgrading" the value half of the
> > > kv
> > > > pair to a byte vector. What is the best way to find out if there is
> > > > sufficient interest?
> > > >
> > > >
> > > > I've been stewing on the ideas here around schema evolution, and I
> > > realize
> > > > the specific feature I am missing is the ability to encode that a field
> > > > (i.e. its FieldNode and accompanying Buffers in the RecordBatch) is
> > > > empty/has-no-data in O(0) cost (yes; for free).
> > > >
> > > > Might there be interest in adding a "field_id" to the FieldNode (which 
> > > > is
> > > > encoded on the RecordBatch flatbuffer)? I see a simple 
> > > > forward-compatible
> > > > upgrade (by either keying off of 0, or explicitly set the field default
> > > to
> > > > -1) which would allow the sender to "skip" fields that have 1) FieldNode
> > > > length of zero, and 2) all Buffer's associated at that level (and 
> > > > further
> > > > nested) are also equally empty (i.e. Buffer length is zero).
> > > >
> > > > I understand this concept slightly interferes with RecordBatch's 
> > > > `length`
> > > > field, and that many implementations use that length to resize the
> > > > root-level FieldNodes. The use-case I have in mind has different logical
> > > > lengths per field node; current implementations require sending a
> > > > RecordBatch length of the max length across all root level field nodes. 
> > > > I
> > > > believe this requires a copy of data whenever a field node is too short;
> > > I
> > > > don't know if there is a decent solution to this slight inefficiency. I
> > > am
> > > > bringing it up because if "skipping a field node when it is empty" is a
> > > > feature, then we may not want to allocate space for those nodes given
> > > that
> > > > the record batch length will likely be greater than zero.
> > > >
> > > > On Wed, Jul 7, 2021 at 8:12 AM Wes McKinney <wesmck...@gmail.com> wrote:
> > > >
> > > > > On Wed, Jul 7, 2021 at 2:53 PM David Li <apa...@lidavidm.me> wrote:
> > > > > >
> > > > > > From the Flatbuffers internals doc[1] it appears they are the same:
> > > > > "Strings are simply a vector of bytes, and are always 
> > > > > null-terminated."
> > > > >
> > > > > I see. I took a look at flatbuffers.h, and it appears that changing
> > > > > this field from string to [byte] would be backward-compatible and
> > > > > forward-compatible except with code that expects a null terminator.
> > > > > This is something we could discuss separately if there were enough
> > > > > interest.
> > > > >
> > > > > > [1]: https://google.github.io/flatbuffers/flatbuffers_internals.html
> > > > > >
> > > > > > -David
> > > > > >
> > > > > > On Wed, Jul 7, 2021, at 05:08, Wes McKinney wrote:
> > > > > > > On Tue, Jul 6, 2021 at 6:33 PM Micah Kornfield <
> > > > emkornfi...@gmail.com>
> > > > > wrote:
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Right, I had wanted to focus the discussion on Flight as I
> > > think
> > > > > schema
> > > > > > > > > evolution or multiplexing streams (more so the latter) is a
> > > > > property of the
> > > > > > > > > transport and not the stream format itself. If we are leaning
> > > > > towards just
> > > > > > > > > schema evolution then maybe it makes sense to discuss it for
> > > the
> > > > > IPC stream
> > > > > > > > > format and leverage that in Flight. I'd be interested in what
> > > > > others think.
> > > > > > > >
> > > > > > > > I tend to agree, I think stream multiplexing is likely a
> > > transport
> > > > > level
> > > > > > > > issue.  IMO I think schema evolution should be consistent with
> > > the
> > > > > IPC
> > > > > > > > stream format  and flight.
> > > > > > > >
> > > > > > > >
> > > > > > > > > Nate: it may be worth starting a separate discussion about 
> > > > > > > > > more
> > > > > general
> > > > > > > > > metadata in the IPC message. I'm not aware of why key-value
> > > > > metadata was
> > > > > > > > > chosen/if opaque bytes were considered in the past.
> > > > > > > >
> > > > > > > >
> > > > > > > > I think  this was an unfortunate design of the key value 
> > > > > > > > metadata
> > > > in
> > > > > > > > Schema.fbs, but I don't think I was around when this decision 
> > > > > > > > was
> > > > > made.
> > > > > > >
> > > > > > > I agree that it's unfortunate that we did not use [ byte ] instead
> > > of
> > > > > > > string for the value in the KeyValue metadata — I think this was
> > > more
> > > > > > > of an oversight than a deliberate choice (e.g. it was not our
> > > intent
> > > > > > > to require binary data to be base64-encoded — this is something
> > > that
> > > > > > > we have to do when encoding binary data in Thrift KeyValue 
> > > > > > > metadata
> > > > > > > for Parquet, for example). Is the binary representation of [byte]
> > > > > > > different from string?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > Side Question: Why isn't the IPC stream format a series of the
> > > > flight
> > > > > > > > > protobufs?
> > > > > > > >
> > > > > > > > In addition to what David said, protobufs can't be read directly
> > > > > from a
> > > > > > > > memory-mapped file (they need decoding).  This was one of the
> > > > design
> > > > > > > > considerations of using flatbuffers and IPC Stream/File format.
> > > > > > > >
> > > > > > > > I was thinking Micah's comment is more that whatever we do, it
> > > > > should be
> > > > > > > > > clearly specified and edge cases should be considered,
> > > especially
> > > > > if we
> > > > > > > > > might want to 'backport' this into the stream format later.
> > > > > > > >
> > > > > > > >
> > > > > > > > Yes, for dictionaries we just need to be careful to define
> > > > semantics
> > > > > and
> > > > > > > > ensure implementations are validating them with regards to
> > > > > dictionaries.
> > > > > > > > There likely isn't any need to change current implementations
> > > > though.
> > > > > > > >
> > > > > > > > On Mon, Jun 28, 2021 at 1:25 PM David Li <lidav...@apache.org>
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Right, I had wanted to focus the discussion on Flight as I
> > > think
> > > > > schema
> > > > > > > > > evolution or multiplexing streams (more so the latter) is a
> > > > > property of the
> > > > > > > > > transport and not the stream format itself. If we are leaning
> > > > > towards just
> > > > > > > > > schema evolution then maybe it makes sense to discuss it for
> > > the
> > > > > IPC stream
> > > > > > > > > format and leverage that in Flight. I'd be interested in what
> > > > > others think.
> > > > > > > > >
> > > > > > > > > Especially if we are looking at multiplexing streams - I would
> > > > > wonder if
> > > > > > > > > that's actually better served by making it easier to implement
> > > > > using the
> > > > > > > > > Flight implementation as it stands (by managing concurrent RPC
> > > > > calls and/or
> > > > > > > > > performing the union-of-structs encoding trick for you),
> > > instead
> > > > > of having
> > > > > > > > > to change the protocol.
> > > > > > > > >
> > > > > > > > > Nate: it may be worth starting a separate discussion about 
> > > > > > > > > more
> > > > > general
> > > > > > > > > metadata in the IPC message. I'm not aware of why key-value
> > > > > metadata was
> > > > > > > > > chosen/if opaque bytes were considered in the past. Off the 
> > > > > > > > > top
> > > > of
> > > > > my head
> > > > > > > > > if it's for on-disk storage and fully application-defined it
> > > may
> > > > > make sense
> > > > > > > > > to store as a separate file alongside the Arrow file (indexed
> > > by
> > > > > record
> > > > > > > > > batch index) where you can take advantage of whatever format 
> > > > > > > > > is
> > > > > most
> > > > > > > > > suitable.
> > > > > > > > >
> > > > > > > > > -David
> > > > > > > > >
> > > > > > > > > On Sun, Jun 27, 2021, at 07:50, Gosh Arzumanyan wrote:
> > > > > > > > > > Hi guys,
> > > > > > > > > >
> > > > > > > > > > 1. Regarding IPC vs Flight: in fact my initial suggestion 
> > > > > > > > > > was
> > > > to
> > > > > add this
> > > > > > > > > > feature starting from the IPC(I moved initial write up steps
> > > to
> > > > > the
> > > > > > > > > bottom
> > > > > > > > > > of the doc). Afterwards David suggested focusing on Flight
> > > and
> > > > > that's how
> > > > > > > > > > we ended up with the protobufs change in the proposal. This
> > > > > being said I
> > > > > > > > > do
> > > > > > > > > > think that the place where this should be impemented is a
> > > good
> > > > > question
> > > > > > > > > on
> > > > > > > > > > its own. Maybe it makes sense to have this kind of a feature
> > > in
> > > > > IPC and
> > > > > > > > > > somehow use it in Flight, maybe not.
> > > > > > > > > > 2. The point about dictionaries deserves a dedicated section
> > > in
> > > > > the
> > > > > > > > > > proposal. Nate and David brought it up and shared some
> > > > insights.
> > > > > I'll try
> > > > > > > > > > to aggregate them and we can continue the discussion form
> > > > there.
> > > > > > > > > >
> > > > > > > > > > Cheers,
> > > > > > > > > > Gosh
> > > > > > > > > >
> > > > > > > > > > On Sat., 26 Jun. 2021, 17:26 Nate Bauernfeind, <
> > > > > > > > > natebauernfe...@deephaven.io>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > > makes it more difficult to bring schema evolution
> > > back
> > > > > into the
> > > > > > > > > > > > > > IPC Stream format (i.e. it would live only in 
> > > > > > > > > > > > > > flight)
> > > > > > > > > > > > >
> > > > > > > > > > > > > Gosh's proposal extends the flatbuffer structures not
> > > the
> > > > > > > > > protobufs.
> > > > > > > > > > > Can
> > > > > > > > > > > > > you help me understand how difficult it would be to
> > > bring
> > > > > the
> > > > > > > > > > > `schema_id`
> > > > > > > > > > > > > approach to the IPC stream format?
> > > > > > > > > > > >
> > > > > > > > > > > > I thought we were talking solely about the Flight
> > > Protobuf
> > > > > > > > > definitions -
> > > > > > > > > > > > not the Flatbuffers (and the Google doc at least only
> > > talks
> > > > > about the
> > > > > > > > > > > > Protobufs).
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I somehow missed that schema_id is being added to protobuf
> > > in
> > > > > the
> > > > > > > > > document.
> > > > > > > > > > > It feels to me that the schema_id is a property that would
> > > > > ideally only
> > > > > > > > > > > apply to the RecordBatch. I better understand Micah's
> > > > > dictionary
> > > > > > > > > concerns,
> > > > > > > > > > > now, too.
> > > > > > > > > > >
> > > > > > > > > > > > Side Question: Why isn't the IPC stream format a series
> > > of
> > > > > the flight
> > > > > > > > > > > > > protobufs? It's a real shame that there is no standard
> > > > way
> > > > > to
> > > > > > > > > > > > > capture/replay a stream with app_metadata. (Obviously
> > > > > ignoring the
> > > > > > > > > > > > > annoyances around protobuf wrapping flatbuffers.)
> > > > > > > > > > > >
> > > > > > > > > > > > The IPC format was defined long before Flight, and
> > > Flight's
> > > > > > > > > app_metadata
> > > > > > > > > > > > was added after Flight's initial definition. Note an IPC
> > > > > message does
> > > > > > > > > > > have
> > > > > > > > > > > > a provision for key-value metadata, though I think APIs
> > > for
> > > > > that are
> > > > > > > > > not
> > > > > > > > > > > > fully exposed. (See ARROW-6940:
> > > > > > > > > > > > https://issues.apache.org/jira/browse/ARROW-6940 and
> > > > > despite my
> > > > > > > > > comments
> > > > > > > > > > > > there perhaps we need to unify or at least consider how
> > > > > Flight's
> > > > > > > > > > > > app_metadata relates to the IPC message custom_metadata.
> > > > Also
> > > > > > > > > perhaps see
> > > > > > > > > > > > ARROW-1059.)
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > KeyValue unfortunately is string to string. In flatbuffer
> > > > > strings are
> > > > > > > > > only
> > > > > > > > > > > UTF-8 or 7-bit ASCII. The app_metadata on the other hand 
> > > > > > > > > > > is
> > > > > opaque
> > > > > > > > > bytes.
> > > > > > > > > > > The latter is a bit more useful.
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > >
> >
> >
> > --
> >

Re: [INFO_REQUEST][FLIGHT] - Dynamic schema changes in ArrowFlight streams

Reply via email to