Re: [Discuss] IPC Specification, flatbuffers and unaligned memory accesses

Micah Kornfield Thu, 18 Jul 2019 21:43:47 -0700

I'm trying to work out the exact steps in my mind for a migration. It seems
like one approach is:


1.  Add a code change which throws a clear exception it encounters -1 for
size.  In java the reasonable place seems to be at [1] (there might be
more?).   The exception should state that the current stream reader isn't
compatible with version 1.0.0 streams (we should have similar ones in each
language).  We can add a note about the environment variable in 2 if we
decide to do it.  Release this change as 0.15.0 or 0.14.2 and ensure at
least Spark upgrades to this version.

2.  Change the reader implementation to support reading both 1.0.0 streams
and be backwards compatible with pre-1.0.0 streams.  Change the writer
implementation to default to writing 1.0.0 streams but have an environment
variable that make it write backwards compatible streams (writer
compatibility seems like it should be optional).  Release this as 1.0.0

3. If provided, remove the environment variable switch in a later release.

Thanks,
Micah

[1]
https://github.com/apache/arrow/blob/9fe728c86caaf9ceb1827159eb172ff81fb98550/java/vector/src/main/java/org/apache/arrow/vector/ipc/message/MessageChannelReader.java#L67

On Thu, Jul 18, 2019 at 8:58 PM Wes McKinney <wesmck...@gmail.com> wrote:

> To be clear, we could make a patch 0.14.x release that includes the
> necessary compatibility changes. I presume Spark will be able to upgrade to
> a new patch release (I'd be surprised if not, otherwise how can you get
> security fixes)?
>
> On Thu, Jul 18, 2019, 10:52 PM Bryan Cutler <cutl...@gmail.com> wrote:
>
> > Hey Wes,
> > I understand we don't want to burden 1.0 by maintaining compatibility and
> > that is fine with me. I'm just try to figure out how to best handle this
> > situation so Spark users won't get a cryptic error message. It sounds
> like
> > it will need to be handled on the Spark side to not allow mixing 1.0 and
> > pre-1.0 versions. I'm not too sure how much a 0.15.0 release with
> > compatibility would help, it might depend on when things get released but
> > we can discuss that in another thread.
> >
> > On Thu, Jul 18, 2019 at 12:03 PM Wes McKinney <wesmck...@gmail.com>
> wrote:
> >
> > > hi Bryan -- well, the reason for the current 0.x version is precisely
> > > to avoid a situation where we are making decisions on the basis of
> > > maintaining forward / backward compatibility.
> > >
> > > One possible way forward on this is to make a 0.15.0 (0.14.2, so there
> > > is less trouble for Spark to upgrade) release that supports reading
> > > _both_ old and new variants of the protocol.
> > >
> > > On Thu, Jul 18, 2019 at 1:20 PM Bryan Cutler <cutl...@gmail.com>
> wrote:
> > > >
> > > > Are we going to say that Arrow 1.0 is not compatible with any version
> > > > before?  My concern is that Spark 2.4.x might get stuck on Arrow Java
> > > > 0.14.1 and a lot of users will install PyArrow 1.0.0, which will not
> > > work.
> > > > In Spark 3.0.0, though it will be no problem to update both Java and
> > > Python
> > > > to 1.0. Having a compatibility mode so that new readers/writers can
> > work
> > > > with old readers using a 4-byte prefix would solve the problem, but
> if
> > we
> > > > don't want to do this will pyarrow be able to raise an error that
> > clearly
> > > > the new version does not support the old protocol?  For example,
> would
> > a
> > > > pyarrow reader see the 0xFFFFFFFF and raise something like "PyArrow
> > > > detected an old protocol and cannot continue, please use a version <
> > > 1.0.0"?
> > > >
> > > > On Thu, Jul 11, 2019 at 12:39 PM Wes McKinney <wesmck...@gmail.com>
> > > wrote:
> > > >
> > > > > Hi Francois -- copying the metadata into memory isn't the end of
> the
> > > world
> > > > > but it's a pretty ugly wart. This affects every IPC protocol
> message
> > > > > everywhere.
> > > > >
> > > > > We have an opportunity to address the wart now but such a fix
> > > post-1.0.0
> > > > > will be much more difficult.
> > > > >
> > > > > On Thu, Jul 11, 2019, 2:05 PM Francois Saint-Jacques <
> > > > > fsaintjacq...@gmail.com> wrote:
> > > > >
> > > > > > If the data buffers are still aligned, then I don't think we
> should
> > > > > > add a breaking change just for avoiding the copy on the metadata?
> > I'd
> > > > > > expect said metadata to be small enough that zero-copy doesn't
> > really
> > > > > > affect performance.
> > > > > >
> > > > > > François
> > > > > >
> > > > > > On Sun, Jun 30, 2019 at 4:01 AM Micah Kornfield <
> > > emkornfi...@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > While working on trying to fix undefined behavior for unaligned
> > > memory
> > > > > > > accesses [1], I ran into an issue with the IPC specification
> [2]
> > > which
> > > > > > > prevents us from ever achieving zero-copy memory mapping and
> > having
> > > > > > aligned
> > > > > > > accesses (i.e. clean UBSan runs).
> > > > > > >
> > > > > > > Flatbuffer metadata needs 8-byte alignment to guarantee aligned
> > > > > accesses.
> > > > > > >
> > > > > > > In the IPC format we align each message to 8-byte boundaries.
> We
> > > then
> > > > > > > write a int32_t integer to to denote the size of flat buffer
> > > metadata,
> > > > > > > followed immediately  by the flatbuffer metadata.  This means
> the
> > > > > > > flatbuffer metadata will never be 8 byte aligned.
> > > > > > >
> > > > > > > Do people care?  A simple fix  would be to use int64_t instead
> of
> > > > > int32_t
> > > > > > > for length.  However, any fix essentially breaks all previous
> > > client
> > > > > > > library versions or incurs a memory copy.
> > > > > > >
> > > > > > > [1] https://github.com/apache/arrow/pull/4757
> > > > > > > [2] https://arrow.apache.org/docs/ipc.html
> > > > > >
> > > > >
> > >
> >
>

Re: [Discuss] IPC Specification, flatbuffers and unaligned memory accesses

Reply via email to