+1 (non-binding) On Tue, Aug 20, 2019, 7:43 AM Antoine Pitrou <solip...@pitrou.net> wrote:
> > Sorry, had forgotten to send my vote on this. > > +1 from me. > > Regards > > Antoine. > > > On Wed, 14 Aug 2019 17:42:33 -0500 > Wes McKinney <wesmck...@gmail.com> wrote: > > hi all, > > > > As we've been discussing [1], there is a need to introduce 4 bytes of > > padding into the preamble of the "encapsulated IPC message" format to > > ensure that the Flatbuffers metadata payload begins on an 8-byte > > aligned memory offset. The alternative to this would be for Arrow > > implementations where alignment is important (e.g. C or C++) to copy > > the metadata (which is not always small) into memory when it is > > unaligned. > > > > Micah has proposed to address this by adding a > > 4-byte "continuation" value at the beginning of the payload > > having the value 0xFFFFFFFF. The reason to do it this way is that > > old clients will see an invalid length (what is currently the > > first 4 bytes of the message -- a 32-bit little endian signed > > integer indicating the metadata length) rather than potentially > > crashing on a valid length. We also propose to expand the "end of > > stream" marker used in the stream and file format from 4 to 8 > > bytes. This has the additional effect of aligning the file footer > > defined in File.fbs. > > > > This would be a backwards incompatible protocol change, so older Arrow > > libraries would not be able to read these new messages. Maintaining > > forward compatibility (reading data produced by older libraries) would > > be possible as we can reason that a value other than the continuation > > value was produced by an older library (and then validate the > > Flatbuffer message of course). Arrow implementations could offer a > > backward compatibility mode for the sake of old readers if they desire > > (this may also assist with testing). > > > > Additionally with this vote, we want to formally approve the change to > > the Arrow "file" format to always write the (new 8-byte) end-of-stream > > marker, which enables code that processes Arrow streams to safely read > > the file's internal messages as though they were a normal stream. > > > > The PR making these changes to the IPC documentation is here > > > > https://github.com/apache/arrow/pull/4951 > > > > Please vote to accept these changes. This vote will be open for at > > least 72 hours > > > > [ ] +1 Adopt these Arrow protocol changes > > [ ] +0 > > [ ] -1 I disagree because... > > > > Here is my vote: +1 > > > > Thanks, > > Wes > > > > [1]: > https://lists.apache.org/thread.html/8440be572c49b7b2ffb76b63e6d935ada9efd9c1c2021369b6d27786@%3Cdev.arrow.apache.org%3E > > > > > >