Hmm... not sure about that.  IMHO, if the old format is detected, then a
4-byte EOS marker should be used.  If the new format is detected, then a
8-byte EOS marker should be used.

Regards

Antoine.


Le 07/08/2019 à 16:16, Wes McKinney a écrit :
> You make a good point. For backward compatibility reasons, bytes 5
> through 8 would need to be unspecified padding bytes, I think. Does
> that sound right?
> 
> On Wed, Aug 7, 2019 at 9:02 AM Antoine Pitrou <anto...@python.org> wrote:
>>
>>
>> This may be coming a bit late, but I realize we could take the
>> opportunity to *also* make the end-of-stream marker a 8-bytes marker
>> (rather than 4-bytes).  What do you think?
>>
>> Regards
>>
>> Antoine.
>>
>>
>> Le 06/08/2019 à 22:15, Wes McKinney a écrit :
>>> hi all,
>>>
>>> As we've been discussing for the last 5 weeks or so [1], there is a
>>> need to introduce 4 bytes of padding into the preamble of the
>>> "encapsulated IPC message" format to ensure that the Flatbuffers
>>> metadata payload begins on an 8-byte aligned memory offset. The
>>> alternative to this would be for Arrow implementations where alignment
>>> is important (e.g. C or C++) to copy the metadata (which is not always
>>> small) into memory when it is unaligned.
>>>
>>> Micah has proposed to address this by adding a 4-byte "continuation"
>>> value at the beginning of the payload having the value 0xFFFFFFFF. The
>>> reason to do it this way is that old clients will see an invalid
>>> length (what is currently the first 4 bytes of the message -- a 32-bit
>>> little endian signed integer indicating the metadata length) rather
>>> than potentially crashing on a valid length.
>>>
>>> This would be a backwards incompatible protocol change, so older Arrow
>>> libraries would not be able to read these new messages. Maintaining
>>> forward compatibility (reading data produced by older libraries) would
>>> be possible as we can reason that a value other than the continuation
>>> value was produced by an older library (and then validate the
>>> Flatbuffer message of course). Arrow implementations could offer a
>>> backward compatibility mode for the sake of old readers if they desire
>>> (this may also assist with testing).
>>>
>>> The PR making these changes to the IPC documentation is here
>>>
>>> https://github.com/apache/arrow/pull/4951
>>>
>>> Please vote to accept this change. This vote will be open for at least 72 
>>> hours
>>>
>>> [ ] +1 Adopt the Arrow protocol change
>>> [ ] +0
>>> [ ] -1 I disagree because...
>>>
>>> Here is my vote: +1
>>>
>>> Thanks,
>>> Wes
>>>
>>> [1]: 
>>> https://lists.apache.org/thread.html/8440be572c49b7b2ffb76b63e6d935ada9efd9c1c2021369b6d27786@%3Cdev.arrow.apache.org%3E
>>>

Reply via email to