Re: Byte ordering/Endianness revisited

Wes McKinney Sat, 23 Apr 2016 08:08:22 -0700

I don't see a problem adding endianness as a flag in the IPC metadata,
and raise exceptions if big-endian data is ever encountered for the
time being. Since big-endian hardware is so exotic nowadays, I don't
think it's unreasonable to expect IBM or other hardware vendors
requiring big-endian support to contribute the byte-swapping logic
when the time comes. I suppose this just means we'll have to be
careful in code reviews should any algorithms get written that assume
a particular endianness. Will defer to others' judgment on this
ultimately, though.


On Fri, Apr 22, 2016 at 11:59 PM, Micah Kornfield <emkornfi...@gmail.com> wrote:
> This was discussed on a previous thread
> (https://mail-archives.apache.org/mod_mbox/arrow-dev/201604.mbox/%3CCAKa9qDkppFrJQCHsSN7CmkJCzOTAhGPERMd_u2CMZANNQGtNyw%40mail.gmail.com%3E
> the relevant snippet is pasted below).  But I'd like to reopen this
> because it appears Spark supports big endian systems (high end IBM
> hardware).    Right now the spec says:
>
> "The Arrow format is little endian."
>
> I'd like to change this to something like:
>
> "Algorithms written against Arrow Arrays should assume native
> byte-ordering. Endianness is communicated via IPC/RPC metadata and
> conversion to native byte-ordering is handled via IPC/RPC
> implementations".
>
> What do other people think?
>
> My assumption is that most deployments for the systems we are
> targeting  are going to be homogenous in terms of byte ordering.  I
> think this can allow initial implementations to ignore support for
> non-native byte ordering (i.e. raise an exception if detected).
> Has this been other's experience?
>
> Thanks,
> Micah
>
> Snippet from the original thread:
>>>
>>> 1.  For completeness it might be useful to add a statement that the
>>> byte order (endianness) is platform native.
>
>
>> Actually, Arrow is little-endian. It is an oversight that we haven't 
>> documented it as
>>such. One of the key capabilities is to push it across the wire between 
>>separate
>>systems without serialization (not just IPC). As such, we have to pick an
>>endianness. If there is a huge need for a second big-endian encoding, we'll 
>>need to
>>extend the spec to support that as a property.

Re: Byte ordering/Endianness revisited

Reply via email to