This was discussed on a previous thread
(https://mail-archives.apache.org/mod_mbox/arrow-dev/201604.mbox/%3CCAKa9qDkppFrJQCHsSN7CmkJCzOTAhGPERMd_u2CMZANNQGtNyw%40mail.gmail.com%3E
the relevant snippet is pasted below).  But I'd like to reopen this
because it appears Spark supports big endian systems (high end IBM
hardware).    Right now the spec says:

"The Arrow format is little endian."

I'd like to change this to something like:

"Algorithms written against Arrow Arrays should assume native
byte-ordering. Endianness is communicated via IPC/RPC metadata and
conversion to native byte-ordering is handled via IPC/RPC
implementations".

What do other people think?

My assumption is that most deployments for the systems we are
targeting  are going to be homogenous in terms of byte ordering.  I
think this can allow initial implementations to ignore support for
non-native byte ordering (i.e. raise an exception if detected).
Has this been other's experience?

Thanks,
Micah

Snippet from the original thread:
>>
>> 1.  For completeness it might be useful to add a statement that the
>> byte order (endianness) is platform native.


> Actually, Arrow is little-endian. It is an oversight that we haven't 
> documented it as
>such. One of the key capabilities is to push it across the wire between 
>separate
>systems without serialization (not just IPC). As such, we have to pick an
>endianness. If there is a huge need for a second big-endian encoding, we'll 
>need to
>extend the spec to support that as a property.

Reply via email to