In case any one wants to comment further, I've opened
https://github.com/apache/arrow/pull/8374
<https://github.com/apache/arrow/pull/8374#pullrequestreview-504324361> to
canonicalize the details.

On Mon, Sep 28, 2020 at 9:08 PM Micah Kornfield <emkornfi...@gmail.com>
wrote:

> OK, I will try to update documentation reflecting this in the next few
> days (in particular it would be good to document which implementations are
> willing to support byte flipping).
>
> On Tue, Sep 22, 2020 at 3:30 AM Antoine Pitrou <anto...@python.org> wrote:
>
>>
>>
>> Le 22/09/2020 à 06:36, Micah Kornfield a écrit :
>> > I wanted to give this thread a bump, does the proposal I made below
>> sound
>> > reasonable?
>>
>> It does!
>>
>> Regards
>>
>> Antoine.
>>
>>
>> >
>> > On Sun, Sep 13, 2020 at 9:57 PM Micah Kornfield <emkornfi...@gmail.com>
>> > wrote:
>> >
>> >> If I read the responses so far it seems like the following might be a
>> good
>> >> compromise/summary:
>> >>
>> >> 1. It does not seem too invasive to support native endianness in
>> >> implementation libraries.  As long as there is appropriate performance
>> >> testing and CI infrastructure to demonstrate the changes work.
>> >> 2. It is up to implementation maintainers if they wish to accept PRs
>> that
>> >> handle byte swapping between different architectures.  (Right now it
>> sounds
>> >> like C++ is potentially OK with it and for Java at least Jacques is
>> opposed
>> >> to it?
>> >>
>> >> Testing changes that break big-endian can be a potential drag on
>> developer
>> >> productivity but there are methods to run locally (at least on more
>> recent
>> >> OSes).
>> >>
>> >> Thoughts?
>> >>
>> >> Thanks,
>> >> Micah
>> >>
>> >> On Mon, Aug 31, 2020 at 7:08 PM Fan Liya <liya.fa...@gmail.com> wrote:
>> >>
>> >>> Thank Kazuaki for the survey and thank Micah for starting the
>> discussion.
>> >>>
>> >>> I do not oppose supporting BE. In fact, I am in general optimistic
>> about
>> >>> the performance impact (for Java).
>> >>> IMO, this is going to be a painful way (many byte order related
>> problems
>> >>> are tricky to debug), so I hope we can make it short.
>> >>>
>> >>> It is good that someone is willing to take this on, and I would like
>> to
>> >>> provide help if needed.
>> >>>
>> >>> Best,
>> >>> Liya Fan
>> >>>
>> >>>
>> >>>
>> >>> On Tue, Sep 1, 2020 at 7:25 AM Bryan Cutler <cutl...@gmail.com>
>> wrote:
>> >>>
>> >>>> I also think this would be a worthwhile addition and help the project
>> >>>> expand in more areas. Beyond the Apache Spark optimization use case,
>> >>> having
>> >>>> Arrow interoperability with the Python data science stack on BE
>> would be
>> >>>> very useful. I have looked at the remaining PRs for Java and they
>> seem
>> >>>> pretty minimal and straightforward. Implementing the equivalent
>> record
>> >>>> batch swapping as done in C++ at [1] would be a little more involved,
>> >>> but
>> >>>> still reasonable. Would it make sense to create a branch to apply all
>> >>>> remaining changes with CI to get a better picture before deciding on
>> >>>> bringing into master branch?  I could help out with shepherding this
>> >>> effort
>> >>>> and assist in maintenance, if we decide to accept.
>> >>>>
>> >>>> Bryan
>> >>>>
>> >>>> [1] https://github.com/apache/arrow/pull/7507
>> >>>>
>> >>>> On Mon, Aug 31, 2020 at 1:42 PM Wes McKinney <wesmck...@gmail.com>
>> >>> wrote:
>> >>>>
>> >>>>> I think it's well within the right of an implementation to reject BE
>> >>>>> data (or non-native-endian), but if an implementation chooses to
>> >>>>> implement and maintain the endianness conversions, then it does not
>> >>>>> seem so bad to me.
>> >>>>>
>> >>>>> On Mon, Aug 31, 2020 at 3:33 PM Jacques Nadeau <jacq...@apache.org>
>> >>>> wrote:
>> >>>>>>
>> >>>>>> And yes, for those of you looking closely, I commented on ARROW-245
>> >>>> when
>> >>>>> it
>> >>>>>> was committed. I just forgot about it.
>> >>>>>>
>> >>>>>> It looks like I had mostly the same concerns then that I do now :)
>> >>> Now
>> >>>>> I'm
>> >>>>>> just more worried about format sprawl...
>> >>>>>>
>> >>>>>> On Mon, Aug 31, 2020 at 1:30 PM Jacques Nadeau <jacq...@apache.org
>> >
>> >>>>> wrote:
>> >>>>>>
>> >>>>>>> What do you mean?  The Endianness field (a Big|Little enum) was
>> >>>> added 4
>> >>>>>>>> years ago:
>> >>>>>>>> https://issues.apache.org/jira/browse/ARROW-245
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> I didn't realize that was done, my bad. Good example of format rot
>> >>>>> from my
>> >>>>>>> pov.
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> >
>>
>

Reply via email to