Re: [Discuss] Extremely dubious Python equality semantics

2020-07-02 Thread Wes McKinney
On Wed, Jul 1, 2020 at 9:52 AM Joris Van den Bossche wrote: > > I am personally fine with removing the compute dunder methods again (i.e. > Array.__richcmp__), if that resolves the ambiguity. Although they *are* > convenient IMO, even for developers (question might also come up if we want > to add

Re: [Discuss] Extremely dubious Python equality semantics

2020-07-01 Thread Joris Van den Bossche
I am personally fine with removing the compute dunder methods again (i.e. Array.__richcmp__), if that resolves the ambiguity. Although they *are* convenient IMO, even for developers (question might also come up if we want to add __add__, __sub__ etc, though). So it could also be an option to say th

Re: [Discuss] Extremely dubious Python equality semantics

2020-07-01 Thread Wes McKinney
I think we need to have a hard separation between "data structure equality" (do these objects contain equivalent data) and "analytical/semantic equality". The latter is more the domain of pyarrow.compute and I am not sure we should be overloading dunder methods with compute functions. I might recom

Re: [Discuss] Extremely dubious Python equality semantics

2020-07-01 Thread Maarten Breddels
I think that if __eq__ does not return True/False exclusively, __bool__ should raise an exception to avoid these unexpected truthies. Python users are used to that due to Numpy. Op wo 1 jul. 2020 om 15:40 schreef Joris Van den Bossche < jorisvandenboss...@gmail.com>: > On Wed, 1 Jul 2020 at 09:4

Re: [Discuss] Extremely dubious Python equality semantics

2020-07-01 Thread Joris Van den Bossche
On Wed, 1 Jul 2020 at 09:46, Antoine Pitrou wrote: > > Hello, > > Recent changes to PyArrow seem to have taken the stance that comparing > null values should return null. Small note that it is not a *very* recent change ( https://github.com/apache/arrow/pull/5330, ARROW-6488

Re: [Discuss] Extremely dubious Python equality semantics

2020-07-01 Thread Krisztián Szűcs
On Wed, Jul 1, 2020 at 9:46 AM Antoine Pitrou wrote: > > > Hello, > > Recent changes to PyArrow seem to have taken the stance that comparing > null values should return null. This is actually how the previous versions work: https://github.com/apache/arrow/blob/master/python/pyarrow/scalar.pxi#L51

RE: [Discuss] Extremely dubious Python equality semantics

2020-07-01 Thread Mehul Batra
- From: Antoine Pitrou Sent: Wednesday, July 1, 2020 1:19 PM To: dev@arrow.apache.org Subject: Re: [Discuss] Extremely dubious Python equality semantics CAUTION: THIS EMAIL IS FROM AN EXTERNAL SOURCE. Internet links, office documents or other attachments may contain viruses. Do not click on a

Re: [Discuss] Extremely dubious Python equality semantics

2020-07-01 Thread Antoine Pitrou
Note if the snippet below doesn't display right in your e-mail reader, you can read it here: https://gist.github.com/pitrou/6a0ce89ce866bc0c70e33155503d1c47 Le 01/07/2020 à 09:46, Antoine Pitrou a écrit : > > Hello, > > Recent changes to PyArrow seem to have taken the stance that comparing > n