Note if the snippet below doesn't display right in your e-mail reader, you can read it here: https://gist.github.com/pitrou/6a0ce89ce866bc0c70e33155503d1c47
Le 01/07/2020 à 09:46, Antoine Pitrou a écrit : > > Hello, > > Recent changes to PyArrow seem to have taken the stance that comparing > null values should return null. The problem is that it breaks the > expectation that comparisons should return booleans, and perculates into > crazy behaviour in other places. Here is an example of such > misbehaviour in the scalar refactor PR: > >>>> import pyarrow as pa > > >>>> na = pa.scalar(None) > > >>>> na == na > > > <pyarrow.NullScalar: None> >>>> na == 5 > > > <pyarrow.NullScalar: None> >>>> bool(na == 5) > > > True >>>> if na == 5: print("yo!") > > > yo! >>>> na in [5] > > > True > > But you can see it also with arrays containing null values: > >>>> pa.array([1, None]) in [pa.scalar(42)] > > > True > > I think that Python equality operators should behave in a > Python-sensible way (return True or False). Have people call another > method if they like the fancy (or noxious, depending on the POV) > semantics of returning null when comparing null with anything. > > (note that Numpy doesn't have null scalars, so it can be less > conservative in its customization of equality methods) > > Regards > > Antoine. >