New submission from Nick Coghlan <ncogh...@gmail.com>:

The question of the way Python handles NaN came up again on python-dev 
recently. The current semantics have been assessed as a reasonable compromise, 
but a poorly explained and inconsistently implemented one.

Based on a suggestion from Terry Reedy [1] I propose that a new glossary entry 
be added for "Reflexive Equality":

"Part of the standard mathematical definition of equality is that it is 
reflexive, that is ``x is y`` necessarily implies that ``x == y``. This is an 
essential property that is relied upon when designing and implementing 
container classes such as ``list`` and ``dict``.

However, the IEEE754 committee defined the float Not_a_Number (NaN) values as 
being unequal with all others floats, including themselves. While this design 
choice violates the basic mathematical definition of equality, it is still 
considered desirable to be able to correctly implement IEEE754 floating point 
semantics, and those of similar types such as ``decimal.Decimal``, directly in 
Python.

Accordingly, Python makes the follow compromise in order to cope with types 
that use non-reflexive definitions of equality without breaking the invariants 
of container classes that rely on reflexive definitions of equality:

1. Direct equality comparisons involving ``NaN``, such as ``nan=float('NaN'); 
nan == nan``, follow the IEEE754 rule and return False (or True in the case of 
``!=``). This rule applies to ``float`` and ``decimal.Decimal`` within the 
builtins and standard library.

2. Indirect comparisons conducted internally by container classes, such as ``x 
in someset`` or ``seq.count(x)`` or ``somedict[x]``, enforce reflexivity by 
using the expressions ``x is y or x == y`` and ``x is not y and x != y`` 
respectively rather than assuming that ``x == y`` and ``x != y`` will always 
respect the reflexivity requirement. This rule applies to all container types 
within the builtins and standard library that may contain values of arbitrary 
types.

Also see [1] for a more comprehensive theoretical discussion of this topic.

[1] 
http://bertrandmeyer.com/2010/02/06/reflexivity-and-other-pillars-of-civilization/";

Specific container methods that have currently been identified as relying on 
the reflexivity assumption are:
- __contains__() (for x in c: assert x in c)
- __eq__() (assert [x] == [x])
- __ne__() (assert not [x] != [x])
- index() (for x in c: assert 0 <= c.index(x) < len(c))
- count() (for x in c: assert c.count(x) > 0)

collections.Sequence and array.array (with the 'f' or 'd' type indicators) have 
already been identified as container classes in the standard library that fails 
to follow the second guideline and hence fail to correctly implement the above 
invariants in the presence of non-reflexive definitions of equality. They will 
be fixed as part of implementing this patch. Other container types that fail to 
correctly enforce reflexivity can be fixed as they are identified.

[1] http://mail.python.org/pipermail/python-dev/2011-April/110962.html

----------
assignee: docs@python
components: Documentation, Library (Lib)
messages: 134639
nosy: docs@python, ncoghlan
priority: normal
severity: normal
status: open
title: Adopt and document consistent semantics for handling NaN values in 
containers
type: behavior
versions: Python 2.7, Python 3.2, Python 3.3

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue11945>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to