Eric V. Smith <e...@trueblade.com> added the comment: I apologize for the length of this, but I want to be as precise as possible. I've no doubt made some mistakes, so corrections and discussion are welcomed.
I'm adding the commented text at the end of this message to dataclasses.py. I'm repeating it here for discussion. These tables are slightly different from previous versions on this issue, and I've added line numbers to the __hash__ table to make it easier to discuss. So any further comments and discussion should reference the tables in this message. I think the __init__, __repr__, __set/delattr__, __eq__, and ordering tables are not controversial. ** hash For __hash__, the 99.9% use case is that the default value of hash=None is sufficient. This is rows 1-4 of that table. It's unfortunate that I have to spend so much of the following text describing cases that I think will rarely be used, but I think it's important that in these special cases we don't do anything surprising. **** hash=None First, let's discuss hash=None, which is the default. This is lines 1-4 of the __hash__ table. Here's an example of line 1: @dataclass(hash=None, eq=False, frozen=False) class A: i: int The user doesn't want an __eq__, and the class is not frozen. Whether or not the user supplied a __hash__, no hash will be added by @dataclass. In the absense of __hash__, the base class (in this case, object) __hash__. object.__hash__ and object.__eq__ are based on object identity. Here's an example of line 2: @dataclass(hash=None, eq=False, frozen=True) class A: i: int This is a frozen class, where the user doesn't want an __eq__. The same logic is used as for line 1: no __hash__ is added. Here's an example of line 3, "no" column (no __hash__). Note that this line shows the default values for hash=, eq=, frozen=: @dataclass(hash=None, eq=True, frozen=False) class A: i: int In this case, if the user doesn't provide __hash__ (the "no" column), we'll set __hash__=None. That's because it's a non-frozen class with an __eq__, it should not be hashable. Here's an example of the line 3, "yes" column (__hash__ defined): @dataclass(hash=None, eq=True, frozen=False) class A: i: int def __hash__(self): pass Since a __hash__ exists, it will not be overwritten. Note that this is also true if __hash__ were set to None. We're just checking that __hash__ exists, not it's value. Here's an example of line 4, "no" column: @dataclass(hash=None, eq=True, frozen=True) class A: i: int In this case the action is "add", and a __hash__ method will be created. And finally, consider line 4, "yes" column. This case is hash=None, eq=True, frozen=True. Here we also apply the auto-hash test, just like we do in the hash=True case (line 12, "yes" column). We want to make the frozen instances hashable, so we add our __hash__ method, unless the user has directly implemented __hash__. **** hash=False Next, consider hash=False (rows 5-8). In this case, @dataclass will never add a __hash__ method, and if one exists it is not modified. **** hash=True Lastly, consider hash=True. This is rows 9-12 of the __hash__ table. The user is saying they always want a __hash__ method on the class. If __hash__ does not already exist (the "no" column), then @dataclass will add a __hash__ method. If __hash__ does already exist in the class's __dict__, @dataclass will overwrite it if and only if the current value of __hash__ is None, and if the class has an __eq__ in the class definition. Let's call this condition the "auto-hash test". The assumption is that the only reason a class has a __hash__ of None and an __eq__ method is that the __hash__ was automatically added by Python's class creation machinery. And if the hash was auto-added, then by the user specifying hash=True, they're saying they want the __hash__ = None overridden by a generated __hash__ method. I'll give examples from line 9 of the __hash__ table, but this behavior is the same for rows 9-12. Consider: @dataclass(hash=True, eq=False, frozen=False) class A: i: int __hash__ = None This is line 9, "yes" column, action="add*", which says to add a __hash__ method if the class passes the auto-hash test. In this case the class fails the test because it doesn't have an __eq__ in the class definition. __hash__ will not be overwritten. Now consider: @dataclass(hash=True, eq=False, frozen=False) class A: i: int def __eq__(self, other): ... This again is line 9, "yes" column, action="add*". In this case, the user is saying to add a __hash__, but it already exists because Python automatically added __hash__=None when it created the class. So, the class passes the auto-hash test. So even though there is a __hash__, we'll overwrite it with a generated method. Now consider: @dataclass(hash=True, eq=False, frozen=False) class A: i: int def __eq__(self, other): ... def __hash__(self): ... Again, this is line 9, "yes" column, action="add*". The existing __hash__ is not None, so we don't overwrite the user's __hash__ method. Note that a class can pass the auto-hash test but not have an auto-generated __hash__=None. There's no way for @dataclass to actually know that __hash__=False was auto-generated, it just assumes that that's the case. For example, this class passes the auth-hash test and __hash__ will be overwritten: @dataclass(hash=True, eq=False, frozen=False) class A: i: int def __eq__(self, other): ... __hash__=None A special case to consider is lines 11 and 12 from the table. Here's an example of line 11, but line 12 is the same for the purposes of this discussion: @dataclass(hash=True, eq=True, frozen=False) class A: i: int __hash__=None The class will have a generated __eq__, because eq=True. However, the class still fails the auto-hash test, because the class's __dict__ did not have an __eq__ that was added by the class definition. Instead, it was added by @dataclass. So this class fails the auto-hash test and the __hash__ value will not be overwritten. Tables follow: # Conditions for adding methods. The boxes indicate what action the # dataclass decorator takes. For all of these tables, when I talk # about init=, repr=, eq=, order=, hash=, or frozen=, I'm referring # to the arguments to the @dataclass decorator. When checking if a # dunder method already exists, I mean check for an entry in the # class's __dict__. I never check to see if an attribute is defined # in a base class. # Key: # +=========+=========================================+ # + Value | Meaning | # +=========+=========================================+ # | <blank> | No action: no method is added. | # +---------+-----------------------------------------+ # | add | Generated method is added. | # +---------+-----------------------------------------+ # | add* | Generated method is added only if the | # | | existing attribute is None and if the | # | | user supplied a __eq__ method in the | # | | class definition. | # +---------+-----------------------------------------+ # | raise | TypeError is raised. | # +---------+-----------------------------------------+ # | None | Attribute is set to None. | # +=========+=========================================+ # __init__ # # +--- init= parameter # | # v | | | # | no | yes | <--- class has __init__ in __dict__? # +=======+=======+=======+ # | False | | | # +-------+-------+-------+ # | True | add | | <- the default # +=======+=======+=======+ # __repr__ # # +--- repr= parameter # | # v | | | # | no | yes | <--- class has __repr__ in __dict__? # +=======+=======+=======+ # | False | | | # +-------+-------+-------+ # | True | add | | <- the default # +=======+=======+=======+ # __setattr__ # __delattr__ # # +--- frozen= parameter # | # v | | | # | no | yes | <--- class has __setattr__ or __delattr__ in __dict__? # +=======+=======+=======+ # | False | | | <- the default # +-------+-------+-------+ # | True | add | raise | # +=======+=======+=======+ # Raise because not adding these methods would break the "frozen-ness" # of the class. # __eq__ # # +--- eq= parameter # | # v | | | # | no | yes | <--- class has __eq__ in __dict__? # +=======+=======+=======+ # | False | | | # +-------+-------+-------+ # | True | add | | <- the default # +=======+=======+=======+ # __lt__ # __le__ # __gt__ # __ge__ # # +--- order= parameter # | # v | | | # | no | yes | <--- class has any comparison method in __dict__? # +=======+=======+=======+ # | False | | | <- the default # +-------+-------+-------+ # | True | add | raise | # +=======+=======+=======+ # Raise because to allow this case would interfere with using # functools.total_ordering. # __hash__ # +------------------- hash= parameter # | +----------- eq= parameter # | | +--- frozen= parameter # | | | # v v v | | | # | no | yes | <--- class has __hash__ in __dict__? # +=========+=======+=======+========+========+ # | 1 None | False | False | | | No __eq__, use the base class __hash__ # +---------+-------+-------+--------+--------+ # | 2 None | False | True | | | No __eq__, use the base class __hash__ # +---------+-------+-------+--------+--------+ # | 3 None | True | False | None | | <-- the default, not hashable # +---------+-------+-------+--------+--------+ # | 4 None | True | True | add | add* | Frozen, so hashable # +---------+-------+-------+--------+--------+ # | 5 False | False | False | | | # +---------+-------+-------+--------+--------+ # | 6 False | False | True | | | # +---------+-------+-------+--------+--------+ # | 7 False | True | False | | | # +---------+-------+-------+--------+--------+ # | 8 False | True | True | | | # +---------+-------+-------+--------+--------+ # | 9 True | False | False | add | add* | Has no __eq__, but hashable # +---------+-------+-------+--------+--------+ # |10 True | False | True | add | add* | Has no __eq__, but hashable # +---------+-------+-------+--------+--------+ # |11 True | True | False | add | add* | Not frozen, but hashable # +---------+-------+-------+--------+--------+ # |12 True | True | True | add | add* | Frozen, so hashable # +=========+=======+=======+========+========+ # For boxes that are blank, __hash__ is untouched and therefore # inherited from the base class. If the base is object, then # id-based hashing is used. # Note that a class may have already __hash__=None if it specified an # __eq__ method in the class body (not one that was created by # @dataclass). ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue32513> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com