[Python-Dev] Dataclasses, frozen and __post_init__
Hello I have been using dataclasses package in a pet project of mine. I'm sorry if this issue has already been raised. I came across a situation where I wanted to use the __post_init__ function to initialise some inherited fields from a dataclass with frozen=True. The problem is that because it is frozen, assigning to the field doesn't work. There are two workarounds without changing the base class to frozen=False, which could be in a library. 1. Use object.__setattr__, this is ugly and not very user or beginner friendly. 2. Extract __post_init__ out into a factory function. Then it also loses all the advantages of the __post_init__ and InitVar mechanism. Both frozen and unfrozen dataclasses should be able to use the same initialisation mechanism for consistency. Being consistent would ease of converting an unfrozen dataclass to a frozen one if the only code that actually modifies the instance is in __post_init__ function. I think frozen classes should be able to be mutated during the __post_init__ call. To implements this a frozen dataclass could have a flag to says it's not yet fully initialised and the flag would be checked in the frozen setattr/delattr methods. This flag could be located as a special attribute on the instance or be in a weak reference dict. Thanks Ben Lewis ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Dataclasses, frozen and __post_init__
On Sat, Feb 17, 2018 at 6:40 PM, Guido van Rossum wrote: > > > That's a pretty tricky proposal, and one that's been debated on and off > for a long time in other contexts. And that flag would somehow have to be > part of every instance's state. > > In general the right way to initialize an immutable/frozen object is not > through __init__ but through __new__ -- have you tried that? > Constructing it throught __new__ doesn't actually work as it has no way to alter the arguments that are passed into __init__, I think creating a metaclass that overides __call__ is required to acheive the desired result. Although a factory classmethod would acheive similar api. > > Also, a small example that demonstrates your need would do wonders to help > us understand your use case better. > > # unrelated object class NamedObject: @property def name(self) -> str: return "some name" // has may subclasses @dataclass class Item: name: str @dataclass class NamedObjectItem(Item): name: str = field(init=False) obj: NamedObject def __post_init__(self): self.name = self.obj.name This works fine, until I decided them Item and therefore all subclasses should be frozen as no instances are mutated and if they are ever in the future then its a bug. But to do this the following factory method needs to be added: @classmethod def create(cls, obj: NamedObject, *args, **kwargs): return cls(obj.name, obj, *args, **kwargs) This doesn't look that bad but all fields(up to the last field used that would have been used in __post_init__) needs to be declared in the signature. Thanks Ben Lewis ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Dataclasses, frozen and __post_init__
>
> Why can'y you make `name` on `NamedObjectItem` a property that returns `
> self.obj.name`? Why store a duplicate copy of the name?
>
Agreed, it's probably a better design not to store a duplicate reference to
name. But when I tried that, the property clashed with the inherited field.
This caused the creation of the dataclass to fail as it thought that the
property was the default value for the field 'name'. Even if I set a
default for the obj field, it crashed as it tried to set the default value
for name to the read-only property.
Although I can think of situations where properties wouldn't be sufficent
as you only want to calculate the value once per instance on creation. My
thought is that most dataclasses would still be sensible and useful even if
all mutation ability was removed from them. Taking an example directly from
the PEP:
@dataclass
class C:
i: int
j: int = None
database: InitVar[DatabaseType] = None
def __post_init__(self, database):
if self.j is None and database is not None:
self.j = database.lookup('j')
Maybe I'm thinking of dataclasses wrong but this still make complete sense
and is useful even if its declared as frozen.
My thought is that initialisation logic and immutability is orthogonal to
each other. Possibly initialisation logic like this should occur before the
instance is created so it would work for immutable types as well.
A possible idea could be, instead of __post_init__, there is __pre_init__
which allows altering of fields before the instance is created. It would
take a dict as first argument which contain the field values passed into
the 'constructor' and default values would also be filled out.
@dataclass
class C:
i: int
j: int = None
database: InitVar[DatabaseType]
@classmethod
def __pre_init__(cls, fields: Dict[str, Any], database: DatabaseType):
if fields['j'] is None and database is not None:
fields['j'] = database.lookup('j')
I personally see two problems with this idea:
1. This isn't as ergonomic as __post_init__ is as its modifing a dictionary
instead of its instance.
2. To implement this, it would require a metaclass.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
