Re: [PHP-DEV] [RFC] Lazy Objects

Nicolas Grekas Mon, 15 Jul 2024 01:25:41 -0700

Le jeu. 11 juil. 2024 à 20:31, Tim Düsterhus <t...@bastelstu.be> a écrit :


> Hi
>
> On 7/11/24 10:32, Nicolas Grekas wrote:
> >> Many things are already possible in userland. That does not always mean
> >> that the cost-benefit ratio is appropriate for inclusion in core. I get
> >> behind the two examples in the “About Lazy-Loading Strategies” section,
> >> but I'm afraid I still can't wrap my head why I would want an object
> >> that makes itself lazy in its own constructor: I have not yet seen a
> >> real-world example.
> >>
> >
> > Keeping this capability for userland is not an option for me as it would
> > mostly defeat my goal, which is to get rid of any userland code on this
> > topic (and is achieved by the RFC).
> >
> > Here is a real-world example:
> >
> https://github.com/doctrine/DoctrineBundle/blob/2.12.x/src/Repository/LazyServiceEntityRepository.php
> >
> > This class currently uses a poor-man's implementation of lazy objects and
> > would greatly benefit from resetAsLazyGhost().
> >
>
> Sorry, I was probably a little unclear with my question. I was not
> specifically asking if anyone did that, because I am fairly sure that
> everything possible has been done before.
>
> I was interested in learning why I would want to promote a
> "LazyServiceEntityRepository" instead of the user of my library just
> making the "ServiceEntityRepository" lazy themselves.
>
> I understand that historically making the "ServiceEntityRepository" lazy
> yourself would have been very complicated, but the new RFC makes this
> super easy.
>
> So based on my understanding the "LazyServiceEntityRepository"
> (c|sh)ould be deprecated with the reason that PHP 8.4 provides all the
> necessary tools to do it yourself, no? That would also match your goal
> of getting rid of userland code on this topic.
>

That class is not a candidate for deprecation even after this RFC. But
thanks to this RFC, PHP 8.4 would allow removing all the code related to
the magic methods in the current implementation.



> To me this is what the language evolution should do: Enable users to do
> things that previously needed to be provided by userland libraries,
> because they were complicated and fragile, not enabling userland
> libraries to simplify things that they should not need to provide in the
> first place because the language already provides it.
>

That's exactly it: instead of using a third party lib (or in this case
implementing a poor man's subset of a correct lazy object implementation),
the engine would enable using a native feature to achieve a fully correct
behavior in a very simple way. In this case, LazyServiceEntityRepository
would directly use ReflectionClass::resetAsLazyGhost to make itself lazy,
so that its consumers get a packaged behavior and don't need to care about
the topic while consuming the class.



> >> I have one question regarding the updated initialization sequence. The
> >> RFC writes:
> >>
> >>> Properties that are declared on the real instance are uninitialized on
> >> the proxy instance (including overlapping properties used with
> >> ReflectionProperty::skipLazyInitialization() or
> >> setRawValueWithoutLazyInitialization()) to synchronize the state shared
> by
> >> both instances.
> >>
> >> I do not understand this. Specifically I do not understand the "to
> >> synchronize the state" bit.
> >
> >
> > We reworded this sentence a bit. Clearer?
>
> Yes, I think it is clearer. Let me try to rephrase this differently to
> see if my understanding is correct:
>
> ---
>
> For every property on that exists on the real instance, the property on
> the proxy instance effectively [1] is replaced by a property hook like
> the following:
>
>      public PropertyType $propertyName {
>          get {
>              return $this->realInstance->propertyName;
>          }
>          set(PropertyType $value) {
>              $this->realInstance->propertyName = $value;
>          }
>      }
>
> And value that is stored in the property will be freed (including
> calling the destructor if it was the last reference), as if `unset()`
> was called on the property.
>
> [1] No actual property hook will be created and the `realInstance`
> property does not actually exist, but the semantics behave as if such a
> hook would be applied.
>

Conceptually, you've got it right yes!



> >> My understanding is that the proxy will
> >> always forward the property access, so there effectively is no state on
> >> the proxy?!
> >
> >
> > It follows that more properties can exist on the proxy itself (declared
> by
> > child classes of the real object that the proxy implements).
> >
>
> Right, that's mentioned in (2), so all clear.
>
> >
> >>> That is very true. I had a look at the userland implementation and
> >> indeed,
> >>> we keep the wrapper while cloning the backing instance (it's not that
> we
> >>> have the choice, the engine doesn't give us any other options).
> >>> RFC updated.
> >>>
> >>> We also updated the behavior when an uninitialized proxy is cloned: we
> >> now
> >>> postpone calling $real->__clone to the moment where the proxy clone is
> >>> initialized.
> >>
> >> Do I understand it correctly that the initializer of the cloned proxy is
> >> effectively replaced by the following:
> >>
> >>       function (object $clonedProxy) use ($originalProxy) {
> >>           return clone $originalProxy->getRealObject();
> >>       }
> >>
> >
> > Nope, that's not what we describe in the RFC so I hope you can read it
> > again and get where you were confused and tell us if we're not clear
> enough
> > (to me we are :) )
>
> The "cloning of the real instance" bit is what lead me to this
> understanding.
>
> > The $originalProxy is *not* shared with $clonedProxy. Instead, it's
> > *initializers* that are shared between clones.
> > And then, when we call that shared initializer in the $clonedProxy, we
> > clone the returned instance, so that even if the initializer returns a
> > shared instance, we don't share anything with the $originalProxy.
> >
>
> Ah, so you mean if the initializer would look like this instead of
> creating a fresh object within the initializer?
>
>       $predefinedObject = new SomeObj();
>       $myProxy = $r->newLazyProxy(function () use ($predefinedObject) {
>           return $predefinedObject;
>       });
>       $clonedProxy = clone $myProxy;
>       $r->initialize($myProxy);
>       $r->initialize($clonedProxy);
>
> It didn't even occur to me that one would be able to return a
> pre-existing object: I assume that simply reusing the initializer would
> create a separate object and that would be sufficient to ensure that the
> cloned instance would be independent.
>

E.g. Doctrine's Entity Manager returns always the same instance.



> >> ? Then I believe this is unsound. Consider the following:
> >>
> >>       $myProxy = $r->newLazyProxy(...);
> >>       $clonedProxy = clone $myProxy;
> >>       $r->initialize($myProxy);
> >>       $myProxy->someProp++;
> >>       var_dump($clonedProxy->someProp);
> >>
> >> The clone was created before `someProp` was modified, but it outputs the
> >> value after modification!
> >>
> >> Also: What happens if the cloned proxy is initialized *before* the
> >> original proxy? There is no real object to clone.
> >>
> >> I believe the correct behavior would be: Just clone the proxy and keep
> >> the same initializer. Then both proxies are actually fully independent
> >> after cloning, as I would expect from the clone operation.
> >>
> >
> > That's basically what we do and what we describe in the RFC, just with
> the
> > added lazy-clone operation on the instance returned by the initializer.
> >
>
> This means that if I would return a completely new object within the
> initializer then for a cloned proxy the new object would immediately be
> cloned and the original object be destructed, yes?
>

That's correct. This preserves the lifecycle of cloned objects: __clone()
has to be called when creating clones if the method exists.



> Frankly, thinking about this cloning behavior gives me a headache,
> because it quickly leads to very weird semantics. Consider the following
> example:
>
>       $predefinedObject = new SomeObj();
>       $initializer = function () use ($predefinedObject) {
>           return $predefinedObject;
>       };
>       $myProxy = $r->newLazyProxy($initializer);
>       $otherProxy = $r->newLazyProxy($initializer);
>       $clonedProxy = clone $myProxy;
>       $r->initialize($myProxy);
>       $r->initialize($otherProxy);
>       $r->initialize($clonedProxy);
>
> To my understanding both $myProxy and $otherProxy would share the
> $predefinedObject as the real instance and $clonedProxy would have a
> clone of the $predefinedObject at the time of the initialization as its
> real instance?
>

Correct. The sharing is not specifically related to cloning. But when
cloning happens, the expected behavior is well defined: we should have
separate states.



> To me this sounds like cloning an uninitialized proxy would need to
> trigger an initialization to result in semantics that do not violate the
> principle of least astonishment.
>

Forcing an initialization when cloning would be unexpected. E.g. in
Doctrine, when you clone an uninitialized entity, you don't trigger a
database roundtrip. Instead, you create a new object that still references
the original state internally, but under a different object identity. This
cloning behavior is the one we've had for more than 10 years and I think
it's also the least astonishing one - at least if we consider this example
as a real world trial of this principle.



> I would assume that cloning a proxy is something that rarely happens,
> because my understanding is that proxies are most useful for service
> objects, whereas ghost objects would be used for entities / value
> objects, so this should not be too much of a problem.
>

I share this opinion. Then, since the difference between ghost vs proxy
should be internal from a consumer PoV, defining an observable behavior for
ghost objects also defines the behavior for proxy ones.



> >> 2.
> >>
> >>   > Properties are not initialized to their default value yet (they are
> >> initialized before calling the initializer).
> >>
> >> I see that you removed the bit about this being not observable. What is
> >> the reason that you removed that? One possible reason that comes to my
> >> mind is a default value that refers to a non-existing constant. It would
> >> be observable because the initialization emits an error. Are there any
> >> other reasons?
> >>
> >
> > That's because this is observable using e.g. (array) or var_dump.
> >
>
> I see. Perhaps add a short sentence with the reasoning. Something like:
>
> Properties are not initialized to their default value yet (they are
> initialized before calling the initializer). As an example, this has an
> impact on the behavior of an (array) cast on uninitialized objects and
> also when the default value is based on a constant that is not yet
> defined when creating the lazy object, but will be defined at the point
> of initialization.
>

Thanks for the proposal. I've added it to the RFC.

Cheers,
Nicolas

Re: [PHP-DEV] [RFC] Lazy Objects

Reply via email to