Hey Nicolas, Arnaud,

On Tue, 4 Jun 2024 at 14:29, Nicolas Grekas <nicolas.grekas+...@gmail.com>
wrote:

> Please find all the details here:
> https://wiki.php.net/rfc/lazy-objects
>

First of all, let me say that this is a fantastic RFC: having maintained
both mine and Doctrine's version of lazy proxies for many years, this is
indeed a push in the right direction, making laziness an engine detail,
rather than a complex userland topic with hacks.

Moving this forward will allow us (in a far future) to get rid of some BC
boundaries around the weird `unset()` semantics of properties, which were
indeed problematic with typed properties and `readonly` properties.

Stellar work: well done!

## TL;DR: of my feedback:

 * RFC is good / useful / needed: will vote for it
 * ghost proxies are well designed / fantastic feature
 * lazy proxies should probably explore replacing object identity further
 * don't like expanding `ReflectionClass` further: `LazyGhost` and
`LazyProxy` classes (or such) instead
 * `initialize()` shouldn't have a boolean behavioral switch parameter:
make 2 methods
 * flags should be a `list<SomeEnumAroundProxies>` instead. A bitmask for a
new API feels unsafe and anachronistic, given the tiny performance hit.
 * don't touch `readonly` because of lazy objects: this feature is too
niche to cripple a major-major feature like `readonly`. I would suggest
deferring until after the first bits of this RFC landed.

That said, I took some notes that I'd like you to both consider / answer.

## Raw feedback

I did skim through the thread, but did not read it all, so please excuse me
if some feedback is potentially duplicate.
Notes are succinct/to the point: I abuse bullet points heavily, sorry :-)

> From an abstraction point of view, lazy objects from this RFC are
indistinguishable from non-lazy ones

* do the following aspects always apply? I understand they don't for lazy
proxies, just for ghosts?
   * `spl_object_id($object) === spl_object_id($proxy)`?
   * `get_class($object) === get_class($proxy)`?

> Execution of methods or property hooks does not trigger initialization
until one of them accesses a backed property.

* excellent! The entire design revolving around object state makes it so
much easier to reason about the entire behavior too!

> Proxies: The initializer returns a new instance, and interactions with
the proxy object are forwarded to this instance

* I am not sure why this is needed, given ghost objects.
   * I understand that you want this for when instantiation is delegated to
a third party (in Symfony's DIC, a factory), but I feel like the ghost
design is so much more fool-proof than the proxy approach.
   * Perhaps worth taking another stab at sharing object identity, before
implementing these?
* another note on naming: I used "value holder" inside ProxyManager,
because using just "proxy" as a name led to a lot of confusion. This also
applies to me trying to distinguish proxy types inside the RFC and this
discussion.

> Internal objects are not supported because their state is usually not
managed via regular properties.

* sad, but understandable limitation
* what happens if a class `extends` an internal engine class, like the
gruesome `ArrayObject`?

The API uses various flags given as options

```php
    public int const SKIP_INITIALIZATION_ON_SERIALIZE = 1;
    public int const SKIP_DESTRUCTOR = 2;
    public int const SKIP_INITIALIZED_READONLY = 4;
    public int const RESET_INITIALIZED_READONLY = 8;
```

* IMO, these should be `enum` types, given in as a `list<TOption>` to the
call site
    * bitmasks are really only relevant in serialization / storage
contexts, IMO, for compressing space as much as possible
    * the bitmasks in the reflection API are already very confusing and
hard to use, and I say that as someone that wrapped the entire reflection
API various times, out of despair

```php
public function newLazyGhost(callable $initializer, int $options = 0):
object {}
public function newLazyProxy(callable $factory, int $options = 0): object {}
```

* Given the recent improvements around closures and the `...` syntax (
https://wiki.php.net/rfc/first_class_callable_syntax), is it worth having
`Closure` only as argument type?
* should we declare generic types in the RFC/docs, even if just in the
stubs?
    * they would serve as massive documentation improvement for Psalm and
PHPStan
    * it would be helpful to document `$initializer` and `$factory` as
`:void` or `:object` functions
        * can the engine check that, perhaps? I have no idea if `Closure`
can provide such information on return types, inside the engine.

```php
public function initialize(object $object, bool $skipInitializer = false):
object {}
```

* worth dividing this into
    * `initialize()`
    * `markAsAlreadyInitialized()`
    * don't use a boolean flag for two functions that do different things

* all methods were added to `ReflectionClass`
    * IMO worth having this as separate class/object where this is attached
        * if one needs to decorate/stub such API in userland, it is
therefore a completely separate decoration
            * `ReflectionClass` is already gigantic
            * a smaller API surface that only does proxies (perhaps
different based on proxy strategy) would be very beneficial
   * suggestion: something like `new GhostObject($className)` and `new
Proxy($className)`
      * I understand that the interactions with
`ReflectionClass#getProperty()` felt natural, but the use-case is narrow,
and `ReflectionClass` is already really humongous

> The `resetAsLazy*()` methods accept an already created instance.
> This allows writing classes that manage their own laziness

* overall understandable
    * a bit weird to support this, for now
    * useful for resettable interfaces: Symfony DIC could benefit from this
* what happens if `ReflectionClass#reset*()` methods are used on a
different class instance?
    * considered?

```php
$reflector->getProperty('id')->skipLazyInitialization($post);
```

* perfect for partial objects / understanding why this was implemented
    * would it make sense to have an API to set "bulk" values in an object
this way, instead of having to do this for each property manually?
        * avoids instantiating reflection properties / marking individual
properties manually
        * perhaps future scope?
           * thinking `(new
GhostObject($class))->initializePropertiesTo(['foo' => 'bar', 'baz' =>
'tab'])`

> Initialization Triggers

* really happy to see all these edge cases being considered here!
    * how much of this new API has been tried against the test suite of
(for example) `ocramius/proxy-manager`?
        * mostly asking because there's tons of edge cases noted in there

> Cloning, unless `__clone()` is implemented and accesses a property.

* how is the initializer of a cloned proxy used?
    * Is the initializer cloned too?
    * what about the object that a lazy proxy forwards state access to? Is
it cloned too?

> The following special cases do not trigger initialization of a lazy
object:

* Will accessing a property via a debugger (such as XDebug) trigger
initialization here?
    * asking because debugging proxy initialization often led to problems,
in the past
       * sometimes even IDEs crashing, or segfaults

* this wording is a bit confusing:

> Proxy Objects
> The actual instance is set to the return value.

* considering the following paragraph:

> The proxy object is _not_ replaced or substituted for the actual instance.


> After initialization, property accesses on the proxy are forwarded to the
actual instance.
> Observing properties of the proxy has the same result as observing
properties of the actual instance.

* This is some sort of "quantum locking" of both objects?
* How hard is it to break this linkage?
    * Can properties be `unset()`, for example?
    * what happens to dynamic properties?
        * I don't use them myself, and I discourage their usage, but it
would be OK to just document the expected behavior

> The proxy and actual instance have distinct identities.

* Given that we went great lengths to "quantum lock" two objects'
properties, wasn't it perhaps feasible to replace the
  proxy instance?
    * I have no idea if that would be possible/wished
    * would require merging `spl_object_id()` within the scope of the
initializer stack frame, with any outer ones
    * would solve any identity problems that still riddle the lazy proxy
design (which I think is incomplete, right now)

> The scope and `$this` of the initializer function is not changed

* good / thoughtful design
    * using `__construct()` or reflection properties suffices for most users

> If the initializer throws, the object properties are reverted to their
pre-initialization state and the object is
> marked as lazy again.

* this is some sort of "transactional" behavior
    * welcome API, but is it worth having this complexity?
        * is there a performance tradeoff?
        * is a copy of the original state kept during initializer calls?
        * OK with it myself, just probing design considerations
* the example uses `setRawValueWithoutLazyInitialization()`, and
initialization then accesses `public` properties
    * shouldn't a property that now **has** a value not trigger
initialization anymore?
        * or does that require
`ReflectionProperty#skipLazyInitialization()` calls, for that to work?

> `ReflectionClass::SKIP_INITIALIZATION_ON_SERIALIZE`: By default,
serializing a lazy object triggers its initialization
> This flag disables that behavior, allowing lazy objects to be serialized
as empty objects.

* how would one deserialize an empty object into a proxy again?
    * would this understanding be deferred to the (de-)serializer of choice?
    * exercise for userland?

> `ReflectionClass::newLazyProxy()`
> The factory should return a new object: the actual instance.

* what happens if the user mis-implements the factory as `function (object
$proxy): object { return $proxy; }`?
    * this is obviously a mistake on their end, but is it somehow
preventable?

> The `resetAsLazyGhost()` method resets an existing object and marks it as
lazy.
> The indented use-case is for an object to manage its own lazyness by
calling the method in its constructor.

* this certainly makes it easier to design "out of the box" lazy objects
    * perhaps more useful for tools like ORMs, (de-)serializers and DICs
though
    * using the proxy API internally in classes like DB connections feels a
bit overkill, to me

> `ReflectionClass::SKIP_INITIALIZED_READONLY`
> If this flag is set, these properties are skipped and no exception is
thrown.
> The behavior around readonly properties is explained in more details
later.
> `ReflectionClass::RESET_INITIALIZED_READONLY`

* while I can see this as useful, it effectively completely breaks the
`readonly` design
    * this is something I'd probably vote against: not worth breaking
`readonly` for just the `reset*()` API here
       * `reset*()` is already a niche API inside the (relatively) niche
use-case of laziness: I wouldn't bypass `readonly` for it
    * `readonly` provided developer value is bigger than lazy object value,
in my mind

> `ReflectionClass::resetAsLazyProxy()`
> The proxy and the actual instance are distinct objects, with distinct
identities.

* When creating a lazy proxy, all property accesses are forwarded to a new
instance
    * are all property accesses re-bound to the new instance?
    * are there any leftovers pointing to the old instance anywhere?
        * thinking dynamic properties and similar

> If `$skipInitializer` is true, the behavior is the one described for
Ghost Objects in the Initialization Sequence
> section, except that the initializer is not called.

* please make a separate method for this
    * it's not worth cramming a completely different behavior in the same
method
    * can document the methods completely independently, this way
    * can deprecate the new method separately, if a design flaw is found in
future

> ReflectionProperty::setRawValueWithoutLazyInitialization()
> The method does not call hooks, if any, when setting the property value.

* So far, it has been possible to `unset($object->property)` to force
`__get` and `__set` to be called
    * will `setRawValueWithoutLazyInitialization` skip also this "unset
properties" behavior that is possible in userland?
        * this is fine, just a documentation detail to note
        * if it is like that, is it worth renaming the method
`setValueWithoutCallingHooks` or such?
            * not important, just noting this opportunity

> Readonly properties

* while I appreciate the effort into digging in `readonly` properties, this
section feels extremely tricky
    * I would suggest not implementing (for now) either of
        * `SKIP_INITIALIZED_READONLY`
        * `RESET_INITIALIZED_READONLY`
    * I would suggest leaving some time for these, and re-evaluating after
the RFC lands / starts being used

> Destructors
> The destructor of proxy objects is never called. We rely on the
destructor of the proxied instance instead.

* raising an edge case here: `spl_object_*` and object identity checks may
be used inside a destructor
    * for example, a DB connection de-registering itself from a connection
pool somewhere, such as `$pool->deRegister($this)`
        * the connection pool may have the `spl_object_id()` of the proxy,
not the real instance
    * this is not a blocker, just an edge case that may require
documentation
        * it reconnects with "can we replace the object in-place?" question
above: replacing objects worth exploring

> It employs the ghost strategy by default unless the dependency is to be
instantiated
> and initialized by a user-provided factory

* one question arises here: can this RFC create proxies of **interfaces**
at all?
    * if not, does it throw appropriate exceptions?
    * the reason this question comes up is that, especially in the context
of DICs, factories are for interfaces
        * concrete classes are implementation details of factories, usually
           * very difficult to use lazy proxy and ghost object design with
services that decorate each other
    * this is semi-discussed in "About Proxies" below, around
"inheritance-proxy"
        * still worth mentioning "no interfaces" as a clear limitation,
perhaps?



Marco Pivetta

https://mastodon.social/@ocramius

https://ocramius.github.io/

Reply via email to