Hey Nicolas, Arnaud, On Tue, 4 Jun 2024 at 14:29, Nicolas Grekas <nicolas.grekas+...@gmail.com> wrote:
> Please find all the details here: > https://wiki.php.net/rfc/lazy-objects > First of all, let me say that this is a fantastic RFC: having maintained both mine and Doctrine's version of lazy proxies for many years, this is indeed a push in the right direction, making laziness an engine detail, rather than a complex userland topic with hacks. Moving this forward will allow us (in a far future) to get rid of some BC boundaries around the weird `unset()` semantics of properties, which were indeed problematic with typed properties and `readonly` properties. Stellar work: well done! ## TL;DR: of my feedback: * RFC is good / useful / needed: will vote for it * ghost proxies are well designed / fantastic feature * lazy proxies should probably explore replacing object identity further * don't like expanding `ReflectionClass` further: `LazyGhost` and `LazyProxy` classes (or such) instead * `initialize()` shouldn't have a boolean behavioral switch parameter: make 2 methods * flags should be a `list<SomeEnumAroundProxies>` instead. A bitmask for a new API feels unsafe and anachronistic, given the tiny performance hit. * don't touch `readonly` because of lazy objects: this feature is too niche to cripple a major-major feature like `readonly`. I would suggest deferring until after the first bits of this RFC landed. That said, I took some notes that I'd like you to both consider / answer. ## Raw feedback I did skim through the thread, but did not read it all, so please excuse me if some feedback is potentially duplicate. Notes are succinct/to the point: I abuse bullet points heavily, sorry :-) > From an abstraction point of view, lazy objects from this RFC are indistinguishable from non-lazy ones * do the following aspects always apply? I understand they don't for lazy proxies, just for ghosts? * `spl_object_id($object) === spl_object_id($proxy)`? * `get_class($object) === get_class($proxy)`? > Execution of methods or property hooks does not trigger initialization until one of them accesses a backed property. * excellent! The entire design revolving around object state makes it so much easier to reason about the entire behavior too! > Proxies: The initializer returns a new instance, and interactions with the proxy object are forwarded to this instance * I am not sure why this is needed, given ghost objects. * I understand that you want this for when instantiation is delegated to a third party (in Symfony's DIC, a factory), but I feel like the ghost design is so much more fool-proof than the proxy approach. * Perhaps worth taking another stab at sharing object identity, before implementing these? * another note on naming: I used "value holder" inside ProxyManager, because using just "proxy" as a name led to a lot of confusion. This also applies to me trying to distinguish proxy types inside the RFC and this discussion. > Internal objects are not supported because their state is usually not managed via regular properties. * sad, but understandable limitation * what happens if a class `extends` an internal engine class, like the gruesome `ArrayObject`? The API uses various flags given as options ```php public int const SKIP_INITIALIZATION_ON_SERIALIZE = 1; public int const SKIP_DESTRUCTOR = 2; public int const SKIP_INITIALIZED_READONLY = 4; public int const RESET_INITIALIZED_READONLY = 8; ``` * IMO, these should be `enum` types, given in as a `list<TOption>` to the call site * bitmasks are really only relevant in serialization / storage contexts, IMO, for compressing space as much as possible * the bitmasks in the reflection API are already very confusing and hard to use, and I say that as someone that wrapped the entire reflection API various times, out of despair ```php public function newLazyGhost(callable $initializer, int $options = 0): object {} public function newLazyProxy(callable $factory, int $options = 0): object {} ``` * Given the recent improvements around closures and the `...` syntax ( https://wiki.php.net/rfc/first_class_callable_syntax), is it worth having `Closure` only as argument type? * should we declare generic types in the RFC/docs, even if just in the stubs? * they would serve as massive documentation improvement for Psalm and PHPStan * it would be helpful to document `$initializer` and `$factory` as `:void` or `:object` functions * can the engine check that, perhaps? I have no idea if `Closure` can provide such information on return types, inside the engine. ```php public function initialize(object $object, bool $skipInitializer = false): object {} ``` * worth dividing this into * `initialize()` * `markAsAlreadyInitialized()` * don't use a boolean flag for two functions that do different things * all methods were added to `ReflectionClass` * IMO worth having this as separate class/object where this is attached * if one needs to decorate/stub such API in userland, it is therefore a completely separate decoration * `ReflectionClass` is already gigantic * a smaller API surface that only does proxies (perhaps different based on proxy strategy) would be very beneficial * suggestion: something like `new GhostObject($className)` and `new Proxy($className)` * I understand that the interactions with `ReflectionClass#getProperty()` felt natural, but the use-case is narrow, and `ReflectionClass` is already really humongous > The `resetAsLazy*()` methods accept an already created instance. > This allows writing classes that manage their own laziness * overall understandable * a bit weird to support this, for now * useful for resettable interfaces: Symfony DIC could benefit from this * what happens if `ReflectionClass#reset*()` methods are used on a different class instance? * considered? ```php $reflector->getProperty('id')->skipLazyInitialization($post); ``` * perfect for partial objects / understanding why this was implemented * would it make sense to have an API to set "bulk" values in an object this way, instead of having to do this for each property manually? * avoids instantiating reflection properties / marking individual properties manually * perhaps future scope? * thinking `(new GhostObject($class))->initializePropertiesTo(['foo' => 'bar', 'baz' => 'tab'])` > Initialization Triggers * really happy to see all these edge cases being considered here! * how much of this new API has been tried against the test suite of (for example) `ocramius/proxy-manager`? * mostly asking because there's tons of edge cases noted in there > Cloning, unless `__clone()` is implemented and accesses a property. * how is the initializer of a cloned proxy used? * Is the initializer cloned too? * what about the object that a lazy proxy forwards state access to? Is it cloned too? > The following special cases do not trigger initialization of a lazy object: * Will accessing a property via a debugger (such as XDebug) trigger initialization here? * asking because debugging proxy initialization often led to problems, in the past * sometimes even IDEs crashing, or segfaults * this wording is a bit confusing: > Proxy Objects > The actual instance is set to the return value. * considering the following paragraph: > The proxy object is _not_ replaced or substituted for the actual instance. > After initialization, property accesses on the proxy are forwarded to the actual instance. > Observing properties of the proxy has the same result as observing properties of the actual instance. * This is some sort of "quantum locking" of both objects? * How hard is it to break this linkage? * Can properties be `unset()`, for example? * what happens to dynamic properties? * I don't use them myself, and I discourage their usage, but it would be OK to just document the expected behavior > The proxy and actual instance have distinct identities. * Given that we went great lengths to "quantum lock" two objects' properties, wasn't it perhaps feasible to replace the proxy instance? * I have no idea if that would be possible/wished * would require merging `spl_object_id()` within the scope of the initializer stack frame, with any outer ones * would solve any identity problems that still riddle the lazy proxy design (which I think is incomplete, right now) > The scope and `$this` of the initializer function is not changed * good / thoughtful design * using `__construct()` or reflection properties suffices for most users > If the initializer throws, the object properties are reverted to their pre-initialization state and the object is > marked as lazy again. * this is some sort of "transactional" behavior * welcome API, but is it worth having this complexity? * is there a performance tradeoff? * is a copy of the original state kept during initializer calls? * OK with it myself, just probing design considerations * the example uses `setRawValueWithoutLazyInitialization()`, and initialization then accesses `public` properties * shouldn't a property that now **has** a value not trigger initialization anymore? * or does that require `ReflectionProperty#skipLazyInitialization()` calls, for that to work? > `ReflectionClass::SKIP_INITIALIZATION_ON_SERIALIZE`: By default, serializing a lazy object triggers its initialization > This flag disables that behavior, allowing lazy objects to be serialized as empty objects. * how would one deserialize an empty object into a proxy again? * would this understanding be deferred to the (de-)serializer of choice? * exercise for userland? > `ReflectionClass::newLazyProxy()` > The factory should return a new object: the actual instance. * what happens if the user mis-implements the factory as `function (object $proxy): object { return $proxy; }`? * this is obviously a mistake on their end, but is it somehow preventable? > The `resetAsLazyGhost()` method resets an existing object and marks it as lazy. > The indented use-case is for an object to manage its own lazyness by calling the method in its constructor. * this certainly makes it easier to design "out of the box" lazy objects * perhaps more useful for tools like ORMs, (de-)serializers and DICs though * using the proxy API internally in classes like DB connections feels a bit overkill, to me > `ReflectionClass::SKIP_INITIALIZED_READONLY` > If this flag is set, these properties are skipped and no exception is thrown. > The behavior around readonly properties is explained in more details later. > `ReflectionClass::RESET_INITIALIZED_READONLY` * while I can see this as useful, it effectively completely breaks the `readonly` design * this is something I'd probably vote against: not worth breaking `readonly` for just the `reset*()` API here * `reset*()` is already a niche API inside the (relatively) niche use-case of laziness: I wouldn't bypass `readonly` for it * `readonly` provided developer value is bigger than lazy object value, in my mind > `ReflectionClass::resetAsLazyProxy()` > The proxy and the actual instance are distinct objects, with distinct identities. * When creating a lazy proxy, all property accesses are forwarded to a new instance * are all property accesses re-bound to the new instance? * are there any leftovers pointing to the old instance anywhere? * thinking dynamic properties and similar > If `$skipInitializer` is true, the behavior is the one described for Ghost Objects in the Initialization Sequence > section, except that the initializer is not called. * please make a separate method for this * it's not worth cramming a completely different behavior in the same method * can document the methods completely independently, this way * can deprecate the new method separately, if a design flaw is found in future > ReflectionProperty::setRawValueWithoutLazyInitialization() > The method does not call hooks, if any, when setting the property value. * So far, it has been possible to `unset($object->property)` to force `__get` and `__set` to be called * will `setRawValueWithoutLazyInitialization` skip also this "unset properties" behavior that is possible in userland? * this is fine, just a documentation detail to note * if it is like that, is it worth renaming the method `setValueWithoutCallingHooks` or such? * not important, just noting this opportunity > Readonly properties * while I appreciate the effort into digging in `readonly` properties, this section feels extremely tricky * I would suggest not implementing (for now) either of * `SKIP_INITIALIZED_READONLY` * `RESET_INITIALIZED_READONLY` * I would suggest leaving some time for these, and re-evaluating after the RFC lands / starts being used > Destructors > The destructor of proxy objects is never called. We rely on the destructor of the proxied instance instead. * raising an edge case here: `spl_object_*` and object identity checks may be used inside a destructor * for example, a DB connection de-registering itself from a connection pool somewhere, such as `$pool->deRegister($this)` * the connection pool may have the `spl_object_id()` of the proxy, not the real instance * this is not a blocker, just an edge case that may require documentation * it reconnects with "can we replace the object in-place?" question above: replacing objects worth exploring > It employs the ghost strategy by default unless the dependency is to be instantiated > and initialized by a user-provided factory * one question arises here: can this RFC create proxies of **interfaces** at all? * if not, does it throw appropriate exceptions? * the reason this question comes up is that, especially in the context of DICs, factories are for interfaces * concrete classes are implementation details of factories, usually * very difficult to use lazy proxy and ghost object design with services that decorate each other * this is semi-discussed in "About Proxies" below, around "inheritance-proxy" * still worth mentioning "no interfaces" as a clear limitation, perhaps? Marco Pivetta https://mastodon.social/@ocramius https://ocramius.github.io/