Re: [PHP-DEV] [RFC][Vote announcement] Property hooks

Ilija Tovilo Thu, 11 Apr 2024 16:02:12 -0700

Hi Juliette

On 9-4-2024 16:03, Juliette Reinders Folmer wrote:
> On 8-4-2024 23:39, Ilija Tovilo wrote:
>>
>> https://wiki.php.net/rfc/property-hooks
>>
>
> I realize it is late in the discussion period to speak up, but for months 
> I've been trying to find the words to express my concerns in a polite and 
> constructive way and have failed.


First of all, thank you for taking the time to analyze the RFC and
form your thoughts. I know how much time and effort it takes.
Nonetheless, I hope you understand that receiving feedback of this
sort literally minutes before a vote is planned is close to the worst
way to go about it, both from a practical standpoint, but also RFC
author and reviewer morale.

Many of the points you raise have been there for many months to a
year, and many have been raised and discussed many times over. I will
try to address each, and give people some time to respond.

> And not just one syntax, but five and the differences in the semantics of the 
> function syntaxes are significant.

I think this is somewhat of a misrepresentation. For reference, you
can find the grammar here. It's only about <50 lines.

https://github.com/php/php-src/blob/bf390b47c7522fd4d130be26b8ee97520f985275/Zend/zend_language_parser.y#L1088-L1131

Notably, `get` and `set` hooks don't actually have a different
grammar. Instead, hooks have names, optional parameter lists, and an
optional short (`=>`) or long (`{}`) body. Whether all possible
combinations are considered different syntax is open to
interpretation. It's true that additional rules apply to each hook,
e.g. whether they are allowed to declare a parameter list, and how the
return value is used. I could see an argument being made for allowing
an empty parameter list (`()`) for `get`, for consistency. Let us know
if you have any other ideas to make them more congruent.

In any case, I don't believe PHP_CodeSniffer would need to handle each
combination separately.

What would your optimal solution look like? I feel this discussion
cannot progress unless we have more concrete discussion points.

> * The implicit "set" parameter, which does not have a explicit parameter, 
> does not have parentheses and has the magically created $value variable.

To be a bit more exact, the parameter list is simply inferred to
`(<property type> $value)` so that the user does not need to spell it
out. This is not akin to other magic variables we had previously, like
`$http_response_header`.

> TL;DR: this RFC tries to do too much in one go and introduces a huge amount 
> of cognitive complexity with all the exceptions and the differences in 
> behaviour between virtual and backed properties. This cognitive complexity is 
> so high that I expect that the feature will catch most developers out a lot 
> of the time.

Many people still say that this RFC doesn't do enough because it
doesn't support x/y/z. This is including you:

> * The arrow function variant for `set` with all the above differences + the 
> differences inherent to arrow functions with the above mentioned exceptions + 
> the implicit assignment, which breaks the expected behaviour of arrow 
> functions by assigning the result of the expression instead of returning it 
> (totally understandable, but still a difference).
> * Properties _may_ or _may not_ have a default value anymore, depending on 
> whether the hooks cause it to be a backed or a virtual property.
> * Properties with hooks can _be_ readonly, but cannot be declared as readonly.
> * Only available for object properties, static properties are excluded.
> * Readonly properties which are not denoted as readonly, but still are due to 
> their virtual nature (get without access to $this->prop), which can only be 
> figured out by, again, studying the contents of the hook function.

There are more below that I will be commenting on in more detail.

I understand that, in a perfect world, each language feature would
automatically work with every other. In reality, this is either not
possible, or complex and time consuming. We opted for the middle
ground, allowing things that were workable and omitting things that
weren't, or required unrelated changes. If this is not acceptable,
then I genuinely don't know how to approach non-trivial RFCs.

> * Properties can now be declared as abstract, but only with explicit hook 
> requirements, otherwise the abstract keyword is not allowed.
> * Properties can now be declared on interfaces, but only with explicit hook 
> requirements, otherwise they are not allowed.

The semantics of plain interface properties are not clear.
Intuitively, `public $prop;` looks equivalent to `public $prop { get;
set; }`, but due to reference semantics it is not. It's not even
equivalent to `{ &get; set; }`, because by-reference hooked properties
still have some inherent limitations
(https://wiki.php.net/rfc/property-hooks#assignment_by_reference).
Again, this could be addressed, but only at the cost of complexity.

> * Abstract/Interface properties don't comply with the same 
> covariance/contravariance rules as other properties

Do you mean that they are not invariant as other properties? I think
it's generally accepted that LSP rules should be as lax as possible,
while being sound. Note that this rule isn't unique to
read-/write-only abstract properties either, but also applies to
read-/write-only virtual properties.

```php
class P {
    public string|int $readonly { get => 'foo'; }
}

class C extends P {
    public int $readonly { get => 42; }
}
```

> And last but not least, there is the ability to declare property hooks in 
> constructor property promotion ... where we can now basically have a function 
> declaration within the signature of a method declaration.... with all five 
> possible syntax types.

The grammar used for hooks in constructor property promotion is the
same as the one used for properties. Am I wrong to assume this
shouldn't cause any additional work for PHP_CodeSniffer?

> * `var` for property declarations is not supported according to the RFC.

This is a misunderstanding between Larry and me. I personally never
intended to deprecate `var` in this RFC. We've briefly discussed this
and decided to treat `var` as normal, meaning as an alias of `public`.

> * Disallows references to properties, but only when there is a set hook.
> * Disallows indirect modification of a property, but only when there is a set 
> hook.

The RFC spends much time explaining these limitations. If the RFC is
ever accepted, in any form, then we must accept either of these
approaches:

1. The engine places limitations on by-reference usage of properties
when there is a `set` hook.
2. The user can trivially circumvent the `set` hook.

This limitation is not unique to PHP either.

> * Write-only (virtual) properties, which cannot be read and you need to study 
> the hook declaration in detail to figure out if a property is or is not 
> write-only.

Indeed. But this has also been discussed in much detail. One proposed
solution was explicitly marking properties as virtual. This comes with
the downside of changing the property signature, making conversion of
a backed parent property to virtual (and vice versa) a BC break. If
you have a different solution in mind, please share.

> Oh, and hang on, they *can* be read from a method in the same class as long 
> as that method is called from within the set hook, so now the method will 
> need to either do a backtrace or use Reflection to figure out whether it has 
> access to the property value.

Right. We use the same mechanism as `__get`/`__set` uses to protect
against recursion. In particular, when entering a magic method for a
particular property, PHP remembers this using a "property guard". It's
really just a marker in the object that sets the magic method type,
along with the property name.

When the same property is accessed again, the magic method or hook is
skipped, and PHP behaves as if the magic method or hook wasn't there
at all. For `__get`, `__set` and hooks of virtual properties, this
means accessing a dynamic property. For backed properties, this means
accessing its backing value.

After discussing it, we've decided to make virtual properties slightly
less surprising by throwing, instead of trying to create a dynamic
property. This would be consistent with how virtual read-, and
write-only properties are handled outside of hooks, when they are
written to or read, respectively.

> now why does that remind me of the magic __...() methods ?

That's no secret. We have mentioned many times that we use the same
mechanism for accessing backing value as `__get`/`__set` uses for
accessing dynamic properties.

> * Whether a type can be specified on the parameter on `set` depends on 
> whether the property is typed. You cannot declare `set(mixed $value)` for an 
> untyped property, even though it would effectively be compatible. This is 
> inconsistent with the behaviour for, for instance method overloads, where 
> this is acceptable: https://3v4l.org/hbCor/rfc#vrfc.property-hooks , though 
> it is consistent with the behaviour of property overloads, where this is not 
> acceptable: https://3v4l.org/seDWM (anyone up for an RFC to fix this 
> inconsistency ?)

For child properties, it is not legal to add `mixed` to a previously
untyped property. https://3v4l.org/mScQS#v8.3.5 I was honestly not
aware of the case you outlined. Anyway, either approach we chose will
be inconsistent with the other.

> * Changes the meaning of `$this->property` read/write access for all methods 
> called from within a hook while executing the hook.

What is the right solution to this in your opinion? Recursion surely
isn't it. Throwing when accessing the backing value outside of its
corresponding hook? I will investigate whether this is possible in an
efficient way, and whether it's worth considering.

> * Creates two different flavour of properties: "backed" properties and 
> "virtual" properties with significantly different behaviours, raising 
> cognitive complexity.
>     This includes, but is not limited to the fact that set hooks can be 
> bypassed on virtual properties via a get reference.

Sure, but is there actually a solution to this? Making all properties
backed means that serialization includes backing values that aren't
used, and that the property supports useless reads/writes for
properties that are only read or only written to. Only supporting
virtual properties means that there's an inevitable BC break when
converting a backed property to a virtual one, because the explicit
backing property cannot share the same name as the virtual one, which
will once again affect serialization.

> * The range of different behaviours for various forms of serialization.

Indeed, we carefully picked the behavior that should make sense for
each case. Isn't that better than the user having to remember that
they need to handle serialization/JSON themselves, depending on the
case, which they also need to identify themselves?

> [1] The RFC also states in "Interaction with isset() and unset()" that the 
> behaviour of property hooks is consistent with how `isset()` interacts with 
> `__get()` today. This is not true, as in: the behaviour as described for 
> isset with magic methods is different than what the RFC states it is: 
> https://3v4l.org/2Arkh/rfc#vrfc.property-hooks

Right, this paragraph is a bit misleading. In essence, `isset`
triggers both `__isset` and `__get` to 1. verify that the property
even exists and 2. if it does, that it is non-null. For hooked
properties, we don't need to verify the former, because the property
is explicitly declared. So we only call `get` and make sure its value
is non-null.

We'll make sure to clarify this better.

> Additionally, it proposes for isset() to throw an Error for virtual 
> properties without a get hook. This is surprising behaviour as isset() by its 
> nature is expected to *not* throw errors, but return false for whatever is 
> not set or inaccessible from the current context: https://3v4l.org/3OlgM

Personally, I would rather know when checking `isset` on something
that can never return `true`. But I admit that this case is not
clear-cut. The throwing approach can be relaxed to return false
without a BC break though, but not the opposite.

> [2] PHP supports "multi-property declarations", but I see no mention of these 
> in the RFC, which makes it unclear if and if so, how property hooks would 
> work with multi-property declarations.
> class Foo {
>     public string $bar, $baz, $bab;
> }

The implementation does not allow mixing multi-property declarations
with hooks. I find them confusing (e.g. are `public int $foo, $bar;`
two typed properties, or one typed and one untyped?). The same applies
for hooks. We'll make sure to spell this out explicitly.

> [3] If a `set` hook gets a named parameter, is the magic `$value` variable 
> still available or not ? I'd expect not, but it is unclear from the RFC.

No. As mentioned above, on omission of the parameter list, it is
simply inferred. There's no "truly" magic `$value` variable.

> [4] Why should ReflectionProperty::getRawValue() throw an error for static 
> properties ? Instead of returning the value, identically to 
> ReflectionProperty::getValue() ? This is surprising to me and I see no reason 
> for it.

Because it's equivalent to `getValue`, and indicates the user is doing
something they don't fully understand.

> As for the impact on static analysis tools....

I won't comment too much on this, because it's your area of expertise.
Of course, I sympathize. If there's a solution that works well for us
_and_ you, let me know. But it doesn't sound like some simple tweaks
can significantly improve this problem for you.

Here are the things we're changing:

* `var` continues to be supported.
* We will change accessing a virtual property with the same name from
within a method called from a hook to throw, rather than create a
dynamic property.
* We will change accessing a backed property from within a method
called from a hook to throw as well, assuming it can be done
performantly. (We need to verify this.)

To summarize: I understand your concerns and why you raise them.
However, none of the complexity in this RFC is arbitrary. Most of it
is inherent in the problem space, and the RFC’s approach is the result
of extensive experimentation to determine possible solutions. While
some superficial changes are possible, the actual complex parts
(reference handling, inheritance, interfaces, virtual properties,
etc.) are all necessary complexity, not accidental complexity. Absent
a novel alternative, which after over a year of work and extensive
discussions on this list and elsewhere we do not believe exists, we
believe this RFC represents "the best hooks implementation that PHP
can support." And that implementation deserves a final vote to
determine if the internals community wants hooks at all or not.

Ilija

Re: [PHP-DEV] [RFC][Vote announcement] Property hooks

Reply via email to