[Including my full previous reply, since the list and gmail currently
aren't being friends. Apologies that this leads to rather a lot of
reading in one go...]
On 21/02/2024 18:55, Larry Garfield wrote:
Hello again, fine Internalians.
After much on-again/off-again work, Ilija and I are back with a more polished
property access hooks/interface properties RFC.
Hello, and a huge thanks to both you and Ilija for the continued work
on this. I'd really like to see this feature make it into PHP, and
agree with a lot of the RFC.
My main concern is the proliferation of things that look the same but
act differently, and things that look different but act the same:
var $a;
public $b;
public mixed $c;
public mixed $d = null;
public mixed $e => null;
public mixed $f { get => null };
public mixed $g { get { return null; } }
public mixed $h { get => $field }
public mixed $i { get => $this->i }
public mixed $j { get => $this->_j }
public mixed $k { &get => $this->k }
public mixed $l { &get => $this->_l }
As currently proposed:
- a and b are both what we might call "traditional" properties, and
equivalent to each other; a uses legacy syntax which we haven't
removed for some reason
- c allows the same values as b, but is a "typed property", which
changes its behaviour in various ways
- e looks like d, but is actually equivalent to f and g
- h and i are both "properties with hooks", but j is a "virtual
property", which brings additional changes in behaviour
- l allows callers to assign by reference, but k is not allowed
To make this all less confusing, I suggest the following changes:
- Remove the short-hand syntax in example e, so this sentence from the
RFC is always true: "For a property to use a hook, it must replace its
trailing |;| with a code block denoted by |{ }|."
- Allow any property to define an "&get" hook in place of a "get" hook
(i.e. allow example k). It is up to the user to decide whether this
will cause problems.
- Limit as much as possible the difference in behaviour between
"virtual" and "hooked" properties.
And, probably most controversially:
- Hooks should always be *on top of* the normal property defined,
unless explicitly indicated to be "virtual". Example j would thus be:
public virtual mixed $j { get => $this->_j }
This is slightly more verbose, but removes all the complexity for both
the implementation and users in determining which properties are
"virtual". I believe the time saved in reading the more explicit code
would outweigh the time spent typing the extra keyword.
Regarding the implicit $value on set hooks, I am unconvinced by the
comparison to $this, which acts more like a keyword - it is reserved
outside of methods, read-only inside them, and cannot be renamed. I
think a closer analogy would be "foreach ( $foo as $key => $value )"
or "catch ( SomeException $e )": naming $value is always required;
$key and $e can be omitted, but doing so makes the values unavailable,
it does not give them default names.
Regarding arrays, have you considered allowing array-index writes if
an &get hook is defined? i.e. "$x->foo['bar'] = 42;" could be treated
as semantically equivalent to "$_temp =& $x->foo; $_temp['bar'] = 42;
unset($_temp);"
As noted above, I think the user should be able to opt into this
facility for both virtual and non-virtual hooked properties, at their
own risk, for example:
class Example {
// non-virtual property, using a get hook for additional
behaviour, not to reroute the value
public array $foo {
&get { $this->foo = $this->lazyLoad('foo'); return $this->foo; }
}
// ...
}
$a = new Example;
$a->foo[] = 42; // will call $a->lazyLoad('foo') to populate the
initial value, then append an item to it
The more I think about it, the more convinced I am this RFC is trying to
cram too many features into too small a space.
For instance, the ability to specify a type on the set hook. Taking the
example from the RFC:
public UnicodeString$name {
set(string|UnicodeString$value) {
$this->name = $value instanceof UnicodeString ?$value : new UnicodeString($value);
}
}
What is the type of $name? The answer is "it depends if you're writing
to or reading from it". The same use case can be covered by this:
public UnicodeString$name; public string $name_string{ get =>
(string)$this->name;
set=> $this->name = new UnicodeString($value);
}
Now we have two properties with clear types, without the complexity of
the conditional (which would be even worse if we wanted more than two
types). We can even swap the "real" and "virtual" properties transparently:
public UnicodeString$name{ get => new UnicodeString($this->name_string);
set=> $this->name_string = (string)$value;
}
public string $name_string;
This exotic "asymmetric typing" is then being used to justify other
decisions - if you can specify setter's the type, it's confusing if you
specify a name without a type; so we need to make the name optional as
well... Compare to C#, where "value" is not a default, it's an
unchangeable keyword; or Kotlin, where naming it is mandatory but
doesn't have mention type.
I think my concerns about distinguishing "virtual properties" may stem
from a similar cause.
In C#, all "properties" are virtual - as soon as you have any
non-default "get", "set" or "init" definition, it's up to you to declare
a separate "field" to store the value in. Swift's "computed properties"
are similar: if you have a custom getter or setter, there is no backing
store; to add behaviour to a "stored property", you use the separate
"property observer" hooks.
Kotlin's approach is philosophically the opposite: there are no fields,
only properties, but properties can access a hidden "backing field" via
the special keyword "field". Importantly, omitting the setter doesn't
make the property read-only, it implies set(value) { field = value }
The current RFC attempts to combine all of these ideas into one syntax,
on top of everything the language already has. The result has some
odd-shaped corners. For instance, this won't work:
public string $name { set => throw new Exception('Read-only property ' .
__PROPERTY__); }
But this will:
public string $name { set => throw new Exception('Read-only property ' .
__PROPERTY__ . '; current value is: ' . $this->name); }
The first declares a virtual property, with no default getter, like in
C# or Swift. The second instead acts like Kotlin, and has a default
getter referencing the implicit backing field.
It would be clearer to choose one style or the other: explicitly enable
the defaults...
public string $name { get; set => throw new Exception('Read-only
property ' . __PROPERTY__); } // default getter and backing field
requested
public string $name { get => $this->name ??= $this->generateName(); }
// setter disabled because it's not mentioned, even though backing field
is used
...or explicitly disable them:
public string $name { set => throw new Exception('Read-only property ' .
__PROPERTY__ } // implied default getter and backing field
public virtual string $name { get => $this->firstName . ' ' .
$this->lastName; } // setter disabled because property is declared
virtual
I think there's some really great functionality in the RFC, and would
love for it to succeed in some form, but I think it would benefit from
removing some of the "magic".
Regards,
--
Rowan Tommins
[IMSoP]