Hi Rob,
I'm torn on this one. On the one hand, it does look like a nice solution
for adding custom value objects to the language; on the other hand, it's
a lot of things that are "just slightly different" for users to get used to.
On 17/11/2024 21:30, Rob Landers wrote:
One of the main reasons for the alternative creation syntax is because
I felt that "new" was misleading at best, and just plain wrong at
worst. It is also why I chose "&", to make it clear you are not
getting "a new one" but one that "just happens to exist with the
values you asked for." I'd be open to a different keyword or something
else entirely. It's just that "new" is the wrong one for records.
I'm not convinced by this, because I'm not convinced the "mental model"
in the RFC is the one that most developers need to care about. I think a
much simpler mental model (for someone who understands the rest of PHP) is:
- records are copy-on-write (like arrays, not like objects)
- the === operator returns true for two records with the same value
(again like arrays, not like objects)
- the implementation optimises two records with the same values to share
memory
The third point is useful to know if you're creating a lot of them, but
probably irrelevant most of the time.
We also might not want to make it a hard guarantee, because there may be
cases where a different trade-off is more efficient. For instance, maybe
we will optimise $foo->with(a: 1)->with(b: 2)->with(c: 3) to overwrite
values in-place, at the cost of an extra condition in the `===`
implementation.
That would be similar to some of the changes to zvals in PHP 7; for
instance, "$foo=42; $bar=$foo;" will copy the value 42 to a new piece of
memory, not increase a zval reference count as PHP 5 would have done.
That leaves the mental model as mostly "records are a bit like arrays".
Now, it's true that we don't write "new array(1,2,3)"; but I have heard
people calling the "array()" syntax "the array constructor", and the
manual describes it as "creating an array". Similarly, you use
"constructor" and "construction" throughout the RFC.
All of that makes it feel perfectly natural for me to have "new Point(1,
5)" mean "create a Point record; feel free to save memory by reusing one
with the same values".
A *record* may contain a traditional constructor with zero arguments
to perform further initialization.
A *record* body may also declare properties whose values are only
mutable during a constructor call. At any other time, the property is
immutable.
Talking of constructors, I find the proposed syntax rather confusing,
because it's doing the same job as constructor property promotion, but
in almost the opposite way: taking things out of the constructor
signature, vs putting them in:
readonly class Foo {
public string $bytes;
public function __construct(public int $len) {
$this->bytes = random_bytes($this->len);
}
}
record Foo (int $len) {
public string $bytes;
public function __construct() {
$this->bytes = random_bytes($this->len);
}
}
While writing this example, I realised that the behaviour is also
confusing: when exactly will the constructor be called? Consider:
$a = &Foo(42); // new Record; constructor called
$b = &Foo(42); // re-use cached Record; is the constructor skipped? or
called, but the result discarded? what if the constructor modifies
$this->len?
unset($a, $b);
$c = &Foo(42); // does this re-use the Record? or has it been garbage
collected, so this will call the constructor again?
I think we should decide between two paths:
- structs/records as a special kind of object, keeping as much behaviour
and syntax from classes as we can; that means no "inline constructor",
and probably no "pull a cached instance from memory"
- structs/records as a brand new thing, with new syntax that only allows
the parts that fit the model; that means no non-constructor properties,
and no constructor bodies
Regards,
--
Rowan Tommins
[IMSoP]