> Now fast forward a few years, and we figure out a way to performantly enforce
> that check at runtime and turn that function call into a call-site TypeError.
> I would consider that a good improvement to the language. However, it would
> also mean that the previous mismatched line would now generate an error where
> it didn't before. And the author is going to get up in arms about how "PHP
> is breaking my code and destroying the language why can't they respect BC"
> and so on and so on, because we've seen that movie several times now.
This depends on how reified generics are implemented, specifically if
it's the way you suggest. I.e, all erased generics become reified,
then this is a problem. However, if we make reified generics an opt-in
feature, this is not a concern, because the code will keep working as
is, and would only break if the author of `foo` enables reified
generics for it.
Personally, if i had the option of full-reified generics from day one,
or bound-eraser + opt-in reification, the latter makes more sense.
Because I don't want to pay for the performance penalty that will come
with reified generics, when I know that all my code is type-safe.
Quoting Matt Brown: "I have now been writing Hack (Facebook fork of
PHP) full-time for almost five years — alongside hundreds of other
backend engineers at Slack — and it’s obvious to everyone around me
that erased generics are a good idea."
> In general, several of those issues are entirely self-created, and PHP can
> easily resolve them.
>
> 1. Nothing says we can't provide a first-party SA tool that's written in Rust
> instead of C.
>
> 2. Nothing says a first-party SA tool must follow the exact same cadence as
> the engine. Making improvements to it at the same time as a .z release of
> the engine is completely reasonable; it would be purely an administrative
> decision to do that or not.
>
> First-party doesn't have to mean "in the php-src C code directories." Though
> leveraging some parts of those as externs could certainly make sense.
1. If it's written in Rust, C++, Zig, whatever, it might make the work
easier, but it won't make it instant.
2. True, but again, this needs a maintenance team, a spec team, its
own documentation, and more.
> (Joking: Should we just ship Mago with PHP? :-) )
Mago could help if such an SA were written in Rust, because it can
provide the parser, name resolver, reflections, semantics analyzer,
and a few other crates. However, the analyzer can't be used as is,
because it does much more than just "check if generics are being used
correctly".
However, this would open a can of worms: Should this analyzer support
watch mode? Should it perform incremental analysis? What about the
language server? Should it offer IDE/editor integration? and a million
other things.
( Also note: PHP *can't* actually use the Mago parser and get away
with it, because in Mago, we made a decision a long time ago not to
support non-utf-8 code. PHP must stay true to the engine, so it must
re-write its own parser ).
> Kotlin and C# have several orders of magnitude more users, so the
> "familiarity" argument goes firmly in that direction.
We will see, this small change can be made later, before the vote, or
perhaps require a secondary vote.
> At what point did I say slide shenanigans are a problem? :-)
Nvm, read it wrong :)
> No no, I meant the type argument cap.
>
> function foo<A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P>(/* */) {}
>
> That's 16 type arguments, an order of magnitude less than the cap, and I'll
> already say any such code should be rejected on sight as bonkers. So
> reserving more than one bit for future flags seems completely fine to me, and
> perhaps advisable.
Ah, I agree. Already changed the cap to 127.
> I'd probably omit it, but if it's kept, the RFC should at least include that
> explanation.
I will omit it.
> And explicitly calling out something like "despite the name, this is
> mostly-enforced generics." Or something like that, which seems more accurate.
>
> Would this be enforceable?
>
> class Collection<T> {
> public function add(T $val): void {}
> }
>
> $c = new Collection<int>();
>
> $c->add('string'); // Error, or would this be allowed at runtime?
No, hence the name "bound erased." In this example, `T` has no bound,
therefore, it is the same as if it were bound to `mixed`, so this
won't error.
However, the following applies:
```
class Collection<T> {
public function add(T $val): void {}
}
class UserCollection extends Collection<User> {}
$c = new UserCollection();
$c->add(new User()); // ok
$c->add("hello"); // TypeError
```
because `T` gets substituted with `User` at compile time.
> - With reflection, would it be possible to tell what a given object was
> instantiated with, or only the class? new
> ReflectionObject($listOfInts)->getTypeParam() => ReflectionType(int)? Or is
> that also the erased part?
No, that's erased. This is entering reified generics territory. and
causes associated performance issues. Either we add an extra pointer
field to `zend_object` ( +8 bytes on every PHP object, regardless of
whether it uses generics or not ), or we keep a side table keyed by
object handle. The latter moves the cost from storage to lookup (every
read incurs a hash table hit, and the table must be torn down on
`__destruct` ). Neither is free.
Beyond storage, the bindings themselves aren't cheap to carry either:
each `zend_type` is 16 bytes plus any list / named-with-args payload
it points to. Therefore, something like `Map<string, Box<User>>`
involves multiple type structures per instance, all of which must be
kept alive, reference-counted, and traversed at instanceof /
reflection time.
And once the runtime can read the binding back, the pressure to
actually use it follows immediately: $obj instanceof Box<string>
becoming truthful, parameter checks tightening from the bound to the
substituted type, etc. That's the full reified runtime, which is a
separate RFC (and one we explicitly punt on in "Why bound erasure").
> - I assume that dynamically specifying the type parameter is also right-out,
> yes?
Yep, not supported, and I don't see it being useful at all really. the
way to type that function properly would be something like this:
```
function make<T: object>(int $count) {
$c = new Foo::<T>();
foreach ($count as $i) {
$c->add(new T());
}
return $c;
}
```
Note: This requires reified generics, specifically, the `new T` part.