On Sat, Jun 22, 2024 at 8:59 PM Arnaud Le Blanc <arnaud...@gmail.com> wrote: > > On Fri, Jun 21, 2024 at 7:20 PM Robert Landers <landers.rob...@gmail.com> > wrote: > > > > I'm always surprised why arrays can't keep track of their internal > > > > types. Every time an item is added to the map, just chuck in the type > > > > and a count, then if it is removed, decrement the counter, and if > > > > zero, remove the type. Thus checking if an array is `array<int>` > > > > should be a near O(1) operation. Memory usage might be an issue (a > > > > couple bytes per type in the array), but not terrible.... but then > > > > again, I've been digging into the type system quite a bit over the > > > > last few months. > > > > > > And every time a modification happens, directly or indirectly, you'll > > > have to modify the counts too. Given how much arrays / hash tables are > > > used within the PHP codebase, this will eventually add up to a lot of > > > overhead. A lot of internal functions that work with arrays will need > > > to be audited and updated too. Lots of potential for introducing bugs. > > > It's (unfortunately) not a matter of "just" adding some counts. > > > > Well, of course, nothing in software is "just" anything. > > Counters are not cheap as we need one slot for each type in the array so we > need a dynamic buffer and an indirection, plus absolutely every mutation > needs to update a counter, including writes to references. It is possible to > remove the counters and to maintain an optimistic upper bound of the type > (computing the type more precisely when type checking fails), but I feel this > would not work well with pattern matching.
To me, that sounds kinda silly. PHP does reference counting and while there is an overhead, it doesn't prevent us from using it... Anyway, while you make a good point for the pathological case, I suspect the impact to the popular case (mostly homogenous arrays), the performance impact will be negligible compared to the productivity impact of being able to type-check arrays. > > Also, a few things complicate this: > - Nested writes like $a[0][0][0]=1 need to backtrack to update the type of > all parent arrays after the element is added/updated Nesting could be forbidden, at least at first. I think saying "array<array<array<int>>>" is forbidden is totally fine. > - Supporting references to properties whose type is a typed array, or > dimensions of these properties, is very hard Also, depends on the defined behavior? If you have a typed property that is array<string> and you try to add an int to it ... it could fail. Thus not being hard at all. > > Fixed-type arrays may be easier to support but there are important drawbacks > in usability IMHO. This does not play well with CoW semantics. > > Best Regards, > Arnaud Robert Landers Software Engineer Utrecht NL