Hi internals

There's an open bug report that array_unique doesn't work for enums:
https://github.com/php/php-src/issues/9775

This comes down to the fact that array_unique internally sorts the
array before iterating over it to remove duplicates, and that enums
are intentionally incomparable.

    Foo::Bar < Foo::Baz // false
    Foo::Baz < Foo::Bar // false

Unfortunately, this means that array_unique might coincidentally work
fine if the array is already sorted, or gets correctly sorted by
chance while breaking otherwise.

To solve this, I propose adding an ARRAY_UNIQUE_IDENTICAL option that
can be passed to array_uniques $flags which uses identical operator
(===) semantics. Internally it uses a new hashmap that allows using
arbitrary PHP values as keys to efficiently remove duplicates. This is
slightly over-engineered for this use case. However, this data
structure will be required for implementing ADTs to deduplicate
instances with the same values. This hashmap is a heavily minimized
version of the teds extensions StrictHashMap [1].

Time complexity of this function is O(n). With the exception of
SORT_STRING (which uses PHPs existing hashmap in a very similar
fashion and also has O(n)) it should scale better than the other sort
options which are O(n log n).

Here's a link to the implementation:
https://github.com/php/php-src/pull/9882/files

If there are no concerns or complaints I'd like to merge this into PHP
8.3. Otherwise I will create an RFC. Looking forward to your feedback.

Ilija

[1| https://github.com/TysonAndre/pecl-teds/blob/main/teds_stricthashmap.c

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php

Reply via email to