Hello everybody!

I'd like to open a discussion regarding the behavior of `array_unique()`
with the `SORT_REGULAR` flag when used on arrays containing mixed types.

Currently, `SORT_REGULAR` uses non-strict comparisons, which can lead to
unintentional data loss when values like `100` and `"100"` are treated as
duplicates. This forces developers to implement user-land workarounds.

Here is a common scenario where this behavior is problematic:

```php
$events = [
    ['id' => 100, 'type' => 'user.login'],        // User event (int)
    ['id' => "100", 'type' => 'system.migration'],  // System event (string)
    ['id' => 100, 'type' => 'user.login'],        // Duplicate user event
];

$event_ids = array_column($events, 'id'); // [100, "100", 100]

// Current behavior with SORT_REGULAR
$unique_ids = array_unique($event_ids, SORT_REGULAR); // Result: [100]
// The string "100" is lost due to type coercion.
```

To address this, I propose adding a new flag, `SORT_STRICT`, which would
use strict (`===`) comparisons to differentiate between values of different
types.

With the new flag, the result would be:

```php
// Proposed behavior with SORT_STRICT
$unique_ids = array_unique($event_ids, SORT_STRICT); // Result: [100, "100"]
// Both integer and string values are preserved.
```

I've already submitted a PR to correct the bug I just highlighted:
PR: https://github.com/php/php-src/pull/20273

The potential for a `SORT_NATURAL` flag also came to mind as another useful
addition, but I believe `SORT_STRICT` is the more critical feature to
discuss first.

I look forward to your feedback.

Thanks,
- Jason

Reply via email to