Le mar. 3 mars 2020 à 11:04, Rowan Tommins <rowan.coll...@gmail.com> a écrit :
> On Tue, 3 Mar 2020 at 08:46, Andreas Heigl <andr...@heigl.org> wrote: > > > > > While it is mainly aimed at being a mere convenience-function that could > > also be easily implemented in userland it misses one main thing IMO when > > handling unicode-strings: Normalization. > > > > > > While I would love to see more functionality for handling Unicode which > didn't treat it as just another character set, I don't think sprinkling it > into the main string functions of the language would be the right approach. > Even if we changed all the existing functions to be "Unicode-aware", as was > planned for PHP 6, the resulting API would not handle all cases correctly. > > In this case, a Unicode-based string API ought to provide at least two > variants of "contains", as options or separate functions: > > - a version which matches on code point, for answering queries like "does > this string contain right-to-left override characters?" > - at least one form of normalization, but probably several > > If there was serious work on a new string API in progress, a freeze on > additions to the current API would make sense; but right now, the > byte-based string API is what we have, and I think this function is a > sensible addition to it. > FYI, I wrote a String handling lib, shipped as Symfony String: - doc: https://symfony.com/doc/current/components/string.html - src: https://github.com/symfony/string TL;DR, it provides 3 classes of value objects, dealing with bytes, code points and grapheme cluster (~= normalized unicode) It makes no sense to have `str_contains()` or any global function able to deal with Unicode normalization *unless* the PHP string values embed their unit system (one of: bytes, codepoints or graphemes). With this rationale, I agree with Rowan: PHP's native string functions deal with bytes. So should str_contains(). Other unit systems can be implemented in userland (until PHP implements something similar to Symfony String in core - but that's another topic.) Nicolas