On Tue, 3 Mar 2020 at 08:46, Andreas Heigl <andr...@heigl.org> wrote:
> > While it is mainly aimed at being a mere convenience-function that could > also be easily implemented in userland it misses one main thing IMO when > handling unicode-strings: Normalization. > > While I would love to see more functionality for handling Unicode which didn't treat it as just another character set, I don't think sprinkling it into the main string functions of the language would be the right approach. Even if we changed all the existing functions to be "Unicode-aware", as was planned for PHP 6, the resulting API would not handle all cases correctly. In this case, a Unicode-based string API ought to provide at least two variants of "contains", as options or separate functions: - a version which matches on code point, for answering queries like "does this string contain right-to-left override characters?" - at least one form of normalization, but probably several If there was serious work on a new string API in progress, a freeze on additions to the current API would make sense; but right now, the byte-based string API is what we have, and I think this function is a sensible addition to it. Regards, -- Rowan Tommins [IMSoP]