On 12/08/2024 00:36, Nick Lockheart wrote:
So what I would propose is:

(1) All string functions should state in the official man page if they
are safe for UTF-8 or not.

Reasonable but see below

(2) Functions intended for working with text should be made UTF-8 safe.

Define precisely UTF-8 safe. Also, what about BC breaks here?

(3) Functions intended for processing binary should be added if
necessary, and should be named something like "binary" or "byte".

That would require renaming and deprecating most of the standard string library, I guess no one would agree to that.

But generally they are already named differently, str* are binary, mb_* and grapheme_* are text-oriented

Reply via email to