Hi

On 8/11/24 17:50, Nick Lockheart wrote:
It seems like there's still a lot of string functions that assume that
a character is a single byte, and these may actually work as expected
when dealing with Latin characters, but may fail unexpectedly if a
sequence is more than one byte.

PHP's strings are byte-strings containing arbitrary sequences of bytes. Unless you specifically select functions that interpret the byte-strings as something else, you get a byte-string interpretation. There is nothing unexpected about that.

Are there any use cases for PHP where **single-byte** characters are
the norm?

Dealing with binary formats.

It seems that if everything on the Internet is multi-byte encoded now,
then all of the PHP string functions should be multi-byte safe.

The premise is false. Everything on the Internet is byte-strings (also called "octet-string").

--------

You might be interested in https://externals.io/message/119149#119149.

Best regards
Tim Düsterhus

Reply via email to