Hey On 16.12.22 16:21, Tim Düsterhus wrote:
HiOn 12/16/22 14:28, Derick Rethans wrote:Question 2 is that class. I know folks have been clammoring for a `String` class for some time and this actually fills that niche quite well. A part of me wonders if we can overload it a little to provide a psuedo locale of "binary" so that users can, optionally, treat it like a more generalized String class in specific cases, storing a normal `char*` zend_string under the hood in that case. Possibly as a specialzation tree.An alternative could be to just have this as an implementation detail, in case the associated locale/collation is C/root. Then nobody needs to worry about it, *but* it would mean implementing everything twice. Which I am not too keen on, especially because we have such a wide array of operations on strings already.I rather not see this either, because if a 'Text' object may contain binary data, the type safety is lost and users cannot rely on "'Text' implies valid UTF-8" (see sibling thread).
Does Text contain valid UTF-8? Or valid Unicode? As IIRC the idea was to internally use UTF-16 as encoding.
In the end the internal encoding should be irrelevant to the user as long as we can assert that __toString() returns a Unicode-String in a valid encoding. And I'm with you that UTF-8 might be the best choice for that.
Cheers Andreas -- ,,, (o o) +---------------------------------------------------------ooO-(_)-Ooo-+ | Andreas Heigl | | mailto:andr...@heigl.org N 50°22'59.5" E 08°23'58" | | https://andreas.heigl.org | +---------------------------------------------------------------------+ | https://hei.gl/appointmentwithandreas | +---------------------------------------------------------------------+ | GPG-Key: https://hei.gl/keyandreasheiglorg | +---------------------------------------------------------------------+
OpenPGP_signature
Description: OpenPGP digital signature