On 16 August 2024 19:44:22 BST, Mike Schinkel <m...@newclarity.net> wrote:
>Let me see if I understand your argument correctly?  You are asserting that 
>Unicode is "too complex" to be handled in the standard library so that 
>complexity should instead be shouldered individually by each and every PHP 
>developer who needs to work with Unicode text in PHP, which is many PHP 
>developers if not eventually most. Is that your argument?


Not really, no. I'm definitely in favour of including more Unicode-based string 
handling functionality, by improving and extending ext/intl, or coming up with 
new convenience wrappers for common tasks. 

What I'm always sceptical of is the idea that you could ever consider such 
functionality "complete", or that "Unicode support" can ever be a single 
deliverable, rather than an ongoing aspiration. (And consequently, I'm 
sceptical of any language which says it has achieved that.)

I also think "Unicode support" is probably the wrong angle to approach from; it 
leads to features like IntlChar, which technically provides access to tons of 
data from the Unicode standard, but practically has no use for 99% of PHP 
developers. Instead we should be talking about "internationalisation support", 
of which handling different writing systems is one (fairly big) part.

For instance, I would welcome proposals like "here's some functions for 
handling locale-specific case folding and normalisation-based matching", 
"here's some functions for limiting the storage size of a string without 
producing garbage characters", etc. As well as related things which aren't just 
about text encoding, like "here's some functions for working with 
locale-specific date formatting" (or even just "here's some documentation for 
how you're supposed to use ext/intl's date classes").


>> We also have the "mbstring" extension, ...
>
>Interesting historical factoid, but how is that really relevant to including 
>Unicode into the standard library?

I was just summarising the current situation, to work out where we could go 
next. Any attempt to extend string handling functionality is likely to build on 
either ext/intl or ext/mbstring, so it's useful to understand how they differ.


Regards,
Rowan Tommins
[IMSoP]

Reply via email to