Hi

On 12/15/22 17:05, Andreas Heigl wrote:

I see a few challenges in the approach. My first question was: Why do we
need a new implementation of the ICU library? Creating a userland

[…]

I'm ambivalent about this. On the one hand it could make some things for
sure easier. On the other hand it adds burden onto the core-developers
that could be avoided by providing the intl (and mb-string) extension by
default instead of having to add them separately. And then find a group
if people willing to build a userland implementation.

Because a programming language needs a standard library, otherwise one could just use JavaScript and pull in a dependency for 'is-odd' or left-padding.

The biggest advantage this proposal has compared to ext/intl is that it *adds a new data type*. If you receive a 'Text' object then you are guaranteed to have valid Unicode/UTF-8 inside of it.

It also provides a OO API around text/string processing functionality, which is something users have desired for quite some time already ("scalar objects").

The addition of a new data type is also a reason why this cannot usefully be implemented in userland alone: It would require every developer to standardize on a single userland implementation, as otherwise you need bridges to convert between the different representations of various userland libraries (or need to round-trip through the standard 'string' type), which I consider to be a non-starter for something as fundamental as text processing. Both because it adds complexity and because it will kill performance.

As the RFC notes, an explicit design goal is to keep the API simple and focused, so I don't expect much ongoing maintenance burden here. Especially if all the heavy lifting is off-loaded to ICU. Any convenience functionality can then be be provided in userland based on the building blocks provided by PHP itself, with the benefit that userland libraries are going to be fully interoperable because they all use the standard 'Text' type that is guaranteed to be available [1].

Best regards
Tim Düsterhus

[1] The 'Text' class should likely be made final, because folks might otherwise rely on a specific userland extension, preventing actual interoperability.

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php

Reply via email to