Hi
On 12/15/22 17:05, Andreas Heigl wrote:
I see a few challenges in the approach. My first question was: Why do we
need a new implementation of the ICU library? Creating a userland
[…]
I'm ambivalent about this. On the one hand it could make some things for
sure easier. On the other hand it adds burden onto the core-developers
that could be avoided by providing the intl (and mb-string) extension by
default instead of having to add them separately. And then find a group
if people willing to build a userland implementation.
Because a programming language needs a standard library, otherwise one
could just use JavaScript and pull in a dependency for 'is-odd' or
left-padding.
The biggest advantage this proposal has compared to ext/intl is that it
*adds a new data type*. If you receive a 'Text' object then you are
guaranteed to have valid Unicode/UTF-8 inside of it.
It also provides a OO API around text/string processing functionality,
which is something users have desired for quite some time already
("scalar objects").
The addition of a new data type is also a reason why this cannot
usefully be implemented in userland alone: It would require every
developer to standardize on a single userland implementation, as
otherwise you need bridges to convert between the different
representations of various userland libraries (or need to round-trip
through the standard 'string' type), which I consider to be a
non-starter for something as fundamental as text processing. Both
because it adds complexity and because it will kill performance.
As the RFC notes, an explicit design goal is to keep the API simple and
focused, so I don't expect much ongoing maintenance burden here.
Especially if all the heavy lifting is off-loaded to ICU. Any
convenience functionality can then be be provided in userland based on
the building blocks provided by PHP itself, with the benefit that
userland libraries are going to be fully interoperable because they all
use the standard 'Text' type that is guaranteed to be available [1].
Best regards
Tim Düsterhus
[1] The 'Text' class should likely be made final, because folks might
otherwise rely on a specific userland extension, preventing actual
interoperability.
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php