On Thu, 15 Dec 2022, Jakub Zelenka wrote: > On Thu, Dec 15, 2022 at 4:56 PM Christoph M. Becker <cmbecke...@gmx.de> > wrote: > > > On 15.12.2022 at 16:34, Derick Rethans wrote: > > > > > I have just published an initial draft of the "Unicode Text > > > Processing" RFC, a proposal to have performant unicode text > > > processing always available to PHP users, by introducing a new > > > "Text" class. > > > > > > You can find it at: > > > https://wiki.php.net/rfc/unicode_text_processing > > > > > > I'm looking forwards to hearing your opinions, additions, and > > > suggestions — the RFC specifically asks for these in places. > > > > | As the implementation requires ICU, this would also mean that PHP > > | depend on the ICU library. > > > > Our current stance is that a minimal PHP should be buildable without > > requiring any "non-standard" libraries; this is the reason why we > > bundle PCRE. If we wanted to stick with that policy, we would need > > to bundle ICU, what might not be the best idea – it's generally not > > great to have bundled libraries which are still maintained outside > > of php-src, and especially for such huge libraries. > > > > > I agree with this. Bundling ICU doesn't seem like a good idea. > Wouldn't be better to base on something smaller that can be bundled > and does the job? For example NJS and QuickJS use their own > implementations which seem to be fine. Especially > https://github.com/bellard/quickjs/blob/master/libunicode.c seems like > something that we could fork and maintain potentially.
I have no intentions of bundling ICU. That'd be a crazy thing to do. Instead, the current proposal is to make PHP depend on libicu. I realise that this is against our current stance, but considering that 1. most (if not all) Linux distributions ignore our bundled libraries any way as per their policies; 2. libicu is pretty much available everywhere; and 3. I am not proposing to require the latest and greatest, I believe we can safely rely on it being available. I'm not opposed to using something else than ICU Most of the other unicode related libraries that I had a quick look at, either provide a small subset — either just character properties, or graphemes, none of them also take care of collation/locales and transliteration. I am also weary about some of these library's development and future proofness. ICU won't have these problems. cheers, Derick -- https://derickrethans.nl | https://xdebug.org | https://dram.io Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support Host of PHP Internals News: https://phpinternals.news mastodon: @derickr@phpc.social @xdebug@phpc.social twitter: @derickr and @xdebug
-- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php