2026年5月13日(水) 19:27 Derick Rethans <[email protected]>: > > On Tue, 12 May 2026, youkidearitai wrote: > > > 2022年12月16日(金) 0:34 Derick Rethans <[email protected]>: > > > > > I have just published an initial draft of the "Unicode Text > > > Processing" RFC, a proposal to have performant unicode text > > > processing always available to PHP users, by introducing a new > > > "Text" class. > > > > > > You can find it at: > > > https://wiki.php.net/rfc/unicode_text_processing > > > > > > I'm looking forwards to hearing your opinions, additions, and > > > suggestions — the RFC specifically asks for these in places. > > > > Is still available this topic? > > I have interesting this Text class. > > I'm glad to control based on grapheme cluster such as Swift's string type. > > I still have interest in working this out into supporting even more > things. Since I wrote that Draft RFC, I did add a few more features: > > https://github.com/derickr/php-text/commits/main/ > > > > > I have some idea. > > > > 1. Move to Intl extension such as \Intl\Text > > * I think keep it simple for implementation. > > I don't agree with this, as although it builds on top of ICU like the > classes in the Intl extension, it isn't following ICU's API style at > all. > > It is meant to be a much more opiniated API that does the simple 80% > case well. > > > 2. Add Text type for grapheme_* function only such as string|Text. > > * It is some complexy for implementation but userland is simple > > I am not too sure about this. The grapheme_* functions closely match > ICUs internal, and powerful, API. If you want them to accept a Test > object too, that means these grapheme_* functions' signature needs to be > overloaded. > > for example: > > grapheme_strstr(string $haystack, string $needle, bool $beforeNeedle = false, > string $locale = "" ): string|false > > would need to change into: > > grapheme_strstr(string|Text $haystack, string|Text $needle, bool > $beforeNeedle = false, string $locale = "" ): string|false > > And then '$locale' makes no sense, as this is already part of each of > the Text objects themselves. > > Instead, the 'contains' method on the Text object already does something > very similar: > > https://github.com/derickr/php-text/blob/main/tests/text-contains.phpt > > I think the grapheme functions should stay as they are, and additional > methods can be added on the Text class, where there is currently > functionality missing that the grapheme_* functions already support. > > The RFC document also already lists more functions than I have > implemented so far too. > > > 3. If UTF-8 validaion failed, throws an exception > > It already does that, see this test case: > https://github.com/derickr/php-text/blob/main/tests/text-in-out-basic.phpt#L13 > — although the exception message itself could be improved. > > > __toString method returns string type is seems good. > > Please consider this. > > This is already implemented too: > https://github.com/derickr/php-text/blob/main/text.c#L323 > > cheers, > Derick > > -- > https://derickrethans.nl | https://xdebug.org | https://dram.io > > Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support > > mastodon: @[email protected] @[email protected]
Thanks, Derick. I confirmed already almost implemented. Surely, already grapheme_* functions are implemented `$locale` but conflict `Text::$locale`. Regards Yuya -- --------------------------- Yuya Hamada (tekimen) - https://tekitoh-memdhoi.info - https://github.com/youkidearitai -----------------------------
