Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-21 Thread Larry Garfield
On Thu, Dec 15, 2022, at 9:34 AM, Derick Rethans wrote: > Hi, > > I have just published an initial draft of the "Unicode Text Processing" > RFC, a proposal to have performant unicode text processing always > available to PHP users, by introducing a new "Text" class. > > You can find it at: > http

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-21 Thread Rowan Tommins
On Wed, 21 Dec 2022 at 11:48, Derick Rethans wrote: > I know what a polyfill is, and I still don't want to see this. > I can 100% guarantee that you will *see* it - as soon as this RFC is even close to being accepted, either the Symfony project or someone else will work on such a polyfill. But

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-21 Thread Derick Rethans
On Fri, 16 Dec 2022, Tim Düsterhus wrote: > Hi > > On 12/16/22 14:55, Derick Rethans wrote: > > > -- > > > > > > getPositionOfFirstOccurrence(): > > > > > > I agree this is too long. How about: > > > > > > - findOffset() > > > - findOffsetLast() > > > > > > And for returnFromFirstOccu

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-21 Thread Derick Rethans
On Fri, 16 Dec 2022, Rowan Tommins wrote: > On 16 December 2022 13:55:02 GMT, Derick Rethans wrote: > >I do not want a polyfill. These already exist for intl and friends. > > I think you misunderstood what I meant by "polyfill"; I meant in the > sense that once the real implementation gets inclu

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-16 Thread Rowan Tommins
On 16 December 2022 13:55:02 GMT, Derick Rethans wrote: >I do not want a polyfill. These already exist for intl and friends. I think you misunderstood what I meant by "polyfill"; I meant in the sense that once the real implementation gets included in, say PHP 8.3, users needing to support, say,

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-16 Thread Tim Düsterhus
Hi On 12/16/22 16:27, Andreas Heigl wrote: I rather not see this either, because if a 'Text' object may contain binary data, the type safety is lost and users cannot rely on "'Text' implies valid UTF-8" (see sibling thread). Does Text contain valid UTF-8? Or valid Unicode? As IIRC the idea was

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-16 Thread Tim Düsterhus
Hi On 12/16/22 14:55, Derick Rethans wrote: -- getPositionOfFirstOccurrence(): I agree this is too long. How about: - findOffset() - findOffsetLast() And for returnFromFirstOccurence(): - startingWith() - startingWithLast() I have included these as suggested names. I suspect we'll

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-16 Thread Andreas Heigl
Hey On 16.12.22 16:21, Tim Düsterhus wrote: Hi On 12/16/22 14:28, Derick Rethans wrote: Question 2 is that class.  I know folks have been clammoring for a `String` class for some time and this actually fills that niche quite well.  A part of me wonders if we can overload it a little to provide

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-16 Thread Tim Düsterhus
Hi On 12/16/22 14:28, Derick Rethans wrote: Question 2 is that class. I know folks have been clammoring for a `String` class for some time and this actually fills that niche quite well. A part of me wonders if we can overload it a little to provide a psuedo locale of "binary" so that users can

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-16 Thread Derick Rethans
On Thu, 15 Dec 2022, Tim Düsterhus wrote: > On 12/15/22 16:34, Derick Rethans wrote: > > You can find it at: > > https://wiki.php.net/rfc/unicode_text_processing > > > > I'm looking forwards to hearing your opinions, additions, and > > suggestions — the RFC specifically asks for these in places.

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-16 Thread Derick Rethans
On Thu, 15 Dec 2022, Rowan Tommins wrote: > On 15/12/2022 15:34, Derick Rethans wrote: > > I have just published an initial draft of the "Unicode Text Processing" > > RFC, a proposal to have performant unicode text processing always > > available to PHP users, by introducing a new "Text" class. >

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-16 Thread Derick Rethans
On Fri, 16 Dec 2022, Tim Starling wrote: > On 16/12/22 02:34, Derick Rethans wrote: > > > > I have just published an initial draft of the "Unicode Text > > Processing" RFC, a proposal to have performant unicode text > > processing always available to PHP users, by introducing a new > > "Text"

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-16 Thread Derick Rethans
On Thu, 15 Dec 2022, Tim Düsterhus wrote: > [1] The 'Text' class should likely be made final, because folks might > otherwise rely on a specific userland extension, preventing actual > interoperability. Yes, I intended to do this, but forgot to include it. I've updated the RFC. cheers, Derick

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-16 Thread Derick Rethans
On Thu, 15 Dec 2022, Sara Golemon wrote: > On Thu, Dec 15, 2022 at 9:34 AM Derick Rethans wrote: > > > I have just published an initial draft of the "Unicode Text > > Processing" RFC, a proposal to have performant unicode text > > processing always available to PHP users, by introducing a new

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-16 Thread Rowan Tommins
On Fri, 16 Dec 2022 at 03:21, Tim Starling wrote: > > I'm concerned about the time order of using grapheme offsets. For > example, is subString() O(N) in $offset? If the idea is to be easy to > use and performant, you don't want to have subtle algorithmic > complexity traps. > This is a good po

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-15 Thread Tim Starling
On 16/12/22 02:34, Derick Rethans wrote: Hi, I have just published an initial draft of the "Unicode Text Processing" RFC, a proposal to have performant unicode text processing always available to PHP users, by introducing a new "Text" class. Using "collator" and "locale" interchangeably seems

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-15 Thread Rowan Tommins
On 15/12/2022 15:34, Derick Rethans wrote: I have just published an initial draft of the "Unicode Text Processing" RFC, a proposal to have performant unicode text processing always available to PHP users, by introducing a new "Text" class. You can find it at: https://wiki.php.net/rfc/unicode_tex

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-15 Thread Tim Düsterhus
Hi On 12/15/22 19:51, Deleu wrote: [1] The 'Text' class should likely be made final, because folks might otherwise rely on a specific userland extension, preventing actual interoperability. I'm fond of final classes but in here I think it *adds* burden to core developers. As you said it yours

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-15 Thread Deleu
On Thu, Dec 15, 2022, 3:16 PM Tim Düsterhus wrote: > > [1] The 'Text' class should likely be made final, because folks might > otherwise rely on a specific userland extension, preventing actual > interoperability. > I'm fond of final classes but in here I think it *adds* burden to core developer

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-15 Thread Sara Golemon
On Thu, Dec 15, 2022 at 9:34 AM Derick Rethans wrote: > I have just published an initial draft of the "Unicode Text Processing" > RFC, a proposal to have performant unicode text processing always > available to PHP users, by introducing a new "Text" class. > > You can find it at: > https://wiki.p

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-15 Thread Tim Düsterhus
Hi On 12/15/22 17:05, Andreas Heigl wrote: I see a few challenges in the approach. My first question was: Why do we need a new implementation of the ICU library? Creating a userland […] I'm ambivalent about this. On the one hand it could make some things for sure easier. On the other hand it

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-15 Thread Tim Düsterhus
Hi On 12/15/22 16:34, Derick Rethans wrote: You can find it at: https://wiki.php.net/rfc/unicode_text_processing I'm looking forwards to hearing your opinions, additions, and suggestions — the RFC specifically asks for these in places. Some first remarks: -- replaceText(): In the r

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-15 Thread Deleu
On Thu, Dec 15, 2022 at 12:34 PM Derick Rethans wrote: > Hi, > > I have just published an initial draft of the "Unicode Text Processing" > RFC, a proposal to have performant unicode text processing always > available to PHP users, by introducing a new "Text" class. > > You can find it at: > https

Re: [PHP-DEV] [RFC] Unicode Text Processing

2022-12-15 Thread Andreas Heigl
Hey Derick, Hey all. On 15.12.22 16:34, Derick Rethans wrote: Hi, I have just published an initial draft of the "Unicode Text Processing" RFC, a proposal to have performant unicode text processing always available to PHP users, by introducing a new "Text" class. You can find it at: https://wik

[PHP-DEV] [RFC] Unicode Text Processing

2022-12-15 Thread Derick Rethans
Hi, I have just published an initial draft of the "Unicode Text Processing" RFC, a proposal to have performant unicode text processing always available to PHP users, by introducing a new "Text" class. You can find it at: https://wiki.php.net/rfc/unicode_text_processing I'm looking forwards to