RE: [PHP-DEV] [RFC] Desire to move RFC add_str_begin_and_end_functions to a vote

CHU Zhaowei Sat, 29 Jun 2019 08:42:27 -0700

Agreed. I'm wondering why the author choose to use begin(s) /end(s) while 
almost all other popular language has a more clear naming. e.g. starts_with or 
has_prefix.


In addition, like someone else pointed out two years ago, userland may already 
have functions with the same name, and this should be considered as a potential 
BC break, which is not reflected in the RFC yet. 

Regards,
CHU Zhaowei

> -----Original Message-----
> From: Nikita Popov <nikita....@gmail.com>
> Sent: Saturday, June 29, 2019 6:07 AM
> To: w...@wkhudgins.info
> Cc: Ben Ramsey <b...@benramsey.com>; PHP internals <internals@lists.php.net>
> Subject: Re: [PHP-DEV] [RFC] Desire to move RFC
> add_str_begin_and_end_functions to a vote
> 
> On Fri, Jun 28, 2019 at 10:54 PM <w...@wkhudgins.info> wrote:
> 
> > These are good points. Originally my RFC called for less functions but
> > based on feedback I added the others. My proposal: take the RFC as-is
> > to a vote. If it fails, I will raise another RFC for a vote that will
> > just contain the two basic functions: str_begins and str_ends.
> >
> 
> To put my comments into more actionable form, here is what I would
> recommend for this RFC:
> 
>  * Rename str_begins -> str_starts_with, str_ends -> str_ends_with, 
> str_ibegins ->
> str_starts_with_ci, str_iends -> str_ends_with_ci. As mentioned before, this 
> is
> standard terminology used by many, many programming languages and it would
> be great if PHP did not deviate from convention without strong reason.
>  * Have a separate vote (in the same RFC) for the addition of the 
> corresponding
> mb_* variants.
> 
> I believe doing those two changes will ensure that the core part of the RFC
> passes. I personally would be voting yes on the first part and no on the 
> second,
> but others may decide as they see fit.
> 
> Nikita
> 
> 
> > On 2019-06-22 15:56, Ben Ramsey wrote:
> > >> On Jun 22, 2019, at 10:32, Nikita Popov <nikita....@gmail.com> wrote:
> > >>
> > >>> On Thu, Jun 20, 2019 at 12:32 AM <w...@wkhudgins.info> wrote:
> > >>>
> > >>> I sent this earlier this week without [RFC] in the subject
> > >>> line...since some people might have filters to check the subject
> > >>> line I wanted to send this again with the proper substring in the
> > >>> subject line–to make it clear I intend to take this to a vote in
> > >>> two weeks. Apologies for the duplicate email.
> > >>>
> > >>> -Will
> > >>>
> > >>>> On 2019-06-18 14:45, w...@wkhudgins.info wrote:
> > >>>> Hello all,
> > >>>>
> > >>>> I submitted this RFC several years ago. I collected a lot of
> > >>>> feedback and I have updated the RFC and corresponding github
> > >>>> patch. Please see the RFC at
> > >>>> https://wiki.php.net/rfc/add_str_begin_and_end_functions
> > >>>> and the github patch at https://github.com/php/php-src/pull/2049.
> > >>>> I have addressed many concerns (order of arguments, name of
> > >>>> functions, multibye support, etc). I plan to move this RFC to a
> > >>>> vote in the coming weeks.
> > >>>>
> > >>>> Thanks,
> > >>>>
> > >>>> Will
> > >>>
> > >>
> > >> Unfortunately, this looks like a case where the RFC feedback has
> > >> made the proposal worse, rather than better :(
> > >>
> > >> I think it's easier to start with what I think this proposal should
> > >> be:
> > >> There should be just two functions, str_starts_with() and
> > >> str_ends_with()
> > >> -- and that's it.
> > >>
> > >> The important realization to have here is that these functions are
> > >> a bit of sugar for an operation that is quite common, but can also
> > >> be easily implemented with existing functions (using strcmp, strpos
> > >> or substr, depending on what you like). There is no need for us to
> > >> cover every conceivable combination, just make the common case more
> > >> convenient and easier to read.
> > >>
> > >> With that in mind:
> > >> * I believe the "starts with" and "ends with" naming is a lot more
> > >> canonical, used by Python, Ruby, Java, JavaScript and probably lots
> > >> more.
> > >> * In my experience case-insensitive "i" variants of strings
> > >> functions are used much less, by an order of magnitude. With this
> > >> being sugar in the first place, I don't think there's a need to
> > >> cover case-insensitive variations (and from a quick look, these
> > >> don't seem to be first class methods in other languages either). If
> > >> we do want to have them, I'd suggest making the names
> > >> str_starts_with_ci() and str_ends_with_ci(), which is more obvious
> > >> and harder to miss than str_istarts_with() etc.
> > >> * Having mb_* variants of these functions doesn't really make
> > >> sense. I realize that there's this knee-jerk reaction about how if
> > >> it doesn't have "mb" in the name it's not Unicode compatible, but
> > >> in this case it's even more wrong than usual. The normal
> > >> str_starts_with() function is perfectly safe to use on UTF-8
> > >> strings, the only difference between it and
> > >> mb_str_starts_with() is that it's going to be implemented a lot
> > >> more efficiently. The only case that *might* make some sense is the
> > >> case-insensitive variant here, because that has some genuine
> > >> reliance on the character encoding. But then again, this can be
> > >> handled by case-folding the strings first, something that mbstring
> > >> is going to do internally anyway.
> > >>
> > >> I would happily accept a proposal for str_starts_with() +
> > >> str_ends_with(), but I'm a lot more apprehensive about adding these
> > >> 8 new functions.
> > >>
> > >> Regards,
> > >> Nikita
> > >
> > >
> > > I like the idea of simplifying this to the two functions
> > > str_starts_with() and str_ends_with().
> > >
> > > When I was looking through this the other day, I had trouble coming
> > > up with an example of a string with the mb_* versions would ever
> > > generate a different result from the non-multibyte versions, since
> > > the implementation only needs to count and analyze bytes for uniqueness.
> > > Perhaps it would only be an issue with the case-insensitive
> > > versions, as Nikita points out? If so, can someone provide some
> > > example strings where an mb_starts_with_ci() would return true,
> > > while
> > > str_starts_with_ci() would return false?
> > >
> > > I think the case sensitivity versions would be common enough in use
> > > cases (i.e. looking to see if a path ends with .CSV vs. .csv, etc.),
> > > but maybe the signatures could be revised to pass a third parameter?
> > >
> > > str_starts_with($haystack, $needle, $case_sensitive = true): bool
> > >
> > > -Ben
> >




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

RE: [PHP-DEV] [RFC] Desire to move RFC add_str_begin_and_end_functions to a vote

Reply via email to