> On Jun 22, 2019, at 10:32, Nikita Popov <nikita....@gmail.com> wrote:
> 
>> On Thu, Jun 20, 2019 at 12:32 AM <w...@wkhudgins.info> wrote:
>> 
>> I sent this earlier this week without [RFC] in the subject line...since
>> some people might have filters to check the subject line I wanted to
>> send this again with the proper substring in the subject line–to make it
>> clear I intend to take this to a vote in two weeks. Apologies for the
>> duplicate email.
>> 
>> -Will
>> 
>>> On 2019-06-18 14:45, w...@wkhudgins.info wrote:
>>> Hello all,
>>> 
>>> I submitted this RFC several years ago. I collected a lot of feedback
>>> and I have updated the RFC and corresponding github patch. Please see
>>> the RFC at https://wiki.php.net/rfc/add_str_begin_and_end_functions
>>> and the github patch at https://github.com/php/php-src/pull/2049. I
>>> have addressed many concerns
>>> (order of arguments, name of functions, multibye support, etc). I plan
>>> to move this RFC to a vote in the coming weeks.
>>> 
>>> Thanks,
>>> 
>>> Will
>> 
> 
> Unfortunately, this looks like a case where the RFC feedback has made the
> proposal worse, rather than better :(
> 
> I think it's easier to start with what I think this proposal should be:
> There should be just two functions, str_starts_with() and str_ends_with()
> -- and that's it.
> 
> The important realization to have here is that these functions are a bit of
> sugar for an operation that is quite common, but can also be easily
> implemented with existing functions (using strcmp, strpos or substr,
> depending on what you like). There is no need for us to cover every
> conceivable combination, just make the common case more convenient and
> easier to read.
> 
> With that in mind:
> * I believe the "starts with" and "ends with" naming is a lot more
> canonical, used by Python, Ruby, Java, JavaScript and probably lots more.
> * In my experience case-insensitive "i" variants of strings functions are
> used much less, by an order of magnitude. With this being sugar in the
> first place, I don't think there's a need to cover case-insensitive
> variations (and from a quick look, these don't seem to be first class
> methods in other languages either). If we do want to have them, I'd suggest
> making the names str_starts_with_ci() and str_ends_with_ci(), which is more
> obvious and harder to miss than str_istarts_with() etc.
> * Having mb_* variants of these functions doesn't really make sense. I
> realize that there's this knee-jerk reaction about how if it doesn't have
> "mb" in the name it's not Unicode compatible, but in this case it's even
> more wrong than usual. The normal str_starts_with() function is perfectly
> safe to use on UTF-8 strings, the only difference between it and
> mb_str_starts_with() is that it's going to be implemented a lot more
> efficiently. The only case that *might* make some sense is the
> case-insensitive variant here, because that has some genuine reliance on
> the character encoding. But then again, this can be handled by case-folding
> the strings first, something that mbstring is going to do internally anyway.
> 
> I would happily accept a proposal for str_starts_with() + str_ends_with(),
> but I'm a lot more apprehensive about adding these 8 new functions.
> 
> Regards,
> Nikita


I like the idea of simplifying this to the two functions str_starts_with() and 
str_ends_with().

When I was looking through this the other day, I had trouble coming up with an 
example of a string with the mb_* versions would ever generate a different 
result from the non-multibyte versions, since the implementation only needs to 
count and analyze bytes for uniqueness. Perhaps it would only be an issue with 
the case-insensitive versions, as Nikita points out? If so, can someone provide 
some example strings where an mb_starts_with_ci() would return true, while 
str_starts_with_ci() would return false?

I think the case sensitivity versions would be common enough in use cases (i.e. 
looking to see if a path ends with .CSV vs. .csv, etc.), but maybe the 
signatures could be revised to pass a third parameter?

str_starts_with($haystack, $needle, $case_sensitive = true): bool

-Ben

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to