> On Jun 22, 2019, at 10:32, Nikita Popov <nikita....@gmail.com> wrote: > >> On Thu, Jun 20, 2019 at 12:32 AM <w...@wkhudgins.info> wrote: >> >> I sent this earlier this week without [RFC] in the subject line...since >> some people might have filters to check the subject line I wanted to >> send this again with the proper substring in the subject lineāto make it >> clear I intend to take this to a vote in two weeks. Apologies for the >> duplicate email. >> >> -Will >> >>> On 2019-06-18 14:45, w...@wkhudgins.info wrote: >>> Hello all, >>> >>> I submitted this RFC several years ago. I collected a lot of feedback >>> and I have updated the RFC and corresponding github patch. Please see >>> the RFC at https://wiki.php.net/rfc/add_str_begin_and_end_functions >>> and the github patch at https://github.com/php/php-src/pull/2049. I >>> have addressed many concerns >>> (order of arguments, name of functions, multibye support, etc). I plan >>> to move this RFC to a vote in the coming weeks. >>> >>> Thanks, >>> >>> Will >> > > Unfortunately, this looks like a case where the RFC feedback has made the > proposal worse, rather than better :( > > I think it's easier to start with what I think this proposal should be: > There should be just two functions, str_starts_with() and str_ends_with() > -- and that's it. > > The important realization to have here is that these functions are a bit of > sugar for an operation that is quite common, but can also be easily > implemented with existing functions (using strcmp, strpos or substr, > depending on what you like). There is no need for us to cover every > conceivable combination, just make the common case more convenient and > easier to read. > > With that in mind: > * I believe the "starts with" and "ends with" naming is a lot more > canonical, used by Python, Ruby, Java, JavaScript and probably lots more. > * In my experience case-insensitive "i" variants of strings functions are > used much less, by an order of magnitude. With this being sugar in the > first place, I don't think there's a need to cover case-insensitive > variations (and from a quick look, these don't seem to be first class > methods in other languages either). If we do want to have them, I'd suggest > making the names str_starts_with_ci() and str_ends_with_ci(), which is more > obvious and harder to miss than str_istarts_with() etc. > * Having mb_* variants of these functions doesn't really make sense. I > realize that there's this knee-jerk reaction about how if it doesn't have > "mb" in the name it's not Unicode compatible, but in this case it's even > more wrong than usual. The normal str_starts_with() function is perfectly > safe to use on UTF-8 strings, the only difference between it and > mb_str_starts_with() is that it's going to be implemented a lot more > efficiently. The only case that *might* make some sense is the > case-insensitive variant here, because that has some genuine reliance on > the character encoding. But then again, this can be handled by case-folding > the strings first, something that mbstring is going to do internally anyway. > > I would happily accept a proposal for str_starts_with() + str_ends_with(), > but I'm a lot more apprehensive about adding these 8 new functions. > > Regards, > Nikita
I like the idea of simplifying this to the two functions str_starts_with() and str_ends_with(). When I was looking through this the other day, I had trouble coming up with an example of a string with the mb_* versions would ever generate a different result from the non-multibyte versions, since the implementation only needs to count and analyze bytes for uniqueness. Perhaps it would only be an issue with the case-insensitive versions, as Nikita points out? If so, can someone provide some example strings where an mb_starts_with_ci() would return true, while str_starts_with_ci() would return false? I think the case sensitivity versions would be common enough in use cases (i.e. looking to see if a path ends with .CSV vs. .csv, etc.), but maybe the signatures could be revised to pass a third parameter? str_starts_with($haystack, $needle, $case_sensitive = true): bool -Ben -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php