On 09.07.2022 at 01:55, mickmackusa wrote: > I've discovered that several native string functions offer a character mask > as a parameter. > > I've laid out my observations at > https://stackoverflow.com/q/72865138/2943403 > > In a nutshell, not all character masks offer ranges via "double dot" > syntax. Or should I refer to ".." as the "string spread operator" to avoid > naming conflict with "..." -- the better known "spread operator" (array > spread operator)? > > Rowan/@IMSoP informed me that the current division between the haves and > the have-nots appears to be based on the source language from which PHP > pulled. Essentially, if from C, the double dot does not represent a range. > https://chat.stackoverflow.com/transcript/11?m=54864842#54864842 > > Character ranges are not yet supported for: > - strcspn() > - strpbrk() > - strspn() > > Before I fire off an RFC, I would like to know: > > 1. Are there any reasonable objections to consistently implementing > character range expressions for all character masks?
In my opinion, this notation is somewhat confusing; trim($str, "a..z") and trim($str, "a.z") look pretty similar, but have completely different meaning. I'd rather have some general way to construct such ranges; the slightly contrived implode(range()) is already available, though. Besides, adding support for such character ranges to other functions now, constitutes a (probably minor) BC break. > 2. Are there any native functions that I did not mention my Stack Overflow > answer? It is impossible to list all "native" functions, at least if you mean internal functions, because these may be defined by extensions. And these extensions would need to explicitly implement support for such character ranges. > 3. Is it true that only single-byte characters can be used in all > scenarios? If so, must it remain that way? I think it needs to remain that way, since the functions already accepting character ranges actually work on byte strings. > 4. Is there already an official or widely-used term that I should be using > for the two-dot operator? I'd call them character ranges; the implementation is called php_charmask() (<https://github.com/php/php-src/blob/php-8.1.8/ext/standard/string.c#L689>). > I should also mention that I initially considered requesting that all > character mask parameters be named $mask (instead of $separators, $token, > or $characters), but I later resigned to the fact that changing to a name > that describes the texture of the string would remove the more > vital/intuitive purpose of the string. I suppose the best that can be done > to inform developers is to explicitly mention in the documentation when > character range expressions are implemented and demonstrate their usage in > an example (not just as a user comment at the bottom; this isn't In-N-Out > Burger -- put your offerings on the frickin' menu!). I agree that the documentation needs to be improved. While trim() mentions the character range support in one sentence, addcslashes() dedicates several paragraphs of detailed explanation. -- Christoph M. Becker -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php