> On 27 Jun 2024, at 12:31, Mike Schinkel <m...@newclarity.net> wrote: > >> On Jun 26, 2024, at 8:14 AM, Gina P. Banyard <intern...@gpb.moe >> <mailto:intern...@gpb.moe>> wrote: >> >> >> On Wednesday, 26 June 2024 at 06:18, Mike Schinkel <m...@newclarity.net >> <mailto:m...@newclarity.net>> wrote: >>> https://3v4l.org/RDYFs#v8.3.8 >>> >>> Note those seven use-cases are found in around the first 25 results when >>> searching GitHub for "strtok(". I could probably find more if I kept >>> looking: >>> >>> https://github.com/search?q=strtok%28+language%3APHP+&type=code >>> >>> Regarding explode($delimiter, $str)[0] — unless it is to be special-cased >>> during compilation —it is a really inefficient way to find the substring up >>> to the first character, especially for large strings and/or when in a tight >>> loop where the explode is contained in a called function >> >> Then use a regex: https://3v4l.org/SGWL5 > > Using `preg_match()` instead of `strtok()` to process the ~4k file of commas > is, on average, same as using explode()[0], or 10x as long as using > `strtok()` (at times it got as low as 4.4x, but that was rare): > > https://onlinephp.io/c/e1fad > > Size of file: 3972 > Number of commas: 359 > Time taken for strtok: 0.003 seconds > Time taken for regex: 0.0307 seconds > Times strtok() faster: 10.25 > >> Or a combination of strpos and substr. > > > Using `strpos()`+ `substr()` instead of `strtok()` to process the ~4k file of > commas is, took on average ~3x as long as using `strtok()`. I implemented a > class for this and tried to optimize it by using only string positions and > not copying the string repeatedly. It also took about 1/2 hour to get the > code working vs. about 15 seconds to get the code working with strtok(); > which will most programmers prefer? > > https://onlinephp.io/c/2a09f > > Size of file: 3972 > Number of commas: 359 > Time for strtok: 0.0027 seconds > Time for strpos/substr: 0.0089 seconds > Times strtok() faster: 3.31 > > >> There are *plenty* of solutions to the specific problem you pose here, and >> thus many different solutions more or less appropriate. > > Yes, and in all cases the existing solutions are significantly slower, except > one. > > And that one solution that is not significantly slower is to not deprecate > `strtok()`. Not to mention not deprecating would keep from causing lots of > BC breakage. > > -Mike
Hi All, I do appreciate that strtok has a kind of bizarre signature/use pattern and potential for confusion due to how subsequent calls work, but to me that sounds like a better result for uses that need the repeated call functionality, would be to introduce a builtin `StringTokenizer` class that wraps the underlying strtok_r C call and uses internal state to keep track of the string being tokenized. As a "works the same" solution for grabbing the first segment of a string up to any of the delimiter chars, could the `strpbrk` function be expanded with a `$before_needle` arg like `strstr` has? (strstr matches on an exact substring, not on any pf a list of characters) Cheers Stephen