On Fri, Jul 29, 2022 at 7:15 AM mickmackusa <mickmack...@gmail.com> wrote:
> > > On Monday, July 25, 2022, Guilliam Xavier <guilliam.xav...@gmail.com> > wrote: > >> On Sat, Jul 9, 2022 at 1:56 AM mickmackusa <mickmack...@gmail.com> wrote: >> >>> I've discovered that several native string functions offer a character >>> mask >>> as a parameter. >>> >>> I've laid out my observations at >>> https://stackoverflow.com/q/72865138/2943403 >>> >> >> Out of curiosity, why do you say that strtr() is "not a good candidate >> because character order matters" (although you give a reasonable example)? >> Maybe you have some counter-example? >> >> Regards, >> >> -- >> Guilliam Xavier >> > > I prefer to keep my scope very tight when posting on Stack Overflow. > > My focus was purely on enabling character range syntax for native > functions with character mask parameters. My understanding of character > masks in PHP requires single-byte characters and no meaning to character > order. > > When strtr() is fed two strings, they cannot be considered "character > masks" because the character orders matter. > > If extending character range syntax to parameters which are not character > masks, I might support the feature for strtr(), but ensuring that the two > strings are balanced will be made more difficult with ranged syntax. > strtr() will silently condone imbalanced strings. https://3v4l.org/PY15F > Thanks for the clarifications. You're right that the internal `php_charmask` converts a character list (possibly containing one or more ranges) into a 256-char *mask*, thus "losing" any original order; so strtr() actually couldn't use the same implementation (even without ranges), and a counter-example is `strtr('adobe', 'abcde', 'ebcda')` (`strtr('adobe', 'a..e', 'e..a')` would trigger a Warning "Invalid '..'-range, '..'-range needs to be incrementing"). I had seen a parallel with the Unix `tr` command, which *does* support [incrementing] ranges (e.g. both `echo adobe | tr abcde ABCDE` and `echo adobe | tr a-e A-E` give "ADoBE", while `echo adobe | tr abcde edcba` gives "eboda" but `echo adobe | tr a-e e-a` errors "range-endpoints of 'e-a' are in reverse collating sequence order"), but its implementation doesn't use character masks indeed ( https://github.com/coreutils/coreutils/blob/master/src/tr.c), and `echo abracadabra | tr a-f x` gives "xxrxxxxxxrx" not "xbrxcxdxbrx"; and it also supports more things like POSIX character classes... PS: I find the `strtr(string $string, array $replace_pairs)` form generally superior to the `strtr(string $string, string $from, string $to)` one anyway ;) Regards, -- Guilliam Xavier