Xin LI wrote: > This is expected behavior (in en_US.UTF-8 the ordering is AaBb, not > ABab). You might want to set LC_COLLATE to C if C behavior is desirable. > > On Mon, Apr 17, 2023 at 2:06 PM Poul-Henning Kamp <p...@phk.freebsd.dk > <mailto:p...@phk.freebsd.dk>> wrote: > > This surprised me: > > # mkdir /tmp/P > # cd /tmp/P > # touch FOO > # touch bar > # env LANG=C.UTF-8 find . -name '[A-Z]*' -print > ./FOO > # env LANG=en_US.UTF-8 find . -name '[A-Z]*' -print > ./FOO > ./bar > > Really ?!
A bit more detail: find uses fnmatch(3) here, where the RE Bracket Expression rules apply (except for ! instead of ^, but that's unrelated): https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_05 ...which has the following note: 7. In the POSIX locale, a range expression represents the set of collating elements that fall between two elements in the collation sequence, inclusive. In other locales, a range expression has unspecified behavior: strictly conforming applications shall not rely on whether the range expression is valid, or on the set of collating elements matched. Indeed, it's unfortunate that collations in non-POSIX are not that... linear and range expressions can break, but I don't see an easy way of "fixing" this.