On 21 Apr 2023, at 12:01, Ronald Klop <ronald-li...@klop.ws> wrote: > Van: Poul-Henning Kamp <p...@phk.freebsd.dk> > Datum: maandag, 17 april 2023 23:06 > Aan: curr...@freebsd.org > Onderwerp: find(1): I18N gone wild ? > This surprised me: > > # mkdir /tmp/P > # cd /tmp/P > # touch FOO > # touch bar > # env LANG=C.UTF-8 find . -name '[A-Z]*' -print > ./FOO > # env LANG=en_US.UTF-8 find . -name '[A-Z]*' -print > ./FOO > ./bar > > Really ?! ... > My Mac and a Linux server only give ./FOO in both cases. Just a 2 cents > remark.
Same here. However, I have read that with unicode, you should *never* use [A-Z] or [0-9], but character classes instead. That seems to give both files on macOS and Linux with [[:alpha:]]: $ LANG=en_US.UTF-8 find . -name '[[:alpha:]]*' -print ./BAR ./foo and only the lowercase file with [[:lower:]]: $ LANG=en_US.UTF-8 find . -name '[[:lower:]]*' -print ./foo But on FreeBSD, these don't work at all: $ LANG=en_US.UTF-8 find . -name '[[:alpha:]]*' -print <nothing> $ LANG=en_US.UTF-8 find . -name '[[:lower:]]*' -print <nothing> This is an interesting rabbit hole... :) -Dimitry
signature.asc
Description: Message signed with OpenPGP