On 2024-08-08 05:53, mark.yagnatinsky--- via Bug reports for GNU grep wrote:
I ran into an odd issue... the workaround is easy enough but the issue is weird.
In case this relevant, my grep coms from git bash. (which I think is mostly
Cygwin? (or maybe msys2??))
Anyway, grep -P doesn't work if no LC vars are set, and complains that it only
works in unibyte locales or UTF-8.
Normally, the git bash mintty launcher sets LC_CTYPE to en_us.UTF-8 but not if
I bypass the launcher and run grep directly.
Here's the weird part, if I ask /usr/bin/locale what LC_TYPE "should" be, it
says C.UTF-8.
If I run grep with C.UTF-8 then it also works. So it must be deriving a
default locale an different way.
My guess is that your default environment says that it supports UTF-8,
but it doesn't support it well enough to pass grep's test; see
grep/lib/localeinfo.c's is_using_utf8. If my guess is right, you may be
encountering subtle bugs in programs other than grep.
When you say "I run grep with C.UTF-8" how exactly do you do that?
Is there any difference in output between these two shell commands?
localeinfo
LC_ALL=C.UTF-8 localeinfo
If you have a debugger, you might look into why is_using_utf8 returns
false in your default locale.