bug#72524: how does grep determine locale if no LC environment variables are set

Paul Eggert Fri, 09 Aug 2024 14:57:54 -0700

On 2024-08-08 12:24, mark.yagnatin...@barclays.com wrote:

Re: how am I doing that ... via bash, just like the way you suggested I run 
"locale" the second time:
LC_CTYPE=C.UTF-8 grep -P needle haystack.txt  # just CTYPE seems to be enough, 
no need for ALL

As an aside, I wouldn't mess with LC_CTYPE independently. One can getinto trouble if the LC_CTYPE locale disagrees with the others. However,I don't think that's your problem.

Re: is_using_utf8 ... It relies on mbrtowc, which in turn relies on the current 
locale.
It seems that this function should NEVER return false in a UTF-8 locale.


Correct.

But how does grep decide what the locale even is?
Presumably it must call setlocale at some point, or else it would be using the 
C locale, which is surely a unibyte locale.


Correct, it calls 'setlocale (LC_ALL, "")' first thing.

is there any good way to find out what locale it actually got "resolved" to?


You could modify the source code to add a call like this:

   fprintf (stderr, "grep: locale is %s\n", setlocale (LC_ALL, 0));

after the earlier call to setlocale. Or you could run 'setlocale(LC_ALL, 0)' in a debugger.

bug#72524: how does grep determine locale if no LC environment variables are set

Reply via email to