On 2024-08-08 12:24, mark.yagnatin...@barclays.com wrote:
Re: how am I doing that ... via bash, just like the way you suggested I run 
"locale" the second time:
LC_CTYPE=C.UTF-8 grep -P needle haystack.txt  # just CTYPE seems to be enough, 
no need for ALL

As an aside, I wouldn't mess with LC_CTYPE independently. One can get into trouble if the LC_CTYPE locale disagrees with the others. However, I don't think that's your problem.


Re: is_using_utf8 ... It relies on mbrtowc, which in turn relies on the current 
locale.
It seems that this function should NEVER return false in a UTF-8 locale.

Correct.

But how does grep decide what the locale even is?
Presumably it must call setlocale at some point, or else it would be using the 
C locale, which is surely a unibyte locale.

Correct, it calls 'setlocale (LC_ALL, "")' first thing.


is there any good way to find out what locale it actually got "resolved" to?

You could modify the source code to add a call like this:

   fprintf (stderr, "grep: locale is %s\n", setlocale (LC_ALL, 0));

after the earlier call to setlocale. Or you could run 'setlocale (LC_ALL, 0)' in a debugger.



  • bug#72524: how does gre... mark . yagnatinsky--- via Bug reports for GNU grep
    • bug#72524: how doe... Paul Eggert
      • bug#72524: how... mark . yagnatinsky--- via Bug reports for GNU grep
        • bug#72524:... Paul Eggert
          • bug#72... mark . yagnatinsky--- via Bug reports for GNU grep
            • b... Paul Eggert
              • ... mark . yagnatinsky--- via Bug reports for GNU grep
                • ... Paul Eggert
                • ... mark . yagnatinsky--- via Bug reports for GNU grep
                • ... mark . yagnatinsky--- via Bug reports for GNU grep

Reply via email to