On 1/11/23 14:12, Carlo Arenas wrote:
pcre2_config does a static check (defined at compile time) and
therefore is unlikely to fail and might be even under the right
circumstances optimized out.
Not sure what is meant by "static check" here. The call won't be
optimized out unless you compile with -flto or equivalent, and have the
source code to pcre2 as well as the source code to grep. And in that
case the two forms should generate equivalent code (no insns needed).
you are correct that setting the original value was meant to protect
from that function failing and will ensure the original path was still
being taken (which I thought was safer), while your suggested change
will take the opposite one (not setting UTF in a multibyte locale,
which will fail in different ways).
Oh, I think see your point, but doesn't this mean that even my code was
too trusting? It should be something like this:
if (localeinfo.multibyte)
{
uint32_t unicode;
if (! (localeinfo.using_utf8
&& 0 <= pcre2_config (PCRE2_CONFIG_UNICODE, &unicode)
&& unicode))
die (EXIT_TROUBLE, 0, _("-P supports only unibyte and UTF-8
locales"));
...
That is, we're better off diagnosing the problem and not attempting to
use pcre2 if the result will be wrong (or even result in undefined
behavior). The problem is unlikely to occur so it's good to be
conservative here.