On 2023-04-04 12:31, Junio C Hamano wrote:
My personal inclination is to let Perl folks decide
and follow them (even though I am skeptical about the wisdom of
letting '\d' match anything other than [0-9])
I looked into what pcre2grep does. It has always done only 8-bit
processing unless you u
On 2023-04-05 11:32, Paul Eggert wrote:
in a February 8 commit[1], Philip Hazel changed pcre2grep to use
PCRE2_UCP, so this will mean 10.43 pcre2grep -u will behave like 3.9 GNU
grep -P did (though 3.10 has changed this).
Sorry, due to fumblefingers I gave the wrong URL for [1]. Here's a
cor
Paul Eggert writes:
> Here are two ways forward to fix this incompatibility (there are other
> possibilities of course):
>
> (A) GNU grep adds a --no-ucp option that acts like 10.43 pcre2grep
> --no-ucp, and git grep -P follows suit. That is, both GNU and git grep
> act like 10.43 pcre2grep -u, i
On Wed, Apr 5, 2023 at 11:33 AM Paul Eggert wrote:
> On 2023-04-04 12:31, Junio C Hamano wrote:
> > My personal inclination is to let Perl folks decide
> > and follow them (even though I am skeptical about the wisdom of
> > letting '\d' match anything other than [0-9])
>
> I looked into what pcre2
On 2023-04-05 12:40, Jim Meyering wrote:
(C) preserve grep -P's tradition of \d matching only 0..9, and once
grep uses 10.43 or newer, \b and \w will also work as desired.
If I understand you correctly, (C) would mean that GNU grep -P, git grep
-P, and pcre2grep -u would all use PCRE2_UTF | P
On Wed, Apr 5, 2023 at 12:40 PM Jim Meyering wrote:
>
> Changing grep -P's \d to match multibyte digits by default would break
> an important contract.
While I tend to agree[1] (and indeed that is why PCRE2_EXTRA_ASCII_BSD
was invented), it would be also important to note that it goes against
the