bug#60690: -P '\d' in GNU and git grep

2023-04-05 Thread Paul Eggert
On 2023-04-04 12:31, Junio C Hamano wrote: My personal inclination is to let Perl folks decide and follow them (even though I am skeptical about the wisdom of letting '\d' match anything other than [0-9]) I looked into what pcre2grep does. It has always done only 8-bit processing unless you u

bug#60690: -P '\d' in GNU and git grep

2023-04-05 Thread Paul Eggert
On 2023-04-05 11:32, Paul Eggert wrote: in a February 8 commit[1], Philip Hazel changed pcre2grep to use PCRE2_UCP, so this will mean 10.43 pcre2grep -u will behave like 3.9 GNU grep -P did (though 3.10 has changed this). Sorry, due to fumblefingers I gave the wrong URL for [1]. Here's a cor

bug#60690: -P '\d' in GNU and git grep

2023-04-05 Thread Junio C Hamano
Paul Eggert writes: > Here are two ways forward to fix this incompatibility (there are other > possibilities of course): > > (A) GNU grep adds a --no-ucp option that acts like 10.43 pcre2grep > --no-ucp, and git grep -P follows suit. That is, both GNU and git grep > act like 10.43 pcre2grep -u, i

bug#60690: -P '\d' in GNU and git grep

2023-04-05 Thread Jim Meyering
On Wed, Apr 5, 2023 at 11:33 AM Paul Eggert wrote: > On 2023-04-04 12:31, Junio C Hamano wrote: > > My personal inclination is to let Perl folks decide > > and follow them (even though I am skeptical about the wisdom of > > letting '\d' match anything other than [0-9]) > > I looked into what pcre2

bug#60690: -P '\d' in GNU and git grep

2023-04-05 Thread Paul Eggert
On 2023-04-05 12:40, Jim Meyering wrote: (C) preserve grep -P's tradition of \d matching only 0..9, and once grep uses 10.43 or newer, \b and \w will also work as desired. If I understand you correctly, (C) would mean that GNU grep -P, git grep -P, and pcre2grep -u would all use PCRE2_UTF | P

bug#60690: -P '\d' in GNU and git grep

2023-04-05 Thread Carlo Arenas
On Wed, Apr 5, 2023 at 12:40 PM Jim Meyering wrote: > > Changing grep -P's \d to match multibyte digits by default would break > an important contract. While I tend to agree[1] (and indeed that is why PCRE2_EXTRA_ASCII_BSD was invented), it would be also important to note that it goes against the