bug#60506: parallel grep

2023-01-06 Thread Eike Dierks
I was thinking about this again. It looked easy at first, but it is not. My prime use would be to grep in /usr/include That would search a lot of files, but only return a few results. In that case, searching a lot of files in parallel could be beneficial. But it gets a lot more troublesome, if yo

bug#60618: unicode characters are not identified as such for \w and \b with -P

2023-01-06 Thread Carlo Arenas
Reported to PCRE[1] with mention of GNU grep being also affected. [1] https://github.com/PCRE2Project/pcre2/issues/185 From c2d4a43b5b15df7c8853d591bf6ae872c602ed14 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Carlo=20Marcelo=20Arenas=20Bel=C3=B3n?= Date: Fri, 6 Jan 2023 19:34:56 -0800 Subject: [PATC

bug#60618: unicode characters are not identified as such for \w and \b with -P

2023-01-06 Thread Jim Meyering
On Fri, Jan 6, 2023 at 7:49 PM Carlo Arenas wrote: > Reported to PCRE[1] with mention of GNU grep being also affected. > > [1] https://github.com/PCRE2Project/pcre2/issues/185 Yikes. This is a big deal. Thank you for the patch and added test. I made a tiny comment tweak and this test logic change

bug#60618: unicode characters are not identified as such for \w and \b with -P

2023-01-06 Thread Jim Meyering
On Fri, Jan 6, 2023 at 11:28 PM Jim Meyering wrote: > On Fri, Jan 6, 2023 at 7:49 PM Carlo Arenas wrote: > > Reported to PCRE[1] with mention of GNU grep being also affected. > > > > [1] https://github.com/PCRE2Project/pcre2/issues/185 > > Yikes. This is a big deal. > Thank you for the patch and

bug#60621: grep -P does not set PCRE2_UCP

2023-01-06 Thread Karl Pettersson
Hi Using grep -P for boundary matches yields incorrect results with non-ASCII letters: $ echo 'Öst' | grep -P '\bs' Öst The output should be nothing in this case, and the culprit seems to be this line in pcresearch.c: flags |= PCRE2_UTF; If the PCRE2_UCP flag is added according to this,