On Sun, Nov 14, 2021 at 12:45 PM Paul Eggert <egg...@cs.ucla.edu> wrote: > > On 11/9/21 02:58, Carlo Marcelo Arenas Belón wrote: > > Sadly, hadn't been able to generate a release, > > Does this mean you're having trouble running 'make dist'? If so, what's > the trouble?
I seem to be unlucky; getting certificate errors in Debian sid, FTBFS errors when building the info in macOS, but the latest master was able to run `make dist` successfully in Debian 10, so it is just likely a PBKAC problem. > Also, I followed up with several related patches (also attached as > 0002-0012). Please take a look at them and let us know of any problems. > In the attached patch "grep: prefer signed integers" I followed the > usual grep approach of preferring signed to unsigned integers (e.g., > idx_t to size_t) when either will do; this lets us debug better with > -fsanitize=undefined to catch integer overflow. the one in patch6 where a uint32_t option is doubled, triggers warnings because of comparing an unsigned variable with 0 AFAIK, but there are several of those in the upstream gnulib so presumably not a concern? using idx_t instead of size_t should be fine (if only halves the max size of the objects managed), but I am concerned that assuming PCRE2_SIZE_MAX is always equivalent to SIZE_MAX (as done in patch 4) might be risky (at least without a comment), and considering that is part of the API anyway might be better if kept as PCRE2_SIZE_MAX IMHO. > One issue I discovered: PCRE2_EXTRA_MATCH_WORD (which is used by > pcre2grep -w) is incompatible with 'grep -w'. For example, 'echo a%%a | > grep -Pw %%' outputs nothing, whereas 'echo a%%a | pcre2grep -w %%' > outputs 'a%%a'. I think the GNU grep behavior (which is the same as with > 'grep -w', either on Linux or OpenBSD) is more intuitive here: do you > happen to know why PCRE behaves the way it does? Is that worth a PCRE2 > bug report? Anyway, the attached patches avoid using > PCRE2_EXTRA_MATCH_WORD for that reason. As I mentioned before, PCRE matches the Perl definition as mentioned before in an early draft that also had this change reversed. I would suggest instead that -P should also follow perl convention instead when used together with -w, but maybe that is something that a -P feature flag could enable or disable as needed? Note that "word" definition also has a different meaning in a post Unicode world, and so I expect that will have to change eventually as well. Carlo