The branch main has been updated by kevans: URL: https://cgit.FreeBSD.org/src/commit/?id=d0ff5773cefaf3fa41b1be3e44ca35bd9d5f68ee
commit d0ff5773cefaf3fa41b1be3e44ca35bd9d5f68ee Author: Kyle Evans <kev...@freebsd.org> AuthorDate: 2025-08-08 18:21:03 +0000 Commit: Kyle Evans <kev...@freebsd.org> CommitDate: 2025-08-08 18:27:26 +0000 libregex: fix our mapping for \w A small oversight in our implementation of \w is that it's actually not strictly [[:alnum:]]. According to the GNU documentation, it's actually [[:alnum:]] + underscore. The fix is rather trivial: just add it to our set explicitly, and amend our test set to be sure that _ is actually included. PR: 287396 --- lib/libc/regex/regcomp.c | 1 + lib/libregex/tests/gnuext.in | 6 +++--- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/lib/libc/regex/regcomp.c b/lib/libc/regex/regcomp.c index f34dc322d0bb..aebea2b02435 100644 --- a/lib/libc/regex/regcomp.c +++ b/lib/libc/regex/regcomp.c @@ -1183,6 +1183,7 @@ p_b_pseudoclass(struct parse *p, char c) { /* PASSTHROUGH */ case 'w': p_b_cclass_named(p, cs, "alnum"); + CHadd(p, cs, '_'); break; case 'S': cs->invert = 1; diff --git a/lib/libregex/tests/gnuext.in b/lib/libregex/tests/gnuext.in index 8f49854235a9..3ce0f4af1b34 100644 --- a/lib/libregex/tests/gnuext.in +++ b/lib/libregex/tests/gnuext.in @@ -10,9 +10,9 @@ a\|b\|c b abc a (ab)\1 - abab abab \1(ab) C ESUBREG (a)(b)(c)(d)(e)(f)(g)(h)(i)\9 - abcdefghii abcdefghii -# \w, \W, \s, \S (alnum, ^alnum, space, ^space) -\w+ - -%@a0X- a0X -\w\+ b -%@a0X- a0X +# \w, \W, \s, \S (_alnum, ^_alnum, space, ^space) +\w+ - -%@a_0X- a_0X +\w\+ b -%@a_0X- a_0X \s+ - aSNTb SNT \s\+ b aSNTb SNT # Word boundaries (\b, \B, \<, \>, \`, \')