The branch main has been updated by kevans:

URL: 
https://cgit.FreeBSD.org/src/commit/?id=d0ff5773cefaf3fa41b1be3e44ca35bd9d5f68ee

commit d0ff5773cefaf3fa41b1be3e44ca35bd9d5f68ee
Author:     Kyle Evans <kev...@freebsd.org>
AuthorDate: 2025-08-08 18:21:03 +0000
Commit:     Kyle Evans <kev...@freebsd.org>
CommitDate: 2025-08-08 18:27:26 +0000

    libregex: fix our mapping for \w
    
    A small oversight in our implementation of \w is that it's actually
    not strictly [[:alnum:]].  According to the GNU documentation, it's
    actually [[:alnum:]] + underscore.  The fix is rather trivial: just add
    it to our set explicitly, and amend our test set to be sure that _ is
    actually included.
    
    PR:             287396
---
 lib/libc/regex/regcomp.c     | 1 +
 lib/libregex/tests/gnuext.in | 6 +++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/libc/regex/regcomp.c b/lib/libc/regex/regcomp.c
index f34dc322d0bb..aebea2b02435 100644
--- a/lib/libc/regex/regcomp.c
+++ b/lib/libc/regex/regcomp.c
@@ -1183,6 +1183,7 @@ p_b_pseudoclass(struct parse *p, char c) {
                /* PASSTHROUGH */
        case 'w':
                p_b_cclass_named(p, cs, "alnum");
+               CHadd(p, cs, '_');
                break;
        case 'S':
                cs->invert = 1;
diff --git a/lib/libregex/tests/gnuext.in b/lib/libregex/tests/gnuext.in
index 8f49854235a9..3ce0f4af1b34 100644
--- a/lib/libregex/tests/gnuext.in
+++ b/lib/libregex/tests/gnuext.in
@@ -10,9 +10,9 @@ a\|b\|c       b       abc     a
 (ab)\1 -       abab    abab
 \1(ab) C       ESUBREG
 (a)(b)(c)(d)(e)(f)(g)(h)(i)\9  -       abcdefghii      abcdefghii
-# \w, \W, \s, \S (alnum, ^alnum, space, ^space)
-\w+    -       -%@a0X- a0X
-\w\+   b       -%@a0X- a0X
+# \w, \W, \s, \S (_alnum, ^_alnum, space, ^space)
+\w+    -       -%@a_0X-        a_0X
+\w\+   b       -%@a_0X-        a_0X
 \s+    -       aSNTb   SNT
 \s\+   b       aSNTb   SNT
 # Word boundaries (\b, \B, \<, \>, \`, \')

Reply via email to