I can reproduce this is gawk. :-( It's a bug somewhere in the dfa matcher. When I export GAWK_NO_DFA=1 to bypass the dfa matcher, only xxxxy matches.
Hope this helps, Arnold Gonzalo Padrino <grimalg.on+...@gmail.com> wrote: > Hello, > > While using GNU grep v3.4 in an Ubuntu 20.04 userspace running on top of > Win10 WSL (yeah, i know... but also checked in other envs) i discovered > what seems like an obvious bug (if i'm not mistaken). > The bug: > ----- > me@host:~$ echo 'xxxxy' |grep -E '^x+x+x+x+y$' > xxxxy > me@host:~$ echo 'xxxy' |grep -E '^x+x+x+x+y$' > xxxy > me@host:~$ echo 'xxy' |grep -E '^x+x+x+x+y$' > xxy > me@host:~$ echo 'xy' |grep -E '^x+x+x+x+y$' > > ---- > ...the terminal supports ansi color escapes, and what's really weird is > that only the result from the first command is colored in red. First and > fourth commands yield correct results; the second and third do not, as they > should not match it's input. > > I've tested releases from v3.1 to latest v3.5 and found the anomalous > behaviour in version v3.2 through v3.5. A (quick and clunky) git bisect led > me to believe it was introduced about two years ago, possibly in commit > 123620af88f55c3e0cc9f0aed7311c72f625bc82 ( > https://git.savannah.gnu.org/cgit/grep.git/commit/?id=123620af88f55c3e0cc9f0aed7311c72f625bc82). > If this is true, it would mean either the bug is in gnulib, or maybe grep > needed to do some kind of extra handling on it's side. > > Kind regards. Gonzalo Padrino. > > P.S.: I had to patch some things in order to successfully compile the code > after checking out some problematic commits (pragmas to avoid warnings > about "pure" and "noreturn" function attributes, a missing configmake > dependency in bootstrap.conf, etc ).