On Fri, Jan 24, 2025 at 07:26:00PM +0000, Peter White wrote: > On Fri, Jan 24, 2025 at 01:27:13PM +0000, Andreas BROCKMANN via Bug reports > for GNU grep wrote: > > Hi, > > > > The 1st command below correctly reports trailing spaces, for Unix and > > Windows format files. > > The 2nd one incorrectly reports all lines. > > > > grep -sHn -i " [[:cntrl:]]*$" *.vhd > > grep -sHn -i "\s[[:cntrl:]]*$" *.vhd > As someone who just today made a similar mistake I would like to point > out that the pattern does as intended because '*' matches *zero* or more > occurrences of the preceding atom. So the second pattern matches > any line that contains a *literal* 's' followed by zero or more control > chars, which is any line because of the newline at the end which is a > control char. Since you did not ask for perl regex (-P) grep uses basic > POSIX regex instead; at least I *think* you want perl syntax given that > '\s' is only valid in PCRE, IIRC.
Turns out that last part is not true, sorry. I was going by the grep(1) man page instead of `info grep`, which does say that '\s' is shorthand for '[[:space:]]'. Still, the 2nd pattern is incorrect. IIUC this is what it should look like: # '-i' is bogus since there is no upper/lower case whitespace grep --color=never -sHn '[[:blank:]][[:cntrl:]]*$' [:blank:] is the more correct char class because '\s' matches anything in the ASCII range 0-31 (plus <DEL>[127]) and as it so happens <CR> is in that range. DOS files have the <CR> in front of <LF> (a.k.a. '$'), which is why the original pattern did match *correctly*. Contrary to the claim in the OP I could only reproduce the "false" behavior with DOS and not UNIX files. And now I understand why '[[:cntrl:]]' is in the pattern (sorry for my initial misunderstanding). DOS, the gift that keeps on giving. :P Also note the '--color=never'. I don't know how relevant this is on Windows but on my terminal emulator (with --color=auto) the <CR> at the end of a line in DOS files would be printed as a match and the terminal obeyed with all the ensuing consequences, leaving empty lines without match text. Another "gift", I guess. PW