tag 26322 notabug thanks On 03/31/2017 08:37 AM, Julien Denis wrote: > Hello, > > Assuming that "textfile" is a regular non empty text file, is it > normal that grep '*' textfile returns nothing but grep -E '*' textfile > returns all the lines ? > I got this using Debian 7.1 stable and so grep is version 2.20. > Would a newer grep version resolve this or is it not a bug (but a > valid behavior of the star character in ERE) ?
According to POSIX, the regular expression '*' has a different interpretation under BRE than under ERE: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html In BRE (plain 'grep' style), 9.3.3 states that " The <asterisk> shall be special except when used: In a bracket expression As the first character of an entire BRE (after an initial '^', if any)" so it means you are searching for the literal character '*'. In your case of no output, that means that your textfile contains no literal '*' on any line. In ERE ('grep -E' style), 9.4.3 states that "*+?{ The <asterisk>, <plus-sign>, <question-mark>, and <left-brace> shall be special except when used in a bracket expression (see RE Bracket Expression). Any of the following uses produce undefined results: If these characters appear first in an ERE, or immediately following an unescaped <vertical-line>, <circumflex>, <dollar-sign>, or <left-parenthesis>" So your regular expression is undefined, and we can make it mean whatever we want (whether we error out, or treat it as equivalent to some other regular expression, it doesn't matter - you are outside the bounds of POSIX so you can't rely on our behavior to be consistent). My guess is that your combination of libc and grep version (yes, it might be different across versions or on different platforms) has an interpretation where '*' is treated the same as searching for zero-or-more instances of the regular expression '', and since the empty regular expression matches everywhere, zero-or-more instances of that regular expression will also match everywhere, and you thus get the result of every line of textfile output. But that doesn't mean you should expect that behavior to stay the same. Maybe you are mixing regular expressions with globs. If you want to search for zero-or-more characters with a glob, you use '*'; but that translates to '.*' in both BRE and ERE syntax. At any rate, I don't see this as a bug, so I'm closing the instance in the bug-tracker, but feel free to reply with further comments or questions. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature