-- Regarding EREs having leading repetition operators, e.g. '*xyz': Section 9.5.3 of
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html supplies the grammar for POSIX-conforming EREs. From the notes at the very bottom: ----------------------------------------------------------------------- The ERE grammar does not permit several constructs that previous sections specify as having undefined results: [ ... ] * One or more ERE_dupl_symbols appearing first in an ERE, or [ ... ] Implementations are permitted to extend the language to allow these. Conforming applications cannot use such constructs. ----------------------------------------------------------------------- To my eyes, the last sentence seems to say that a conforming implementation must not accept EREs like '*xyz'. But egrep [grep 2.14] does accept them, even with POSIXLY_CORRECT defined, e.g. $ export POSIXLY_CORRECT=1 $ echo 'abcdefghi' | egrep --color=auto '*def' matches 'def'. In contrast, POSIX regex(3) rejects such EREs with "invalid preceding regular expression". Not sure whether this is a POSIX conformance issue or not; it depends on the intended semantics of POSIXLY_CORRECT. To my eyes, the man page is a bit ambiguous, since it first says that it "behaves as POSIX.2 requires", but then goes on to list only some specific behaviors related to option processing. It wasn't clear to me whether listing the option-related behavior was intended to limit the scope of the POSIXLY_CORRECT-ness to only those aspects, or if they were listed just because they are (for example) often confusing to users, hence worthwhile to call out explicitly. In summary, there are a few questions/branches to this: 1. If POSIXLY_CORRECT is intended to be conforming only in the specific respects listed, I'd suggest that the name of the associated envar be changed to reflect that (e.g., something like POSIXLY_CORRECT_OPTS), and also to change the man page text to read something like: POSIXLY_CORRECT_OPTS If set, grep conforms with POSIX.2 with regard to the following option processing behaviors: [ description of option behaviors ] 2. If POSIXLY_CORRECT is intended to mean 'fully conforming in all respects' then it seems like the present behavior is in technical violation. 3. If (2) is the case, and the decision is made to change the behavior of grep accordingly, it might be worthwhile to also change the doc for POSIXLY_CORRECT to something like this: POSIXLY_CORRECT If set, grep conforms with POSIX.2 in all respects. In particular, [ description of option-related behaviors and/or other behaviors that are deemed worthwhile to call out explicitly ] 4. If (2) is the case, but the decision is made not to change the behavior of grep (i.e. accept the non-conformance) it might be wortwhile to change the doc for POSIXLY_CORRECT to something like this: POSIXLY_CORRECT If set, grep conforms with POSIX.2 in almost all respects. In particular, [ description of option-related behaviors and/or other behaviors that are deemed worthwhile to call out explicitly ]. But it does not conform precisely regarding ERE's like '*xyz' [ and whatever other ways are known to be non-conforming. ] To pre-answer an expected question, asked of a submitter (Roman Donchenko) in a similar POSIX violation bug report (#37737): Are you encountering this problem in a real-world usecase, or are you simply reporting a violation of the standard? My response is essentially the same as Roman gave: I am reporting it only as a violation, but otoh, the POSIX-mandated behavior makes a lot more sense to me than the current behavior, since expressions like '*xyz' are almost always user error; the intent is usually '.*xyz'. So if such expressions were rejected by egrep, it would IMO be a behavioral improvement for users (like, ummm... me) who chronically mis-remember how '*' is interpreted by bash vs. grep.