Date: Sun, 11 Oct 2020 11:45:12 +0200 From: tlaro...@polynum.com Message-ID: <20201011094512.ga...@polynum.com>
| The problem? the leading '$' is not escaped (I was trying to get the var | name from a Makefile)... | | Is this a bug or is this behavior undefined or even required by | POSIX? Not a bug, and (kind of) required, kind of in that a \ somewhere it is not required produces undefined results (XBD 9.3.2) The interpretation of an ordinary character preceded by an unescaped <backslash> ('\\') is undefined, except for: The exceptions have nothing to do with '$'. "Ordinary character" is defined in the previous sentence, same section An ordinary character is a BRE that matches itself: any character in the supported character set, except for the BRE special characters listed in Section 9.3.3. 9.3.3 does include '$' but: $ The <dollar-sign> shall be special when used as an anchor. So, '$' is an ordinary character, except when it is an anchor. Anchors are defined in XBD 9.3.8: A BRE can be limited to matching expressions that begin or end a string; this is called ``anchoring''. The <circumflex> and <dollar-sign> special characters shall be considered BRE anchors in the following contexts: (skip '^' for this message) 2. A <dollar-sign> ('$') shall be an anchor when used as the last character of an entire BRE. Your (first) '$' was not the last character of the BRE, so is not an anchor, and hence is not special for this reason. The section continues: The implementation may treat a <dollar-sign> as an anchor when used as the last character of a subexpression. That one is optional for the implementation so you could not rely upon it working, but here your first '$' is not at the end of a subsxpression, so it wouldn't qualify anyway. [This option is actually very ugly, as when you want to use a '$' at the end of a subexpression to match itself, rather than be an anchor, you must escape it with '\' if the implementation would treat it as an anchor, but not escape it if it wouldn't.] The rest of the paragraph just explains how matching by a '$' that is an anchor works. Yours isn't, so is just an ordinary character, and so matches itself, and would produce undefined results if escaped (the undefined result could be for it to simply match itself, making \$ always mean to match a literal '$' but a sed (or anything else using BREs) script should not rely upon that). This behaviour ('^' is special only when it is the very first character of the RE, and '$' is special only when it is the absolute last) is traditional RE behaviour going back to the very earliest unix RE's (as in "ed"). For ERE's the rules are slightly different, but for anchors, I think only in that they always work in subexpressions, it isn't an implementation option. So, even in an ERE your first '$' should not be escaped (and certainly does not require to be). kre