rsmith accepted this revision. rsmith added inline comments.
================ Comment at: clang/lib/Lex/Lexer.cpp:2012-2015 + // Skip escaped characters. Escaped newlines will already be processed by + // getAndAdvanceChar. + if (C == '\\') + C = getAndAdvanceChar(CurPtr, Result); ---------------- vsapsai wrote: > rsmith wrote: > > You can just delete these four lines entirely. > It will make Clang reject previously accepted `#include <test\>escape.h>` > That's what we want, right? I agree with having no support for such file > names, just want to confirm. For the reference, proposed change would match > gcc 5.4.0 behaviour. gcc 6.1 and higher rejects such include too but in a > different way. Hmm. I thought so, but on further inspection this area is more subtle than I'd anticipated. It's *permissible* to reject that. In C++, [lex.header]p2 says: "The appearance of [...] the character[...] `\` [...] in a //q-char-sequence// or an //h-char-sequence// is conditionally-supported with implementation-defined semantics". C11 6.4.7/3 has similar wording but makes the behavior undefined in that case. (Both GCC's behavior and Clang's behavior are permitted by these rules.) Clang's handling of the `#include "q-char-sequence"` form behaves the same as its handling of the `#include <h-char-sequence>` form today: the header name is *not* terminated by a `"` or `>` (respectively) that is preceded by an odd number of `\`s. Retaining this consistency of behavior seems like a good idea. However, the actual behavior when such a character is "escaped" does not seem reasonable: ``` #include <\>> // includes file named \>, not file named > #include "\"" // includes file named \", not file named " ``` ... and there is no general way to perform a double-quoted include of a file whose name contains a `"` (not preceded by `\`), or an angled include of a file whose name is `>` (not preceded by `>`). So it seems that permitting escaped ending characters without actually handling them as escape sequences is not particularly worthwhile. I think the best approach here is to follow GCC's lead: terminate a //header-name// (of either form) when the first `"` or `>` character is reached (depending on the form of header name), regardless of preceding `\`s. But... that's a more invasive change (you'll need to change the handling of double-quoted string literals too, for consistency). Let's keep that change and this one separate. https://reviews.llvm.org/D41423 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits