rsmith accepted this revision.
rsmith added inline comments.

================
Comment at: clang/lib/Lex/Lexer.cpp:2012-2015
+    // Skip escaped characters.  Escaped newlines will already be processed by
+    // getAndAdvanceChar.
+    if (C == '\\')
+      C = getAndAdvanceChar(CurPtr, Result);
----------------
vsapsai wrote:
> rsmith wrote:
> > You can just delete these four lines entirely.
> It will make Clang reject previously accepted `#include <test\>escape.h>` 
> That's what we want, right? I agree with having no support for such file 
> names, just want to confirm. For the reference, proposed change would match 
> gcc 5.4.0 behaviour. gcc 6.1 and higher rejects such include too but in a 
> different way.
Hmm. I thought so, but on further inspection this area is more subtle than I'd 
anticipated.

It's *permissible* to reject that. In C++, [lex.header]p2 says: "The appearance 
of [...] the character[...] `\` [...] in a //q-char-sequence// or an 
//h-char-sequence// is conditionally-supported with implementation-defined 
semantics". C11 6.4.7/3 has similar wording but makes the behavior undefined in 
that case. (Both GCC's behavior and Clang's behavior are permitted by these 
rules.)

Clang's handling of the `#include "q-char-sequence"` form behaves the same as 
its handling of the `#include <h-char-sequence>` form today: the header name is 
*not* terminated by a `"` or `>` (respectively) that is preceded by an odd 
number of `\`s. Retaining this consistency of behavior seems like a good idea.

However, the actual behavior when such a character is "escaped" does not seem 
reasonable:

```
#include <\>> // includes file named \>, not file named >
#include "\"" // includes file named \", not file named "
```

... and there is no general way to perform a double-quoted include of a file 
whose name contains a `"` (not preceded by `\`), or an angled include of a file 
whose name is `>` (not preceded by `>`). So it seems that permitting escaped 
ending characters without actually handling them as escape sequences is not 
particularly worthwhile.

I think the best approach here is to follow GCC's lead: terminate a 
//header-name// (of either form) when the first `"` or `>` character is reached 
(depending on the form of header name), regardless of preceding `\`s.

But... that's a more invasive change (you'll need to change the handling of 
double-quoted string literals too, for consistency). Let's keep that change and 
this one separate.


https://reviews.llvm.org/D41423



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to