rsmith added inline comments.

================
Comment at: clang/include/clang/Basic/DiagnosticLexKinds.td:117
+def err_character_not_allowed : Error<
+  "character <U+%0> not allowed in identifiers">;
 def ext_unicode_whitespace : ExtWarn<
----------------
The old diagnostic text here was bad in the case where the character was 
supposed to be part of an identifier. The new diagnostic text is bad in the 
case where the character is not supposed to be part of an identifier (eg, if 
you copy-paste a smart-quote into the source file). Is there some way we can 
phrase this diagnostic so it's reasonable regardless of the programmer's intent?

Perhaps the best we can do is to say that if an identifier is immediately 
followed by one or more "bad" Unicode characters that were probably intended to 
be identifier characters (things we don't recognize as whitespace or homoglyphs 
or anything else), then produce the nice "not allowed in identifiers" 
diagnostic (and maybe even treat the characters as part of the identifier for 
error recovery purposes), and otherwise diagnose as simply "unexpected 
character <U+%0>"? I also wonder if perhaps we should treat all unexpected 
characters as identifier characters for error recovery purposes.


================
Comment at: clang/lib/Lex/Lexer.cpp:3182
+    }
+    Diag(BufferPtr, diag::err_character_not_allowed)
+        << CharBuf
----------------
It's a bit strange that we produce this warning only on characters that aren't 
allowed in identifiers, and don't warn on identifier continuation characters 
that aren't valid identifier start characters (such as a standalone combining 
character). Is there a good reason for that (specifically for the 
`!isAllowedIDChar` check above)?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104975/new/

https://reviews.llvm.org/D104975

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to