cor3ntin added inline comments.

================
Comment at: clang/lib/Lex/Lexer.cpp:2399
+                                      getSourceLocation(CurPtr));
+  bool UnicodeDecodeFailed = false;
+
----------------
aaron.ballman wrote:
> It looks like this can move into the `while` loop and we can remove some `= 
> false` from within the loop?
The idea here is to not diagnose a contiguous sequence of invalid code units 
more than once
The sequence  `0x80x0x80x80` for example will only warn one for example. The 
goal is to avoid noise when, for example there is a long comment in 
Windows-1251 encoded cyrillic. it would warn one per word (because the spaces 
would decode fine) rather than one per contiguous non-sense character.
It's also somewhat necessary because we can't know where the end of the invalid 
"character" is, better to be greedy. 
(The test file check that behavior)


================
Comment at: clang/lib/Lex/Lexer.cpp:2407
+    // diagnostic only once per sequence that cannot be decoded.
+    while ((!WarnOnInvalidUtf8 || isASCII(C)) && C != 0 && // Potentially EOF.
+           C != '\n' && C != '\r') { // Newline or DOS-style newline.
----------------
aaron.ballman wrote:
> Do you have some idea of performance cost for enabling the warning with a 
> large TU? Are we talking .5%? 1%? 5%? noise?
This adds 2 comparisons when the warning is enbled (is ASCII(c) is simply `C < 
0x80`) - so it should be noise, but I have not run benchmark.
It might be more noticeable within multi line comments as there the 
optimization that vectorizes the comment skipping on SSE2 platforms is simply 
skipped when this optimization is enabled.


================
Comment at: clang/lib/Lex/Lexer.cpp:2417
+      if (Length == 0) {
+        if (!UnicodeDecodeFailed) {
+          Diag(CurPtr, diag::warn_invalid_utf8_in_comment);
----------------
aaron.ballman wrote:
> aaron.ballman wrote:
> > 
> So if unicode decoding didn't fail... diagnose? :-) The naming here seems 
> unfortunate.
`if(!UnicodeDecodeFailedInThePreviousLoopIteration)` ? I'm open to suggestion 
here


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128059/new/

https://reviews.llvm.org/D128059

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to