Re: [PATCH] D18296: [Sema] Handle UTF-8 invalid format string specifiers

Richard Smith via cfe-commits Mon, 28 Mar 2016 11:23:16 -0700

rsmith added a comment.

This patch builds a length-1 `ConversionSpecifier` but includes the complete 
code point in the length of the overall format specifier, which is 
inconsistent. Please either treat the trailing bytes as part of the 
`ConversionSpecifier` or revert the changes to `ParsePrintfSpecifier` and 
handle this entirely within `HandleInvalidConversionSpecifier`.


Does the same problem exist when parsing `scanf` specifiers?


================
Comment at: lib/Analysis/PrintfFormatString.cpp:322
@@ +321,3 @@
+
+    // If the specifier in non-printable, it could be the first byte of a
+    // UTF-8 sequence. If that's the case, adjust the length accordingly.
----------------
in -> is

================
Comment at: lib/Analysis/PrintfFormatString.cpp:324
@@ +323,3 @@
+    // UTF-8 sequence. If that's the case, adjust the length accordingly.
+    if (Start + 1 < I && !llvm::sys::locale::isPrint(FirstByte) &&
+        isLegalUTF8String(&SB, SE))
----------------
The interpretation of a format string by `printf` should not depend on the 
locale, so our parsing of a format string should not either.


http://reviews.llvm.org/D18296



_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D18296: [Sema] Handle UTF-8 invalid format string specifiers

Reply via email to