rsmith added a comment. This patch builds a length-1 `ConversionSpecifier` but includes the complete code point in the length of the overall format specifier, which is inconsistent. Please either treat the trailing bytes as part of the `ConversionSpecifier` or revert the changes to `ParsePrintfSpecifier` and handle this entirely within `HandleInvalidConversionSpecifier`.
Does the same problem exist when parsing `scanf` specifiers? ================ Comment at: lib/Analysis/PrintfFormatString.cpp:322 @@ +321,3 @@ + + // If the specifier in non-printable, it could be the first byte of a + // UTF-8 sequence. If that's the case, adjust the length accordingly. ---------------- in -> is ================ Comment at: lib/Analysis/PrintfFormatString.cpp:324 @@ +323,3 @@ + // UTF-8 sequence. If that's the case, adjust the length accordingly. + if (Start + 1 < I && !llvm::sys::locale::isPrint(FirstByte) && + isLegalUTF8String(&SB, SE)) ---------------- The interpretation of a format string by `printf` should not depend on the locale, so our parsing of a format string should not either. http://reviews.llvm.org/D18296 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits