efriedma added a comment.
Still worried about the effect on tools which parse clang diagnostics... please
send a message to cfe-dev. Hopefully we'll get responses there.
https://reviews.llvm.org/D33765
___
cfe-commits mailing list
cfe-commits@list
erikjv updated this revision to Diff 124903.
erikjv added a comment.
I moved all code to the TextDiagnostics, so all other interfaces still get byte
offsets.
https://reviews.llvm.org/D33765
Files:
lib/Frontend/TextDiagnostic.cpp
test/Misc/diag-utf8.cpp
Index: test/Misc/diag-utf8.cpp
efriedma added a comment.
I didn't really search for it before, but it looks like LLVM already has a
routine for computing column widths? See llvm::sys::unicode::columnWidthUTF8.
There are some tools which parse clang diagnostic output; we might need a flag
to control this. Not sure who would
erikjv updated this revision to Diff 117660.
erikjv edited the summary of this revision.
https://reviews.llvm.org/D33765
Files:
include/clang/Basic/SourceManager.h
lib/Basic/SourceManager.cpp
test/Misc/diag-utf8.cpp
Index: test/Misc/diag-utf8.cpp
===
Is it right to only change the behavior of this caller? Presumably other
callers (like getSpellingColumnNumber, getExpansionColumnNumber, etc)
probably want the same handling? Do any callers /not/ want this behavior?
On Thu, Jun 1, 2017 at 3:14 AM Erik Verbruggen via Phabricator via
cfe-commits w
efriedma added a comment.
Correctly counting columns is a bit more complicated that that... for example,
consider what happens if you replace `ideëen` with `idez̈en`. See
https://stackoverflow.com/questions/3634627/how-to-know-the-preferred-display-width-in-columns-of-unicode-characters
.
ht
erikjv created this revision.
Previously, the column number in a diagnostic would be the byte position
in the line. This results in incorrect column numbers when a multi-byte
UTF-8 character would be present in the input. By ignoring all bytes
starting with 0b10 the correct column number is cr