[PATCH] D33765: Show correct column nr. when multi-byte utf8 chars are used.

2017-11-30 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added a comment. Still worried about the effect on tools which parse clang diagnostics... please send a message to cfe-dev. Hopefully we'll get responses there. https://reviews.llvm.org/D33765 ___ cfe-commits mailing list cfe-commits@list

[PATCH] D33765: Show correct column nr. when multi-byte utf8 chars are used.

2017-11-30 Thread Erik Verbruggen via Phabricator via cfe-commits
erikjv updated this revision to Diff 124903. erikjv added a comment. I moved all code to the TextDiagnostics, so all other interfaces still get byte offsets. https://reviews.llvm.org/D33765 Files: lib/Frontend/TextDiagnostic.cpp test/Misc/diag-utf8.cpp Index: test/Misc/diag-utf8.cpp

[PATCH] D33765: Show correct column nr. when multi-byte utf8 chars are used.

2017-10-26 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added a comment. I didn't really search for it before, but it looks like LLVM already has a routine for computing column widths? See llvm::sys::unicode::columnWidthUTF8. There are some tools which parse clang diagnostic output; we might need a flag to control this. Not sure who would

[PATCH] D33765: Show correct column nr. when multi-byte utf8 chars are used.

2017-10-04 Thread Erik Verbruggen via Phabricator via cfe-commits
erikjv updated this revision to Diff 117660. erikjv edited the summary of this revision. https://reviews.llvm.org/D33765 Files: include/clang/Basic/SourceManager.h lib/Basic/SourceManager.cpp test/Misc/diag-utf8.cpp Index: test/Misc/diag-utf8.cpp ===

Re: [PATCH] D33765: Show correct column nr. when multi-byte utf8 chars are used.

2017-06-05 Thread David Blaikie via cfe-commits
Is it right to only change the behavior of this caller? Presumably other callers (like getSpellingColumnNumber, getExpansionColumnNumber, etc) probably want the same handling? Do any callers /not/ want this behavior? On Thu, Jun 1, 2017 at 3:14 AM Erik Verbruggen via Phabricator via cfe-commits w

[PATCH] D33765: Show correct column nr. when multi-byte utf8 chars are used.

2017-06-01 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added a comment. Correctly counting columns is a bit more complicated that that... for example, consider what happens if you replace `ideëen` with `idez̈en`. See https://stackoverflow.com/questions/3634627/how-to-know-the-preferred-display-width-in-columns-of-unicode-characters . ht

[PATCH] D33765: Show correct column nr. when multi-byte utf8 chars are used.

2017-06-01 Thread Erik Verbruggen via Phabricator via cfe-commits
erikjv created this revision. Previously, the column number in a diagnostic would be the byte position in the line. This results in incorrect column numbers when a multi-byte UTF-8 character would be present in the input. By ignoring all bytes starting with 0b10 the correct column number is cr