Timm =?utf-8?q?Bäder?= <[email protected]>,
Timm =?utf-8?q?Bäder?= <[email protected]>
Message-ID:
In-Reply-To: <llvm/llvm-project/pull/66514/[email protected]>
================
@@ -0,0 +1,77 @@
+
+#include "clang/Frontend/CodeSnippetHighlighter.h"
+#include "clang/Basic/DiagnosticOptions.h"
+#include "clang/Basic/SourceManager.h"
+#include "clang/Lex/Lexer.h"
+#include "clang/Lex/Preprocessor.h"
+#include "clang/Lex/PreprocessorOptions.h"
+#include "llvm/Support/raw_ostream.h"
+
+using namespace clang;
+
+static SourceManager createTempSourceManager() {
+ FileSystemOptions FileOpts;
+ FileManager FileMgr(FileOpts);
+ llvm::IntrusiveRefCntPtr<DiagnosticIDs> DiagIDs(new DiagnosticIDs());
+ llvm::IntrusiveRefCntPtr<DiagnosticOptions> DiagOpts(new
DiagnosticOptions());
+ DiagnosticsEngine diags(DiagIDs, DiagOpts);
+ return SourceManager(diags, FileMgr);
+}
+
+static Lexer createTempLexer(llvm::MemoryBufferRef B, SourceManager &FakeSM,
+ const LangOptions &LangOpts) {
+ return Lexer(FakeSM.createFileID(B), B, FakeSM, LangOpts);
+}
+
+std::vector<StyleRange> CodeSnippetHighlighter::highlightLine(
+ StringRef SourceLine, const Preprocessor *PP, const LangOptions &LangOpts)
{
+ if (!PP)
+ return {};
+ constexpr raw_ostream::Colors CommentColor = raw_ostream::BLACK;
+ constexpr raw_ostream::Colors LiteralColor = raw_ostream::GREEN;
+ constexpr raw_ostream::Colors KeywordColor = raw_ostream::YELLOW;
+
+ SourceManager FakeSM = createTempSourceManager();
+ const auto MemBuf = llvm::MemoryBuffer::getMemBuffer(SourceLine);
+ Lexer L = createTempLexer(MemBuf->getMemBufferRef(), FakeSM, LangOpts);
+ L.SetKeepWhitespaceMode(true);
----------------
zygoloid wrote:
Yes, I think those are the three cases we can (currently) encounter.
For multi-line comments: all our `-Wdoxygen` warnings will fire in the middle
of multi-line comments. I don't think we want to turn off the highlighting in
those cases. We also don't know that on a line containing `foo /* bar */ baz`,
`foo` is not part of the block comment. It'd be valid for there to be a `/*` on
a previous line. We do detect and warn on `/*` within a `/*...*/` comment, and
we could perhaps keep track of the places where that happens. I'm not sure we
warn on `//` within a `/*...*/` comment, which has similar issues.
It might be reasonable to require that any time a diagnostic is produced with a
caret location within a comment or a string literal, the diagnostic must be
informed of that fact. Possibly we could require that the caret location is
either a raw token location (that is, it points to a location that we know we
can lex forward from), or a raw token location plus an offset from the start of
the token (for diagnostics within comments and strings)? That would at least
allow us to highlight reliably from the caret location forwards, but scanning
backwards to find the start of a comment or string would still not really be
possible in general. We could approximate it with heuristics, but that's
imperfect.
So I suppose part of what we need to decide here is how much imperfection we're
OK with. I think this highlighting will become important feedback to developers
to help them see how Clang is interpreting their code, so I think it's
important that the highlighting is reliable. If the highlighting is weird /
wrong, the developer will assume that Clang is interpreting the code in that
weird / wrong way.
https://github.com/llvm/llvm-project/pull/66514
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits