Timm =?utf-8?q?Bäder?= <tbae...@redhat.com>,
Timm =?utf-8?q?Bäder?= <tbae...@redhat.com>
Message-ID:
In-Reply-To: <llvm/llvm-project/pull/66514/cl...@github.com>


================
@@ -0,0 +1,77 @@
+
+#include "clang/Frontend/CodeSnippetHighlighter.h"
+#include "clang/Basic/DiagnosticOptions.h"
+#include "clang/Basic/SourceManager.h"
+#include "clang/Lex/Lexer.h"
+#include "clang/Lex/Preprocessor.h"
+#include "clang/Lex/PreprocessorOptions.h"
+#include "llvm/Support/raw_ostream.h"
+
+using namespace clang;
+
+static SourceManager createTempSourceManager() {
+  FileSystemOptions FileOpts;
+  FileManager FileMgr(FileOpts);
+  llvm::IntrusiveRefCntPtr<DiagnosticIDs> DiagIDs(new DiagnosticIDs());
+  llvm::IntrusiveRefCntPtr<DiagnosticOptions> DiagOpts(new 
DiagnosticOptions());
+  DiagnosticsEngine diags(DiagIDs, DiagOpts);
+  return SourceManager(diags, FileMgr);
+}
+
+static Lexer createTempLexer(llvm::MemoryBufferRef B, SourceManager &FakeSM,
+                             const LangOptions &LangOpts) {
+  return Lexer(FakeSM.createFileID(B), B, FakeSM, LangOpts);
+}
+
+std::vector<StyleRange> CodeSnippetHighlighter::highlightLine(
+    StringRef SourceLine, const Preprocessor *PP, const LangOptions &LangOpts) 
{
+  if (!PP)
+    return {};
+  constexpr raw_ostream::Colors CommentColor = raw_ostream::BLACK;
+  constexpr raw_ostream::Colors LiteralColor = raw_ostream::GREEN;
+  constexpr raw_ostream::Colors KeywordColor = raw_ostream::YELLOW;
+
+  SourceManager FakeSM = createTempSourceManager();
+  const auto MemBuf = llvm::MemoryBuffer::getMemBuffer(SourceLine);
+  Lexer L = createTempLexer(MemBuf->getMemBufferRef(), FakeSM, LangOpts);
+  L.SetKeepWhitespaceMode(true);
----------------
zygoloid wrote:

Yes, I think those are the three cases we can (currently) encounter.

For multi-line comments: all our `-Wdoxygen` warnings will fire in the middle 
of multi-line comments. I don't think we want to turn off the highlighting in 
those cases. We also don't know that on a line containing `foo /* bar */ baz`, 
`foo` is not part of the block comment. It'd be valid for there to be a `/*` on 
a previous line. We do detect and warn on `/*` within a `/*...*/` comment, and 
we could perhaps keep track of the places where that happens. I'm not sure we 
warn on `//` within a `/*...*/` comment, which has similar issues.

It might be reasonable to require that any time a diagnostic is produced with a 
caret location within a comment or a string literal, the diagnostic must be 
informed of that fact. Possibly we could require that the caret location is 
either a raw token location (that is, it points to a location that we know we 
can lex forward from), or a raw token location plus an offset from the start of 
the token (for diagnostics within comments and strings)? That would at least 
allow us to highlight reliably from the caret location forwards, but scanning 
backwards to find the start of a comment or string would still not really be 
possible in general. We could approximate it with heuristics, but that's 
imperfect.

So I suppose part of what we need to decide here is how much imperfection we're 
OK with. I think this highlighting will become important feedback to developers 
to help them see how Clang is interpreting their code, so I think it's 
important that the highlighting is reliable. If the highlighting is weird / 
wrong, the developer will assume that Clang is interpreting the code in that 
weird / wrong way.

https://github.com/llvm/llvm-project/pull/66514
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to