sammccall added inline comments.

================
Comment at: clang-tools-extra/clangd/index/SymbolCollector.cpp:801
+  if (IsSystem)
+    return "<" + ShorterInclude + ">";
   // Standard case: just insert the file itself.
----------------
This is a great heuristic, now I'm thinking of all the ways it can go wrong...

---

It could be slow: it's doing a string comparison for every include path entry 
(not just the -isystems) and that can be a long list for big codebases. We're 
calling this for most symbols we encounter, so in principle this looks 
quadratic in codebase size when indexing a preamble.

It's tempting to throw a cache on it but actually... the callsite looks like:
```
foreach header:
  foreach symbol:
    adjust(getIncludeHeader(symbol, header));
```

getIncludeHeader() is entirely dependent on the header almost ignores the 
symbol, so hoisting it out of the loop would be as good as caching it.

The wrinkle is the std::move hack, but we can move *that* part alone inside the 
loop instead.

---

It could fail to translate properly between paths, being inserted where the 
-isystem wasn't available. Often this means the *library* isn't available in 
this part of the codebase, so really the problem is offering the symbol at all 
(though using verbatim headers may make this harder to fix). Dynamic indexes 
spanning multiple projects are another case, but I think a marginal one. I 
don't think we need to address this.

---

It may give inconsistent results between TUs for the same symbol.
When a definition is indexed, it should win, so the problematic cases are:
 - header-only symbols (or no definition indexed) with inconsistent -isystem. 
This is not terribly important, as above I think.
 - definition is missing -isystem, because *within* the library headers are not 
considered "system". This seems likely to be common, and I'm not sure how to 
address it (other than giving verbatim headers priority always, which seems 
dubious). It's not a regression though (just the bug doesn't get fixed for 
these symbols.)
 - definition has -isystem (library impl considers its own headers as system) 
but other files don't. Also not sure how to solve this but it seems less likely 
to be common to me.

One design around these problems would be to store *both* a preferred spelling 
and a filename, and to use the spelling if it can be resolved. But this is 
invasive, means doing stat()s, doesn't work well with genfiles... I'm not 
really sure of a clean solution here.

Maybe we just move ahead as-is and see if we hit problems...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98242/new/

https://reviews.llvm.org/D98242

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to