ArcsinX added inline comments.

================
Comment at: clang-tools-extra/clangd/index/FileIndex.cpp:292
+    for (const auto &Sym : *Slab) {
+      if (Sym.Definition)
+        Files.insert(Sym.Definition.FileURI);
----------------
kadircet wrote:
> ArcsinX wrote:
> > kadircet wrote:
> > > it feels weird to choose one or the other here, why not just insert both 
> > > (after checking if definition exists, ofc).
> > > 
> > > We are likely to have a merged symbol information anyway, and the logic 
> > > here will likely result in no index owning the header files, unless there 
> > > are some symbols *defined* (not just declared) in it.
> > > 
> > > This will likely result in some overshooting as knowing about a symbols 
> > > declaration location doesn't mean indexing that particular file. But so 
> > > does the current version, as knowing about definition location might be 
> > > because of a merged symbol, rather than indexing of particular file.
> > I will try to explain on example:
> > 
> > - **test.h**
> > ```
> > void f();
> > ```
> > - **test.c**
> > ```
> > #include "test.h"
> > void f() { }
> > ```
> > - compile_commands.json contains a command for **test.c** compilation.
> > 
> > Scenario:
> > - open **test.c**
> > - try to find all references for `f()`
> > 
> > For that scenario result for `find all references` will be incorrect if 
> > both (decl and def) files are in the file list because:
> > - decl location is inside **test.h**
> > - def location is inside **test.c**
> > - the file list for the main file index contains **test.h** and **test.c**
> > - the main file index does not contain references from **test.h**
> > - the background (static) index contains references from **test.c**, but 
> > results from the background index will be skipped, because **test.h** is in 
> > the main file (dynamic) index file list.
> Ah i see. This all seems really fragile :/ We might as well have something 
> like:
> 
> a.h:
> ```
> struct Bar;
> ```
> 
> a.cc:
> ```
> #include "a.h"
> struct Bar;
> ```
> 
> and now as soon as you open a.cc, all the results from a.h will be gone, 
> because canonical declaration location for `Bar` will be in `a.h` and there 
> is no definition. (and i believe this kind of forward-decl madness is quite 
> common in at least LLVM). 
> Even worse, the same will also happen even if you have definition in the 
> header, but a forward decl in the main file, so even accepting the file for 
> definition wouldn't be enough.
> 
> It is currently (i.e. without this patch) working as expected, as main file 
> index only owns the information for the "main file" indexing was initiated 
> for. This feels like a big regression to me (that I didn't notice initially, 
> sorry for that), but I am ready to be convinced otherwise :D
> 
> As Sam mentioned what you do here and partitioning logic in FileShardedIndex 
> is quite similar (yours undershoot, sharding logic overshoots) but in the 
> sharding process we split indexing result of a full TU/preamble, and later on 
> those shards will always be used in a merged fashion (e.g. when a.cc and 
> b.cc, both including a.h gets indexed, the shard produced for a.h from b.cc 
> won't contain any definition locations, but the shard produced for a.cc will 
> know about those symbols definition locations and in a merged view those 
> symbols will have all the necessary information.), whereas in here the 
> resulting information is used in isolation (main file isn't merged with 
> preamble symbols, but mixes Files view). Hence causing such regressions.
> 
> I think the proper thing to do here is to propagate relevant files with slabs 
> on `FileSymbols::update`. What you do here and sharding isn't very different. 
> The question is should we have a:
> - string File
> - optional<string> File
> - vector<string> Files
> 
> Currently we always have exactly one File associated with all of those slabs 
> as we either:
> - always do sharding (preamble and background idx)
> - even though symbol informations tell otherwise, we've only processed a 
> single file (main file idx)
> 
> We would need 3rd option if we were to use filesymbols with a monolithic 
> index, but we don't. And even if we need such a thing in future it shouldn't 
> be a huge change hopefully.
> 
> 
> Sorry for the wall of text, I hope I do make sense, but please tell me if I 
> misunderstood/missing something.
> It is currently (i.e. without this patch) working as expected, as main file 
> index only owns the information for the "main file" indexing was initiated 
> for. This feels like a big regression to me (that I didn't notice initially, 
> sorry for that), but I am ready to be convinced otherwise :D

Currently it works mostly as expected. The only thing which worries me is the 
preamble index update: the first parameter passed to `PreambleSymbols.update()` 
is an URI, but we expect a file path there. Thus, the file list of the preamble 
index contains URIs, which seems incorrect and `PreambleIndex::indexedFiles()` 
always returns `IndexContents::None`.

In other words, we need to make the file list to contains always file paths or 
always URIs (current situation: the preamble index contains URIs, other indexes 
contain file paths). I thought that the best solution is to convert URI to file 
path before `PreambleSymbols.update()` call, but after D94952#inline-892421 
discussion I was no sure about my solution, and seems the solution introduces 
in this patch also inacceptable.

Currently I am not sure what is the best way to solve this problem =(
 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97535/new/

https://reviews.llvm.org/D97535

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to