ArcsinX added inline comments.
================ Comment at: clang-tools-extra/clangd/index/FileIndex.cpp:292 + for (const auto &Sym : *Slab) { + if (Sym.Definition) + Files.insert(Sym.Definition.FileURI); ---------------- kadircet wrote: > ArcsinX wrote: > > kadircet wrote: > > > it feels weird to choose one or the other here, why not just insert both > > > (after checking if definition exists, ofc). > > > > > > We are likely to have a merged symbol information anyway, and the logic > > > here will likely result in no index owning the header files, unless there > > > are some symbols *defined* (not just declared) in it. > > > > > > This will likely result in some overshooting as knowing about a symbols > > > declaration location doesn't mean indexing that particular file. But so > > > does the current version, as knowing about definition location might be > > > because of a merged symbol, rather than indexing of particular file. > > I will try to explain on example: > > > > - **test.h** > > ``` > > void f(); > > ``` > > - **test.c** > > ``` > > #include "test.h" > > void f() { } > > ``` > > - compile_commands.json contains a command for **test.c** compilation. > > > > Scenario: > > - open **test.c** > > - try to find all references for `f()` > > > > For that scenario result for `find all references` will be incorrect if > > both (decl and def) files are in the file list because: > > - decl location is inside **test.h** > > - def location is inside **test.c** > > - the file list for the main file index contains **test.h** and **test.c** > > - the main file index does not contain references from **test.h** > > - the background (static) index contains references from **test.c**, but > > results from the background index will be skipped, because **test.h** is in > > the main file (dynamic) index file list. > Ah i see. This all seems really fragile :/ We might as well have something > like: > > a.h: > ``` > struct Bar; > ``` > > a.cc: > ``` > #include "a.h" > struct Bar; > ``` > > and now as soon as you open a.cc, all the results from a.h will be gone, > because canonical declaration location for `Bar` will be in `a.h` and there > is no definition. (and i believe this kind of forward-decl madness is quite > common in at least LLVM). > Even worse, the same will also happen even if you have definition in the > header, but a forward decl in the main file, so even accepting the file for > definition wouldn't be enough. > > It is currently (i.e. without this patch) working as expected, as main file > index only owns the information for the "main file" indexing was initiated > for. This feels like a big regression to me (that I didn't notice initially, > sorry for that), but I am ready to be convinced otherwise :D > > As Sam mentioned what you do here and partitioning logic in FileShardedIndex > is quite similar (yours undershoot, sharding logic overshoots) but in the > sharding process we split indexing result of a full TU/preamble, and later on > those shards will always be used in a merged fashion (e.g. when a.cc and > b.cc, both including a.h gets indexed, the shard produced for a.h from b.cc > won't contain any definition locations, but the shard produced for a.cc will > know about those symbols definition locations and in a merged view those > symbols will have all the necessary information.), whereas in here the > resulting information is used in isolation (main file isn't merged with > preamble symbols, but mixes Files view). Hence causing such regressions. > > I think the proper thing to do here is to propagate relevant files with slabs > on `FileSymbols::update`. What you do here and sharding isn't very different. > The question is should we have a: > - string File > - optional<string> File > - vector<string> Files > > Currently we always have exactly one File associated with all of those slabs > as we either: > - always do sharding (preamble and background idx) > - even though symbol informations tell otherwise, we've only processed a > single file (main file idx) > > We would need 3rd option if we were to use filesymbols with a monolithic > index, but we don't. And even if we need such a thing in future it shouldn't > be a huge change hopefully. > > > Sorry for the wall of text, I hope I do make sense, but please tell me if I > misunderstood/missing something. > It is currently (i.e. without this patch) working as expected, as main file > index only owns the information for the "main file" indexing was initiated > for. This feels like a big regression to me (that I didn't notice initially, > sorry for that), but I am ready to be convinced otherwise :D Currently it works mostly as expected. The only thing which worries me is the preamble index update: the first parameter passed to `PreambleSymbols.update()` is an URI, but we expect a file path there. Thus, the file list of the preamble index contains URIs, which seems incorrect and `PreambleIndex::indexedFiles()` always returns `IndexContents::None`. In other words, we need to make the file list to contains always file paths or always URIs (current situation: the preamble index contains URIs, other indexes contain file paths). I thought that the best solution is to convert URI to file path before `PreambleSymbols.update()` call, but after D94952#inline-892421 discussion I was no sure about my solution, and seems the solution introduces in this patch also inacceptable. Currently I am not sure what is the best way to solve this problem =( Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D97535/new/ https://reviews.llvm.org/D97535 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits