[PATCH] D108194: [clangd] IncludeCleaner: Mark used headers

Sam McCall via Phabricator via cfe-commits Wed, 18 Aug 2021 03:23:44 -0700

sammccall added a comment.

I realize many of the things I'll object to came from my own prototype, sorry 
about that :-\
I think/hope I gave you some forewarning about this!




================
Comment at: clang-tools-extra/clangd/Headers.h:137
+  // Maps including files (from) to included files (to).
+  using AbstractIncludeGraph = llvm::DenseMap<unsigned, SmallVector<unsigned>>;
+  // Decides usage for each file included by EntryPoint based on the set of
----------------
Hey, I recognize this code :-)

I think the key ideas here are that:
 - we're using opaque identifiers for the files, nothing is interesting about a 
file other than its (include) edges and (directly-referenced) color
 - the identity of headers is an impl detail of Headers.h rather than being 
something like a FileID, this allows us to hide messy details of files not 
having stable identity across preamble->main file

I think the second idea is important, but the first one might be a bit naive. I 
worry it's going to lead to certain rules being hard to implement, or being 
bundled into Headers.cpp instead of IncludeCleaner.

For example:
 - if a file is not self-contained, how does this affect the algorithm? 
(There's a FIXME for this, but it's in the wrong file!)
 - if a file is a standard library entrypoint?
 - if a file is a standard library impl detail?

I think the facts (is a file self-contained) should be part of 
IncludeStructure, but that we should expose them for IncludeCleaner to deal 
with, rather than trying to hide them in `markUsed`.
This means giving IncludeStructure a wider interface, which is we need to be 
careful about. It makes it harder to substitute algorithms by swapping out the 
UsedFunc, but I think this is only a cute trick and not actually important.

---

Concretely, I think I'd suggest just extending the public API to expose the 
"file index" concept:
 - expose the type `using IncludeStructure::File = unsigned` or so
 - add the file id to `Inclusion`
 - add File getFile(const FileEntry*)
 - add `ArrayRef<File> getIncludedFiles(File)`
 - in future, we can add e.g. `const char* isStandardLibraryEntrypoint(File)` 
or whatever

And implement all of markUsed in IncludeCleaner.
(Not 100% sure if we actually still need the `Used` ivar in Inclusion, come to 
think of it, maybe we just run this code in the diagnostic cycle)


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:144
+  std::vector<SourceLocation> Sorted{Locs.begin(), Locs.end()};
+  // Group by FileID.
+  llvm::sort(Sorted);
----------------
nit: move this to be a line comment on the sort() call?

I think it's sufficiently nonobvious that sorting groups by file ID that it 
becomes nonobvious exactly what code the comment refers to!


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.h:50
+/// Retrieves IDs of all files containing SourceLocations from \p Locs. Those
+/// locations could be within macro expansions and are not resolved to their
+/// spelling locations.
----------------
so when *do* we perform this expansion?

Seems like you've wired this up end-to-end in this patch and we're just going 
to hit the elog case.

I think it's reasonable to put this expansion into findReferencedFiles fwiw, 
it's a fairly simple second pass and in practice combining them won't interfere 
with reasonable tests


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.h:51
+/// locations could be within macro expansions and are not resolved to their
+/// spelling locations.
+llvm::DenseSet<FileID> findReferencedFiles(const ReferencedLocations &Locs,
----------------
comment just says spelling, not spelling/expansion, not sure if this is 
significant.

Expansion is actually the more obvious, but I do think we need both.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108194/new/

https://reviews.llvm.org/D108194

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D108194: [clangd] IncludeCleaner: Mark used headers

Reply via email to