ioeric added a comment. Thanks a lot for further cleaning up the patch! It is now much easier to review. I really appreciate it!
Some more comments on the public APIs and the layering of classes. There are a lot of helper classes in the implementation, so I think it's important to get a clear layering so that they could be easily understood by future contributors :) Also, with the `IndexUnitDataConsumer` abstraction, it seems to be feasible now to add some unit tests for `createUnitIndexingAction`. With all the recent major changes, I think it's important that we have some degree of testing to make sure components actually work together. ================ Comment at: include/clang/Frontend/CompilerInstance.h:187 + typedef std::function<std::unique_ptr<FrontendAction>( + const FrontendOptions &opts, std::unique_ptr<FrontendAction> action)> + ActionWrapperTy; ---------------- ioeric wrote: > nit: LLVM variable names start with upper-case letters. `opts` and `action` are still lower-case. ================ Comment at: include/clang/Index/DeclOccurrence.h:38 + } + }; + ---------------- Nit: indentation. Tip: `git-clang-format` against the diff base can format all changed lines in your patch. ================ Comment at: include/clang/Index/IndexUnitDataConsumer.h:1 +//===--- IndexUnitDataConsumer.h - Abstract index unit data consumer ---------------===// +// ---------------- IIUC, this is the index data for a translation unit, as opposed to an AST. If so, consider calling this `UnitIndexDataConsumer` to match `(AST)IndexDataConsumer` which is parallel to this. We might want to rename them to be either `index::UnitDataConsumer` vs `index::ASTDataConsumer` or `index::UnitIndexDataConsumer` vs `index::ASTIndexDataConsumer` . I am inclined to the first pair as `index` is already implied in the namespace. ================ Comment at: include/clang/Index/IndexUnitDataConsumer.h:67 +private: + virtual void _anchor(); +}; ---------------- Comment? Why do we actually need this? ================ Comment at: include/clang/Index/IndexingAction.h:49 +struct UnitIndexingOptions : IndexingOptions { + enum class FileIncludeFilterKind { ---------------- We are now mixing functionalities for Unit indexing and AST indexing actions in the same file. We might want to separate these into two headers e..g `UnitIndexingAction.h` and `ASTIndexingAction.h`. This would make it easier for users to find the right functions :) ================ Comment at: include/clang/Index/IndexingAction.h:66 +/// Information on the translation unit +struct UnitDetails { + Module *UnitModule; ---------------- Please add documentation for each field. It's not trivial what each field is for, especially some fields seem to be optional and some seem to be mutually exclusive. ================ Comment at: include/clang/Index/IndexingAction.h:67 +struct UnitDetails { + Module *UnitModule; + std::string ModuleName; ---------------- These pointers suggest the life time of this struct is tied to some other struct, which makes the struct look a bit dangerous to use. Should we also carry a reference or a smart pointer to the underlying object that keeps these pointers valid? Would it be a `CompilerInstance ` (guessing from `IndexUnitDataConsumerFactory `)? ================ Comment at: include/clang/Index/IndexingAction.h:114 +std::unique_ptr<FrontendAction> +createUnitIndexingAction(const UnitIndexingOptions &IndexOpts, + IndexUnitDataConsumerFactory ConsumerFactory, ---------------- What is the intended user of this function? It's unclear how users could obtain a `ConsumerFactory ` (i.e. `UnitDetails`) without the functionalities in `UnitDataConsumerActionImpl `. (Also see comment in the implementation of `createIndexDataRecordingAction`.) ================ Comment at: include/clang/Index/IndexingAction.h:128 +RecordingOptions +getRecordingOptionsFromFrontendOptions(const FrontendOptions &FEOpts); + ---------------- This is likely only useful for compiler invocation. I would put it in the compiler invocation code. ================ Comment at: lib/Driver/Job.cpp:211 +/// The leftover modules from the crash are stored in +/// <name>.cache/vfs/modules ---------------- nit: Comment should start with an overview of what the function does. ``` Returns a directory path that is ... ``` Also, consider calling this `getDirAdjacentToModCache`. `buildDir` can be ambiguous. ================ Comment at: lib/Driver/Job.cpp:220 + llvm::SmallString<128> RelModCacheDir = llvm::sys::path::parent_path( + llvm::sys::path::parent_path(CrashInfo->VFSPath)); + llvm::sys::path::append(RelModCacheDir, DirName); ---------------- Please clang-format the code. Without indentation, this looks like an no-op statement. ================ Comment at: lib/Index/IndexingAction.cpp:88 +/// \c WrappingIndexAction frontend actions. +struct IndexActionImpl { + virtual ~IndexActionImpl() = default; ---------------- Use `class` for interfaces. ================ Comment at: lib/Index/IndexingAction.cpp:99 + /// Callback at the end of processing a single input. + virtual void finish(CompilerInstance &CI) = 0; }; ---------------- Does `CI` here have to be the same instance as the one in `createIndexASTConsumer `? Might worth documenting. ================ Comment at: lib/Index/IndexingAction.cpp:137 + + CreatedASTConsumer = true; + std::vector<std::unique_ptr<ASTConsumer>> Consumers; ---------------- nit: Move this after `Impl->createIndexASTConsumer(CI)`. Do we need to reset this flag? Calling `CreateASTConsumer ` multiple times on the same instance seems to be allowed? ================ Comment at: lib/Index/IndexingAction.cpp:243 +/// Collects and groups consumed index data by \c FileID. +class IndexDataCollector : public IndexDataConsumer { + Preprocessor *PP = nullptr; ---------------- This seems to be related to files. Maybe `FileIndexDataCollector`? ================ Comment at: lib/Index/IndexingAction.cpp:250 +public: + void setPreprocessor(Preprocessor &PreProc) { + PP = &PreProc; ---------------- `override` ================ Comment at: lib/Index/IndexingAction.cpp:254 + + IndexDataByFileTy::const_iterator by_file_begin() const { + return IndexDataByFile.begin(); ---------------- Simply `begin`, if the class is called `FileIndexDataCollector `. Similar below to match iterator naming convention. ================ Comment at: lib/Index/IndexingAction.cpp:265 +private: + bool handleDeclOccurence(const Decl *D, SymbolRoleSet Roles, + ArrayRef<SymbolRelation> Relations, FileID FID, ---------------- I think this should be `public` as this is still implementing `IndexDataConsumer`. ================ Comment at: lib/Index/IndexingAction.cpp:323 + llvm_unreachable("should have already checked in the beginning"); + case UnitIndexingOptions::FileIncludeFilterKind::UserOnly: + if (SystemCache.isSystem(LocInfo.first, SourceMgr)) ---------------- I'd simply do: ``` if FileIncludeFilter == UnitIndexingOptions::FileIncludeFilterKind::UserOnly) if (isSystem...) return; ``` ================ Comment at: lib/Index/IndexingAction.cpp:337 + + virtual void InclusionDirective(SourceLocation HashLoc, + const Token &IncludeTok, StringRef FileName, ---------------- Same here. This should be `public` ================ Comment at: lib/Index/IndexingAction.cpp:354 + + virtual void visitFileDependencies( + const CompilerInstance &CI, ---------------- The naming convention for the callback interfaces is `forEach*` e.g. `forEachFileDependency`. s/visitor/Callback/ (same below). ================ Comment at: lib/Index/IndexingAction.cpp:358 + virtual void + visitIncludes(llvm::function_ref<void(const FileEntry *Source, unsigned Line, + const FileEntry *Target)> ---------------- `forEachInclude` ================ Comment at: lib/Index/IndexingAction.cpp:361 + visitor) const = 0; + virtual void visitModuleImports( + const CompilerInstance &CI, ---------------- `forEachModuleImport` ================ Comment at: lib/Index/IndexingAction.cpp:369 +/// file to file inclusions, for the source files in a translation unit +class SourceFilesIndexDependencyCollector : public DependencyCollector, + public IndexDependencyProvider { ---------------- This is two classes in one, which is difficult to understand. Could you split it into `FileIndexDependencyCollector ` and `FileIndexDependencyProvider` and have `FileIndexDependencyCollector` returns a provider on finish (e.g. `Provider consume();`; you might want to copy/move the collected data into the provider). It would be easier to justify the behavior (e.g. what happens when you access the provider while collector is still working?) ================ Comment at: lib/Index/IndexingAction.cpp:373 + UnitIndexingOptions IndexOpts; + llvm::SetVector<const FileEntry *> Entries; + llvm::BitVector IsSystemByUID; ---------------- What does `Entries` contain? What files are added? ================ Comment at: lib/Index/IndexingAction.cpp:501 + /// IndexUnitDataConsumer constructed from the \c UnitConsumerFactory if the + /// \c ParentUnitConsumer indicates \c Mod should be indexed. + /// ---------------- Instead of passing `ParentUnitConsumer`, consider checking the `Mod` before calling the function. ================ Comment at: lib/Index/IndexingAction.cpp:504 + /// \returns true if \c Mod was indexed + static bool indexModule( + const CompilerInstance &CI, serialization::ModuleFile &Mod, ---------------- Non-factory static method is often a code smell. Any reason not to make these static methods private members? With that, you wouldn't need to pass along so many parameters. You could make them `const` if you don't want members to be modified. ================ Comment at: lib/Index/IndexingAction.cpp:511 + /// Get unit details for the given module file + static UnitDetails getUnitDetails(serialization::ModuleFile &Mod, + const CompilerInstance &CI, ---------------- Why is this overload public while others are private? Aren't they all used only in this class? ================ Comment at: lib/Index/IndexingAction.cpp:531 +}; +} // anonymous namespace + ---------------- Any reason to close the anonymous namespace here? Shouldn't outlined definitions of `UnitDataConsumerActionImpl`'s methods also in the anonymous namespace? ================ Comment at: lib/Index/IndexingAction.cpp:758 + + class IndexUnitDataRecorder : public IndexUnitDataConsumer { + public: ---------------- I think the inheritance of `IndexUnitDataConsumer` and the creation of factory should be in user code (e.g. implementation for on-disk persist-index-data should come from the compiler invocation code `ExecuteCompilerInvocation.cpp` or at least a separate file in the library that compiler invocation can use), and the user should only use `createUnitIndexingAction` by providing a factory. Currently, `createUnitIndexingAction` and `createIndexDataRecordingAction` are mostly identical except for the code that implements `IndexUnitDataConsumer ` and creates the factory. The current `createIndexDataRecordingAction` would probably only used by the compiler invocation, and we can keep the generalized `createUnitIndexingAction` in the public APIs. ================ Comment at: lib/Index/IndexingAction.cpp:765 + auto ConsumerFactory = + [&](const CompilerInstance &CI, UnitDetails UnitInfo) -> + std::unique_ptr<IndexUnitDataConsumer> { ---------------- The `UnitInfo` is ignored? What do we actually need it for? ================ Comment at: lib/Index/IndexingAction.cpp:769 + }; + auto Base = llvm::make_unique<UnitDataConsumerActionImpl>( + std::move(RecordOpts), ConsumerFactory); ---------------- `Base` doesn't seem to be a very meaningful name here. https://reviews.llvm.org/D39050 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits