ChuanqiXu added a comment. In D153114#4603579 <https://reviews.llvm.org/D153114#4603579>, @sammccall wrote:
> In D153114#4602414 <https://reviews.llvm.org/D153114#4602414>, @ChuanqiXu > wrote: > >>> Don't attempt any cross-file or cross-version coordination: i.e. don't try >>> to reuse BMIs between different files, don't try to reuse BMIs between >>> (preamble) reparses of the same file, don't try to persist the module >>> graph. Instead, when building a preamble, synchronously scan for the module >>> graph, build the required PCMs on the single preamble thread with filenames >>> private to that preamble, and then proceed to build the preamble. >> >> Do I understand right? If I understand correctly, I fully agree with the >> direction. We can go slowly, as long as we keep moving forward. >> >> Then I'd like to leave the patch as-is for referencing and create new >> patches following the suggestion. > > Yes, that's the suggestion, and that plan makes sense to me, thanks! > > I did some more thinking about this (having a concrete implementation helps a > lot!) and had a couple more thoughts. > At some point we should write down a design somewhere, need to strike a > balance between doing it early enough to be useful but late enough that we've > understood! Yeah, then let's make the page into some design ideas discussing page. > Dep scanning - roles > -------------------- > > IIUC we do this for two reasons: > > - to identify what module names we must have PCMs for in order to build a > given TU (either an open file, or a module we're building as PCM) > - to build a database mapping module name => filename, which we compose with > the CDB to know how to build a PCM for a given module name > > I think it would be good to clearly separate these. The latter is simpler, > more performance-critical, async, and is probably not used at all if the > build system can tell us this mapping. > The latter is more complex, and will always be needed synchronously for the > mainfile regardless of the build system. Yes, agreed. > Dep scanning - implementation > ----------------------------- > > The dep scanner seems to work by getting the compile command and running the > preprocessor. This is fairly heavyweight, and I can't see anywhere it's going > into single-file mode - is it really reading all `#included` headers? This is > certainly not workable for reparses of the mainfile (when no headers have > changed). > > It seems unneccesary: the standard seems to go to some lengths to ensure that > we (almost) only need to lex the top-level file: > > - module and import decls can't be produced by macros (seems to be the effect > of the `pp-module` directive etc) > - module and import decls can't be `#include`d (definition of `module-file` > and `[cpp.import]` rules) > > The wrinkle I see is that some PP usage is possible: the module name can be > produced by a macro, and imports can be `#ifdef`d. I think the former is very > unlikely (like `#include MACRO_NAME`) and we can not support it, and the > latter will just result in us overestimating the deps, which seems OK. > You have more context here though, and maybe I'm misreading the restrictions > placed by the standard. Today clang doesn't seem to enforce most of these > sort of restrictions, which is probably worth fixing if they're real. > > (This doesn't apply to header modules: it's perfectly possible to include a > textual header which includes a modular header, and it's impossible to know > without actually preprocessing. This divergence is nasty, but I don't think > we should pessimize standard modules for it). There are some problems due to the complexities of the standard... The take away may be: - module and import decls can't be produced by macros (seems to be the effect of the `pp-module` directive etc) - yes - module and import decls can't be `#include`d (definition of `module-file` and `[cpp.import]` rules) - the module declarations can't be `#include`d. - but the import decls can be `#include`d partially. See the discussion of https://github.com/llvm/llvm-project/issues/59688 for detail. The explanation is: - the wording(http://eel.is/c++draft/cpp.import#3) is "If a pp-import is produced by source file inclusion (including by the rewrite produced when a #include directive names an importable header) while processing the group of a module-file, the program is ill-formed." - and the definition of `module-file` (http://eel.is/c++draft/cpp.pre#nt:module-file) is `pp-global-module-fragment pp-module group pp-private-module-fragment`. - so the phrase `the group of a module-file` only refers to the `group` in the definition of `module-file` literally. We can't expand the grammar. - the module name can be produced by a macro - yes - imports can be #ifdefd. - yes. And this is a pretty important use-case for using modules in practice. So possibly we have to look into the `#include`s when scanning. > Interaction with preamble > ------------------------- > > At a high level, `import` decls should be processed with the preamble: they > should change infrequently, rebuilding modules is expensive, coarse-grained > work, we want to make the same policy decisions on whether to use stale PCMs > or block on fresh ones etc. > However they don't appear in a prefix of the file, and this is pretty > important to how the preamble action works, so exactly in what sense are they > part of the preamble? > > I believe the best answer is: > > - "preamble" is really a set of required artifacts + conditions to check for > validity > - `import foo` in a file means `foo.pcm` is a required artifact, not that > `preamble.pcm` contains an `ImportDecl` > > So given this code: > > module; > #include <x> > module foo; > import dep1; > module :private; > import dep2; > > The "preamble region" should end after `#include <x>` and preamble.pcm should > contain the AST & PP state up to that point. > Meanwhile `dep1.pcm` and `dep2.pcm` are separate PCM files that will be > loaded on each parse. > For a preamble to be usable at all, we need to have built `preamble.pcm`, > `dep1.pcm`, `dep2.pcm`. > For a preamble to be up-to-date, the preamble region + set of imported > modules must be unchanged, the preamble.pcm must be up to date with respect > to its sources, and the module PCMs must be up-to-date with respect to their > sources. (unclear exactly how to implement the latter, may need to add extra > tracking) > > (When building this module as a PCM we may not want to treat dep2 as a > dependency or parse the private fragment... but this is not relevant to > preambles as we won't be using a preamble for this anyway) Agreed basically. To make it clear, I think what you proposed may be to make `clang::clangd::PreambleData` contains the paths (or wrapped data structures) to `dep1.pcm` and `dep2.pcm`. And other parts of clangd should interact with the `dep1.pcm` or `dep2.pcm` with `clang::clangd::PreambleData`? > For a preamble to be up-to-date, the preamble region + set of imported > modules must be unchanged, the preamble.pcm must be up to date with respect > to its sources, and the module PCMs must be up-to-date with respect to their > sources. (unclear exactly how to implement the latter, may need to add extra > tracking) For this issue, (and the related ABA issue you gave inline comments), my thought **was** that the code intelligence doesn't have to have super strict real time requirement with the compilation systems. I mean, in my real experience, it is not so bad to see out-of-date results from code intelligence than the results from compilation systems. (I know this may not be accepted.) > inside a module unit, imports must appear directly underneath the module > declaration, above non-trivial declarations Then this is not so correct. With the explanation in the same link above (https://github.com/llvm/llvm-project/issues/59688), the following example may be valid: cpp // a.cpp export module a; hpp // b.hpp import a; cpp // c.cpp module; #include <b.hpp> export module c; CHANGES SINCE LAST ACTION https://reviews.llvm.org/D153114/new/ https://reviews.llvm.org/D153114 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits