ChuanqiXu added a comment.

In D153114#4603579 <https://reviews.llvm.org/D153114#4603579>, @sammccall wrote:

> In D153114#4602414 <https://reviews.llvm.org/D153114#4602414>, @ChuanqiXu 
> wrote:
>
>>> Don't attempt any cross-file or cross-version coordination: i.e. don't try 
>>> to reuse BMIs between different files, don't try to reuse BMIs between 
>>> (preamble) reparses of the same file, don't try to persist the module 
>>> graph. Instead, when building a preamble, synchronously scan for the module 
>>> graph, build the required PCMs on the single preamble thread with filenames 
>>> private to that preamble, and then proceed to build the preamble.
>>
>> Do I understand right? If I understand correctly, I fully agree with the 
>> direction. We can go slowly, as long as we keep moving forward.
>>
>> Then I'd like to leave the patch as-is for referencing and create new 
>> patches following the suggestion.
>
> Yes, that's the suggestion, and that plan makes sense to me, thanks!
>
> I did some more thinking about this (having a concrete implementation helps a 
> lot!) and had a couple more thoughts.
> At some point we should write down a design somewhere, need to strike a 
> balance between doing it early enough to be useful but late enough that we've 
> understood!

Yeah, then let's make the page into some design ideas discussing page.

> Dep scanning - roles
> --------------------
>
> IIUC we do this for two reasons:
>
> - to identify what module names we must have PCMs for in order to build a 
> given TU (either an open file, or a module we're building as PCM)
> - to build a database mapping module name => filename, which we compose with 
> the CDB to know how to build a PCM for a given module name
>
> I think it would be good to clearly separate these. The latter is simpler, 
> more performance-critical, async, and is probably not used at all if the 
> build system can tell us this mapping.
> The latter is more complex, and will always be needed synchronously for the 
> mainfile regardless of the build system.

Yes, agreed.

> Dep scanning - implementation
> -----------------------------
>
> The dep scanner seems to work by getting the compile command and running the 
> preprocessor. This is fairly heavyweight, and I can't see anywhere it's going 
> into single-file mode - is it really reading all `#included` headers? This is 
> certainly not workable for reparses of the mainfile (when no headers have 
> changed).
>
> It seems unneccesary: the standard seems to go to some lengths to ensure that 
> we (almost) only need to lex the top-level file:
>
> - module and import decls can't be produced by macros (seems to be the effect 
> of the `pp-module` directive etc)
> - module and import decls can't be `#include`d (definition of `module-file` 
> and `[cpp.import]` rules)
>
> The wrinkle I see is that some PP usage is possible: the module name can be 
> produced by a macro, and imports can be `#ifdef`d. I think the former is very 
> unlikely (like `#include MACRO_NAME`) and we can not support it, and the 
> latter will just result in us overestimating the deps, which seems OK.
> You have more context here though, and maybe I'm misreading the restrictions 
> placed by the standard. Today clang doesn't seem to enforce most of these 
> sort of restrictions, which is probably worth fixing if they're real.
>
> (This doesn't apply to header modules: it's perfectly possible to include a 
> textual header which includes a modular header, and it's impossible to know 
> without actually preprocessing. This divergence is nasty, but I don't think 
> we should pessimize standard modules for it).

There are some problems due to the complexities of the standard...

The take away may be:

- module and import decls can't be produced by macros (seems to be the effect 
of the `pp-module` directive etc)
  - yes
- module and import decls can't be `#include`d (definition of `module-file` and 
`[cpp.import]` rules)
  - the module declarations can't be `#include`d.
  - but the import decls can be `#include`d partially. See the discussion of 
https://github.com/llvm/llvm-project/issues/59688 for detail. The explanation 
is:
    - the wording(http://eel.is/c++draft/cpp.import#3) is "If a pp-import is 
produced by source file inclusion (including by the rewrite produced when a 
#include directive names an importable header) while processing the group of a 
module-file, the program is ill-formed."
    - and the definition of `module-file` 
(http://eel.is/c++draft/cpp.pre#nt:module-file) is `pp-global-module-fragment 
pp-module group pp-private-module-fragment`.
    - so the phrase `the group of a module-file` only refers to the `group` in 
the definition of `module-file` literally. We can't expand the grammar.
- the module name can be produced by a macro
  - yes
- imports can be #ifdefd.
  - yes. And this is a pretty important use-case for using modules in practice.

So possibly we have to look into the `#include`s when scanning.

> Interaction with preamble
> -------------------------
>
> At a high level, `import` decls should be processed with the preamble: they 
> should change infrequently, rebuilding modules is expensive, coarse-grained 
> work, we want to make the same policy decisions on whether to use stale PCMs 
> or block on fresh ones etc.
> However they don't appear in a prefix of the file, and this is pretty 
> important to how the preamble action works, so exactly in what sense are they 
> part of the preamble?
>
> I believe the best answer is:
>
> - "preamble" is really a set of required artifacts + conditions to check for 
> validity
> - `import foo` in a file means `foo.pcm` is a required artifact, not that 
> `preamble.pcm` contains an `ImportDecl`
>
> So given this code:
>
>   module;
>   #include <x>
>   module foo;
>   import dep1;
>   module :private;
>   import dep2;
>
> The "preamble region" should end after `#include <x>` and preamble.pcm should 
> contain the AST & PP state up to that point.
> Meanwhile `dep1.pcm` and `dep2.pcm` are separate PCM files that will be 
> loaded on each parse.
> For a preamble to be usable at all, we need to have built `preamble.pcm`, 
> `dep1.pcm`, `dep2.pcm`.
> For a preamble to be up-to-date, the preamble region + set of imported 
> modules must be unchanged, the preamble.pcm must be up to date with respect 
> to its sources, and the module PCMs must be up-to-date with respect to their 
> sources. (unclear exactly how to implement the latter, may need to add extra 
> tracking)
>
> (When building this module as a PCM we may not want to treat dep2 as a 
> dependency or parse the private fragment... but this is not relevant to 
> preambles as we won't be using a preamble for this anyway)

Agreed basically. To make it clear, I think what you proposed may be to make 
`clang::clangd::PreambleData` contains the paths (or wrapped data structures) 
to `dep1.pcm` and `dep2.pcm`. And other parts of clangd should interact with 
the `dep1.pcm` or `dep2.pcm` with `clang::clangd::PreambleData`?

> For a preamble to be up-to-date, the preamble region + set of imported 
> modules must be unchanged, the preamble.pcm must be up to date with respect 
> to its sources, and the module PCMs must be up-to-date with respect to their 
> sources. (unclear exactly how to implement the latter, may need to add extra 
> tracking)

For this issue, (and the related ABA issue you gave inline comments), my 
thought **was** that the code intelligence doesn't have to have super strict 
real time requirement with the compilation systems. I mean, in my real 
experience, it is not so bad to see out-of-date results from code intelligence 
than the results from compilation systems. (I know this may not be accepted.)

> inside a module unit, imports must appear directly underneath the module 
> declaration, above non-trivial declarations

Then this is not so correct. With the explanation in the same link above 
(https://github.com/llvm/llvm-project/issues/59688),  the following example may 
be valid:

  cpp
  // a.cpp
  export module a;



  hpp
  // b.hpp
  import a;



  cpp
  // c.cpp
  module;
  #include <b.hpp>
  export module c;


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153114/new/

https://reviews.llvm.org/D153114

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to