Re: RFC: A redesign of `-Mmodules` output

vspefs via Gcc Tue, 04 Mar 2025 05:10:00 -0800

On Tuesday, March 4th, 2025 at 18:04, Ben Boeckel via Gcc <gcc@gcc.gnu.org> 
wrote:

> On Tue, Mar 04, 2025 at 07:53:51 +0000, vspefs wrote:
> 
> > By the way, what's stop us from having compiler options like
> > `g++ -Rgcm.cache -Rsomewhere/else/gcm.cache` to specify CMI repo path, like 
> > `-I`
> > for include paths? It could be useful for projects with complex folder
> > structure, as build tools like Make sometimes change current working 
> > directory,
> > and so we need to locate CMIs in different folders.
> 
> 
> Numerous :) . Consider these (non-exhaustive) problematic scenarios:
> 
> - incremental build, so some modules already exist, you find
> `imported.cmi`, but how do you know it is up-to-date?
> - flag compatibility matters; if you find `imported.cmi`, but its flag
> set is incompatible, do you keep searching or give up?
> - caching and distributed build tools now need to somehow encapsulate
> repository state into their hashes and send contents as necessary

In my beautiful, blinded fantasy, this flag should only be used with other tools
keeping the CMIs up-to-date, e.g. a build system. If a build system ensures all
needed CMIs are updated before the source gets built, there should be no
problem.

Flag compatibility - it should reject incompatible CMIs.

My whole intention of this flag is to provide a handy way to configure the
built-in module mapper, especially for build systems that use subdirectories to
structure projects. The <path> in `-R<path>` should be dynamic, decided by the
build system at *configure* time, usually representing the paths of CMI cache
folders of multiple subdirectories under 1 specific build profile. It cannot be
used together with `-fmodule-mapper`.

In shorter words, it is a way to 1) isolate CMIs of different build profiles,
and 2) make handling modules in different subdirectories easier, with minor
effort needed on the build system side.

Caching build tools - God, I guess you're right.

> > The mapping between module interface unit, module name, and expected CMI
> > filename is still performed by the module mapper. But now when looking up a 
> > CMI,
> > it goes to each repo in the list, in order, until it finds a CMI that 
> > matches
> > and returns its full path. When producing a CMI, the CMI file is dumped to 
> > the
> > first repo.
> 
> 
> The mapper returns the path to where it wants the CMI; why second-guess
> it by putting it somewhere it didn't specify? Same with lookup; the
> mapper knows the full path it made the CMI at in the first place and
> returns that. There should not need to be any searching involved at all.

Like mentioned above, the point is not searching but specifying. For a capable
build system, it can provide straight-forward mapper file that contains
module-name-CMI-path pairs like CMake does. Or other facility that works.

I had this idea when I was toying with module mappers. I thought module mappers
could optionally provide some logic, and it can be useful. But with current
`-fmodule-mapper`, you either use the simple built-in mapper, or write one
yourself, build a standalone executable, and remember to handle its lifetime
during the build. This flag should give users some power to configure the
built-in module mapper behaviour that makes life easier.

> > Ideally, all invocations concerning modules should have `-Rgcm.cache` as the
> > first CMI repo. This way, all CMI producing calls remain deterministic, and
> > behave same as before.
> 
> 
> If you have a single build with both release and debug configurations,
> they at least need to have separate repositories. Each executable should
> as well because while `export module foo;` has to be unique in a
> program, each executable could have their own `foo` module that doesn't
> interact with others of the same name.

My bad. Didn't mention the intended use. The whole point of having a `-R` flag
is to have separate repositories. We should have `-Rgcm.cache-debug` under debug
configuration, and `-Rgcm.cache-release` under release.

Each executable should as well - Yes, I didn't think of that. I always believed
that "avoiding duplicated module names" is something commonly accepted, but now
come to think of it, it's not something I can decide. It can be handled though.
Something like `-Rgcm.cache-<executable-target-name>-<build-profile>`.

I might have oversimplified the situation. But the "mangling" of CMI paths can't
be avoided, and I believe this flag can help.

> > This could make Make-based build systems really work. The Makefile rules
> > proposed in this RFC make sure CMIs are built before used, and this `-R` 
> > flag
> > offers big-project-ready module lookup mechanics, if we just ignore the 
> > multiple
> > CMIs problem for now :(
> 
> 
> IMO, simple `%.o: %.cxx` build systems are dead with C++ modules. One
> just needs to have a comprehensive understanding of the source files
> involved and the relationship between groups of them to get it right.

I kind of agree. I haven't practiced in or seen any large projects with modules,
but my unmature opinion is that we need advanced grouping ability. Simple
`%.o: %.cc` is not only hard to write, but also hard to maintain.

But could there be some opportunity for a new tool to be born? I honestly don't
have a clue.

> > This option alone, I believe, could also offer some convenience to the 
> > general
> > usage of modules.
> 
> 
> I believe the only place for such things is in "make this one file I
> have" use cases. Projects should use proper infrastructure that handles
> modules correctly.

I think I was completely brain-dead while making this claim. Sorry. T_T

Just to mention, I think something like 

  myfavoritebuildsystem --quick --use=boost@1.80 --as-shared-library=mylib 
--sources mylib.cc

you mentioned is the right way for these minor tasks.

Re: RFC: A redesign of `-Mmodules` output

Reply via email to