Hi, I'm Ben and implemented modules support in CMake, authored P1689 itself, its support in GCC, and helped wrangle its support into clang and MSVC.
I encourage you to read this paper: https://mathstuf.fedorapeople.org/fortran-modules/fortran-modules.html which describes the strategy CMake uses for compiling Fortran modules (which are isomorphic to C++ modules at the build tool level). I believe that this strategy (sometimes called "explicit module builds" elsewhere) is the only long-term viable strategy for projects (one-off `g++` commands may want the "implicit module build" strategy which is more-or-less "dump things into a directory and find modules like we find headers"). However, the number of corner cases that exist at this level are only bad news for using it for projects in the real world (i.e., incrementally; clean CI builds probably don't actually care that much). There is also this repository which contains "interesting" corner cases for scanners: https://github.com/mathstuf/cxx-modules-sandbox You may also be interested in this thread on the autoconf list: https://lists.gnu.org/archive/html/autoconf/2025-02/msg00000.html On Fri, Feb 28, 2025 at 05:38:11 +0000, vspefs via Gcc wrote: > Current `-Mmodules` output is based on [P1602R0](wg21.link/p1602r0), which > speaks about a set of Makefile rules that can handle modules, with the help of > module mappers and a modified GNU Make. > > The proposal came out in 2019, and the output of those rules was implemented > at GCC in 2020. However, so far we still don't have a new release of GNU Make > which implements P1602R0. > > What's more, the rules described in P1602R0 are not ideal. It sets up phony > prerequisites for real-file targets, causing guaranteed rebuilds. It is also > unable to handle dependencies among module interfaces - that is to say, if > module A imports and exports B, then the interface of module A depends on that > of module B, so its CMI should be rebuilt if the interface of B changes. Note that this is the case even if A only imports B. Only GCC makes "standalone" CMI files; MSVC and Clang both want B's CMI location to be known when A's CMI is imported. > I tried a few approaches to fix the current implementation, but in vain. > > It is possible, however, to have another set of Makefile rules generated, > which > solves all the problems, and doesn't need a new GNU Make. I've posted it in > reddit. There are other problems that a build tool (make, ninja) cannot resolve on its own. These include (but may not be limited to): - permission to import: just because B exists does not mean that A is allowed to import it: - it could be private to B's library and only available to other TUs in the same library - it could be from a library that links *to* A's library (causing circular linker artifact dependencies (though not necessarily build graph dependencies)) - incremental builds: loading previous state might allow state to "percolate" up the build graph - A imports B fails on the first run because dependencies are not correct, but B is now made due to `-k` or concurrent execution passing while A's failure shuts down the graph: how to preserve that "B is not visible by A" state? - B's CMI is on disk yet, but its source is deleted; how to ensure that nothing imports it even though it can be satisfied with on-disk state? - circular dependencies: actually pretty easy for build tools to handle as it is a very expected error case to handle; not easy for dynamic mappers > See > [here](https://www.reddit.com/r/cpp/comments/1izg2cc/make_me_a_module_now/). > > To briefly summarize the idea: > > > If an object target is built from a module interface unit, the rules > > generated > > are: > > > > ```Makefile > > target.o: source.cc regular_prereqs header_unit_prereqs| > > header_unit_prereqs \ > > module_prereqs > > source_cmi.gcm: source.cc regular_prereqs header_unit_prereqs \ > > module_prereqs| target.o > > ``` > > If an object target is not, the rule generated is: > > > > ```Makefile > > target.o: source_files regular_prereqs \ > > header_unit_prereqs| header_unit_prereqs module_prereqs > > ``` > > > > The `header_unit_prereqs` and `module_prereqs` are paths to the > > corresponding > > CMI files. I haven't sat down and drawn out the build graph this makes, but it passes the smell test at least (though I'm just ignoring header unit stuff at this point). As for your questions on Reddit: > The module mapper maps between module interface units, module names, > and CMIs. It's good. But who should be responsible for using it? The > build system, or the compiler? I believe it is the build *system*'s job. I suppose I should clarify my definitions here: - build tool: build graph executor (e.g., ninja, make) - build system: provides a model of libraries, executables, and other rules which may be rendered as a build graph to be executed by a build tool (e.g., cmake, meson, automake) There are projects that are both at once (e.g., build2, boost.build, tup). The key difference is that the build system has a "higher level" graph which associates groups of compilations into artifacts (usually visible in the build graph by only looking at the linker bits). This implies "walls" between "just compiles" that might be topologically possible on the build graph, but logically inconsistent with the target graph. Say we have: - library A - library B - executable E which links to A and B A and B have no relation, so while a module import from an A compile into a B compile is *possible*, the build *system* does not consider this possible as B does not depend on A at all. Build tools (and certainly compilers) lack this context, so I don't think a generic mapper at that level is, in general, viable. > If it's the build system, then should we take our time, implement it > in a new version of GNU Make, release it, and cast some magic spells > to let people switch to it overnight? Make is only really missing `restat = 1` for performance (correctness is fine as Make only runs things unnecessarily without it). Everything else is *possible* even in POSIX make. There *might* need to be another feature for one global graph, but even without that, a static 2-level recursive Makefile setup is sufficient. > Furthermore, should we implement one for every build system? Every build system needs a "collator" that transports enough information about its target-level semantics to the build graph to stitch together scanning outputs into rules for the build tool. I also see this assertion in your post: > TL;DR - CMIs and object files are managed separately, and it > ultimately achieves everything we (at least I) want from modules. > Sometimes a CMI might be redundantly built. Once. Note that one may need *multiple* CMIs for a given source in a single build graph. This is because CMI compatibility is *very* narrow. If A is compiled with C++26 and B with C++23 and both use modules from C, each needs *unique* CMIs to be able to import them because the standard level changes the parser enough that CMIs cannot be loaded. There are many flags that can affect CMI compatibility (in fact, it is probably easier to list those known to *not* affect it: `-v`, `-ftime-report*`, `-M*`, `-fdeps-*`, `-pipe`, `-save-*`, `-time` and flags like them). > Header units Header units have all of these problems too, but they need figured out right away rather than having useful "checkpoint" states of the implementation like named modules have. It's why they'll be the last thing CMake implements for modules (despite them being "transitional", they are all the hardest parts of named modules, all at once, for build systems). Thanks, --Ben