naveen-seth wrote: Hi, sorry! I left a comment earlier but noticed that a large part of the explanation was incorrect and deleted it. Here is the fixed comment:
At first I thought that the issue can be fixed by just checking for duplicates while parsing the command-line arguments, but that doesn't seem to be the case. I will try to describe the crash point and the crash reason with this example (without duplicate arguments): ```cpp //--- a.cppm export module A; export int a() { return 41; } ``` ```cpp //--- b.cppm export module B; import A; export int b() { return a() + 1; } ``` ```cpp //--- c.cppm export module C; export int c() { return 42; } ``` ```cpp //--- main.cpp import B; int main() { return b(); } ``` ``` # Correctly precompile the modules clang++ -std=c++20 --precompile a.cppm -o a.pcm clang++ -std=c++20 --precompile b.cppm -fmodule-file=A=a.pcm -o b.pcm clang++ -std=c++20 --precompile c.cppm -o c.pcm # Bad use of -fmodule-file=... clang++ -std=c++20 main.cpp -std=c++20 main.cpp -fmodule-file=B=b.pcm -fmodule-file=A=c.pcm ``` After `import B;` is parsed from `main.cpp`, the following sequence of function calls occurs (from top to bottom): … `Sema::ActOnModuleImport` `CompilerInstance::loadModule` `CompilerInstance::findOrCompileModuleAndReadAST` `ASTReader::ReadAST` `ASTReader::ReadASTCore` `ASTReader::ReadControlBlock` When reading a modules control block in `ASTReader::ReadControlBlock`, Clang also attempts to get the `ModuleFile` for each of its imports ([see here](https://github.com/llvm/llvm-project/blob/943a70717c629f43b309ab56e8141ffb131871a6/clang/lib/Serialization/ASTReader.cpp#L3024)). First, the matching BMI file for an imported module is looked up, then Clang calls back to `ASTReader::ReadASTCore` to ensure the BMI files `ModuleFile`. Currently (prior to this PR), there is no validation to ensure that the BMI file actually contains the module clang is trying to load after having read the module name in `ReadControlBlock`. Because of the faulty `-fmodule-file=` mapping, the `ModuleFile` of the import will refer to a BMI built for a different module than the one clang is trying to load. Here, the import of `A` in `b.cppm` will result in loading the `ModuleFile` corresponding to `c.pcm` while deserializing `b.pcm`. The crash occurs because Clang, assuming it has the correct `ModuleFile`s, continues reading AST blocks and attempts to resolve `ModuleFile` remappings (any of the mappings mentioned [here](https://github.com/llvm/llvm-project/blob/74ace4fd3c71ff59f2d89680c0f1382d0f9933f4/clang/lib/Serialization/ASTReader.cpp#L8765)). This results in invalid access into members of the invalid `ModuleFile`s and finally the crash. (Depending on the specific example, the exact location of the crash will be different, but it will always happen while trying to resolve the remappings.) While writing this comment, I noticed that my commit and PR message describe the cause of the crash incorrectly and have edited both. https://github.com/llvm/llvm-project/pull/133462 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits