dexonsmith added inline comments.
================ Comment at: clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp:56-58 + // Ensure the reported context hash is strict. + CI.getHeaderSearchOpts().ModulesStrictContextHash = true; + ---------------- jansvoboda11 wrote: > dexonsmith wrote: > > jansvoboda11 wrote: > > > jansvoboda11 wrote: > > > > dexonsmith wrote: > > > > > IIUC, explicit modules don't really have/need a context hash. Can > > > > > related options be stripped out when serializing to `-cc1` when > > > > > `ImplicitModules` is false? > > > > > > > > > > Basically, I'm asking if `ModulesStrictContextHash` is a no-op when > > > > > `ImplicitModules` is false. If not, can we make it a no-op? > > > > > (If we can, then maybe rename the field to > > > > > `ImplicitModulesStrictContextHash` and audit that no one reads it > > > > > when `ImplicitModules` is off...) > > > > Let me clarify this a bit. You're right that when building explicit > > > > modules, we don't care about context hash. > > > > > > > > We do care about using **strict** context hash during the scan though - > > > > it's an implementation detail through which we prevent mixing > > > > incompatible modules/TUs. (This strict context hash is enabled > > > > elsewhere in the dependency scanner.) > > > > > > > > At the end of the scan, we take discovered modules and modify/prune > > > > their `CompilerInvocation` (in this function). This can essentially > > > > "merge" multiple versions of the same module into one, which is very > > > > desirable. But we still want to do it according to the **strict** > > > > context hash. We don't want to merge versions with different search > > > > paths for example (non-strict context hash). That's what this change > > > > ensures. > > > > > > > > Note that we don't **need** to report context hashes to scanner > > > > clients. Any other identifier derived from a strict context hash would > > > > work. > > > I think the rename you're suggesting is valid. > > > > > > We //could// strip the `ModulesStrictContextHash` in the scanner: after > > > we generate the strict context hash and before we generate the > > > command-line. I think that can be done in a NFC follow-up. > > I think I understand the problem now. I hadn't really put together that the > > implicit modules context hash machinery was being used to decide the > > artifact location for the explicit module. > > > > I'm concerned this is too subtle and fragile. I'm wondering if the > > following more naive solution would work: > > - Prune/canonicalize the CompilerInvocation (as now). > > - Write/modify any fields to use placeholders for fields the client has > > control over. (Besides `OutputFile`, what else is there?) > > - Generate the `-cc1` and hash it. That's now the context hash. Return it > > to the client. > > > > But maybe that's the whole point of the strict context hash, and if there > > are bugs where this would behave differently, we should fix the strict > > context hash? > > > > ... > > > > Stepping back, please go ahead and commit this incremental improvement, > > after expanding the comment to: > > 1. More fully explain the context: that the context hash will be generated > > from the CompilerInvocation and sent to the client, which then uses it to > > decide where to store the artifact. We need to make sure it's strict. > > 2. Explain why `assert(ModulesStrictContextHash)` would fail, even though > > the scan used a strict context hash. (Maybe it'd makes sense to make that > > change in a follow-up...?) > > I'm concerned this is too subtle and fragile. I'm wondering if the > > following more naive solution would work: > > - Prune/canonicalize the CompilerInvocation (as now). > > - Write/modify any fields to use placeholders for fields the client has > > control over. (Besides `OutputFile`, what else is there?) > > In general, we don't know much about the intended filesystem paths. Off the > top of my head, we don't know the PCM output file (and values for > `-fmodule-file=`), input file (and values for `-fmodule-map-file=`: module > map used for building might have different location during the scan), > serialized diagnostics file (we can't use the file for a TU that discovered > the module first). > > > - Generate the `-cc1` and hash it. That's now the context hash. Return it > > to the client. > > > > But maybe that's the whole point of the strict context hash, and if there > > are bugs where this would behave differently, we should fix the strict > > context hash? > > I think that's a valid approach. Just wondering why you see it being less > subtle and fragile than strict context hash? Clients like build systems don't want two different `-cc1`s for the same object. This would structurally guarantee that, whereas the strict context hash seems best-effort. Maybe the strict context hash should be generated this way? (Maybe it already is?) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D111720/new/ https://reviews.llvm.org/D111720 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits