> On 3 Sep 2024, at 16:08, Jason Merrill via Gcc <gcc@gcc.gnu.org> wrote:
>
> On 9/3/24 7:30 AM, Jonathan Wakely wrote:
>> On Tue, 3 Sept 2024, 10:15 Iain Sandoe, <i...@sandoe.co.uk
>> <mailto:i...@sandoe.co.uk>> wrote:
>> Hi Folks,
>> When we build a C++ binary module (CMI/BMI), we obviously have
>> access to its source to produce diagnostics, all fine.
>> However, when we consume that module we might also need access to
>> the sources used to build it - since diagnostics triggered in the
>> consumer can refer back to the sources used.
>> I'm fairly convinced by your argument that building the module usually
>> happens as part of the same build as consuming the module, and so the
>> sources will be available anyway.
>> For large scale build environments where pre-built BMIs might be deployed by
>> one team and consumed by other teams, without (re)building those BMIs, it
>> doesn't seem too difficult for the module interface sources to also be
>> deployed. That's not so different from deploying headers and libraries (.so,
>> .dlsym, .dll etc) today.
>> So I don't actually see a need to embed sources. It seems like it's solving
>> something that can easily be solved using existing processes. Just include
>> sources with BMIs that you deploy. If the full sources are sensitive IP,
>> separate your code into the public parts that are used to compile the BMI
>> and the non-public parts. Or proprietary vendors who don't want to do that
>> separation can choose to not provide code, and diagnostics suffer for their
>> users. That's not a technical problem, and doesn't need to be solved by the
>> compiler.
>
> Agreed; it seems natural to provide interface unit sources everywhere you
> would provide headers currently. Or not in cases where you wouldn't, such as
> distcc compiling preprocessed code.
>
>> Currently clang has been experimenting with embedding the sources
>> into the BMI - this can make things seem more efficient when, for
>> example, distributing BMIs to remote nodes in a large-scale
>> distributed build.
>
>> There was a patch proposed to make this the default for clang, which
>> has resulted in the discussion here:
>>
>> https://discourse.llvm.org/t/rfc-modules-should-we-embed-sources-to-the-bmi/81029
>>
>> <https://discourse.llvm.org/t/rfc-modules-should-we-embed-sources-to-the-bmi/81029>
>
> From the first post:
>
>> (1) Fix the underlying issue. Readers may already recognize that the two
>> topics (whether or not embedding source files) (security concerns) are not
>> technically mutually exclusive. The fundamental technical problem may be
>> that clang require to open the actual file during the compilation. It looks
>> like both GCC and MSVC doesn’t have the problem.
>
> Sounds like the primary motivation for this clang change doesn't apply to GCC.
I think that might be a misunderstanding on the part of the author; AFAIU both
GCC and MSVC _do_ require access to the sources at BMI consume-time to give
decent diagnostics. I think that there might be confusion because the
compilation would suceed on those toolchains without the sources - but with
poorer diagnostic quality?
Hopefully, other folks from the “modules implementer’s group” including MSVC
will add comments to the discourse thread - we just discussed this (with my
impression being that most folks think it’s the build system’s territory).
Iain