On Mon, 2 Sep 2024, Tobias Burnus wrote:

> Hi Richard,
> 
> Am 02.09.24 um 13:58 schrieb Richard Biener:
> > Hmm, I can't really follow how and where it's currently decided whether to
> > output offload tables for the LTRANS units
> 
> Before the patch, output_offload_tables is called unconditionally, but guarded
> by the check whether there is anything to output at all. Call trees:
> 
> When outputting the .o files, the call is done via ipa_passes →
> ipa_write_summaries → ipa_write_summaries_1.
> 
> This calls ipa_write_summaries twice: once for the offload/for-device LTO
> section and once for the host LTO section – and both calls are needed.
> 
> For the LTO (lto1, ltrans) step, the call tree starts with:
> do_whole_program_analysis → lto_wpa_write_files → stream_out_partitions
> → stream_out_partitions_1 → stream_out → ipa_write_optimization_summaries.
> 
> Here, stream_out_partitions potentially forks the 'stream_out_partitions_1'
> calls. And each stream_out_partitions_1 calls for each (of its share) of the
> partitions 'stream_out' in a loop.
> 
> With either code path, the ipa_write... function then calls: write_lto →
> lto_output → output_offload_tables.
> 
> > but instead of an odd global
> > variable would it be possible to pass that down as a flag or,
> > alternatively encode that flag in the representation for the LTRANS
> > partition?  I suppose that's the out_decl_state?
> 
> Actually, I tried follow your initial suggestion of the PR, but now moved to
> the somewhat clearer out_decl_state.

Yeah - much nicer.

OK if it passes testing.

Thanks,
Richard.

Reply via email to