> Am 13.08.2024 um 08:37 schrieb Andrew Pinski <pins...@gmail.com>:
> 
> On Mon, Aug 12, 2024 at 10:36 PM Prathamesh Kulkarni
> <prathame...@nvidia.com> wrote:
>> 
>> Hi,
>> As mentioned in:
>> https://gcc.gnu.org/pipermail/gcc/2024-August/244581.html
>> 
>> AArch64 cl_optimization_stream_out streams out target-specific optimization 
>> options like flag_aarch64_early_ldp_fusion, aarch64_early_ra etc, which 
>> breaks AArch64/nvptx offloading,
>> since nvptx cl_optimization_stream_in doesn't have corresponding stream-in 
>> for these options and ends up setting invalid values for ptr->explicit_mask 
>> (and subsequent data structures).
>> 
>> This makes even a trivial test like the following to cause ICE in 
>> lto_read_decls with -O3 -fopenmp -foffload=nvptx-none:
>> 
>> int main()
>> {
>>  int x;
>>  #pragma omp target map(x)
>>    x;
>> }
>> 
>> The attached patch modifies optc-save-gen.awk to generate if 
>> (!lto_stream_offload_p) check before streaming out target-specific opt in 
>> cl_optimization_stream_out, which
>> fixes the issue. cl_optimization_stream_out after patch (last few entries):
>> 
>>  bp_pack_var_len_int (bp, ptr->x_flag_wrapv_pointer);
>>  bp_pack_var_len_int (bp, ptr->x_debug_nonbind_markers_p);
>>  if (!lto_stream_offload_p)
>>  bp_pack_var_len_int (bp, ptr->x_flag_aarch64_early_ldp_fusion);
>>  if (!lto_stream_offload_p)
>>  bp_pack_var_len_int (bp, ptr->x_aarch64_early_ra);
>>  if (!lto_stream_offload_p)
>>  bp_pack_var_len_int (bp, ptr->x_flag_aarch64_late_ldp_fusion);
>>  if (!lto_stream_offload_p)
>>  bp_pack_var_len_int (bp, ptr->x_flag_mlow_precision_div);
>>  if (!lto_stream_offload_p)
>>  bp_pack_var_len_int (bp, ptr->x_flag_mrecip_low_precision_sqrt);
>>  if (!lto_stream_offload_p)
>>  bp_pack_var_len_int (bp, ptr->x_flag_mlow_precision_sqrt);
>>  for (size_t i = 0; i < ARRAY_SIZE (ptr->explicit_mask); i++)
>>    bp_pack_value (bp, ptr->explicit_mask[i], 64);
>> 
>> For target-specific options, streaming out is gated on !lto_stream_offload_p 
>> check.
>> 
>> The patch also fixes failures due to same issue with x86_64->nvptx 
>> offloading for target-print-1.f90 (and couple more).
>> Does the patch look OK ?
> 
> I think it seems to be on the right track. One thing that is also
> going to be an issue is streaming in, there could be a target option
> on the offload side that is marked as Optimization that would might
> also cause issues. We should check to make sure that also gets fixed
> here too. Or error out for offloading targets can't have target
> options with Optimization on them during the build.

It may have been misguided to mark target specific flags as Optimization.  It 
might be required to merge those (from all targets) into the common optimize 
enum, like we do for tree codes.  Language specific options marked as 
Optimization possibly have the same issue when mixing with other languages and 
LTO.  Can you assess the situation a bit more?

I think the proposed fix looks reasonable but the problem might be more 
widespread and warrant a more global solution or at least revisiting 
documentation?

Thanks,
Richard 

> Thanks,
> Andrew Pinski
> 
>> 
>> Signed-off-by: Prathamesh Kulkarni <prathame...@nvidia.com>
>> 
>> Thanks,
>> Prathamesh

Reply via email to