> -----Original Message-----
> From: Richard Biener <rguent...@suse.de>
> Sent: Monday, August 19, 2024 6:51 PM
> To: Prathamesh Kulkarni <prathame...@nvidia.com>
> Cc: Andrew Pinski <pins...@gmail.com>; gcc-patches@gcc.gnu.org; Thomas
> Schwinge <tschwi...@baylibre.com>
> Subject: RE: [optc-save-gen.awk] Fix streaming of command line options
> for offloading
> 
> External email: Use caution opening links or attachments
> 
> 
> On Mon, 19 Aug 2024, Prathamesh Kulkarni wrote:
> 
> >
> >
> > > -----Original Message-----
> > > From: Richard Biener <rguent...@suse.de>
> > > Sent: Tuesday, August 13, 2024 12:52 PM
> > > To: Andrew Pinski <pins...@gmail.com>
> > > Cc: Prathamesh Kulkarni <prathame...@nvidia.com>; gcc-
> > > patc...@gcc.gnu.org; Thomas Schwinge <tschwi...@baylibre.com>
> > > Subject: Re: [optc-save-gen.awk] Fix streaming of command line
> > > options for offloading
> > >
> > > External email: Use caution opening links or attachments
> > >
> > >
> > > > Am 13.08.2024 um 08:37 schrieb Andrew Pinski
> <pins...@gmail.com>:
> > > >
> > > > On Mon, Aug 12, 2024 at 10:36 PM Prathamesh Kulkarni
> > > > <prathame...@nvidia.com> wrote:
> > > >>
> > > >> Hi,
> > > >> As mentioned in:
> > > >> https://gcc.gnu.org/pipermail/gcc/2024-August/244581.html
> > > >>
> > > >> AArch64 cl_optimization_stream_out streams out target-specific
> > > >> optimization options like flag_aarch64_early_ldp_fusion,
> > > aarch64_early_ra etc, which breaks AArch64/nvptx offloading, since
> > > nvptx cl_optimization_stream_in doesn't have corresponding stream-
> in
> > > for these options and ends up setting invalid values for ptr-
> > > >explicit_mask (and subsequent data structures).
> > > >>
> > > >> This makes even a trivial test like the following to cause ICE
> in
> > > lto_read_decls with -O3 -fopenmp -foffload=nvptx-none:
> > > >>
> > > >> int main()
> > > >> {
> > > >>  int x;
> > > >>  #pragma omp target map(x)
> > > >>    x;
> > > >> }
> > > >>
> > > >> The attached patch modifies optc-save-gen.awk to generate if
> > > >> (!lto_stream_offload_p) check before streaming out
> > > >> target-specific
> > > opt in cl_optimization_stream_out, which fixes the issue.
> > > cl_optimization_stream_out after patch (last few entries):
> > > >>
> > > >>  bp_pack_var_len_int (bp, ptr->x_flag_wrapv_pointer);
> > > >> bp_pack_var_len_int (bp, ptr->x_debug_nonbind_markers_p);  if
> > > >> (!lto_stream_offload_p)  bp_pack_var_len_int (bp,
> > > >> ptr->x_flag_aarch64_early_ldp_fusion);
> > > >>  if (!lto_stream_offload_p)
> > > >>  bp_pack_var_len_int (bp, ptr->x_aarch64_early_ra);  if
> > > >> (!lto_stream_offload_p)  bp_pack_var_len_int (bp,
> > > >> ptr->x_flag_aarch64_late_ldp_fusion);
> > > >>  if (!lto_stream_offload_p)
> > > >>  bp_pack_var_len_int (bp, ptr->x_flag_mlow_precision_div);  if
> > > >> (!lto_stream_offload_p)  bp_pack_var_len_int (bp,
> > > >> ptr->x_flag_mrecip_low_precision_sqrt);
> > > >>  if (!lto_stream_offload_p)
> > > >>  bp_pack_var_len_int (bp, ptr->x_flag_mlow_precision_sqrt);
> for
> > > >> (size_t i = 0; i < ARRAY_SIZE (ptr->explicit_mask); i++)
> > > >>    bp_pack_value (bp, ptr->explicit_mask[i], 64);
> > > >>
> > > >> For target-specific options, streaming out is gated on
> > > !lto_stream_offload_p check.
> > > >>
> > > >> The patch also fixes failures due to same issue with
> > > >> x86_64->nvptx
> > > offloading for target-print-1.f90 (and couple more).
> > > >> Does the patch look OK ?
> > > >
> > > > I think it seems to be on the right track. One thing that is
> also
> > > > going to be an issue is streaming in, there could be a target
> > > > option on the offload side that is marked as Optimization that
> > > > would might also cause issues. We should check to make sure that
> > > > also gets fixed here too. Or error out for offloading targets
> > > > can't have target options with Optimization on them during the
> build.
> > Thanks for the suggestions. The attached patch modifies
> > optc-save-gen.awk to emit an error if accel backend marks target
> specific option with Optimization.
> > AFAIU, currently neither nvptx nor gcn have target-specific options
> > marked with Optimization, so this is mostly a safeguard against
> future additions.
> >
> > cl_optimization_stream_in after patch for target-specifc
> optimization options:
> >
> > #ifdef ACCEL_COMPILER
> > #error accel compiler cannot define Optimization attribute for
> > target-specific option x_flag_aarch64_early_ldp_fusion #else
> >   ptr->x_flag_aarch64_early_ldp_fusion = (signed char )
> > bp_unpack_var_len_int (bp); #endif
> >
> > To test if this works, I added -mfoo to nvptx.opt and marked it with
> > both Target and Optimization, which resulted in the following build
> error for nvptx:
> >
> > options-save.cc:13548:2: error: #error accel compiler cannot define
> > Optimization attribute for target-specifc option x_flag_nvptx_foo
> > 13548 | #error accel compiler cannot define Optimization attribute
> for target-specific option x_flag_nvptx_foo
> >       |  ^~~~~
> > >
> > > It may have been misguided to mark target specific flags as
> > > Optimization.  It might be required to merge those (from all
> > > targets) into the common optimize enum, like we do for tree codes.
> > > Language specific options marked as Optimization possibly have the
> > > same issue when mixing with other languages and LTO.  Can you
> assess
> > > the situation a bit more?
> > AFAIK, only c-family/c.opt marks few options with Optimization flag.
> I
> > tried marking fortran's -ffrontend-optimize with Optimization and
> > verified that Optimization options are combined for c-family
> languages and fortran in cl_optimization_stream_{out,in}.
> >
> > cl_optimization_stream_out shows:
> >   ...
> >   bp_pack_var_len_int (bp, ptr->x_flag_frontend_optimize);
> >   ...
> >   bp_pack_var_len_int (bp, ptr->x_flag_nothrow_opt);
> >
> > and likewise has corresponding entries for
> cl_optimization_stream_in.
> >
> > So I guess this shouldn't be an issue with lang specific
> Optimization opts ?
> 
> Yes.  So as said if we now support multiple targets in one image by
> means of offloading we possibly should see to merge options like we do
> for frontends.
> 
> That said, I think the patch you posted is OK, the above is just
> something that we should remember when more similar cases pop up.
Thanks, I committed the patch in:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=db2e9a2a46f64b037494e8300c46f2d90a9fa55c
after LTO bootstrap+test on aarch64-linux-gnu and verifying it survives libgomp 
testing
for AArch64/nvptx offloading.

Thanks,
Prathamesh
> 
> Richard.
> 
> > Signed-off-by: Prathamesh Kulkarni <prathame...@nvidia.com>
> >
> > Thanks,
> > Prathamesh
> > >
> > > I think the proposed fix looks reasonable but the problem might be
> > > more widespread and warrant a more global solution or at least
> > > revisiting documentation?
> > >
> > > Thanks,
> > > Richard
> > >
> > > > Thanks,
> > > > Andrew Pinski
> > > >
> > > >>
> > > >> Signed-off-by: Prathamesh Kulkarni <prathame...@nvidia.com>
> > > >>
> > > >> Thanks,
> > > >> Prathamesh
> >
> 
> --
> Richard Biener <rguent...@suse.de>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> Nuernberg)

Reply via email to