> -----Original Message----- > From: Richard Biener <rguent...@suse.de> > Sent: Monday, August 19, 2024 6:51 PM > To: Prathamesh Kulkarni <prathame...@nvidia.com> > Cc: Andrew Pinski <pins...@gmail.com>; gcc-patches@gcc.gnu.org; Thomas > Schwinge <tschwi...@baylibre.com> > Subject: RE: [optc-save-gen.awk] Fix streaming of command line options > for offloading > > External email: Use caution opening links or attachments > > > On Mon, 19 Aug 2024, Prathamesh Kulkarni wrote: > > > > > > > > -----Original Message----- > > > From: Richard Biener <rguent...@suse.de> > > > Sent: Tuesday, August 13, 2024 12:52 PM > > > To: Andrew Pinski <pins...@gmail.com> > > > Cc: Prathamesh Kulkarni <prathame...@nvidia.com>; gcc- > > > patc...@gcc.gnu.org; Thomas Schwinge <tschwi...@baylibre.com> > > > Subject: Re: [optc-save-gen.awk] Fix streaming of command line > > > options for offloading > > > > > > External email: Use caution opening links or attachments > > > > > > > > > > Am 13.08.2024 um 08:37 schrieb Andrew Pinski > <pins...@gmail.com>: > > > > > > > > On Mon, Aug 12, 2024 at 10:36 PM Prathamesh Kulkarni > > > > <prathame...@nvidia.com> wrote: > > > >> > > > >> Hi, > > > >> As mentioned in: > > > >> https://gcc.gnu.org/pipermail/gcc/2024-August/244581.html > > > >> > > > >> AArch64 cl_optimization_stream_out streams out target-specific > > > >> optimization options like flag_aarch64_early_ldp_fusion, > > > aarch64_early_ra etc, which breaks AArch64/nvptx offloading, since > > > nvptx cl_optimization_stream_in doesn't have corresponding stream- > in > > > for these options and ends up setting invalid values for ptr- > > > >explicit_mask (and subsequent data structures). > > > >> > > > >> This makes even a trivial test like the following to cause ICE > in > > > lto_read_decls with -O3 -fopenmp -foffload=nvptx-none: > > > >> > > > >> int main() > > > >> { > > > >> int x; > > > >> #pragma omp target map(x) > > > >> x; > > > >> } > > > >> > > > >> The attached patch modifies optc-save-gen.awk to generate if > > > >> (!lto_stream_offload_p) check before streaming out > > > >> target-specific > > > opt in cl_optimization_stream_out, which fixes the issue. > > > cl_optimization_stream_out after patch (last few entries): > > > >> > > > >> bp_pack_var_len_int (bp, ptr->x_flag_wrapv_pointer); > > > >> bp_pack_var_len_int (bp, ptr->x_debug_nonbind_markers_p); if > > > >> (!lto_stream_offload_p) bp_pack_var_len_int (bp, > > > >> ptr->x_flag_aarch64_early_ldp_fusion); > > > >> if (!lto_stream_offload_p) > > > >> bp_pack_var_len_int (bp, ptr->x_aarch64_early_ra); if > > > >> (!lto_stream_offload_p) bp_pack_var_len_int (bp, > > > >> ptr->x_flag_aarch64_late_ldp_fusion); > > > >> if (!lto_stream_offload_p) > > > >> bp_pack_var_len_int (bp, ptr->x_flag_mlow_precision_div); if > > > >> (!lto_stream_offload_p) bp_pack_var_len_int (bp, > > > >> ptr->x_flag_mrecip_low_precision_sqrt); > > > >> if (!lto_stream_offload_p) > > > >> bp_pack_var_len_int (bp, ptr->x_flag_mlow_precision_sqrt); > for > > > >> (size_t i = 0; i < ARRAY_SIZE (ptr->explicit_mask); i++) > > > >> bp_pack_value (bp, ptr->explicit_mask[i], 64); > > > >> > > > >> For target-specific options, streaming out is gated on > > > !lto_stream_offload_p check. > > > >> > > > >> The patch also fixes failures due to same issue with > > > >> x86_64->nvptx > > > offloading for target-print-1.f90 (and couple more). > > > >> Does the patch look OK ? > > > > > > > > I think it seems to be on the right track. One thing that is > also > > > > going to be an issue is streaming in, there could be a target > > > > option on the offload side that is marked as Optimization that > > > > would might also cause issues. We should check to make sure that > > > > also gets fixed here too. Or error out for offloading targets > > > > can't have target options with Optimization on them during the > build. > > Thanks for the suggestions. The attached patch modifies > > optc-save-gen.awk to emit an error if accel backend marks target > specific option with Optimization. > > AFAIU, currently neither nvptx nor gcn have target-specific options > > marked with Optimization, so this is mostly a safeguard against > future additions. > > > > cl_optimization_stream_in after patch for target-specifc > optimization options: > > > > #ifdef ACCEL_COMPILER > > #error accel compiler cannot define Optimization attribute for > > target-specific option x_flag_aarch64_early_ldp_fusion #else > > ptr->x_flag_aarch64_early_ldp_fusion = (signed char ) > > bp_unpack_var_len_int (bp); #endif > > > > To test if this works, I added -mfoo to nvptx.opt and marked it with > > both Target and Optimization, which resulted in the following build > error for nvptx: > > > > options-save.cc:13548:2: error: #error accel compiler cannot define > > Optimization attribute for target-specifc option x_flag_nvptx_foo > > 13548 | #error accel compiler cannot define Optimization attribute > for target-specific option x_flag_nvptx_foo > > | ^~~~~ > > > > > > It may have been misguided to mark target specific flags as > > > Optimization. It might be required to merge those (from all > > > targets) into the common optimize enum, like we do for tree codes. > > > Language specific options marked as Optimization possibly have the > > > same issue when mixing with other languages and LTO. Can you > assess > > > the situation a bit more? > > AFAIK, only c-family/c.opt marks few options with Optimization flag. > I > > tried marking fortran's -ffrontend-optimize with Optimization and > > verified that Optimization options are combined for c-family > languages and fortran in cl_optimization_stream_{out,in}. > > > > cl_optimization_stream_out shows: > > ... > > bp_pack_var_len_int (bp, ptr->x_flag_frontend_optimize); > > ... > > bp_pack_var_len_int (bp, ptr->x_flag_nothrow_opt); > > > > and likewise has corresponding entries for > cl_optimization_stream_in. > > > > So I guess this shouldn't be an issue with lang specific > Optimization opts ? > > Yes. So as said if we now support multiple targets in one image by > means of offloading we possibly should see to merge options like we do > for frontends. > > That said, I think the patch you posted is OK, the above is just > something that we should remember when more similar cases pop up. Thanks, I committed the patch in: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=db2e9a2a46f64b037494e8300c46f2d90a9fa55c after LTO bootstrap+test on aarch64-linux-gnu and verifying it survives libgomp testing for AArch64/nvptx offloading.
Thanks, Prathamesh > > Richard. > > > Signed-off-by: Prathamesh Kulkarni <prathame...@nvidia.com> > > > > Thanks, > > Prathamesh > > > > > > I think the proposed fix looks reasonable but the problem might be > > > more widespread and warrant a more global solution or at least > > > revisiting documentation? > > > > > > Thanks, > > > Richard > > > > > > > Thanks, > > > > Andrew Pinski > > > > > > > >> > > > >> Signed-off-by: Prathamesh Kulkarni <prathame...@nvidia.com> > > > >> > > > >> Thanks, > > > >> Prathamesh > > > > -- > Richard Biener <rguent...@suse.de> > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > Nuernberg)