Hello, even though it is not my work, I would like to ping this patch. Having it upstream would really help us a lot.
Thank you very much in advance, Martin On Wed, Nov 13 2024, Michael Matz wrote: > Hello, > > this is essentially > > https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651025.html > > from Kewen in functionality. When discussing this with Segher at the > Cauldron he expressed reservations about changing the default > implementation of -fpatchable-function-entry. So, to move forward, let's > move it under a new target option -msplit-patch-nops (expressing the > important deviation from the default behaviour, namely that all the > patching nops form a consecutive sequence normally). > > Regstrapping on power9 ppc64le in progress. Okay if that passes? > > > Ciao, > Michael. > > --- > > as the bug report details some uses of -fpatchable-function-entry > aren't happy with the "before" NOPs being inserted between global and > local entry point on powerpc. We want the before NOPs be in front > of the global entry point. That means that the patching NOPs aren't > consecutive for dual entry point functions, but for these usecases > that's not the problem. But let us support both under the control > of a new target option: -msplit-patch-nops. > > gcc/ > > PR target/112980 > * config/rs6000/rs6000.opt (msplit-patch-nops): New option. > * doc/invoke.texi (RS/6000 and PowerPC Options): Document it. > * config/rs6000/rs6000.h (machine_function.stop_patch_area_print): > New member. > * config/rs6000/rs6000.cc (rs6000_print_patchable_function_entry): > Emit split nops under control of that one. > * config/rs6000/rs6000-logue.cc (rs6000_output_function_prologue): > Add handling of split patch nops. > --- > gcc/config/rs6000/rs6000-logue.cc | 15 +++++++++------ > gcc/config/rs6000/rs6000.cc | 27 +++++++++++++++++++++++---- > gcc/config/rs6000/rs6000.h | 6 ++++++ > gcc/config/rs6000/rs6000.opt | 4 ++++ > gcc/doc/invoke.texi | 17 +++++++++++++++-- > 5 files changed, 57 insertions(+), 12 deletions(-) > > diff --git a/gcc/config/rs6000/rs6000-logue.cc > b/gcc/config/rs6000/rs6000-logue.cc > index c87058b435e..aa1e0442f2b 100644 > --- a/gcc/config/rs6000/rs6000-logue.cc > +++ b/gcc/config/rs6000/rs6000-logue.cc > @@ -4005,8 +4005,8 @@ rs6000_output_function_prologue (FILE *file) > > unsigned short patch_area_size = crtl->patch_area_size; > unsigned short patch_area_entry = crtl->patch_area_entry; > - /* Need to emit the patching area. */ > - if (patch_area_size > 0) > + /* Emit non-split patching area now. */ > + if (!TARGET_SPLIT_PATCH_NOPS && patch_area_size > 0) > { > cfun->machine->global_entry_emitted = true; > /* As ELFv2 ABI shows, the allowable bytes between the global > @@ -4027,7 +4027,6 @@ rs6000_output_function_prologue (FILE *file) > patch_area_entry); > rs6000_print_patchable_function_entry (file, patch_area_entry, > true); > - patch_area_size -= patch_area_entry; > } > } > > @@ -4037,9 +4036,13 @@ rs6000_output_function_prologue (FILE *file) > assemble_name (file, name); > fputs ("\n", file); > /* Emit the nops after local entry. */ > - if (patch_area_size > 0) > - rs6000_print_patchable_function_entry (file, patch_area_size, > - patch_area_entry == 0); > + if (patch_area_size > patch_area_entry) > + { > + patch_area_size -= patch_area_entry; > + cfun->machine->stop_patch_area_print = false; > + rs6000_print_patchable_function_entry (file, patch_area_size, > + patch_area_entry == 0); > + } > } > > else if (rs6000_pcrel_p ()) > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index 950fd947fda..6427e6913ba 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -15226,11 +15226,25 @@ rs6000_print_patchable_function_entry (FILE *file, > { > bool global_entry_needed_p = rs6000_global_entry_point_prologue_needed_p > (); > /* For a function which needs global entry point, we will emit the > - patchable area before and after local entry point under the control of > - cfun->machine->global_entry_emitted, see the handling in function > - rs6000_output_function_prologue. */ > - if (!global_entry_needed_p || cfun->machine->global_entry_emitted) > + patchable area when it isn't split before and after local entry point > + under the control of cfun->machine->global_entry_emitted, see the > + handling in function rs6000_output_function_prologue. */ > + if (!TARGET_SPLIT_PATCH_NOPS > + && (!global_entry_needed_p || cfun->machine->global_entry_emitted)) > default_print_patchable_function_entry (file, patch_area_size, record_p); > + > + /* For split patch nops we emit the before nops (from generic code) > + in front of the global entry point and after the local entry point, > + under the control of cfun->machine->stop_patch_area_print, see > + rs6000_output_function_prologue and rs6000_elf_declare_function_name. > */ > + if (TARGET_SPLIT_PATCH_NOPS) > + { > + if (!cfun->machine->stop_patch_area_print) > + default_print_patchable_function_entry (file, patch_area_size, > + record_p); > + else > + gcc_assert (global_entry_needed_p); > + } > } > > enum rtx_code > @@ -21423,6 +21437,11 @@ rs6000_elf_declare_function_name (FILE *file, const > char *name, tree decl) > fprintf (file, "\t.previous\n"); > } > ASM_OUTPUT_FUNCTION_LABEL (file, name, decl); > + /* At this time, the "before" NOPs have been already emitted. > + For split nops stop generic code from printing the "after" NOPs and > + emit them just after local entry ourself later. */ > + if (rs6000_global_entry_point_prologue_needed_p ()) > + cfun->machine->stop_patch_area_print = true; > } > > static void rs6000_elf_file_end (void) ATTRIBUTE_UNUSED; > diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h > index d460eb06544..2bd56f8d377 100644 > --- a/gcc/config/rs6000/rs6000.h > +++ b/gcc/config/rs6000/rs6000.h > @@ -2424,6 +2424,12 @@ typedef struct GTY(()) machine_function > global entry. It helps to control the patchable area before and after > local entry. */ > bool global_entry_emitted; > + /* With ELFv2 ABI dual entry points being adopted, generic framework > + targetm.asm_out.print_patchable_function_entry would generate "after" > + NOPs before local entry, which is wrong. This flag is to stop it from > + printing patch area before local entry, it is only useful when the > + function requires dual entry points. */ > + bool stop_patch_area_print; > } machine_function; > #endif > > diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt > index 94323bd1db2..27a165ef0ee 100644 > --- a/gcc/config/rs6000/rs6000.opt > +++ b/gcc/config/rs6000/rs6000.opt > @@ -300,6 +300,10 @@ mfull-toc > Target > Put everything in the regular TOC. > > +msplit-patch-nops > +Target Var(TARGET_SPLIT_PATCH_NOPS) Init(0) > +Emit NOPs before global and after local entry point for > -fpatchable-function-entry. > + > mvrsave > Target Var(TARGET_ALTIVEC_VRSAVE) Save > Generate VRSAVE instructions when generating AltiVec code. > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 93e9af3791c..f3d37a48fd0 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -1318,6 +1318,7 @@ See RS/6000 and PowerPC Options. > -mtraceback=@var{traceback_type} > -maix-struct-return -msvr4-struct-return > -mabi=@var{abi-type} -msecure-plt -mbss-plt > +-msplit-patch-nops > -mlongcall -mno-longcall -mpltseq -mno-pltseq > -mblock-move-inline-limit=@var{num} > -mblock-compare-inline-limit=@var{num} > @@ -18569,11 +18570,12 @@ If @code{N=0}, no pad location is recorded. > The NOP instructions are inserted at---and maybe before, depending on > @var{M}---the function entry address, even before the prologue. On > PowerPC with the ELFv2 ABI, for a function with dual entry points, > -the local entry point is this function entry address. > +the local entry point is this function entry address by default. See > +the @option{-msplit-patch-nops} option to change this. > > The maximum value of @var{N} and @var{M} is 65535. On PowerPC with the > ELFv2 ABI, for a function with dual entry points, the supported values > -for @var{M} are 0, 2, 6 and 14. > +for @var{M} are 0, 2, 6 and 14 when not using @option{-msplit-patch-nops}. > @end table > > > @@ -31658,6 +31660,17 @@ requires @code{.plt} and @code{.got} > sections that are both writable and executable. > This is a PowerPC 32-bit SYSV ABI option. > > +@opindex msplit-patch-nops > +@item -msplit-patch-nops > +When adding NOPs for a patchable area via the > +@option{-fpatchable-function-entry} option emit the "before" NOPs in front > +of the global entry point and the "after" NOPs after the local entry point. > +This makes the sequence of NOPs not consecutive when a global entry point > +is generated. Without this option the NOPs are emitted directly before and > +after the local entry point, making them consecutive but moving global and > +local entry point further apart. If only a single entry point is generated > +this option has no effect. > + > @opindex misel > @opindex mno-isel > @item -misel > -- > 2.35.3