I would also like to ping this patch. On Thu, Jan 09, 2025 at 04:15:24PM +0100, Michael Matz wrote: > Hello, > > On Wed, 13 Nov 2024, Michael Matz wrote: > > > Hello, > > > > this is essentially > > > > https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651025.html > > > > from Kewen in functionality. When discussing this with Segher at the > > Cauldron he expressed reservations about changing the default > > implementation of -fpatchable-function-entry. So, to move forward, let's > > move it under a new target option -msplit-patch-nops (expressing the > > important deviation from the default behaviour, namely that all the > > patching nops form a consecutive sequence normally). > > > > Regstrapping on power9 ppc64le in progress. Okay if that passes? > > Is there anything I can do to move forward with this one? Please? > > > Ciao, > Michael. > > > > > > > > Ciao, > > Michael. > > > > --- > > > > as the bug report details some uses of -fpatchable-function-entry > > aren't happy with the "before" NOPs being inserted between global and > > local entry point on powerpc. We want the before NOPs be in front > > of the global entry point. That means that the patching NOPs aren't > > consecutive for dual entry point functions, but for these usecases > > that's not the problem. But let us support both under the control > > of a new target option: -msplit-patch-nops. > > > > gcc/ > > > > PR target/112980 > > * config/rs6000/rs6000.opt (msplit-patch-nops): New option. > > * doc/invoke.texi (RS/6000 and PowerPC Options): Document it. > > * config/rs6000/rs6000.h (machine_function.stop_patch_area_print): > > New member. > > * config/rs6000/rs6000.cc (rs6000_print_patchable_function_entry): > > Emit split nops under control of that one. > > * config/rs6000/rs6000-logue.cc (rs6000_output_function_prologue): > > Add handling of split patch nops. > > --- > > gcc/config/rs6000/rs6000-logue.cc | 15 +++++++++------ > > gcc/config/rs6000/rs6000.cc | 27 +++++++++++++++++++++++---- > > gcc/config/rs6000/rs6000.h | 6 ++++++ > > gcc/config/rs6000/rs6000.opt | 4 ++++ > > gcc/doc/invoke.texi | 17 +++++++++++++++-- > > 5 files changed, 57 insertions(+), 12 deletions(-) > > > > diff --git a/gcc/config/rs6000/rs6000-logue.cc > > b/gcc/config/rs6000/rs6000-logue.cc > > index c87058b435e..aa1e0442f2b 100644 > > --- a/gcc/config/rs6000/rs6000-logue.cc > > +++ b/gcc/config/rs6000/rs6000-logue.cc > > @@ -4005,8 +4005,8 @@ rs6000_output_function_prologue (FILE *file) > > > > unsigned short patch_area_size = crtl->patch_area_size; > > unsigned short patch_area_entry = crtl->patch_area_entry; > > - /* Need to emit the patching area. */ > > - if (patch_area_size > 0) > > + /* Emit non-split patching area now. */ > > + if (!TARGET_SPLIT_PATCH_NOPS && patch_area_size > 0) > > { > > cfun->machine->global_entry_emitted = true; > > /* As ELFv2 ABI shows, the allowable bytes between the global > > @@ -4027,7 +4027,6 @@ rs6000_output_function_prologue (FILE *file) > > patch_area_entry); > > rs6000_print_patchable_function_entry (file, patch_area_entry, > > true); > > - patch_area_size -= patch_area_entry; > > } > > } > > > > @@ -4037,9 +4036,13 @@ rs6000_output_function_prologue (FILE *file) > > assemble_name (file, name); > > fputs ("\n", file); > > /* Emit the nops after local entry. */ > > - if (patch_area_size > 0) > > - rs6000_print_patchable_function_entry (file, patch_area_size, > > - patch_area_entry == 0); > > + if (patch_area_size > patch_area_entry) > > + { > > + patch_area_size -= patch_area_entry; > > + cfun->machine->stop_patch_area_print = false; > > + rs6000_print_patchable_function_entry (file, patch_area_size, > > + patch_area_entry == 0); > > + } > > } > > > > else if (rs6000_pcrel_p ()) > > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > > index 950fd947fda..6427e6913ba 100644 > > --- a/gcc/config/rs6000/rs6000.cc > > +++ b/gcc/config/rs6000/rs6000.cc > > @@ -15226,11 +15226,25 @@ rs6000_print_patchable_function_entry (FILE *file, > > { > > bool global_entry_needed_p = rs6000_global_entry_point_prologue_needed_p > > (); > > /* For a function which needs global entry point, we will emit the > > - patchable area before and after local entry point under the control of > > - cfun->machine->global_entry_emitted, see the handling in function > > - rs6000_output_function_prologue. */ > > - if (!global_entry_needed_p || cfun->machine->global_entry_emitted) > > + patchable area when it isn't split before and after local entry point > > + under the control of cfun->machine->global_entry_emitted, see the > > + handling in function rs6000_output_function_prologue. */ > > + if (!TARGET_SPLIT_PATCH_NOPS > > + && (!global_entry_needed_p || cfun->machine->global_entry_emitted)) > > default_print_patchable_function_entry (file, patch_area_size, > > record_p); > > + > > + /* For split patch nops we emit the before nops (from generic code) > > + in front of the global entry point and after the local entry point, > > + under the control of cfun->machine->stop_patch_area_print, see > > + rs6000_output_function_prologue and rs6000_elf_declare_function_name. > > */ > > + if (TARGET_SPLIT_PATCH_NOPS) > > + { > > + if (!cfun->machine->stop_patch_area_print) > > + default_print_patchable_function_entry (file, patch_area_size, > > + record_p); > > + else > > + gcc_assert (global_entry_needed_p); > > + } > > } > > > > enum rtx_code > > @@ -21423,6 +21437,11 @@ rs6000_elf_declare_function_name (FILE *file, > > const char *name, tree decl) > > fprintf (file, "\t.previous\n"); > > } > > ASM_OUTPUT_FUNCTION_LABEL (file, name, decl); > > + /* At this time, the "before" NOPs have been already emitted. > > + For split nops stop generic code from printing the "after" NOPs and > > + emit them just after local entry ourself later. */ > > + if (rs6000_global_entry_point_prologue_needed_p ()) > > + cfun->machine->stop_patch_area_print = true; > > } > > > > static void rs6000_elf_file_end (void) ATTRIBUTE_UNUSED; > > diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h > > index d460eb06544..2bd56f8d377 100644 > > --- a/gcc/config/rs6000/rs6000.h > > +++ b/gcc/config/rs6000/rs6000.h > > @@ -2424,6 +2424,12 @@ typedef struct GTY(()) machine_function > > global entry. It helps to control the patchable area before and after > > local entry. */ > > bool global_entry_emitted; > > + /* With ELFv2 ABI dual entry points being adopted, generic framework > > + targetm.asm_out.print_patchable_function_entry would generate "after" > > + NOPs before local entry, which is wrong. This flag is to stop it from > > + printing patch area before local entry, it is only useful when the > > + function requires dual entry points. */ > > + bool stop_patch_area_print; > > } machine_function; > > #endif > > > > diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt > > index 94323bd1db2..27a165ef0ee 100644 > > --- a/gcc/config/rs6000/rs6000.opt > > +++ b/gcc/config/rs6000/rs6000.opt > > @@ -300,6 +300,10 @@ mfull-toc > > Target > > Put everything in the regular TOC. > > > > +msplit-patch-nops > > +Target Var(TARGET_SPLIT_PATCH_NOPS) Init(0) > > +Emit NOPs before global and after local entry point for > > -fpatchable-function-entry. > > + > > mvrsave > > Target Var(TARGET_ALTIVEC_VRSAVE) Save > > Generate VRSAVE instructions when generating AltiVec code. > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > > index 93e9af3791c..f3d37a48fd0 100644 > > --- a/gcc/doc/invoke.texi > > +++ b/gcc/doc/invoke.texi > > @@ -1318,6 +1318,7 @@ See RS/6000 and PowerPC Options. > > -mtraceback=@var{traceback_type} > > -maix-struct-return -msvr4-struct-return > > -mabi=@var{abi-type} -msecure-plt -mbss-plt > > +-msplit-patch-nops > > -mlongcall -mno-longcall -mpltseq -mno-pltseq > > -mblock-move-inline-limit=@var{num} > > -mblock-compare-inline-limit=@var{num} > > @@ -18569,11 +18570,12 @@ If @code{N=0}, no pad location is recorded. > > The NOP instructions are inserted at---and maybe before, depending on > > @var{M}---the function entry address, even before the prologue. On > > PowerPC with the ELFv2 ABI, for a function with dual entry points, > > -the local entry point is this function entry address. > > +the local entry point is this function entry address by default. See > > +the @option{-msplit-patch-nops} option to change this. > > > > The maximum value of @var{N} and @var{M} is 65535. On PowerPC with the > > ELFv2 ABI, for a function with dual entry points, the supported values > > -for @var{M} are 0, 2, 6 and 14. > > +for @var{M} are 0, 2, 6 and 14 when not using @option{-msplit-patch-nops}. > > @end table > > > > > > @@ -31658,6 +31660,17 @@ requires @code{.plt} and @code{.got} > > sections that are both writable and executable. > > This is a PowerPC 32-bit SYSV ABI option. > > > > +@opindex msplit-patch-nops > > +@item -msplit-patch-nops > > +When adding NOPs for a patchable area via the > > +@option{-fpatchable-function-entry} option emit the "before" NOPs in front > > +of the global entry point and the "after" NOPs after the local entry point. > > +This makes the sequence of NOPs not consecutive when a global entry point > > +is generated. Without this option the NOPs are emitted directly before and > > +after the local entry point, making them consecutive but moving global and > > +local entry point further apart. If only a single entry point is generated > > +this option has no effect. > > + > > @opindex misel > > @opindex mno-isel > > @item -misel > > >
Marek