I would also like to ping this patch.

On Thu, Jan 09, 2025 at 04:15:24PM +0100, Michael Matz wrote:
> Hello,
> 
> On Wed, 13 Nov 2024, Michael Matz wrote:
> 
> > Hello,
> > 
> > this is essentially 
> > 
> >   https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651025.html
> > 
> > from Kewen in functionality.  When discussing this with Segher at the 
> > Cauldron he expressed reservations about changing the default 
> > implementation of -fpatchable-function-entry.  So, to move forward, let's 
> > move it under a new target option -msplit-patch-nops (expressing the 
> > important deviation from the default behaviour, namely that all the 
> > patching nops form a consecutive sequence normally).
> > 
> > Regstrapping on power9 ppc64le in progress.  Okay if that passes?
> 
> Is there anything I can do to move forward with this one?  Please?
> 
> 
> Ciao,
> Michael.
> 
> 
> > 
> > 
> > Ciao,
> > Michael.
> > 
> > ---
> > 
> > as the bug report details some uses of -fpatchable-function-entry
> > aren't happy with the "before" NOPs being inserted between global and
> > local entry point on powerpc.  We want the before NOPs be in front
> > of the global entry point.  That means that the patching NOPs aren't
> > consecutive for dual entry point functions, but for these usecases
> > that's not the problem.  But let us support both under the control
> > of a new target option: -msplit-patch-nops.
> > 
> >     gcc/
> > 
> >         PR target/112980
> >         * config/rs6000/rs6000.opt (msplit-patch-nops): New option.
> >         * doc/invoke.texi (RS/6000 and PowerPC Options): Document it.
> >         * config/rs6000/rs6000.h (machine_function.stop_patch_area_print):
> >         New member.
> >         * config/rs6000/rs6000.cc (rs6000_print_patchable_function_entry):
> >         Emit split nops under control of that one.
> >         * config/rs6000/rs6000-logue.cc (rs6000_output_function_prologue):
> >         Add handling of split patch nops.
> > ---
> >  gcc/config/rs6000/rs6000-logue.cc | 15 +++++++++------
> >  gcc/config/rs6000/rs6000.cc       | 27 +++++++++++++++++++++++----
> >  gcc/config/rs6000/rs6000.h        |  6 ++++++
> >  gcc/config/rs6000/rs6000.opt      |  4 ++++
> >  gcc/doc/invoke.texi               | 17 +++++++++++++++--
> >  5 files changed, 57 insertions(+), 12 deletions(-)
> > 
> > diff --git a/gcc/config/rs6000/rs6000-logue.cc 
> > b/gcc/config/rs6000/rs6000-logue.cc
> > index c87058b435e..aa1e0442f2b 100644
> > --- a/gcc/config/rs6000/rs6000-logue.cc
> > +++ b/gcc/config/rs6000/rs6000-logue.cc
> > @@ -4005,8 +4005,8 @@ rs6000_output_function_prologue (FILE *file)
> >  
> >        unsigned short patch_area_size = crtl->patch_area_size;
> >        unsigned short patch_area_entry = crtl->patch_area_entry;
> > -      /* Need to emit the patching area.  */
> > -      if (patch_area_size > 0)
> > +      /* Emit non-split patching area now.  */
> > +      if (!TARGET_SPLIT_PATCH_NOPS && patch_area_size > 0)
> >     {
> >       cfun->machine->global_entry_emitted = true;
> >       /* As ELFv2 ABI shows, the allowable bytes between the global
> > @@ -4027,7 +4027,6 @@ rs6000_output_function_prologue (FILE *file)
> >                    patch_area_entry);
> >           rs6000_print_patchable_function_entry (file, patch_area_entry,
> >                                                  true);
> > -         patch_area_size -= patch_area_entry;
> >         }
> >     }
> >  
> > @@ -4037,9 +4036,13 @@ rs6000_output_function_prologue (FILE *file)
> >        assemble_name (file, name);
> >        fputs ("\n", file);
> >        /* Emit the nops after local entry.  */
> > -      if (patch_area_size > 0)
> > -   rs6000_print_patchable_function_entry (file, patch_area_size,
> > -                                          patch_area_entry == 0);
> > +      if (patch_area_size > patch_area_entry)
> > +   {
> > +     patch_area_size -= patch_area_entry;
> > +     cfun->machine->stop_patch_area_print = false;
> > +     rs6000_print_patchable_function_entry (file, patch_area_size,
> > +                                            patch_area_entry == 0);
> > +   }
> >      }
> >  
> >    else if (rs6000_pcrel_p ())
> > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> > index 950fd947fda..6427e6913ba 100644
> > --- a/gcc/config/rs6000/rs6000.cc
> > +++ b/gcc/config/rs6000/rs6000.cc
> > @@ -15226,11 +15226,25 @@ rs6000_print_patchable_function_entry (FILE *file,
> >  {
> >    bool global_entry_needed_p = rs6000_global_entry_point_prologue_needed_p 
> > ();
> >    /* For a function which needs global entry point, we will emit the
> > -     patchable area before and after local entry point under the control of
> > -     cfun->machine->global_entry_emitted, see the handling in function
> > -     rs6000_output_function_prologue.  */
> > -  if (!global_entry_needed_p || cfun->machine->global_entry_emitted)
> > +     patchable area when it isn't split before and after local entry point
> > +     under the control of cfun->machine->global_entry_emitted, see the
> > +     handling in function rs6000_output_function_prologue.  */
> > +  if (!TARGET_SPLIT_PATCH_NOPS
> > +      && (!global_entry_needed_p || cfun->machine->global_entry_emitted))
> >      default_print_patchable_function_entry (file, patch_area_size, 
> > record_p);
> > +
> > +  /* For split patch nops we emit the before nops (from generic code)
> > +     in front of the global entry point and after the local entry point,
> > +     under the control of cfun->machine->stop_patch_area_print, see
> > +     rs6000_output_function_prologue and rs6000_elf_declare_function_name. 
> >  */
> > +  if (TARGET_SPLIT_PATCH_NOPS)
> > +    {
> > +      if (!cfun->machine->stop_patch_area_print)
> > +   default_print_patchable_function_entry (file, patch_area_size,
> > +                                           record_p);
> > +      else
> > +   gcc_assert (global_entry_needed_p);
> > +    }
> >  }
> >  
> >  enum rtx_code
> > @@ -21423,6 +21437,11 @@ rs6000_elf_declare_function_name (FILE *file, 
> > const char *name, tree decl)
> >        fprintf (file, "\t.previous\n");
> >      }
> >    ASM_OUTPUT_FUNCTION_LABEL (file, name, decl);
> > +  /* At this time, the "before" NOPs have been already emitted.
> > +     For split nops stop generic code from printing the "after" NOPs and
> > +     emit them just after local entry ourself later.  */
> > +  if (rs6000_global_entry_point_prologue_needed_p ())
> > +    cfun->machine->stop_patch_area_print = true;
> >  }
> >  
> >  static void rs6000_elf_file_end (void) ATTRIBUTE_UNUSED;
> > diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
> > index d460eb06544..2bd56f8d377 100644
> > --- a/gcc/config/rs6000/rs6000.h
> > +++ b/gcc/config/rs6000/rs6000.h
> > @@ -2424,6 +2424,12 @@ typedef struct GTY(()) machine_function
> >       global entry.  It helps to control the patchable area before and after
> >       local entry.  */
> >    bool global_entry_emitted;
> > +  /* With ELFv2 ABI dual entry points being adopted, generic framework
> > +     targetm.asm_out.print_patchable_function_entry would generate "after"
> > +     NOPs before local entry, which is wrong.  This flag is to stop it from
> > +     printing patch area before local entry, it is only useful when the
> > +     function requires dual entry points.  */
> > +  bool stop_patch_area_print;
> >  } machine_function;
> >  #endif
> >  
> > diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
> > index 94323bd1db2..27a165ef0ee 100644
> > --- a/gcc/config/rs6000/rs6000.opt
> > +++ b/gcc/config/rs6000/rs6000.opt
> > @@ -300,6 +300,10 @@ mfull-toc
> >  Target
> >  Put everything in the regular TOC.
> >  
> > +msplit-patch-nops
> > +Target Var(TARGET_SPLIT_PATCH_NOPS) Init(0)
> > +Emit NOPs before global and after local entry point for 
> > -fpatchable-function-entry.
> > +
> >  mvrsave
> >  Target Var(TARGET_ALTIVEC_VRSAVE) Save
> >  Generate VRSAVE instructions when generating AltiVec code.
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index 93e9af3791c..f3d37a48fd0 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -1318,6 +1318,7 @@ See RS/6000 and PowerPC Options.
> >  -mtraceback=@var{traceback_type}
> >  -maix-struct-return  -msvr4-struct-return
> >  -mabi=@var{abi-type}  -msecure-plt  -mbss-plt
> > +-msplit-patch-nops
> >  -mlongcall  -mno-longcall  -mpltseq  -mno-pltseq
> >  -mblock-move-inline-limit=@var{num}
> >  -mblock-compare-inline-limit=@var{num}
> > @@ -18569,11 +18570,12 @@ If @code{N=0}, no pad location is recorded.
> >  The NOP instructions are inserted at---and maybe before, depending on
> >  @var{M}---the function entry address, even before the prologue.  On
> >  PowerPC with the ELFv2 ABI, for a function with dual entry points,
> > -the local entry point is this function entry address.
> > +the local entry point is this function entry address by default.  See
> > +the @option{-msplit-patch-nops} option to change this.
> >  
> >  The maximum value of @var{N} and @var{M} is 65535.  On PowerPC with the
> >  ELFv2 ABI, for a function with dual entry points, the supported values
> > -for @var{M} are 0, 2, 6 and 14.
> > +for @var{M} are 0, 2, 6 and 14 when not using @option{-msplit-patch-nops}.
> >  @end table
> >  
> >  
> > @@ -31658,6 +31660,17 @@ requires @code{.plt} and @code{.got}
> >  sections that are both writable and executable.
> >  This is a PowerPC 32-bit SYSV ABI option.
> >  
> > +@opindex msplit-patch-nops
> > +@item -msplit-patch-nops
> > +When adding NOPs for a patchable area via the
> > +@option{-fpatchable-function-entry} option emit the "before" NOPs in front
> > +of the global entry point and the "after" NOPs after the local entry point.
> > +This makes the sequence of NOPs not consecutive when a global entry point
> > +is generated.  Without this option the NOPs are emitted directly before and
> > +after the local entry point, making them consecutive but moving global and
> > +local entry point further apart.  If only a single entry point is generated
> > +this option has no effect.
> > +
> >  @opindex misel
> >  @opindex mno-isel
> >  @item -misel
> > 
> 

Marek

Reply via email to