Hello,

On Wed, Dec 11 2024, Martin Jambor wrote:
> Hello,
>
> even though it is not my work, I would like to ping this patch.  Having
> it upstream would really help us a lot.
>

Please, pretty please, consider reviewing this in time for GCC 15,
having it upstream would really help us a lot and from what I can tell,
it should no longer be controversial.

Thank you very much in advance,

Martin

>
> On Wed, Nov 13 2024, Michael Matz wrote:
>> Hello,
>>
>> this is essentially 
>>
>>   https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651025.html
>>
>> from Kewen in functionality.  When discussing this with Segher at the 
>> Cauldron he expressed reservations about changing the default 
>> implementation of -fpatchable-function-entry.  So, to move forward, let's 
>> move it under a new target option -msplit-patch-nops (expressing the 
>> important deviation from the default behaviour, namely that all the 
>> patching nops form a consecutive sequence normally).
>>
>> Regstrapping on power9 ppc64le in progress.  Okay if that passes?
>>
>>
>> Ciao,
>> Michael.
>>
>> ---
>>
>> as the bug report details some uses of -fpatchable-function-entry
>> aren't happy with the "before" NOPs being inserted between global and
>> local entry point on powerpc.  We want the before NOPs be in front
>> of the global entry point.  That means that the patching NOPs aren't
>> consecutive for dual entry point functions, but for these usecases
>> that's not the problem.  But let us support both under the control
>> of a new target option: -msplit-patch-nops.
>>
>>      gcc/
>>
>>         PR target/112980
>>         * config/rs6000/rs6000.opt (msplit-patch-nops): New option.
>>         * doc/invoke.texi (RS/6000 and PowerPC Options): Document it.
>>         * config/rs6000/rs6000.h (machine_function.stop_patch_area_print):
>>         New member.
>>         * config/rs6000/rs6000.cc (rs6000_print_patchable_function_entry):
>>         Emit split nops under control of that one.
>>         * config/rs6000/rs6000-logue.cc (rs6000_output_function_prologue):
>>         Add handling of split patch nops.
>> ---
>>  gcc/config/rs6000/rs6000-logue.cc | 15 +++++++++------
>>  gcc/config/rs6000/rs6000.cc       | 27 +++++++++++++++++++++++----
>>  gcc/config/rs6000/rs6000.h        |  6 ++++++
>>  gcc/config/rs6000/rs6000.opt      |  4 ++++
>>  gcc/doc/invoke.texi               | 17 +++++++++++++++--
>>  5 files changed, 57 insertions(+), 12 deletions(-)
>>
>> diff --git a/gcc/config/rs6000/rs6000-logue.cc 
>> b/gcc/config/rs6000/rs6000-logue.cc
>> index c87058b435e..aa1e0442f2b 100644
>> --- a/gcc/config/rs6000/rs6000-logue.cc
>> +++ b/gcc/config/rs6000/rs6000-logue.cc
>> @@ -4005,8 +4005,8 @@ rs6000_output_function_prologue (FILE *file)
>>  
>>        unsigned short patch_area_size = crtl->patch_area_size;
>>        unsigned short patch_area_entry = crtl->patch_area_entry;
>> -      /* Need to emit the patching area.  */
>> -      if (patch_area_size > 0)
>> +      /* Emit non-split patching area now.  */
>> +      if (!TARGET_SPLIT_PATCH_NOPS && patch_area_size > 0)
>>      {
>>        cfun->machine->global_entry_emitted = true;
>>        /* As ELFv2 ABI shows, the allowable bytes between the global
>> @@ -4027,7 +4027,6 @@ rs6000_output_function_prologue (FILE *file)
>>                     patch_area_entry);
>>            rs6000_print_patchable_function_entry (file, patch_area_entry,
>>                                                   true);
>> -          patch_area_size -= patch_area_entry;
>>          }
>>      }
>>  
>> @@ -4037,9 +4036,13 @@ rs6000_output_function_prologue (FILE *file)
>>        assemble_name (file, name);
>>        fputs ("\n", file);
>>        /* Emit the nops after local entry.  */
>> -      if (patch_area_size > 0)
>> -    rs6000_print_patchable_function_entry (file, patch_area_size,
>> -                                           patch_area_entry == 0);
>> +      if (patch_area_size > patch_area_entry)
>> +    {
>> +      patch_area_size -= patch_area_entry;
>> +      cfun->machine->stop_patch_area_print = false;
>> +      rs6000_print_patchable_function_entry (file, patch_area_size,
>> +                                             patch_area_entry == 0);
>> +    }
>>      }
>>  
>>    else if (rs6000_pcrel_p ())
>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
>> index 950fd947fda..6427e6913ba 100644
>> --- a/gcc/config/rs6000/rs6000.cc
>> +++ b/gcc/config/rs6000/rs6000.cc
>> @@ -15226,11 +15226,25 @@ rs6000_print_patchable_function_entry (FILE *file,
>>  {
>>    bool global_entry_needed_p = rs6000_global_entry_point_prologue_needed_p 
>> ();
>>    /* For a function which needs global entry point, we will emit the
>> -     patchable area before and after local entry point under the control of
>> -     cfun->machine->global_entry_emitted, see the handling in function
>> -     rs6000_output_function_prologue.  */
>> -  if (!global_entry_needed_p || cfun->machine->global_entry_emitted)
>> +     patchable area when it isn't split before and after local entry point
>> +     under the control of cfun->machine->global_entry_emitted, see the
>> +     handling in function rs6000_output_function_prologue.  */
>> +  if (!TARGET_SPLIT_PATCH_NOPS
>> +      && (!global_entry_needed_p || cfun->machine->global_entry_emitted))
>>      default_print_patchable_function_entry (file, patch_area_size, 
>> record_p);
>> +
>> +  /* For split patch nops we emit the before nops (from generic code)
>> +     in front of the global entry point and after the local entry point,
>> +     under the control of cfun->machine->stop_patch_area_print, see
>> +     rs6000_output_function_prologue and rs6000_elf_declare_function_name.  
>> */
>> +  if (TARGET_SPLIT_PATCH_NOPS)
>> +    {
>> +      if (!cfun->machine->stop_patch_area_print)
>> +    default_print_patchable_function_entry (file, patch_area_size,
>> +                                            record_p);
>> +      else
>> +    gcc_assert (global_entry_needed_p);
>> +    }
>>  }
>>  
>>  enum rtx_code
>> @@ -21423,6 +21437,11 @@ rs6000_elf_declare_function_name (FILE *file, const 
>> char *name, tree decl)
>>        fprintf (file, "\t.previous\n");
>>      }
>>    ASM_OUTPUT_FUNCTION_LABEL (file, name, decl);
>> +  /* At this time, the "before" NOPs have been already emitted.
>> +     For split nops stop generic code from printing the "after" NOPs and
>> +     emit them just after local entry ourself later.  */
>> +  if (rs6000_global_entry_point_prologue_needed_p ())
>> +    cfun->machine->stop_patch_area_print = true;
>>  }
>>  
>>  static void rs6000_elf_file_end (void) ATTRIBUTE_UNUSED;
>> diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
>> index d460eb06544..2bd56f8d377 100644
>> --- a/gcc/config/rs6000/rs6000.h
>> +++ b/gcc/config/rs6000/rs6000.h
>> @@ -2424,6 +2424,12 @@ typedef struct GTY(()) machine_function
>>       global entry.  It helps to control the patchable area before and after
>>       local entry.  */
>>    bool global_entry_emitted;
>> +  /* With ELFv2 ABI dual entry points being adopted, generic framework
>> +     targetm.asm_out.print_patchable_function_entry would generate "after"
>> +     NOPs before local entry, which is wrong.  This flag is to stop it from
>> +     printing patch area before local entry, it is only useful when the
>> +     function requires dual entry points.  */
>> +  bool stop_patch_area_print;
>>  } machine_function;
>>  #endif
>>  
>> diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
>> index 94323bd1db2..27a165ef0ee 100644
>> --- a/gcc/config/rs6000/rs6000.opt
>> +++ b/gcc/config/rs6000/rs6000.opt
>> @@ -300,6 +300,10 @@ mfull-toc
>>  Target
>>  Put everything in the regular TOC.
>>  
>> +msplit-patch-nops
>> +Target Var(TARGET_SPLIT_PATCH_NOPS) Init(0)
>> +Emit NOPs before global and after local entry point for 
>> -fpatchable-function-entry.
>> +
>>  mvrsave
>>  Target Var(TARGET_ALTIVEC_VRSAVE) Save
>>  Generate VRSAVE instructions when generating AltiVec code.
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index 93e9af3791c..f3d37a48fd0 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -1318,6 +1318,7 @@ See RS/6000 and PowerPC Options.
>>  -mtraceback=@var{traceback_type}
>>  -maix-struct-return  -msvr4-struct-return
>>  -mabi=@var{abi-type}  -msecure-plt  -mbss-plt
>> +-msplit-patch-nops
>>  -mlongcall  -mno-longcall  -mpltseq  -mno-pltseq
>>  -mblock-move-inline-limit=@var{num}
>>  -mblock-compare-inline-limit=@var{num}
>> @@ -18569,11 +18570,12 @@ If @code{N=0}, no pad location is recorded.
>>  The NOP instructions are inserted at---and maybe before, depending on
>>  @var{M}---the function entry address, even before the prologue.  On
>>  PowerPC with the ELFv2 ABI, for a function with dual entry points,
>> -the local entry point is this function entry address.
>> +the local entry point is this function entry address by default.  See
>> +the @option{-msplit-patch-nops} option to change this.
>>  
>>  The maximum value of @var{N} and @var{M} is 65535.  On PowerPC with the
>>  ELFv2 ABI, for a function with dual entry points, the supported values
>> -for @var{M} are 0, 2, 6 and 14.
>> +for @var{M} are 0, 2, 6 and 14 when not using @option{-msplit-patch-nops}.
>>  @end table
>>  
>>  
>> @@ -31658,6 +31660,17 @@ requires @code{.plt} and @code{.got}
>>  sections that are both writable and executable.
>>  This is a PowerPC 32-bit SYSV ABI option.
>>  
>> +@opindex msplit-patch-nops
>> +@item -msplit-patch-nops
>> +When adding NOPs for a patchable area via the
>> +@option{-fpatchable-function-entry} option emit the "before" NOPs in front
>> +of the global entry point and the "after" NOPs after the local entry point.
>> +This makes the sequence of NOPs not consecutive when a global entry point
>> +is generated.  Without this option the NOPs are emitted directly before and
>> +after the local entry point, making them consecutive but moving global and
>> +local entry point further apart.  If only a single entry point is generated
>> +this option has no effect.
>> +
>>  @opindex misel
>>  @opindex mno-isel
>>  @item -misel
>> -- 
>> 2.35.3

Reply via email to