Hi,

Gentle ping this patch:

https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651025.html

BR,
Kewen

>> on 2024/7/12 00:15, Martin Jambor wrote:
>>> Hi,
>>>
>>> can I add myself to the bunch of people who are pinging this? 
>>> Having
>>> this in will make our life easier.
>>>
>>> Thanks a lot,
>>>
>>> Martin
>>>
>>>
>>> On Wed, May 08 2024, Kewen.Lin wrote:
>>>> Hi,
>>>>
>>>> As the discussion in PR112980, although the current
>>>> implementation for -fpatchable-function-entry* conforms
>>>> with the documentation (making N NOPs be consecutive),
>>>> it's inefficient for both kernel and userspace livepatching
>>>> (see comments in PR for the details).
>>>>
>>>> So this patch is to change the current implementation by
>>>> emitting the "before" NOPs before global entry point and
>>>> the "after" NOPs after local entry point.  The new behavior
>>>> would not keep NOPs to be consecutive, so the documentation
>>>> is updated to emphasize this.
>>>>
>>>> Bootstrapped and regress-tested on powerpc64-linux-gnu
>>>> P8/P9 and powerpc64le-linux-gnu P9 and P10.
>>>>
>>>> Is it ok for trunk?  And backporting to active branches
>>>> after burn-in time?  I guess we should also mention this
>>>> change in changes.html?
>>>>
>>>> BR,
>>>> Kewen
>>>> -----
>>>>    PR target/112980
>>>>
>>>> gcc/ChangeLog:
>>>>
>>>>    * config/rs6000/rs6000-logue.cc
>>>> (rs6000_output_function_prologue):
>>>>    Adjust the handling on patch area emitting with dual
>>>> entry, remove
>>>>    the restriction on "before" NOPs count, not emit
>>>> "before" NOPs any
>>>>    more but only emit "after" NOPs.
>>>>    * config/rs6000/rs6000.cc
>>>> (rs6000_print_patchable_function_entry):
>>>>    Adjust by respecting cfun->machine-
>>>>> stop_patch_area_print.
>>>>    (rs6000_elf_declare_function_name): For ELFv2 with dual
>>>> entry, set
>>>>    cfun->machine->stop_patch_area_print as true.
>>>>    * config/rs6000/rs6000.h (struct machine_function):
>>>> Remove member
>>>>    global_entry_emitted, add new member
>>>> stop_patch_area_print.
>>>>    * doc/invoke.texi (option -fpatchable-function-entry):
>>>> Adjust the
>>>>    documentation for PowerPC ELFv2 dual entry.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>
>>>>    * c-c++-common/patchable_function_entry-default.c:
>>>> Adjust.
>>>>    * gcc.target/powerpc/pr99888-4.c: Likewise.
>>>>    * gcc.target/powerpc/pr99888-5.c: Likewise.
>>>>    * gcc.target/powerpc/pr99888-6.c: Likewise.
>>>> ---
>>>>  gcc/config/rs6000/rs6000-logue.cc             | 40 +++++--------
>>>> ------
>>>>  gcc/config/rs6000/rs6000.cc                   | 15 +++++--
>>>>  gcc/config/rs6000/rs6000.h                    | 10 +++--
>>>>  gcc/doc/invoke.texi                           |  8 ++--
>>>>  .../patchable_function_entry-default.c        |  3 --
>>>>  gcc/testsuite/gcc.target/powerpc/pr99888-4.c  |  4 +-
>>>>  gcc/testsuite/gcc.target/powerpc/pr99888-5.c  |  4 +-
>>>>  gcc/testsuite/gcc.target/powerpc/pr99888-6.c  |  4 +-
>>>>  8 files changed, 33 insertions(+), 55 deletions(-)
>>>>
>>>> diff --git a/gcc/config/rs6000/rs6000-logue.cc
>>>> b/gcc/config/rs6000/rs6000-logue.cc
>>>> index 60ba15a8bc3..0eb019b44b3 100644
>>>> --- a/gcc/config/rs6000/rs6000-logue.cc
>>>> +++ b/gcc/config/rs6000/rs6000-logue.cc
>>>> @@ -4006,43 +4006,21 @@ rs6000_output_function_prologue (FILE
>>>> *file)
>>>>      fprintf (file, "\tadd 2,2,12\n");
>>>>    }
>>>>
>>>> -      unsigned short patch_area_size = crtl->patch_area_size;
>>>> -      unsigned short patch_area_entry = crtl->patch_area_entry;
>>>> -      /* Need to emit the patching area.  */
>>>> -      if (patch_area_size > 0)
>>>> -  {
>>>> -    cfun->machine->global_entry_emitted = true;
>>>> -    /* As ELFv2 ABI shows, the allowable bytes between the
>>>> global
>>>> -       and local entry points are 0, 4, 8, 16, 32 and 64
>>>> when
>>>> -       there is a local entry point.  Considering there
>>>> are two
>>>> -       non-prefixed instructions for global entry point
>>>> prologue
>>>> -       (8 bytes), the count for patchable nops before
>>>> local entry
>>>> -       point would be 2, 6 and 14.  It's possible to
>>>> support those
>>>> -       other counts of nops by not making a local entry
>>>> point, but
>>>> -       we don't have clear use cases for them, so leave
>>>> them
>>>> -       unsupported for now.  */
>>>> -    if (patch_area_entry > 0)
>>>> -      {
>>>> -        if (patch_area_entry != 2
>>>> -            && patch_area_entry != 6
>>>> -            && patch_area_entry != 14)
>>>> -          error ("unsupported number of nops before
>>>> function entry (%u)",
>>>> -                 patch_area_entry);
>>>> -        rs6000_print_patchable_function_entry (file,
>>>> patch_area_entry,
>>>> -                                               true);
>>>> -        patch_area_size -= patch_area_entry;
>>>> -      }
>>>> -  }
>>>> -
>>>>        fputs ("\t.localentry\t", file);
>>>>        assemble_name (file, name);
>>>>        fputs (",.-", file);
>>>>        assemble_name (file, name);
>>>>        fputs ("\n", file);
>>>>        /* Emit the nops after local entry.  */
>>>> -      if (patch_area_size > 0)
>>>> -  rs6000_print_patchable_function_entry (file,
>>>> patch_area_size,
>>>> -                                         patch_area_entry
>>>> == 0);
>>>> +      unsigned short patch_area_size = crtl->patch_area_size;
>>>> +      unsigned short patch_area_entry = crtl->patch_area_entry;
>>>> +      if (patch_area_size > patch_area_entry)
>>>> +  {
>>>> +    cfun->machine->stop_patch_area_print = false;
>>>> +    patch_area_size -= patch_area_entry;
>>>> +    rs6000_print_patchable_function_entry (file,
>>>> patch_area_size,
>>>> +                                          
>>>> patch_area_entry == 0);
>>>> +  }
>>>>      }
>>>>
>>>>    else if (rs6000_pcrel_p ())
>>>> diff --git a/gcc/config/rs6000/rs6000.cc
>>>> b/gcc/config/rs6000/rs6000.cc
>>>> index 6ba9df4f02e..7e70741d59f 100644
>>>> --- a/gcc/config/rs6000/rs6000.cc
>>>> +++ b/gcc/config/rs6000/rs6000.cc
>>>> @@ -15190,12 +15190,14 @@ rs6000_print_patchable_function_entry
>>>> (FILE *file,
>>>>                                   bool record_p)
>>>>  {
>>>>    bool global_entry_needed_p =
>>>> rs6000_global_entry_point_prologue_needed_p ();
>>>> -  /* For a function which needs global entry point, we will emit
>>>> the
>>>> -     patchable area before and after local entry point under the
>>>> control of
>>>> -     cfun->machine->global_entry_emitted, see the handling in
>>>> function
>>>> +  /* For a function which needs global entry point, we will only
>>>> emit the
>>>> +     patchable area after local entry point under the control of
>>>> +     !cfun->machine->stop_patch_area_print, see the handling in
>>>> functions
>>>>       rs6000_output_function_prologue.  */
>>>> -  if (!global_entry_needed_p || cfun->machine-
>>>>> global_entry_emitted)
>>>> +  if (!cfun->machine->stop_patch_area_print)
>>>>      default_print_patchable_function_entry (file,
>>>> patch_area_size, record_p);
>>>> +  else
>>>> +    gcc_assert (global_entry_needed_p);
>>>>  }
>>>>
>>>>
>>>>  enum rtx_code
>>>> @@ -21381,6 +21383,11 @@ rs6000_elf_declare_function_name (FILE
>>>> *file, const char *name, tree decl)
>>>>        fprintf (file, "\t.previous\n");
>>>>      }
>>>>    ASM_OUTPUT_FUNCTION_LABEL (file, name, decl);
>>>> +  /* At this time, the "before" NOPs have been already emitted,
>>>> +     let's stop generic code from printing the "after" NOPs and
>>>> +     emit just after local entry later.  */
>>>> +  if (rs6000_global_entry_point_prologue_needed_p ())
>>>> +    cfun->machine->stop_patch_area_print = true;
>>>>  }
>>>>
>>>>  static void rs6000_elf_file_end (void) ATTRIBUTE_UNUSED;
>>>> diff --git a/gcc/config/rs6000/rs6000.h
>>>> b/gcc/config/rs6000/rs6000.h
>>>> index 68bc45d65ba..73767fdae06 100644
>>>> --- a/gcc/config/rs6000/rs6000.h
>>>> +++ b/gcc/config/rs6000/rs6000.h
>>>> @@ -2434,10 +2434,12 @@ typedef struct GTY(()) machine_function
>>>>    bool lr_is_wrapped_separately;
>>>>    bool toc_is_wrapped_separately;
>>>>    bool mma_return_type_error;
>>>> -  /* Indicate global entry is emitted, only useful when the
>>>> function requires
>>>> -     global entry.  It helps to control the patchable area
>>>> before and after
>>>> -     local entry.  */
>>>> -  bool global_entry_emitted;
>>>> +  /* With ELFv2 ABI dual entry points being adopted, generic
>>>> framework
>>>> +     targetm.asm_out.print_patchable_function_entry would
>>>> generate "after"
>>>> +     NOPs before local entry, it is wrong.  This flag is to stop
>>>> it from
>>>> +     printing patch area before local entry, it is only useful
>>>> when the
>>>> +     function requires dual entry points.  */
>>>> +  bool stop_patch_area_print;
>>>>  } machine_function;
>>>>  #endif
>>>>
>>>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>>>> index c584664e168..58e48f7dc55 100644
>>>> --- a/gcc/doc/invoke.texi
>>>> +++ b/gcc/doc/invoke.texi
>>>> @@ -18363,11 +18363,11 @@ If @code{N=0}, no pad location is
>>>> recorded.
>>>>  The NOP instructions are inserted at---and maybe before,
>>>> depending on
>>>>  @var{M}---the function entry address, even before the prologue. 
>>>> On
>>>>  PowerPC with the ELFv2 ABI, for a function with dual entry
>>>> points,
>>>> -the local entry point is this function entry address.
>>>> +@var{M} NOP instructions are inserted before the global entry
>>>> point and
>>>> +@var{N} - @var{M} NOP instructions are inserted after the local
>>>> entry
>>>> +point, which means the NOP instructions may not be consecutive.
>>>>
>>>> -The maximum value of @var{N} and @var{M} is 65535.  On PowerPC
>>>> with the
>>>> -ELFv2 ABI, for a function with dual entry points, the supported
>>>> values
>>>> -for @var{M} are 0, 2, 6 and 14.
>>>> +The maximum value of @var{N} and @var{M} is 65535.
>>>>  @end table
>>>>
>>>>
>>>> diff --git a/gcc/testsuite/c-c++-common/patchable_function_entry-
>>>> default.c b/gcc/testsuite/c-c++-common/patchable_function_entry-
>>>> default.c
>>>> index 3ccbafc87db..899938b4aa3 100644
>>>> --- a/gcc/testsuite/c-c++-common/patchable_function_entry-
>>>> default.c
>>>> +++ b/gcc/testsuite/c-c++-common/patchable_function_entry-
>>>> default.c
>>>> @@ -1,9 +1,6 @@
>>>>  /* { dg-do compile { target { ! { nvptx*-*-* visium-*-* } } } }
>>>> */
>>>>  /* { dg-options "-O2 -fpatchable-function-entry=3,1" } */
>>>>  /* { dg-additional-options "-fno-pie" { target sparc*-*-* } } */
>>>> -/* See PR99888, one single preceding nop isn't allowed on
>>>> powerpc_elfv2,
>>>> -   so overriding with two preceding nops to make it pass there. 
>>>> */
>>>> -/* { dg-additional-options "-fpatchable-function-entry=3,2" {
>>>> target powerpc_elfv2 } } */
>>>>  /* { dg-final { scan-assembler-times "nop|NOP|SWYM" 3 { target {
>>>> ! { alpha*-*-* riscv*-*-* } } } } } */
>>>>  /* { dg-final { scan-assembler-times "bis" 3 { target alpha*-*-*
>>>> } } } */
>>>>  /* { dg-final { scan-assembler-times "nop\n" 3 { target riscv*-
>>>> *-* } } } */
>>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr99888-4.c
>>>> b/gcc/testsuite/gcc.target/powerpc/pr99888-4.c
>>>> index 00a8d4d316e..6f23f2bb939 100644
>>>> --- a/gcc/testsuite/gcc.target/powerpc/pr99888-4.c
>>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr99888-4.c
>>>> @@ -2,12 +2,10 @@
>>>>  /* There is no global entry point prologue with pcrel.  */
>>>>  /* { dg-options "-mno-pcrel -fpatchable-function-entry=1,1" } */
>>>>
>>>> -/* Verify one error emitted for unexpected 1 nop before local
>>>> -   entry.  */
>>>> +/* Verify there is no error with 1 nop before local entry.  */
>>>>
>>>>  extern int a;
>>>>
>>>>  int test (int b) {
>>>>    return a + b;
>>>>  }
>>>> -/* { dg-error "unsupported number of nops before function entry
>>>> \\(1\\)" "" { target *-*-* } .-1 } */
>>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr99888-5.c
>>>> b/gcc/testsuite/gcc.target/powerpc/pr99888-5.c
>>>> index 39d3b4465f1..13f192ebd20 100644
>>>> --- a/gcc/testsuite/gcc.target/powerpc/pr99888-5.c
>>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr99888-5.c
>>>> @@ -2,12 +2,10 @@
>>>>  /* There is no global entry point prologue with pcrel.  */
>>>>  /* { dg-options "-mno-pcrel -fpatchable-function-entry=7,3" } */
>>>>
>>>> -/* Verify one error emitted for unexpected 3 nops before local
>>>> -   entry.  */
>>>> +/* Verify no error emitted for 3 nops before local entry.  */
>>>>
>>>>  extern int a;
>>>>
>>>>  int test (int b) {
>>>>    return a + b;
>>>>  }
>>>> -/* { dg-error "unsupported number of nops before function entry
>>>> \\(3\\)" "" { target *-*-* } .-1 } */
>>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr99888-6.c
>>>> b/gcc/testsuite/gcc.target/powerpc/pr99888-6.c
>>>> index c6c18dcc7ac..431c69cae9a 100644
>>>> --- a/gcc/testsuite/gcc.target/powerpc/pr99888-6.c
>>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr99888-6.c
>>>> @@ -2,8 +2,7 @@
>>>>  /* There is no global entry point prologue with pcrel.  */
>>>>  /* { dg-options "-mno-pcrel" } */
>>>>
>>>> -/* Verify one error emitted for unexpected 4 nops before local
>>>> -   entry.  */
>>>> +/* Verify no error emitted for 4 nops before local entry.  */
>>>>
>>>>  extern int a;
>>>>
>>>> @@ -11,4 +10,3 @@ __attribute__ ((patchable_function_entry (20,
>>>> 4)))
>>>>  int test (int b) {
>>>>    return a + b;
>>>>  }
>>>> -/* { dg-error "unsupported number of nops before function entry
>>>> \\(4\\)" "" { target *-*-* } .-1 } */
>>>> --
>>>> 2.39.1
>>
> 

Reply via email to