Ping.  Or maybe I've lost some replies here because my mail server
crashed several days ago :).

On Wed, 2023-03-29 at 02:01 +0800, Xi Ruoyao wrote:
> LoongArch backend used to save all GARs for a function with variable
> arguments.  But sometimes a function only accepts variable arguments
> for
> a purpose like C++ function overloading.  For example, POSIX defines
> open() as:
> 
>     int open(const char *path, int oflag, ...);
> 
> But only two forms are actually used:
> 
>     int open(const char *pathname, int flags);
>     int open(const char *pathname, int flags, mode_t mode);
> 
> So it's obviously a waste to save all 8 GARs in open().  We can use
> the
> cfun->va_list_gpr_size field set by the stdarg pass to only save the
> GARs necessary to be saved.
> 
> If the va_list escapes (for example, in fprintf() we pass it to
> vfprintf()), stdarg would set cfun->va_list_gpr_size to 255 so we
> don't need a special case.
> 
> With this patch, only one GAR ($a2/$r6) is saved in open().  Ideally
> even this stack store should be omitted too, but doing so is not
> trivial
> and AFAIK there are no compilers (for any target) performing the
> "ideal"
> optimization here, see https://godbolt.org/z/n1YqWq9c9.
> 
> Bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk
> (GCC 14 or now)?
> 
> gcc/ChangeLog:
> 
>         * config/loongarch/loongarch.cc
>         (loongarch_setup_incoming_varargs): Don't save more GARs than
>         cfun->va_list_gpr_size / UNITS_PER_WORD.
> 
> gcc/testsuite/ChangeLog:
> 
>         * gcc.target/loongarch/va_arg.c: New test.
> ---
>  gcc/config/loongarch/loongarch.cc           |  4 +++-
>  gcc/testsuite/gcc.target/loongarch/va_arg.c | 24
> +++++++++++++++++++++
>  2 files changed, 27 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/va_arg.c
> 
> diff --git a/gcc/config/loongarch/loongarch.cc
> b/gcc/config/loongarch/loongarch.cc
> index 6927bdc7fe5..0ecb91ca997 100644
> --- a/gcc/config/loongarch/loongarch.cc
> +++ b/gcc/config/loongarch/loongarch.cc
> @@ -764,7 +764,9 @@ loongarch_setup_incoming_varargs
> (cumulative_args_t cum,
>      loongarch_function_arg_advance (pack_cumulative_args
> (&local_cum), arg);
>  
>    /* Found out how many registers we need to save.  */
> -  gp_saved = MAX_ARGS_IN_REGISTERS - local_cum.num_gprs;
> +  gp_saved = cfun->va_list_gpr_size / UNITS_PER_WORD;
> +  if (gp_saved > (int) (MAX_ARGS_IN_REGISTERS - local_cum.num_gprs))
> +    gp_saved = MAX_ARGS_IN_REGISTERS - local_cum.num_gprs;
>  
>    if (!no_rtl && gp_saved > 0)
>      {
> diff --git a/gcc/testsuite/gcc.target/loongarch/va_arg.c
> b/gcc/testsuite/gcc.target/loongarch/va_arg.c
> new file mode 100644
> index 00000000000..980c96d0e3d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/loongarch/va_arg.c
> @@ -0,0 +1,24 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +/* Technically we shouldn't save any register for this function: it
> should be
> +   compiled as if it accepts 3 named arguments.  But AFAIK no
> compilers can
> +   achieve this "perfect" optimization now, so just ensure we are
> using the
> +   knowledge provided by stdarg pass and we won't save GARs
> impossible to be
> +   accessed with __builtin_va_arg () when the va_list does not
> escape.  */
> +
> +/* { dg-final { scan-assembler-not "st.*r7" } } */
> +
> +int
> +test (int a0, ...)
> +{
> +  void *arg;
> +  int a1, a2;
> +
> +  __builtin_va_start (arg, a0);
> +  a1 = __builtin_va_arg (arg, int);
> +  a2 = __builtin_va_arg (arg, int);
> +  __builtin_va_end (arg);
> +
> +  return a0 + a1 + a2;
> +}

-- 
Xi Ruoyao <xry...@xry111.site>
School of Aerospace Science and Technology, Xidian University

Reply via email to