Ping. Or maybe I've lost some replies here because my mail server crashed several days ago :).
On Wed, 2023-03-29 at 02:01 +0800, Xi Ruoyao wrote: > LoongArch backend used to save all GARs for a function with variable > arguments. But sometimes a function only accepts variable arguments > for > a purpose like C++ function overloading. For example, POSIX defines > open() as: > > int open(const char *path, int oflag, ...); > > But only two forms are actually used: > > int open(const char *pathname, int flags); > int open(const char *pathname, int flags, mode_t mode); > > So it's obviously a waste to save all 8 GARs in open(). We can use > the > cfun->va_list_gpr_size field set by the stdarg pass to only save the > GARs necessary to be saved. > > If the va_list escapes (for example, in fprintf() we pass it to > vfprintf()), stdarg would set cfun->va_list_gpr_size to 255 so we > don't need a special case. > > With this patch, only one GAR ($a2/$r6) is saved in open(). Ideally > even this stack store should be omitted too, but doing so is not > trivial > and AFAIK there are no compilers (for any target) performing the > "ideal" > optimization here, see https://godbolt.org/z/n1YqWq9c9. > > Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk > (GCC 14 or now)? > > gcc/ChangeLog: > > * config/loongarch/loongarch.cc > (loongarch_setup_incoming_varargs): Don't save more GARs than > cfun->va_list_gpr_size / UNITS_PER_WORD. > > gcc/testsuite/ChangeLog: > > * gcc.target/loongarch/va_arg.c: New test. > --- > gcc/config/loongarch/loongarch.cc | 4 +++- > gcc/testsuite/gcc.target/loongarch/va_arg.c | 24 > +++++++++++++++++++++ > 2 files changed, 27 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.target/loongarch/va_arg.c > > diff --git a/gcc/config/loongarch/loongarch.cc > b/gcc/config/loongarch/loongarch.cc > index 6927bdc7fe5..0ecb91ca997 100644 > --- a/gcc/config/loongarch/loongarch.cc > +++ b/gcc/config/loongarch/loongarch.cc > @@ -764,7 +764,9 @@ loongarch_setup_incoming_varargs > (cumulative_args_t cum, > loongarch_function_arg_advance (pack_cumulative_args > (&local_cum), arg); > > /* Found out how many registers we need to save. */ > - gp_saved = MAX_ARGS_IN_REGISTERS - local_cum.num_gprs; > + gp_saved = cfun->va_list_gpr_size / UNITS_PER_WORD; > + if (gp_saved > (int) (MAX_ARGS_IN_REGISTERS - local_cum.num_gprs)) > + gp_saved = MAX_ARGS_IN_REGISTERS - local_cum.num_gprs; > > if (!no_rtl && gp_saved > 0) > { > diff --git a/gcc/testsuite/gcc.target/loongarch/va_arg.c > b/gcc/testsuite/gcc.target/loongarch/va_arg.c > new file mode 100644 > index 00000000000..980c96d0e3d > --- /dev/null > +++ b/gcc/testsuite/gcc.target/loongarch/va_arg.c > @@ -0,0 +1,24 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > + > +/* Technically we shouldn't save any register for this function: it > should be > + compiled as if it accepts 3 named arguments. But AFAIK no > compilers can > + achieve this "perfect" optimization now, so just ensure we are > using the > + knowledge provided by stdarg pass and we won't save GARs > impossible to be > + accessed with __builtin_va_arg () when the va_list does not > escape. */ > + > +/* { dg-final { scan-assembler-not "st.*r7" } } */ > + > +int > +test (int a0, ...) > +{ > + void *arg; > + int a1, a2; > + > + __builtin_va_start (arg, a0); > + a1 = __builtin_va_arg (arg, int); > + a2 = __builtin_va_arg (arg, int); > + __builtin_va_end (arg); > + > + return a0 + a1 + a2; > +} -- Xi Ruoyao <xry...@xry111.site> School of Aerospace Science and Technology, Xidian University