Hi Jakub,

On Tue, Apr 22, 2025 at 10:41:29AM +0200, Jakub Jelinek wrote:
> Hi!
> 
> protobuf (and therefore firefox too) currently doesn't build on s390*-linux.
> The problem is that it uses [[clang::musttail]] attribute heavily, and in
> llvm (IMHO llvm bug) [[clang::musttail]] calls with 5+ arguments on
> s390*-linux are silently accepted and result in a normal non-tail call.
> In GCC we just reject those because the target hook refuses to tail call it
> (IMHO the right behavior).
> Now, the reason why that happens is as s390_function_ok_for_sibcall attempts
> to explain, the 5th argument (assuming normal <= wordsize integer or pointer
> arguments, nothing that needs 2+ registers) is passed in %r6 which is not
> call clobbered, so we can't do tail call when we'd have to change content
> of that register and then caller would assume %r6 content didn't change and
> use it again.
> In the protobuf case though, the 5th argument is always passed through
> from the caller to the musttail callee unmodified, so one can actually
> emit just jg tail_called_function or perhaps tweak some registers but
> keep %r6 untouched, and in that case I think it is just fine to tail call
> it (at least unless the stack slots used for 6+ argument can't be modified
> by the callee in the ABI and nothing checks for that).

I very much like the idea.

> 
> So, the following patch checks for this special case, where the argument
> which uses %r6 is passed in a single register and it is passed default
> definition of SSA_NAME of a PARM_DECL with the same DECL_INCOMING_RTL.

Do we really need a check for nregs==1 here?  Only, for -m31 we pass
parameters in register pairs.  With check nregs==1 we fail on -m31 for
the following example and without we pass:

extern int bar (int p1, int p2, int p3, long long p4, int p5);
int foo (int p1, int p2, int p3, long long p4, int p5)
{
  [[gnu::musttail]] return bar (p1, p2, p3, p4, p5);
}

Parameter p4 should be passed in r5,r6 and p5 via stack.

> 
> It won't really work at -O0 but should work for -O1 and above, at least when
> one doesn't really try to modify the parameter conditionally and hope it will
> be optimized away in the end.

It also fails for

extern int bar (int p1, int p2, int p3, int p4, int p5);
int foo (int p1, int p2, int p3, int p4, int p5)
{
  [[gnu::musttail]] return bar (p1, p2, p3, p4, p5);
}

since rtx_equal_p (parm, parm_rtx) does not hold for p5

(gdb) call debug_rtx(parm_rtx)
(reg:SI 6 %r6)
(gdb) call debug_rtx(parm)
(reg:DI 6 %r6 [ p5+-4 ])

due to a missmatch between extended and non-extended values.  Maybe a
check like

REGNO (parm) == REGNO (parm_rtx)
&& REG_NREGS (parm) == REG_NREGS (parm_rtx)

is sufficient?

Cheers,
Stefan

> 
> Bootstrapped/regtested on s390x-linux, ok for trunk?
> 
> I wonder if we shouldn't do this for 15.1 as well with additional
> && CALL_EXPR_MUST_TAIL_CALL (call_expr) check ideally after nregs == 1
> so that we only do that for the musttail cases where we'd otherwise
> error and not for anything else, to fix up protobuf/firefox out of the box.
> 
> 2025-04-21  Jakub Jelinek  <ja...@redhat.com>
> 
>       PR target/119873
>       * config/s390/s390.cc (s390_call_saved_register_used): Don't return
>       true if default definition of PARM_DECL SSA_NAME of the same register
>       is passed in call saved register.
>       (s390_function_ok_for_sibcall): Adjust comment.
> 
>       * gcc.target/s390/pr119873-1.c: New test.
>       * gcc.target/s390/pr119873-2.c: New test.
> 
> --- gcc/config/s390/s390.cc.jj        2025-04-14 07:26:46.441883927 +0200
> +++ gcc/config/s390/s390.cc   2025-04-21 21:57:37.457535989 +0200
> @@ -14496,7 +14496,21 @@ s390_call_saved_register_used (tree call
>  
>         for (reg = 0; reg < nregs; reg++)
>           if (!call_used_or_fixed_reg_p (reg + REGNO (parm_rtx)))
> -           return true;
> +           {
> +             rtx parm;
> +             /* Allow passing through unmodified value from caller,
> +                see PR119873.  */
> +             if (nregs == 1
> +                 && TREE_CODE (parameter) == SSA_NAME
> +                 && SSA_NAME_IS_DEFAULT_DEF (parameter)
> +                 && SSA_NAME_VAR (parameter)
> +                 && TREE_CODE (SSA_NAME_VAR (parameter)) == PARM_DECL
> +                 && (parm = DECL_INCOMING_RTL (SSA_NAME_VAR (parameter)))
> +                 && REG_P (parm)
> +                 && rtx_equal_p (parm, parm_rtx))
> +               break;
> +             return true;
> +           }
>       }
>        else if (GET_CODE (parm_rtx) == PARALLEL)
>       {
> @@ -14543,8 +14557,9 @@ s390_function_ok_for_sibcall (tree decl,
>      return false;
>  
>    /* Register 6 on s390 is available as an argument register but 
> unfortunately
> -     "caller saved". This makes functions needing this register for arguments
> -     not suitable for sibcalls.  */
> +     "caller saved".  This makes functions needing this register for 
> arguments
> +     not suitable for sibcalls, unless the same value is passed from the
> +     caller.  */
>    return !s390_call_saved_register_used (exp);
>  }
>  
> --- gcc/testsuite/gcc.target/s390/pr119873-1.c.jj     2025-04-21 
> 22:03:59.341568852 +0200
> +++ gcc/testsuite/gcc.target/s390/pr119873-1.c        2025-04-21 
> 22:03:54.277634719 +0200
> @@ -0,0 +1,11 @@
> +/* PR target/119873 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +const char *foo (void *, void *, void *, void *, unsigned long, unsigned 
> long);
> +
> +const char *
> +bar (void *a, void *b, void *c, void *d, unsigned long e, unsigned long f)
> +{
> +  [[gnu::musttail]] return foo (a, b, c, d, e, f);
> +}
> --- gcc/testsuite/gcc.target/s390/pr119873-2.c.jj     2025-04-21 
> 22:04:06.150480291 +0200
> +++ gcc/testsuite/gcc.target/s390/pr119873-2.c        2025-04-21 
> 22:05:36.014311435 +0200
> @@ -0,0 +1,17 @@
> +/* PR target/119873 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +const char *foo (void *, void *, void *, void *, unsigned long, unsigned 
> long);
> +
> +const char *
> +bar (void *a, void *b, void *c, void *d, unsigned long e, unsigned long f)
> +{
> +  [[gnu::musttail]] return foo (a, b, c, d, e + 1, f);       /* { dg-error 
> "cannot tail-call: target is not able to optimize the call into a sibling 
> call" } */
> +}
> +
> +const char *
> +baz (void *a, void *b, void *c, void *d, unsigned long e, unsigned long f)
> +{
> +  [[gnu::musttail]] return foo (a, b, c, d, f, e);   /* { dg-error "cannot 
> tail-call: target is not able to optimize the call into a sibling call" } */
> +}
> 
>       Jakub
> 

Reply via email to