On 3/2/26 6:23 PM, Vineet Gupta wrote:
Function arguments narrower than 32-bits are currently zero/sign extended in
the callee while clang has been doing this in caller for a long time.
Per discussion with kernel community, this needs to be fixed in gcc [1].

Also ensure that function return value promotion happens in callee as well [2],
although gcc currently seems to be doing that by chance (add tests to that
effect).

[1]https://gcc.gnu.org/pipermail/bpf/2026-February/000016.html
[2]https://gcc.gnu.org/pipermail/bpf/2026-February/000036.html

Implementation details:

  #1: target hook TARGET_PROMOTE_FUNCTION_MODE set to helper
      default_promote_function_mode_always_promote () which calls into
      target macro PROMOTE_MODE.

  #2. PROMOTE_MODE updated to not extend anything smaller than DI to DI,
      rather promote anything smaller than SI to SI (meaning retain
      SI where possible, likely to generate better code). This is needed
      for ABI change but has merit of it's own, hence updated and
      targethook made to call this.

Do note that this is effectively implementing the equivalent of newly
introduced helper in commit fecf542e1db9b
"(Add default_promote_function_mode_sign_extend)".
But the semantics of that default are not clear for 64-bit targets and
besides with the PROMOTE_MODE change as needed/done, the two
implementations of targethook would have been effectively the same.

-----------------------------------------------------------------------
Now the rub:
This is an RFC as it introduces one new issue (atleast) which I can't fix.
The issue is exhibited by newly introduced abi-ret-caller.c:foo_bool_ne1 ()
where caller is missing a clamping the return value to 0xff, which surprisingly
is emitted for the collorary test foo_bool_ne0 () which checks for 0 vs.1.
This is also the reason for additional kernel selftest failues as I spot
a missing 0xff in some of the verifier failure logs.

We start off with following nobrainer tree-optimized output:

    int foo_bool_ne1 ()
    {
      _Bool _1;
      int _5;

      <bb 2> [local count: 1073741824]:
      _1 = bar_bool ();
      _5 = (int) _1;
      return _5;
    }

At RTL expand time, we get different control flow for same test w/ w/o
the patch.

expand_gimple_stmt
  expand_call_stmt
   expand_assignment
    expand_assignment
     store_expr(exp, target)

        (gdb) call debug(exp)
        exp <call_expr 0x7fffe98104e0
        (gdb) call debug_rtx(target)
        (subreg/s/v:QI (reg:SI 19 [ _1 ]) 0)

      else if (GET_CODE (target) == SUBREG && SUBREG_PROMOTED_VAR_P (target))
       temp = expand_expr (exp, inner_target, VOIDmode,
                          call_param_p ? EXPAND_STACK_PARM : EXPAND_NORMAL);

        (gdb) call debug_rtx(target)
        (subreg/s/v:QI (reg:SI 22) 0)

expand_expr() above returns different value due to the new
targethook/PROMOTE_MODE implementation.

        expand_expr_real_1
           expand_call
             valreg = hard_function_value (rettype, fndecl, fntype, (pass == 
0));
                 targetm.calls.function_value
                   bpf_function_value
                      promote_function_mode
                         PROMOTE_MODE

                gcc (targethook):NOK              | gcc (baseline:OK)
                (gdb) call debug(valreg)          |
                (reg:SI 0 %r0)                    |(reg:QI 0 %r0)

This is the key as it changes what next store_expr() -> convert_move () does.
For the subreg it calls
       from = gen_lowpart (to_int_mode, SUBREG_REG (from));

while for the baseline it calls
       convert_mode_scalar(to, from)

which generates the &= 0xff.

The part I don't understand is, is the subreg being there a problem or
is it missing handling in some case in the maze of expr.cc
(subregs were never clear to me).

I have some more info here.

The issue is not subreg but that it is marked as promoted eagerly.
The subreg itself is created as follows expand_call
target = copy_to_reg (avoid_likely_spilled_reg (valreg)); pmode = promote_function_mode (type, ret_mode, &unsignedp, funtype, 1);
          target = gen_lowpart_SUBREG (ret_mode, target);
-          SUBREG_PROMOTED_VAR_P (target) = 1;
+          SUBREG_PROMOTED_VAR_P (target) = 0;
          SUBREG_PROMOTED_SET (target, unsignedp);

As a hack if I clear the PROMOTED note, we get the desired promotion.

And as a data point, if I run gcc testsuite for *riscv* with this change - riscv being sensitive to sign extensions and it's funny as I did an opposite change for riscv, a few years ago "(8eb9cdd14218 expr: don't clear SUBREG_PROMOTED_VAR_P flag for a promoted subreg [target/111466])"

There's a grand total of 1 additional failure for entire suite (at -O0)

|
| gcc.target/riscv/zero-extend-3.c -O0 scan-assembler-times \\msext\\.w\\M 0
|

Essentially w/ patch we get following which doesn't get optimized away at -O0

(insn 8 7 9 2 (set (reg:DI 134 [ _1 ])

        (sign_extend:DI (subreg/u:SI (reg:DI 137) 0)))


vs.

(insn 8 7 9 2 (set (reg:DI 134 [ _1 ])

        (reg:DI 137))


I tried a further little hack to mark promoted conditionally, i.e. only if modes were disparate but that seems to be the common case anyways and won't help with stragglers.

RISCV: PROMOTE_MODE < DI to DI
BPF proposed: PROMOTE_MODE < SI to SI

-             SUBREG_PROMOTED_VAR_P (target) = 1;
+             int is_prom = !!known_gt (GET_MODE_SIZE (pmode), GET_MODE_SIZE (ret_mode)));
+             SUBREG_PROMOTED_VAR_P (target) = is_prom;

So the question is obviously how to mark this promoted conditionally?

Thx,
-Vineet

As for the working abi-ret-caller.c:foo_bool_ne0 () the gimple is
obviously different, and it seems the intervening XOR triggers the
clamping to 0xff.

                ret_bool_ne1           |       ret_bool_ne0
                                       |
   int foo_bool_ne1 ()                 |  int foo_bool_ne0 ()
   {                                   |  {
     _Bool _1;                         |    _Bool _1;
     int _5;                           |    _Bool _5;
                                       |    int _6;
     <bb 2> [local count: 1073741824]: |
     _1 = bar_bool ();                 |    <bb 2> [local count: 1073741824]:
     _5 = (int) _1;                    |    _1 = bar_bool ();
     return _5;                        |    _5 = ~_1;          <-----
   }                                   |    _6 = (int) _5;
                                       |    return _6;
                                       |  }

Any gueestions on how to solve this will be much appreciated.

gcc/ChangeLog:

        * config/bpf/bpf.cc (TARGET_PROMOTE_FUNCTION_MODE): Implement.
        (bpf_function_value): Initialized unsignedp.
        * config/bpf/bpf.h (PROMOTE_MODE): Modes smaller than SImode
        promoted to SImode.

gcc/testsuite/ChangeLog:

        * gcc.dg/sign-extend.c: Works for bpf as well.
        * gcc.dg/zero-extend.c: Ditto.
        * gcc.target/bpf/smov-1.c: Don't expect promotion in callee.
        * gcc.target/bpf/smov-pseudoc-1.c: Ditto.
        * gcc.target/bpf/abi-args-callee.c: New test.
        * gcc.target/bpf/abi-args-caller.c: New test.
        * gcc.target/bpf/abi-ret-caller.c: New test.

Signed-off-by: Vineet Gupta<[email protected]>
Signed-off-by: Vineet Gupta<[email protected]>
---
  gcc/config/bpf/bpf.cc                         | 13 +++--
  gcc/config/bpf/bpf.h                          | 16 +++----
  gcc/testsuite/gcc.dg/sign-extend.c            |  2 +-
  gcc/testsuite/gcc.dg/zero-extend.c            |  2 +-
  .../gcc.target/bpf/abi-args-callee.c          | 15 ++++++
  .../gcc.target/bpf/abi-args-caller.c          | 18 +++++++
  gcc/testsuite/gcc.target/bpf/abi-ret-caller.c | 47 +++++++++++++++++++
  gcc/testsuite/gcc.target/bpf/smov-1.c         |  4 +-
  gcc/testsuite/gcc.target/bpf/smov-pseudoc-1.c |  4 +-
  9 files changed, 102 insertions(+), 19 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/bpf/abi-args-callee.c
  create mode 100644 gcc/testsuite/gcc.target/bpf/abi-args-caller.c
  create mode 100644 gcc/testsuite/gcc.target/bpf/abi-ret-caller.c

diff --git a/gcc/config/bpf/bpf.cc b/gcc/config/bpf/bpf.cc
index d91013ad140b..6aaccf45fe8f 100644
--- a/gcc/config/bpf/bpf.cc
+++ b/gcc/config/bpf/bpf.cc
@@ -292,6 +292,13 @@ bpf_file_end (void)
  #undef TARGET_ASM_FILE_END
  #define TARGET_ASM_FILE_END bpf_file_end
+/* Caller-side promotion of 8/16/32-bit args and result.
+   Semantics as implemented in PROMOTE_MODE macro.  */
+
+#undef TARGET_PROMOTE_FUNCTION_MODE
+#define TARGET_PROMOTE_FUNCTION_MODE \
+  default_promote_function_mode_always_promote
+
  /* Return an RTX representing the place where a function returns or
     receives a value of data type RET_TYPE, a tree node representing a
     data type.  */
@@ -301,10 +308,10 @@ bpf_function_value (const_tree ret_type,
                    const_tree fntype_or_decl,
                    bool outgoing ATTRIBUTE_UNUSED)
  {
-  enum machine_mode mode;
-  int unsignedp;
+  enum machine_mode mode = TYPE_MODE (ret_type);
+  int unsignedp = TYPE_UNSIGNED (ret_type);
- mode = TYPE_MODE (ret_type);
+  /* TBD: change for_return as  outgoing ? 2 : 1.  */
    if (INTEGRAL_TYPE_P (ret_type))
      mode = promote_function_mode (ret_type, mode, &unsignedp,
                                  fntype_or_decl, 1);
diff --git a/gcc/config/bpf/bpf.h b/gcc/config/bpf/bpf.h
index c8dad55fd4c4..3e8d7dcc5772 100644
--- a/gcc/config/bpf/bpf.h
+++ b/gcc/config/bpf/bpf.h
@@ -43,14 +43,14 @@
  #define BITS_PER_WORD 64
  #define UNITS_PER_WORD 8
-/* When storing an integer whose size is less than 64-bit in a
-   register, promote it to a DImode.  */
-#define PROMOTE_MODE(M, UNSIGNEDP, TYPE)       \
-  do                                           \
-    {                                          \
-      if (GET_MODE_CLASS (M) == MODE_INT       \
-         && GET_MODE_SIZE (M) < 8)          \
-       M = DImode;                             \
+/* Promote integer modes smaller than a word to SImode.  */
+#define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE)            \
+  do {                                                 \
+    if (GET_MODE_CLASS (MODE) == MODE_INT              \
+       && GET_MODE_SIZE (MODE) < UNITS_PER_WORD)    \
+      {                                                        \
+       (MODE) = SImode;                                \
+      }                                                        \
      } while (0)
/* Align argument parameters on the stack to 64-bit, at a minimum. */
diff --git a/gcc/testsuite/gcc.dg/sign-extend.c 
b/gcc/testsuite/gcc.dg/sign-extend.c
index 3ded5050d17e..b045eac3a639 100644
--- a/gcc/testsuite/gcc.dg/sign-extend.c
+++ b/gcc/testsuite/gcc.dg/sign-extend.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target mcore-*-* xtensa-*-* } } */
+/* { dg-do compile { target mcore-*-* xtensa-*-* bpf-*-* } } */
  /* { dg-options "-O1 -fdump-rtl-expand" } */
void s8(signed char c);
diff --git a/gcc/testsuite/gcc.dg/zero-extend.c 
b/gcc/testsuite/gcc.dg/zero-extend.c
index 22671e568971..b1b27b93f428 100644
--- a/gcc/testsuite/gcc.dg/zero-extend.c
+++ b/gcc/testsuite/gcc.dg/zero-extend.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target mcore-*-* xtensa-*-* } } */
+/* { dg-do compile { target mcore-*-* xtensa-*-* bpf-*-* } } */
/* { dg-options "-O1 -fdump-rtl-expand" } */ void u8(unsigned char c); diff --git a/gcc/testsuite/gcc.target/bpf/abi-args-callee.c b/gcc/testsuite/gcc.target/bpf/abi-args-callee.c new file mode 100644 index 000000000000..c9308bea23f0 --- /dev/null +++ b/gcc/testsuite/gcc.target/bpf/abi-args-callee.c @@ -0,0 +1,15 @@ +/* Verify ABI: narrow args are not promoted in callee. */ +/* { dg-do compile } */ +/* { dg-options "-O2 -mcpu=v4" } */ + +int args_consume (signed char a, short b, int c) +{ + int x = a; + int y = b; + int z = c; + + return x + y + z; +} + +/* { dg-final { scan-assembler-not {w. = \(s8\) w.\n} } } */ +/* { dg-final { scan-assembler-not {w. = \(s16\) w.\n} } } */ diff --git a/gcc/testsuite/gcc.target/bpf/abi-args-caller.c b/gcc/testsuite/gcc.target/bpf/abi-args-caller.c new file mode 100644 index 000000000000..18f1a3f9e840 --- /dev/null +++ b/gcc/testsuite/gcc.target/bpf/abi-args-caller.c @@ -0,0 +1,18 @@ +/* Verify ABI: narrow args are promoted in caller. */ +/* { dg-do compile } */ +/* { dg-options "-O2 -mcpu=v4" } */
+
+typedef struct {
+    int a;
+    short b;
+    signed char c;
+} my_t;
+
+void foo(signed char, short, int);
+char args_setup(my_t *s)
+{
+    foo(s->c, s->b, s->a);
+}
+
+/* { dg-final { scan-assembler-times {w. = \(s8\) w.\n} 1 } } */
+/* { dg-final { scan-assembler-times {w. = \(s16\) w.\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/bpf/abi-ret-caller.c 
b/gcc/testsuite/gcc.target/bpf/abi-ret-caller.c
new file mode 100644
index 000000000000..0769f33ce74d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/bpf/abi-ret-caller.c
@@ -0,0 +1,47 @@
+/* Verify ABI: narrow return values are promoted in caller.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcpu=v4" } */
+
+_Bool bar_bool(void);
+signed char bar_char(void);
+unsigned char bar_char_u(void);
+short bar_short(void);
+int bar_int(void);
+
+int foo_bool_ne1(void) {
+      if (bar_bool() != 1) return 0; else return 1;
+}
+int foo_bool_ne0(void) {
+      if (bar_bool() != 0) return 0; else return 1;
+}
+int foo_bool2(void) {
+      if (bar_bool() != 1) return 7; else return 9;
+}
+int foo_char(void) {
+      if (bar_char() != 1) return 0; else return 1;
+}
+int foo_char2(void) {
+      if (bar_char() != 20) return 0; else return 1;
+}
+int foo_char_u(void) {
+      if (bar_char_u() != 1) return 0; else return 1;
+}
+int foo_char_u2(void) {
+      if (bar_char_u() != 20) return 0; else return 1;
+}
+int foo_short(void) {
+      if (bar_short() != 1) return 0; else return 1;
+}
+int foo_short2(void) {
+      if (bar_short() != 30) return 0; else return 1;
+}
+int foo_int(void) {
+      if (bar_int() != 1) return 0; else return 1;
+}
+int foo_int2(void) {
+      if (bar_int() != 10) return 0; else return 1;
+}
+/* Bool and char tests will clamp the return value to 0xff .  */
+/* { dg-final { scan-assembler-times {r. &= 0xff\n} 7 } } */
+/* { dg-final { scan-assembler-times {r. &= 0xffff\n} 2 } } */
diff --git a/gcc/testsuite/gcc.target/bpf/smov-1.c 
b/gcc/testsuite/gcc.target/bpf/smov-1.c
index 49109a80e3e8..dfeea9ea2e2c 100644
--- a/gcc/testsuite/gcc.target/bpf/smov-1.c
+++ b/gcc/testsuite/gcc.target/bpf/smov-1.c
@@ -13,6 +13,4 @@ foo (char a, short b, int c, unsigned long d)
    return x + y + z + w;
  }
-/* { dg-final { scan-assembler {movs\t%r.,%r.,8\n} } } */
-/* { dg-final { scan-assembler {movs\t%r.,%r.,16\n} } } */
-/* { dg-final { scan-assembler {movs\t%r.,%r.,32\n} } } */
+/* { dg-final { scan-assembler-times {movs\t%r.,%r.,32\n} 3 } } */
diff --git a/gcc/testsuite/gcc.target/bpf/smov-pseudoc-1.c 
b/gcc/testsuite/gcc.target/bpf/smov-pseudoc-1.c
index b15334ee4e36..fb241b3eba70 100644
--- a/gcc/testsuite/gcc.target/bpf/smov-pseudoc-1.c
+++ b/gcc/testsuite/gcc.target/bpf/smov-pseudoc-1.c
@@ -13,6 +13,4 @@ foo (char a, short b, int c, unsigned long d)
    return x + y + z + w;
  }
-/* { dg-final { scan-assembler {r. = \(s8\) r.\n} } } */
-/* { dg-final { scan-assembler {r. = \(s16\) r.\n} } } */
-/* { dg-final { scan-assembler {r. = \(s32\) r.\n} } } */
+/* { dg-final { scan-assembler-times {r. = \(s32\) r.\n} 3 } } */

Reply via email to