https://gcc.gnu.org/g:2ec684467cbc610348ebd7b71e01425629b82f73

commit r17-675-g2ec684467cbc610348ebd7b71e01425629b82f73
Author: Jeff Law <[email protected]>
Date:   Fri May 22 09:21:31 2026 -0600

    Re: [Linaro-TCWG-CI] gcc-17-577-gbc19036af435: 2 regressions on aarch64
    
    >>> Produces 2 regressions:
    >>>    |
    >>>    | regressions.sum:
    >>>    | Running gcc:gcc.target/aarch64/aarch64.exp ...
    >>>    | FAIL: gcc.target/aarch64/tbz_1.c check-function-bodies g1
    >>>    | FAIL: gcc.target/aarch64/tbz_1.c check-function-bodies g2
    >> I've reproduced this locally.  I haven't gotten into the debug cycle
    >> yet, but wanted folks to know it's mine AFAICT to avoid duplicating
    >> debugging efforts.
    >
    > I think the fix is just a testcase fix.
    > Currently the testcase has:
    > **      tbnz    w[0-9]+, #?0, .L([0-9]+)
    >
    > But that should just can be:
    > **      tbnz    [wx][0-9]+, #?0, .L([0-9]+)
    >
    > to match both x0 and w0 there. But are valid in this case with bit 0.
    > For both g1 and g2.
    Yea, so I wanted to verify this was behaving per expectation.  It is.  We 
have
    this in g1:
    
    > (insn 2 4 3 2 (set (reg/v:SI 101 [ x ])
    >         (zero_extend:SI (reg:QI 0 x0 [ x ]))) "j.c":16:1 149 
{*zero_extendqisi2_aarch64}
    >      (expr_list:REG_DEAD (reg:QI 0 x0 [ x ])
    >         (nil)))
    > (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
    > (insn 6 3 7 2 (set (reg:SI 102)
    >         (and:SI (reg/v:SI 101 [ x ])
    >             (const_int 1 [0x1]))) "j.c":17:6 discrim 1 551 {andsi3}
    >      (expr_list:REG_DEAD (reg/v:SI 101 [ x ])
    >         (nil)))
    
    We can trivially see that the zero extension is pointless because the value 
is
    masked with 0x1 immediately thereafter.  So ext-dce replaces the zero_extend
    with a subreg which then simplifies the sequence into:
    
    > (insn 2 4 3 2 (set (reg/v:SI 101 [ x ])
    >         (reg:SI 0 x0 [ x ])) "j.c":16:1 104 {*movsi_aarch64}
    >      (expr_list:REG_DEAD (reg:QI 0 x0 [ x ])
    >         (nil)))
    > (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
    > (insn 6 3 7 2 (set (reg:SI 102)
    >         (and:SI (reg/v:SI 101 [ x ])
    >             (const_int 1 [0x1]))) "j.c":17:6 discrim 1 551 {andsi3}
    >      (expr_list:REG_DEAD (reg/v:SI 101 [ x ])
    >         (nil)))
    
    Combine then turns it into:
    
    > (insn 28 4 2 2 (set (reg:SI 103 [ x ])
    >         (reg:SI 0 x0 [ x ])) "j.c":16:1 -1
    >      (expr_list:REG_DEAD (reg:SI 0 x0 [ x ])
    >         (nil)))
    > (note 2 28 3 2 NOTE_INSN_DELETED)
    > (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
    > (note 6 3 7 2 NOTE_INSN_DELETED)
    > (note 7 6 8 2 NOTE_INSN_DELETED)
    > (jump_insn 8 7 9 2 (parallel [
    >             (set (pc)
    >                 (if_then_else (eq (zero_extract:DI (subreg:DI (reg:SI 103 
[ x ]) 0)
    >                             (const_int 1 [0x1])
    >                             (const_int 0 [0]))
    >                         (const_int 0 [0]))
    >                     (label_ref:DI 14)
    >                     (pc)))
    >             (clobber (reg:CC 66 cc))
    >         ]) "j.c":17:6 discrim 1 76 {aarch64_tbzeqdidi}
    >      (expr_list:REG_UNUSED (reg:CC 66 cc)
    >         (expr_list:REG_DEAD (reg:SI 103 [ x ])
    >             (int_list:REG_BR_PROB 966367641 (nil))))
    
    Without the recent ext-dce improvement we get this out of combine:
    
    > (jump_insn 8 7 9 2 (parallel [
    >             (set (pc)
    >                 (if_then_else (eq (zero_extract:SI (reg:SI 0 x0 [ x ])
    >                             (const_int 1 [0x1])
    >                             (const_int 0 [0]))
    >                         (const_int 0 [0]))
    >                     (label_ref:DI 14)
    >                     (pc)))
    >             (clobber (reg:CC 66 cc))
    >         ]) "k.c":17:6 discrim 1 47 {aarch64_tbeqsisi}
    >      (expr_list:REG_UNUSED (reg:CC 66 cc)
    >         (expr_list:REG_DEAD (reg:QI 0 x0 [ x ])
    >             (int_list:REG_BR_PROB 966367644 (nil))))
    
    Note the difference in zero_extract's mode within the bit-test insn.  DI vs 
SI.
    Hence the w vs x difference in the resulting assembly code.
    
    gcc/testsuite
            * gcc.target/aarch64/tbz_1.c: Update expected output.

Diff:
---
 gcc/testsuite/gcc.target/aarch64/tbz_1.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/tbz_1.c 
b/gcc/testsuite/gcc.target/aarch64/tbz_1.c
index 39deb58e278e..065e7f62cf8a 100644
--- a/gcc/testsuite/gcc.target/aarch64/tbz_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/tbz_1.c
@@ -8,7 +8,7 @@ void h(void);
 
 /*
 ** g1:
-**     tbnz    w[0-9]+, #?0, .L([0-9]+)
+**     tbnz    [wx][0-9]+, #?0, .L([0-9]+)
 **     ret
 **     ...
 */
@@ -20,7 +20,7 @@ void g1(bool x)
 
 /*
 ** g2:
-**     tbz     w[0-9]+, #?0, .L([0-9]+)
+**     tbz     [wx][0-9]+, #?0, .L([0-9]+)
 **     b       h
 **     ...
 */

Reply via email to