Hello Richard,

I don't think this topic was ever concluded.

By now, I'd like to re-target this to releases/gcc-15 and trunk instead.


To make it more obvious what happens, I've generated 
reg_equal_test.c.274r.expand with r16-8253-geb50d28a9353e9 for the different 
multilib permutations that I test (I've taken the liberty to remove empty 
lines).


thumb/arch=armv7ve+nofp/tune=cortex-a7/float-abi=soft/fpu=auto
;; Function x (x, funcdef_no=0, decl_uid=4706, cgraph_uid=1, symbol_order=0)
;; Generating RTL for gimple basic block 2
try_optimize_cfg iteration 1
Merging block 3 into block 2...
Merged blocks 2 and 3.
Merged 2 and 3 without moving.
Merging block 4 into block 2...
Merged blocks 2 and 4.
Merged 2 and 4 without moving.
try_optimize_cfg iteration 2
;;
;; Full RTL generated for this function:
;;
(note 1 0 3 NOTE_INSN_DELETED)
(note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)
(insn 5 2 6 2 (set (reg/v:SI 94 [ d ])
        (const_int 19294 [0x4b5e])) -1
     (nil))
(insn 6 5 0 2 (set (zero_extract:SI (reg/v:SI 94 [ d ])
            (const_int 16 [0x10])
            (const_int 16 [0x10]))
        (const_int 51154 [0xc7d2])) -1
     (expr_list:REG_EQUAL (const_int -942519458 [0xffffffffc7d24b5e])
        (nil)))


thumb/arch=armv7ve+neon/tune=cortex-a7/float-abi=hard/fpu=auto
;; Function x (x, funcdef_no=0, decl_uid=7770, cgraph_uid=1, symbol_order=0)
;; Generating RTL for gimple basic block 2
try_optimize_cfg iteration 1
Merging block 3 into block 2...
Merged blocks 2 and 3.
Merged 2 and 3 without moving.
Merging block 4 into block 2...
Merged blocks 2 and 4.
Merged 2 and 4 without moving.
try_optimize_cfg iteration 2
;;
;; Full RTL generated for this function:
;;
(note 1 0 3 NOTE_INSN_DELETED)
(note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)
(insn 5 2 6 2 (set (reg/v:SI 94 [ d ])
        (const_int 19294 [0x4b5e])) -1
     (nil))
(insn 6 5 0 2 (set (zero_extract:SI (reg/v:SI 94 [ d ])
            (const_int 16 [0x10])
            (const_int 16 [0x10]))
        (const_int 51154 [0xc7d2])) -1
     (expr_list:REG_EQUAL (const_int -942519458 [0xffffffffc7d24b5e])
        (nil)))


thumb/arch=armv7e-m+fp/tune=cortex-m4/float-abi=hard/fpu=auto
thumb/arch=armv7e-m+fp.dp/tune=cortex-m7/float-abi=hard/fpu=auto
thumb/arch=armv8-m.main+dsp+fp/tune=cortex-m33/float-abi=hard/fpu=auto
;; Function x (x, funcdef_no=0, decl_uid=7770, cgraph_uid=1, symbol_order=0)
;; Generating RTL for gimple basic block 2
try_optimize_cfg iteration 1
Merging block 3 into block 2...
Merged blocks 2 and 3.
Merged 2 and 3 without moving.
Merging block 4 into block 2...
Merged blocks 2 and 4.
Merged 2 and 4 without moving.
try_optimize_cfg iteration 2
;;
;; Full RTL generated for this function:
;;
(note 1 0 3 NOTE_INSN_DELETED)
(note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)
(insn 5 2 0 2 (set (reg/v:SI 94 [ d ])
        (const_int -942519458 [0xffffffffc7d24b5e])) -1
     (nil))


thumb/arch=armv6s-m/tune=cortex-m0/float-abi=soft/fpu=auto (unsupported by { ! 
{ arm_thumb2_ok || arm_thumb1_movt_ok } }, but included anyway for completeness)
thumb/arch=armv7-m/tune=cortex-m3/float-abi=soft/fpu=auto
thumb/arch=armv7e-m+nofp/tune=cortex-m4/float-abi=soft/fpu=auto
thumb/arch=armv7e-m+nofp/tune=cortex-m7/float-abi=soft/fpu=auto
thumb/arch=armv8-m.main+dsp+nofp/tune=cortex-m33/float-abi=soft/fpu=auto
thumb/arch=armv8.1-m.main+mve+nofp/tune=cortex-m55/float-abi=soft/fpu=auto
thumb/arch=armv8.1-m.main+mve+pacbti+nofp/tune=cortex-m85/float-abi=soft/fpu=auto
;; Function x (x, funcdef_no=0, decl_uid=4706, cgraph_uid=1, symbol_order=0)
;; Generating RTL for gimple basic block 2
try_optimize_cfg iteration 1
Merging block 3 into block 2...
Merged blocks 2 and 3.
Merged 2 and 3 without moving.
Merging block 4 into block 2...
Merged blocks 2 and 4.
Merged 2 and 4 without moving.
try_optimize_cfg iteration 2
;;
;; Full RTL generated for this function:
;;
(note 1 0 3 NOTE_INSN_DELETED)
(note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)
(insn 5 2 0 2 (set (reg/v:SI 94 [ d ])
        (const_int -942519458 [0xffffffffc7d24b5e])) -1
     (nil))


thumb/arch=armv8.1-m.main+mve.fp+fp.dp/tune=cortex-m55/float-abi=hard/fpu=auto
thumb/arch=armv8.1-m.main+mve.fp+pacbti+fp.dp/tune=cortex-m85/float-abi=hard/fpu=auto
;; Function x (x, funcdef_no=0, decl_uid=7944, cgraph_uid=1, symbol_order=0)
;; Generating RTL for gimple basic block 2
try_optimize_cfg iteration 1
Merging block 3 into block 2...
Merged blocks 2 and 3.
Merged 2 and 3 without moving.
Merging block 4 into block 2...
Merged blocks 2 and 4.
Merged 2 and 4 without moving.
try_optimize_cfg iteration 2
;;
;; Full RTL generated for this function:
;;
(note 1 0 3 NOTE_INSN_DELETED)
(note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)
(insn 5 2 0 2 (set (reg/v:SI 94 [ d ])
        (const_int -942519458 [0xffffffffc7d24b5e])) -1
     (nil))



So, to summarize; Cortex-A7 generates

(insn 6 5 0 2 (set (zero_extract:SI (reg/v:SI 94 [ d ])
            (const_int 16 [0x10])
            (const_int 16 [0x10]))
        (const_int 51154 [0xc7d2])) -1
     (expr_list:REG_EQUAL (const_int -942519458 [0xffffffffc7d24b5e])
        (nil)))

while Cortex-M generates

(insn 5 2 0 2 (set (reg/v:SI 94 [ d ])
        (const_int -942519458 [0xffffffffc7d24b5e])) -1
     (nil))


Is the chunk generated for Cortex-M wrong as the test fail for all Cortex-M but 
pass for Cortex-A (at least Cortex-A7)?
If the test is supposed to work for both Cortex-M and Cortex-A, then what 
should be checked in the Cortex-M case?

Kind regards,
Torbjörn

On 2025-01-24 18:46, Torbjorn SVENSSON wrote:
Gentle ping 🙂

Kind regards,
Torbjörn

On 2024-12-18 11:46, Torbjorn SVENSSON wrote:


On 2024-12-12 15:50, Richard Earnshaw (lists) wrote:
On 12/12/2024 13:36, Torbjorn SVENSSON wrote:


On 2024-12-12 12:26, Richard Earnshaw (lists) wrote:
On 10/11/2024 13:38, Torbjörn SVENSSON wrote:
Hi Richard,

I'm not sure if I'm doing something wrong here, or if it was an oversight
when doing the update in r12-8108-g62082d278d1.
Anyway, the commit message suggest that it's only the constant that is of
interrest, so I updated the test to only check the constant. Do you think
this is enough, or is should the test case also verify that it's used in
a "set" expression?

Ok for trunk and releases/gcc-14?

--

The test case was re-writtend in r12-8108-g62082d278d1, but the expected
RTL was not updated.

The diff for the generated reg_equal_test.c.*r.expand files produced by
r12-8108-g62082d278d1 and r15-5047-g7e1d9f58858 is:

--- reg_equal_test.c.253r.expand-r12-8108-g62082d278d1  2024-11-10 
14:24:54.957438394 +0100
+++ reg_equal_test.c.268r.expand-r15-5047-g7e1d9f58858  2024-11-10 
14:30:13.633437178 +0100
@@ -1,5 +1,5 @@

-;; Function x (x, funcdef_no=0, decl_uid=4195, cgraph_uid=1, symbol_order=0)
+;; Function x (x, funcdef_no=0, decl_uid=4590, cgraph_uid=1, symbol_order=0)

  ;; Generating RTL for gimple basic block 2
@@ -25,6 +25,6 @@
  (note 1 0 3 NOTE_INSN_DELETED)
  (note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
  (note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)
-(insn 5 2 0 2 (set (reg/v:SI 113 [ d ])
+(insn 5 2 0 2 (set (reg/v:SI 114 [ d ])
          (const_int -942519458 [0xffffffffc7d24b5e])) -1
       (nil))


That's not what I see if I compile with "-march=armv8-a -mthumb".  I get the 
reg_equal note that I expect and the insn is something like:

(insn 6 5 0 2 (set (zero_extract:SI (reg/v:SI 114 [ d ])
             (const_int 16 [0x10])
             (const_int 16 [0x10]))
         (const_int 51154 [0xc7d2])) -1
      (expr_list:REG_EQUAL (const_int -942519458 [0xffffffffc7d24b5e])
         (nil)))

Can you tell me the exact options you were using to get your output?

Hmm.. This is interesting. With Cortex-A, I do see the same output that you 
get. With Cortex-M, it's instead my output.

You can get my output with any of the Cortex-M targets (M3 or above):

This is the line that I've used
arm-none-eabi-gcc gcc.target/arm/reg_equal_test.c  -mthumb - 
march=armv8.1-m.main -mfloat-abi=soft -fgimple -O1 -fdump-rtl-expand - S -o 
/dev/null

I suppose the change I propose will match both cases, but is there any backside 
of not checking the REG_EQUAL part?
Should the test case be Cortex-A only?


I don't think so.  We'd expect the code to be using MOVW/MOVT here and that's 
what the require rules seem to be saying.  That constant can't really be 
handled by a single mov, so it looks like for your case the compiler is 
expecting this value to be spilled to a constant pool later on.  It might be 
legitimate with some costing models, but it seems a bit unlikely, especially 
when not -Os.

So to conclude; There are 2 different outcomes from this.

1. A MOVW and MOVT is generated (at least for armv8-a, maybe other Cortex-A 
targets too?)
2. A LDR with a literal pool (at least for Cortex-M)

How can these 2 cases be combined into one test case that will actually check 
that the right thing is generated?

For the size check, I'd opt to just remove it.

Kind regards,
Torbjörn


R.

Kind regards,
Torbjörn



R.

In both versions, the constant is simply assigned, thus I updated the
expected RTL accordingly.

gcc/testsuite/ChangeLog:

    * gcc.target/arm/reg_equal_test.c: Update expected RTL.

Signed-off-by: Torbjörn SVENSSON <[email protected]>
---
  gcc/testsuite/gcc.target/arm/reg_equal_test.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/reg_equal_test.c b/gcc/ 
testsuite/gcc.target/arm/reg_equal_test.c
index d87c75cc27c..4337e3f0af5 100644
--- a/gcc/testsuite/gcc.target/arm/reg_equal_test.c
+++ b/gcc/testsuite/gcc.target/arm/reg_equal_test.c
@@ -12,4 +12,4 @@ x ()
    return;
  }
-/* { dg-final { scan-rtl-dump "expr_list:REG_EQUAL \\(const_int -942519458" 
"expand" } } */
+/* { dg-final { scan-rtl-dump "\\(const_int -942519458" "expand" } } */






Reply via email to