[Bug tree-optimization/93080] insert of an extraction on the same location is not optimized

rguenth at gcc dot gnu.org via Gcc-bugs Tue, 31 Mar 2026 00:31:15 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93080


--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Torbjorn SVENSSON from comment #10)
> With the change introduced in comment 8, I now see the following failures
> for arm-none-eabi:
> 
> FAIL: gcc.dg/tree-ssa/forwprop-40.c scan-tree-dump-times optimized
> "BIT_FIELD_REF" 0
> FAIL: gcc.dg/tree-ssa/forwprop-40.c scan-tree-dump-times optimized
> "BIT_INSERT_EXPR" 0
> FAIL: gcc.dg/tree-ssa/forwprop-41.c scan-tree-dump-times optimized
> "BIT_FIELD_REF" 0
> FAIL: gcc.dg/tree-ssa/forwprop-41.c scan-tree-dump-times optimized
> "BIT_INSERT_EXPR" 1
> 
> The tests only fails for Cortex-M55/M85 with -mfloat-abi=hard, but not when
> using -mfloat-abu=soft.
> 
> 
> 
> The content of forwprop-40.c.273t.optimized for Cortex-M55
> (thumb/arch=armv8.1-m.main+mve.fp+fp.dp/tune=cortex-m55/float-abi=hard/
> fpu=auto) with r16-8253-geb50d28a9353e9 is:
> ;; Function g (g, funcdef_no=0, decl_uid=7945, cgraph_uid=1, symbol_order=0)
> 
> vector(4) int g (vector(4) int a)
> {
>   int b;
> 
>   <bb 2> [local count: 1073741824]:
>   b_2 = BIT_FIELD_REF <a_1(D), 32, 0>;
>   a_3 = BIT_INSERT_EXPR <a_1(D), b_2, 0 (32 bits)>;
>   return a_3;
> }
> 
> 
> The content of forwprop-41.c.273t.optimized for Cortex-M55
> (thumb/arch=armv8.1-m.main+mve.fp+fp.dp/tune=cortex-m55/float-abi=hard/
> fpu=auto) with r16-8253-geb50d28a9353e9 is:
> ;; Function g (g, funcdef_no=0, decl_uid=7946, cgraph_uid=1, symbol_order=0)
> 
> vector(4) int g (vector(4) int a, int c)
> {
>   int b;
> 
>   <bb 2> [local count: 1073741824]:
>   b_2 = BIT_FIELD_REF <a_1(D), 32, 64>;
>   a_3 = BIT_INSERT_EXPR <a_1(D), b_2, 64 (32 bits)>;
>   a_5 = BIT_INSERT_EXPR <a_3, c_4(D), 32 (32 bits)>;
>   return a_5;
> }
> 
> 
> Should the tests be xfail, just like they are for s390 or is this a bug in
> GCC?
> 
> The same tests pass'es on x86_64-pc-linux-gnu.

This means that your target cannot perform the required constant vector
permute.
If that's a true incapability or just a missed pattern I cannot say.

I suggest to XFAIL on the relevant targets and see to fix the backend in
stage1 if possible (on x86 even two or three instruction sequences are
generated for constant permutes - the middle-end never tries to decompose
those into target supported pieces).

[Bug tree-optimization/93080] insert of an extraction on the same location is not optimized

Reply via email to