[Bug target/115921] Missed optimization: and->ashift might be cheaper than ashift->and on typical RISC targets

2024-08-22 Thread Jovan.Vukic--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115921

Jovan Vukic  changed:

   What|Removed |Added

 CC||jovan.vu...@rt-rk.com

--- Comment #2 from Jovan Vukic  ---
I have reviewed the issue, specifically for the RISC-V platform, as I find the
difference between the results for 32-bit and 64-bit variants intriguing.

Here are my conclusions:
1. Both RISC-V 32-bit and 64-bit versions in the releases 14.1.0 and 14.2.0 do
not perform the optimization in question.
2. The GCC trunk code contains the optimization for this problem for both
RISC-V 32-bit and 64-bit, applicable to AND, OR, and XOR operations.
3. This optimization happens because of the pattern
_shift_reverse on line 2929 in riscv.md.


The optimization works correctly, as evidenced by the example C code below, for
which GCC generates optimized assembly code for both 32-bit and 64-bit
platforms:

typedef unsigned long target_wide_uint_t;

target_wide_uint_t test_ashift_and(target_wide_uint_t x) {
    return (x & 0x3e) << 12;
}


test_ashift_and:
    andi    a0,a0,62
    slli    a0,a0,12
    ret

However, the problem arises with the test example we are considering (repeated
below). It is optimized for 32-bit but not for 64-bit. Here are the results for
the 64-bit architecture (https://godbolt.org/z/6a7x9zTjs):

typedef unsigned long target_wide_uint_t;

target_wide_uint_t test_ashift_and(target_wide_uint_t x) {
    return (x & 0x3f) << 12;
}


test_ashift_and:
    li  a5,258048
    slli    a0,a0,12
    and a0,a0,a5
    ret

In this example, we have 0x3f == 2^6 - 1. If we examine the pattern on line
2929 in riscv.md, there is a special condition:
(!TARGET_64BIT || (exact_log2((INTVAL(operands[3]) >> INTVAL(operands[2])) + 1)
== -1))

This condition indicates that optimization does not occur for 64-bit
architecture when the value 0x3f (or another number in its place) is of the
form 2^x - 1. This condition, introduced in this commit
[https://github.com/gcc-mirror/gcc/commit/236116068151bbc72aaaf53d0f223fe06f7e3bac],
seems to be the root of the issue for RISC-V 64-bit.


It would be helpful if someone could clarify the rationale behind the
exact_log2 condition, as it prevents optimization only for numbers of the form
2^x - 1, specifically for the 64-bit target, and its purpose is not immediately
clear to me.

[Bug target/117011] New: RISC-V: Logic overlap in IF_THEN_ELSE case for instruction cost calculation

2024-10-08 Thread Jovan.Vukic--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117011

Bug ID: 117011
   Summary: RISC-V: Logic overlap in IF_THEN_ELSE case for
instruction cost calculation
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jovan.vu...@rt-rk.com
  Target Milestone: ---
Target: riscv

In the file riscv.cc, in the function riscv_rtx_costs, we have the  
following code for the IF_THEN_ELSE case:

if ((TARGET_SFB_ALU || TARGET_XTHEADCONDMOV) 
&& reg_or_0_operand(XEXP(x, 1), mode) 
&& sfb_alu_operand(XEXP(x, 2), mode) 
&& comparison_operator(XEXP(x, 0), VOIDmode)) 
{
/* For predicated conditional-move operations, we assume the cost 
   of a single instruction, even though there are actually two. */ 
*total = COSTS_N_INSNS(1); 
return true; 
} 
else if (TARGET_ZICOND_LIKE 
 && outer_code == SET 
 && ...) 
{
*total = COSTS_N_INSNS(1); 
return true; 
}


However, what will happen if both TARGET_SFB_ALU and 
TARGET_ZICOND_LIKE are set to 1 at the same time 
(-mtune=sifive-7-series -march=rv64gc_zicond)? In this case, we won't be 
able to distinguish between the movcc and czero instructions, and for 
czero, we will enter the "if" branch instead of (as we expect) the 
"else if" branch.

Since both branches currently set *total to 1 instruction, there is 
essentially no difference. However, if someone changes the way the 
calculation is performed in the future, it could lead to issues.

[Bug target/113035] RISC-V: regression testsuite errors -mtune=sifive-7-series

2024-10-10 Thread Jovan.Vukic--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113035

Jovan Vukic  changed:

   What|Removed |Added

 CC||jovan.vu...@rt-rk.com

--- Comment #5 from Jovan Vukic  ---
Since we have many zicond tests failing here for the sifive-7-series
(currently 275 FAILs), I analyzed the issue this week. 

In short, in the function riscv_expand_conditional_move (from riscv.cc), 
the SFB is always preferred when both TARGET_SFB_ALU and 
TARGET_ZICOND_LIKE are set to 1.

During the meeting this week, it was clarified to me that we generally steer 
towards zicond. A simple change in the riscv_expand_conditional_move function
could help us achieve this here too and eliminate 120 FAILs.

-  if (TARGET_SFB_ALU || TARGET_XTHEADCONDMOV)
+  if ((TARGET_SFB_ALU || TARGET_XTHEADCONDMOV)
+  && (!TARGET_ZICOND_LIKE || (code != EQ && code != NE)))

However, it was also noted that zicond + SFB cases are not currently 
that relevant. So, even though I initially thought about turning this into
a patch, I assume it's better to leave this here as an observation.

[Bug middle-end/116860] [15 Regression] New test case gcc.dg/tree-ssa/fold-xor-and-or.c from r15-3866-ga88d6c6d777ad7 fails

2024-10-10 Thread Jovan.Vukic--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116860

Jovan Vukic  changed:

   What|Removed |Added

 CC||jovan.vu...@rt-rk.com

--- Comment #5 from Jovan Vukic  ---
Just to confirm, the tests are now passing successfully, I assume since this
commit:

https://gcc.gnu.org/g:34ae3a992a0cc3240d07d69ff12a664cbb5c8be0

[Bug middle-end/116860] [15 Regression] New test case gcc.dg/tree-ssa/fold-xor-and-or.c from r15-3866-ga88d6c6d777ad7 fails

2024-10-10 Thread Jovan.Vukic--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116860

--- Comment #6 from Jovan Vukic  ---
Now I see that I wasn't specific enough, sorry. These tests are passing now
for RISC-V, since there has been a regression there too, due to
LOGICAL_OP_NON_SHORT_CIRCUIT being defined as 0.