On 08/04/13 13:47, Kyrylo Tkachov wrote:
Hi all,
When compiling:
unsigned long long
muld (unsigned long long X, unsigned long long Y)
{
unsigned long long mask = 0xffffffffull;
return (X & mask) * (Y & mask);
}
we get a suboptimal sequence:
stmfd sp!, {r4, r5}
mvn r4, #0
mov r5, #0
and r0, r0, r4
and r3, r3, r5
and r1, r1, r5
and r2, r2, r4
mul r3, r0, r3
mla r3, r2, r1, r3
umull r0, r1, r0, r2
ldmfd sp!, {r4, r5}
add r1, r3, r1
bx lr
This patch improves that situation by changing the anddi3 insn into an
insn_and_split and
simplifying the SImode ands. Also, the NEON version is merged with the
non-NEON one.
This allows us to generate just:
umull r0, r1, r2, r0
bx lr
for the above code.
Regtested arm-none-eabi on qemu.
Ok for trunk?
Thanks,
Kyrill
gcc/ChangeLog
2013-04-08 Kyrylo Tkachov <kyrylo.tkac...@arm.com>
* config/arm/arm.c (const_ok_for_dimode_op): Handle AND case.
* config/arm/arm.md (*anddi3_insn): Change to insn_and_split.
* config/arm/constraints.md (De): New constraint.
* config/arm/neon.md (anddi3_neon): Delete.
(neon_vand<mode>): Expand to standard anddi3 pattern.
* config/arm/predicates.md (imm_for_neon_inv_logic_operand):
Move earlier in the file.
(neon_inv_logic_op2): Likewise.
(arm_anddi_operand_neon): New predicate.
gcc/testsuite/ChangeLog
2013-04-08 Kyrylo Tkachov <kyrylo.tkac...@arm.com>
* gcc.target/arm/anddi3-opt.c: New test.
* gcc.target/arm/anddi3-opt2.c: Likewise.
OK.
R.