Hi all, I'm looking at a case on aarch64 that's not if-converted to use conditional moves:
typedef unsigned char uint8_t; typedef unsigned int uint16_t; uint8_t foo(const uint8_t byte, const uint16_t generator) { if (byte & 0x80) { return (byte << 1) ^ (generator & 0xff); } else { return byte << 1; } } For aarch64 we fail to if-convert and generate: foo: uxtb w2, w0 lsl w3, w2, 1 uxtb w0, w3 tbnz x2, 7, .L5 ret .p2align 3 .L5: eor w0, w3, w1 uxtb w0, w0 ret whereas on x86 we if convert successfully and use a conditional move/select: leal (%rdi,%rdi), %eax xorl %eax, %esi testb %dil, %dil cmovs %esi, %eax ret After fixing some of the branch costs in aarch64 and a bogus cost calculation in cheap_bb_rtx_cost_p I'm stuck on noce_process_if_block (in ifcvt.c) and what I think is a restriction that the THEN-block contents have to be only a single set insn. This fails on aarch64 because we get an extra zero_extend. In particular, the following check in noce_process_if_block triggers: insn_a = first_active_insn (then_bb); if (! insn_a || insn_a != last_active_insn (then_bb, FALSE) || (set_a = single_set (insn_a)) == NULL_RTX) return FALSE; Is there any particular reason why the code shouldn't be able to handle arbitrarily large contents in then_bb (within a sane limit)? Thanks, Kyrill