On Tue, Jul 25, 2017 at 11:25:25AM +0200, Robin Dapp wrote: > Some questions/remarks, comments welcome: > - ifcvt performs creates things that it expects other passes to clean > up afterwards. In the case of the superfluous compares this might be > possible but the code is actually wrong and we cannot rely on a pass > fixing wrong code.
Do you have an example where wrong code is generated through the noce_convert_multiple_sets_p path (with or without bodged costs)? Both AArch64 and x86-64 reject your testcase along this codepath because of the constant set of 1. If we work around that by setting bla = n rather than bla = 1 , I see this code generation for AArch64: (insn 16 15 56 4 (set (reg:CC 66 cc) (compare:CC (reg:SI 73 [ _4 ]) (reg/v:SI 77 [ <retval> ]))) "foo.c":7 393 {cmpsi} (nil)) (insn 56 16 57 4 (set (reg:CC 66 cc) (compare:CC (reg:SI 73 [ _4 ]) (reg/v:SI 77 [ <retval> ]))) "foo.c":10 393 {cmpsi} (nil)) (insn 57 56 58 4 (set (reg:SI 81) (if_then_else:SI (ge (reg:CC 66 cc) (const_int 0 [0])) (reg/v:SI 74 [ blaD.3132 ]) (reg/v:SI 79 [ nD.3128 ]))) "foo.c":10 445 {*cmovsi_insn} (nil)) (insn 58 57 59 4 (set (reg:CC 66 cc) (compare:CC (reg:SI 73 [ _4 ]) (reg/v:SI 77 [ <retval> ]))) "foo.c":10 393 {cmpsi} (nil)) (insn 59 58 60 4 (set (reg:SI 82) (if_then_else:SI (ge (reg:CC 66 cc) (const_int 0 [0])) (reg/v:SI 77 [ <retval> ]) (reg:SI 73 [ _4 ]))) "foo.c":10 445 {*cmovsi_insn} (nil)) i.e. we have fresh registers. As you note, we do expect a later pass to clean up the redundant compare of (reg:SI 73) (reg:SI 77), which can throw the cost calculation off. If I hack up ix86_noce_conversion_profitable_p to alwyas return true, then for your testcase I also see: .L4: movl (%rdi), %ecx cmpl %eax, %ecx cmovl %esi, %edx cmovl %ecx, %eax addq $4, %rdi cmpq %r8, %rdi jne .L4 Again, no overlap. The path through cond_move_process_if_block is less interesting to me, as you're only getting incorrect code after disabling a correctness check. What am I missing in the noce_convert_multiple_sets_p case? > - The extraction of the compare from the conditional move is pretty > ad-hoc and pattern-dependent currently. Is there a way to do it in a > more backend-independent fashion? > - Branch mispredict costs don't seem to play a major role in ifcvt. > Shouldn't they be accounted for in all cases, maybe weighted by bb > probabilities? They are accounted for through if_info->max_seq_cost, which is set as so: if_info.max_seq_cost = targetm.max_noce_ifcvt_seq_cost (then_edge); Where max_noce_ifcvt_seq_cost is normally the default implementation, which looks like: unsigned int default_max_noce_ifcvt_seq_cost (edge e) { bool predictable_p = predictable_edge_p (e); enum compiler_param param = (predictable_p ? PARAM_MAX_RTL_IF_CONVERSION_PREDICTABLE_COST : PARAM_MAX_RTL_IF_CONVERSION_UNPREDICTABLE_COST); /* If we have a parameter set, use that, otherwise take a guess using BRANCH_COST. */ if (global_options_set.x_param_values[param]) return PARAM_VALUE (param); else return BRANCH_COST (true, predictable_p) * COSTS_N_INSNS (3); } The i386 version looks a lot like this but with a lower COSTS_N_INSNS magic number. Thanks, James