On 2023/01/11 17:02, Robin Dapp wrote: > Hi, Hi! > >> On optimizing for speed, default_noce_conversion_profitable_p() allows >> plenty of headroom, so this patch has little impact. >> >> Also, if the target-specific cost estimate is accurate or allows for >> margins, the impact should be similarly small. > I believe this part of ifcvt does/did not use the costing on purpose. > It will generally convert more sequences than other paths that compare > before and after costs since we just count the number of converted > insns comparing them against the "branch costs". Similar to rtx costs > they are kind of relative to a single insn but AFAIK it's not used > consistently everywhere. All the major platforms have low branch costs > nowadays (0 or 1?) thus we won't emit too many conditional moves here. > > In general I agree that we should compare costs everywhere and not just > count (the costing should include the branch costs as well) but this would > be a major overhaul. For your case (assuming xtensa), could you not > tune xtensa_branch_cost? It is currently 3 allowing up to 4 conditional > moves to be generated. optimize_function_for_speed_p is already being > passed to the hook so you could make use of that and decrease branch > costs when optimizing for size only. > > Regards > Robin
Thank you for your detailed explanation. In my case (for Xtensa), the cost of branching isn't really an issue. The actual problem (that I think) is the costs of the sequence itself before and after conversion. It is due to the fact that ifcvt's internal estimation is based on PATTERN(insn), so the instruction lengths ("length" attribute) associated with insns are not well reflected. This is especially noticeable when optimizing for size (overestimating the original cost). Currently, in addition to the patch, I have implemented the following code, and I'm confirming that it works roughly well (fine adjustments are still required). /* Return true if the instruction sequence seq is a good candidate as a replacement for the if-convertible sequence described in if_info. */ static bool xtensa_noce_conversion_profitable_p (rtx_insn *seq, struct noce_if_info *if_info) { unsigned int cost, original_cost; bool speed_p; rtx_insn *insn; speed_p = if_info->speed_p; /* of TEST_BB */ /* Estimate the cost for the replacing sequence. */ cost = 0; for (insn = seq; insn; insn = NEXT_INSN (insn)) if (active_insn_p (insn)) cost += xtensa_insn_cost (insn, speed_p); /* Short circuit and margins if optimiziing for speed. */ if (speed_p) return cost <= if_info->max_seq_cost; /* Estimate the cost for the original sequence if optimizing for size. */ original_cost = xtensa_insn_cost (if_info->jump, speed_p); speed_p = optimize_bb_for_speed_p (if_info->then_bb); FOR_BB_INSNS (if_info->then_bb, insn) if (active_insn_p (insn)) original_cost += xtensa_insn_cost (insn, speed_p); if (if_info->else_bb) { speed_p = optimize_bb_for_speed_p (if_info->else_bb); FOR_BB_INSNS (if_info->else_bb, insn) if (active_insn_p (insn)) original_cost += xtensa_insn_cost (insn, speed_p); } return cost <= original_cost; }