Hi, During studying ivopt pass, I found the cost of preloop calculations are inaccurately calculated in many scenarios.
There are two kinds of preloop calculations: base of candidates and invariant part of iv use representation. For base of iv candidates, it is calculated as below: cost_base = force_var_cost (data, base, NULL); /* It will be exceptional that the iv register happens to be initialized with the proper value at no cost. In general, there will at least be a regcopy or a const set. */ if (cost_base.cost == 0) cost_base.cost = COSTS_N_INSNS (1); cost_step = add_cost (data->speed, TYPE_MODE (TREE_TYPE (base))); cost = cost_step + adjust_setup_cost (data, cost_base.cost); The amortization of cost_base over the per-iteration cost results in bad choice of candidates. Considering below codes generated for ARM: mov r2, #0 sub ip, r1, #4 mov lr, r2 .L48: add r2, r2, #1 str lr, [ip, #4]! cmp r2, #23 bne .L48 The sub instruction in pre-header can be saved if ivopt chooses post-increment address mode, which didn't happen because pre/post increment candidates have same cost/cost_base after amortization. I did experiment to keep cost_base information and comparing it when choosing iv set but did not get obvious change. Also it interferes with the hypothesis that there always is one regcopy or constant loading. Truth is it's hardly to know whether there will be such an instruction at this stage. Same issue occurs the invariant part of iv use representation in get_computation_cost_at, here ivopt just ignores the possible regcopy or constant loading instruction. I understand it's difficult to calculate accurate cost at gimple IR, but many such choices of iv set are observed, especially after enabling auto-increment and multiplied_address mode on ARM. So here I send this message for help. Thanks in advance. Best Regards.