https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113533
--- Comment #11 from Oleg Endo <olegendo at gcc dot gnu.org> --- (In reply to Roger Sayle from comment #10) > I've found an interesting table of SH cycle counts (for different CPUs) at > http://www.shared-ptr.com/sh_insns.html Yeah, I know. I did that ;) > In my proposed patch, the address cost (1) when optimizing for size attempts > to return the additional size of an instruction based on the addressing > mode. For register, and reg+reg addressing modes there is no size increase > (overhead), and for adressing modes with displacements, and displacements to > address pointers, there is a cost. AFAIR, I've added the 'sh_address_cost' function. The intention was/is to encourage/discourage usage of certain address modes based on the side effects and impact on the surrounding code. All insns/addr modes have the same length and basically same execution time. However, e.g. @(reg+reg) has a constraint on 'r0' usage, so I weighted that heavier. If there's anything that could use @(reg+disp) as an alternative, that'd be better in some cases. (not sure if such optimizations actually are done...) > (2) when optimizing for speed, address > cost remains between 0 and 3, and is used to prioritize between (equivalent > numbers of) instructions. Normally, rtx_costs are defined in terms of > COST_N_INSNS, which multiplies by 4. Hence on many platforms a single > instruction that references memory may be encoded as COSTS_N_INSNS(1)+1 (or > a more complex addressing mode as COSTS_N_INSNS(1)+2) to show that this is > disfavored to a single instruction that doesn't reference memory, > COSTS_N_INSNS(1)+0. That's actually what sh_rtx_costs was supposed to do as well. I think in usual cases it does that, only that apparently I've screwed up the {SIGN|ZERO}_EXTEND for the case of the mem load and it shows up only now, many years later. It's still not entirely clear to me why we would want to squash the costs of addresses to 0 when optimizing for size? What does effect does it have on the generated code? I can't imagine how it would be possibly making any smaller code? With your patch, in case of the SIGN_EXTEND with mem operand, it would make the address cost 0 with -Os, which would return COSTS_N_INSNS(1) for reg operand as well as mem operand. So both insns are equally weighted and could be considered interchangeable. And we might bump into this type of regression again, if some (future) optimization decides that it can interchange/substitute insns of the same cost... > For example, SH currently reports multiplications as a single cycle operation, That doesn't seem to be the case. It's supposed to be using the function 'multcosts' in sh.cc, which returns at least a cost of '2'. Note that on SH1 and SH2 there is no dynamic (barrel) shift. So actually some multiplications could be faster than stitched shifts. > sh_rtx_costs doesn't distinguish the machine mode, so the costs of SImode > multiplications are the same as DImode multiplications. I guess this is because SH doesn't have real DImode multiplication (64 x 64 -> 64/128 bit). It can only do 32 x 32 -> 64 bit widening multiplication. Any real DImode multiplication will result in either expanded sequence to calculate sum of particial products or a libcall, AFAIR