On Thu, Dec 13, 2012 at 11:20 AM, Uros Bizjak <ubiz...@gmail.com> wrote: > On Thu, Dec 13, 2012 at 10:51 AM, Richard Biener > <richard.guent...@gmail.com> wrote: > >>>>> I assume that this is not right way for fixing such simple performance >>>>> anomaly since we need to do redundant work - combine load to >>>>> conditional and then split it back in peephole2? Does it look >>>>> reasonable? Why we should produce non-efficient instrucction that must >>>>> be splitted later? >>>> >>>> Well, either don't allow this instruction variant from the start, or allow >>>> the extra freedom for register allocation this creates. It doesn't make >>>> sense to just reject it being generated by combine - that doesn't address >>>> when it materializes in another way. >>> >>> Please check the attached patch, it implements this limitation in a correct >>> way: >>> - keeps memory operands for -Os or cold parts of the executable >>> - doesn't increase register pressure >>> - handles all situations where memory operand can propagate into RTX >>> >>> Yuri, can you please check if this patch fixes the performance problem for >>> you? >>> >>> BTW: I would really like to add some TARGET_USE_CMOVE_WITH_MEMOP >>> target macro and conditionalize new peephole2 patterns on it. >> >> Looks good to me. I believe optimize_insn_for_speed_p () >> only works reliable during RTL expansion as it relies on >> crtl->maybe_hot_insn_p to be better than function granular. I'm quite sure >> split does not adjust this (it probably should, as those predicates are >> definitely the correct ones to use), via rtl_profile_for_bb (). I >> think passes that >> do not adjust this get what is left over by previous passes (instead of the >> default). >> >> Honza, I think the pass manager should call default_rtl_profile () before >> each >> RTL pass to avoid this, no? > > Please note that we have plenty of existing peephole2s that use > optimize_insn_for_speed_p predicate. It is assumed to work ...
Indeed - it calls rtl_profile_for_bb already. (it's in recog.c ... just looked for sth like "split.c" in my grep). Richard. > Uros.