I have an ivopts optimization question/proposal. When compiling the attached program the ivopts pass prefers the original ivs over new ivs and that causes us to generate less efficient code on MIPS. It may affect other platforms too.
The Source code is a C strcmp: int strcmp (const char *p1, const char *p2) { const unsigned char *s1 = (const unsigned char *) p1; const unsigned char *s2 = (const unsigned char *) p2; unsigned char c1, c2; do { c1 = (unsigned char) *s1++; c2 = (unsigned char) *s2++; if (c1 == '\0') return c1 - c2; } while (c1 == c2); return c1 - c2; } Currently the code prefers the original ivs and so it generates code that increments s1 and s2 before doing the loads (and uses a -1 offset): <bb 3>: # s1_1 = PHI <p1_4(D)(2), s1_6(6)> # s2_2 = PHI <p2_5(D)(2), s2_9(6)> s1_6 = s1_1 + 1; c1_8 = MEM[base: s1_6, offset: 4294967295B]; s2_9 = s2_2 + 1; c2_10 = MEM[base: s2_9, offset: 4294967295B]; if (c1_8 == 0) goto <bb 4>; else goto <bb 5>; If I remove the cost increment for non-original ivs then GCC does the loads before the increments: <bb 3>: # ivtmp.6_17 = PHI <ivtmp.6_24(2), ivtmp.6_14(6)> # ivtmp.7_21 = PHI <ivtmp.7_22(2), ivtmp.7_23(6)> _25 = (void *) ivtmp.6_17; c1_8 = MEM[base: _25, offset: 0B]; _26 = (void *) ivtmp.7_21; c2_10 = MEM[base: _26, offset: 0B]; if (c1_8 == 0) goto <bb 4>; else goto <bb 5>; . . <bb 5>: ivtmp.6_14 = ivtmp.6_17 + 1; ivtmp.7_23 = ivtmp.7_21 + 1; if (c1_8 == c2_10) goto <bb 6>; else goto <bb 7>; This second case (without the preference for the original IV) generates better code on MIPS because the final assembly has the increment instructions between the loads and the tests of the values being loaded and so there is no delay (or less delay) between the load and use. It seems like this could easily be the case for other platforms too so I was wondering what people thought of this patch: 2015-12-08 Steve Ellcey <sell...@imgtec.com> * tree-ssa-loop-ivopts.c (determine_iv_cost): Remove preference for original ivs. diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c index 98dc451..26daabc 100644 --- a/gcc/tree-ssa-loop-ivopts.c +++ b/gcc/tree-ssa-loop-ivopts.c @@ -5818,14 +5818,6 @@ determine_iv_cost (struct ivopts_data *data, struct iv_cand *cand) cost = cost_step + adjust_setup_cost (data, cost_base.cost); - /* Prefer the original ivs unless we may gain something by replacing it. - The reason is to make debugging simpler; so this is not relevant for - artificial ivs created by other optimization passes. */ - if (cand->pos != IP_ORIGINAL - || !SSA_NAME_VAR (cand->var_before) - || DECL_ARTIFICIAL (SSA_NAME_VAR (cand->var_before))) - cost++; - /* Prefer not to insert statements into latch unless there are some already (so that we do not create unnecessary jumps). */ if (cand->pos == IP_END