On Tue, Dec 8, 2015 at 7:56 PM, Steve Ellcey <sell...@imgtec.com> wrote:
> I have an ivopts optimization question/proposal.  When compiling the
> attached program the ivopts pass prefers the original ivs over new ivs
> and that causes us to generate less efficient code on MIPS.  It may
> affect other platforms too.
>
> The Source code is a C strcmp:
>
> int strcmp (const char *p1, const char *p2)
> {
>   const unsigned char *s1 = (const unsigned char *) p1;
>   const unsigned char *s2 = (const unsigned char *) p2;
>   unsigned char c1, c2;
>   do {
>       c1 = (unsigned char) *s1++;
>       c2 = (unsigned char) *s2++;
>       if (c1 == '\0') return c1 - c2;
>   } while (c1 == c2);
>   return c1 - c2;
> }
>
> Currently the code prefers the original ivs and so it generates
> code that increments s1 and s2 before doing the loads (and uses
> a -1 offset):
>
>   <bb 3>:
>   # s1_1 = PHI <p1_4(D)(2), s1_6(6)>
>   # s2_2 = PHI <p2_5(D)(2), s2_9(6)>
>   s1_6 = s1_1 + 1;
>   c1_8 = MEM[base: s1_6, offset: 4294967295B];
>   s2_9 = s2_2 + 1;
>   c2_10 = MEM[base: s2_9, offset: 4294967295B];
>   if (c1_8 == 0)
>     goto <bb 4>;
>   else
>     goto <bb 5>;
>
> If I remove the cost increment for non-original ivs then GCC
> does the loads before the increments:
>
>  <bb 3>:
>   # ivtmp.6_17 = PHI <ivtmp.6_24(2), ivtmp.6_14(6)>
>   # ivtmp.7_21 = PHI <ivtmp.7_22(2), ivtmp.7_23(6)>
>   _25 = (void *) ivtmp.6_17;
>   c1_8 = MEM[base: _25, offset: 0B];
>   _26 = (void *) ivtmp.7_21;
>   c2_10 = MEM[base: _26, offset: 0B];
>   if (c1_8 == 0)
>     goto <bb 4>;
>   else
>     goto <bb 5>;
> .
> .
>   <bb 5>:
>   ivtmp.6_14 = ivtmp.6_17 + 1;
>   ivtmp.7_23 = ivtmp.7_21 + 1;
>   if (c1_8 == c2_10)
>     goto <bb 6>;
>   else
>     goto <bb 7>;
>
>
> This second case (without the preference for the original IV)
> generates better code on MIPS because the final assembly
> has the increment instructions between the loads and the tests
> of the values being loaded and so there is no delay (or less delay)
> between the load and use.  It seems like this could easily be
> the case for other platforms too so I was wondering what people
> thought of this patch:

You don't comment on the comment you remove ... debugging
programs is also important!

So if then the cost of both cases should be distinguished
somewhere else, like granting a bonus for increment before
exit test or so.

Richard.

> 2015-12-08  Steve Ellcey  <sell...@imgtec.com>
>
>         * tree-ssa-loop-ivopts.c (determine_iv_cost): Remove preference for
>         original ivs.
>
>
> diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
> index 98dc451..26daabc 100644
> --- a/gcc/tree-ssa-loop-ivopts.c
> +++ b/gcc/tree-ssa-loop-ivopts.c
> @@ -5818,14 +5818,6 @@ determine_iv_cost (struct ivopts_data *data, struct 
> iv_cand *cand)
>
>    cost = cost_step + adjust_setup_cost (data, cost_base.cost);
>
> -  /* Prefer the original ivs unless we may gain something by replacing it.
> -     The reason is to make debugging simpler; so this is not relevant for
> -     artificial ivs created by other optimization passes.  */
> -  if (cand->pos != IP_ORIGINAL
> -      || !SSA_NAME_VAR (cand->var_before)
> -      || DECL_ARTIFICIAL (SSA_NAME_VAR (cand->var_before)))
> -    cost++;
> -
>    /* Prefer not to insert statements into latch unless there are some
>       already (so that we do not create unnecessary jumps).  */
>    if (cand->pos == IP_END

Reply via email to