A somewhat old "issue report" pointed me to the code generated for a 4-fold 
manually unrolled version of the following loop:

>                       while (++len != len_limit) /* this is loop */
>                               if (pb[len] != cur[len])
>                                       break;

As unrolled, the loop appears as:

>                 while (++len != len_limit) /* this is loop */ {
>                   if (pb[len] != cur[len])
>                     break;
>                   if (++len == len_limit)  /* unrolled 2nd iteration */
>                     break;
>                   if (pb[len] != cur[len])
>                     break;
>                   if (++len == len_limit)  /* unrolled 3rd iteration */
>                     break;
>                   if (pb[len] != cur[len])
>                     break;
>                   if (++len == len_limit)  /* unrolled 4th iteration */
>                     break;
>                   if (pb[len] != cur[len])
>                     break;
>                 }

In examining the behavior of tree-ssa-loop-ivopts.c, I've discovered the only 
induction variable candidates that are being considered are all forms of the 
len variable.  We are not considering any induction variables to represent the 
address expressions &pb[len] and &cur[len].

I rewrote the source code for this loop to make the addressing expressions more 
explicit, as in the following:

>       cur++;
>       while (++pb != last_pb) /* this is loop */ {
>       if (*pb != *cur)
>         break;
>       ++cur;
>       if (++pb == last_pb)  /* unrolled 2nd iteration */
>         break;
>       if (*pb != *cur)
>         break;
>       ++cur;
>       if (++pb == last_pb)  /* unrolled 3rd iteration */
>         break;
>       if (*pb != *cur)
>         break;
>       ++cur;
>       if (++pb == last_pb)  /* unrolled 4th iteration */
>         break;
>       if (*pb != *cur)
>         break;
>       ++cur;
>       }

Now, gcc does a better job of identifying the "address expression induction 
variables".  This version of the loop runs about 10% faster than the original 
on my target architecture.

This would seem to be a textbook pattern for the induction variable analysis.  
Does anyone have any thoughts on the best way to add these candidates to the 
set of induction variables that are considered by tree-ssa-loop-ivopts.c?

Thanks in advance for any suggestions.

Reply via email to