On Fri, Jul 13, 2018 at 6:04 AM, Kelvin Nilsen <kdnil...@linux.ibm.com> wrote:
> A somewhat old "issue report" pointed me to the code generated for a 4-fold 
> manually unrolled version of the following loop:
>
>>                       while (++len != len_limit) /* this is loop */
>>                               if (pb[len] != cur[len])
>>                                       break;
>
> As unrolled, the loop appears as:
>
>>                 while (++len != len_limit) /* this is loop */ {
>>                   if (pb[len] != cur[len])
>>                     break;
>>                   if (++len == len_limit)  /* unrolled 2nd iteration */
>>                     break;
>>                   if (pb[len] != cur[len])
>>                     break;
>>                   if (++len == len_limit)  /* unrolled 3rd iteration */
>>                     break;
>>                   if (pb[len] != cur[len])
>>                     break;
>>                   if (++len == len_limit)  /* unrolled 4th iteration */
>>                     break;
>>                   if (pb[len] != cur[len])
>>                     break;
>>                 }
>
> In examining the behavior of tree-ssa-loop-ivopts.c, I've discovered the only 
> induction variable candidates that are being considered are all forms of the 
> len variable.  We are not considering any induction variables to represent 
> the address expressions &pb[len] and &cur[len].
>
> I rewrote the source code for this loop to make the addressing expressions 
> more explicit, as in the following:
>
>>       cur++;
>>       while (++pb != last_pb) /* this is loop */ {
>>       if (*pb != *cur)
>>         break;
>>       ++cur;
>>       if (++pb == last_pb)  /* unrolled 2nd iteration */
>>         break;
>>       if (*pb != *cur)
>>         break;
>>       ++cur;
>>       if (++pb == last_pb)  /* unrolled 3rd iteration */
>>         break;
>>       if (*pb != *cur)
>>         break;
>>       ++cur;
>>       if (++pb == last_pb)  /* unrolled 4th iteration */
>>         break;
>>       if (*pb != *cur)
>>         break;
>>       ++cur;
>>       }
>
> Now, gcc does a better job of identifying the "address expression induction 
> variables".  This version of the loop runs about 10% faster than the original 
> on my target architecture.
>
> This would seem to be a textbook pattern for the induction variable analysis. 
>  Does anyone have any thoughts on the best way to add these candidates to the 
> set of induction variables that are considered by tree-ssa-loop-ivopts.c?
>
> Thanks in advance for any suggestions.
>
Hi,
Could you please file a bug with your original slow test code
attached?  I tried to construct meaningful test case from your code
snippet but not successful.  There is difference in generated
assembly, but it's not that fundamental.  So a bug with preprocessed
test would be high appreciated.
I think there are two potential issues in cost computation for such
case: invariant expression and iv uses outside of loop handled as
inside uses.

Thanks,
bin

Reply via email to