[Bug tree-optimization/77902] New: Auto-vectorizes epilogue loops or manually vectorized functions

linux at carewolf dot com Sat, 08 Oct 2016 04:52:11 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77902


            Bug ID: 77902
           Summary: Auto-vectorizes epilogue loops or manually vectorized
                    functions
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: linux at carewolf dot com
  Target Milestone: ---

Created attachment 39774
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39774&action=edit
Example that trigger the pointless auto-vectorization

A common pattern when manually vectorizing an inner function is to have a small
epilogue that handles the remainder of the input vector that cannot be handled
by the vectorized stepping.

For instance:
    int i = 0;
    for (; i < (count - 3); i +=4)
       // do 4 at a time
    for (; i < count; ++i)
       // do 1 at a time


When compiled with -O3 or -ftree-loop-vectorize that last epilogue may be
auto-vectorized by GCC even though it can at most be run 3 times, and the
auto-vectorized code-path will never be called.

Rewriting it as 
    int i = 0;
    for (; i < (count - 3); i +=4)
       // do 4 at a time
    for (int _i;  _i < 3 && i < count; ++_i, ++i)
       // do 1 at a time

Fixes the issue.

I am guessing GCC would do well to learn a range from the main-loop so that it
can figure out on its own that the epilogue can not be run more than 3 times.

[Bug tree-optimization/77902] New: Auto-vectorizes epilogue loops or manually vectorized functions

Reply via email to