https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113134

Tamar Christina <tnfchris at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2023-12-25 00:00:00         |2023-12-27
            Summary|Middle end early break      |gcc does not version loops
                   |vectorization: Fail to      |with side-effect early
                   |vectorize a simple early    |breaks
                   |break code                  |
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW

--- Comment #2 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
So GCC's approach is much different than clang.

I think this should be handled by IVcannon as it makes the vectorizer code much
easier.  At the moment the vectorizer assumes that any exit it sees are
actually needed.  So even if I relax my patch to allow this we still produce a
pointless compare.

Looking at IVcannon it does for a constant sized array:

Loop 1 iterates 1001 times.
Loop 1 iterates at most 999 times.
Loop 1 likely iterates at most 999 times.
Analyzing # of iterations of loop 1
  exit condition [0, + , 1](no_overflow) <= 1000
  bounds on difference of bases: 1000 ... 1000
  result:
    # of iterations 1001, bounded by 1001
Removed pointless exit: if (i_13 > 1000)

but for the example attached:

Loop 1 iterates 1001 times.
Loop 1 iterates at most 1001 times.
Loop 1 likely iterates at most 1001 times.
Analyzing # of iterations of loop 1
  exit condition [1, + , 1](no_overflow) < N_13(D)
  bounds on difference of bases: 0 ... 2147483646

It has correctly determined that the loop bounds is at most 1001 but since N
can  be < 1001 it doesn't think the additional exit is useless.

However like clang we can just version the loop. Unlike clang however we can
probably do better.

if N >= 1000 then we can enter the vector code without the additional exit, but
if N < 1000 we can use my new pass.

It's not hard to allow this through the pass, but I doubt this will be accepted
in stage3..

For best result the loop should be versioned like clang does.

Richi?

Reply via email to