https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120089

--- Comment #25 from Richard Biener <rguenth at gcc dot gnu.org> ---
At least on x86_64 before r15-7533-g589d79e6268b05 we failed to vectorize this:

t.c:17:12: note:   examining phi: _33 = PHI <0(20), _42(17)>
t.c:9:1: missed:   not vectorized: relevant phi not supported: _33 = PHI
<0(20), _42(17)> 
t.c:17:12: missed:  bad operation or unsupported loop bound
t.c:17:12: note:  ***** Analysis failed with vector mode V2DI

(this is the PHI that misses the SLP discovery)

t.c:17:12: missed:    can't vectorize early exit because the target doesn't
support flag setting vector comparisons.
t.c:17:12: note:   unsupported SLP instance starting from: if (patt_37 != 0)
t.c:17:12: missed:  unsupported SLP instances
t.c:17:12: note:  ***** Analysis failed with vector mode V8QI

(unsupported ptest)

So the issue was previously latent.  I did not yet spot the actual issue,
the vectorization looks correct to my eyes...

Note the main exit is the exit to the __builtin_trap().


With SSE4 we exit the main vector loop after 3 vector iterations via the early
exit (to the non-trap)

(gdb) p data
$20 = {d = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 
    19, 20, 21, 22, 23, 0 <repeats 76 times>}}

        movq    %xmm3, %rdx
        movd    %xmm1, %eax

both IVs are 12 which I think is correct, but then the destination pointer
seems mishandled - that somehow gets taken from the original scalar IV
and thus it doesn't have VF == 4 imposed.  Seems like a missed early-exit
forced-live IV, but it's an address IV.

Reply via email to