[Bug target/117722] RISC-V: Failed to vectorize x264_pixel_sad_4x4

rdapp at gcc dot gnu.org via Gcc-bugs Fri, 22 Nov 2024 01:01:15 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117722

--- Comment #13 from Robin Dapp <rdapp at gcc dot gnu.org> ---
I don't fully understand yet :)

So the full-register moves are undesirable, I agree.  When accumulating with a
widening op they seem unavoidable, though.  The only alternative would be to
split out the extension and use a regular add which would get us close to your
second example.  I don't see why we would prefer one over the other when the
only difference is one vsetvl outside the loop.  vsext and vmv1r should be
comparable in latency as well.

Regarding vectorizing with misaligned loads, how does that change with a usad
expander?

[Bug target/117722] RISC-V: Failed to vectorize x264_pixel_sad_4x4

Reply via email to