https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114200

--- Comment #1 from Robin Dapp <rdapp at gcc dot gnu.org> ---
Took me a while to analyze this... needed more time than I'd like to admit to
make sense of the somewhat weird code created by fully unrolling and peeling.

I believe the problem is that we reload the output register of a vfmacc/fma via
vmv.v.v (subject to length masking) but we should be using vmv1r.v.  The result
is used by a reduction which always operates on the full length.  As annoying
as it was to find - it's definitely a good catch.

I'm testing a patch.  PR114202 is indeed a duplicate.  Going to add its test
case to the patch.

Reply via email to