https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85048

--- Comment #15 from Matthias Kretz (Vir) <mkretz at gcc dot gnu.org> ---
So it seems that if at least one of the vector builtins involved in the
expression is 512 bits GCC needs to locally increase prefer-vector-width to
512? Or, more generally:

prefer-vector-width = max(prefer-vector-width, 8 * sizeof(operands)..., 8 *
sizeof(return-value))

The reason to default to 256 bits is to avoid zmm register usage altogether
(clock-down). But if the surrounding code already uses zmm registers that
motivation is moot.

Also, I think this shouldn't be considered auto-vectorization but rather
pattern recognition (recognizing a __builtin_convertvector).

Reply via email to