> Hi,
> this patch enables logic which avoid FMA for matrix multiplicaiton loop
> for 256 bit vectors. The underlying issue is same as with znver1. While
> combined latency of mutliply and add operations is slower than FMA, the
> dependency chain in matrix multiplication depends only on additions
> that are faster.
> 
> Bootstrapped/regtested x86_64-linux, comitted.
> 
>       * config/i386/i386-options.c (ix86_option_override_internal): Default
>       PARAM_AVOID_FMA_MAX_BITS to 256 for znver2.
>       * conifg/i386/x86-tune.def (X86_TUNE_AVOID_256FMA_CHAINS): Set for
>       ZNVER2.

Hi,
this patch is now also backported to gcc9 branch (r273901)

Honza

Reply via email to