https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110148
--- Comment #4 from Jan Hubicka <hubicka at gcc dot gnu.org> --- zen3 fma requires all inputs to be ready to start execution, separate multiply+add can start multiplication earlier. Not sure if that explains the difference.