https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117072
--- Comment #11 from Hongtao Liu <liuhongt at gcc dot gnu.org> --- (In reply to Hongtao Liu from comment #10) > (In reply to rguent...@suse.de from comment #9) > > On Fri, 11 Oct 2024, liuhongt at gcc dot gnu.org wrote: > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117072 > > > > > > --- Comment #8 from Hongtao Liu <liuhongt at gcc dot gnu.org> --- > > > (In reply to Richard Biener from comment #7) > > > > OTOH I'll note that no other simplify_* treats canonicalization as > > > > simplification and the existing swap_commutative_operands_p transform > > > > for FMA > > > > is highly uncommon. > > > > > > > > So why do we recognize (fma (neg (mem...)) ...) and not only (neg > > > > (register_operand))? > > > > > > I think we can relex register_operand to nonimmediate_operand and rely on > > > RA to > > > reload it into a reg just like we did in > > > <sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name><round_name>. So a > > > backend fix > > > shou be better? > > > > I think currently the backend isn't consistent with itself and sure, > > a backend fix would be better (if it doesn't mean bloating the .md > > with many more patterns). > > No, just adjust the existed pattern should be ok. Relax the predicate doesn't help since the mask pattern checks extra (match_dup 1) and need to swap operands. we once tried to replace it with (match_operand:VFH_AVX512VL 5 "nonimmediate_operand" "0,0")), but trigger an ICE in reload(reload can handle at most one operand with "0" constraint). 6213(define_insn "<avx512>_fnmsub_<mode>_mask<round_name>" 6214 [(set (match_operand:VFH_AVX512VL 0 "register_operand" "=v,v") 6215 (vec_merge:VFH_AVX512VL 6216 (fma:VFH_AVX512VL 6217 (neg:VFH_AVX512VL 6218 (match_operand:VFH_AVX512VL 1 "nonimmediate_operand" "0,0")) 6219 (match_operand:VFH_AVX512VL 2 "<round_nimm_predicate>" "<round_constraint>,v") 6220 (neg:VFH_AVX512VL 6221 (match_operand:VFH_AVX512VL 3 "<round_nimm_predicate>" "v,<round_constraint>"))) 6222 (match_dup 1) 6223 (match_operand:<avx512fmaskmode> 4 "register_operand" "Yk,Yk")))] 6224 "TARGET_AVX512F && <round_mode_condition>" So the backend fix should at least add 8 patterns to handle that, in that case, maybe the middle-end canonicalization would be better.