https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117072

--- Comment #11 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
(In reply to Hongtao Liu from comment #10)
> (In reply to rguent...@suse.de from comment #9)
> > On Fri, 11 Oct 2024, liuhongt at gcc dot gnu.org wrote:
> > 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117072
> > > 
> > > --- Comment #8 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
> > > (In reply to Richard Biener from comment #7)
> > > > OTOH I'll note that no other simplify_* treats canonicalization as
> > > > simplification and the existing swap_commutative_operands_p transform 
> > > > for FMA
> > > > is highly uncommon.
> > > > 
> > > > So why do we recognize (fma (neg (mem...)) ...) and not only (neg
> > > > (register_operand))?
> > > 
> > > I think we can relex register_operand to nonimmediate_operand and rely on 
> > > RA to
> > > reload it into a reg just like we did in
> > > <sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name><round_name>. So a 
> > > backend fix
> > > shou be better?
> > 
> > I think currently the backend isn't consistent with itself and sure,
> > a backend fix would be better (if it doesn't mean bloating the .md
> > with many more patterns).
> 
> No, just adjust the existed pattern should be ok.
Relax the predicate doesn't help since the mask pattern checks extra (match_dup
1)
 and need to swap operands. we once tried to replace it with
(match_operand:VFH_AVX512VL 5 "nonimmediate_operand" "0,0")), but trigger an
ICE in reload(reload can handle at most one operand with "0" constraint).


6213(define_insn "<avx512>_fnmsub_<mode>_mask<round_name>"
 6214  [(set (match_operand:VFH_AVX512VL 0 "register_operand" "=v,v")
 6215        (vec_merge:VFH_AVX512VL
 6216          (fma:VFH_AVX512VL
 6217            (neg:VFH_AVX512VL
 6218              (match_operand:VFH_AVX512VL 1 "nonimmediate_operand" "0,0"))
 6219            (match_operand:VFH_AVX512VL 2 "<round_nimm_predicate>"
"<round_constraint>,v")
 6220            (neg:VFH_AVX512VL
 6221              (match_operand:VFH_AVX512VL 3 "<round_nimm_predicate>"
"v,<round_constraint>")))
 6222          (match_dup 1)
 6223          (match_operand:<avx512fmaskmode> 4 "register_operand"
"Yk,Yk")))]
 6224  "TARGET_AVX512F && <round_mode_condition>"


So the backend fix should at least add 8 patterns to handle that, in that case,
maybe the middle-end canonicalization would be better.

Reply via email to