Hi!

On Fri, Apr 07, 2023 at 02:32:04AM -0400, Michael Meissner wrote:
> On Thu, Apr 06, 2023 at 03:37:59PM -0500, Segher Boessenkool wrote:
> > > This patch eliminates the generation of the Altivec fmaddfp and fnmsubfp
> > > instructions as alternatives in the VSX instruction insn support, and in 
> > > the
> > > Altivec insns it adds a test to prevent the insn from being used if VSX is
> > > available.  I also added a test to the regression test suite.
> > 
> > Please leave the latter out, it does not belong in this patch.  If you
> > want a patch to do that deal with *all* VMX FP insns?  There also are
> > add, sub, mul, etc.  Well I think those (as well as madd and nmsub) are
> > the only ones that use the NJ bit or the RN bits, but please check.
> 
> After I posted the patch I refreshed my memory of the VECTOR_UNIT_ALTIVEC_P
> macro and it is not true if VSX code generation is enabled.  So I dropped the
> changes to altivec.md.

Right you are.  We still run into all the same problems for -maltivec
-mno-vsx compilations, but no one (knock on wood) does that except it is
the default on old systems.  We can live with that / we'll just have to
live with that, take your pick.

> In addition, as far as I know, the only AltiVec (VMX) floating point
> instructions generated when VSX is used are the vmaddfp and vnmsubfp
> instructions.

That is more likely given that VECTOR_UNIT_ALTIVEC_P means things match
if *only* VMX registers are allowed, right.  But it still sits very
uneasy with me, the way it is written is not very defensive.

> In the case of add and subtract, xvaddsp and xvsubsp is more
> general than vaddfp or vsubfp since it can access all VSX registers.  VMX does
> not have a stand-alone multiply (it generates FMA with a zero register) and it
> does not have a division operation.  And VMX does not have xvmsub{a,m}sp nor
> xvnadd{a,m}sp variations of the FMA instructions.

Yes, it has only the more frequent two variants.

> > >  (define_insn "*altivec_fmav4sf4"
> > >    [(set (match_operand:V4SF 0 "register_operand" "=v")
> > >   (fma:V4SF (match_operand:V4SF 1 "register_operand" "v")
> > >             (match_operand:V4SF 2 "register_operand" "v")
> > >             (match_operand:V4SF 3 "register_operand" "v")))]
> > > -  "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
> > > +  "VECTOR_UNIT_ALTIVEC_P (V4SFmode) && !TARGET_VSX"
> > 
> > This is very error-prone.  Maybe add a test to the VECTOR_UNIT_ALTIVEC
> > macro instead?
> 
> As I said that part of the code is not in the next patch.

Excellent.

> > > -;; Fused vector multiply/add instructions. Support the classical Altivec
> > > -;; versions of fma, which allows the target to be a separate register 
> > > from the
> > > -;; 3 inputs.  Under VSX, the target must be either the addend or the 
> > > first
> > > -;; multiply.
> > > +;; Fused vector multiply/add instructions. Do not use the classical 
> > > Altivec
> 
> > (Two spaces after dot, and AltiVec is spelled with a capital V.  I don't
> > like it either, VMX is a much nicer and more regular name).
> 
> When the name might be more regular, but in terms of the instruction set, it
> does have holes that I mentioned above (no multiply that is not a FMA, two of
> the four FMA variants are not provided).

Yes.  And several new insns (like all QP insns) are VMX only, in terms
of what registers are allowed.  The mnemonics for most of those insns
(with noteworthy exception the QP insns) starts with a "v", too.

> > So this part looks okay, and it alone is safe for GCC 13 as well.
> 
> Well as we were discussing on a private channel, it is desirable to generate
> vmaddfp and vnmsubfp if -Ofast is used, so the next patch incorporates that
> change.

If only considering the RN effects.  But not when considering NJ screws
us over big time -- we should never generate VMX FP insns unless
explicitly asked for, its semantics are just too foreign.

We could add a separate flag for just this, but is there any demand?


Segher

Reply via email to