> > > Why?  I don't think the vectorizer relies on a particular else
> > > value?  I'd say it would be appropriate for if-conversion to
> > > use "ANY" and for the vectorizer to then pick a supported
> > > version and/or enforce the else value it needs via a blend?
> > 
> > In PR115336 we have something like
> > 
> >   _Bool iftmp.0_113;
> >   _Bool iftmp.0_114;
> >   iftmp.0_113 = .MASK_LOAD (_170, 8B, _169, _171(D));
> >   iftmp.0_114 = _47 | iftmp.0_113;
> > 
> > which assumes zeroing.
>
> I see - is that some trick ifcvt performs?  I can't immediately
> see the connection to the PR and it only contains RISC-V assembly
> analysis.

It happens in predicate_scalar_phi where we build the cond_expr.

After converting

  iftmp.0_114 = PHI <iftmp.0_113(45), 1(38)>

we have

  _BoolD.2746 _47;
  iftmp.0_114 = _47 ? 1 : iftmp.0_113;
which is folded into
  iftmp.0_114 = _47 | iftmp.0_113;

I should really have documented that and more in the PR already...
So it's not an ifcvt trick but rather match.

Another related case is PR116059.

> > In order to circumvent that we could use COND_IOR
> > but I suppose that wouldn't be optimized away even on targets that zero
> > masked elements?  "ANY" would seem to be wrong here.
>
> What I was trying to say is that of course any transform we perform
> that requires zero-masking should either make .MAKS_LOAD perform
> that or add a COND_EXPR.  But it shouldn't be required to make
> all .MASK_LOADs be zero-masked, no?
>
> I'm of course fine if you think that's the best way for RISC-V given
> other targets are likely unaffected as they can perform zero-masking.

No, the less zeroing the better of course :)

Richard S's point was to make the COND_EXPR explicit, so that e.g.
a MASK_LOAD (mask, ..., 1) does not appear cheap as cheap as
MASK_LOAD (mask, ..., 0) on zeroing targets.

>From this I kind of jumped to the conclusion (see below) that we might want
it everywhere.

With the patches as is, ifcvt would enforce the zero here while all other
masked-load occurrences in the vectorizer would just query the target's
preferred else value and simply use that without blend/cond_expr.

> > What I didn't do (in the posted version, just locally) is an explicit
> > VEC_COND_EXPR after each masked (gather/load lanes) call the vectorizer
> > does.  Do we need that?  AFAICT loop masking (be it len style or
> > fully-masked style) should be safe.
>
> Well, why should we need that?  There seem to be the assumption that
> .MASK_LOAD is zero-masked in very few places (PR115336, but not
> identified there), if we'd assume this everywhere there would be
> way more issues with RISC-V?

Ok, I was already pretty sure we don't need - and glad to hear it confirmed.
I was just thinking for consistency reasons we might want a masked
load to always look like
  foo123 = .MASK_..._LOAD (mask, ..., else)
  foo124 = COND_EXPR (mask, foo123, 0);
where foo124 would be optimized away (or not even emitted) for zeroing
targets).  That way subsequent code could always rely on zero.
But as this requirement seems very rare it doesn't look like a useful
invariant to enforce.

All in all, it seems we don't need major changes to the approach.
I'm going to work on the comments for the other patches.

-- 
Regards
 Robin

Reply via email to