On Wed, Jul 17, 2019 at 12:37:59PM +0200, Richard Biener wrote:
> On Tue, Jul 16, 2019 at 10:45 AM Kewen.Lin <li...@linux.ibm.com> wrote:
> > Based on the previous comments (thank you!), I tried to update the
> > handling in expander and vectorizer.  Middle-end optimizes lrotate
> > with const rotation count to rrotate all the time, it makes vectorizer
> > fail to vectorize if rrotate isn't supported on the target.  We can at
> > least teach it on const rotation count, the cost should be the same?
> > At the same time, the expander already tries to use the opposite
> > rotation optable for scalar, we can teach it to deal with vector as well.
> >
> > Is it on the right track and reasonable?
> 
> So you're basically fixing this up in the expander.  I think on
> the GIMPLE level you then miss to update tree-vect-generic.c?
> 
> I'm not sure if it makes sense to have both LROTATE_EXPR and
> RROTATE_EXPR on the GIMPLE level then (that CPUs only
> support one direction is natural though).  So maybe simply get
> rid of one?  Its semantics are also nowhere documented

A lot of targets support both, and I think not all targets do the
truncation, so at least with non-constant rotate count emitting one over the
other is important and trying to match it up only during combine might be
too late and not work well in many cases.
Then there are some targets that only support left rotates and not right
rotates (rs6000, s390, tilegx, ...), and other targets that only support
right rotates (mips, iq2000, ...).
So only having one GIMPLE code doesn't seem to be good enough.

I think handling it during expansion in generic code is fine, especially
when we clearly have several targets that do support only one of the
rotates.  As you wrote, it needs corresponding code in tree-vect-generic.c,
and shouldn't hardcode the rs6000 direction of mapping rotr to rotl, but
support also the other direction - rotl to rotr.  For the sake of
!SHIFT_COUNT_TRUNCATED targets for constant shift counts it needs to do
negation + masking and for variable shift counts probably punt and let the
backend code handle it if it can do the truncation in there?

        Jakub

Reply via email to