On Mon, 26 Feb 2024, Andre Vieira (lists) wrote:

> 
> 
> On 05/02/2024 09:56, Richard Biener wrote:
> > On Thu, 1 Feb 2024, Andre Vieira (lists) wrote:
> > 
> >>
> >>
> >> On 01/02/2024 07:19, Richard Biener wrote:
> >>> On Wed, 31 Jan 2024, Andre Vieira (lists) wrote:
> >>>
> >>>
> >>> The patch didn't come with a testcase so it's really hard to tell
> >>> what goes wrong now and how it is fixed ...
> >>
> >> My bad! I had a testcase locally but never added it...
> >>
> >> However... now I look at it and ran it past Richard S, the codegen isn't
> >> 'wrong', but it does have the potential to lead to some pretty slow
> >> codegen,
> >> especially for inbranch simdclones where it transforms the SVE predicate
> >> into
> >> an Advanced SIMD vector by inserting the elements one at a time...
> >>
> >> An example of which can be seen if you do:
> >>
> >> gcc -O3 -march=armv8-a+sve -msve-vector-bits=128  -fopenmp-simd t.c -S
> >>
> >> with the following t.c:
> >> #pragma omp declare simd simdlen(4) inbranch
> >> int __attribute__ ((const)) fn5(int);
> >>
> >> void fn4 (int *a, int *b, int n)
> >> {
> >>      for (int i = 0; i < n; ++i)
> >>          b[i] = fn5(a[i]);
> >> }
> >>
> >> Now I do have to say, for our main usecase of libmvec we won't have any
> >> 'inbranch' Advanced SIMD clones, so we avoid that issue... But of course
> >> that
> >> doesn't mean user-code will.
> > 
> > It seems to use SVE masks with vector(4) <signed-boolean:4> and the
> > ABI says the mask is vector(4) int.  You say that's because we choose
> > a Adv SIMD clone for the SVE VLS vector code (it calls _ZGVnM4v_fn5).
> > 
> > The vectorizer creates
> > 
> >    _44 = VEC_COND_EXPR <loop_mask_41, { 1, 1, 1, 1 }, { 0, 0, 0, 0 }>;
> > 
> > and then vector lowering decomposes this.  That means the vectorizer
> > lacks a check that the target handles this VEC_COND_EXPR.
> > 
> > Of course I would expect that SVE with VLS vectors is able to
> > code generate this operation, so it's missing patterns in the end.
> > 
> > Richard.
> > 
> 
> What should we do for GCC-14? Going forward I think the right thing to do is
> to add these patterns. But I am not even going to try to do that right now and
> even though we can codegen for this, the result doesn't feel like it would
> ever be profitable which means I'd rather not vectorize, or well pick a
> different vector mode if possible.
> 
> This would be achieved with the change to the targethook. If I change the hook
> to take modes, using STMT_VINFO_VECTYPE (stmt_vinfo), is that OK for now?

Passing in a mode is OK.  I'm still not fully understanding why the
clone isn't fully specifying 'mode' and if it does not why the
vectorizer itself can not disregard it.

>From the past discussion I understood the existing situation isn't
as bad as initially thought and no bad things happen right now?

Thanks,
Richard.

Reply via email to