On Mon, 6 Feb 2023 at 20:14, Roger Sayle <ro...@nextmovesoftware.com> wrote:
>
>
> Perhaps I'm missing something (I'm not too familiar with SVE semantics), but
> is there
> a reason that the solution for PR96473 uses a VEC_PERM_EXPR and not just a
> VEC_DUPLICATE_EXPR?  The folding of sv1d1rq (svptrue_..., ...) doesn't seem
> to
> require either the blending or the permutation functionality of a
> VEC_PERM_EXPR.
> Instead, it seems to be misusing (the modified) VEC_PERM_EXPR as a form of
> VIEW_CONVERT_EXPR that allows us to convert/mismatch the type of the
> operands
> to the type of the result.
Hi,
I am not sure if we could use VEC_DUPLICATE_EXPR for PR96463 case as-is.
Perhaps we could extend VEC_DUPLICATE_EXPR to take N operands,
so the resulting vector has npatterns = N, nelts_per_pattern = 1 ?
AFAIU, extending VEC_PERM_EXPR to handle vectors with different lengths,
would allow for more optimization opportunities besides PR96463.
>
> Conceptually, (as in Richard's original motivation for the PR),
> svint32_t foo (int32x4_t x) { return svld1rq (svptrue_b8 (), &x[0]); }
> can be optimized to (something like)
> svint32_t foo (int32x4_t x) { return svdup_32 (x[0]); }  // or dup z0.q,
> z0.q[0] equivalent
I guess that should be equivalent to svdupq_s32 (x[0], x[1], x[2], x[3]) ?

Thanks,
Prathamesh



> hence it makes sense for fold to transform the gimple form of the first,
> into the
> gimple form of the second(?)
>
> Just curious.
> Roger
> --
>
> > -----Original Message-----
> > From: Richard Sandiford <richard.sandif...@arm.com>
> > Sent: 06 February 2023 12:22
> > To: Richard Biener <richard.guent...@gmail.com>
> > Cc: Roger Sayle <ro...@nextmovesoftware.com>; GCC Patches <gcc-
> > patc...@gcc.gnu.org>
> > Subject: Re: [DOC PATCH] Document the VEC_PERM_EXPR tree code (and minor
> > clean-ups).
> >
> > Richard Biener <richard.guent...@gmail.com> writes:
> > > On Sat, Feb 4, 2023 at 9:35 PM Roger Sayle <ro...@nextmovesoftware.com>
> > wrote:
> > >>
> > >>
> > >> This patch (primarily) documents the VEC_PERM_EXPR tree code in
> > >> generic.texi.  For ease of review, it is provided below as a pair of
> > >> diffs.  The first contains just the new text added to describe
> > >> VEC_PERM_EXPR, the second tidies up this part of the documentation by
> > >> sorting the tree codes into alphabetical order, and providing
> > >> consistent section naming/capitalization, so changing this section
> > >> from "Vectors" to "Vector Expressions" (matching the nearby "Unary
> > >> and Binary Expressions").
> > >>
> > >> Tested with make pdf and make html on x86_64-pc-linux-gnu.
> > >> The reviewer(s) can decide whether to approve just the new content,
> > >> or the content+clean-up.  Ok for mainline?
> > >
> > > +@item VEC_PERM_EXPR
> > > +This node represents a vector permute/blend operation.  The three
> > > +operands must be vectors of the same number of elements.  The first
> > > +and second operands must be vectors of the same type as the entire
> > > +expression,
> > >
> > > this was recently relaxed for the case of constant permutes in which
> > > case the first and second operands only have to have the same element
> > > type as the result.  See tree-cfg.cc:verify_gimple_assign_ternary.
> > >
> > > The following description will become a bit more awkward here and for
> > > rhs1/rhs2 with different number of elements the modulo interpretation
> > > doesn't hold - I believe we require in-bounds elements for constant
> > > permutes.  Richard can probably clarify things here.
> >
> > I thought that the modulo behaviour still applies when the node has a
> constant
> > selector, it's just that the in-range form is the canonical one.
> >
> > With variable-length vectors, I think it's in principle possible to have a
> stepped
> > constant selector whose start elements are in-range but whose final
> elements
> > aren't (and instead wrap around when applied).
> > E.g. the selector could zip the last quarter of the inputs followed by the
> first
> > quarter.
> >
> > Thanks,
> > Richard
>

Reply via email to