> I assume the "full permutation" condition is to avoid performing some
> extra operations that would raise exception flags. If so, are there
> conditions (-fno-trapping-math?) where the transformation would be safe
> with arbitrary shuffles?
Yes, that could be an alternative choice with -fno-trap
On Fri, 4 Nov 2022, Hongyu Wang via Gcc-patches wrote:
This is a follow-up patch for PR98167
The sequence
c1 = VEC_PERM_EXPR (a, a, mask)
c2 = VEC_PERM_EXPR (b, b, mask)
c3 = c1 op c2
can be optimized to
c = a op b
c3 = VEC_PERM_EXPR (c, c, mask)
for all integer vector opera
On Wed, Nov 16, 2022 at 03:40:02PM +, Tamar Christina wrote:
> > Even:
> >
> > --- gcc/match.pd.jj 2022-11-15 07:56:05.240348804 +0100
> > +++ gcc/match.pd2022-11-16 16:35:34.854080956 +0100
> > @@ -8259,7 +8259,7 @@ and,
> > (simplify
> >(op (vec_perm @0 @0 @2) (vec_perm @1 @1 @2))
u.org
> Subject: Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation
> index and operation [PR98167]
>
> On Wed, Nov 16, 2022 at 04:30:06PM +0100, Richard Biener via Gcc-patches
> wrote:
> > On Wed, Nov 16, 2022 at 4:29 PM Richard Biener
> > wrote:
> > >
On Wed, Nov 16, 2022 at 04:30:06PM +0100, Richard Biener via Gcc-patches wrote:
> On Wed, Nov 16, 2022 at 4:29 PM Richard Biener
> wrote:
> >
> > On Wed, Nov 16, 2022 at 4:25 PM Tamar Christina
> > wrote:
> > >
> > > Hi,
> > >
> > > This patch is causing several ICEs because it changes the permu
r
> > >
> > > > -Original Message-
> > > > From: Gcc-patches > > > bounces+tamar.christina=arm@gcc.gnu.org> On Behalf Of Richard
> > > > Biener via Gcc-patches
> > > > Sent: Monday, November 14, 2022 2:53
riginal Message-
> > > From: Gcc-patches > > bounces+tamar.christina=arm@gcc.gnu.org> On Behalf Of Richard
> > > Biener via Gcc-patches
> > > Sent: Monday, November 14, 2022 2:53 PM
> > > To: Hongyu Wang
> > > Cc: Prathamesh Kulkarni
lkarni ; Richard
> > Sandiford ; Hongyu Wang
> > ; hongtao@intel.com; gcc-
> > patc...@gcc.gnu.org
> > Subject: Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation
> > index and operation [PR98167]
> >
> > On Thu, Nov 10, 2022 at 3:27 PM Hongyu Wa
To: Hongyu Wang
> Cc: Prathamesh Kulkarni ; Richard
> Sandiford ; Hongyu Wang
> ; hongtao@intel.com; gcc-
> patc...@gcc.gnu.org
> Subject: Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation
> index and operation [PR98167]
>
> On Thu, Nov 10, 2022 at 3:27 PM Hongyu Wa
On Thu, Nov 10, 2022 at 3:27 PM Hongyu Wang wrote:
>
> > Well, with AVX512 v64qi that's 64*64 == 4096 cases to check. I think
> > a lambda function is fine to use. The alternative (used by the vectorizer
> > in some places) is to use sth like
> >
> > auto_sbitmap seen (nelts);
> > for (i = 0;
> Well, with AVX512 v64qi that's 64*64 == 4096 cases to check. I think
> a lambda function is fine to use. The alternative (used by the vectorizer
> in some places) is to use sth like
>
> auto_sbitmap seen (nelts);
> for (i = 0; i < nelts; i++)
>{
> if (!bitmap_set_bit (seen, i))
>
On Thu, Nov 10, 2022 at 3:27 AM Hongyu Wang wrote:
>
> Hi Prathamesh and Richard,
>
> Thanks for the review and nice suggestions!
>
> > > I guess the transform should work as long as mask is same for both
> > > vectors even if it's
> > > not constant ?
> >
> > Yes, please change accordingly (and m
Hi Prathamesh and Richard,
Thanks for the review and nice suggestions!
> > I guess the transform should work as long as mask is same for both
> > vectors even if it's
> > not constant ?
>
> Yes, please change accordingly (and maybe push separately).
>
Removed VECTOR_CST for integer ops.
> > If
On Fri, Nov 4, 2022 at 7:44 AM Prathamesh Kulkarni via Gcc-patches
wrote:
>
> On Fri, 4 Nov 2022 at 05:36, Hongyu Wang via Gcc-patches
> wrote:
> >
> > Hi,
> >
> > This is a follow-up patch for PR98167
> >
> > The sequence
> > c1 = VEC_PERM_EXPR (a, a, mask)
> > c2 = VEC_PERM_EXPR (b, b
On Fri, 4 Nov 2022 at 05:36, Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> This is a follow-up patch for PR98167
>
> The sequence
> c1 = VEC_PERM_EXPR (a, a, mask)
> c2 = VEC_PERM_EXPR (b, b, mask)
> c3 = c1 op c2
> can be optimized to
> c = a op b
> c3 = VEC_PERM_EXPR (c
Hi,
This is a follow-up patch for PR98167
The sequence
c1 = VEC_PERM_EXPR (a, a, mask)
c2 = VEC_PERM_EXPR (b, b, mask)
c3 = c1 op c2
can be optimized to
c = a op b
c3 = VEC_PERM_EXPR (c, c, mask)
for all integer vector operation, and float operation with
full permutation.
16 matches
Mail list logo