Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]

2022-11-16 Thread Hongyu Wang via Gcc-patches
> I assume the "full permutation" condition is to avoid performing some > extra operations that would raise exception flags. If so, are there > conditions (-fno-trapping-math?) where the transformation would be safe > with arbitrary shuffles? Yes, that could be an alternative choice with -fno-trap

Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]

2022-11-16 Thread Marc Glisse via Gcc-patches
On Fri, 4 Nov 2022, Hongyu Wang via Gcc-patches wrote: This is a follow-up patch for PR98167 The sequence c1 = VEC_PERM_EXPR (a, a, mask) c2 = VEC_PERM_EXPR (b, b, mask) c3 = c1 op c2 can be optimized to c = a op b c3 = VEC_PERM_EXPR (c, c, mask) for all integer vector opera

Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]

2022-11-16 Thread Jakub Jelinek via Gcc-patches
On Wed, Nov 16, 2022 at 03:40:02PM +, Tamar Christina wrote: > > Even: > > > > --- gcc/match.pd.jj 2022-11-15 07:56:05.240348804 +0100 > > +++ gcc/match.pd2022-11-16 16:35:34.854080956 +0100 > > @@ -8259,7 +8259,7 @@ and, > > (simplify > >(op (vec_perm @0 @0 @2) (vec_perm @1 @1 @2))

RE: [PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]

2022-11-16 Thread Tamar Christina via Gcc-patches
u.org > Subject: Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation > index and operation [PR98167] > > On Wed, Nov 16, 2022 at 04:30:06PM +0100, Richard Biener via Gcc-patches > wrote: > > On Wed, Nov 16, 2022 at 4:29 PM Richard Biener > > wrote: > > >

Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]

2022-11-16 Thread Jakub Jelinek via Gcc-patches
On Wed, Nov 16, 2022 at 04:30:06PM +0100, Richard Biener via Gcc-patches wrote: > On Wed, Nov 16, 2022 at 4:29 PM Richard Biener > wrote: > > > > On Wed, Nov 16, 2022 at 4:25 PM Tamar Christina > > wrote: > > > > > > Hi, > > > > > > This patch is causing several ICEs because it changes the permu

RE: [PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]

2022-11-16 Thread Tamar Christina via Gcc-patches
r > > > > > > > -Original Message- > > > > From: Gcc-patches > > > bounces+tamar.christina=arm@gcc.gnu.org> On Behalf Of Richard > > > > Biener via Gcc-patches > > > > Sent: Monday, November 14, 2022 2:53

Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]

2022-11-16 Thread Richard Biener via Gcc-patches
riginal Message- > > > From: Gcc-patches > > bounces+tamar.christina=arm@gcc.gnu.org> On Behalf Of Richard > > > Biener via Gcc-patches > > > Sent: Monday, November 14, 2022 2:53 PM > > > To: Hongyu Wang > > > Cc: Prathamesh Kulkarni

Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]

2022-11-16 Thread Richard Biener via Gcc-patches
lkarni ; Richard > > Sandiford ; Hongyu Wang > > ; hongtao@intel.com; gcc- > > patc...@gcc.gnu.org > > Subject: Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation > > index and operation [PR98167] > > > > On Thu, Nov 10, 2022 at 3:27 PM Hongyu Wa

RE: [PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]

2022-11-16 Thread Tamar Christina via Gcc-patches
To: Hongyu Wang > Cc: Prathamesh Kulkarni ; Richard > Sandiford ; Hongyu Wang > ; hongtao@intel.com; gcc- > patc...@gcc.gnu.org > Subject: Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation > index and operation [PR98167] > > On Thu, Nov 10, 2022 at 3:27 PM Hongyu Wa

Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]

2022-11-14 Thread Richard Biener via Gcc-patches
On Thu, Nov 10, 2022 at 3:27 PM Hongyu Wang wrote: > > > Well, with AVX512 v64qi that's 64*64 == 4096 cases to check. I think > > a lambda function is fine to use. The alternative (used by the vectorizer > > in some places) is to use sth like > > > > auto_sbitmap seen (nelts); > > for (i = 0;

Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]

2022-11-10 Thread Hongyu Wang via Gcc-patches
> Well, with AVX512 v64qi that's 64*64 == 4096 cases to check. I think > a lambda function is fine to use. The alternative (used by the vectorizer > in some places) is to use sth like > > auto_sbitmap seen (nelts); > for (i = 0; i < nelts; i++) >{ > if (!bitmap_set_bit (seen, i)) >

Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]

2022-11-10 Thread Richard Biener via Gcc-patches
On Thu, Nov 10, 2022 at 3:27 AM Hongyu Wang wrote: > > Hi Prathamesh and Richard, > > Thanks for the review and nice suggestions! > > > > I guess the transform should work as long as mask is same for both > > > vectors even if it's > > > not constant ? > > > > Yes, please change accordingly (and m

Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]

2022-11-09 Thread Hongyu Wang via Gcc-patches
Hi Prathamesh and Richard, Thanks for the review and nice suggestions! > > I guess the transform should work as long as mask is same for both > > vectors even if it's > > not constant ? > > Yes, please change accordingly (and maybe push separately). > Removed VECTOR_CST for integer ops. > > If

Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]

2022-11-08 Thread Richard Biener via Gcc-patches
On Fri, Nov 4, 2022 at 7:44 AM Prathamesh Kulkarni via Gcc-patches wrote: > > On Fri, 4 Nov 2022 at 05:36, Hongyu Wang via Gcc-patches > wrote: > > > > Hi, > > > > This is a follow-up patch for PR98167 > > > > The sequence > > c1 = VEC_PERM_EXPR (a, a, mask) > > c2 = VEC_PERM_EXPR (b, b

Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]

2022-11-03 Thread Prathamesh Kulkarni via Gcc-patches
On Fri, 4 Nov 2022 at 05:36, Hongyu Wang via Gcc-patches wrote: > > Hi, > > This is a follow-up patch for PR98167 > > The sequence > c1 = VEC_PERM_EXPR (a, a, mask) > c2 = VEC_PERM_EXPR (b, b, mask) > c3 = c1 op c2 > can be optimized to > c = a op b > c3 = VEC_PERM_EXPR (c

[PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]

2022-11-03 Thread Hongyu Wang via Gcc-patches
Hi, This is a follow-up patch for PR98167 The sequence c1 = VEC_PERM_EXPR (a, a, mask) c2 = VEC_PERM_EXPR (b, b, mask) c3 = c1 op c2 can be optimized to c = a op b c3 = VEC_PERM_EXPR (c, c, mask) for all integer vector operation, and float operation with full permutation.