RE: [PATCH] aarch64: Carry over zeroness in aarch64_evpc_reencode

2025-05-21 Thread quic_pzheng
> Pengxuan Zheng writes: > > There was a bug in aarch64_evpc_reencode which could leave zero_op0_p > > and zero_op1_p of the struct "newd" uninitialized. > > r16-701-gd77c3bc1c35e303 fixed the issue by zero initializing "newd." > > This patch provides an alternative fix as suggested by Richard > >

RE: [PUSHED] aarch64: Fix an oversight in aarch64_evpc_reencode

2025-05-20 Thread quic_pzheng
> Pengxuan Zheng writes: > > Some fields (e.g., zero_op0_p and zero_op1_p) of the struct "newd" may > > be left uninitialized in aarch64_evpc_reencode. This can cause reading > > of uninitialized data. I found this oversight when testing my patches > > on and/fmov optimizations. This patch fixes t

RE: [PATCH v2 3/3] aarch64: Add more vector permute tests for the FMOV optimization [PR100165]

2025-05-16 Thread quic_pzheng
> Pengxuan Zheng writes: > > diff --git a/gcc/testsuite/gcc.target/aarch64/fmov-3-le.c > > b/gcc/testsuite/gcc.target/aarch64/fmov-3-le.c > > new file mode 100644 > > index 000..adbf87243f6 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/fmov-3-le.c > > @@ -0,0 +1,130 @@ > > +

RE: [PATCH v2 2/3] aarch64: Optimize AND with certain vector of immediates as FMOV [PR100165]

2025-05-16 Thread quic_pzheng
> Pengxuan Zheng writes: > > diff --git a/gcc/config/aarch64/aarch64.cc > > b/gcc/config/aarch64/aarch64.cc index 15f08cebeb1..98ce85dfdae 100644 > > --- a/gcc/config/aarch64/aarch64.cc > > +++ b/gcc/config/aarch64/aarch64.cc > > @@ -23621,6 +23621,36 @@ aarch64_simd_valid_and_imm (rtx op) > >

RE: [PATCH v2 1/3] aarch64: Recognize vector permute patterns which can be interpreted as AND [PR100165]

2025-05-16 Thread quic_pzheng
> Pengxuan Zheng writes: > **... > **and v0.8b, (?:v0.8b, v[0-9]+.8b|v[0-9]+.8b, v0.8b) > **ret > > Same for other tests that can't use a move immediate. > > Please leave 24 hours for others to comment on the target-independent part, > but otherwise the patch is ok with the chang

RE: [PATCH 1/3] Recognize vector permute patterns which can be interpreted as AND [PR100165]

2025-05-12 Thread quic_pzheng
> Richard Biener writes: > > On Sat, Apr 26, 2025 at 2:42 AM Pengxuan Zheng > wrote: > >> > >> Certain permute that blends a vector with zero can be interpreted as > >> an AND of a mask. This idea was suggested by Richard Sandiford when > >> he was reviewing my patch which tries to optimizes cert

RE: [PATCH 2/3] aarch64: Optimize AND with certain vector of immediates as FMOV [PR100165]

2025-05-12 Thread quic_pzheng
> Pengxuan Zheng writes: > > We can optimize AND with certain vector of immediates as FMOV if the > > result of the AND is as if the upper lane of the input vector is set > > to zero and the lower lane remains unchanged. > > > > For example, at present: > > > > v4hi > > f_v4hi (v4hi x) > > { > >

RE: [PATCH] Canonicalize vec_merge in simplify_ternary_operation

2025-05-07 Thread quic_pzheng
> Pengxuan Zheng writes: > > Similar to the canonicalization done in combine, we canonicalize > > vec_merge with swap_communattive_operands_p in > simplify_ternary_operation too. > > > > gcc/ChangeLog: > > > > * config/aarch64/aarch64-protos.h (aarch64_exact_log2_inverse): > New. > > * con

RE: [PATCH v3] aarch64: Recognize vector permute patterns suitable for FMOV [PR100165]

2025-04-25 Thread quic_pzheng
> Richard Sandiford writes: > > I think this would also simplify the evpc detection, since the > > requirement for using AND is the same for big-endian and > > little-endian, namely that index I of the result must either come from > > index I of the nonzero vector or from any element of the zero v

RE: [PATCH] aarch64: Recognize vector permute patterns suitable for FMOV [PR100165]

2025-02-18 Thread quic_pzheng
> > Pengxuan Zheng writes: > > > This patch optimizes certain vector permute expansion with the FMOV > > > instruction when one of the input vectors is a vector of all zeros > > > and the result of the vector permute is as if the upper lane of the > > > non-zero input vector is set to zero and the

RE: [PATCH] aarch64: Recognize vector permute patterns suitable for FMOV [PR100165]

2025-02-18 Thread quic_pzheng
> Pengxuan Zheng writes: > > This patch optimizes certain vector permute expansion with the FMOV > > instruction when one of the input vectors is a vector of all zeros and > > the result of the vector permute is as if the upper lane of the > > non-zero input vector is set to zero and the lower lan