test accordingly.
* gcc.target/aarch64/fmov.c: New test.
* gcc.target/aarch64/fmov-be.c: New test.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-simd.md| 14 +++
gcc/config/aarch64/aarch64.cc | 74 +++-
gcc/testsuite/gcc.target
My bad. Thanks for fixing this quickly, Andrew!
Thanks,
Pengxuan
>
> After r15-4579-g9ffcf1f193b477, we get the following warning/error while
> bootstrapping on aarch64:
> ```
> ../../gcc/gcc/config/aarch64/aarch64.cc: In function ‘rtx_def*
> aarch64_ptrue_reg(machine_mode, unsigned int)’:
> ../.
> Pengxuan Zheng writes:
> > This is similar to the recent improvements to the Advanced SIMD
> > popcount expansion by using SVE. We can utilize SVE to generate more
> > efficient code for scalar mode popcount too.
> >
> > Changes since v1:
> > * v2: Add a
> Pengxuan Zheng writes:
> > This is similar to the recent improvements to the Advanced SIMD
> > popcount expansion by using SVE. We can utilize SVE to generate more
> > efficient code for scalar mode popcount too.
> >
> > Changes since v1:
> > * v2: Add a
.
(vec_pop_mode): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/popcnt-sve.c: Update test.
* gcc.target/aarch64/popcnt11.c: New test.
* gcc.target/aarch64/popcnt12.c: New test.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-protos.h
> Pengxuan Zheng writes:
> > This is similar to the recent improvements to the Advanced SIMD
> > popcount expansion by using SVE. We can utilize SVE to generate more
> > efficient code for scalar mode popcount too.
> >
> > PR target/113860
> >
> >
attribute.
(vec_pop_mode): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/popcnt11.c: New test.
* gcc.target/aarch64/popcnt12.c: New test.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-modes.def| 3 ++
gcc/config/aarch64/aarch64-simd.md
> > Pengxuan Zheng writes:
> > > We can still use SVE's INDEX instruction to construct vectors even
> > > if not all elements are constants. For example, { 0, x, 2, 3 } can
> > > be constructed by first using "INDEX #0, #1" to generate { 0, 1, 2,
&g
> > > On 16 Sep 2024, at 16:32, Richard Sandiford
> wrote:
> > >
> > > External email: Use caution opening links or attachments
> > >
> > >
> > > "Pengxuan Zheng (QUIC)" writes:
> > >>> On Thu, Sep 12, 2024 at 2:
> > On 16 Sep 2024, at 16:32, Richard Sandiford
> wrote:
> >
> > External email: Use caution opening links or attachments
> >
> >
> > "Pengxuan Zheng (QUIC)" writes:
> >>> On Thu, Sep 12, 2024 at 2:53 AM Pengxuan Zheng
> >>&
> Pengxuan Zheng writes:
> > We can still use SVE's INDEX instruction to construct vectors even if
> > not all elements are constants. For example, { 0, x, 2, 3 } can be
> > constructed by first using "INDEX #0, #1" to generate { 0, 1, 2, 3 },
> > and the
> "Pengxuan Zheng (QUIC)" writes:
> >> On Thu, Sep 12, 2024 at 2:53 AM Pengxuan Zheng
> >> wrote:
> >> >
> >> > SVE's INDEX instruction can be used to populate vectors by values
> >> > starting from "base" and incr
> > Pengxuan Zheng writes:
> > > SVE's INDEX instruction can be used to populate vectors by values
> > > starting from "base" and incremented by "step" for each subsequent
> > > value. We can take advantage of it to generate vector consta
> On Thu, Sep 12, 2024 at 2:53 AM Pengxuan Zheng
> wrote:
> >
> > SVE's INDEX instruction can be used to populate vectors by values
> > starting from "base" and incremented by "step" for each subsequent
> > value. We can take advantage
* gcc.target/aarch64/sve/vec_init_5.c: New test.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64.cc | 81 ++-
.../aarch64/sve/acle/general/dupq_1.c | 18 -
.../aarch64/sve/acle/general/dupq_2.c | 18 -
.../aarch64/sve/a
> Pengxuan Zheng writes:
> > SVE's INDEX instruction can be used to populate vectors by values
> > starting from "base" and incremented by "step" for each subsequent
> > value. We can take advantage of it to generate vector constants if
> > TARG
* gcc.target/aarch64/sve/vec_init_5.c: New test.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64.cc | 81 ++-
.../aarch64/sve/acle/general/dupq_1.c | 12 ++-
.../aarch64/sve/acle/general/dupq_2.c | 12 ++-
.../aarch64/sve/a
* gcc.target/aarch64/sve/acle/general/dupq_2.c: Likewise.
* gcc.target/aarch64/sve/acle/general/dupq_3.c: Likewise.
* gcc.target/aarch64/sve/acle/general/dupq_4.c: Likewise.
* gcc.target/aarch64/sve/vec_init_3.c: New test.
Signed-off-by: Pengxuan Zheng
attribute.
(vec_pop_mode): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/popcnt11.c: New test.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-simd.md | 5 +-
gcc/config/aarch64/aarch64.md | 9
gcc/config/aarch64/iterators.md
* gcc.target/aarch64/sve/acle/general/dupq_2.c: Likewise.
* gcc.target/aarch64/sve/acle/general/dupq_3.c: Likewise.
* gcc.target/aarch64/sve/acle/general/dupq_4.c: Likewise.
* gcc.target/aarch64/sve/vec_init_3.c: New test.
Signed-off-by: Pengxuan Zheng
/aarch64/aarch64-sve.md
> (@aarch64_pred_): Use new
> iterator SVE_VDQ_I.
> * config/aarch64/iterators.md (SVE_VDQ_I): New mode iterator.
> (VPRED): Add V8QI, V16QI, V4HI, V8HI and V2SI.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/p
> Sorry for the slow review.
>
> Pengxuan Zheng writes:
> > This patch improves the Advanced SIMD popcount expansion by using SVE
> > if available.
> >
> > For example, GCC currently generates the following code sequence for V2DI:
> > cnt v31.16b
V8QI, V16QI, V4HI, V8HI and V2SI.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/popcnt-sve.c: New test.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-simd.md| 9 ++
gcc/config/aarch64/aarch64-sve.md | 13 +--
gcc/config/aarch64/iterators.md
V8QI, V16QI, V4HI, V8HI and V2SI.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/popcnt-sve.c: New test.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-simd.md| 9 ++
gcc/config/aarch64/aarch64-sve.md | 13 +--
gcc/config/aarch64/iterators.md
This has been approved and will be committed if there's no other comments in a
day.
:
* gcc.target/aarch64/popcnt-sve.c: New test.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-simd.md| 9 ++
gcc/config/aarch64/aarch64-sve.md | 12 +++
gcc/config/aarch64/iterators.md | 1 +
gcc/testsuite/gcc.target/aarch64/popcnt-sve.c
> Pengxuan Zheng writes:
> > This patch improves GCC’s vectorization of __builtin_popcount for
> > aarch64 target by adding popcount patterns for vector modes besides
> > QImode, i.e., HImode, SImode and DImode.
> >
> > With this patch, we now generate the followi
> > On 6/28/24 6:18 AM, Pengxuan Zheng wrote:
> > > This patch improves GCC’s vectorization of __builtin_popcount for
> > > aarch64 target by adding popcount patterns for vector modes besides
> > > QImode, i.e., HImode, SImode and DImode.
> > >
> >
.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-simd.md| 41 ++-
.../gcc.target/aarch64/popcnt-udot.c | 58
gcc/testsuite/gcc.target/aarch64/popcnt-vec.c | 69 +++
3 files changed, 167 insertions(+), 1 deletion(-)
create mode
> On 6/28/24 6:18 AM, Pengxuan Zheng wrote:
> > This patch improves GCC’s vectorization of __builtin_popcount for
> > aarch64 target by adding popcount patterns for vector modes besides
> > QImode, i.e., HImode, SImode and DImode.
> >
> > With this patch, we no
.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-simd.md| 41 ++-
.../gcc.target/aarch64/popcnt-udot.c | 58
gcc/testsuite/gcc.target/aarch64/popcnt-vec.c | 69 +++
3 files changed, 167 insertions(+), 1 deletion(-)
create mode
v0.4s, v3.16b, v1.16b
> uaddlp v0.2d, v0.4s
>
> PR target/113859
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-simd.md (aarch64_addlp):
> Rename to...
> (@aarch64_addlp): ... This.
> (popcount2): New define_expand.
>
> gcc/testsuite/
.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-simd.md| 41 ++-
.../gcc.target/aarch64/popcnt-udot.c | 58
gcc/testsuite/gcc.target/aarch64/popcnt-vec.c | 69 +++
3 files changed, 167 insertions(+), 1 deletion(-)
create mode
Thanks, Richard! I've updated the patch accordingly.
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655912.html
Please let me know if any other changes are needed.
Thanks,
Pengxuan
> Sorry for the slow reply.
>
> Pengxuan Zheng writes:
> > This patch improves GC
.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-simd.md| 41 ++-
.../gcc.target/aarch64/popcnt-udot.c | 58
gcc/testsuite/gcc.target/aarch64/popcnt-vec.c | 69 +++
3 files changed, 167 insertions(+), 1 deletion(-)
create mode
> On Mon, Jun 17, 2024 at 11:25 PM Pengxuan Zheng
> wrote:
> >
> > This patch improves GCC’s vectorization of __builtin_popcount for
> > aarch64 target by adding popcount patterns for vector modes besides
> > QImode, i.e., HImode, SImode and DImode.
> >
>
.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-simd.md| 51 +-
.../gcc.target/aarch64/popcnt-udot.c | 58
gcc/testsuite/gcc.target/aarch64/popcnt-vec.c | 69 +++
3 files changed, 177 insertions(+), 1 deletion(-)
create
> Hi,
>
> > -Original Message-
> > From: Pengxuan Zheng
> > Sent: Friday, June 14, 2024 12:57 AM
> > To: gcc-patches@gcc.gnu.org
> > Cc: Pengxuan Zheng
> > Subject: [PATCH v3] aarch64: Add vector popcount besides QImode
> > [PR113859]
&g
.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-simd.md| 52 +-
.../gcc.target/aarch64/popcnt-udot.c | 45
gcc/testsuite/gcc.target/aarch64/popcnt-vec.c | 69 +++
3 files changed, 165 insertions(+), 1 deletion(-)
create mode
> Pengxuan Zheng writes:
> > This patch adds the fix_truncv4sfv4hi2 (V4SF->V4HI) pattern which is
> > implemented using fix_truncv4sfv4si2 (V4SF->V4SI) and then truncv4siv4hi2
> (V4SI->V4HI).
> >
> > PR target/113882
> >
> > gcc/Chan
> Pengxuan Zheng writes:
> > This patch improves GCC’s vectorization of __builtin_popcount for
> > aarch64 target by adding popcount patterns for vector modes besides
> > QImode, i.e., HImode, SImode and DImode.
> >
> > With this patch, we now generate the followi
/aarch64/popcnt-vec.c: New test.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-simd.md| 28 +++-
gcc/testsuite/gcc.target/aarch64/popcnt-vec.c | 69 +++
2 files changed, 96 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/gcc.target/aarch64
> Pengxuan Zheng writes:
> > This patch improves GCC’s vectorization of __builtin_popcount for
> > aarch64 target by adding popcount patterns for vector modes besides
> > QImode, i.e., HImode, SImode and DImode.
> >
> > With this patch, we now generate th
This patch improves GCC’s vectorization of __builtin_popcount for aarch64 target
by adding popcount patterns for vector modes besides QImode, i.e., HImode,
SImode and DImode.
With this patch, we now generate the following for V8HI:
cnt v1.16b, v.16b
uaddlp v2.8h, v1.16b
For V4HI, we gene
Ping https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650311.html
> -Original Message-
> From: Pengxuan Zheng (QUIC)
> Sent: Tuesday, April 30, 2024 5:32 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Andrew Pinski (QUIC) ; Pengxuan Zheng
> (QUIC)
> Subject: [PATCH
> Pengxuan Zheng writes:
> > This patch is a follow-up of r15-1079-g230d62a2cdd16c to add vector
> > floating point trunc pattern for V2DF->V2SF and V4SF->V4HF conversions
> > by renaming the existing
> > aarch64_float_truncate_lo_ pattern to the standard
> &g
trunc2): ... This.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/trunc-vec.c: New test.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-builtins.cc | 7 +++
gcc/config/aarch64/aarch64-simd.md | 6 +++---
gcc/testsuite/gcc.target/aarch64/trunc-vec.
> Pengxuan Zheng writes:
> > This patch adds vector floating point extend pattern for V2SF->V2DF
> > and
> > V4HF->V4SF conversions by renaming the existing
> > V4HF->aarch64_float_extend_lo_
> > pattern to the standard optab one, i.e., extend2. This
>
testsuite/ChangeLog:
* gcc.target/aarch64/fix_trunc2.c: New test.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-simd.md| 13 +
gcc/testsuite/gcc.target/aarch64/fix_trunc2.c | 14 ++
2 files changed, 27 insertions(+)
create mode 100644 gcc/
Ping
> -Original Message-
> From: Pengxuan Zheng (QUIC)
> Sent: Tuesday, April 30, 2024 5:32 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Andrew Pinski (QUIC) ; Pengxuan Zheng
> (QUIC)
> Subject: [PATCH] aarch64: Add vector popcount besides QImode [PR113859]
>
>
> > Pengxuan Zheng writes:
> > > vget_low_2.c is a test case for little-endian, but we missed the
> > > -mlittle-endian flag in r15-697-ga2e4fe5a53cf75.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.target/aarch64/vget_low_2.
ChangeLog:
* MAINTAINERS: Add myself to Write After Approval and DCO.
Signed-off-by: Pengxuan Zheng
---
MAINTAINERS | 2 ++
1 file changed, 2 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index e2870eef2ef..6444e6ea2f1 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -743,6 +743,7
> Pengxuan Zheng writes:
> > This patch improves vectorization of certain floating point widening
> > operations for the aarch64 target by adding vector floating point
> > extend patterns for
> > V2SF->V2DF and V4HF->V4SF conversions.
> >
> >
to...
(extend2): ... This.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/extend-vec.c: New test.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-builtins.cc| 9
gcc/config/aarch64/aarch64-simd.md| 2 +-
gcc/testsuite/gcc.target/aarch64/extend-vec.
> Pengxuan Zheng writes:
> > vget_low_2.c is a test case for little-endian, but we missed the
> > -mlittle-endian flag in r15-697-ga2e4fe5a53cf75.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/aarch64/vget_low_2.c: Add -mlittle-endian.
>
>
arch64-simd.md (extend2): New expand.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/extend-vec.c: New test.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-simd.md| 7 +++
gcc/testsuite/gcc.target/aarch64/extend-vec.c | 21 +++
2 files chang
vget_low_2.c is a test case for little-endian, but we missed the -mlittle-endian
flag in r15-697-ga2e4fe5a53cf75.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/vget_low_2.c: Add -mlittle-endian.
Signed-off-by: Pengxuan Zheng
---
gcc/testsuite/gcc.target/aarch64/vget_low_2.c | 2 +-
1
.
* gcc.target/aarch64/vget_high_2_be.c: New test.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-builtins.cc| 59 +++---
gcc/config/aarch64/aarch64-simd-builtins.def | 6 -
gcc/config/aarch64/aarch64-simd.md| 22
gcc/config/aarch64/arm_neon.h
> On Mon, May 20, 2024 at 2:57 AM Richard Sandiford
> wrote:
> >
> > Pengxuan Zheng writes:
> > > This patch folds vget_low_* intrinsics to BIT_FILED_REF to open up
> > > more optimization opportunities for gimple optimizers.
> > >
> >
.
* gcc.target/aarch64/vget_low_2.c: New test.
* gcc.target/aarch64/vget_low_2_be.c: New test.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-builtins.cc| 60 ++
gcc/config/aarch64/aarch64-simd-builtins.def | 5 +-
gcc/config/aarch64/aarch64-simd.md
define_expand.
gcc/testsuite/ChangeLog:
PR target/113859
* gcc.target/aarch64/popcnt-vec.c: New test.
Signed-off-by: Pengxuan Zheng
---
gcc/config/aarch64/aarch64-simd.md| 40
gcc/testsuite/gcc.target/aarch64/popcnt-vec.c | 48 +++
2
61 matches
Mail list logo