date:20240119

Re: [PATCH] RISC-V: Add the Zihpm and Zicntr extensions

2024-01-19 Thread Kito Cheng

I realized we missed this on trunk, and I need this on adding -mcpu
for sfive cores, so I'm gonna push this to trunk.
Most concerns are around the assembler stuff, so I believe it's less
controversial on the toolchain driver side.

On Wed, Nov 23, 2022 at 6:01 AM Palmer Dabbelt  wrote:
>
> On Tue, 22 Nov 2022 13:50:28 PST (-0800), jeffreya...@gmail.com wrote:
> >
> > On 11/22/22 08:29, Palmer Dabbelt wrote:
> >> On Tue, 22 Nov 2022 07:20:15 PST (-0800), jeffreya...@gmail.com wrote:
> >>>
> >>> On 11/20/22 18:36, Kito Cheng wrote:
> > So the idea here is just to define the extension so that it gets
> > defined
> > in the ISA strings and passed through to the assembler, right?
>  That will also define arch test marco:
> 
>  https://github.com/riscv-non-isa/riscv-c-api-doc/blob/master/riscv-c-api.md#architecture-extension-test-macro
> 
> >>>
> >>> Sorry I should have been clearer and included the test macro(s) as well.
> >>>
> >>> So a better summary would be that while it doesn't change the codegen
> >>> behavior in the compiler, it does provide the mechanisms to pass along
> >>> isa strings to other tools such as the assembler and signal via the test
> >>> macros that this extension is available.
> >>
> >> IMO the important bit here is that we're not adding any compatibility
> >> flags, like we did when fence.i was removed from the ISA.  That's fine
> >> as long as we never remove these instructions from the base ISA in the
> >> software, but that's what's suggested by Andrew in the post.
> >
> > Right.  IIUC these instructions were never supposed to be in the base
> > ISA, but, in effect, snuck through.  We're retro-actively adding them as
> > an extension, at least in terms of ISA strings & test macros.  We're
> > currently (forever?) going to allow them in the assembler without
> > strictly requiring the extension be on.
>
> That'd the the idea.
>
> >> It's a super weird one, but there's a bunch of cases in RISC-V where
> >> we're told to just ignore words in the ISA manual.  Definitely a trap
> >> for users (and we already had some Linux folks get bit by the counter
> >> changes here), but that's just how RISC-V works.
> >
> > Mistakes happen.  The key is to adjust for them as best as we can.
> > I'd lean towards a stricter enforcement, bringing these
> > instructions/extension in line with how we handle the others. It'd
> > potentially mean source incompatibilities that would need to be fixed,
> > but they shouldn't be difficult and we're still early enough in the game
> > that we *could* take that route.  Andrew's position is more
> > accommodating of existing code and while I may not 100% agree with his
> > position, I understand it.
> >
> >
> > So while I'd lean towards a stricter checking, I can live with this
> > approach.  I wouldn't mind hearing from Kito, Philipp and others though.
>
> That's the sort of thing we've traditionally done: essentially just read
> the actual words in the PDF and produce implementations that match
> those, tagging versions when things change (the fence.i stuff is a good
> example).  After some amount of time we can then move the default spec
> version over to the new one.  That's a little bit of churn for users,
> but it shouldn't be all that bad.
>
> IMO that's the sane way to go, I'd certainly expect to be able to read
> the words in the PDFs and go implement things according to them.  It's
> pretty clearly not what the ISA folks want, though.
>
> There's also the secondary issue of getting ISA strings to match between
> the various bits of the software stack that uses them.  We're trying to
> move away from ISA strings as a stable uABI in Linux for exactly this
> reason, but ISA strings have already ended up all over the place so
> there's only so much we can do.

Re: Re: [PATCH] RISC-V: Bugfix for resolve_overloaded_builtin[PR113420]

2024-01-19 Thread Li Xu

you are right.

vint8mf8_t test_vle8_v_i8mf8_m(vbool64_t vm, const int32_t *rs1, size_t vl) {
  return __riscv_vle8(vm, rs1, vl);
}

This will cause ICE. I tried clang and it will also cause ICE.



xu...@eswincomputing.com
 
From: juzhe.zh...@rivai.ai
Date: 2024-01-19 15:53
To: Li Xu; gcc-patches
CC: kito.cheng; palmer; zhengyu; pan2.li; Li Xu
Subject: Re: [PATCH] RISC-V: Bugfix for resolve_overloaded_builtin[PR113420]
Could you add a test for vle with mask?

For example:

__riscv_vle8 which overload __riscv_vle8_v_i8mf8_m and __riscv_vle8_v_u8mf8_m

You are using pointer type and mask type to resolve it.

So this pointer type is expecting const int8_t or const uint8_t.

Could you add test:
1.__riscv_vle8 (const int8_t *...)
2. __riscv_vle8 (const uint8_t *...)
3. __riscv_vle8 (const int32_t *...) ---> I worry this will cause ICE since 
pointer type doesn't match the expecting type,
I wonder whether it will cause ICE while resolving API.

Thanks.




juzhe.zh...@rivai.ai
 
From: Li Xu
Date: 2024-01-19 15:44
To: gcc-patches
CC: kito.cheng; palmer; juzhe.zhong; zhengyu; pan2.li; xuli
Subject: [PATCH] RISC-V: Bugfix for resolve_overloaded_builtin[PR113420]
From: xuli 
 
Change the hash value of overloaded intrinsic from considering
all parameter types to:
1. Encoding vector data type
2. In order to distinguish vle8_v_i8mf8_m(vbool64_t vm, const int8_t *rs1, 
size_t vl)
   and vle8_v_u8mf8_m(vbool64_t vm, const uint8_t *rs1, size_t vl), encode the 
pointer type
3. In order to distinguish vfadd_vv_f32mf2_rm(vfloat32mf2_t vs2, vfloat32mf2_t 
vs1, size_t vl)
   and vfadd_vv_f32mf2(vfloat32mf2_t vs2, vfloat32mf2_t vs1, size_t vl), encode 
the number of
   parameters. The same goes for the vxrm intrinsics.
 
PR target/113420
 
gcc/ChangeLog:
 
* config/riscv/riscv-vector-builtins.cc (has_vxrm_or_frm_p): remove.
(registered_function::overloaded_hash): refactor.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/base/pr113420.c: New test.
---
gcc/config/riscv/riscv-vector-builtins.cc | 88 +++
.../gcc.target/riscv/rvv/base/pr113420.c  | 30 +++
2 files changed, 43 insertions(+), 75 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr113420.c
 
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 25e0b6e56de..5240f9e1f02 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -4271,24 +4271,22 @@ registered_function::overloaded_hash () const
: TYPE_UNSIGNED (type);
   mode_p = POINTER_TYPE_P (type) ? TYPE_MODE (TREE_TYPE (type))
 : TYPE_MODE (type);
-  h.add_int (unsigned_p);
-  h.add_int (mode_p);
+  if (POINTER_TYPE_P (type) || lookup_vector_type_attribute (type))
+ {
+   h.add_int (unsigned_p);
+   h.add_int (mode_p);
+ }
+  else if (instance.base->may_require_vxrm_p ()
+|| instance.base->may_require_frm_p ())
+ {
+   h.add_int (argument_types.length ());
+   break;
+ }
 }
   return h.end ();
}
-bool
-has_vxrm_or_frm_p (function_instance &instance, const vec 
&arglist)
-{
-  if (instance.base->may_require_vxrm_p ()
-  || (instance.base->may_require_frm_p ()
-   && (TREE_CODE (TREE_TYPE (arglist[arglist.length () - 2]))
-   == INTEGER_TYPE)))
-return true;
-  return false;
-}
-
hashval_t
registered_function::overloaded_hash (const vec &arglist)
{
@@ -4296,68 +4294,8 @@ registered_function::overloaded_hash (const vec &arglist)
   unsigned int len = arglist.length ();
   for (unsigned int i = 0; i < len; i++)
-{
-  /* vint8m1_t __riscv_vget_i8m1(vint8m2_t src, size_t index);
-  When the user calls vget intrinsic, the __riscv_vget_i8m1(src, 1)
-   form is used. The compiler recognizes that the parameter index is signed
-   int, which is inconsistent with size_t, so the index is converted to
-   size_t type in order to get correct hash value. vint8m2_t
-   __riscv_vset(vint8m2_t dest, size_t index, vint8m1_t value); The reason
-   is the same as above. */
-  if ((instance.base == bases::vget && (i == (len - 1)))
-   || ((instance.base == bases::vset
-   || instance.shape == shapes::crypto_vi)
- && (i == (len - 2
- argument_types.safe_push (size_type_node);
-  /* Vector fixed-point arithmetic instructions requiring argument vxrm.
-  For example: vuint32m4_t __riscv_vaaddu(vuint32m4_t vs2,
-  vuint32m4_t vs1, unsigned int vxrm, size_t vl); The user calls vaaddu
-  intrinsic in the form of __riscv_vaaddu(vs2, vs1, 2, vl). The compiler
-  recognizes that the parameter vxrm is a signed int, which is inconsistent
-  with the parameter unsigned int vxrm declared by intrinsic, so the
-  parameter vxrm is converted to an unsigned int type in order to get
-  correct hash value.
-
-  Vector Floating-Point Instructions requiring argument frm.
-  DEF_RVV_FUNCTION (vfadd, alu, full_preds, f_vvv_ops)
-

Re: Re: [PATCH] RISC-V: Bugfix for resolve_overloaded_builtin[PR113420]

2024-01-19 Thread juzhe.zh...@rivai.ai

Could you show me the ICE message ?
Is it in front-end ? If yes, it's ok.

I wonder whether it is "internal compiler error".




juzhe.zh...@rivai.ai
 
From: Li Xu
Date: 2024-01-19 16:04
To: juzhe.zhong; gcc-patches
CC: kito.cheng; palmer; zhengyu; pan2.li
Subject: Re: Re: [PATCH] RISC-V: Bugfix for resolve_overloaded_builtin[PR113420]
you are right.

vint8mf8_t test_vle8_v_i8mf8_m(vbool64_t vm, const int32_t *rs1, size_t vl) {
  return __riscv_vle8(vm, rs1, vl);
}

This will cause ICE. I tried clang and it will also cause ICE.



xu...@eswincomputing.com
 
From: juzhe.zh...@rivai.ai
Date: 2024-01-19 15:53
To: Li Xu; gcc-patches
CC: kito.cheng; palmer; zhengyu; pan2.li; Li Xu
Subject: Re: [PATCH] RISC-V: Bugfix for resolve_overloaded_builtin[PR113420]
Could you add a test for vle with mask?

For example:

__riscv_vle8 which overload __riscv_vle8_v_i8mf8_m and __riscv_vle8_v_u8mf8_m

You are using pointer type and mask type to resolve it.

So this pointer type is expecting const int8_t or const uint8_t.

Could you add test:
1.__riscv_vle8 (const int8_t *...)
2. __riscv_vle8 (const uint8_t *...)
3. __riscv_vle8 (const int32_t *...) ---> I worry this will cause ICE since 
pointer type doesn't match the expecting type,
I wonder whether it will cause ICE while resolving API.

Thanks.




juzhe.zh...@rivai.ai
 
From: Li Xu
Date: 2024-01-19 15:44
To: gcc-patches
CC: kito.cheng; palmer; juzhe.zhong; zhengyu; pan2.li; xuli
Subject: [PATCH] RISC-V: Bugfix for resolve_overloaded_builtin[PR113420]
From: xuli 
 
Change the hash value of overloaded intrinsic from considering
all parameter types to:
1. Encoding vector data type
2. In order to distinguish vle8_v_i8mf8_m(vbool64_t vm, const int8_t *rs1, 
size_t vl)
   and vle8_v_u8mf8_m(vbool64_t vm, const uint8_t *rs1, size_t vl), encode the 
pointer type
3. In order to distinguish vfadd_vv_f32mf2_rm(vfloat32mf2_t vs2, vfloat32mf2_t 
vs1, size_t vl)
   and vfadd_vv_f32mf2(vfloat32mf2_t vs2, vfloat32mf2_t vs1, size_t vl), encode 
the number of
   parameters. The same goes for the vxrm intrinsics.
 
PR target/113420
 
gcc/ChangeLog:
 
* config/riscv/riscv-vector-builtins.cc (has_vxrm_or_frm_p): remove.
(registered_function::overloaded_hash): refactor.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/base/pr113420.c: New test.
---
gcc/config/riscv/riscv-vector-builtins.cc | 88 +++
.../gcc.target/riscv/rvv/base/pr113420.c  | 30 +++
2 files changed, 43 insertions(+), 75 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr113420.c
 
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 25e0b6e56de..5240f9e1f02 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -4271,24 +4271,22 @@ registered_function::overloaded_hash () const
: TYPE_UNSIGNED (type);
   mode_p = POINTER_TYPE_P (type) ? TYPE_MODE (TREE_TYPE (type))
 : TYPE_MODE (type);
-  h.add_int (unsigned_p);
-  h.add_int (mode_p);
+  if (POINTER_TYPE_P (type) || lookup_vector_type_attribute (type))
+ {
+   h.add_int (unsigned_p);
+   h.add_int (mode_p);
+ }
+  else if (instance.base->may_require_vxrm_p ()
+|| instance.base->may_require_frm_p ())
+ {
+   h.add_int (argument_types.length ());
+   break;
+ }
 }
   return h.end ();
}
-bool
-has_vxrm_or_frm_p (function_instance &instance, const vec 
&arglist)
-{
-  if (instance.base->may_require_vxrm_p ()
-  || (instance.base->may_require_frm_p ()
-   && (TREE_CODE (TREE_TYPE (arglist[arglist.length () - 2]))
-   == INTEGER_TYPE)))
-return true;
-  return false;
-}
-
hashval_t
registered_function::overloaded_hash (const vec &arglist)
{
@@ -4296,68 +4294,8 @@ registered_function::overloaded_hash (const vec &arglist)
   unsigned int len = arglist.length ();
   for (unsigned int i = 0; i < len; i++)
-{
-  /* vint8m1_t __riscv_vget_i8m1(vint8m2_t src, size_t index);
-  When the user calls vget intrinsic, the __riscv_vget_i8m1(src, 1)
-   form is used. The compiler recognizes that the parameter index is signed
-   int, which is inconsistent with size_t, so the index is converted to
-   size_t type in order to get correct hash value. vint8m2_t
-   __riscv_vset(vint8m2_t dest, size_t index, vint8m1_t value); The reason
-   is the same as above. */
-  if ((instance.base == bases::vget && (i == (len - 1)))
-   || ((instance.base == bases::vset
-   || instance.shape == shapes::crypto_vi)
- && (i == (len - 2
- argument_types.safe_push (size_type_node);
-  /* Vector fixed-point arithmetic instructions requiring argument vxrm.
-  For example: vuint32m4_t __riscv_vaaddu(vuint32m4_t vs2,
-  vuint32m4_t vs1, unsigned int vxrm, size_t vl); The user calls vaaddu
-  intrinsic in the form of __riscv_vaaddu(vs2, vs1, 2, vl). The compiler
-  recognizes that the parameter vxrm is a sig

[PATCH] RISC-V: Fix RVV_VLMAX

2024-01-19 Thread Juzhe-Zhong

This patch fixes memory hog found in SPEC2017 wrf benchmark which caused by
RVV_VLMAX since RVV_VLMAX generate brand new rtx by gen_rtx_REG (Pmode, 
X0_REGNUM)
every time we call RVV_VLMAX, that is, we are always generating garbage and 
redundant
(reg:DI 0 zero) rtx.

After this patch fix, the memory hog is gone.

Time variable   usr   sys  wall 
  GGC
 machine dep reorg  :   1.99 (  9%)   0.35 ( 56%)   2.33 ( 10%) 
  939M ( 80%) [Before this patch]
 machine dep reorg  :   1.71 (  6%)   0.16 ( 27%)   3.77 (  6%) 
  659k (  0%) [After this patch]
 
Time variable   usr   sys  wall 
  GGC
 machine dep reorg  :  75.93 ( 18%)  14.23 ( 88%)  90.15 ( 21%) 
33383M ( 95%) [Before this patch]
 machine dep reorg  :  56.00 ( 14%)   7.92 ( 77%)  63.93 ( 15%) 
 4361k (  0%) [After this patch]

Test is running. Ok for trunk if I passed the test with no regresion ?

gcc/ChangeLog:

* config/riscv/riscv-protos.h (RVV_VLMAX): Change to 
regno_reg_rtx[X0_REGNUM].
(RVV_VUNDEF): Ditto.
* config/riscv/riscv-vsetvl.cc: Add timevar.

---
 gcc/config/riscv/riscv-protos.h  | 5 ++---
 gcc/config/riscv/riscv-vsetvl.cc | 2 +-
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 7853b488838..7fe26fcd939 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -299,10 +299,9 @@ void riscv_run_selftests (void);
 #endif
 
 namespace riscv_vector {
-#define RVV_VLMAX gen_rtx_REG (Pmode, X0_REGNUM)
+#define RVV_VLMAX regno_reg_rtx[X0_REGNUM]
 #define RVV_VUNDEF(MODE)   
\
-  gen_rtx_UNSPEC (MODE, gen_rtvec (1, gen_rtx_REG (SImode, X0_REGNUM)),
\
- UNSPEC_VUNDEF)
+  gen_rtx_UNSPEC (MODE, gen_rtvec (1, RVV_VLMAX), UNSPEC_VUNDEF)
 
 /* These flags describe how to pass the operands to a rvv insn pattern.
e.g.:
diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 2067073185f..54c85ffb7d5 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -3556,7 +3556,7 @@ const pass_data pass_data_vsetvl = {
   RTL_PASS, /* type */
   "vsetvl", /* name */
   OPTGROUP_NONE, /* optinfo_flags */
-  TV_NONE,  /* tv_id */
+  TV_MACH_DEP,  /* tv_id */
   0,/* properties_required */
   0,/* properties_provided */
   0,/* properties_destroyed */
-- 
2.36.3

Re: [PATCH] RISC-V: Fix RVV_VLMAX

2024-01-19 Thread Kito Cheng

LGTM, nice catch, I wasn't aware that would be a problem.

On Fri, Jan 19, 2024 at 4:12 PM Juzhe-Zhong  wrote:
>
> This patch fixes memory hog found in SPEC2017 wrf benchmark which caused by
> RVV_VLMAX since RVV_VLMAX generate brand new rtx by gen_rtx_REG (Pmode, 
> X0_REGNUM)
> every time we call RVV_VLMAX, that is, we are always generating garbage and 
> redundant
> (reg:DI 0 zero) rtx.
>
> After this patch fix, the memory hog is gone.
>
> Time variable   usr   sys  
> wall   GGC
>  machine dep reorg  :   1.99 (  9%)   0.35 ( 56%)   2.33 ( 
> 10%)   939M ( 80%) [Before this patch]
>  machine dep reorg  :   1.71 (  6%)   0.16 ( 27%)   3.77 (  
> 6%)   659k (  0%) [After this patch]
>
> Time variable   usr   sys  
> wall   GGC
>  machine dep reorg  :  75.93 ( 18%)  14.23 ( 88%)  90.15 ( 
> 21%) 33383M ( 95%) [Before this patch]
>  machine dep reorg  :  56.00 ( 14%)   7.92 ( 77%)  63.93 ( 
> 15%)  4361k (  0%) [After this patch]
>
> Test is running. Ok for trunk if I passed the test with no regresion ?
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-protos.h (RVV_VLMAX): Change to 
> regno_reg_rtx[X0_REGNUM].
> (RVV_VUNDEF): Ditto.
> * config/riscv/riscv-vsetvl.cc: Add timevar.
>
> ---
>  gcc/config/riscv/riscv-protos.h  | 5 ++---
>  gcc/config/riscv/riscv-vsetvl.cc | 2 +-
>  2 files changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index 7853b488838..7fe26fcd939 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -299,10 +299,9 @@ void riscv_run_selftests (void);
>  #endif
>
>  namespace riscv_vector {
> -#define RVV_VLMAX gen_rtx_REG (Pmode, X0_REGNUM)
> +#define RVV_VLMAX regno_reg_rtx[X0_REGNUM]
>  #define RVV_VUNDEF(MODE) 
>   \
> -  gen_rtx_UNSPEC (MODE, gen_rtvec (1, gen_rtx_REG (SImode, X0_REGNUM)),  
>   \
> - UNSPEC_VUNDEF)
> +  gen_rtx_UNSPEC (MODE, gen_rtvec (1, RVV_VLMAX), UNSPEC_VUNDEF)
>
>  /* These flags describe how to pass the operands to a rvv insn pattern.
> e.g.:
> diff --git a/gcc/config/riscv/riscv-vsetvl.cc 
> b/gcc/config/riscv/riscv-vsetvl.cc
> index 2067073185f..54c85ffb7d5 100644
> --- a/gcc/config/riscv/riscv-vsetvl.cc
> +++ b/gcc/config/riscv/riscv-vsetvl.cc
> @@ -3556,7 +3556,7 @@ const pass_data pass_data_vsetvl = {
>RTL_PASS, /* type */
>"vsetvl", /* name */
>OPTGROUP_NONE, /* optinfo_flags */
> -  TV_NONE,  /* tv_id */
> +  TV_MACH_DEP,  /* tv_id */
>0,/* properties_required */
>0,/* properties_provided */
>0,/* properties_destroyed */
> --
> 2.36.3
>

[PATCH V2] RISC-V: Fix RVV_VLMAX

2024-01-19 Thread Juzhe-Zhong

This patch fixes memory hog found in SPEC2017 wrf benchmark which caused by
RVV_VLMAX since RVV_VLMAX generate brand new rtx by gen_rtx_REG (Pmode, 
X0_REGNUM)
every time we call RVV_VLMAX, that is, we are always generating garbage and 
redundant
(reg:DI 0 zero) rtx.

After this patch fix, the memory hog is gone.

Time variable   usr   sys  wall 
  GGC
 machine dep reorg  :   1.99 (  9%)   0.35 ( 56%)   2.33 ( 10%) 
  939M ( 80%) [Before this patch]
 machine dep reorg  :   1.71 (  6%)   0.16 ( 27%)   3.77 (  6%) 
  659k (  0%) [After this patch]
 
Time variable   usr   sys  wall 
  GGC
 machine dep reorg  :  75.93 ( 18%)  14.23 ( 88%)  90.15 ( 21%) 
33383M ( 95%) [Before this patch]
 machine dep reorg  :  56.00 ( 14%)   7.92 ( 77%)  63.93 ( 15%) 
 4361k (  0%) [After this patch]

Test is running. Ok for trunk if I passed the test with no regresion ?

PR target/113495

gcc/ChangeLog:

* config/riscv/riscv-protos.h (RVV_VLMAX): Change to 
regno_reg_rtx[X0_REGNUM].
(RVV_VUNDEF): Ditto.
* config/riscv/riscv-vsetvl.cc: Add timevar.

---
 gcc/config/riscv/riscv-protos.h  | 5 ++---
 gcc/config/riscv/riscv-vsetvl.cc | 2 +-
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 7853b488838..7fe26fcd939 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -299,10 +299,9 @@ void riscv_run_selftests (void);
 #endif
 
 namespace riscv_vector {
-#define RVV_VLMAX gen_rtx_REG (Pmode, X0_REGNUM)
+#define RVV_VLMAX regno_reg_rtx[X0_REGNUM]
 #define RVV_VUNDEF(MODE)   
\
-  gen_rtx_UNSPEC (MODE, gen_rtvec (1, gen_rtx_REG (SImode, X0_REGNUM)),
\
- UNSPEC_VUNDEF)
+  gen_rtx_UNSPEC (MODE, gen_rtvec (1, RVV_VLMAX), UNSPEC_VUNDEF)
 
 /* These flags describe how to pass the operands to a rvv insn pattern.
e.g.:
diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 2067073185f..54c85ffb7d5 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -3556,7 +3556,7 @@ const pass_data pass_data_vsetvl = {
   RTL_PASS, /* type */
   "vsetvl", /* name */
   OPTGROUP_NONE, /* optinfo_flags */
-  TV_NONE,  /* tv_id */
+  TV_MACH_DEP,  /* tv_id */
   0,/* properties_required */
   0,/* properties_provided */
   0,/* properties_destroyed */
-- 
2.36.3

Re: Re: [PATCH] RISC-V: Fix RVV_VLMAX

2024-01-19 Thread juzhe.zh...@rivai.ai

Thanks. I will commit V2 patch:
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643420.html 
after I finishing testing.

V2 no difference from V1 in codes except adding:

PR target/113495



juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2024-01-19 16:19
To: Juzhe-Zhong
CC: gcc-patches; kito.cheng; jeffreyalaw; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Fix RVV_VLMAX
LGTM, nice catch, I wasn't aware that would be a problem.
 
On Fri, Jan 19, 2024 at 4:12 PM Juzhe-Zhong  wrote:
>
> This patch fixes memory hog found in SPEC2017 wrf benchmark which caused by
> RVV_VLMAX since RVV_VLMAX generate brand new rtx by gen_rtx_REG (Pmode, 
> X0_REGNUM)
> every time we call RVV_VLMAX, that is, we are always generating garbage and 
> redundant
> (reg:DI 0 zero) rtx.
>
> After this patch fix, the memory hog is gone.
>
> Time variable   usr   sys  
> wall   GGC
>  machine dep reorg  :   1.99 (  9%)   0.35 ( 56%)   2.33 ( 
> 10%)   939M ( 80%) [Before this patch]
>  machine dep reorg  :   1.71 (  6%)   0.16 ( 27%)   3.77 (  
> 6%)   659k (  0%) [After this patch]
>
> Time variable   usr   sys  
> wall   GGC
>  machine dep reorg  :  75.93 ( 18%)  14.23 ( 88%)  90.15 ( 
> 21%) 33383M ( 95%) [Before this patch]
>  machine dep reorg  :  56.00 ( 14%)   7.92 ( 77%)  63.93 ( 
> 15%)  4361k (  0%) [After this patch]
>
> Test is running. Ok for trunk if I passed the test with no regresion ?
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-protos.h (RVV_VLMAX): Change to 
> regno_reg_rtx[X0_REGNUM].
> (RVV_VUNDEF): Ditto.
> * config/riscv/riscv-vsetvl.cc: Add timevar.
>
> ---
>  gcc/config/riscv/riscv-protos.h  | 5 ++---
>  gcc/config/riscv/riscv-vsetvl.cc | 2 +-
>  2 files changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index 7853b488838..7fe26fcd939 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -299,10 +299,9 @@ void riscv_run_selftests (void);
>  #endif
>
>  namespace riscv_vector {
> -#define RVV_VLMAX gen_rtx_REG (Pmode, X0_REGNUM)
> +#define RVV_VLMAX regno_reg_rtx[X0_REGNUM]
>  #define RVV_VUNDEF(MODE) 
>   \
> -  gen_rtx_UNSPEC (MODE, gen_rtvec (1, gen_rtx_REG (SImode, X0_REGNUM)),  
>   \
> - UNSPEC_VUNDEF)
> +  gen_rtx_UNSPEC (MODE, gen_rtvec (1, RVV_VLMAX), UNSPEC_VUNDEF)
>
>  /* These flags describe how to pass the operands to a rvv insn pattern.
> e.g.:
> diff --git a/gcc/config/riscv/riscv-vsetvl.cc 
> b/gcc/config/riscv/riscv-vsetvl.cc
> index 2067073185f..54c85ffb7d5 100644
> --- a/gcc/config/riscv/riscv-vsetvl.cc
> +++ b/gcc/config/riscv/riscv-vsetvl.cc
> @@ -3556,7 +3556,7 @@ const pass_data pass_data_vsetvl = {
>RTL_PASS, /* type */
>"vsetvl", /* name */
>OPTGROUP_NONE, /* optinfo_flags */
> -  TV_NONE,  /* tv_id */
> +  TV_MACH_DEP,  /* tv_id */
>0,/* properties_required */
>0,/* properties_provided */
>0,/* properties_destroyed */
> --
> 2.36.3
>

[PATCH] sccvn: Don't use SCALAR_INT_TYPE_MODE on BLKmode BITINT_TYPEs [PR113459]

2024-01-19 Thread Jakub Jelinek

Hi!

sccvn uses GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (type)) for INTEGER_TYPEs,
most likely because that is what native_{interpret,encode}_int used.
This obviously doesn't work for larger BITINT_TYPEs which have BLKmode
and the above ICEs on those.  native_{interpret,encode}_int checks whether
the BITINT_TYPE is medium/large/huge (i.e. an array of 2+ ABI limbs)
and uses TYPE_SIZE_UNIT for that case, otherwise SCALAR_INT_TYPE_MODE like
for the INTEGER_TYPE case.

The following patch instead just uses SCALAR_INT_TYPE_MODE for non-BLKmode
TYPE_MODE and TYPE_SIZE_UNIT otherwise.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-01-19  Jakub Jelinek  

PR tree-optimization/113459
* tree-ssa-sccvn.cc (vn_walk_cb_data::push_partial_def): Use
TREE_INT_CST_LOW of TYPE_SIZE_UNIT rather than GET_MODE_SIZE
of SCALAR_INT_TYPE_MODE if type has BLKmode.
(vn_reference_lookup_3): Likewise.  Formatting fix.

* gcc.dg/bitint-73.c: New test.

--- gcc/tree-ssa-sccvn.cc.jj2024-01-03 11:51:42.361580881 +0100
+++ gcc/tree-ssa-sccvn.cc   2024-01-18 12:39:52.789606975 +0100
@@ -2287,7 +2287,12 @@ vn_walk_cb_data::push_partial_def (pd_da
BITS_PER_UNIT
- (maxsizei % BITS_PER_UNIT));
   if (INTEGRAL_TYPE_P (type))
-   sz = GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (type));
+   {
+ if (TYPE_MODE (type) != BLKmode)
+   sz = GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (type));
+ else
+   sz = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (type));
+   }
   if (sz > needed_len)
{
  memcpy (this_buffer + (sz - needed_len), buffer, needed_len);
@@ -2967,8 +2972,10 @@ vn_reference_lookup_3 (ao_ref *ref, tree
}
  else
{
- unsigned buflen = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (vr->type)) + 
1;
- if (INTEGRAL_TYPE_P (vr->type))
+ unsigned buflen
+   = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (vr->type)) + 1;
+ if (INTEGRAL_TYPE_P (vr->type)
+ && TYPE_MODE (vr->type) != BLKmode)
buflen = GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (vr->type)) + 1;
  unsigned char *buf = XALLOCAVEC (unsigned char, buflen);
  memset (buf, TREE_INT_CST_LOW (gimple_call_arg (def_stmt, 1)),
@@ -3165,7 +3172,12 @@ vn_reference_lookup_3 (ao_ref *ref, tree
 offset + maxsize - 1.  */
  HOST_WIDE_INT sz = maxsizei / BITS_PER_UNIT;
  if (INTEGRAL_TYPE_P (type))
-   sz = GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (type));
+   {
+ if (TYPE_MODE (type) != BLKmode)
+   sz = GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (type));
+ else
+   sz = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (type));
+   }
  amnt = ((unsigned HOST_WIDE_INT) offset2i + size2i
  - offseti - maxsizei) % BITS_PER_UNIT;
  if (amnt)
--- gcc/testsuite/gcc.dg/bitint-73.c.jj 2024-01-18 12:29:07.586634031 +0100
+++ gcc/testsuite/gcc.dg/bitint-73.c2024-01-18 12:28:42.406986342 +0100
@@ -0,0 +1,18 @@
+/* PR tree-optimization/113459 */
+/* { dg-do compile { target bitint } } */
+/* { dg-options "-std=c23 -O2" } */
+
+#if __BITINT_MAXWIDTH__ >= 129
+# define N 129
+#else
+# define N 63
+#endif
+
+_BitInt(N) a;
+
+_BitInt(N)
+foo (void)
+{
+  __builtin_memset (&a, 6, sizeof a);
+  return a;
+}

Jakub

[PATCH] gimple-ssa-warn-restrict: Only use type range from NOP_EXPR for non-narrowing conversions [PR113463]

2024-01-19 Thread Jakub Jelinek

Hi!

builtin_memref::extend_offset_range when it sees a NOP_EXPR from
INTEGRAL_TYPE (to INTEGRAL_TYPE of sizetype/ptrdifftype precision
given the callers) uses wi::to_offset on TYPE_{MIN,MAX}_VALUE
of the rhs1 type.  This ICEs with large BITINT_TYPEs - to_offset
is only supported for precisions up to the offset_int precision
- but it even doesn't make any sense to do such thing for narrowing
conversions, their range means the whole sizetype/ptrdifftype range
and so the normal handling done later on (largest sized supported object)
is the way to go in that case.

So, the following patch just restrict this to non-narrowing conversions.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-01-19  Jakub Jelinek  

PR tree-optimization/113463
* gimple-ssa-warn-restrict.cc (builtin_memref::extend_offset_range):
Only look through NOP_EXPRs if rhs1 doesn't have wider type than
lhs.

* gcc.dg/bitint-74.c: New test.

--- gcc/gimple-ssa-warn-restrict.cc.jj  2024-01-03 11:51:27.705784291 +0100
+++ gcc/gimple-ssa-warn-restrict.cc 2024-01-18 16:00:02.519483821 +0100
@@ -391,7 +391,8 @@ builtin_memref::extend_offset_range (tre
   tree type;
   if (is_gimple_assign (stmt)
  && (type = TREE_TYPE (gimple_assign_rhs1 (stmt)))
- && INTEGRAL_TYPE_P (type))
+ && INTEGRAL_TYPE_P (type)
+ && TYPE_PRECISION (type) <= TYPE_PRECISION (TREE_TYPE (offset)))
{
  tree_code code = gimple_assign_rhs_code (stmt);
  if (code == NOP_EXPR)
--- gcc/testsuite/gcc.dg/bitint-74.c.jj 2024-01-18 16:14:05.523599054 +0100
+++ gcc/testsuite/gcc.dg/bitint-74.c2024-01-18 16:13:30.150099638 +0100
@@ -0,0 +1,16 @@
+/* PR tree-optimization/113463 */
+/* { dg-do compile { target bitint } } */
+/* { dg-options "-std=c23 -O2" } */
+
+extern char *a, *b;
+#if __BITINT_MAXWIDTH__ >= 129
+_BitInt(129) o;
+#else
+_BitInt(63) o;
+#endif
+
+void
+foo (void)
+{
+  __builtin_memcpy (a + o, b, 4);
+}

Jakub

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-19 Thread chenglulu




在 2024/1/19 下午1:46, Xi Ruoyao 写道:

On Wed, 2024-01-17 at 17:57 +0800, chenglulu wrote:

Virtual register 1479 will be used in insn 2744, but register 1479 was
assigned the REG_UNUSED attribute in the previous instruction.

The attached file is the wrong file.
The compilation command is as follows:

$ ./gcc/cc1 -fpreprocessed regrename.i -quiet -dp -dumpbase regrename.c
-dumpbase-ext .c -mno-relax -mabi=lp64d -march=loongarch64 -mfpu=64
-msimd=lasx -mcmodel=extreme -mtune=loongarch64 -g3 -O2
-Wno-int-conversion -Wno-implicit-int -Wno-implicit-function-declaration
-Wno-incompatible-pointer-types -version -o regrename.s
-mexplicit-relocs=always -fdump-rtl-all-all

I've seen some "guality" test failures in GCC test suite as well.
Normally I just ignore the guality failures but this time they look very
suspicious.  I'll investigate these issues...


I've also seen this type of failed regression tests and I'll continue to
look at this issue as well.

The guality regression is simple: I didn't call
delegitimize_mem_from_attrs (the default TARGET_DELEGITIMIZE_ADDRESS) in
the custom implementation.

The failure of this test case was because the compiler believes that two
(UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
same result, but this isn't true because the result depends on PC.  Thus
(pc) needed to be included in the RTX, like:

   [(set (match_operand:DI 0 "register_operand" "=r")
 (unspec:DI [(match_operand:DI 2 "") (pc)] UNSPEC_LA_PCREL_64_PART1))
(set (match_operand:DI 1 "register_operand" "=r")
 (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]

With this the buggy REG_UNUSED notes were gone.  But it then prevented
the CSE when loading the address of __tls_get_addr (i.e. if we address
10 TLE_LD symbols in a function it would emit 10 instances of "la.global
__tls_get_addr") so I added an REG_EQUAL note for it.  For symbols other
than __tls_get_addr such notes are added automatically by optimization
passes.

Updated patch attached.


I'm eliminating redundant la.global directives in my macro implementation.

I will be testing this patch.

Re: [PATCH v3] LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT

2024-01-19 Thread chenglulu


Hi, Jiahao:

This patch will introduce redundant FAIL, and the reason needs to be explained.

+FAIL: gcc.dg/tree-ssa/copy-headers-8.c scan-tree-dump-times ch2 "Conditional 
combines static and invariant" 1
+FAIL: gcc.dg/tree-ssa/copy-headers-8.c scan-tree-dump-times ch2 "Will duplicate 
bb" 2
+FAIL: gcc.dg/tree-ssa/update-threading.c scan-tree-dump-times optimized "Invalid 
sum" 0

在 2024/1/16 上午10:32, Jiahao Xu 写道:

Define LOGICAL_OP_NON_SHORT_CIRCUIT as 0, for a short-circuit branch, use the
short-circuit operation instead of the non-short-circuit operation.

SPEC2017 performance evaluation shows 1% performance improvement for fprate
GEOMEAN and no obvious regression for others. Especially, 526.blender_r +10.6%
on 3A6000.

gcc/ChangeLog:

* config/loongarch/loongarch.h (LOGICAL_OP_NON_SHORT_CIRCUIT): Define.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/short-circuit.c: New test.

diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
index 4e6ede926d3..8b453ab3140 100644
--- a/gcc/config/loongarch/loongarch.h
+++ b/gcc/config/loongarch/loongarch.h
@@ -869,6 +869,7 @@ typedef struct {
 1 is the default; other values are interpreted relative to that.  */
  
  #define BRANCH_COST(speed_p, predictable_p) la_branch_cost

+#define LOGICAL_OP_NON_SHORT_CIRCUIT 0
  
  /* Return the asm template for a conditional branch instruction.

 OPCODE is the opcode's mnemonic and OPERANDS is the asm template for
diff --git a/gcc/testsuite/gcc.target/loongarch/short-circuit.c 
b/gcc/testsuite/gcc.target/loongarch/short-circuit.c
new file mode 100644
index 000..bed585ee172
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/short-circuit.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -fdump-tree-gimple" } */
+
+int
+short_circuit (float *a)
+{
+  float t1x = a[0];
+  float t2x = a[1];
+  float t1y = a[2];
+  float t2y = a[3];
+  float t1z = a[4];
+  float t2z = a[5];
+
+  if (t1x > t2y  || t2x < t1y  || t1x > t2z || t2x < t1z || t1y > t2z || t2y < 
t1z)
+return 0;
+
+  return 1;
+}
+/* { dg-final { scan-tree-dump-times "if" 6 "gimple" } } */

[PATCH] lower-bitint: Don't use m_loads for loads used in GIMPLE_ASM [PR113464]

2024-01-19 Thread Jakub Jelinek

Hi!

Like for GIMPLE_PHIs or calls, even for GIMPLE_ASMs we want
a corresponding VAR_DECL assigned for lhs SSA_NAMEs of loads
from memory, as even GIMPLE_ASM relies on those VAR_DECLs to exist.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-01-19  Jakub Jelinek  

PR tree-optimization/113464
* gimple-lower-bitint.cc (gimple_lower_bitint): Don't try to
optimize loads into GIMPLE_ASM stmts.

* gcc.dg/bitint-75.c: New test.

--- gcc/gimple-lower-bitint.cc.jj   2024-01-18 08:44:08.337270271 +0100
+++ gcc/gimple-lower-bitint.cc  2024-01-18 19:57:11.791976322 +0100
@@ -6249,7 +6249,8 @@ gimple_lower_bitint (void)
  if (is_gimple_debug (use_stmt))
continue;
  if (gimple_code (use_stmt) == GIMPLE_PHI
- || is_gimple_call (use_stmt))
+ || is_gimple_call (use_stmt)
+ || gimple_code (use_stmt) == GIMPLE_ASM)
{
  optimizable_load = false;
  break;
--- gcc/testsuite/gcc.dg/bitint-75.c.jj 2024-01-18 20:08:21.710557536 +0100
+++ gcc/testsuite/gcc.dg/bitint-75.c2024-01-18 20:07:18.017447734 +0100
@@ -0,0 +1,11 @@
+/* PR tree-optimization/113464 */
+/* { dg-do compile { target bitint65535 } } */
+/* { dg-options "-O2 -w -std=gnu23" } */
+
+_BitInt(65532) i;
+
+void
+foo (void)
+{
+  __asm__ ("" : "+r" (i)); /* { dg-error "impossible constraint" } */
+}

Jakub

Re: [PATCH] gimple-ssa-warn-restrict: Only use type range from NOP_EXPR for non-narrowing conversions [PR113463]

2024-01-19 Thread Richard Biener

On Fri, 19 Jan 2024, Jakub Jelinek wrote:

> Hi!
> 
> builtin_memref::extend_offset_range when it sees a NOP_EXPR from
> INTEGRAL_TYPE (to INTEGRAL_TYPE of sizetype/ptrdifftype precision
> given the callers) uses wi::to_offset on TYPE_{MIN,MAX}_VALUE
> of the rhs1 type.  This ICEs with large BITINT_TYPEs - to_offset
> is only supported for precisions up to the offset_int precision
> - but it even doesn't make any sense to do such thing for narrowing
> conversions, their range means the whole sizetype/ptrdifftype range
> and so the normal handling done later on (largest sized supported object)
> is the way to go in that case.
> 
> So, the following patch just restrict this to non-narrowing conversions.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> 2024-01-19  Jakub Jelinek  
> 
>   PR tree-optimization/113463
>   * gimple-ssa-warn-restrict.cc (builtin_memref::extend_offset_range):
>   Only look through NOP_EXPRs if rhs1 doesn't have wider type than
>   lhs.
> 
>   * gcc.dg/bitint-74.c: New test.
> 
> --- gcc/gimple-ssa-warn-restrict.cc.jj2024-01-03 11:51:27.705784291 
> +0100
> +++ gcc/gimple-ssa-warn-restrict.cc   2024-01-18 16:00:02.519483821 +0100
> @@ -391,7 +391,8 @@ builtin_memref::extend_offset_range (tre
>tree type;
>if (is_gimple_assign (stmt)
> && (type = TREE_TYPE (gimple_assign_rhs1 (stmt)))
> -   && INTEGRAL_TYPE_P (type))
> +   && INTEGRAL_TYPE_P (type)
> +   && TYPE_PRECISION (type) <= TYPE_PRECISION (TREE_TYPE (offset)))
>   {
> tree_code code = gimple_assign_rhs_code (stmt);
> if (code == NOP_EXPR)
> --- gcc/testsuite/gcc.dg/bitint-74.c.jj   2024-01-18 16:14:05.523599054 
> +0100
> +++ gcc/testsuite/gcc.dg/bitint-74.c  2024-01-18 16:13:30.150099638 +0100
> @@ -0,0 +1,16 @@
> +/* PR tree-optimization/113463 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-std=c23 -O2" } */
> +
> +extern char *a, *b;
> +#if __BITINT_MAXWIDTH__ >= 129
> +_BitInt(129) o;
> +#else
> +_BitInt(63) o;
> +#endif
> +
> +void
> +foo (void)
> +{
> +  __builtin_memcpy (a + o, b, 4);
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] sccvn: Don't use SCALAR_INT_TYPE_MODE on BLKmode BITINT_TYPEs [PR113459]

2024-01-19 Thread Richard Biener

On Fri, 19 Jan 2024, Jakub Jelinek wrote:

> Hi!
> 
> sccvn uses GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (type)) for INTEGER_TYPEs,
> most likely because that is what native_{interpret,encode}_int used.
> This obviously doesn't work for larger BITINT_TYPEs which have BLKmode
> and the above ICEs on those.  native_{interpret,encode}_int checks whether
> the BITINT_TYPE is medium/large/huge (i.e. an array of 2+ ABI limbs)
> and uses TYPE_SIZE_UNIT for that case, otherwise SCALAR_INT_TYPE_MODE like
> for the INTEGER_TYPE case.
> 
> The following patch instead just uses SCALAR_INT_TYPE_MODE for non-BLKmode
> TYPE_MODE and TYPE_SIZE_UNIT otherwise.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2024-01-19  Jakub Jelinek  
> 
>   PR tree-optimization/113459
>   * tree-ssa-sccvn.cc (vn_walk_cb_data::push_partial_def): Use
>   TREE_INT_CST_LOW of TYPE_SIZE_UNIT rather than GET_MODE_SIZE
>   of SCALAR_INT_TYPE_MODE if type has BLKmode.
>   (vn_reference_lookup_3): Likewise.  Formatting fix.
> 
>   * gcc.dg/bitint-73.c: New test.
> 
> --- gcc/tree-ssa-sccvn.cc.jj  2024-01-03 11:51:42.361580881 +0100
> +++ gcc/tree-ssa-sccvn.cc 2024-01-18 12:39:52.789606975 +0100
> @@ -2287,7 +2287,12 @@ vn_walk_cb_data::push_partial_def (pd_da
>   BITS_PER_UNIT
>   - (maxsizei % BITS_PER_UNIT));
>if (INTEGRAL_TYPE_P (type))
> - sz = GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (type));
> + {
> +   if (TYPE_MODE (type) != BLKmode)
> + sz = GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (type));
> +   else
> + sz = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (type));
> + }
>if (sz > needed_len)
>   {
> memcpy (this_buffer + (sz - needed_len), buffer, needed_len);
> @@ -2967,8 +2972,10 @@ vn_reference_lookup_3 (ao_ref *ref, tree
>   }
> else
>   {
> -   unsigned buflen = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (vr->type)) + 
> 1;
> -   if (INTEGRAL_TYPE_P (vr->type))
> +   unsigned buflen
> + = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (vr->type)) + 1;
> +   if (INTEGRAL_TYPE_P (vr->type)
> +   && TYPE_MODE (vr->type) != BLKmode)
>   buflen = GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (vr->type)) + 1;
> unsigned char *buf = XALLOCAVEC (unsigned char, buflen);
> memset (buf, TREE_INT_CST_LOW (gimple_call_arg (def_stmt, 1)),
> @@ -3165,7 +3172,12 @@ vn_reference_lookup_3 (ao_ref *ref, tree
>offset + maxsize - 1.  */
> HOST_WIDE_INT sz = maxsizei / BITS_PER_UNIT;
> if (INTEGRAL_TYPE_P (type))
> - sz = GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (type));
> + {
> +   if (TYPE_MODE (type) != BLKmode)
> + sz = GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (type));
> +   else
> + sz = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (type));
> + }
> amnt = ((unsigned HOST_WIDE_INT) offset2i + size2i
> - offseti - maxsizei) % BITS_PER_UNIT;
> if (amnt)
> --- gcc/testsuite/gcc.dg/bitint-73.c.jj   2024-01-18 12:29:07.586634031 
> +0100
> +++ gcc/testsuite/gcc.dg/bitint-73.c  2024-01-18 12:28:42.406986342 +0100
> @@ -0,0 +1,18 @@
> +/* PR tree-optimization/113459 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-std=c23 -O2" } */
> +
> +#if __BITINT_MAXWIDTH__ >= 129
> +# define N 129
> +#else
> +# define N 63
> +#endif
> +
> +_BitInt(N) a;
> +
> +_BitInt(N)
> +foo (void)
> +{
> +  __builtin_memset (&a, 6, sizeof a);
> +  return a;
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] lower-bitint: Don't use m_loads for loads used in GIMPLE_ASM [PR113464]

2024-01-19 Thread Richard Biener

On Fri, 19 Jan 2024, Jakub Jelinek wrote:

> Hi!
> 
> Like for GIMPLE_PHIs or calls, even for GIMPLE_ASMs we want
> a corresponding VAR_DECL assigned for lhs SSA_NAMEs of loads
> from memory, as even GIMPLE_ASM relies on those VAR_DECLs to exist.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> 2024-01-19  Jakub Jelinek  
> 
>   PR tree-optimization/113464
>   * gimple-lower-bitint.cc (gimple_lower_bitint): Don't try to
>   optimize loads into GIMPLE_ASM stmts.
> 
>   * gcc.dg/bitint-75.c: New test.
> 
> --- gcc/gimple-lower-bitint.cc.jj 2024-01-18 08:44:08.337270271 +0100
> +++ gcc/gimple-lower-bitint.cc2024-01-18 19:57:11.791976322 +0100
> @@ -6249,7 +6249,8 @@ gimple_lower_bitint (void)
> if (is_gimple_debug (use_stmt))
>   continue;
> if (gimple_code (use_stmt) == GIMPLE_PHI
> -   || is_gimple_call (use_stmt))
> +   || is_gimple_call (use_stmt)
> +   || gimple_code (use_stmt) == GIMPLE_ASM)
>   {
> optimizable_load = false;
> break;
> --- gcc/testsuite/gcc.dg/bitint-75.c.jj   2024-01-18 20:08:21.710557536 
> +0100
> +++ gcc/testsuite/gcc.dg/bitint-75.c  2024-01-18 20:07:18.017447734 +0100
> @@ -0,0 +1,11 @@
> +/* PR tree-optimization/113464 */
> +/* { dg-do compile { target bitint65535 } } */
> +/* { dg-options "-O2 -w -std=gnu23" } */
> +
> +_BitInt(65532) i;
> +
> +void
> +foo (void)
> +{
> +  __asm__ ("" : "+r" (i));   /* { dg-error "impossible constraint" } */
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

[PATCH 1/3] rtl-ssa: Provide easier access to debug uses [PR113089]

2024-01-19 Thread Alex Coplan

This patch adds some accessors to set_info and use_info to make it
easier to get at and iterate through uses in debug insns.

It is used by the aarch64 load/store pair fusion pass in a subsequent
patch to fix PR113089, i.e. to update debug uses in the pass.

Bootstrapped/regtested as a series on aarch64-linux-gnu (with/without
the load/store pair pass enabled), OK for trunk?

gcc/ChangeLog:

PR target/113089
* rtl-ssa/accesses.h (use_info::next_debug_insn_use): New.
(debug_insn_use_iterator): New.
(set_info::first_debug_insn_use): New.
(set_info::debug_insn_uses): New.
* rtl-ssa/member-fns.inl (use_info::next_debug_insn_use): New.
(set_info::first_debug_insn_use): New.
(set_info::debug_insn_uses): New.
---
 gcc/rtl-ssa/accesses.h | 13 +
 gcc/rtl-ssa/member-fns.inl | 29 +
 2 files changed, 42 insertions(+)

diff --git a/gcc/rtl-ssa/accesses.h b/gcc/rtl-ssa/accesses.h
index 6a3ecd32848..c57b8a8b7b5 100644
--- a/gcc/rtl-ssa/accesses.h
+++ b/gcc/rtl-ssa/accesses.h
@@ -357,6 +357,10 @@ public:
   //next_use () && next_use ()->is_in_any_insn () ? next_use () : nullptr
   use_info *next_any_insn_use () const;
 
+  // Return the next use by a debug instruction, or null if none.
+  // This is only valid if if is_in_debug_insn ().
+  use_info *next_debug_insn_use () const;
+
   // Return the previous use by a phi node in the list, or null if none.
   //
   // This is only valid if is_in_phi ().  It is equivalent to:
@@ -458,6 +462,8 @@ using reverse_use_iterator = list_iterator;
 // of use in the same definition.
 using nondebug_insn_use_iterator
   = list_iterator;
+using debug_insn_use_iterator
+  = list_iterator;
 using any_insn_use_iterator
   = list_iterator;
 using phi_use_iterator = list_iterator;
@@ -680,6 +686,10 @@ public:
   use_info *first_nondebug_insn_use () const;
   use_info *last_nondebug_insn_use () const;
 
+  // Return the first use of the set by debug instructions, or null if
+  // there is no such use.
+  use_info *first_debug_insn_use () const;
+
   // Return the first use of the set by any kind of instruction, or null
   // if there are no such uses.  The uses are in the order described above.
   use_info *first_any_insn_use () const;
@@ -731,6 +741,9 @@ public:
   // List the uses of the set by nondebug instructions, in reverse postorder.
   iterator_range nondebug_insn_uses () const;
 
+  // List the uses of the set by debug instructions, in reverse postorder.
+  iterator_range debug_insn_uses () const;
+
   // Return nondebug_insn_uses () in reverse order.
   iterator_range reverse_nondebug_insn_uses () const;
 
diff --git a/gcc/rtl-ssa/member-fns.inl b/gcc/rtl-ssa/member-fns.inl
index 8e1c17ced95..e4825ad2a18 100644
--- a/gcc/rtl-ssa/member-fns.inl
+++ b/gcc/rtl-ssa/member-fns.inl
@@ -119,6 +119,15 @@ use_info::next_any_insn_use () const
   return nullptr;
 }
 
+inline use_info *
+use_info::next_debug_insn_use () const
+{
+  if (auto use = next_use ())
+if (use->is_in_debug_insn ())
+  return use;
+  return nullptr;
+}
+
 inline use_info *
 use_info::prev_phi_use () const
 {
@@ -212,6 +221,20 @@ set_info::last_nondebug_insn_use () const
   return nullptr;
 }
 
+inline use_info *
+set_info::first_debug_insn_use () const
+{
+  use_info *use;
+  if (has_nondebug_insn_uses ())
+use = last_nondebug_insn_use ()->next_use ();
+  else
+use = first_use ();
+
+  if (use && use->is_in_debug_insn ())
+return use;
+  return nullptr;
+}
+
 inline use_info *
 set_info::first_any_insn_use () const
 {
@@ -310,6 +333,12 @@ set_info::nondebug_insn_uses () const
   return { first_nondebug_insn_use (), nullptr };
 }
 
+inline iterator_range
+set_info::debug_insn_uses () const
+{
+  return { first_debug_insn_use (), nullptr };
+}
+
 inline iterator_range
 set_info::reverse_nondebug_insn_uses () const
 {

[PATCH 2/3] aarch64: Re-parent trailing nondebug base reg uses [PR113089]

2024-01-19 Thread Alex Coplan

While working on PR113089, I realised we where missing code to re-parent
trailing nondebug uses of the base register in the case of cancelling
writeback in the load/store pair pass.  This patch fixes that.

Bootstrapped/regtested as a series on aarch64-linux-gnu (with/without
the pass enabled), OK for trunk?

Thanks,
Alex

gcc/ChangeLog:

PR target/113089
* config/aarch64/aarch64-ldp-fusion.cc (ldp_bb_info::fuse_pair):
Update trailing nondebug uses of the base register in the case
of cancelling writeback.
---
 gcc/config/aarch64/aarch64-ldp-fusion.cc | 24 
 1 file changed, 24 insertions(+)

diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc b/gcc/config/aarch64/aarch64-ldp-fusion.cc
index 70b75c668ce..4d7fd72c6b1 100644
--- a/gcc/config/aarch64/aarch64-ldp-fusion.cc
+++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc
@@ -1693,6 +1693,30 @@ ldp_bb_info::fuse_pair (bool load_p,
 
   if (trailing_add)
 changes.safe_push (make_delete (trailing_add));
+  else if ((writeback & 2) && !writeback_effect)
+{
+  // The second insn initially had writeback but now the pair does not,
+  // need to update any nondebug uses of the base register def in the
+  // second insn.  We'll take care of debug uses later.
+  auto def = find_access (insns[1]->defs (), base_regno);
+  gcc_assert (def);
+  auto set = dyn_cast (def);
+  if (set && set->has_nondebug_uses ())
+	{
+	  auto orig_use = find_access (insns[0]->uses (), base_regno);
+	  for (auto use : set->nondebug_insn_uses ())
+	{
+	  auto change = make_change (use->insn ());
+	  change->new_uses = check_remove_regno_access (attempt,
+			change->new_uses,
+			base_regno);
+	  change->new_uses = insert_access (attempt,
+		orig_use,
+		change->new_uses);
+	  changes.safe_push (change);
+	}
+	}
+}
 
   auto is_changing = insn_is_changing (changes);
   for (unsigned i = 0; i < changes.length (); i++)

[PATCH 3/3] aarch64: Fix up debug uses in ldp/stp pass [PR113089]

2024-01-19 Thread Alex Coplan

As the PR shows, we were missing code to update debug uses in the
load/store pair fusion pass.  This patch fixes that.

Note that this patch depends on the following patch to create new uses
in RTL-SSA, submitted as part of the fixes for PR113070:
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642919.html

The patch tries to give a complete treatment of the debug uses that will
be affected by the changes we make, and in particular makes an effort to
preserve debug info where possible, e.g. when re-ordering an update of
a base register by a constant over a debug use of that register.  When
re-ordering loads over a debug use of a transfer register, we reset the
debug insn.  Likewise when re-ordering stores over debug uses of mem.

While doing this I noticed that try_promote_writeback used a strange
choice of move_range for the pair insn, in that it chose the previous
nondebug insn instead of the insn itself.  Since the insn is being
changed, these move ranges are equivalent (at least in terms of nondebug
insn placement as far as RTL-SSA is concerned), but I think it is more
natural to choose the pair insn itself.  This is needed to avoid
incorrectly updating some debug uses.

Notes on testing:
 - The series was bootstrapped/regtested on top of the fixes for
   PR113070 and PR113356.  It seemed to make more sense to test with
   correct use/def info, and as mentioned above, this patch depends on
   one of the PR113070 patches.
 - I also ran the testsuite with -g -funroll-loops -mearly-ldp-fusion
   -mlate-ldp-fusion to try and flush out more issues, and worked
   through some examples where writeback updates were triggered to
   make sure it was doing the right thing.
 - The patches also survived an LTO+PGO bootstrap with
   --enable-languages=all (with the passes enabled).

Bootstrapped/regtested as a series on aarch64-linux-gnu (with/without
the pass enabled).  OK for trunk?

Thanks,
Alex

gcc/ChangeLog:

PR target/113089
* config/aarch64/aarch64-ldp-fusion.cc (reset_debug_use): New.
(fixup_debug_use): New.
(fixup_debug_uses_trailing_add): New.
(fixup_debug_uses): New. Use it ...
(ldp_bb_info::fuse_pair): ... here.
(try_promote_writeback): Call fixup_debug_uses_trailing_add to
fix up debug uses of the base register that are affected by
folding in the trailing add insn.

gcc/testsuite/ChangeLog:

PR target/113089
* gcc.c-torture/compile/pr113089.c: New test.
---
 gcc/config/aarch64/aarch64-ldp-fusion.cc  | 332 +-
 .../gcc.c-torture/compile/pr113089.c  |  26 ++
 2 files changed, 351 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr113089.c

diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc b/gcc/config/aarch64/aarch64-ldp-fusion.cc
index 4d7fd72c6b1..fd0278e7acf 100644
--- a/gcc/config/aarch64/aarch64-ldp-fusion.cc
+++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc
@@ -1342,6 +1342,309 @@ ldp_bb_info::track_tombstone (int uid)
 gcc_unreachable (); // Bit should have changed.
 }
 
+// Reset the debug insn containing USE (the debug insn has been
+// optimized away).
+static void
+reset_debug_use (use_info *use)
+{
+  auto use_insn = use->insn ();
+  auto use_rtl = use_insn->rtl ();
+  insn_change change (use_insn);
+  change.new_uses = {};
+  INSN_VAR_LOCATION_LOC (use_rtl) = gen_rtx_UNKNOWN_VAR_LOC ();
+  crtl->ssa->change_insn (change);
+}
+
+// USE is a debug use that needs updating because DEF (a def of the same
+// register) is being re-ordered over it.  If BASE is non-null, then DEF
+// is an update of the register BASE by a constant, given by WB_OFFSET,
+// and we can preserve debug info by accounting for the change in side
+// effects.
+static void
+fixup_debug_use (obstack_watermark &attempt,
+		 use_info *use,
+		 def_info *def,
+		 rtx base,
+		 poly_int64 wb_offset)
+{
+  auto use_insn = use->insn ();
+  if (base)
+{
+  auto use_rtl = use_insn->rtl ();
+  insn_change change (use_insn);
+
+  gcc_checking_assert (REG_P (base) && use->regno () == REGNO (base));
+  change.new_uses = check_remove_regno_access (attempt,
+		   change.new_uses,
+		   use->regno ());
+
+  // The effect of the writeback is to add WB_OFFSET to BASE.  If
+  // we're re-ordering DEF below USE, then we update USE by adding
+  // WB_OFFSET to it.  Otherwise, if we're re-ordering DEF above
+  // USE, we update USE by undoing the effect of the writeback
+  // (subtracting WB_OFFSET).
+  use_info *new_use;
+  if (*def->insn () > *use_insn)
+	{
+	  // We now need USE_INSN to consume DEF.  Create a new use of DEF.
+	  //
+	  // N.B. this means until we call change_insns for the main change
+	  // group we will temporarily have a debug use consuming a def that
+	  // comes after it, but RTL-SSA doesn't currently support updating
+	  // debug insns as part of the main change group (together with
+	  // nondebug changes

Re: [PATCH V2] RISC-V: Fix RVV_VLMAX

2024-01-19 Thread Robin Dapp

Ah, interesting that this was it.  Thanks for fixing and also
thanks to Andrew for suggesting that fix.

Regards
 Robin

[PATCH] c++: Fix g++.dg/ext/attr-section2.C etc. with Solaris/SPARC as

2024-01-19 Thread Rainer Orth

The new g++.dg/ext/attr-section2*.C tests FAIL on Solaris/SPARC with the
native assembler:

+FAIL: g++.dg/ext/attr-section2.C  -std=c++14  scan-assembler .(section|csect)[ 
t]+.foo
+FAIL: g++.dg/ext/attr-section2.C  -std=c++17  scan-assembler .(section|csect)[ 
t]+.foo
+FAIL: g++.dg/ext/attr-section2.C  -std=c++20  scan-assembler .(section|csect)[ 
t]+.foo

The problem is that the SPARC assembler requires the section name to be
double-quoted, like

.section".foo%_Z3varIiE",#alloc,#write,#progbits

This patch allows for that.  At the same time, it quotes literal dots in
the REs.

Tested on sparc-sun-solaris2.11 (as and gas) and i386-pc-solaris2.11 (as
and gas).

Ok for trunk?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2024-01-18  Rainer Orth  

gcc/testsuite:
* g++.dg/ext/attr-section2.C (scan-assembler): Quote dots.  Allow
for double-quoted section name.
* g++.dg/ext/attr-section2a.C: Likewise.
* g++.dg/ext/attr-section2b.C: Likewise.

# HG changeset patch
# Parent  7d0c57f448dfceb02583a770e09fa7907720ba7e
c++: Fix g++.dg/ext/attr-section2.C etc. with Solaris/SPARC as

diff --git a/gcc/testsuite/g++.dg/ext/attr-section2.C b/gcc/testsuite/g++.dg/ext/attr-section2.C
--- a/gcc/testsuite/g++.dg/ext/attr-section2.C
+++ b/gcc/testsuite/g++.dg/ext/attr-section2.C
@@ -6,4 +6,4 @@ template
 
 template int var;
 
-// { dg-final { scan-assembler {.(section|csect)[ \t]+.foo} } }
+// { dg-final { scan-assembler {\.(section|csect)[ \t]+"?\.foo} } }
diff --git a/gcc/testsuite/g++.dg/ext/attr-section2a.C b/gcc/testsuite/g++.dg/ext/attr-section2a.C
--- a/gcc/testsuite/g++.dg/ext/attr-section2a.C
+++ b/gcc/testsuite/g++.dg/ext/attr-section2a.C
@@ -11,4 +11,4 @@ int A::var = 42;
 
 template struct A;
 
-// { dg-final { scan-assembler {.(section|csect)[ \t]+.foo} } }
+// { dg-final { scan-assembler {\.(section|csect)[ \t]+"?\.foo} } }
diff --git a/gcc/testsuite/g++.dg/ext/attr-section2b.C b/gcc/testsuite/g++.dg/ext/attr-section2b.C
--- a/gcc/testsuite/g++.dg/ext/attr-section2b.C
+++ b/gcc/testsuite/g++.dg/ext/attr-section2b.C
@@ -9,4 +9,4 @@ int* fun() {
 
 template int* fun();
 
-// { dg-final { scan-assembler {.(section|csect)[ \t]+.foo} } }
+// { dg-final { scan-assembler {\.(section|csect)[ \t]+"?\.foo} } }

Re: HELP: Questions on unshare_expr

2024-01-19 Thread Richard Biener

On Thu, Jan 18, 2024 at 3:46 PM Qing Zhao  wrote:
>
>
>
> > On Jan 17, 2024, at 1:43 AM, Richard Biener  
> > wrote:
> >
> > On Wed, Jan 17, 2024 at 7:42 AM Richard Biener
> >  wrote:
> >>
> >> On Tue, Jan 16, 2024 at 9:26 PM Qing Zhao  wrote:
> >>>
> >>>
> >>>
>  On Jan 15, 2024, at 4:31 AM, Richard Biener  
>  wrote:
> 
> > All my questions for unshare_expr relate to a  LTO bug that I currently 
> > stuck with
> > when using .ACCESS_WITH_SIZE in bound sanitizer (only with -flto, 
> > without -flto, no issue):
> >
> > [opc@qinzhao-aarch64-ol8 gcc]$ sh t
> > during IPA pass: modref
> > t.c:20:1: internal compiler error: tree code ‘ssa_name’ is not 
> > supported in LTO streams
> > 0x14c3993 lto_write_tree
> >   ../../latest-gcc-write/gcc/lto-streamer-out.cc:561
> > 0x14c3aeb lto_output_tree_1
> >
> > And the value of the tree node that triggered the ICE is:
> > (gdb) call debug_tree(expr)
> > 
> >   nothrow
> >   def_stmt
> >   version:13 in-free-list>
> >
> > Is there any good way to debug LTO bug?
> 
>  This happens usually when you have a VLA type and its type fields are not
>  properly gimplified which usually happens because the frontend fails to
>  insert a gimplification point for it (a DECL_EXPR).
> >>>
> >>> I found an old gcc bug
> >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97172
> >>> ICE: tree code ‘ssa_name’ is not supported in LTO streams since 
> >>> r11-3303-g6450f07388f9fe57
> >>>
> >>> Which is very similar to the bug I am having right now.
> >>>
> >>> After further study, I suspect that the issue I am having right now with 
> >>> the LTO streaming also
> >>> relate to “unshare_expr”, “save_expr”, and the combination of these two, 
> >>> I suspect that
> >>> the current gcc cannot handle the combination of these two correctly for 
> >>> my case.
> >>>
> >>> My testing case is:
> >>>
> >>> #include 
> >>> void __attribute__((__noinline__)) setup_and_test_vla (int n1, int n2, 
> >>> int m)
> >>> {
> >>>   struct foo {
> >>>   int n;
> >>>   int p[][n2][n1] __attribute__((counted_by(n)));
> >>>   } *f;
> >>>
> >>>   f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n2][n1]));
> >>>   f->n = m;
> >>>   f->p[m][n2][n1]=1;
> >>>   return;
> >>> }
> >>>
> >>> int main(int argc, char *argv[])
> >>> {
> >>>  setup_and_test_vla (10, 11, 20);
> >>>  return 0;
> >>> }
> >>>
> >>> Failed with
> >>> my_gcc -Os -fsanitize=bounds -flto
> >>>
> >>> If changing either n1 or n2 to a constant, the testing passed.
> >>> If deleting -flto, the testing passed too.
> >>>
> >>> I double checked my code per the suggestions provided by you and Jakub in 
> >>> this
> >>> email thread, and I think the code should be fine.
> >>>
> >>> The code is following:
> >>>
> >>> =
> >>> 504 /* Instrument array bounds for INDIRECT_REFs whose pointers are
> >>> 505POINTER_PLUS_EXPRs of calls to .ACCESS_WITH_SIZE. We create special
> >>> 506builtins that gets expanded in the sanopt pass, and make an array
> >>> 507dimension of it.  ARRAY is the pointer to the base of the array,
> >>> 508which is a call to .ACCESS_WITH_SIZE, *OFFSET is the offset to the
> >>> 509beginning of array.
> >>> 510Return NULL_TREE if no instrumentation is emitted.  */
> >>> 511
> >>> 512 tree
> >>> 513 ubsan_instrument_bounds_indirect_ref (location_t loc, tree array, 
> >>> tree *offset)
> >>> 514 {
> >>> 515   if (!is_access_with_size_p (array))
> >>> 516 return NULL_TREE;
> >>> 517   tree bound = get_bound_from_access_with_size (array);
> >>> 518   /* The type of the call to .ACCESS_WITH_SIZE is a pointer type to
> >>> 519  the element of the array.  */
> >>> 520   tree element_size = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array)));
> >>> 521   gcc_assert (bound);
> >>> 522
> >>> 523   /* Given the offset, and the size of each element, the index can be
> >>> 524  computed as: offset/element_size.  */
> >>> 525   *offset = save_expr (*offset);
> >>> 526   tree index = fold_build2 (EXACT_DIV_EXPR,
> >>> 527sizetype, *offset,
> >>> 528unshare_expr (element_size));
> >>> 529   /* Create a "(T *) 0" tree node to describe the original array type.
> >>> 530  We get the original array type from the first argument of the 
> >>> call to
> >>> 531  .ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, num_bytes, -1).
> >>> 532
> >>> 533  Originally, REF is a COMPONENT_REF with the original array type,
> >>> 534  it was converted to a pointer to an ADDR_EXPR, and the 
> >>> ADDR_EXPR's
> >>> 535  first operand is the original COMPONENT_REF.  */
> >>> 536   tree ref = CALL_EXPR_ARG (array, 0);
> >>> 537   tree array_type
> >>> 538 = unshare_expr (TREE_TYPE (TREE_OPERAND (TREE_OPERAND(ref, 0), 
> >>> 0)));
> >>> 539   tree zero_with_type = build_int_cst (build_pointer_type 
> >>> (array_type), 0);
> >>> 540   return build_

[PATCH 0/5] RISC-V: Relax the -march string for accept any order

2024-01-19 Thread juzhe.zh...@rivai.ai

Hi, kito.

I found these following regression:

FAIL: gcc.target/riscv/arch-27.c   -O0   at line 7 (test for errors, line )
FAIL: gcc.target/riscv/arch-27.c   -O0  (test for excess errors)
FAIL: gcc.target/riscv/arch-27.c   -O1   at line 7 (test for errors, line )
FAIL: gcc.target/riscv/arch-27.c   -O1  (test for excess errors)
FAIL: gcc.target/riscv/arch-27.c   -O2   at line 7 (test for errors, line )
FAIL: gcc.target/riscv/arch-27.c   -O2  (test for excess errors)
FAIL: gcc.target/riscv/arch-27.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none   at line 7 (test for errors, line )
FAIL: gcc.target/riscv/arch-27.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
FAIL: gcc.target/riscv/arch-27.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects   at line 7 (test for errors, line )
FAIL: gcc.target/riscv/arch-27.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (test for excess errors)
FAIL: gcc.target/riscv/arch-27.c   -O3 -g   at line 7 (test for errors, line )
FAIL: gcc.target/riscv/arch-27.c   -O3 -g  (test for excess errors)
FAIL: gcc.target/riscv/arch-27.c   -Os   at line 7 (test for errors, line )
FAIL: gcc.target/riscv/arch-27.c   -Os  (test for excess errors)
FAIL: gcc.target/riscv/arch-28.c   -O0   at line 7 (test for errors, line )
FAIL: gcc.target/riscv/arch-28.c   -O0  (test for excess errors)
FAIL: gcc.target/riscv/arch-28.c   -O1   at line 7 (test for errors, line )
FAIL: gcc.target/riscv/arch-28.c   -O1  (test for excess errors)
FAIL: gcc.target/riscv/arch-28.c   -O2   at line 7 (test for errors, line )
FAIL: gcc.target/riscv/arch-28.c   -O2  (test for excess errors)
FAIL: gcc.target/riscv/arch-28.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none   at line 7 (test for errors, line )
FAIL: gcc.target/riscv/arch-28.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
FAIL: gcc.target/riscv/arch-28.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects   at line 7 (test for errors, line )
FAIL: gcc.target/riscv/arch-28.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (test for excess errors)
FAIL: gcc.target/riscv/arch-28.c   -O3 -g   at line 7 (test for errors, line )
FAIL: gcc.target/riscv/arch-28.c   -O3 -g  (test for excess errors)
FAIL: gcc.target/riscv/arch-28.c   -Os   at line 7 (test for errors, line )
FAIL: gcc.target/riscv/arch-28.c   -Os  (test for excess errors)
FAIL: gcc.target/riscv/attribute-10.c   -O0   at line 8 (test for errors, line )
FAIL: gcc.target/riscv/attribute-10.c   -O0  (test for excess errors)
FAIL: gcc.target/riscv/attribute-10.c   -O1   at line 8 (test for errors, line )
FAIL: gcc.target/riscv/attribute-10.c   -O1  (test for excess errors)
FAIL: gcc.target/riscv/attribute-10.c   -O2   at line 8 (test for errors, line )
FAIL: gcc.target/riscv/attribute-10.c   -O2  (test for excess errors)
FAIL: gcc.target/riscv/attribute-10.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none   at line 8 (test for errors, line )
FAIL: gcc.target/riscv/attribute-10.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
FAIL: gcc.target/riscv/attribute-10.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects   at line 8 (test for errors, line )
FAIL: gcc.target/riscv/attribute-10.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (test for excess errors)
FAIL: gcc.target/riscv/attribute-10.c   -O3 -g   at line 8 (test for errors, 
line )
FAIL: gcc.target/riscv/attribute-10.c   -O3 -g  (test for excess errors)
FAIL: gcc.target/riscv/attribute-10.c   -Os   at line 8 (test for errors, line )
FAIL: gcc.target/riscv/attribute-10.c   -Os  (test for excess errors)

Could you take a look at it ?
I am not sure whether they are caused by this patch.  But I find only this 
patch looks related.


juzhe.zh...@rivai.ai

Re: [PATCH 0/5] RISC-V: Relax the -march string for accept any order

2024-01-19 Thread Kito Cheng

Oh, ok, I must have missed something during testing.

On Fri, Jan 19, 2024 at 5:37 PM juzhe.zh...@rivai.ai
 wrote:
>
> Hi, kito.
>
> I found these following regression:
>
> FAIL: gcc.target/riscv/arch-27.c   -O0   at line 7 (test for errors, line )
> FAIL: gcc.target/riscv/arch-27.c   -O0  (test for excess errors)
> FAIL: gcc.target/riscv/arch-27.c   -O1   at line 7 (test for errors, line )
> FAIL: gcc.target/riscv/arch-27.c   -O1  (test for excess errors)
> FAIL: gcc.target/riscv/arch-27.c   -O2   at line 7 (test for errors, line )
> FAIL: gcc.target/riscv/arch-27.c   -O2  (test for excess errors)
> FAIL: gcc.target/riscv/arch-27.c   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none   at line 7 (test for errors, line )
> FAIL: gcc.target/riscv/arch-27.c   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none  (test for excess errors)
> FAIL: gcc.target/riscv/arch-27.c   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects   at line 7 (test for errors, line )
> FAIL: gcc.target/riscv/arch-27.c   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  (test for excess errors)
> FAIL: gcc.target/riscv/arch-27.c   -O3 -g   at line 7 (test for errors, line )
> FAIL: gcc.target/riscv/arch-27.c   -O3 -g  (test for excess errors)
> FAIL: gcc.target/riscv/arch-27.c   -Os   at line 7 (test for errors, line )
> FAIL: gcc.target/riscv/arch-27.c   -Os  (test for excess errors)
> FAIL: gcc.target/riscv/arch-28.c   -O0   at line 7 (test for errors, line )
> FAIL: gcc.target/riscv/arch-28.c   -O0  (test for excess errors)
> FAIL: gcc.target/riscv/arch-28.c   -O1   at line 7 (test for errors, line )
> FAIL: gcc.target/riscv/arch-28.c   -O1  (test for excess errors)
> FAIL: gcc.target/riscv/arch-28.c   -O2   at line 7 (test for errors, line )
> FAIL: gcc.target/riscv/arch-28.c   -O2  (test for excess errors)
> FAIL: gcc.target/riscv/arch-28.c   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none   at line 7 (test for errors, line )
> FAIL: gcc.target/riscv/arch-28.c   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none  (test for excess errors)
> FAIL: gcc.target/riscv/arch-28.c   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects   at line 7 (test for errors, line )
> FAIL: gcc.target/riscv/arch-28.c   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  (test for excess errors)
> FAIL: gcc.target/riscv/arch-28.c   -O3 -g   at line 7 (test for errors, line )
> FAIL: gcc.target/riscv/arch-28.c   -O3 -g  (test for excess errors)
> FAIL: gcc.target/riscv/arch-28.c   -Os   at line 7 (test for errors, line )
> FAIL: gcc.target/riscv/arch-28.c   -Os  (test for excess errors)
> FAIL: gcc.target/riscv/attribute-10.c   -O0   at line 8 (test for errors, 
> line )
> FAIL: gcc.target/riscv/attribute-10.c   -O0  (test for excess errors)
> FAIL: gcc.target/riscv/attribute-10.c   -O1   at line 8 (test for errors, 
> line )
> FAIL: gcc.target/riscv/attribute-10.c   -O1  (test for excess errors)
> FAIL: gcc.target/riscv/attribute-10.c   -O2   at line 8 (test for errors, 
> line )
> FAIL: gcc.target/riscv/attribute-10.c   -O2  (test for excess errors)
> FAIL: gcc.target/riscv/attribute-10.c   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none   at line 8 (test for errors, line )
> FAIL: gcc.target/riscv/attribute-10.c   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none  (test for excess errors)
> FAIL: gcc.target/riscv/attribute-10.c   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects   at line 8 (test for errors, line )
> FAIL: gcc.target/riscv/attribute-10.c   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  (test for excess errors)
> FAIL: gcc.target/riscv/attribute-10.c   -O3 -g   at line 8 (test for errors, 
> line )
> FAIL: gcc.target/riscv/attribute-10.c   -O3 -g  (test for excess errors)
> FAIL: gcc.target/riscv/attribute-10.c   -Os   at line 8 (test for errors, 
> line )
> FAIL: gcc.target/riscv/attribute-10.c   -Os  (test for excess errors)
>
> Could you take a look at it ?
> I am not sure whether they are caused by this patch.  But I find only this 
> patch looks related.
> 
> juzhe.zh...@rivai.ai

[PATCH v2] RISC-V: Documnet the list of supported extensions

2024-01-19 Thread Kito Cheng

Try to list all supported extensions: name, version and few description
for each extension.

v2 changes:
 - Fix several typo.
 - Add expantion info for vector crypto extensions.
 - Drop zvl8192b, zvl16384b, zvl32768b and zvl65536b.
 - Aadd zicntr and zihpm

gcc/ChangeLog:

* doc/invoke.texi (RISC-V Options): Add list of supported
extensions.
---
 gcc/doc/invoke.texi | 461 
 1 file changed, 461 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c0e513c8f27..313f363f5f2 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -30113,6 +30113,467 @@ syntax @samp{p} or @samp{}, 
(e.g.@: @samp{m2p1} or
 @samp{m2}).
 @end table
 
+Supported extension are listed below:
+@multitable @columnfractions .10 .10 .80
+@headitem Extension Name @tab Supported Version @tab Description
+@item i
+@tab 2.0, 2.1
+@tab Base integer extension.
+
+@item e
+@tab 2.0
+@tab Reduced base integer extension.
+
+@item g
+@tab -
+@tab General-purpose computing base extension, @samp{g} will expand to
+@samp{i}, @samp{m}, @samp{a}, @samp{f}, @samp{d}, @samp{zicsr} and
+@samp{zifencei}.
+
+@item m
+@tab 2.0
+@tab Integer multiplication and division extension.
+
+@item a
+@tab 2.0, 2.1
+@tab Atomic extension.
+
+@item f
+@tab 2.0, 2.2
+@tab Single-precision floating-point extension.
+
+@item d
+@tab 2.0, 2.2
+@tab Double-precision floating-point extension.
+
+@item c
+@tab 2.0
+@tab Compressed extension.
+
+@item h
+@tab 1.0
+@tab Hypervisor extension.
+
+@item v
+@tab 1.0
+@tab Vector extension.
+
+@item zicsr
+@tab 2.0
+@tab Control and status register access extension.
+
+@item zifencei
+@tab 2.0
+@tab Instruction-fetch fence extension.
+
+@item zicond
+@tab 1.0
+@tab Integer conditional operations extension.
+
+@item zawrs
+@tab 1.0
+@tab Wait-on-reservation-set extension.
+
+@item zba
+@tab 1.0
+@tab Address calculation extension.
+
+@item zbb
+@tab 1.0
+@tab Basic bit manipulation extension.
+
+@item zbc
+@tab 1.0
+@tab Carry-less multiplication extension.
+
+@item zbs
+@tab 1.0
+@tab Single-bit operation extension.
+
+@item zfinx
+@tab 1.0
+@tab Single-precision floating-point in integer registers extension.
+
+@item zdinx
+@tab 1.0
+@tab Double-precision floating-point in integer registers extension.
+
+@item zhinx
+@tab 1.0
+@tab Half-precision floating-point in integer registers extension.
+
+@item zhinxmin
+@tab 1.0
+@tab Minimal half-precision floating-point in integer registers extension.
+
+@item zbkb
+@tab 1.0
+@tab Cryptography bit-manipulation extension.
+
+@item zbkc
+@tab 1.0
+@tab Cryptography carry-less multiply extension.
+
+@item zbkx
+@tab 1.0
+@tab Cryptography crossbar permutation extension.
+
+@item zkne
+@tab 1.0
+@tab AES Encryption extension.
+
+@item zknd
+@tab 1.0
+@tab AES Decryption extension.
+
+@item zknh
+@tab 1.0
+@tab Hash function extension.
+
+@item zkr
+@tab 1.0
+@tab Entropy source extension.
+
+@item zksed
+@tab 1.0
+@tab SM4 block cipher extension.
+
+@item zksh
+@tab 1.0
+@tab SM3 hash function extension.
+
+@item zkt
+@tab 1.0
+@tab Data independent execution latency extension.
+
+@item zk
+@tab 1.0
+@tab Standard scalar cryptography extension.
+
+@item zkn
+@tab 1.0
+@tab NIST algorithm suite extension.
+
+@item zks
+@tab 1.0
+@tab ShangMi algorithm suite extension.
+
+@item zihintntl
+@tab 1.0
+@tab Non-temporal locality hints extension.
+
+@item zihintpause
+@tab 1.0
+@tab Pause hint extension.
+
+@item zicboz
+@tab 1.0
+@tab Cache-block zero extension.
+
+@item zicbom
+@tab 1.0
+@tab Cache-block management extension.
+
+@item zicbop
+@tab 1.0
+@tab Cache-block prefetch extension.
+
+@item zicntr
+@tab 2.0
+@tab Standard extension for base counters and timers.
+
+@item zihpm
+@tab 2.0
+@tab Standard extension for hardware performance counters.
+
+@item ztso
+@tab 1.0
+@tab Total store ordering extension.
+
+@item zve32x
+@tab 1.0
+@tab Vector extensions for embedded processors.
+
+@item zve32f
+@tab 1.0
+@tab Vector extensions for embedded processors.
+
+@item zve64x
+@tab 1.0
+@tab Vector extensions for embedded processors.
+
+@item zve64f
+@tab 1.0
+@tab Vector extensions for embedded processors.
+
+@item zve64d
+@tab 1.0
+@tab Vector extensions for embedded processors.
+
+@item zvl32b
+@tab 1.0
+@tab Minimum vector length standard extensions
+
+@item zvl64b
+@tab 1.0
+@tab Minimum vector length standard extensions
+
+@item zvl128b
+@tab 1.0
+@tab Minimum vector length standard extensions
+
+@item zvl256b
+@tab 1.0
+@tab Minimum vector length standard extensions
+
+@item zvl512b
+@tab 1.0
+@tab Minimum vector length standard extensions
+
+@item zvl1024b
+@tab 1.0
+@tab Minimum vector length standard extensions
+
+@item zvl2048b
+@tab 1.0
+@tab Minimum vector length standard extensions
+
+@item zvl4096b
+@tab 1.0
+@tab Minimum vector length standard extensions
+
+@item zvbb
+@tab 1.0
+@tab Vector basic bit-manipulation extension.
+
+@item zvbc
+@tab 1.0
+@tab Vector carryless multiplication extension.
+

Re: [PATCH v2] RISC-V: Documnet the list of supported extensions

2024-01-19 Thread juzhe.zh...@rivai.ai

LGTM.



juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2024-01-19 17:40
To: gcc-patches; kito.cheng; jim.wilson.gcc; palmer; andrew; jeffreyalaw; 
christoph.muellner; juzhe.zhong; rep.dot.nop
CC: Kito Cheng
Subject: [PATCH v2] RISC-V: Documnet the list of supported extensions
Try to list all supported extensions: name, version and few description
for each extension.
 
v2 changes:
- Fix several typo.
- Add expantion info for vector crypto extensions.
- Drop zvl8192b, zvl16384b, zvl32768b and zvl65536b.
- Aadd zicntr and zihpm
 
gcc/ChangeLog:
 
* doc/invoke.texi (RISC-V Options): Add list of supported
extensions.
---
gcc/doc/invoke.texi | 461 
1 file changed, 461 insertions(+)
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c0e513c8f27..313f363f5f2 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -30113,6 +30113,467 @@ syntax @samp{p} or @samp{}, 
(e.g.@: @samp{m2p1} or
@samp{m2}).
@end table
+Supported extension are listed below:
+@multitable @columnfractions .10 .10 .80
+@headitem Extension Name @tab Supported Version @tab Description
+@item i
+@tab 2.0, 2.1
+@tab Base integer extension.
+
+@item e
+@tab 2.0
+@tab Reduced base integer extension.
+
+@item g
+@tab -
+@tab General-purpose computing base extension, @samp{g} will expand to
+@samp{i}, @samp{m}, @samp{a}, @samp{f}, @samp{d}, @samp{zicsr} and
+@samp{zifencei}.
+
+@item m
+@tab 2.0
+@tab Integer multiplication and division extension.
+
+@item a
+@tab 2.0, 2.1
+@tab Atomic extension.
+
+@item f
+@tab 2.0, 2.2
+@tab Single-precision floating-point extension.
+
+@item d
+@tab 2.0, 2.2
+@tab Double-precision floating-point extension.
+
+@item c
+@tab 2.0
+@tab Compressed extension.
+
+@item h
+@tab 1.0
+@tab Hypervisor extension.
+
+@item v
+@tab 1.0
+@tab Vector extension.
+
+@item zicsr
+@tab 2.0
+@tab Control and status register access extension.
+
+@item zifencei
+@tab 2.0
+@tab Instruction-fetch fence extension.
+
+@item zicond
+@tab 1.0
+@tab Integer conditional operations extension.
+
+@item zawrs
+@tab 1.0
+@tab Wait-on-reservation-set extension.
+
+@item zba
+@tab 1.0
+@tab Address calculation extension.
+
+@item zbb
+@tab 1.0
+@tab Basic bit manipulation extension.
+
+@item zbc
+@tab 1.0
+@tab Carry-less multiplication extension.
+
+@item zbs
+@tab 1.0
+@tab Single-bit operation extension.
+
+@item zfinx
+@tab 1.0
+@tab Single-precision floating-point in integer registers extension.
+
+@item zdinx
+@tab 1.0
+@tab Double-precision floating-point in integer registers extension.
+
+@item zhinx
+@tab 1.0
+@tab Half-precision floating-point in integer registers extension.
+
+@item zhinxmin
+@tab 1.0
+@tab Minimal half-precision floating-point in integer registers extension.
+
+@item zbkb
+@tab 1.0
+@tab Cryptography bit-manipulation extension.
+
+@item zbkc
+@tab 1.0
+@tab Cryptography carry-less multiply extension.
+
+@item zbkx
+@tab 1.0
+@tab Cryptography crossbar permutation extension.
+
+@item zkne
+@tab 1.0
+@tab AES Encryption extension.
+
+@item zknd
+@tab 1.0
+@tab AES Decryption extension.
+
+@item zknh
+@tab 1.0
+@tab Hash function extension.
+
+@item zkr
+@tab 1.0
+@tab Entropy source extension.
+
+@item zksed
+@tab 1.0
+@tab SM4 block cipher extension.
+
+@item zksh
+@tab 1.0
+@tab SM3 hash function extension.
+
+@item zkt
+@tab 1.0
+@tab Data independent execution latency extension.
+
+@item zk
+@tab 1.0
+@tab Standard scalar cryptography extension.
+
+@item zkn
+@tab 1.0
+@tab NIST algorithm suite extension.
+
+@item zks
+@tab 1.0
+@tab ShangMi algorithm suite extension.
+
+@item zihintntl
+@tab 1.0
+@tab Non-temporal locality hints extension.
+
+@item zihintpause
+@tab 1.0
+@tab Pause hint extension.
+
+@item zicboz
+@tab 1.0
+@tab Cache-block zero extension.
+
+@item zicbom
+@tab 1.0
+@tab Cache-block management extension.
+
+@item zicbop
+@tab 1.0
+@tab Cache-block prefetch extension.
+
+@item zicntr
+@tab 2.0
+@tab Standard extension for base counters and timers.
+
+@item zihpm
+@tab 2.0
+@tab Standard extension for hardware performance counters.
+
+@item ztso
+@tab 1.0
+@tab Total store ordering extension.
+
+@item zve32x
+@tab 1.0
+@tab Vector extensions for embedded processors.
+
+@item zve32f
+@tab 1.0
+@tab Vector extensions for embedded processors.
+
+@item zve64x
+@tab 1.0
+@tab Vector extensions for embedded processors.
+
+@item zve64f
+@tab 1.0
+@tab Vector extensions for embedded processors.
+
+@item zve64d
+@tab 1.0
+@tab Vector extensions for embedded processors.
+
+@item zvl32b
+@tab 1.0
+@tab Minimum vector length standard extensions
+
+@item zvl64b
+@tab 1.0
+@tab Minimum vector length standard extensions
+
+@item zvl128b
+@tab 1.0
+@tab Minimum vector length standard extensions
+
+@item zvl256b
+@tab 1.0
+@tab Minimum vector length standard extensions
+
+@item zvl512b
+@tab 1.0
+@tab Minimum vector length standard extensions
+
+@item zvl1024b
+@tab 1.0
+@tab Minimum vector length standard extensions
+
+@item zvl2048b
+@tab 1.0
+@tab Minimum

Re: [PATCH] RISC-V: Tweak the wording for the sorry message

2024-01-19 Thread rep . dot . nop

On 19 January 2024 03:41:57 CET, Kito Cheng  wrote:
>Thanks, pushed to trunk :)

Thanks, but don't you have to update the tests too, at least
gcc/testsuite/gcc.target/riscv/rvv/base/big_endian-2.c ?

thanks

>
>On Fri, Jan 19, 2024 at 10:36 AM juzhe.zh...@rivai.ai
> wrote:
>>
>> OK
>>
>> 
>> juzhe.zh...@rivai.ai
>>
>>
>> From: Kito Cheng
>> Date: 2024-01-19 10:34
>> To: rep.dot.nop; jeffreyalaw; rdapp.gcc; juzhe.zhong; gcc-patches
>> CC: Kito Cheng
>> Subject: [PATCH] RISC-V: Tweak the wording for the sorry message
>> Use "does not" rather than "cannot", because it's implementation issue.
>>
>> gcc/ChangeLog:
>>
>> * config/riscv/riscv.cc (riscv_override_options_internal): Tweak
>> sorry message.
>> ---
>> gcc/config/riscv/riscv.cc | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
>> index f1d5129397f..dd6e68a08c2 100644
>> --- a/gcc/config/riscv/riscv.cc
>> +++ b/gcc/config/riscv/riscv.cc
>> @@ -8798,13 +8798,13 @@ riscv_override_options_internal (struct gcc_options 
>> *opts)
>>   We can only allow TARGET_MIN_VLEN * 8 (LMUL) < 65535.  */
>>if (TARGET_MIN_VLEN_OPTS (opts) > 4096)
>> -sorry ("Current RISC-V GCC cannot support VLEN greater than 4096bit for 
>> "
>> +sorry ("Current RISC-V GCC does not support VLEN greater than 4096bit 
>> for "
>>"'V' Extension");
>>/* FIXME: We don't support RVV in big-endian for now, we may enable RVV 
>> with
>>   big-endian after finishing full coverage testing.  */
>>if (TARGET_VECTOR && TARGET_BIG_ENDIAN)
>> -sorry ("Current RISC-V GCC cannot support RVV in big-endian mode");
>> +sorry ("Current RISC-V GCC does not support RVV in big-endian mode");
>>/* Convert -march to a chunks count.  */
>>riscv_vector_chunks = riscv_convert_vector_bits (opts);
>> --
>> 2.34.1
>>
>>

[PATCH] tree-optimization/113494 - Fix two observed regressions with r14-8206

2024-01-19 Thread Richard Biener

The following handles the situation where we lack a loop-closed
PHI for a virtual operand because a loop exit goes to a code
region not having any virtual use (an endless loop).  It also
handles the situation of edge redirection re-allocating a PHI node
in the destination block so we have to re-lookup that before
populating the new PHI argument.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/113494
* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
Handle endless loop on exit.  Handle re-allocated PHI.
---
 gcc/tree-vect-loop-manip.cc | 26 +-
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index 983ed2e9b1f..1477906e96e 100644
--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -1629,11 +1629,17 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop 
*loop, edge loop_exit,
  alt_loop_exit_block = split_edge (exit);
if (!need_virtual_phi)
  continue;
-   if (vphi_def && !vphi)
- vphi = create_phi_node (copy_ssa_name (vphi_def),
- alt_loop_exit_block);
if (vphi_def)
- add_phi_arg (vphi, vphi_def, exit, UNKNOWN_LOCATION);
+ {
+   if (!vphi)
+ vphi = create_phi_node (copy_ssa_name (vphi_def),
+ alt_loop_exit_block);
+   else
+ /* Edge redirection might re-allocate the PHI node
+so we have to rediscover it.  */
+ vphi = get_virtual_phi (alt_loop_exit_block);
+   add_phi_arg (vphi, vphi_def, exit, UNKNOWN_LOCATION);
+ }
  }
 
  set_immediate_dominator (CDI_DOMINATORS, new_preheader,
@@ -1748,7 +1754,17 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop 
*loop, edge loop_exit,
  if (virtual_operand_p (alt_arg))
{
  gphi *vphi = get_virtual_phi (alt_loop_exit_block);
- alt_arg = gimple_phi_result (vphi);
+ /* ???  When the exit yields to a path without
+any virtual use we can miss a LC PHI for the
+live virtual operand.  Simply choosing the
+one live at the start of the loop header isn't
+correct, but we should get here only with
+early-exit vectorization which will move all
+defs after the main exit, so leave a temporarily
+wrong virtual operand in place.  This happens
+for gcc.c-torture/execute/20150611-1.c  */
+ if (vphi)
+   alt_arg = gimple_phi_result (vphi);
}
  edge main_e = single_succ_edge (alt_loop_exit_block);
  SET_PHI_ARG_DEF_ON_EDGE (to_phi, main_e, alt_arg);
-- 
2.35.3

Re: [PATCH] RISC-V: Documnet the list of supported extensions

2024-01-19 Thread Kito Cheng

Hi Bernhard:

Thanks for such careful review! V2 send :)

On Tue, Jan 16, 2024 at 4:08 AM Bernhard Reutner-Fischer
 wrote:
>
> Hi Kito!
>
> On Thu, 11 Jan 2024 17:06:09 +0800
> Kito Cheng  wrote:
>
> > Try to list all supported extensions: name, version and few description
> > for each extension.
> >
> > gcc/ChangeLog:
> >
> >   * doc/invoke.texi (RISC-V Options): Add list of supported
> >   extensions.
> > ---
> >  gcc/doc/invoke.texi | 463 
> >  1 file changed, 463 insertions(+)
> >
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index 68d1f364ac0..58271f2f28e 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -30037,6 +30037,469 @@ Generate code for given RISC-V ISA (e.g.@: 
> > @samp{rv64im}).  ISA strings must be
> >  lower-case.  Examples include @samp{rv64i}, @samp{rv32g}, @samp{rv32e}, and
> >  @samp{rv32imaf}.
> >
> > +Supported extension are list below:
>
> are listed
>
> > +@multitable @columnfractions .10 .10 .80
> > +@headitem Extension Name @tab Supported Version @tab Description
> > +@item i
> > +@tab 2.0, 2.1
> > +@tab Base integer extension.
> > +
> > +@item e
> > +@tab 2.0
> > +@tab Reduced base integer extension.
> > +
> > +@item g
> > +@tab -
> > +@tab General-purpose computing base extension, @samp{g} will expand to
> > +@samp{i}, @samp{m}, @samp{a}, @samp{f}, @samp{d}, @samp{zicsr} and
> > +@samp{zifencei}.
> > +
> > +@item m
> > +@tab 2.0
> > +@tab Integer multiplication and division extension.
> > +
> > +@item a
> > +@tab 2.0, 2.1
> > +@tab Atomic extension.
> > +
> > +@item f
> > +@tab 2.0, 2.2
> > +@tab Single-precision floating-point extension.
> > +
> > +@item d
> > +@tab 2.0, 2.2
> > +@tab Double-precision floating-point extension.
> > +
> > +@item c
> > +@tab 2.0
> > +@tab Compressed extension.
> > +
> > +@item h
> > +@tab 1.0
> > +@tab Hypervisor extension.
> > +
> > +@item v
> > +@tab 1.0
> > +@tab Vector extension.
> > +
> > +@item zicsr
> > +@tab 2.0
> > +@tab Control and status register access extension.
> > +
> > +@item zifencei
> > +@tab 2.0
> > +@tab Instruction-fetch fence extension.
> > +
> > +@item zicond
> > +@tab 1.0
> > +@tab Integer conditional operations extension.
> > +
> > +@item zawrs
> > +@tab 1.0
> > +@tab Wait-on-reservation-set extension.
> > +
> > +@item zba
> > +@tab 1.0
> > +@tab Address calculation extension.
> > +
> > +@item zbb
> > +@tab 1.0
> > +@tab Basic bit manipulation extension.
> > +
> > +@item zbc
> > +@tab 1.0
> > +@tab Carry-less multiplication extension.
> > +
> > +@item zbs
> > +@tab 1.0
> > +@tab Single-bit operation extension.
> > +
> > +@item zfinx
> > +@tab 1.0
> > +@tab Single-precision floating-ioint in integer registers extension.
>
> s/ioint/point/g
> above and below.
>
> > +
> > +@item zdinx
> > +@tab 1.0
> > +@tab Double-precision floating-ioint in integer registers extension.
> > +
> > +@item zhinx
> > +@tab 1.0
> > +@tab Half-precision floating-ioint in integer registers extension.
> > +
> > +@item zhinxmin
> > +@tab 1.0
> > +@tab Minimal half-precision floating-ioint in integer registers extension.
> > +
> > +@item zbkb
> > +@tab 1.0
> > +@tab Cryptography bit-manipulation extension.
> > +
> > +@item zbkc
> > +@tab 1.0
> > +@tab Cryptography carry-less multiply extension.
> > +
> > +@item zbkx
> > +@tab 1.0
> > +@tab Cryptography crossbar permutation extension.
> > +
> > +@item zkne
> > +@tab 1.0
> > +@tab AES Encryption extension.
> > +
> > +@item zknd
> > +@tab 1.0
> > +@tab AES Decryption extension.
> > +
> > +@item zknh
> > +@tab 1.0
> > +@tab Hash function extension.
> > +
> > +@item zkr
> > +@tab 1.0
> > +@tab Entropy source extension.
> > +
> > +@item zksed
> > +@tab 1.0
> > +@tab SM4 block cipher extension.
> > +
> > +@item zksh
> > +@tab 1.0
> > +@tab SM3 hash function extension.
> > +
> > +@item zkt
> > +@tab 1.0
> > +@tab Data independent execution latency extension.
> > +
> > +@item zk
> > +@tab 1.0
> > +@tab Standard scalar cryptography extension.
> > +
> > +@item zkn
> > +@tab 1.0
> > +@tab NIST algorithm suite extension.
>
> For @item g you document which extensions this will expand to, do you
> want to list the expansions here, too?
>
> ISTM that
> https://riscv.org/blog/2021/09/risc-v-cryptography-extensions-task-group-announces-public-review-of-the-scalar-cryptography-extensions/
> lists
> Zkn – NIST Algorithm Suite (shorthand for Zknd_Zkne_Zknh_Zbkb_Zbkc_Zbkx)
> Zks – ShangMi Algorithm Suite  (shorthand for Zksed_Zksh_Zbkb_Zbkc_Zbkx)
> Zk – Standard scalar cryptography extension (shorthand for Zkn_Zkt_Zkr)
>
> > +
> > +@item zks
> > +@tab 1.0
> > +@tab ShangMi algorithm suite extension.
> > +
> > +@item zihintntl
> > +@tab 1.0
> > +@tab Non-temporal locality hints extension.
> > +
> > +@item zihintpause
> > +@tab 1.0
> > +@tab Pause hint extension.
> > +
> > +@item zicboz
> > +@tab 1.0
> > +@tab Cache-block zero extension.
> > +
> > +@item zicbom
> > +@tab 1.0
> > +@tab Cache-block management extension.
> > +
> > +@i

Re: Re: [PATCH] RISC-V: Tweak the wording for the sorry message

2024-01-19 Thread juzhe.zh...@rivai.ai

Yeah. There is regression here:

Executing on host: 
/work/home/jzzhong/work/docker/riscv-gnu-toolchain/build/dev-rv64gcv-lp64d-medany-newlib-spike-release-m1-scalable/build-gcc-newlib-stage2/gcc/xgcc
 
-B/work/home/jzzhong/work/docker/riscv-gnu-toolchain/build/dev-rv64gcv-lp64d-medany-newlib-spike-release-m1-scalable/build-gcc-newlib-stage2/gcc/
  
/work/home/jzzhong/work/docker/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/base/big_endian-1.c
  -march=rv64gcv -mabi=lp64d -mcmodel=medany   -fdiagnostics-plain-output   
-march=rv64gcv -mabi=lp64d -mbig-endian -O3 -S   -o big_endian-1.s(timeout 
= 60)
spawn -ignore SIGHUP 
/work/home/jzzhong/work/docker/riscv-gnu-toolchain/build/dev-rv64gcv-lp64d-medany-newlib-spike-release-m1-scalable/build-gcc-newlib-stage2/gcc/xgcc
 
-B/work/home/jzzhong/work/docker/riscv-gnu-toolchain/build/dev-rv64gcv-lp64d-medany-newlib-spike-release-m1-scalable/build-gcc-newlib-stage2/gcc/
 
/work/home/jzzhong/work/docker/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/base/big_endian-1.c
 -march=rv64gcv -mabi=lp64d -mcmodel=medany -fdiagnostics-plain-output 
-march=rv64gcv -mabi=lp64d -mbig-endian -O3 -S -o big_endian-1.s^M
cc1: sorry, unimplemented: Current RISC-V GCC does not support RVV in 
big-endian mode^M
compiler exited with status 1
XFAIL: gcc.target/riscv/rvv/base/big_endian-1.c (test for excess errors)
Excess errors:
cc1: sorry, unimplemented: Current RISC-V GCC does not support RVV in 
big-endian mode

Executing on host: 
/work/home/jzzhong/work/docker/riscv-gnu-toolchain/build/dev-rv64gcv-lp64d-medany-newlib-spike-release-m1-scalable/build-gcc-newlib-stage2/gcc/xgcc
 
-B/work/home/jzzhong/work/docker/riscv-gnu-toolchain/build/dev-rv64gcv-lp64d-medany-newlib-spike-release-m1-scalable/build-gcc-newlib-stage2/gcc/
  
/work/home/jzzhong/work/docker/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/base/big_endian-2.c
  -march=rv64gcv -mabi=lp64d -mcmodel=medany   -fdiagnostics-plain-output   
-march=rv64gc_zve32x -mabi=lp64d -mbig-endian -O3 -S   -o big_endian-2.s
(timeout = 60)
spawn -ignore SIGHUP 
/work/home/jzzhong/work/docker/riscv-gnu-toolchain/build/dev-rv64gcv-lp64d-medany-newlib-spike-release-m1-scalable/build-gcc-newlib-stage2/gcc/xgcc
 
-B/work/home/jzzhong/work/docker/riscv-gnu-toolchain/build/dev-rv64gcv-lp64d-medany-newlib-spike-release-m1-scalable/build-gcc-newlib-stage2/gcc/
 
/work/home/jzzhong/work/docker/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/base/big_endian-2.c
 -march=rv64gcv -mabi=lp64d -mcmodel=medany -fdiagnostics-plain-output 
-march=rv64gc_zve32x -mabi=lp64d -mbig-endian -O3 -S -o big_endian-2.s^M
cc1: sorry, unimplemented: Current RISC-V GCC does not support RVV in 
big-endian mode^M
compiler exited with status 1
XFAIL: gcc.target/riscv/rvv/base/big_endian-2.c (test for excess errors)
Excess errors:
cc1: sorry, unimplemented: Current RISC-V GCC does not support RVV in 
big-endian mode

I think zvl_unimplemented-1.c and zvl_unimplemented-2.c should also be fixed.



juzhe.zh...@rivai.ai
 
From: rep.dot.nop
Date: 2024-01-19 17:40
To: Kito Cheng; juzhe.zh...@rivai.ai
CC: Kito.cheng; jeffreyalaw; Robin Dapp; gcc-patches
Subject: Re: [PATCH] RISC-V: Tweak the wording for the sorry message
On 19 January 2024 03:41:57 CET, Kito Cheng  wrote:
>Thanks, pushed to trunk :)
 
Thanks, but don't you have to update the tests too, at least
gcc/testsuite/gcc.target/riscv/rvv/base/big_endian-2.c ?
 
thanks
 
>
>On Fri, Jan 19, 2024 at 10:36 AM juzhe.zh...@rivai.ai
> wrote:
>>
>> OK
>>
>> 
>> juzhe.zh...@rivai.ai
>>
>>
>> From: Kito Cheng
>> Date: 2024-01-19 10:34
>> To: rep.dot.nop; jeffreyalaw; rdapp.gcc; juzhe.zhong; gcc-patches
>> CC: Kito Cheng
>> Subject: [PATCH] RISC-V: Tweak the wording for the sorry message
>> Use "does not" rather than "cannot", because it's implementation issue.
>>
>> gcc/ChangeLog:
>>
>> * config/riscv/riscv.cc (riscv_override_options_internal): Tweak
>> sorry message.
>> ---
>> gcc/config/riscv/riscv.cc | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
>> index f1d5129397f..dd6e68a08c2 100644
>> --- a/gcc/config/riscv/riscv.cc
>> +++ b/gcc/config/riscv/riscv.cc
>> @@ -8798,13 +8798,13 @@ riscv_override_options_internal (struct gcc_options 
>> *opts)
>>   We can only allow TARGET_MIN_VLEN * 8 (LMUL) < 65535.  */
>>if (TARGET_MIN_VLEN_OPTS (opts) > 4096)
>> -sorry ("Current RISC-V GCC cannot support VLEN greater than 4096bit for 
>> "
>> +sorry ("Current RISC-V GCC does not support VLEN greater than 4096bit 
>> for "
>>"'V' Extension");
>>/* FIXME: We don't support RVV in big-endian for now, we may enable RVV 
>> with
>>   big-endian after finishing full coverage testing.  */
>>if (TARGET_VECTOR && TARGET_BIG_ENDIAN)
>> -sorry ("Current RISC-V GCC cannot support RVV in big-endian mode");
>> +sorry ("Current RISC-V GCC d

[PATCH] debug/113488 - DW_AT_abstract_origin to self

2024-01-19 Thread Richard Biener

The new sanity check avoiding creating of DIE refs to self triggers
on the PRs testcase when using -g1 and -ffat-lto-objects as while
early DWARF with -g1 doesn't contain any DIEs for LABEL_DECLs later
cloning will still mark DECLs as in if they would via
dwarf2out_abstract_function calling set_block_origin_self.

Instead of messing with the delicate setup of dwarf2out at this stage
the following simply rectifies things after the fact during LTO
streaming when the decl indicates there's an early DIE but there
isn't fixup that indication.

LTO bootstrapped and tested on x86_64-unknwon-linux-gnu, pushed.

PR debug/113488
* lto-streamer-in.cc (lto_read_tree_1): When there isn't
an early DIE but there should be, do not pretend there is.
---
 gcc/lto-streamer-in.cc | 5 +
 1 file changed, 5 insertions(+)

diff --git a/gcc/lto-streamer-in.cc b/gcc/lto-streamer-in.cc
index bef0743b2cd..ad0ca24007a 100644
--- a/gcc/lto-streamer-in.cc
+++ b/gcc/lto-streamer-in.cc
@@ -1746,6 +1746,11 @@ lto_read_tree_1 (class lto_input_block *ib, class 
data_in *data_in, tree expr)
  dref_entry e = { expr, str, off };
  dref_queue.safe_push (e);
}
+  /* When there's no early DIE to refer to but dwarf2out set up
+things in a way to expect that fixup.  This tends to happen
+with -g1, see for example PR113488.  */
+  else if (DECL_P (expr) && DECL_ABSTRACT_ORIGIN (expr) == expr)
+   DECL_ABSTRACT_ORIGIN (expr) = NULL_TREE;
 }
 }
 
-- 
2.35.3

Re: [PATCH v3] LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT

2024-01-19 Thread Jiahao Xu

The test case gcc.dg/tree-ssa/copy-headers-8.c fails for a target where 
LOGICAL_OP_NON_SHORT_CIRCUIT is defined as 0.It is suggested to add 
`--param logical-op-non-short-circuit=1` to the test case to make it a 
target-independent testcase.


see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111248.

在 2024/1/19 下午4:54, chenglulu 写道:

Hi, Jiahao:

This patch will introduce redundant FAIL, and the reason needs to be explained.

+FAIL: gcc.dg/tree-ssa/copy-headers-8.c scan-tree-dump-times ch2 "Conditional 
combines static and invariant" 1
+FAIL: gcc.dg/tree-ssa/copy-headers-8.c scan-tree-dump-times ch2 "Will duplicate 
bb" 2
+FAIL: gcc.dg/tree-ssa/update-threading.c scan-tree-dump-times optimized "Invalid 
sum" 0
在 2024/1/16 上午10:32, Jiahao Xu 写道:

Define LOGICAL_OP_NON_SHORT_CIRCUIT as 0, for a short-circuit branch, use the
short-circuit operation instead of the non-short-circuit operation.

SPEC2017 performance evaluation shows 1% performance improvement for fprate
GEOMEAN and no obvious regression for others. Especially, 526.blender_r +10.6%
on 3A6000.

gcc/ChangeLog:

* config/loongarch/loongarch.h (LOGICAL_OP_NON_SHORT_CIRCUIT): Define.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/short-circuit.c: New test.

diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
index 4e6ede926d3..8b453ab3140 100644
--- a/gcc/config/loongarch/loongarch.h
+++ b/gcc/config/loongarch/loongarch.h
@@ -869,6 +869,7 @@ typedef struct {
 1 is the default; other values are interpreted relative to that.  */
  
  #define BRANCH_COST(speed_p, predictable_p) la_branch_cost

+#define LOGICAL_OP_NON_SHORT_CIRCUIT 0
  
  /* Return the asm template for a conditional branch instruction.

 OPCODE is the opcode's mnemonic and OPERANDS is the asm template for
diff --git a/gcc/testsuite/gcc.target/loongarch/short-circuit.c 
b/gcc/testsuite/gcc.target/loongarch/short-circuit.c
new file mode 100644
index 000..bed585ee172
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/short-circuit.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -fdump-tree-gimple" } */
+
+int
+short_circuit (float *a)
+{
+  float t1x = a[0];
+  float t2x = a[1];
+  float t1y = a[2];
+  float t2y = a[3];
+  float t1z = a[4];
+  float t2z = a[5];
+
+  if (t1x > t2y  || t2x < t1y  || t1x > t2z || t2x < t1z || t1y > t2z || t2y < 
t1z)
+return 0;
+
+  return 1;
+}
+/* { dg-final { scan-tree-dump-times "if" 6 "gimple" } } */

Re: [PATCH] libgomp: Fix build for -fshort-enums

2024-01-19 Thread Sebastian Huber


On 11.09.23 14:57, Sebastian Huber wrote:

On 04.07.23 08:20, Sebastian Huber wrote:

On 22.05.23 14:51, Sebastian Huber wrote:
Make sure that the API enums have at least the size of int.  
Otherwise the

following build error may occur:

In file included from gcc/libgomp/env.c:34:
./libgomp_f.h: In function 'omp_check_defines':
./libgomp_f.h:77:8: error: size of array 'test' is negative
    77 |   char test[(28 != sizeof (omp_lock_t)
   |    ^~~~

libgomp/ChangeLog:

* omp.h.in (omp_alloctrait_key_t):  Add __omp_alloctrait_key_t_max__
with a value of the int type maximum.
---
  libgomp/omp.h.in | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libgomp/omp.h.in b/libgomp/omp.h.in
index bd1286c2a3f..3b1612fcb15 100644
--- a/libgomp/omp.h.in
+++ b/libgomp/omp.h.in
@@ -146,7 +146,8 @@ typedef enum omp_alloctrait_key_t
    omp_atk_fallback = 5,
    omp_atk_fb_data = 6,
    omp_atk_pinned = 7,
-  omp_atk_partition = 8
+  omp_atk_partition = 8,
+  __omp_alloctrait_key_t_max__ = __INT_MAX__
  } omp_alloctrait_key_t;
  typedef enum omp_alloctrait_value_t


Could someone please have a look at this.


Ping.


Any chance to get this integrated for GCC 14?

--
embedded brains GmbH & Co. KG
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/

Re: [PATCH v2] RISC-V: Documnet the list of supported extensions

2024-01-19 Thread Kito Cheng

Pushed to trunk, thanks :)

On Fri, Jan 19, 2024 at 5:41 PM juzhe.zh...@rivai.ai
 wrote:
>
> LGTM.
>
> 
> juzhe.zh...@rivai.ai
>
>
> From: Kito Cheng
> Date: 2024-01-19 17:40
> To: gcc-patches; kito.cheng; jim.wilson.gcc; palmer; andrew; jeffreyalaw; 
> christoph.muellner; juzhe.zhong; rep.dot.nop
> CC: Kito Cheng
> Subject: [PATCH v2] RISC-V: Documnet the list of supported extensions
> Try to list all supported extensions: name, version and few description
> for each extension.
>
> v2 changes:
> - Fix several typo.
> - Add expantion info for vector crypto extensions.
> - Drop zvl8192b, zvl16384b, zvl32768b and zvl65536b.
> - Aadd zicntr and zihpm
>
> gcc/ChangeLog:
>
> * doc/invoke.texi (RISC-V Options): Add list of supported
> extensions.
> ---
> gcc/doc/invoke.texi | 461 
> 1 file changed, 461 insertions(+)
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index c0e513c8f27..313f363f5f2 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -30113,6 +30113,467 @@ syntax @samp{p} or @samp{}, 
> (e.g.@: @samp{m2p1} or
> @samp{m2}).
> @end table
> +Supported extension are listed below:
> +@multitable @columnfractions .10 .10 .80
> +@headitem Extension Name @tab Supported Version @tab Description
> +@item i
> +@tab 2.0, 2.1
> +@tab Base integer extension.
> +
> +@item e
> +@tab 2.0
> +@tab Reduced base integer extension.
> +
> +@item g
> +@tab -
> +@tab General-purpose computing base extension, @samp{g} will expand to
> +@samp{i}, @samp{m}, @samp{a}, @samp{f}, @samp{d}, @samp{zicsr} and
> +@samp{zifencei}.
> +
> +@item m
> +@tab 2.0
> +@tab Integer multiplication and division extension.
> +
> +@item a
> +@tab 2.0, 2.1
> +@tab Atomic extension.
> +
> +@item f
> +@tab 2.0, 2.2
> +@tab Single-precision floating-point extension.
> +
> +@item d
> +@tab 2.0, 2.2
> +@tab Double-precision floating-point extension.
> +
> +@item c
> +@tab 2.0
> +@tab Compressed extension.
> +
> +@item h
> +@tab 1.0
> +@tab Hypervisor extension.
> +
> +@item v
> +@tab 1.0
> +@tab Vector extension.
> +
> +@item zicsr
> +@tab 2.0
> +@tab Control and status register access extension.
> +
> +@item zifencei
> +@tab 2.0
> +@tab Instruction-fetch fence extension.
> +
> +@item zicond
> +@tab 1.0
> +@tab Integer conditional operations extension.
> +
> +@item zawrs
> +@tab 1.0
> +@tab Wait-on-reservation-set extension.
> +
> +@item zba
> +@tab 1.0
> +@tab Address calculation extension.
> +
> +@item zbb
> +@tab 1.0
> +@tab Basic bit manipulation extension.
> +
> +@item zbc
> +@tab 1.0
> +@tab Carry-less multiplication extension.
> +
> +@item zbs
> +@tab 1.0
> +@tab Single-bit operation extension.
> +
> +@item zfinx
> +@tab 1.0
> +@tab Single-precision floating-point in integer registers extension.
> +
> +@item zdinx
> +@tab 1.0
> +@tab Double-precision floating-point in integer registers extension.
> +
> +@item zhinx
> +@tab 1.0
> +@tab Half-precision floating-point in integer registers extension.
> +
> +@item zhinxmin
> +@tab 1.0
> +@tab Minimal half-precision floating-point in integer registers extension.
> +
> +@item zbkb
> +@tab 1.0
> +@tab Cryptography bit-manipulation extension.
> +
> +@item zbkc
> +@tab 1.0
> +@tab Cryptography carry-less multiply extension.
> +
> +@item zbkx
> +@tab 1.0
> +@tab Cryptography crossbar permutation extension.
> +
> +@item zkne
> +@tab 1.0
> +@tab AES Encryption extension.
> +
> +@item zknd
> +@tab 1.0
> +@tab AES Decryption extension.
> +
> +@item zknh
> +@tab 1.0
> +@tab Hash function extension.
> +
> +@item zkr
> +@tab 1.0
> +@tab Entropy source extension.
> +
> +@item zksed
> +@tab 1.0
> +@tab SM4 block cipher extension.
> +
> +@item zksh
> +@tab 1.0
> +@tab SM3 hash function extension.
> +
> +@item zkt
> +@tab 1.0
> +@tab Data independent execution latency extension.
> +
> +@item zk
> +@tab 1.0
> +@tab Standard scalar cryptography extension.
> +
> +@item zkn
> +@tab 1.0
> +@tab NIST algorithm suite extension.
> +
> +@item zks
> +@tab 1.0
> +@tab ShangMi algorithm suite extension.
> +
> +@item zihintntl
> +@tab 1.0
> +@tab Non-temporal locality hints extension.
> +
> +@item zihintpause
> +@tab 1.0
> +@tab Pause hint extension.
> +
> +@item zicboz
> +@tab 1.0
> +@tab Cache-block zero extension.
> +
> +@item zicbom
> +@tab 1.0
> +@tab Cache-block management extension.
> +
> +@item zicbop
> +@tab 1.0
> +@tab Cache-block prefetch extension.
> +
> +@item zicntr
> +@tab 2.0
> +@tab Standard extension for base counters and timers.
> +
> +@item zihpm
> +@tab 2.0
> +@tab Standard extension for hardware performance counters.
> +
> +@item ztso
> +@tab 1.0
> +@tab Total store ordering extension.
> +
> +@item zve32x
> +@tab 1.0
> +@tab Vector extensions for embedded processors.
> +
> +@item zve32f
> +@tab 1.0
> +@tab Vector extensions for embedded processors.
> +
> +@item zve64x
> +@tab 1.0
> +@tab Vector extensions for embedded processors.
> +
> +@item zve64f
> +@tab 1.0
> +@tab Vector extensions for embedded processor

[committed] RISC-V: Update testcase due to message update

2024-01-19 Thread Kito Cheng

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-27.c: Update scan message.
* gcc.target/riscv/arch-28.c: Ditto.
* gcc.target/riscv/attribute-10.c: Ditto.
* gcc.target/riscv/rvv/base/big_endian-2.c: Ditto.
* gcc.target/riscv/rvv/base/zvl-unimplemented-1.c: Ditto.
* gcc.target/riscv/rvv/base/zvl-unimplemented-2.c: Ditto.
---
 gcc/testsuite/gcc.target/riscv/arch-27.c  | 2 +-
 gcc/testsuite/gcc.target/riscv/arch-28.c  | 2 +-
 gcc/testsuite/gcc.target/riscv/attribute-10.c | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/big_endian-2.c| 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-1.c | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-2.c | 2 +-
 6 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/arch-27.c 
b/gcc/testsuite/gcc.target/riscv/arch-27.c
index 03f07deedd1..95cebc1a2da 100644
--- a/gcc/testsuite/gcc.target/riscv/arch-27.c
+++ b/gcc/testsuite/gcc.target/riscv/arch-27.c
@@ -4,4 +4,4 @@ int foo()
 {
 }
 
-/* { dg-error "'i', 'e' or 'g' must be the first extension" "" { target *-*-* 
} 0 } */
+/* { dg-error "i, e or g must be the first extension" "" { target *-*-* } 0 } 
*/
diff --git a/gcc/testsuite/gcc.target/riscv/arch-28.c 
b/gcc/testsuite/gcc.target/riscv/arch-28.c
index 0f83c03ad3d..21c748edf5c 100644
--- a/gcc/testsuite/gcc.target/riscv/arch-28.c
+++ b/gcc/testsuite/gcc.target/riscv/arch-28.c
@@ -4,4 +4,4 @@ int foo()
 {
 }
 
-/* { dg-error "'i', 'e' or 'g' must be the first extension" "" { target *-*-* 
} 0 } */
+/* { dg-error "i, e or g must be the first extension" "" { target *-*-* } 0 } 
*/
diff --git a/gcc/testsuite/gcc.target/riscv/attribute-10.c 
b/gcc/testsuite/gcc.target/riscv/attribute-10.c
index 8a7f0a8ac49..4aaa2bbcd45 100644
--- a/gcc/testsuite/gcc.target/riscv/attribute-10.c
+++ b/gcc/testsuite/gcc.target/riscv/attribute-10.c
@@ -5,4 +5,4 @@ int foo()
 }
 /* { dg-error "extension 'u' is unsupported standard single letter extension" 
"" { target { "riscv*-*-*" } } 0 } */
 /* { dg-error "extension 'n' is unsupported standard single letter extension" 
"" { target { "riscv*-*-*" } } 0 } */
-/* { dg-error "'i', 'e' or 'g' must be the first extension" "" { target { 
"riscv*-*-*" } } 0 } */
+/* { dg-error "i, e or g must be the first extension" "" { target { 
"riscv*-*-*" } } 0 } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/big_endian-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/big_endian-2.c
index 86cf58370bf..45cc97e1f01 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/big_endian-2.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/big_endian-2.c
@@ -2,4 +2,4 @@
 /* { dg-options "-march=rv64gc_zve32x -mabi=lp64d -mbig-endian -O3" } */
 
 #pragma riscv intrinsic "vector"
-vint32m1_t foo (vint32m1_t) {} // { dg-excess-errors "sorry, unimplemented: 
Current RISC-V GCC cannot support RVV in big-endian mode" }
+vint32m1_t foo (vint32m1_t) {} // { dg-excess-errors "sorry, unimplemented: 
Current RISC-V GCC does not support RVV in big-endian mode" }
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-1.c
index 03f67035ca4..1912a2457c7 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-1.c
@@ -1,4 +1,4 @@
 /* { dg-do compile } */
 /* { dg-options "-O3 -march=rv64gcv_zvl8192b -mabi=lp64d --param 
riscv-autovec-preference=fixed-vlmax" } */
 
-void foo () {} // { dg-excess-errors "sorry, unimplemented: Current RISC-V GCC 
can not support VLEN > 4096bit for 'V' Extension" }
+void foo () {} // { dg-excess-errors "sorry, unimplemented: Current RISC-V GCC 
does not support VLEN > 4096bit for 'V' Extension" }
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-2.c
index 075112f2f81..884e834fb90 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-2.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-2.c
@@ -1,4 +1,4 @@
 /* { dg-do compile } */
 /* { dg-options "-O3 -march=rv64gcv_zvl8192b -mabi=lp64d --param 
riscv-autovec-preference=scalable" } */
 
-void foo () {} // { dg-excess-errors "sorry, unimplemented: Current RISC-V GCC 
can not support VLEN > 4096bit for 'V' Extension" }
+void foo () {} // { dg-excess-errors "sorry, unimplemented: Current RISC-V GCC 
does not support VLEN > 4096bit for 'V' Extension" }
-- 
2.34.1

Re: [committed] RISC-V: Update testcase due to message update

2024-01-19 Thread juzhe.zh...@rivai.ai

Ok.



juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2024-01-19 18:08
To: rep.dot.nop; jeffreyalaw; rdapp.gcc; juzhe.zhong; gcc-patches
CC: Kito Cheng
Subject: [committed] RISC-V: Update testcase due to message update
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/arch-27.c: Update scan message.
* gcc.target/riscv/arch-28.c: Ditto.
* gcc.target/riscv/attribute-10.c: Ditto.
* gcc.target/riscv/rvv/base/big_endian-2.c: Ditto.
* gcc.target/riscv/rvv/base/zvl-unimplemented-1.c: Ditto.
* gcc.target/riscv/rvv/base/zvl-unimplemented-2.c: Ditto.
---
gcc/testsuite/gcc.target/riscv/arch-27.c  | 2 +-
gcc/testsuite/gcc.target/riscv/arch-28.c  | 2 +-
gcc/testsuite/gcc.target/riscv/attribute-10.c | 2 +-
gcc/testsuite/gcc.target/riscv/rvv/base/big_endian-2.c| 2 +-
gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-1.c | 2 +-
gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-2.c | 2 +-
6 files changed, 6 insertions(+), 6 deletions(-)
 
diff --git a/gcc/testsuite/gcc.target/riscv/arch-27.c 
b/gcc/testsuite/gcc.target/riscv/arch-27.c
index 03f07deedd1..95cebc1a2da 100644
--- a/gcc/testsuite/gcc.target/riscv/arch-27.c
+++ b/gcc/testsuite/gcc.target/riscv/arch-27.c
@@ -4,4 +4,4 @@ int foo()
{
}
-/* { dg-error "'i', 'e' or 'g' must be the first extension" "" { target *-*-* 
} 0 } */
+/* { dg-error "i, e or g must be the first extension" "" { target *-*-* } 0 } 
*/
diff --git a/gcc/testsuite/gcc.target/riscv/arch-28.c 
b/gcc/testsuite/gcc.target/riscv/arch-28.c
index 0f83c03ad3d..21c748edf5c 100644
--- a/gcc/testsuite/gcc.target/riscv/arch-28.c
+++ b/gcc/testsuite/gcc.target/riscv/arch-28.c
@@ -4,4 +4,4 @@ int foo()
{
}
-/* { dg-error "'i', 'e' or 'g' must be the first extension" "" { target *-*-* 
} 0 } */
+/* { dg-error "i, e or g must be the first extension" "" { target *-*-* } 0 } 
*/
diff --git a/gcc/testsuite/gcc.target/riscv/attribute-10.c 
b/gcc/testsuite/gcc.target/riscv/attribute-10.c
index 8a7f0a8ac49..4aaa2bbcd45 100644
--- a/gcc/testsuite/gcc.target/riscv/attribute-10.c
+++ b/gcc/testsuite/gcc.target/riscv/attribute-10.c
@@ -5,4 +5,4 @@ int foo()
}
/* { dg-error "extension 'u' is unsupported standard single letter extension" 
"" { target { "riscv*-*-*" } } 0 } */
/* { dg-error "extension 'n' is unsupported standard single letter extension" 
"" { target { "riscv*-*-*" } } 0 } */
-/* { dg-error "'i', 'e' or 'g' must be the first extension" "" { target { 
"riscv*-*-*" } } 0 } */
+/* { dg-error "i, e or g must be the first extension" "" { target { 
"riscv*-*-*" } } 0 } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/big_endian-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/big_endian-2.c
index 86cf58370bf..45cc97e1f01 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/big_endian-2.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/big_endian-2.c
@@ -2,4 +2,4 @@
/* { dg-options "-march=rv64gc_zve32x -mabi=lp64d -mbig-endian -O3" } */
#pragma riscv intrinsic "vector"
-vint32m1_t foo (vint32m1_t) {} // { dg-excess-errors "sorry, unimplemented: 
Current RISC-V GCC cannot support RVV in big-endian mode" }
+vint32m1_t foo (vint32m1_t) {} // { dg-excess-errors "sorry, unimplemented: 
Current RISC-V GCC does not support RVV in big-endian mode" }
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-1.c
index 03f67035ca4..1912a2457c7 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-1.c
@@ -1,4 +1,4 @@
/* { dg-do compile } */
/* { dg-options "-O3 -march=rv64gcv_zvl8192b -mabi=lp64d --param 
riscv-autovec-preference=fixed-vlmax" } */
-void foo () {} // { dg-excess-errors "sorry, unimplemented: Current RISC-V GCC 
can not support VLEN > 4096bit for 'V' Extension" }
+void foo () {} // { dg-excess-errors "sorry, unimplemented: Current RISC-V GCC 
does not support VLEN > 4096bit for 'V' Extension" }
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-2.c
index 075112f2f81..884e834fb90 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-2.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/zvl-unimplemented-2.c
@@ -1,4 +1,4 @@
/* { dg-do compile } */
/* { dg-options "-O3 -march=rv64gcv_zvl8192b -mabi=lp64d --param 
riscv-autovec-preference=scalable" } */
-void foo () {} // { dg-excess-errors "sorry, unimplemented: Current RISC-V GCC 
can not support VLEN > 4096bit for 'V' Extension" }
+void foo () {} // { dg-excess-errors "sorry, unimplemented: Current RISC-V GCC 
does not support VLEN > 4096bit for 'V' Extension" }
-- 
2.34.1

[PATCH] MIPS: Accept arguments for -mexplicit-relocs

2024-01-19 Thread YunQiang Su

GAS introduced explicit relocs since 2001, and %pcrel_hi/low were
introduced in 2014.  In future, we may introduce more.

Let's convert -mexplicit-relocs option, and accpet options:
none, base, pcrel.

We also update gcc/configure.ac to set the value to option
the gas support when GCC itself is built.

gcc
* configure.ac: Detect the explicit relocs support for
mips, and define C macro MIPS_EXPLICIT_RELOCS.
* config.in: Regenerated.
* configure: Regenerated.
* doc/invoke.texi(MIPS Options): Add -mexplicit-relocs.
* config/mips/mips-opts.h: Define enum mips_explicit_relocs.
* config/mips/mips.cc(mips_set_compression_mode): Sorry if
!TARGET_EXPLICIT_RELOCS instead of just set it.
* config/mips/mips.h: Define TARGET_EXPLICIT_RELOCS and
TARGET_EXPLICIT_RELOCS_PCREL with mips_opt_explicit_relocs.
* config/mips/mips.opt: Introduce -mexplicit-relocs= option
and define -m(no-)explicit-relocs as aliases.
---
 gcc/config.in   |  6 +
 gcc/config/mips/mips-opts.h |  7 +
 gcc/config/mips/mips.cc |  5 ++--
 gcc/config/mips/mips.h  |  8 ++
 gcc/config/mips/mips.opt| 25 --
 gcc/configure   | 51 -
 gcc/configure.ac| 21 +++
 gcc/doc/invoke.texi | 16 
 8 files changed, 124 insertions(+), 15 deletions(-)

diff --git a/gcc/config.in b/gcc/config.in
index 99fd2d89fe3..ce1d073833f 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -2356,6 +2356,12 @@
 #endif
 
 
+/* Define if assembler supports %reloc. */
+#ifndef USED_FOR_TARGET
+#undef MIPS_EXPLICIT_RELOCS
+#endif
+
+
 /* Define if host mkdir takes a single argument. */
 #ifndef USED_FOR_TARGET
 #undef MKDIR_TAKES_ONE_ARG
diff --git a/gcc/config/mips/mips-opts.h b/gcc/config/mips/mips-opts.h
index 57bdbdfa721..4b0c2c09a3d 100644
--- a/gcc/config/mips/mips-opts.h
+++ b/gcc/config/mips/mips-opts.h
@@ -53,4 +53,11 @@ enum mips_cb_setting {
   MIPS_CB_OPTIMAL,
   MIPS_CB_ALWAYS
 };
+
+/* Enumerates the setting of the -mexplicit-relocs= option.  */
+enum mips_explicit_relocs {
+  MIPS_EXPLICIT_RELOCS_NONE,
+  MIPS_EXPLICIT_RELOCS_BASE,
+  MIPS_EXPLICIT_RELOCS_PCREL
+};
 #endif
diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index 30e99811ff6..68e2ae8d8fa 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -20033,8 +20033,6 @@ mips_set_compression_mode (unsigned int 
compression_mode)
 call.  */
   flag_move_loop_invariants = 0;
 
-  target_flags |= MASK_EXPLICIT_RELOCS;
-
   /* Experiments suggest we get the best overall section-anchor
 results from using the range of an unextended LW or SW.  Code
 that makes heavy use of byte or short accesses can do better
@@ -20064,6 +20062,9 @@ mips_set_compression_mode (unsigned int 
compression_mode)
 
   if (TARGET_MSA)
sorry ("MSA MIPS16 code");
+
+  if (!TARGET_EXPLICIT_RELOCS)
+   sorry ("MIPS16 requires %<-mexplicit-relocs%>");
 }
   else
 {
diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 8768933ba37..7145d23c650 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -145,6 +145,14 @@ struct mips_cpu_info {
  || TARGET_MICROMIPS)  \
 && mips_cb != MIPS_CB_NEVER)
 
+/* True if assembler support %gp_rel etc.  */
+#define TARGET_EXPLICIT_RELOCS \
+  (mips_opt_explicit_relocs >= MIPS_EXPLICIT_RELOCS_BASE)
+
+/* True if assembler support %pcrel_hi/%pcrel_lo.  */
+#define TARGET_EXPLICIT_RELOCS_PCREL \
+  (mips_opt_explicit_relocs >= MIPS_EXPLICIT_RELOCS_PCREL)
+
 /* True if the output file is marked as ".abicalls; .option pic0"
(-call_nonpic).  */
 #define TARGET_ABICALLS_PIC0 \
diff --git a/gcc/config/mips/mips.opt b/gcc/config/mips/mips.opt
index e8b411a8ffe..ce36942aabe 100644
--- a/gcc/config/mips/mips.opt
+++ b/gcc/config/mips/mips.opt
@@ -145,9 +145,30 @@ meva
 Target Var(TARGET_EVA)
 Use Enhanced Virtual Addressing instructions.
 
+Enum
+Name(mips_explicit_relocs) Type(int)
+The code model option names for -mexplicit-relocs:
+
+EnumValue
+Enum(mips_explicit_relocs) String(none) Value(MIPS_EXPLICIT_RELOCS_NONE)
+
+EnumValue
+Enum(mips_explicit_relocs) String(base) Value(MIPS_EXPLICIT_RELOCS_BASE)
+
+EnumValue
+Enum(mips_explicit_relocs) String(pcrel) Value(MIPS_EXPLICIT_RELOCS_PCREL)
+
+mexplicit-relocs=
+Target RejectNegative Joined Enum(mips_explicit_relocs) 
Var(mips_opt_explicit_relocs) Init(MIPS_EXPLICIT_RELOCS)
+Use %reloc() assembly operators.
+
 mexplicit-relocs
-Target Mask(EXPLICIT_RELOCS)
-Use NewABI-style %reloc() assembly operators.
+Target RejectNegative Alias(mexplicit-relocs=,base)
+Use %reloc() assembly operators (for backward compatibility).
+
+mno-explicit-relocs
+Target RejectNegative Alias(mexplicit-relocs=,none)
+Don't use %reloc() assembly operators (for backward compatibility).
 
 mextern-sdata

Re: [middle-end PATCH] Prefer PLUS over IOR in RTL expansion of multi-word shifts/rotates.

2024-01-19 Thread Richard Biener

On Thu, Jan 18, 2024 at 8:55 PM Roger Sayle  wrote:
>
>
> This patch tweaks RTL expansion of multi-word shifts and rotates to use
> PLUS rather than IOR for disjunctive operations.  During expansion of
> these operations, the middle-end creates RTL like (X<>C2)
> where the constants C1 and C2 guarantee that bits don't overlap.
> Hence the IOR can be performed by any any_or_plus operation, such as
> IOR, XOR or PLUS; for word-size operations where carry chains aren't
> an issue these should all be equally fast (single-cycle) instructions.
> The benefit of this change is that targets with shift-and-add insns,
> like x86's lea, can benefit from the LSHIFT-ADD form.
>
> An example of a backend that benefits is ARC, which is demonstrated
> by these two simple functions:
>
> unsigned long long foo(unsigned long long x) { return x<<2; }
>
> which with -O2 is currently compiled to:
>
> foo:lsr r2,r0,30
> asl_s   r1,r1,2
> asl_s   r0,r0,2
> j_s.d   [blink]
> or_sr1,r1,r2
>
> with this patch becomes:
>
> foo:lsr r2,r0,30
> add2r1,r2,r1
> j_s.d   [blink]
> asl_s   r0,r0,2
>
> unsigned long long bar(unsigned long long x) { return (x<<2)|(x>>62); }
>
> which with -O2 is currently compiled to 6 insns + return:
>
> bar:lsr r12,r0,30
> asl_s   r3,r1,2
> asl_s   r0,r0,2
> lsr_s   r1,r1,30
> or_sr0,r0,r1
> j_s.d   [blink]
> or  r1,r12,r3
>
> with this patch becomes 4 insns + return:
>
> bar:lsr r3,r1,30
> lsr r2,r0,30
> add2r1,r2,r1
> j_s.d   [blink]
> add2r0,r3,r0
>
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures.  Ok for mainline?

For expand_shift_1 you add

+where C is the bitsize of A.  If N cannot be zero,
+use PLUS instead of IOR.

but I don't see a check ensuring this other than mabe CONST_INT_P (op1)
suggesting that we enver end up with const0_rtx here.  OTOH why is
N zero a problem and why is it not in the optabs.cc case where I don't
see any such check (at least not obvious)?

Since this doesn't seem to fix a regression it probably has to wait for
stage1 to re-open.

Thanks,
Richard.

>
> 2024-01-18  Roger Sayle  
>
> gcc/ChangeLog
> * expmed.cc (expand_shift_1): Use add_optab instead of ior_optab
> to generate PLUS instead or IOR when unioning disjoint bitfields.
> * optabs.cc (expand_subword_shift): Likewise.
> (expand_binop): Likewise for double-word rotate.
>
>
> Thanks in advance,
> Roger
> --
>

Re: [PATCH] libgccjit: Add ability to get CPU features

2024-01-19 Thread Antoni Boucher

David: Ping.

On Thu, 2023-11-09 at 18:04 -0500, David Malcolm wrote:
> On Thu, 2023-11-09 at 17:27 -0500, Antoni Boucher wrote:
> > Hi.
> > This patch adds support for getting the CPU features in libgccjit
> > (bug
> > 112466)
> > 
> > There's a TODO in the test:
> > I'm not sure how to test that gcc_jit_target_info_arch returns the
> > correct value since it is dependant on the CPU.
> > Any idea on how to improve this?
> > 
> > Also, I created a CStringHash to be able to have a
> > std::unordered_set. Is there any built-in way of
> > doing
> > this?
> 
> Thanks for the patch.
> 
> Some high-level questions:
> 
> Is this specifically about detecting capabilities of the host that
> libgccjit is currently running on? or how the target was configured
> when libgccjit was built?
> 
> One of the benefits of libgccjit is that, in theory, we support all
> of
> the targets that GCC already supports.  Does this patch change that,
> or
> is this more about giving client code the ability to determine
> capabilities of the specific host being compiled for?
> 
> I'm nervous about having per-target jit code.  Presumably there's a
> reason that we can't reuse existing target logic here - can you
> please
> describe what the problem is.  I see that the ChangeLog has:
> 
> > * config/i386/i386-jit.cc: New file.
> 
> where i386-jit.cc has almost 200 lines of nontrivial code.  Where did
> this come from?  Did you base it on existing code in our source tree,
> making modifications to fit the new internal API, or did you write it
> from scratch?  In either case, how onerous would this be for other
> targets?
> 
> I'm not at expert at target hooks (or at the i386 backend), so if we
> do
> go with this approach I'd want someone else to review those parts of
> the patch.
> 
> Have you verified that GCC builds with this patch with jit *not*
> enabled in the enabled languages?
> 
> [...snip...]
> 
> A nitpick:
> 
> > +.. function:: const char * \
> > +  gcc_jit_target_info_arch (gcc_jit_target_info *info)
> > +
> > +   Get the architecture of the currently running CPU.
> 
> What does this string look like?
> How long does the pointer remain valid?
> 
> Thanks again; hope the above makes sense
> Dave
>

RE: [middle-end PATCH] Prefer PLUS over IOR in RTL expansion of multi-word shifts/rotates.

2024-01-19 Thread Roger Sayle

Hi Richard,

Thanks for the speedy review.  I completely agree this patch
can wait for stage1, but it's related to some recent work Andrew
Pinski has been doing in match.pd, so I thought I'd share it.

Hypothetically, recognizing (x<<4)+(x>>60) as a rotation at the
tree-level might lead to a code quality regression, if RTL
expansion doesn't know to lower it back to use PLUS on
those targets with lea but without rotate.

> From: Richard Biener 
> Sent: 19 January 2024 11:04
> On Thu, Jan 18, 2024 at 8:55 PM Roger Sayle 
> wrote:
> >
> > This patch tweaks RTL expansion of multi-word shifts and rotates to
> > use PLUS rather than IOR for disjunctive operations.  During expansion
> > of these operations, the middle-end creates RTL like (X<>C2)
> > where the constants C1 and C2 guarantee that bits don't overlap.
> > Hence the IOR can be performed by any any_or_plus operation, such as
> > IOR, XOR or PLUS; for word-size operations where carry chains aren't
> > an issue these should all be equally fast (single-cycle) instructions.
> > The benefit of this change is that targets with shift-and-add insns,
> > like x86's lea, can benefit from the LSHIFT-ADD form.
> >
> > An example of a backend that benefits is ARC, which is demonstrated by
> > these two simple functions:
> >
> > unsigned long long foo(unsigned long long x) { return x<<2; }
> >
> > which with -O2 is currently compiled to:
> >
> > foo:lsr r2,r0,30
> > asl_s   r1,r1,2
> > asl_s   r0,r0,2
> > j_s.d   [blink]
> > or_sr1,r1,r2
> >
> > with this patch becomes:
> >
> > foo:lsr r2,r0,30
> > add2r1,r2,r1
> > j_s.d   [blink]
> > asl_s   r0,r0,2
> >
> > unsigned long long bar(unsigned long long x) { return (x<<2)|(x>>62);
> > }
> >
> > which with -O2 is currently compiled to 6 insns + return:
> >
> > bar:lsr r12,r0,30
> > asl_s   r3,r1,2
> > asl_s   r0,r0,2
> > lsr_s   r1,r1,30
> > or_sr0,r0,r1
> > j_s.d   [blink]
> > or  r1,r12,r3
> >
> > with this patch becomes 4 insns + return:
> >
> > bar:lsr r3,r1,30
> > lsr r2,r0,30
> > add2r1,r2,r1
> > j_s.d   [blink]
> > add2r0,r3,r0
> >
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check, both with and without --target_board=unix{-m32}
> > with no new failures.  Ok for mainline?
> 
> For expand_shift_1 you add
> 
> +where C is the bitsize of A.  If N cannot be zero,
> +use PLUS instead of IOR.
> 
> but I don't see a check ensuring this other than mabe CONST_INT_P (op1)
> suggesting that we enver end up with const0_rtx here.  OTOH why is N zero a
> problem and why is it not in the optabs.cc case where I don't see any such 
> check
> (at least not obvious)?

Excellent question.   A common mistake in writing a rotate function in C
or C++ is to write something like (x>>n)|(x<<(64-n)) or (x<>(64-n))
which invokes undefined behavior when n == 0.  It's OK to recognize these
as rotates (relying on the undefined behavior), but correct/portable code
(and RTL) needs the correct idiom(x>>n)|(x<<((-n)&63), which never invokes
undefined behaviour.  One interesting property of this idiom, is that shift
by zero is then calculated as (x>>0)|(x<<0) which is x|x.  This should then
reveal the problem, for all non-zero values the IOR can be replaced by PLUS,
but for zero shifts, X|X isn't the same as X+X or X^X.

This only applies for single word rotations, and not multi-word shifts
nor multi-word rotates, which explains why this test is only in one place.

In theory, we could use ranger to check whether a rotate by a variable
amount can ever be by zero bits, but the simplification used here is to
continue using IOR for variable shifts, and PLUS for fixed/known shift
values.  The last remaining insight is that we only need to check for
CONST_INT_P, as rotations/shifts by const0_rtx are handled earlier in
this function (and eliminated by the tree-optimizers), i.e. rotation by
a known constant is implicitly a rotation by a known non-zero constant.

This is a little clearer if you read/cite more of the comment that was
changed.  Fortunately, this case is also well covered by the testsuite.
I'd be happy to change the code to read:

(CONST_INT_P (op1) && op1 != const0_rtx)
? add_optab
: ior_optab

But the test "if (op1 == const0_rtx)" already appears on line 2570
of expmed.cc.

> Since this doesn't seem to fix a regression it probably has to wait for
> stage1 to re-open.
> 
> Thanks,
> Richard.
> 
> > 2024-01-18  Roger Sayle  
> >
> > gcc/ChangeLog
> > * expmed.cc (expand_shift_1): Use add_optab instead of ior_optab
> > to generate PLUS instead or IOR when unioning disjoint bitfields.
> > * optabs.cc (expand_subword_shift): Likewise.
> > (expand_binop): Likewise for double-word rotate.
> >

Thanks again.

Re: [PATCH] testsuite: Disable test for PR113292 on targets without TLS support

2024-01-19 Thread Christophe Lyon

On Fri, 19 Jan 2024 at 08:41, Nathaniel Shead  wrote:
>
> Tested on x86_64-pc-linux-gnu using a cross-compiler to
> arm-unknown-linux-gnueabihf with --enable-threads=0 that the link test
> is correctly skipped. OK for trunk?
>
> -- >8 --
>
> This disables the new test added by r14-8168 on machines that don't have
> TLS support, such as bare-metal ARM.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/modules/pr113292_c.C: Require TLS.
>
> Signed-off-by: Nathaniel Shead 
> ---
>  gcc/testsuite/g++.dg/modules/pr113292_c.C | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/gcc/testsuite/g++.dg/modules/pr113292_c.C 
> b/gcc/testsuite/g++.dg/modules/pr113292_c.C
> index aa3f32ae818..c117c7cfcd4 100644
> --- a/gcc/testsuite/g++.dg/modules/pr113292_c.C
> +++ b/gcc/testsuite/g++.dg/modules/pr113292_c.C
> @@ -1,6 +1,8 @@
>  // PR c++/113292
>  // { dg-module-do link }
> +// { dg-add-options tls }
>  // { dg-additional-options "-fmodules-ts" }
> +// { dg-require-effective-target tls_runtime }
>
Hi,

Thanks, I think this is OK, although I think we prefer to put
dg-require before dg-add-options (after dg-module-do).

Christophe

>  import "pr113292_a.H";
>
> --
> 2.43.0
>

Re: [middle-end PATCH] Prefer PLUS over IOR in RTL expansion of multi-word shifts/rotates.

2024-01-19 Thread Richard Biener

On Fri, Jan 19, 2024 at 2:26 PM Roger Sayle  wrote:
>
>
> Hi Richard,
>
> Thanks for the speedy review.  I completely agree this patch
> can wait for stage1, but it's related to some recent work Andrew
> Pinski has been doing in match.pd, so I thought I'd share it.
>
> Hypothetically, recognizing (x<<4)+(x>>60) as a rotation at the
> tree-level might lead to a code quality regression, if RTL
> expansion doesn't know to lower it back to use PLUS on
> those targets with lea but without rotate.
>
> > From: Richard Biener 
> > Sent: 19 January 2024 11:04
> > On Thu, Jan 18, 2024 at 8:55 PM Roger Sayle 
> > wrote:
> > >
> > > This patch tweaks RTL expansion of multi-word shifts and rotates to
> > > use PLUS rather than IOR for disjunctive operations.  During expansion
> > > of these operations, the middle-end creates RTL like (X<>C2)
> > > where the constants C1 and C2 guarantee that bits don't overlap.
> > > Hence the IOR can be performed by any any_or_plus operation, such as
> > > IOR, XOR or PLUS; for word-size operations where carry chains aren't
> > > an issue these should all be equally fast (single-cycle) instructions.
> > > The benefit of this change is that targets with shift-and-add insns,
> > > like x86's lea, can benefit from the LSHIFT-ADD form.
> > >
> > > An example of a backend that benefits is ARC, which is demonstrated by
> > > these two simple functions:
> > >
> > > unsigned long long foo(unsigned long long x) { return x<<2; }
> > >
> > > which with -O2 is currently compiled to:
> > >
> > > foo:lsr r2,r0,30
> > > asl_s   r1,r1,2
> > > asl_s   r0,r0,2
> > > j_s.d   [blink]
> > > or_sr1,r1,r2
> > >
> > > with this patch becomes:
> > >
> > > foo:lsr r2,r0,30
> > > add2r1,r2,r1
> > > j_s.d   [blink]
> > > asl_s   r0,r0,2
> > >
> > > unsigned long long bar(unsigned long long x) { return (x<<2)|(x>>62);
> > > }
> > >
> > > which with -O2 is currently compiled to 6 insns + return:
> > >
> > > bar:lsr r12,r0,30
> > > asl_s   r3,r1,2
> > > asl_s   r0,r0,2
> > > lsr_s   r1,r1,30
> > > or_sr0,r0,r1
> > > j_s.d   [blink]
> > > or  r1,r12,r3
> > >
> > > with this patch becomes 4 insns + return:
> > >
> > > bar:lsr r3,r1,30
> > > lsr r2,r0,30
> > > add2r1,r2,r1
> > > j_s.d   [blink]
> > > add2r0,r3,r0
> > >
> > >
> > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > > and make -k check, both with and without --target_board=unix{-m32}
> > > with no new failures.  Ok for mainline?
> >
> > For expand_shift_1 you add
> >
> > +where C is the bitsize of A.  If N cannot be zero,
> > +use PLUS instead of IOR.
> >
> > but I don't see a check ensuring this other than mabe CONST_INT_P (op1)
> > suggesting that we enver end up with const0_rtx here.  OTOH why is N zero a
> > problem and why is it not in the optabs.cc case where I don't see any such 
> > check
> > (at least not obvious)?
>
> Excellent question.   A common mistake in writing a rotate function in C
> or C++ is to write something like (x>>n)|(x<<(64-n)) or (x<>(64-n))
> which invokes undefined behavior when n == 0.  It's OK to recognize these
> as rotates (relying on the undefined behavior), but correct/portable code
> (and RTL) needs the correct idiom(x>>n)|(x<<((-n)&63), which never invokes
> undefined behaviour.  One interesting property of this idiom, is that shift
> by zero is then calculated as (x>>0)|(x<<0) which is x|x.  This should then
> reveal the problem, for all non-zero values the IOR can be replaced by PLUS,
> but for zero shifts, X|X isn't the same as X+X or X^X.
>
> This only applies for single word rotations, and not multi-word shifts
> nor multi-word rotates, which explains why this test is only in one place.
>
> In theory, we could use ranger to check whether a rotate by a variable
> amount can ever be by zero bits, but the simplification used here is to
> continue using IOR for variable shifts, and PLUS for fixed/known shift
> values.  The last remaining insight is that we only need to check for
> CONST_INT_P, as rotations/shifts by const0_rtx are handled earlier in
> this function (and eliminated by the tree-optimizers), i.e. rotation by
> a known constant is implicitly a rotation by a known non-zero constant.

Ah, I see.  It wasn't obvious the expmed.cc case was for rotations only.

The patch is OK as-is for stage1 (which also gives others plenty of time
to comment).

I wonder if you can add a testcase though?

Thanks,
Richard.

> This is a little clearer if you read/cite more of the comment that was
> changed.  Fortunately, this case is also well covered by the testsuite.
> I'd be happy to change the code to read:
>
> (CONST_INT_P (op1) && op1 != const0_rtx)
> ? add_optab
> : ior_optab
>
> But the test "if (op1 == const0_rtx)" already appears on line 2570
> of

[PATCHSET] Update of GCC upstream with gccrs development repository

2024-01-19 Thread Arthur Cohen


Hi everyone,

This patchset updates trunk with all of 2023's commits concerning the 
Rust GCC frontend.


We apologize for the large amount of changes - we will change our 
upstreaming process for 2024 and will update upstream on a more regular 
basis.


This patchset contains multiple improvement to our frontend. In no 
particular order, this year, we've worked on the following topics:


1. Procedural macro support, which enables users to write complex macros 
reusing the compiler's lexer and parser. This required extensive changes 
to GCC's build system, as well as the development of a binary interface 
so that the Rust frontend could perform dynamic function calls to 
external libraries during macro expansion.


2. Parser improvements, and most notably, parser relaxation! In order to 
accept more "invalid" Rust concepts, we spent a great deal of time 
making the parser accept a lot more constructs, which are then being 
rejected in a later AST Validation pass. A recent example of why this is 
needed is present in the latest stable release of Rust [1], the 
beautifully named "RPITIT" feature - "return position impl trait in traits".


Previously, our parser would fail to parse trait functions returning 
`impl Trait` values, but it now correctly errors out, mentioning that 
they are being parsed but are not valid Rust. When we eventually catch 
up to modern Rust versions, it will allow us to enable their usage only 
if a specific nightly feature is used for example.


3. Closure support
4. The beginnings of a borrow checker framework
5. Iterators
6. More patten matching
7. A new name resolution algorithm
8. Lots of cleanups
9. Derive macros, including built-in derive macros.
10. Rust error codes, which will help us attempt to pass the `rustc` 
testsuite

11. New compiler intrinsics

And a lot more. All of this work was possible thanks to our contributors:

Abdul Rafey
bl7awy
Charalampos Mitrodimas
Dave Evans
David Malcolm
Emanuele Micheletti
goar5670
Guillaume Gomez
Jakub Dupak
Jiakun Fan
liushuyu
Mahmoud Mohamed
Marc Poulhiès
Matthew Jasper
Mohammed Rizan Farooqui
Muhammad Mahad
M V V S Manoj Kumar
mxlol233
Nikos Alexandris
omkar-mohanty
Owen Avery
Parthib
Philip Herron
Pierre-Emmanuel Patry
Raiki Tamura
Sebastian Kirmayer
Sergey Bugaev
Tage Johansson
Thomas Schwinge
TieWay59
vincent
Xiao Ma
Zheyuan Chen

And thanks to the continued support of Open Source Security, inc and 
Embecosm.


We aim to bring the last missing pieces of the puzzle in order to 
compile `core` in the coming months, and we look forward to being part 
of the GCC 14.1 release.


[1]: https://blog.rust-lang.org/2023/12/28/Rust-1.75.0.html

Here is the list of patches we've added:

[PATCH 001/874] gccrs: Fix bootstrap build
[PATCH 002/874] gccrs: Fix missing build dependency
[PATCH 003/874] gccrs: Parse AltPattern
[PATCH 004/874] gccrs: Add feature gate for "rustc_attri".
[PATCH 005/874] gccrs: parser: Allow parsing of qualified type path
[PATCH 006/874] gccrs: parser: Allow `LEFT_SHIFT` to start
[PATCH 007/874] gccrs: typecheck: Refactor unify_site
[PATCH 008/874] gccrs: typecheck: Refactor coercion_site
[PATCH 009/874] gccrs: parser: Add parsing of auto traits
[PATCH 010/874] gccrs: macro_invoc_lexer: Add `split_current_token`
[PATCH 011/874] gccrs: ast: Add ExternalTypeItem node
[PATCH 012/874] gccrs: ast: Add proper visitors for ExternalTypeItem
[PATCH 013/874] gccrs: typecheck: Refactor cast_site
[PATCH 014/874] gccrs: parser: Parse `default` impl Functions and
[PATCH 015/874] gccrs: Implement and test include_str eager expansion
[PATCH 016/874] gccrs: parser: Parse external type item
[PATCH 017/874] gccrs: testsuite: Add extern type item test
[PATCH 018/874] gccrs: testsuite: Add test with missing semicolon
[PATCH 019/874] gccrs: Fix ICE in ADTType::is_concrete
[PATCH 020/874] gccrs: refactor unify commit as a static function
[PATCH 021/874] gccrs: Generic pointers are coerceable
[PATCH 022/874] gccrs: Allow infer vars on the lhs too
[PATCH 023/874] gccrs: Make coercion sites autoderef cycle optional
[PATCH 024/874] gccrs: Only emit errors during type-bounds checking
[PATCH 025/874] gccrs: autoderef unconstify so we can use in non
[PATCH 026/874] gccrs: bug-fix implicit inference checks
[PATCH 027/874] gccrs: Fix method resolution to use TryCoerce
[PATCH 028/874] gccrs: Remove cmp_autoderef_mode hack from old
[PATCH 029/874] gccrs: ast: Add RestPattern AST node
[PATCH 030/874] gccrs: Fix formatting
[PATCH 031/874] gccrs: Replace gcc_unreachable with rust_sorry_at
[PATCH 032/874] gccrs: Change struct StructPatternElements into class
[PATCH 033/874] gccrs: typecheck: Fix casting error behind generics
[PATCH 034/874] gccrs: Fix assignment operator overloads for AST and
[PATCH 035/874] gccrs: Add feature gate definition for
[PATCH 036/874] gccrs: hir: Refactor ASTLoweringStmt to source file.
[PATCH 037/874] gccrs: add {add,sub,mul}_with_overflow intrinsics
[PATCH 038/874] gccrs: parser: Fix if let parsing
[PATCH 039/874] gccrs: tests

[PATCH] tree-optimization/113373 - add missing LC PHIs for live operations

2024-01-19 Thread Richard Biener

The following makes reduction epilogue code generation happy by properly
adding LC PHIs to the exit blocks for multiple exit vectorized loops.

Some refactoring might make the flow easier to follow but I've refrained
from doing that with this patch.

I've kept some fixes in reduction epilogue generation from the earlier
attempt fixing this PR.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.  I'm
waiting for the linaro CI and on Monday will followup with some
refactoring.

Richard.

PR tree-optimization/113373
* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
Create LC PHIs in the exit blocks where necessary.
* tree-vect-loop.cc (vectorizable_live_operation): Do not try
to handle missing LC PHIs.
(find_connected_edge): Remove.
(vect_create_epilog_for_reduction): Cleanup use of auto_vec.

* gcc.dg/vect/vect-early-break_104-pr113373.c: New testcase.
---
 .../vect/vect-early-break_104-pr113373.c  | 19 
 gcc/tree-vect-loop-manip.cc   | 34 --
 gcc/tree-vect-loop.cc | 46 +--
 3 files changed, 60 insertions(+), 39 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-early-break_104-pr113373.c

diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_104-pr113373.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_104-pr113373.c
new file mode 100644
index 000..1601aafb3e6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_104-pr113373.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+
+struct asCArray {
+  unsigned *array;
+  int length;
+};
+unsigned asCReaderTranslateFunction(struct asCArray b, unsigned t)
+{
+  int size = 0;
+  for (unsigned num; num < t; num++)
+  {
+if (num >= b.length)
+  __builtin_abort();
+size += b.array[num];
+  }
+  return size;
+}
diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index 1477906e96e..eacbc022549 100644
--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -1696,7 +1696,8 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, 
edge loop_exit,
  /* Check if we've already created a new phi node during edge
 redirection.  If we have, only propagate the value
 downwards in case there is no merge block.  */
- if (tree *res = new_phi_args.get (new_arg))
+ tree *res;
+ if ((res = new_phi_args.get (new_arg)))
{
  if (multiple_exits_p)
new_arg = *res;
@@ -1717,7 +1718,7 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, 
edge loop_exit,
  /* Similar to the single exit case, If we have an existing
 LCSSA variable thread through the original value otherwise
 skip it and directly use the final value.  */
- if (tree *res = new_phi_args.get (tmp_arg))
+ if ((res = new_phi_args.get (tmp_arg)))
new_arg = *res;
  else if (!virtual_operand_p (new_arg))
new_arg = tmp_arg;
@@ -1728,9 +1729,20 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop 
*loop, edge loop_exit,
 
  /* Otherwise, main loop exit should use the final iter value.  */
  if (multiple_exits_p)
-   SET_PHI_ARG_DEF_ON_EDGE (lcssa_phi,
-single_succ_edge 
(main_loop_exit_block),
-new_arg);
+   {
+ /* Create a LC PHI if it doesn't already exist.  */
+ if (!virtual_operand_p (new_arg) && !res)
+   {
+ tree new_def = copy_ssa_name (new_arg);
+ gphi *lc_phi
+   = create_phi_node (new_def, main_loop_exit_block);
+ SET_PHI_ARG_DEF (lc_phi, 0, new_arg);
+ new_arg = new_def;
+   }
+ SET_PHI_ARG_DEF_ON_EDGE (lcssa_phi,
+  single_succ_edge 
(main_loop_exit_block),
+  new_arg);
+   }
  else
SET_PHI_ARG_DEF_ON_EDGE (lcssa_phi, loop_exit, new_arg);
 
@@ -1766,6 +1778,18 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop 
*loop, edge loop_exit,
  if (vphi)
alt_arg = gimple_phi_result (vphi);
}
+ /* For other live args we didn't create LC PHI nodes.
+Do so here.  */
+ else
+   {
+ tree alt_def = copy_ssa_name (alt_arg);
+ gphi *lc_phi
+   = create_phi_node (alt_def, alt_loop_exit_block);
+

[PATCH] aarch64: Don't assert recog success in ldp/stp pass [PR113114]

2024-01-19 Thread Alex Coplan

Hi,

The PR shows two different cases where try_promote_writeback produces an
RTL pattern which isn't recognized.  Currently this leads to an ICE, as
we assert recog success, but I think it's better just to back out of the
changes gracefully if recog fails (as we do in the main fuse_pair case).

In theory since we check the ranges here recog shouldn't fail (which is
why I had the assert in the first place), but the PR shows an edge case
in the patterns where if we form a pre-writeback pair where the
writeback offset is exactly -S, where S is the size in bytes of one
transfer register, we fail to match the expected pattern as the patterns
look explicitly for plus operands in the mems.  I think fixing this
would require adding at least four new special-case patterns to
aarch64.md for what doesn't seem to be a particularly useful variant of
the insns.  Even if we were to do that, I think it would be GCC 15
material, and it's better to just punt for GCC 14.

The ILP32 case in the PR is a bit different, as that shows us trying to
combine a pair with DImode base register operands in the mems together
with an SImode trailing update of the base register.  This leads to us
forming an RTL pattern which references the base register in both SImode
and DImode, which also fails to recog.  Again, I think it's best just to
take the missed optimization for now.  If we really want to make this
(try_promote_writeback) work for ILP32, we can try to do it for GCC 15.

Bootstrapped/regtested on aarch64-linux-gnu (with/without passes
enabled), OK for trunk?

Thanks,
Alex

gcc/ChangeLog:

PR target/113114
* config/aarch64/aarch64-ldp-fusion.cc (try_promote_writeback):
Don't assert recog success, just punt if the writeback pair
isn't recognized.

gcc/testsuite/ChangeLog:

PR target/113114
* gcc.c-torture/compile/pr113114.c: New test.
* gcc.target/aarch64/pr113114.c: New test.
diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc 
b/gcc/config/aarch64/aarch64-ldp-fusion.cc
index 689a8c884bd..19142153f41 100644
--- a/gcc/config/aarch64/aarch64-ldp-fusion.cc
+++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc
@@ -2672,7 +2672,15 @@ try_promote_writeback (insn_info *insn)
   for (unsigned i = 0; i < ARRAY_SIZE (changes); i++)
 gcc_assert (rtl_ssa::restrict_movement_ignoring (*changes[i], 
is_changing));
 
-  gcc_assert (rtl_ssa::recog_ignoring (attempt, pair_change, is_changing));
+  if (!rtl_ssa::recog_ignoring (attempt, pair_change, is_changing))
+{
+  if (dump_file)
+   fprintf (dump_file, "i%d: recog failed on wb pair, bailing out\n",
+insn->uid ());
+  cancel_changes (0);
+  return;
+}
+
   gcc_assert (crtl->ssa->verify_insn_changes (changes));
   confirm_change_group ();
   crtl->ssa->change_insns (changes);
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr113114.c 
b/gcc/testsuite/gcc.c-torture/compile/pr113114.c
new file mode 100644
index 000..978e594eb3d
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr113114.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-funroll-loops" } */
+float val[128];
+float x;
+void bar() {
+  int i = 55;
+  for (; i >= 0; --i)
+x += val[i];
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/pr113114.c 
b/gcc/testsuite/gcc.target/aarch64/pr113114.c
new file mode 100644
index 000..5b0383c2435
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr113114.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-mabi=ilp32 -O -mearly-ldp-fusion -mlate-ldp-fusion" } */
+void foo_n(double *a) {
+  int i = 1;
+  for (; i < (int)foo_n; i++)
+a[i] = a[i - 1] + a[i + 1] * a[i];
+}

[PATCH v3 0/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-01-19 Thread Andre Vieira

Hi,

Reworked the patches according to Kyrill's comments, made some other
non-functional changes and rebased.

Reposting as v3 so patchworks picks them up and runs the necessary testing.

Andre Vieira (2):
arm: Add define_attr to to create a mapping between MVE predicated and
  unpredicated insns
arm: Add support for MVE Tail-Predicated Low Overhead Loops

-- 
2.17.1

[PATCH v3 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-01-19 Thread Andre Vieira


Respin after comments from Kyrill and rebase. I also removed an if-then-else
construct in arm_mve_check_reg_origin_is_num_elems similar to the other 
functions
Kyrill pointed out.

After an earlier comment from Richard Sandiford I also added comments to the
two tail predication patterns added to explain the need for the unspecs.
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 2cd560c9925..76c6ee95c16 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -65,8 +65,8 @@ extern void arm_emit_speculation_barrier_function (void);
 extern void arm_decompose_di_binop (rtx, rtx, rtx *, rtx *, rtx *, rtx *);
 extern bool arm_q_bit_access (void);
 extern bool arm_ge_bits_access (void);
-extern bool arm_target_insn_ok_for_lob (rtx);
-
+extern bool arm_target_bb_ok_for_lob (basic_block);
+extern rtx arm_attempt_dlstp_transform (rtx);
 #ifdef RTX_CODE
 enum reg_class
 arm_mode_base_reg_class (machine_mode);
diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index e5a944486d7..75432f3f73a 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -668,6 +668,12 @@ static const scoped_attribute_specs *const arm_attribute_table[] =
 #undef TARGET_HAVE_CONDITIONAL_EXECUTION
 #define TARGET_HAVE_CONDITIONAL_EXECUTION arm_have_conditional_execution
 
+#undef TARGET_LOOP_UNROLL_ADJUST
+#define TARGET_LOOP_UNROLL_ADJUST arm_loop_unroll_adjust
+
+#undef TARGET_PREDICT_DOLOOP_P
+#define TARGET_PREDICT_DOLOOP_P arm_predict_doloop_p
+
 #undef TARGET_LEGITIMATE_CONSTANT_P
 #define TARGET_LEGITIMATE_CONSTANT_P arm_legitimate_constant_p
 
@@ -34483,19 +34489,1135 @@ arm_invalid_within_doloop (const rtx_insn *insn)
 }
 
 bool
-arm_target_insn_ok_for_lob (rtx insn)
+arm_target_bb_ok_for_lob (basic_block bb)
 {
-  basic_block bb = BLOCK_FOR_INSN (insn);
   /* Make sure the basic block of the target insn is a simple latch
  having as single predecessor and successor the body of the loop
  itself.  Only simple loops with a single basic block as body are
  supported for 'low over head loop' making sure that LE target is
  above LE itself in the generated code.  */
-
   return single_succ_p (bb)
-&& single_pred_p (bb)
-&& single_succ_edge (bb)->dest == single_pred_edge (bb)->src
-&& contains_no_active_insn_p (bb);
+	 && single_pred_p (bb)
+	 && single_succ_edge (bb)->dest == single_pred_edge (bb)->src;
+}
+
+/* Utility fuction: Given a VCTP or a VCTP_M insn, return the number of MVE
+   lanes based on the machine mode being used.  */
+
+static int
+arm_mve_get_vctp_lanes (rtx_insn *insn)
+{
+  rtx insn_set = single_set (insn);
+  if (insn_set
+  && GET_CODE (SET_SRC (insn_set)) == UNSPEC
+  && (XINT (SET_SRC (insn_set), 1) == VCTP
+	  || XINT (SET_SRC (insn_set), 1) == VCTP_M))
+{
+  machine_mode mode = GET_MODE (SET_SRC (insn_set));
+  return (VECTOR_MODE_P (mode) && VALID_MVE_PRED_MODE (mode))
+	 ? GET_MODE_NUNITS (mode) : 0;
+}
+  return 0;
+}
+
+/* Check if INSN requires the use of the VPR reg, if it does, return the
+   sub-rtx of the VPR reg.  The TYPE argument controls whether
+   this function should:
+   * For TYPE == 0, check all operands, including the OUT operands,
+ and return the first occurrence of the VPR reg.
+   * For TYPE == 1, only check the input operands.
+   * For TYPE == 2, only check the output operands.
+   (INOUT operands are considered both as input and output operands)
+*/
+static rtx
+arm_get_required_vpr_reg (rtx_insn *insn, unsigned int type = 0)
+{
+  gcc_assert (type < 3);
+  if (!NONJUMP_INSN_P (insn))
+return NULL_RTX;
+
+  bool requires_vpr;
+  extract_constrain_insn (insn);
+  int n_operands = recog_data.n_operands;
+  if (recog_data.n_alternatives == 0)
+return NULL_RTX;
+
+  /* Fill in recog_op_alt with information about the constraints of
+ this insn.  */
+  preprocess_constraints (insn);
+
+  for (int op = 0; op < n_operands; op++)
+{
+  requires_vpr = true;
+  if (type == 1 && recog_data.operand_type[op] == OP_OUT)
+	continue;
+  else if (type == 2 && recog_data.operand_type[op] == OP_IN)
+	continue;
+
+  /* Iterate through alternatives of operand "op" in recog_op_alt and
+	 identify if the operand is required to be the VPR.  */
+  for (int alt = 0; alt < recog_data.n_alternatives; alt++)
+	{
+	  const operand_alternative *op_alt
+	  = &recog_op_alt[alt * n_operands];
+	  /* Fetch the reg_class for each entry and check it against the
+	 VPR_REG reg_class.  */
+	  if (alternative_class (op_alt, op) != VPR_REG)
+	requires_vpr = false;
+	}
+  /* If all alternatives of the insn require the VPR reg for this operand,
+	 it means that either this is VPR-generating instruction, like a vctp,
+	 vcmp, etc., or it is a VPT-predicated insruction.  Return the subrtx
+	 of the VPR reg operand.  */
+  if (requires_vpr)
+	return recog_data.operand[op];
+}
+  return NULL_RTX;
+}
+
+/* Wrapper function of arm_get_required_vpr_reg

[PATCH v3 1/2] arm: Add define_attr to to create a mapping between MVE predicated and unpredicated insns

2024-01-19 Thread Andre Vieira


Reposting for testing purposes, no changes from v2 (other than rebase).
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 2a2207c0ba1..449e6935b32 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -2375,6 +2375,21 @@ extern int making_const_table;
   else if (TARGET_THUMB1)\
 thumb1_final_prescan_insn (INSN)
 
+/* These defines are useful to refer to the value of the mve_unpredicated_insn
+   insn attribute.  Note that, because these use the get_attr_* function, these
+   will change recog_data if (INSN) isn't current_insn.  */
+#define MVE_VPT_PREDICABLE_INSN_P(INSN)	\
+  (recog_memoized (INSN) >= 0		\
+   && get_attr_mve_unpredicated_insn (INSN) != CODE_FOR_nothing)
+
+#define MVE_VPT_PREDICATED_INSN_P(INSN)	\
+  (MVE_VPT_PREDICABLE_INSN_P (INSN)	\
+   && recog_memoized (INSN) != get_attr_mve_unpredicated_insn (INSN))
+
+#define MVE_VPT_UNPREDICATED_INSN_P(INSN)\
+  (MVE_VPT_PREDICABLE_INSN_P (INSN)	\
+   && recog_memoized (INSN) == get_attr_mve_unpredicated_insn (INSN))
+
 #define ARM_SIGN_EXTEND(x)  ((HOST_WIDE_INT)			\
   (HOST_BITS_PER_WIDE_INT <= 32 ? (unsigned HOST_WIDE_INT) (x)	\
: unsigned HOST_WIDE_INT)(x)) & (unsigned HOST_WIDE_INT) 0x) |\
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 4a98f2d7b62..3f2863adf44 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -124,6 +124,12 @@ (define_attr "fpu" "none,vfp"
 ; and not all ARM insns do.
 (define_attr "predicated" "yes,no" (const_string "no"))
 
+; An attribute that encodes the CODE_FOR_ of the MVE VPT unpredicated
+; version of a VPT-predicated instruction.  For unpredicated instructions
+; that are predicable, encode the same pattern's CODE_FOR_ as a way to
+; encode that it is a predicable instruction.
+(define_attr "mve_unpredicated_insn" "" (symbol_ref "CODE_FOR_nothing"))
+
 ; LENGTH of an instruction (in bytes)
 (define_attr "length" ""
   (const_int 4))
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 547d87f3bc8..7600bf62531 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -2311,6 +2311,7 @@ (define_int_attr simd32_op [(UNSPEC_QADD8 "qadd8") (UNSPEC_QSUB8 "qsub8")
 
 (define_int_attr mmla_sfx [(UNSPEC_MATMUL_S "s8") (UNSPEC_MATMUL_U "u8")
 			   (UNSPEC_MATMUL_US "s8")])
+
 ;;MVE int attribute.
 (define_int_attr supf [(VCVTQ_TO_F_S "s") (VCVTQ_TO_F_U "u") (VREV16Q_S "s")
 		   (VREV16Q_U "u") (VMVNQ_N_S "s") (VMVNQ_N_U "u")
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 0fabbaa931b..8aa0bded7f0 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -17,7 +17,7 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; .
 
-(define_insn "*mve_mov"
+(define_insn "mve_mov"
   [(set (match_operand:MVE_types 0 "nonimmediate_operand" "=w,w,r,w   , w,   r,Ux,w")
 	(match_operand:MVE_types 1 "general_operand"  " w,r,w,DnDm,UxUi,r,w, Ul"))]
   "TARGET_HAVE_MVE || TARGET_HAVE_MVE_FLOAT"
@@ -81,18 +81,27 @@ (define_insn "*mve_mov"
   return "";
 }
 }
-  [(set_attr "type" "mve_move,mve_move,mve_move,mve_move,mve_load,multiple,mve_store,mve_load")
+   [(set_attr_alternative "mve_unpredicated_insn" [(symbol_ref "CODE_FOR_mve_mov")
+		   (symbol_ref "CODE_FOR_nothing")
+		   (symbol_ref "CODE_FOR_nothing")
+		   (symbol_ref "CODE_FOR_mve_mov")
+		   (symbol_ref "CODE_FOR_mve_mov")
+		   (symbol_ref "CODE_FOR_nothing")
+		   (symbol_ref "CODE_FOR_mve_mov")
+		   (symbol_ref "CODE_FOR_nothing")])
+   (set_attr "type" "mve_move,mve_move,mve_move,mve_move,mve_load,multiple,mve_store,mve_load")
(set_attr "length" "4,8,8,4,4,8,4,8")
(set_attr "thumb2_pool_range" "*,*,*,*,1018,*,*,*")
(set_attr "neg_pool_range" "*,*,*,*,996,*,*,*")])
 
-(define_insn "*mve_vdup"
+(define_insn "mve_vdup"
   [(set (match_operand:MVE_vecs 0 "s_register_operand" "=w")
 	(vec_duplicate:MVE_vecs
 	  (match_operand: 1 "s_register_operand" "r")))]
   "TARGET_HAVE_MVE || TARGET_HAVE_MVE_FLOAT"
   "vdup.\t%q0, %1"
-  [(set_attr "length" "4")
+ [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vdup"))
+  (set_attr "length" "4")
(set_attr "type" "mve_move")])
 
 ;;
@@ -145,7 +154,8 @@ (define_insn "@mve_q_f"
   ]
   "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
   ".f%#\t%q0, %q1"
-  [(set_attr "type" "mve_move")
+ [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_f"))
+  (set_attr "type" "mve_move")
 ])
 
 ;;
@@ -159,7 +169,8 @@ (define_insn "@mve_q_f"
   ]
   "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
   ".%#\t%q0, %q1"
-  [(set_attr "type" "mve_move")
+ [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_f"))
+  (set_attr "type" "mve_move")
 ])
 
 ;;
@@ -173,7 +184,8 @@ (define_insn "mve_vq_f"
   ]
   "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
   "v.f%#\t%q0, %q1"
-  [(set_attr "type" "mve_move")
+ [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vq_f"

[PATCH] libstdc++: Add and to stdc++.h

2024-01-19 Thread Patrick Palka

Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

-- >8 --

libstdc++-v3/ChangeLog:

* include/precompiled/stdc++.h [_GLIBCXX_HOSTED]: Include
 and  for C++23 and C++26 respectively.
---
 libstdc++-v3/include/precompiled/stdc++.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/libstdc++-v3/include/precompiled/stdc++.h 
b/libstdc++-v3/include/precompiled/stdc++.h
index a1db96cbb64..3eef20d1645 100644
--- a/libstdc++-v3/include/precompiled/stdc++.h
+++ b/libstdc++-v3/include/precompiled/stdc++.h
@@ -223,6 +223,7 @@
 #if __cplusplus > 202002L
 #include 
 #include 
+#include 
 #include 
 #if __has_include()
 # include 
@@ -231,4 +232,8 @@
 #include 
 #endif
 
+#if __cplusplus > 202302L
+#include 
+#endif
+
 #endif // HOSTED
-- 
2.43.0.367.g186b115d30

Re: [PATCH] libstdc++: Add and to stdc++.h

2024-01-19 Thread Jonathan Wakely

On Fri, 19 Jan 2024 at 15:16, Patrick Palka  wrote:
>
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

OK.

This was on my TODO list after somebody asked about  on
IRC this week, so thanks for doing it.


>
> -- >8 --
>
> libstdc++-v3/ChangeLog:
>
> * include/precompiled/stdc++.h [_GLIBCXX_HOSTED]: Include
>  and  for C++23 and C++26 respectively.
> ---
>  libstdc++-v3/include/precompiled/stdc++.h | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/libstdc++-v3/include/precompiled/stdc++.h 
> b/libstdc++-v3/include/precompiled/stdc++.h
> index a1db96cbb64..3eef20d1645 100644
> --- a/libstdc++-v3/include/precompiled/stdc++.h
> +++ b/libstdc++-v3/include/precompiled/stdc++.h
> @@ -223,6 +223,7 @@
>  #if __cplusplus > 202002L
>  #include 
>  #include 
> +#include 
>  #include 
>  #if __has_include()
>  # include 
> @@ -231,4 +232,8 @@
>  #include 
>  #endif
>
> +#if __cplusplus > 202302L
> +#include 
> +#endif
> +
>  #endif // HOSTED
> --
> 2.43.0.367.g186b115d30
>

Re: [middle-end PATCH] Prefer PLUS over IOR in RTL expansion of multi-word shifts/rotates.

2024-01-19 Thread Georg-Johann Lay





Am 18.01.24 um 20:54 schrieb Roger Sayle:


This patch tweaks RTL expansion of multi-word shifts and rotates to use
PLUS rather than IOR for disjunctive operations.  During expansion of
these operations, the middle-end creates RTL like (X<>C2)
where the constants C1 and C2 guarantee that bits don't overlap.
Hence the IOR can be performed by any any_or_plus operation, such as
IOR, XOR or PLUS; for word-size operations where carry chains aren't
an issue these should all be equally fast (single-cycle) instructions.
The benefit of this change is that targets with shift-and-add insns,
like x86's lea, can benefit from the LSHIFT-ADD form.

An example of a backend that benefits is ARC, which is demonstrated
by these two simple functions:


But there are also back-ends where this is bad.

The reason is that with ORI, the back-end needs only to operate no
these sub-words where the sub-mask is non-zero.  But for PLUS this
is not the case because the back-end does not know that intermediate
carry will be zero.  Hence, with PLUS, more instructions are needed.
An example is AVR, but maybe much more target with multi-word operations
are affected in a bad way.

Take for example the case with 2 words and a value of 1.

LO |= 1
HI |= 0

can be optimized to

LO |= 1

but for addition this is not the case:

LO += 1
HI +=c 0 ;; Does not know that always carry = 0.

Johann




unsigned long long foo(unsigned long long x) { return x<<2; }

which with -O2 is currently compiled to:

foo:lsr r2,r0,30
 asl_s   r1,r1,2
 asl_s   r0,r0,2
 j_s.d   [blink]
 or_sr1,r1,r2

with this patch becomes:

foo:lsr r2,r0,30
 add2r1,r2,r1
 j_s.d   [blink]
 asl_s   r0,r0,2

unsigned long long bar(unsigned long long x) { return (x<<2)|(x>>62); }

which with -O2 is currently compiled to 6 insns + return:

bar:lsr r12,r0,30
 asl_s   r3,r1,2
 asl_s   r0,r0,2
 lsr_s   r1,r1,30
 or_sr0,r0,r1
 j_s.d   [blink]
 or  r1,r12,r3

with this patch becomes 4 insns + return:

bar:lsr r3,r1,30
 lsr r2,r0,30
 add2r1,r2,r1
 j_s.d   [blink]
 add2r0,r3,r0


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?


2024-01-18  Roger Sayle  

gcc/ChangeLog
 * expmed.cc (expand_shift_1): Use add_optab instead of ior_optab
 to generate PLUS instead or IOR when unioning disjoint bitfields.
 * optabs.cc (expand_subword_shift): Likewise.
 (expand_binop): Likewise for double-word rotate.


Thanks in advance,
Roger
--

[committed] Skip gcc.dg/analyzer/pr94688.c on hppa64--

2024-01-19 Thread John David Anglin

Tested on hppa64-hp-hpux11.11.  Committed to trunk.

Dave
---

Skip gcc.dg/analyzer/pr94688.c on hppa*64*-*-*

2024-01-19  John David Anglin  

gcc/testsuite/ChangeLog:

PR analyzer/112705
* gcc.dg/analyzer/pr94688.c: Skip on hppa*64*-*-*.

diff --git a/gcc/testsuite/gcc.dg/analyzer/pr94688.c 
b/gcc/testsuite/gcc.dg/analyzer/pr94688.c
index f553b8cfdad..8ea8bc3b288 100644
--- a/gcc/testsuite/gcc.dg/analyzer/pr94688.c
+++ b/gcc/testsuite/gcc.dg/analyzer/pr94688.c
@@ -1,3 +1,4 @@
+/* { dg-skip-if "PR112705" { hppa*64*-*-* } } */
 int a, b;
 void d();
 void c()


signature.asc
Description: PGP signature

[committed] Only xfail gcc.dg/pr84877.c on 32-bit hppa--*

2024-01-19 Thread John David Anglin

Tested on hppa64-hp-hpux11.11.  Committed to trunk.

Dave
---

Only xfail gcc.dg/pr84877.c on 32-bit hppa*-*-*

2024-01-19  John David Anglin  

gcc/testsuite/ChangeLog:

* gcc.dg/pr84877.c: Only xfail on 32-bit hppa*-*-*.

diff --git a/gcc/testsuite/gcc.dg/pr84877.c b/gcc/testsuite/gcc.dg/pr84877.c
index d1fb84763c8..68681206e73 100644
--- a/gcc/testsuite/gcc.dg/pr84877.c
+++ b/gcc/testsuite/gcc.dg/pr84877.c
@@ -1,4 +1,4 @@
-/* { dg-do run { xfail { cris-*-* hppa*-*-* sparc*-*-* } } } */
+/* { dg-do run { xfail { cris-*-* sparc*-*-* } || { { ! lp64 } && hppa*-*-* } 
} } */
 /* { dg-options "-O2" } */
 
 #include 


signature.asc
Description: PGP signature

Re: HELP: Questions on unshare_expr

2024-01-19 Thread Qing Zhao



> On Jan 19, 2024, at 4:30 AM, Richard Biener  
> wrote:
> 
> On Thu, Jan 18, 2024 at 3:46 PM Qing Zhao  wrote:
>> 
>> 
>> 
>>> On Jan 17, 2024, at 1:43 AM, Richard Biener  
>>> wrote:
>>> 
>>> On Wed, Jan 17, 2024 at 7:42 AM Richard Biener
>>>  wrote:
 
 On Tue, Jan 16, 2024 at 9:26 PM Qing Zhao  wrote:
> 
> 
> 
>> On Jan 15, 2024, at 4:31 AM, Richard Biener  
>> wrote:
>> 
>>> All my questions for unshare_expr relate to a  LTO bug that I currently 
>>> stuck with
>>> when using .ACCESS_WITH_SIZE in bound sanitizer (only with -flto, 
>>> without -flto, no issue):
>>> 
>>> [opc@qinzhao-aarch64-ol8 gcc]$ sh t
>>> during IPA pass: modref
>>> t.c:20:1: internal compiler error: tree code ‘ssa_name’ is not 
>>> supported in LTO streams
>>> 0x14c3993 lto_write_tree
>>>  ../../latest-gcc-write/gcc/lto-streamer-out.cc:561
>>> 0x14c3aeb lto_output_tree_1
>>> 
>>> And the value of the tree node that triggered the ICE is:
>>> (gdb) call debug_tree(expr)
>>> 
>>>  nothrow
>>>  def_stmt
>>>  version:13 in-free-list>
>>> 
>>> Is there any good way to debug LTO bug?
>> 
>> This happens usually when you have a VLA type and its type fields are not
>> properly gimplified which usually happens because the frontend fails to
>> insert a gimplification point for it (a DECL_EXPR).
> 
> I found an old gcc bug
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97172
> ICE: tree code ‘ssa_name’ is not supported in LTO streams since 
> r11-3303-g6450f07388f9fe57
> 
> Which is very similar to the bug I am having right now.
> 
> After further study, I suspect that the issue I am having right now with 
> the LTO streaming also
> relate to “unshare_expr”, “save_expr”, and the combination of these two, 
> I suspect that
> the current gcc cannot handle the combination of these two correctly for 
> my case.
> 
> My testing case is:
> 
> #include 
> void __attribute__((__noinline__)) setup_and_test_vla (int n1, int n2, 
> int m)
> {
>  struct foo {
>  int n;
>  int p[][n2][n1] __attribute__((counted_by(n)));
>  } *f;
> 
>  f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n2][n1]));
>  f->n = m;
>  f->p[m][n2][n1]=1;
>  return;
> }
> 
> int main(int argc, char *argv[])
> {
> setup_and_test_vla (10, 11, 20);
> return 0;
> }
> 
> Failed with
> my_gcc -Os -fsanitize=bounds -flto
> 
> If changing either n1 or n2 to a constant, the testing passed.
> If deleting -flto, the testing passed too.
> 
> I double checked my code per the suggestions provided by you and Jakub in 
> this
> email thread, and I think the code should be fine.
> 
> The code is following:
> 
> =
> 504 /* Instrument array bounds for INDIRECT_REFs whose pointers are
> 505POINTER_PLUS_EXPRs of calls to .ACCESS_WITH_SIZE. We create special
> 506builtins that gets expanded in the sanopt pass, and make an array
> 507dimension of it.  ARRAY is the pointer to the base of the array,
> 508which is a call to .ACCESS_WITH_SIZE, *OFFSET is the offset to the
> 509beginning of array.
> 510Return NULL_TREE if no instrumentation is emitted.  */
> 511
> 512 tree
> 513 ubsan_instrument_bounds_indirect_ref (location_t loc, tree array, 
> tree *offset)
> 514 {
> 515   if (!is_access_with_size_p (array))
> 516 return NULL_TREE;
> 517   tree bound = get_bound_from_access_with_size (array);
> 518   /* The type of the call to .ACCESS_WITH_SIZE is a pointer type to
> 519  the element of the array.  */
> 520   tree element_size = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array)));
> 521   gcc_assert (bound);
> 522
> 523   /* Given the offset, and the size of each element, the index can be
> 524  computed as: offset/element_size.  */
> 525   *offset = save_expr (*offset);
> 526   tree index = fold_build2 (EXACT_DIV_EXPR,
> 527sizetype, *offset,
> 528unshare_expr (element_size));
> 529   /* Create a "(T *) 0" tree node to describe the original array type.
> 530  We get the original array type from the first argument of the 
> call to
> 531  .ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, num_bytes, -1).
> 532
> 533  Originally, REF is a COMPONENT_REF with the original array type,
> 534  it was converted to a pointer to an ADDR_EXPR, and the 
> ADDR_EXPR's
> 535  first operand is the original COMPONENT_REF.  */
> 536   tree ref = CALL_EXPR_ARG (array, 0);
> 537   tree array_type
> 538 = unshare_expr (TREE_TYPE (TREE_OPERAND (TREE_OPERAND(ref, 0), 
> 0)));
> 539   tree zero_with_type = build_

Re: [middle-end PATCH] Prefer PLUS over IOR in RTL expansion of multi-word shifts/rotates.

2024-01-19 Thread Jeff Law





On 1/19/24 09:05, Georg-Johann Lay wrote:



Am 18.01.24 um 20:54 schrieb Roger Sayle:


This patch tweaks RTL expansion of multi-word shifts and rotates to use
PLUS rather than IOR for disjunctive operations.  During expansion of
these operations, the middle-end creates RTL like (X<>C2)
where the constants C1 and C2 guarantee that bits don't overlap.
Hence the IOR can be performed by any any_or_plus operation, such as
IOR, XOR or PLUS; for word-size operations where carry chains aren't
an issue these should all be equally fast (single-cycle) instructions.
The benefit of this change is that targets with shift-and-add insns,
like x86's lea, can benefit from the LSHIFT-ADD form.

An example of a backend that benefits is ARC, which is demonstrated
by these two simple functions:


But there are also back-ends where this is bad.

The reason is that with ORI, the back-end needs only to operate no
these sub-words where the sub-mask is non-zero.  But for PLUS this
is not the case because the back-end does not know that intermediate
carry will be zero.  Hence, with PLUS, more instructions are needed.
An example is AVR, but maybe much more target with multi-word operations
are affected in a bad way.

Take for example the case with 2 words and a value of 1.

LO |= 1
HI |= 0

can be optimized to

LO |= 1

but for addition this is not the case:

LO += 1
HI +=c 0 ;; Does not know that always carry = 0.
I think it's clear that the decision is target and possibly uarch 
specific within a target.


Which means that expmed is probably the right place and that we're going 
to need to look for a good way for the target to control.  I suspect 
rtx_cost  isn't likely a good fit.


Jeff

[committed] Change dg-options for hpux to define _HPUX_SOURCE in gcc.dg/pthread-init-2.c

2024-01-19 Thread John David Anglin

Tested on hppa64-hp-hpux11.11.  Committed to trunk.

Dave
---

Change dg-options for hpux to define _HPUX_SOURCE in gcc.dg/pthread-init-2.c

Pthreads on hpux needs _HPUX_SOURCE define for id_t and spu_t types.

2024-01-19  John David Anglin  

gcc/testsuite/ChangeLog:

* gcc.dg/pthread-init-2.c: Change dg-options for hpux
to define _HPUX_SOURCE.

diff --git a/gcc/testsuite/gcc.dg/pthread-init-2.c 
b/gcc/testsuite/gcc.dg/pthread-init-2.c
index d7cd66b5c02..c934fb525f9 100644
--- a/gcc/testsuite/gcc.dg/pthread-init-2.c
+++ b/gcc/testsuite/gcc.dg/pthread-init-2.c
@@ -7,7 +7,8 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target pthread_h } */
 /* { dg-options "-Wextra -Wall -ansi" } */
-/* { dg-options "-Wextra -Wall -ansi -D_POSIX_C_SOURCE=199506L" { target { 
*-*-hpux* } } } */
+/* We need to define _HPUX_SOURCE on hpux11.11 for id_t and spu_t types.  */
+/* { dg-options "-Wextra -Wall -ansi -D_HPUX_SOURCE" { target { *-*-hpux* } } 
} */
 /* { dg-options "-Wextra -Wall -ansi -D_XOPEN_SOURCE=500" { target { 
powerpc-ibm-aix* } } } */
 /* The definition of PTHREAD_MUTEX_INITIALIZER is missing an initializer for
mutexAttr.mutexAttrType in kernel mode for various VxWorks versions.  */


signature.asc
Description: PGP signature

[committed] Limit dg-xfail-run-if for --hpux11.[012]* to -O0

2024-01-19 Thread John David Anglin

Tested on hppa64-hp-hpux11.11.  Committed to trunk.

Dave
---

Limit dg-xfail-run-if for *-*-hpux11.[012]* to -O0

2024-01-19  John David Anglin  

gcc/testsuite/ChangeLog:

* gcc.dg/torture/pr47917.c: Limit dg-xfail-run-if for
hpux11.[012]* to -O0.

diff --git a/gcc/testsuite/gcc.dg/torture/pr47917.c 
b/gcc/testsuite/gcc.dg/torture/pr47917.c
index 5724907ba1c..32c99c6a2d2 100644
--- a/gcc/testsuite/gcc.dg/torture/pr47917.c
+++ b/gcc/testsuite/gcc.dg/torture/pr47917.c
@@ -2,7 +2,7 @@
 /* { dg-options "-std=c99" } */
 /* { dg-options "-std=gnu99" { target *-*-hpux* } } */
 /* { dg-additional-options "-D__USE_MINGW_ANSI_STDIO=1" { target *-*-mingw* } 
} */
-/* { dg-xfail-run-if "non-conforming C99 snprintf" { *-*-hpux11.[012]* } } */
+/* { dg-xfail-run-if "non-conforming C99 snprintf" { *-*-hpux11.[012]* } { 
"-O0" } } */
 
 /* PR middle-end/47917 */
 


signature.asc
Description: PGP signature

[PATCH] fortran: Restore current interface info on error [PR111291]

2024-01-19 Thread Mikael Morin

Hello,

I tested this on x86_64-pc-linux-gnu without regression.
There is no new test, as the problem is visible on an 
existing test with valgrind or an asan-instrumented compiler.
OK for master?

-- >8 --

This change is a followup to the fix for PR48776 (namely
r14-3572-gd58150452976c4ca65ddc811fac78ef956fa96b0 AKA
fortran: Restore interface to its previous state on error [PR48776]),
which cleaned up new changes from interfaces upon error.

Unfortunately, there is one case in that fix that is mishandled, visible
on unexpected_interface.f90 with valgrind or an asan-instrumented gfortran.
when an interface statement is found while parsing an interface body (which
is invalid), the current interface is replaced by the one from the new
statement, and as parsing continues, new procedures are added
to the new interface, which has been rejected and freed, instead of the
original one.

This change restores the current interface pointer to its previous value
on each rejected statement.

PR fortran/48776
PR fortran/111291

gcc/fortran/ChangeLog:

* parse.cc: Restore current interface to its previous value on error.
---
 gcc/fortran/parse.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/fortran/parse.cc b/gcc/fortran/parse.cc
index abd3a424f38..51e89e10e2d 100644
--- a/gcc/fortran/parse.cc
+++ b/gcc/fortran/parse.cc
@@ -4033,6 +4033,7 @@ loop:
 default:
   gfc_error ("Unexpected %s statement in INTERFACE block at %C",
 gfc_ascii_statement (st));
+  current_interface = save;
   reject_statement ();
   gfc_free_namespace (gfc_current_ns);
   goto loop;
-- 
2.43.0

Re: [PATCH] RISC-V: Add split pattern to generate SFB instructions. [PR113095]

2024-01-19 Thread Jeff Law





On 1/19/24 00:09, Kito Cheng wrote:

Thanks! generally LGTM, but I would wait one more week to see any
other comments :)Just a note.  113095 isn't marked as a regression, but it most 

definitely is a regression.  So this meets the stage4 criteria.




On Fri, Jan 19, 2024 at 3:05 PM Monk Chiang  wrote:


Since the match.pd transforms (zero_one == 0) ? y : z  y,
into ((typeof(y))zero_one * z)  y. Add splitters to recongize
this expression to generate SFB instructions.

gcc/ChangeLog:
 PR target/113095
 * config/riscv/sfb.md: New splitters to rewrite single bit
 sign extension as the condition to SFB instructions.

gcc/testsuite/ChangeLog:
 * gcc.target/riscv/sfb.c: New test.
I would probably suggest seeing if these still work when the NE nodes do 
not have a mode (ie, replace "ne:X" with just "ne".  Our docs are a bit 
unclear on that topic IIRC and it looks like the RISC-V backend is 
inconsistent.


More importantly, this message doesn't indicate if/how this patch was 
tested.  Given it's conditional on SFB a bug here would be narrow, but 
we should still be doing a regression test.


Jeff

[Patch] xfail libgomp.c/declare-variant-4-{fiji,gfx803}.c

2024-01-19 Thread Tobias Burnus

The problem is as described at 
https://gcc.gnu.org/install/specific.html#amdgcn-x-amdhsa


"Note that support for Fiji devices has been removed in ROCm 4.0 and 
support in LLVM is deprecated and will be removed in LLVM 18."


Therefore, GCC is no longer build with Fiji (gfx803) support by default 
– and the -march=fiji testcases now fails as the -lgomp multilib for 
Fiji is not available. (That is: It fails, unless Fiji support has been 
enabled manually.)


Andrew mentioned that there is a PR about this, but I couldn't find it. 
If someone can, I am happy to add it to the changelog.


OK for mainline?

Tobias
xfail libgomp.c/declare-variant-4-{fiji,gfx803}.c

Since r14-4734-g56ed1055b2f40ac162ae8d382280ac07a33f789f, GCC no longer
builds the Fiji (alias gfx803) libraries by default as support for it was
removed in ROCm 4.0 and will be removed in LLVM 18.

Thus, unless gfx803 is explicitly enabled, the following testcases will
fail to link as libgomp is not available for Fiji. Hence, this commit
xfails those testcases.

libgomp/ChangeLog:

	* testsuite/libgomp.c/declare-variant-4-fiji.c: Xfail as fiji
	support is no longer enabled by default.
	* testsuite/libgomp.c/declare-variant-4-gfx803.c: Likewise.

Signed-off-by: Tobias Burnus 

 libgomp/testsuite/libgomp.c/declare-variant-4-fiji.c   | 2 ++
 libgomp/testsuite/libgomp.c/declare-variant-4-gfx803.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/libgomp/testsuite/libgomp.c/declare-variant-4-fiji.c b/libgomp/testsuite/libgomp.c/declare-variant-4-fiji.c
index a138fb092f8..654f9bc655c 100644
--- a/libgomp/testsuite/libgomp.c/declare-variant-4-fiji.c
+++ b/libgomp/testsuite/libgomp.c/declare-variant-4-fiji.c
@@ -3,6 +3,8 @@
 /* { dg-additional-options -foffload=-march=fiji } */
 /* { dg-additional-options "-foffload=-fdump-tree-optimized" } */
 
+/* { dg-xfail-if "fiji/gfx803 is no longer enabled by default & deprectated in ROCm/LLVM/GCC" { *-*-* } } */
+
 #define USE_FIJI_FOR_GFX803
 #include "declare-variant-4.h"
 
diff --git a/libgomp/testsuite/libgomp.c/declare-variant-4-gfx803.c b/libgomp/testsuite/libgomp.c/declare-variant-4-gfx803.c
index 03dffddac49..b447631e52e 100644
--- a/libgomp/testsuite/libgomp.c/declare-variant-4-gfx803.c
+++ b/libgomp/testsuite/libgomp.c/declare-variant-4-gfx803.c
@@ -3,6 +3,8 @@
 /* { dg-additional-options -foffload=-march=fiji } */
 /* { dg-additional-options "-foffload=-fdump-tree-optimized" } */
 
+/* { dg-xfail-if "fiji/gfx803 is no longer enabled by default & deprectated in ROCm/LLVM/GCC" { *-*-* } } */
+
 #include "declare-variant-4.h"
 
 /* { dg-final { only_for_offload_target amdgcn-amdhsa scan-offload-tree-dump "= gfx803 \\(\\);" "optimized" } } */

[pushed] c++: alias template argument conversion [PR112632]

2024-01-19 Thread Jason Merrill

Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

We've had a problem with lost conversions to template parameter types for a
while now; looking at this PR, it occurred to me that the problem is really
with alias (and concept) templates, since we do substitution of dependent
arguments into them in a way that we don't for other templates.  And fixing
that specific problem is a lot simpler than adding IMPLICIT_CONV_EXPR around
all dependent template arguments the way I gave up on for 111357.

The other part of the fix was changing tsubst_expr to actually call
convert_nontype_argument instead of assuming it will eventually happen.

I waffled about stripping the forced conversion when !force_conv
vs. skipping them in iterative_hash_template_arg and
template_args_equal (like we already do for some other conversions) and
decided to go with the former, but that isn't a strong preference if it
turns out to be somehow problematic.

PR c++/112632
PR c++/112594
PR c++/111357
PR c++/104594
PR c++/67898

gcc/cp/ChangeLog:

* cp-tree.h (IMPLICIT_CONV_EXPR_FORCED): New.
* pt.cc (expand_integer_pack): Remove 111357 workaround.
(maybe_convert_nontype_argument): Add force parm.
(convert_template_argument): Handle alias template args
specially.
(tsubst_expr): Don't ignore IMPLICIT_CONV_EXPR_NONTYPE_ARG.
* error.cc (dump_expr) [CASE_CONVERT]: Handle null optype.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alias-decl-nontype1.C: New test.
* g++.dg/cpp2a/concepts-narrowing1.C: New test.
* g++.dg/cpp2a/nontype-class63.C: New test.
* g++.dg/cpp2a/nontype-class63a.C: New test.
---
 gcc/cp/cp-tree.h  |  5 ++
 gcc/cp/error.cc   |  4 +-
 gcc/cp/pt.cc  | 48 +--
 .../g++.dg/cpp0x/alias-decl-nontype1.C|  9 
 .../g++.dg/cpp2a/concepts-narrowing1.C| 16 +++
 gcc/testsuite/g++.dg/cpp2a/nontype-class63.C  | 24 ++
 gcc/testsuite/g++.dg/cpp2a/nontype-class63a.C | 24 ++
 7 files changed, 114 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/alias-decl-nontype1.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-narrowing1.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class63.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class63a.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index d9b14d7c4f5..60e6dafc549 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -4717,6 +4717,11 @@ get_vec_init_expr (tree t)
 #define IMPLICIT_CONV_EXPR_BRACED_INIT(NODE) \
   (TREE_LANG_FLAG_2 (IMPLICIT_CONV_EXPR_CHECK (NODE)))
 
+/* True if NODE represents a conversion forced to be represented in
+   maybe_convert_nontype_argument, i.e. for an alias template.  */
+#define IMPLICIT_CONV_EXPR_FORCED(NODE) \
+  (TREE_LANG_FLAG_3 (IMPLICIT_CONV_EXPR_CHECK (NODE)))
+
 /* Nonzero means that an object of this type cannot be initialized using
an initializer list.  */
 #define CLASSTYPE_NON_AGGREGATE(NODE) \
diff --git a/gcc/cp/error.cc b/gcc/cp/error.cc
index 52e24fb086c..d3fcac70ea1 100644
--- a/gcc/cp/error.cc
+++ b/gcc/cp/error.cc
@@ -2673,6 +2673,8 @@ dump_expr (cxx_pretty_printer *pp, tree t, int flags)
 
tree ttype = TREE_TYPE (t);
tree optype = TREE_TYPE (op);
+   if (!optype)
+ optype = unknown_type_node;
 
if (TREE_CODE (ttype) != TREE_CODE (optype)
&& INDIRECT_TYPE_P (ttype)
@@ -2691,7 +2693,7 @@ dump_expr (cxx_pretty_printer *pp, tree t, int flags)
else
  dump_unary_op (pp, "&", t, flags);
  }
-   else if (!same_type_p (TREE_TYPE (op), TREE_TYPE (t)))
+   else if (!same_type_p (optype, ttype))
  {
/* It is a cast, but we cannot tell whether it is a
   reinterpret or static cast. Use the C style notation.  */
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index f82d018c981..fbbca469219 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -3760,13 +3760,6 @@ expand_integer_pack (tree call, tree args, 
tsubst_flags_t complain,
 {
   if (hi != ohi)
{
- /* Work around maybe_convert_nontype_argument not doing this for
-dependent arguments.  Don't use IMPLICIT_CONV_EXPR_NONTYPE_ARG
-because that will make tsubst_expr ignore it.  */
- tree type = tsubst (TREE_TYPE (ohi), args, complain, in_decl);
- if (!TREE_TYPE (hi) || !same_type_p (type, TREE_TYPE (hi)))
-   hi = build1 (IMPLICIT_CONV_EXPR, type, hi);
-
  call = copy_node (call);
  CALL_EXPR_ARG (call, 0) = hi;
}
@@ -8457,23 +8450,30 @@ convert_wildcard_argument (tree parm, tree arg)
conversion for the benefit of cp_tree_equal.  */
 
 static tree
-maybe_convert_nontype_argument (tree type, tree arg)
+maybe_convert_nontype_argument (tree type, tree arg, bool force)
 {
   /*

Ping [PATCH 1/6] Add -mcpu=future

2024-01-19 Thread Michael Meissner

Ping

| Date: Fri, 5 Jan 2024 18:35:37 -0500
| From: Michael Meissner 
| Subject: Repost [PATCH 1/6] Add -mcpu=future
| Message-ID: 

https://gcc.gnu.org/pipermail/gcc-patches/2024-January/641961.html

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com

Ping [PATCH 2/6] PowerPC: Make -mcpu=future enable -mblock-ops-vector-pair.

2024-01-19 Thread Michael Meissner

Ping

| Date: Fri, 5 Jan 2024 18:37:17 -0500
| From: Michael Meissner 
| Subject: Repost [PATCH 2/6] PowerPC: Make -mcpu=future enable 
-mblock-ops-vector-pair.
| Message-ID: 

https://gcc.gnu.org/pipermail/gcc-patches/2024-January/641962.html

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com

Ping [PATCH 3/6] PowerPC: Add support for accumulators in DMR registers.

2024-01-19 Thread Michael Meissner

Ping

| Date: Fri, 5 Jan 2024 18:38:23 -0500
| From: Michael Meissner 
| Subject: Repost [PATCH 3/6] PowerPC: Add support for accumulators in DMR 
registers.
| Message-ID: 

https://gcc.gnu.org/pipermail/gcc-patches/2024-January/641963.html

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com

Ping [PATCH 4/6] PowerPC: Make MMA insns support DMR registers.

2024-01-19 Thread Michael Meissner

Ping

| Date: Fri, 5 Jan 2024 18:39:55 -0500
| From: Michael Meissner 
| Subject: Repost [PATCH 4/6] PowerPC: Make MMA insns support DMR registers.
| Message-ID: 

https://gcc.gnu.org/pipermail/gcc-patches/2024-January/641964.html

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com

Ping [PATCH 5/6] PowerPC: Switch to dense math names for all MMA operations.

2024-01-19 Thread Michael Meissner

Ping

| Date: Fri, 5 Jan 2024 18:40:58 -0500
| From: Michael Meissner 
| Subject: Repost [PATCH 5/6] PowerPC: Switch to dense math names for all MMA 
operations.
| Message-ID: 

https://gcc.gnu.org/pipermail/gcc-patches/2024-January/641965.html

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com

Ping [PATCH 6/6] PowerPC: Add support for 1,024 bit DMR registers.

2024-01-19 Thread Michael Meissner

Ping

| Date: Fri, 5 Jan 2024 18:42:02 -0500
| From: Michael Meissner 
| Subject: Repost [PATCH 6/6] PowerPC: Add support for 1,024 bit DMR registers.
| Message-ID: 

https://gcc.gnu.org/pipermail/gcc-patches/2024-January/641966.html

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com

Ping [PATCH, V2] PR target/112886, Add %S to print_operand for vector pair support.

2024-01-19 Thread Michael Meissner

Ping

| Date: Thu, 11 Jan 2024 12:29:23 -0500
| From: Michael Meissner 
| Subject: [PATCH, V2] PR target/112886, Add %S to print_operand for vector 
pair support.
| Message-ID: 

https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642727.html

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com

Re: c++/modules: Emit definitions of ODR-used static members imported from modules [PR112899]

2024-01-19 Thread Patrick Palka

On Wed, 3 Jan 2024, Nathaniel Shead wrote:

> Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> 
> -- >8 --
> 
> Static data members marked 'inline' should be emitted in TUs where they
> are ODR-used.  We need to make sure that statics imported from modules
> are correctly added to the 'pending_statics' map so that they get
> emitted if needed, otherwise the attached testcase fails to link.
> 
>   PR c++/112899
> 
> gcc/cp/ChangeLog:
> 
>   * cp-tree.h (note_variable_template_instantiation): Rename to...
>   (note_static_storage_variable): ...this.
>   * decl2.cc (note_variable_template_instantiation): Rename to...
>   (note_static_storage_variable): ...this.
>   * pt.cc (instantiate_decl): Rename usage of above function.
>   * module.cc (trees_in::read_var_def): Remember pending statics
>   that we stream in.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/modules/init-4_a.C: New test.
>   * g++.dg/modules/init-4_b.C: New test.
> 
> Signed-off-by: Nathaniel Shead 
> ---
>  gcc/cp/cp-tree.h|  2 +-
>  gcc/cp/decl2.cc |  4 ++--
>  gcc/cp/module.cc|  4 
>  gcc/cp/pt.cc|  2 +-
>  gcc/testsuite/g++.dg/modules/init-4_a.C |  9 +
>  gcc/testsuite/g++.dg/modules/init-4_b.C | 11 +++
>  6 files changed, 28 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/modules/init-4_a.C
>  create mode 100644 gcc/testsuite/g++.dg/modules/init-4_b.C
> 
> diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
> index 1979572c365..ebd2850599a 100644
> --- a/gcc/cp/cp-tree.h
> +++ b/gcc/cp/cp-tree.h
> @@ -7113,7 +7113,7 @@ extern tree maybe_get_tls_wrapper_call  (tree);
>  extern void mark_needed  (tree);
>  extern bool decl_needed_p(tree);
>  extern void note_vague_linkage_fn(tree);
> -extern void note_variable_template_instantiation (tree);
> +extern void note_static_storage_variable (tree);
>  extern tree build_artificial_parm(tree, tree, tree);
>  extern bool possibly_inlined_p   (tree);
>  extern int parm_index   (tree);
> diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
> index 0850d3f5bce..241216b0dfe 100644
> --- a/gcc/cp/decl2.cc
> +++ b/gcc/cp/decl2.cc
> @@ -910,10 +910,10 @@ note_vague_linkage_fn (tree decl)
>vec_safe_push (deferred_fns, decl);
>  }
>  
> -/* As above, but for variable template instantiations.  */
> +/* As above, but for variables with static storage duration.  */
>  
>  void
> -note_variable_template_instantiation (tree decl)
> +note_static_storage_variable (tree decl)
>  {
>vec_safe_push (pending_statics, decl);
>  }
> diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
> index 0bd46414da9..14818131a70 100644
> --- a/gcc/cp/module.cc
> +++ b/gcc/cp/module.cc
> @@ -11752,6 +11752,10 @@ trees_in::read_var_def (tree decl, tree 
> maybe_template)
> DECL_INITIALIZED_P (decl) = true;
> if (maybe_dup && DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P 
> (maybe_dup))
>   DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (decl) = true;
> +   if (DECL_CONTEXT (decl)
> +   && RECORD_OR_UNION_TYPE_P (DECL_CONTEXT (decl))
> +   && !DECL_TEMPLATE_INFO (decl))
> + note_static_storage_variable (decl);

It seems this should also handle templated inlines via

   && (!DECL_TEMPLATE_INFO (decl)
   || DECL_IMPLICIT_INSTANTIATION (decl))

otherwise the following fails to link:

  $ cat init-5_a.H
  template
  struct __from_chars_alnum_to_val_table {
static inline int value = 42;
  };

  inline unsigned char
  __from_chars_alnum_to_val() {
return __from_chars_alnum_to_val_table::value;
  }

  $ cat init-6_b.C
  import "init-5_a.H";

  int main() {
__from_chars_alnum_to_val();
  }

  $ g++ -fmodules-ts -std=c++20 init-5_a.H init-5_b.C
  /usr/bin/ld: /tmp/ccNRaads.o: in function `__from_chars_alnum_to_val()':
  
init-6_b.C:(.text._Z25__from_chars_alnum_to_valv[_Z25__from_chars_alnum_to_valv]+0x6):
 undefined reference to `__from_chars_alnum_to_val_table::value'


By the way I ran into this when testing out std::print with modules:

  $ cat std.C
  export module std;
  export import ;

  $ cat hello.C
  import std;

  int main() {
std::print("Hello {}!", "World");
  }

  $ g++ -fmodules-ts -std=c++26 -x c++-system-header bits/stdc++.h
  $ g++ -fmodules-ts -std=c++26 std.C hello.C && ./a.out # before
  /usr/bin/ld: /tmp/ccqNgOM1.o: in function `unsigned char 
std::__detail::__from_chars_alnum_to_val(unsigned char)':
  
hello.C:(.text._ZNSt8__detail25__from_chars_alnum_to_valILb0EEEhh[_ZNSt8__detail25__from_chars_alnum_to_valILb0EEEhh]+0x12):
 undefined reference to 
`std::__detail::__from_chars_alnum_to_val_table::value'
  $ g++ -fmodules-ts -std=c++26 std.C hello.C && ./a.out # after
  Hello World!

It's great that this is so close to working!

>   }
>

Re: [PATCH] fortran: Restore current interface info on error [PR111291]

2024-01-19 Thread Steve Kargl

On Fri, Jan 19, 2024 at 06:47:36PM +0100, Mikael Morin wrote:
> 
> I tested this on x86_64-pc-linux-gnu without regression.
> There is no new test, as the problem is visible on an 
> existing test with valgrind or an asan-instrumented compiler.
> OK for master?
> 

Yes.  After your explanation, the patch looks trivially obvious! :-)
Thanks for the patch.

-- 
Steve

Re: [PATCH 1/2] RISC-V: delete all the vector psabi checking.

2024-01-19 Thread Andreas Schwab

../../gcc/config/riscv/riscv.cc: In function 'void 
riscv_init_cumulative_args(CUMULATIVE_ARGS*, tree, rtx, tree, int)':
../../gcc/config/riscv/riscv.cc:4879:34: error: unused parameter 'fndecl' 
[-Werror=unused-parameter]
 4879 | tree fndecl,
  | ~^~
../../gcc/config/riscv/riscv.cc: In function 'bool 
riscv_vector_mode_supported_any_target_p(machine_mode)':
../../gcc/config/riscv/riscv.cc:10537:56: error: unused parameter 'mode' 
[-Werror=unused-parameter]
10537 | riscv_vector_mode_supported_any_target_p (machine_mode mode)
  |   ~^~~~
cc1plus: all warnings being treated as errors
make[3]: *** [Makefile:2559: riscv.o] Error 1

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

[committed] libstdc++: Fix P2255R2 dangling checks for std::tuple in C++17 [PR108822]

2024-01-19 Thread Jonathan Wakely

Tested powerp64le-linux. Pushed to trunk.

-- >8 --

I accidentally used && in a fold-expression instead of || which meant
that in C++17 the tuple(UElements&&...) constructor only failed its
debug assertion if all tuple elements were dangling references. Some
missing tests (noted as "TODO") meant this wasn't tested.

This fixes the fold expression and adds the missing tests.

libstdc++-v3/ChangeLog:

PR libstdc++/108822
* include/std/tuple (__glibcxx_no_dangling_refs) [C++17]: Fix
wrong fold-operator.
* testsuite/20_util/tuple/dangling_ref.cc: Check tuples with one
element and three elements. Check allocator-extended
constructors.
---
 libstdc++-v3/include/std/tuple|   2 +-
 .../testsuite/20_util/tuple/dangling_ref.cc   | 156 --
 2 files changed, 110 insertions(+), 48 deletions(-)

diff --git a/libstdc++-v3/include/std/tuple b/libstdc++-v3/include/std/tuple
index 7a045b3e6a1..be92f1eb973 100644
--- a/libstdc++-v3/include/std/tuple
+++ b/libstdc++-v3/include/std/tuple
@@ -1299,7 +1299,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Error if construction from U... would create a dangling ref.
 # if __cpp_fold_expressions
 #  define __glibcxx_dangling_refs(U) \
-  (__reference_constructs_from_temporary(_Elements, U) && ...)
+  (__reference_constructs_from_temporary(_Elements, U) || ...)
 # else
 #  define __glibcxx_dangling_refs(U) \
   __or_<__bool_constant<__reference_constructs_from_temporary(_Elements, U) \
diff --git a/libstdc++-v3/testsuite/20_util/tuple/dangling_ref.cc 
b/libstdc++-v3/testsuite/20_util/tuple/dangling_ref.cc
index 74fdc242349..b2dcf359438 100644
--- a/libstdc++-v3/testsuite/20_util/tuple/dangling_ref.cc
+++ b/libstdc++-v3/testsuite/20_util/tuple/dangling_ref.cc
@@ -7,6 +7,7 @@
 
 #if __cplusplus >= 202002L
 // For C++20 and later, constructors are constrained to disallow dangling.
+static_assert(!std::is_constructible_v, long>);
 static_assert(!std::is_constructible_v, long, 
int>);
 static_assert(!std::is_constructible_v, int, 
long>);
 static_assert(!std::is_constructible_v,
@@ -30,76 +31,137 @@ static_assert(!std::is_constructible_v,
 void
 test_ary_ctors()
 {
-  std::tuple t1(1L, 2);
-  // { dg-error "here" "" { target { c++17_down && hosted } } 33 }
-  // { dg-error "use of deleted function" "" { target c++20 } 33 }
+  std::tuple t1(1L);
+  // { dg-error "here" "" { target { c++17_down && hosted } } 34 }
+  // { dg-error "use of deleted function" "" { target c++20 } 34 }
 
-  std::tuple t2(1, 2L);
-  // { dg-error "here" "" { target { c++17_down && hosted } } 37 }
-  // { dg-error "use of deleted function" "" { target c++20 } 37 }
+  std::tuple t2(1L, 2);
+  // { dg-error "here" "" { target { c++17_down && hosted } } 38 }
+  // { dg-error "use of deleted function" "" { target c++20 } 38 }
 
-  std::tuple t3(1L, 2L);
-  // { dg-error "here" "" { target { c++17_down && hosted } } 41 }
-  // { dg-error "use of deleted function" "" { target c++20 } 41 }
+  std::tuple t3(1, 2L);
+  // { dg-error "here" "" { target { c++17_down && hosted } } 42 }
+  // { dg-error "use of deleted function" "" { target c++20 } 42 }
 
-  std::tuple t4(std::pair{});
-  // { dg-error "here" "" { target { c++17_down && hosted } } 45 }
-  // { dg-error "use of deleted function" "" { target c++20 } 45 }
+  std::tuple t4(1L, 2L);
+  // { dg-error "here" "" { target { c++17_down && hosted } } 46 }
+  // { dg-error "use of deleted function" "" { target c++20 } 46 }
 
-  std::pair p;
-  std::tuple t5(p);
+  std::tuple t5(std::pair{});
   // { dg-error "here" "" { target { c++17_down && hosted } } 50 }
   // { dg-error "use of deleted function" "" { target c++20 } 50 }
+
+  std::pair p;
+  std::tuple t6(p);
+  // { dg-error "here" "" { target { c++17_down && hosted } } 55 }
+  // { dg-error "use of deleted function" "" { target c++20 } 55 }
+
+  std::tuple t7(1L, 2, 3);
+  // { dg-error "here" "" { target { c++17_down && hosted } } 59 }
+  // { dg-error "use of deleted function" "" { target c++20 } 59 }
 }
 
 void
 test_converting_ctors()
 {
-  std::tuple t0;
+  std::tuple t10;
 
-  std::tuple t1(t0);
-  // { dg-error "here" "" { target { c++17_down && hosted } } 60 }
-  // { dg-error "use of deleted function" "" { target c++20 } 60 }
+  std::tuple t11(t10);
+  // { dg-error "here" "" { target { c++17_down && hosted } } 69 }
+  // { dg-error "use of deleted function" "" { target c++20 } 69 }
 
-  std::tuple t2(t0);
-  // { dg-error "here" "" { target { c++17_down && hosted } } 64 }
-  // { dg-error "use of deleted function" "" { target c++20 } 64 }
+  std::tuple t12(std::move(t10));
+  // { dg-error "here" "" { target { c++17_down && hosted } } 73 }
+  // { dg-error "use of deleted function" "" { target c++20 } 73 }
 
-  std::tuple t3(t0);
-  // { dg-error "here" "" { target { c++17_down && hosted } } 68 }
-  // { dg-error "use of deleted function" "" { target c++20 } 68 }
+  std::tuple t20;
 
-  std::tuple t4(std::move(t0));
-  // { dg-e

[committed] libstdc++: Do not use CTAD for _Utf32_view alias template (redux)

2024-01-19 Thread Jonathan Wakely

Tested powerp64le-linux. Pushed to trunk.

-- >8 --

My change in r14-8181-g665a3ff1539ce2 was incomplete as there's a second
place using CTAD with the _Utf32_view alias template. This fixes it.

libstdc++-v3/ChangeLog:

* include/std/format (_Spec::_M_parse_fill_and_align): Do not
use CTAD for _Utf32_view.
---
 libstdc++-v3/include/std/format | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format
index efc4a17ba36..f4d91517656 100644
--- a/libstdc++-v3/include/std/format
+++ b/libstdc++-v3/include/std/format
@@ -434,7 +434,7 @@ namespace __format
if constexpr (__literal_encoding_is_unicode<_CharT>())
  {
// Accept any UCS scalar value as fill character.
-   _Utf32_view __uv(ranges::subrange(__first, __last));
+   _Utf32_view> __uv({__first, __last});
if (!__uv.empty())
  {
auto __beg = __uv.begin();
-- 
2.43.0

[pushed] c++: requires and using-decl [PR113498]

2024-01-19 Thread Jason Merrill

Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

get_template_info was crashing because it assumed that any decl with
DECL_LANG_SPECIFIC could use DECL_TEMPLATE_INFO.  It's more complicated than
that.

PR c++/113498

gcc/cp/ChangeLog:

* pt.cc (decl_template_info): New fn.
(get_template_info): Use it.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-using4.C: New test.
---
 gcc/cp/pt.cc | 30 ++--
 gcc/testsuite/g++.dg/cpp2a/concepts-using4.C | 24 
 2 files changed, 52 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-using4.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index fbbca469219..74013533b0f 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -339,6 +339,32 @@ build_template_info (tree template_decl, tree 
template_args)
   return result;
 }
 
+/* DECL_TEMPLATE_INFO, if applicable, or NULL_TREE.  */
+
+static tree
+decl_template_info (const_tree decl)
+{
+  /* This needs to match template_info_decl_check.  */
+  if (DECL_LANG_SPECIFIC (decl))
+switch (TREE_CODE (decl))
+  {
+  case FUNCTION_DECL:
+   if (DECL_THUNK_P (decl))
+ break;
+   gcc_fallthrough ();
+  case VAR_DECL:
+  case FIELD_DECL:
+  case TYPE_DECL:
+  case CONCEPT_DECL:
+  case TEMPLATE_DECL:
+   return DECL_TEMPLATE_INFO (decl);
+
+  default:
+   break;
+  }
+  return NULL_TREE;
+}
+
 /* Return the template info node corresponding to T, whatever T is.  */
 
 tree
@@ -353,8 +379,8 @@ get_template_info (const_tree t)
   || TREE_CODE (t) == PARM_DECL)
 return NULL;
 
-  if (DECL_P (t) && DECL_LANG_SPECIFIC (t))
-tinfo = DECL_TEMPLATE_INFO (t);
+  if (DECL_P (t))
+tinfo = decl_template_info (t);
 
   if (!tinfo && DECL_IMPLICIT_TYPEDEF_P (t))
 t = TREE_TYPE (t);
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-using4.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-using4.C
new file mode 100644
index 000..a39a7c0a8a3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-using4.C
@@ -0,0 +1,24 @@
+// PR c++/113498
+// { dg-do compile { target c++20 } }
+
+template
+struct S_Base
+{
+static constexpr int D = d;
+};
+
+template
+struct S : public S_Base
+{
+using S_Base::D;
+constexpr void f() const
+requires(D > 0) {}
+
+};
+
+int main(int, char**)
+{
+S<1> s;
+s.f();
+return 0;
+}

base-commit: 631a922e5c8578a1c878b69f1651d482b661ef4a
-- 
2.39.3

Re: [PATCH v5] RISC-V: Support XTheadVector extension

2024-01-19 Thread Jeff Law





On 1/18/24 07:43, Christoph Müllner wrote:

On Fri, Jan 12, 2024 at 4:18 AM Jun Sha (Joshua)
 wrote:


This patch series presents gcc implementation of the XTheadVector
extension [1].

[1] https://github.com/T-head-Semi/thead-extension-spec/

For some vector patterns that cannot be avoided, we use
"!TARGET_XTHEADVECTOR" to disable them in order not to
generate instructions that xtheadvector does not support,
causing 10 changes in vector.md.

For the th. prefix issue, we use current_output_insn and
the ASM_OUTPUT_OPCODE hook instead of directly modifying
patterns in vector.md.

We have run the GCC test suite and can confirm that there
are no regressions.

Furthermore, we have run the tests in
https://github.com/riscv-non-isa/rvv-intrinsic-doc/tree/main/examples,
and all the tests passed.

Co-authored-by: Jin Ma 
Co-authored-by: Xianmiao Qu 
Co-authored-by: Christoph Müllner 

[PATCH v4] RISC-V: Introduce XTheadVector as a subset of V1.0.0
[PATCH v5] RISC-V: Adds the prefix "th." for the instructions of XTheadVector
[PATCH v6] RISC-V: Handle differences between XTheadvector and Vector
[PATCH v6] RISC-V: Add support for xtheadvector-specific intrinsics
[PATCH v6] RISC-V: Fix register overlap issue for some xtheadvector instructions
[PATCH v5] RISC-V: Rewrite some instructions using ASM targethook


All patches of this series got either "LGTM" or "OK":
* https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643339.html
* https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642798.html
* https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642799.html
* https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642800.html
* https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642801.html
* https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642802.html

As mentioned earlier, I have rebased the patches, retested them locally and
(after ensuring there are no regressions) pushed them.

To all involved people: thank you very much!
A special 'thank you' goes to Juzhe, who did a great job in reviewing
the patches
and providing suggestions to get the code into shape!
Likewise.  Glad to see we were able to push this through to a reasonable 
conclusion and a huge thanks to Juzhe for all his work on the review side.


Jeff

[patch][gcn] mkoffload: Fix linking with "-g"; fix file deletion; improve diagnostic [PR111966]

2024-01-19 Thread Tobias Burnus

This patch fixes PR111966, i.e. when compiling offloaded code with "-g" 
but without "-march=", mkoffload created a file with e_flags set to 
gfx803/fiji as architecture - while all other files used gfx900, which 
the linker did not like.


Reason: When the default was changed, this flag was missed. When passing 
-march=... instead of relying on the default, it works.


Additionally, it fixed a bug with dangling pointers and multiple 
deletion attempts for the same file, leading normally only to the 
accumulation of /tmp/cc*.mkoffload.dbg.o files.


And, finally,  when building with a recent GCC; it warned about missing 
%<...%> or %qs quotes. I added a couple to reduce the number of warnings.


OK for mainline? — I think the /tmp/cc*.mkoffload.dbg.o part of the 
patch could also be backported to GCC 13 (and 12) if deemed to be useful.


Tobias
[gcn] mkoffload: Fix linking with "-g"; fix file deletion; improve diagnostic [PR111966]

With debugging enabled, '*.mkoffload.dbg.o' files are generated. The e_flags
header of all *.o files must be the same - otherwise, the linker complains.
Since r14-4734-g56ed1055b2f40ac162ae8d382280ac07a33f789f the -march= default
is now gfx900. If compiling without any -march= flag, the default value is
used by the compiler but not passed to mkoffload. Hence, mkoffload.cc's uses
its own default for march - unfortunately, it still had gfx803/fiji as default,
leading to the linker error: 'incompatible mach'. Solution: Update the
default to gfx900.

While debugging it, I saw that /tmp/cc*.mkoffload.dbg.o kept accumulating;
there were a couple of issues with the handling:
* dbgobj was always added to files_to_cleanup
* If copy_early_debug_info returned true, dbgobj was added again
  -> pointless and in theory a race if the same file was added in the
 faction of a second.
* If copy_early_debug_info returned false,
  - In exactly one case, it already deleted the file it self
(same potential race as above)
  - The pointer dbgobj was freed - such that files_to_cleanup contained
a dangling pointer - probably the reason that stale files remained.
Solution: Only if copy_early_debug_info returns true, dbgobj is added to
files_to_cleanup. If it returns false, the file is unlinked before freeing
the pointer.

When compiling, GCC warned about several fatal_error messages as having
no %<...%> or %qs quotes. This patch now silences several of those warnings
by using those quotes.

gcc/ChangeLog:

	PR other/111966
	* config/gcn/mkoffload.cc (elf_arch): Change default to gfx900
	to match the compiler default.
	(simple_object_copy_lto_debug_sections): Never unlink the outfile
	on error as the caller does so.
	(maybe_unlink, compile_native): Use %<...%> and %qs in fatal_error.
	(main): Likewise. Fix 'mkoffload.dbg.o' cleanup. 

Signed-off-by: Tobias Burnus 

 gcc/config/gcn/mkoffload.cc | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/gcc/config/gcn/mkoffload.cc b/gcc/config/gcn/mkoffload.cc
index d4cd509089e..0d0e7bac9b2 100644
--- a/gcc/config/gcn/mkoffload.cc
+++ b/gcc/config/gcn/mkoffload.cc
@@ -124,7 +124,7 @@ static const char *gcn_dumpbase;
 static struct obstack files_to_cleanup;
 
 enum offload_abi offload_abi = OFFLOAD_ABI_UNSET;
-uint32_t elf_arch = EF_AMDGPU_MACH_AMDGCN_GFX803;  // Default GPU architecture.
+uint32_t elf_arch = EF_AMDGPU_MACH_AMDGCN_GFX900;  // Default GPU architecture.
 uint32_t elf_flags = EF_AMDGPU_FEATURE_SRAMECC_ANY_V4;
 
 static int gcn_stack_size = 0;  /* Zero means use default.  */
@@ -154,7 +154,7 @@ maybe_unlink (const char *file)
   if (!save_temps)
 {
   if (unlink_if_ordinary (file) && errno != ENOENT)
-	fatal_error (input_location, "deleting file %s: %m", file);
+	fatal_error (input_location, "deleting file %qs: %m", file);
 }
   else if (verbose)
 fprintf (stderr, "[Leaving %s]\n", file);
@@ -320,10 +320,7 @@ copy_early_debug_info (const char *infile, const char *outfile)
 
   errmsg = simple_object_copy_lto_debug_sections (inobj, outfile, &err, true);
   if (errmsg)
-{
-  unlink_if_ordinary (outfile);
-  return false;
-}
+return false;
 
   simple_object_release_read (inobj);
   close (infd);
@@ -804,7 +801,7 @@ compile_native (const char *infile, const char *outfile, const char *compiler,
   const char *collect_gcc_options = getenv ("COLLECT_GCC_OPTIONS");
   if (!collect_gcc_options)
 fatal_error (input_location,
-		 "environment variable COLLECT_GCC_OPTIONS must be set");
+		 "environment variable % must be set");
 
   struct obstack argv_obstack;
   obstack_init (&argv_obstack);
@@ -859,11 +856,11 @@ main (int argc, char **argv)
 
   obstack_init (&files_to_cleanup);
   if (atexit (mkoffload_cleanup) != 0)
-fatal_error (input_location, "atexit failed");
+fatal_error (input_location, "% failed");
 
   char *collect_gcc = getenv ("COLLECT_GCC");
   if (collect_gcc == NULL)
-fatal_error (input_location, "COLLECT_GCC must be set.");
+

Re: [PATCH] libgccjit: Add vector permutation and vector access operations

2024-01-19 Thread Antoni Boucher

David: Ping.

On Thu, 2023-11-30 at 17:16 -0500, Antoni Boucher wrote:
> All of these are fixed in this new patch.
> Thanks for the review.
> 
> On Mon, 2023-11-20 at 18:05 -0500, David Malcolm wrote:
> > On Fri, 2023-11-17 at 17:36 -0500, Antoni Boucher wrote:
> > > Hi.
> > > This patch adds a vector permutation and vector access operations
> > > (bug
> > > 112602).
> > > 
> > > This was split from this patch:
> > > https://gcc.gnu.org/pipermail/jit/2023q1/001606.html
> > > 
> > 
> > Thanks for the patch.
> > 
> > Overall, looks good, but 3 minor nitpicks:
> > 
> > [...snip...]
> > 
> > > diff --git a/gcc/jit/docs/topics/compatibility.rst
> > > b/gcc/jit/docs/topics/compatibility.rst
> > > index ebede440ee4..a764e3968d1 100644
> > > --- a/gcc/jit/docs/topics/compatibility.rst
> > > +++ b/gcc/jit/docs/topics/compatibility.rst
> > > @@ -378,3 +378,13 @@ alignment of a variable:
> > >  
> > >  ``LIBGCCJIT_ABI_25`` covers the addition of
> > >  :func:`gcc_jit_type_get_restrict`
> > > +
> > > +
> > > +.. _LIBGCCJIT_ABI_26:
> > > +
> > > +``LIBGCCJIT_ABI_26``
> > > +
> > > +``LIBGCCJIT_ABI_26`` covers the addition of functions to
> > > manipulate vectors:
> > > +
> > > +  * :func:`gcc_jit_context_new_rvalue_vector_perm`
> > > +  * :func:`gcc_jit_context_new_vector_access`
> > > diff --git a/gcc/jit/docs/topics/expressions.rst
> > > b/gcc/jit/docs/topics/expressions.rst
> > > index 42cfee36302..4a45aa13f5c 100644
> > > --- a/gcc/jit/docs/topics/expressions.rst
> > > +++ b/gcc/jit/docs/topics/expressions.rst
> > > @@ -295,6 +295,35 @@ Vector expressions
> > >  
> > >    #ifdef
> > > LIBGCCJIT_HAVE_gcc_jit_context_new_rvalue_from_vector
> > >  
> > > +.. function:: gcc_jit_rvalue * \
> > > +  gcc_jit_context_new_rvalue_vector_perm
> > > (gcc_jit_context *ctxt, \
> > > + 
> > > gcc_jit_location *loc, \
> > > + 
> > > gcc_jit_rvalue *elements1, \
> > > + 
> > > gcc_jit_rvalue *elements2, \
> > > + 
> > > gcc_jit_rvalue *mask);
> > > +
> > > +   Build a permutation of two vectors.
> > > +
> > > +   "elements1" and "elements2" should have the same type.
> > > +   The length of "mask" and "elements1" should be the same.
> > > +   The element type of "mask" should be integral.
> > > +   The size of the element type of "mask" and "elements1" should
> > > be the same.
> > > +
> > > +   This entrypoint was added in :ref:`LIBGCCJIT_ABI_25`; you can
> > > test for
> >    ^^
> > Should be 26
> > 
> > [...snip...]
> > 
> > >  Unary Operations
> > >  
> > >  
> > > @@ -1020,3 +1049,27 @@ Field access is provided separately for
> > > both
> > > lvalues and rvalues.
> > >    PTR[INDEX]
> > >  
> > >     in C (or, indeed, to ``PTR + INDEX``).
> > > +
> > > +.. function:: gcc_jit_lvalue *\
> > > +  gcc_jit_context_new_vector_access (gcc_jit_context
> > > *ctxt,\
> > > +
> > > gcc_jit_location
> > > *loc,\
> > > + gcc_jit_rvalue
> > > *vector,\
> > > + gcc_jit_rvalue
> > > *index)
> > > +
> > > +   Given an rvalue of vector type ``T __attribute__
> > > ((__vector_size__ (SIZE)))``, get the element `T` at
> > > +   the given index.
> > > +
> > > +   This entrypoint was added in :ref:`LIBGCCJIT_ABI_25`; you can
> > > test for
> >    ^^
> > 
> > Likewise here.
> > 
> > [...snip...]
> > 
> > > @@ -4071,6 +4107,79 @@ gcc_jit_context_new_rvalue_from_vector
> > > (gcc_jit_context *ctxt,
> > >   (gcc::jit::recording::rvalue **)elements);
> > >  }
> > >  
> > > +/* Public entrypoint.  See description in libgccjit.h.
> > > +
> > > +   After error-checking, the real work is done by the
> > > +   gcc::jit::recording::context::new_rvalue_vector_perm method,
> > > in
> > > +   jit-recording.cc.  */
> > > +
> > > +gcc_jit_rvalue *
> > > +gcc_jit_context_new_rvalue_vector_perm (gcc_jit_context *ctxt,
> > > + gcc_jit_location *loc,
> > > + gcc_jit_rvalue
> > > *elements1,
> > > + gcc_jit_rvalue
> > > *elements2,
> > > + gcc_jit_rvalue *mask)
> > > +{
> > > +  RETURN_NULL_IF_FAIL (ctxt, NULL, loc, "NULL ctxt");
> > > +  JIT_LOG_FUNC (ctxt->get_logger ());
> > > +
> > > +  /* LOC can be NULL.  */
> > 
> > ...but "elements1", "elements2", and "mask" must not be NULL, as
> > they're dereferenced below.  So this is going to need something
> > like
> > the following (untested):
> > 
> >   RETURN_NULL_IF_FAIL (elements1, ctxt, loc, "NULL elements1");
> >   RETURN_NULL_IF_FAIL (elements2, ctxt,

[PATCH, committed] Fortran: fix wrong array bounds check [PR113471]

2024-01-19 Thread Harald Anlauf

Dear all,

I've pushed the attached obvious patch for a regression due to a
wrong array bounds check after regtesting on x86_64-pc-linux-gnu
and verification of the fix by the reporter in the PR.

https://gcc.gnu.org/g:94b2e6cb1cc4feb122bf77f19a657c97bffa9b42

Thanks,
Harald

From 94b2e6cb1cc4feb122bf77f19a657c97bffa9b42 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Fri, 19 Jan 2024 21:20:44 +0100
Subject: [PATCH] Fortran: fix wrong array bounds check [PR113471]

gcc/fortran/ChangeLog:

	PR fortran/113471
	* trans-array.cc (array_bound_check_elemental): Array bounds check
	shall apply here to elemental dimensions of an array section only.

gcc/testsuite/ChangeLog:

	PR fortran/113471
	* gfortran.dg/bounds_check_24.f90: New test.
---
 gcc/fortran/trans-array.cc|  2 +-
 gcc/testsuite/gfortran.dg/bounds_check_24.f90 | 28 +++
 2 files changed, 29 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/bounds_check_24.f90

diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
index 26e7adaa03f..878a92aff18 100644
--- a/gcc/fortran/trans-array.cc
+++ b/gcc/fortran/trans-array.cc
@@ -3600,7 +3600,7 @@ array_bound_check_elemental (gfc_se * se, gfc_ss * ss, gfc_expr * expr)
 	  continue;
 	}

-	  if (ref->type == REF_ARRAY && ref->u.ar.dimen > 0)
+	  if (ref->type == REF_ARRAY && ref->u.ar.type == AR_SECTION)
 	{
 	  ar = &ref->u.ar;
 	  for (dim = 0; dim < ar->dimen; dim++)
diff --git a/gcc/testsuite/gfortran.dg/bounds_check_24.f90 b/gcc/testsuite/gfortran.dg/bounds_check_24.f90
new file mode 100644
index 000..d0251e8455b
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/bounds_check_24.f90
@@ -0,0 +1,28 @@
+! { dg-do compile }
+! { dg-additional-options "-fcheck=bounds -fdump-tree-original" }
+!
+! PR fortran/113471 - wrong array bounds check
+
+program pr113471
+  implicit none
+  type t
+ integer, dimension(2) :: c1 = 0
+  end type t
+  type(t) :: cc(7), bb(7)
+  integer :: kk = 1
+
+  ! no bounds check (can be determined at compile time):
+  call foo (cc(7)% c1)
+
+  ! bounds check involving kk, but no "outside of expected range"
+  call foo (bb(kk)% c1)
+
+contains
+  subroutine foo (c)
+integer, intent(in) :: c(:)
+  end
+end
+
+! { dg-final { scan-tree-dump-times "below lower bound" 2 "original" } }
+! { dg-final { scan-tree-dump-times "above upper bound" 2 "original" } }
+! { dg-final { scan-tree-dump-not "outside of expected range" "original" } }
--
2.35.3

Re: [PATCH] libgccjit: Allow comparing aligned int types

2024-01-19 Thread Antoni Boucher

David: Ping.

On Thu, 2023-12-21 at 08:33 -0500, Antoni Boucher wrote:
> Hi.
> This patch allows comparing aligned integer types as equal.
> There's a TODO in the code about whether we should check that the
> alignment is equal.
> What are your thoughts on this?
> 
> Thanks for the review.

Re: [PATCH] libgccjit: Allow sending a const pointer as argument

2024-01-19 Thread Antoni Boucher

David: Ping.

On Thu, 2023-12-21 at 11:59 -0500, Antoni Boucher wrote:
> Hi.
> This patch adds the ability to send const pointer as argument to a
> function.
> Thanks for the review.

Re: [PATCH] libgccjit: Add convert vector

2024-01-19 Thread Antoni Boucher

David: Ping.

On Thu, 2023-12-21 at 16:01 -0500, Antoni Boucher wrote:
> Hi.
> This patch adds the support for the convert vector internal function.
> I'll need to double-check that making the decl a register is
> necessary.
> Thanks for the review.

Re: [committed] Fix comment typos

2024-01-19 Thread rep . dot . nop

Hi

Just another commentary typo..

On 17 January 2024 11:23:01 CET, Jakub Jelinek  wrote:

>--- gcc/gengtype.cc.jj 2024-01-03 11:51:23.314845233 +0100
>+++ gcc/gengtype.cc2024-01-16 18:56:57.383009291 +0100
>@@ -4718,8 +4718,8 @@ write_roots (pair_p variables, bool emit
> }
> 
> /* Prints not-as-ugly version of a typename of T to OF.  Trades the uniquness
>-   guaranteee for somewhat increased readability.  If name conflicts do 
>happen,

s/uniquness/uniqueness/g

thanks

Re: Fix merging of value predictors

2024-01-19 Thread rep . dot . nop

On 17 January 2024 14:20:49 CET, Jan Hubicka  wrote:

>--- a/gcc/predict.def
>+++ b/gcc/predict.def
>@@ -94,6 +94,16 @@ DEF_PREDICTOR (PRED_LOOP_ITERATIONS_GUESSED, "guessed loop 
>iterations",
> DEF_PREDICTOR (PRED_LOOP_ITERATIONS_MAX, "guessed loop iterations",
>  PROB_UNINITIALIZED, PRED_FLAG_FIRST_MATCH)
> 
>+/* Prediction which is an outcome of combining multiple value predictions.  */
>+DEF_PREDICTOR (PRED_COMBINED_VALUE_PREDICTIONS,
>+ "combined value predictions", PROB_UNINITIALIZED, 0)
>+
>+/* Prediction which is an outcome of combining multiple value predictions
>+   on PHI statement (this is less accurate since we do not know reverse
>+   edge probabilities at that time).  */
>+DEF_PREDICTOR (PRED_COMBINED_VALUE_PREDICTIONS_PHI,
>+ "combined value predictions", PROB_UNINITIALIZED, 0)
>+

Do you want to add "phi" somewhere to the latter (to distinguish them in the 
dumps)?

thanks

Re: [PATCH] modula2: Many powerpc platforms do _not_ have support for IEEE754 long double [PR111956]

2024-01-19 Thread Gaius Mulley

Richard Biener  writes:

> On Thu, Jan 18, 2024 at 1:58 AM Gaius Mulley  wrote:
>>
>>
>> ok for master ?
>>
>> Bootstrapped on power8 (cfarm135), power9 (cfarm120) and
>> x86_64-linux-gnu.
>
> OK.

many thanks!

> I wonder what this does to the libm2 ABI?

ah yes - I'll open a PR reflecting lack of libm2 ABI compatibility on
powerpc platforms.

[PATCH] libgccjit: Add support for creating temporary variables

2024-01-19 Thread Antoni Boucher

Hi.
This patch adds a new way to create local variable that won't generate
debug info: it is to be used for compiler-generated variables.
Thanks for the review.
From 6f69e9db77f3c7e019fae74414ba5eed15298514 Mon Sep 17 00:00:00 2001
From: Antoni Boucher 
Date: Thu, 18 Jan 2024 16:54:59 -0500
Subject: [PATCH] libgccjit: Add support for creating temporary variables

gcc/jit/ChangeLog:

	* docs/topics/compatibility.rst (LIBGCCJIT_ABI_26): New ABI tag.
	* docs/topics/functions.rst: Document gcc_jit_function_new_temp.
	* jit-playback.cc (new_local): Add new is_temp parameter.
	* jit-playback.h: Add new is_temp parameter.
	* jit-recording.cc (recording::function::new_temp): New method.
	(recording::local::replay_into): Support temporary variables.
	(recording::local::write_reproducer): Support temporary
	variables.
	* jit-recording.h (new_temp): New method.
	(m_is_temp): New field.
	* libgccjit.cc (gcc_jit_function_new_temp): New function.
	* libgccjit.h (gcc_jit_function_new_temp): New function.
	* libgccjit.map: New function.

gcc/testsuite/ChangeLog:

	* jit.dg/all-non-failing-tests.h: Mention test-temp.c.
	* jit.dg/test-temp.c: New test.
---
 gcc/jit/docs/topics/compatibility.rst|  9 
 gcc/jit/docs/topics/functions.rst| 20 +++
 gcc/jit/jit-playback.cc  | 21 ++--
 gcc/jit/jit-playback.h   |  3 +-
 gcc/jit/jit-recording.cc | 52 +-
 gcc/jit/jit-recording.h  | 17 --
 gcc/jit/libgccjit.cc | 31 +++
 gcc/jit/libgccjit.h  |  7 +++
 gcc/jit/libgccjit.map|  5 ++
 gcc/testsuite/jit.dg/all-non-failing-tests.h |  3 ++
 gcc/testsuite/jit.dg/test-temp.c | 56 
 11 files changed, 205 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/jit.dg/test-temp.c

diff --git a/gcc/jit/docs/topics/compatibility.rst b/gcc/jit/docs/topics/compatibility.rst
index cbf5b414d8c..5d62e264a00 100644
--- a/gcc/jit/docs/topics/compatibility.rst
+++ b/gcc/jit/docs/topics/compatibility.rst
@@ -390,3 +390,12 @@ on functions and variables:
   * :func:`gcc_jit_function_add_string_attribute`
   * :func:`gcc_jit_function_add_integer_array_attribute`
   * :func:`gcc_jit_lvalue_add_string_attribute`
+
+.. _LIBGCCJIT_ABI_27:
+
+``LIBGCCJIT_ABI_27``
+
+``LIBGCCJIT_ABI_27`` covers the addition of a functions to create a new
+temporary variable:
+
+  * :func:`gcc_jit_function_new_temp`
diff --git a/gcc/jit/docs/topics/functions.rst b/gcc/jit/docs/topics/functions.rst
index 804605ea939..230caf42466 100644
--- a/gcc/jit/docs/topics/functions.rst
+++ b/gcc/jit/docs/topics/functions.rst
@@ -171,6 +171,26 @@ Functions
underlying string, so it is valid to pass in a pointer to an on-stack
buffer.
 
+.. function:: gcc_jit_lvalue *\
+  gcc_jit_function_new_temp (gcc_jit_function *func,\
+ gcc_jit_location *loc,\
+ gcc_jit_type *type)
+
+   Create a new local variable within the function, of the given type.
+   This function is similar to :func:`gcc_jit_function_new_local`, but
+   it is to be used for compiler-generated variables (as opposed to
+   user-defined variables in the language to be compiled) and these
+   variables won't show up in the debug info.
+
+   The parameter ``type`` must be non-`void`.
+
+   This entrypoint was added in :ref:`LIBGCCJIT_ABI_26`; you can test
+   for its presence using
+
+   .. code-block:: c
+
+  #ifdef LIBGCCJIT_HAVE_gcc_jit_function_new_temp
+
 .. function::  size_t \
gcc_jit_function_get_param_count (gcc_jit_function *func)
 
diff --git a/gcc/jit/jit-playback.cc b/gcc/jit/jit-playback.cc
index 84df6c100e6..cb6b2f66276 100644
--- a/gcc/jit/jit-playback.cc
+++ b/gcc/jit/jit-playback.cc
@@ -31,6 +31,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "toplev.h"
 #include "tree-cfg.h"
 #include "convert.h"
+#include "gimple-expr.h"
 #include "stor-layout.h"
 #include "print-tree.h"
 #include "gimplify.h"
@@ -1950,13 +1951,27 @@ new_local (location *loc,
 	   type *type,
 	   const char *name,
 	   const std::vector> &attributes)
+   std::string>> &attributes,
+	   bool is_temp)
 {
   gcc_assert (type);
-  gcc_assert (name);
-  tree inner = build_decl (UNKNOWN_LOCATION, VAR_DECL,
+  tree inner;
+  if (is_temp)
+  {
+inner = build_decl (UNKNOWN_LOCATION, VAR_DECL,
+			create_tmp_var_name ("JITTMP"),
+			type->as_tree ());
+DECL_ARTIFICIAL (inner) = 1;
+DECL_IGNORED_P (inner) = 1;
+DECL_NAMELESS (inner) = 1;
+  }
+  else
+  {
+gcc_assert (name);
+inner = build_decl (UNKNOWN_LOCATION, VAR_DECL,
 			   get_identifier (name),
 			   type->as_tree ());
+  }
   DECL_CONTEXT (inner) = this->m_inner_fndecl;
 
   /* Prepend to BIND_EXPR_VARS: */
diff --git a/gcc/jit/jit-playback.h b/gcc/jit/jit-playback.h
ind

[PATCH] libgccjit: Allow comparing array types

2024-01-19 Thread Antoni Boucher

Hi.
This patch allows comparing different instances of array types as
equal.
Thanks for the review.
From ef4afd9de440f10502f3cc84b2112cf83cde2610 Mon Sep 17 00:00:00 2001
From: Antoni Boucher 
Date: Tue, 2 Jan 2024 16:04:10 -0500
Subject: [PATCH] libgccjit: Allow comparing array types

gcc/jit/ChangeLog:

	* jit-common.h: Add array_type class.
	* jit-recording.h (type::dyn_cast_array_type,
	memento_of_get_aligned::dyn_cast_array_type,
	array_type::dyn_cast_array_type, array_type::is_same_type_as):
	New methods.

gcc/testsuite/ChangeLog:

	* jit.dg/test-types.c: Add array type comparison to the test.
---
 gcc/jit/jit-common.h  |  1 +
 gcc/jit/jit-recording.h   | 17 +
 gcc/testsuite/jit.dg/test-types.c |  5 +
 3 files changed, 23 insertions(+)

diff --git a/gcc/jit/jit-common.h b/gcc/jit/jit-common.h
index 80c1618da96..57a667e6d12 100644
--- a/gcc/jit/jit-common.h
+++ b/gcc/jit/jit-common.h
@@ -118,6 +118,7 @@ namespace recording {
 class struct_;
 	class union_;
   class vector_type;
+  class array_type;
 class field;
   class bitfield;
 class fields;
diff --git a/gcc/jit/jit-recording.h b/gcc/jit/jit-recording.h
index 4a8082991fb..df33ce219fc 100644
--- a/gcc/jit/jit-recording.h
+++ b/gcc/jit/jit-recording.h
@@ -545,6 +545,7 @@ public:
   virtual function_type *as_a_function_type() { gcc_unreachable (); return NULL; }
   virtual struct_ *dyn_cast_struct () { return NULL; }
   virtual vector_type *dyn_cast_vector_type () { return NULL; }
+  virtual array_type *dyn_cast_array_type () { return NULL; }
 
   /* Is it typesafe to copy to this type from rtype?  */
   virtual bool accepts_writes_from (type *rtype)
@@ -810,6 +811,11 @@ public:
 
   void replay_into (replayer *) final override;
 
+  array_type *dyn_cast_array_type () final override
+  {
+return m_other_type->dyn_cast_array_type ();
+  }
+
 private:
   string * make_debug_string () final override;
   void write_reproducer (reproducer &r) final override;
@@ -868,6 +874,17 @@ class array_type : public type
 
   type *dereference () final override;
 
+  bool is_same_type_as (type *other) final override
+  {
+array_type *other_array_type = other->dyn_cast_array_type ();
+if (!other_array_type)
+  return false;
+return m_num_elements == other_array_type->m_num_elements
+  && m_element_type->is_same_type_as (other_array_type->m_element_type);
+  }
+
+  array_type *dyn_cast_array_type () final override { return this; }
+
   bool is_int () const final override { return false; }
   bool is_float () const final override { return false; }
   bool is_bool () const final override { return false; }
diff --git a/gcc/testsuite/jit.dg/test-types.c b/gcc/testsuite/jit.dg/test-types.c
index a01944e35fa..79f7ea21026 100644
--- a/gcc/testsuite/jit.dg/test-types.c
+++ b/gcc/testsuite/jit.dg/test-types.c
@@ -492,4 +492,9 @@ verify_code (gcc_jit_context *ctxt, gcc_jit_result *result)
 
   CHECK_VALUE (gcc_jit_type_get_size (gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_FLOAT)), sizeof (float));
   CHECK_VALUE (gcc_jit_type_get_size (gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_DOUBLE)), sizeof (double));
+
+  gcc_jit_type *int_type = gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_INT);
+  gcc_jit_type *array_type1 = gcc_jit_context_new_array_type (ctxt, NULL, int_type, 2);
+  gcc_jit_type *array_type2 = gcc_jit_context_new_array_type (ctxt, NULL, int_type, 2);
+  CHECK (gcc_jit_compatible_types (array_type1, array_type2));
 }
-- 
2.43.0

[PATCH] libgccjit: Add gcc_jit_global_set_readonly

2024-01-19 Thread Antoni Boucher

Hi.
This patch adds a new API gcc_jit_global_set_readonly: it's equivalent
to having a const global variable, but it is useful in the case of
complex compilers where it is not convenient to use const.
Thanks for the review.
From ff3aa19207a6cdaeff6fcb6521ad2ad92f5448ff Mon Sep 17 00:00:00 2001
From: Antoni Boucher 
Date: Tue, 24 May 2022 17:45:01 -0400
Subject: [PATCH] libgccjit: Add gcc_jit_global_set_readonly

gcc/jit/ChangeLog:

	* docs/topics/compatibility.rst (LIBGCCJIT_ABI_26): New ABI tag.
	* docs/topics/expressions.rst: Document gcc_jit_global_set_readonly.
	* jit-playback.cc (global_new_decl, new_global,
	new_global_initialized): New parameter readonly.
	* jit-playback.h (global_new_decl, new_global,
	new_global_initialized): New parameter readonly.
	* jit-recording.cc (recording::global::replay_into): Use
	m_readonly.
	(recording::global::write_reproducer): Dump reproducer for
	gcc_jit_global_set_readonly.
	* jit-recording.h (get_readonly, set_readonly): New methods.
	(m_readonly): New attribute.
	* libgccjit.cc (gcc_jit_global_set_readonly): New function.
	(gcc_jit_block_add_assignment): Check that we don't assign to a
	readonly variable.
	* libgccjit.h (gcc_jit_global_set_readonly): New function.
	(LIBGCCJIT_HAVE_gcc_jit_global_set_readonly): New define.
	* libgccjit.map: New function.

gcc/testsuite/ChangeLog:

	* jit.dg/all-non-failing-tests.h: Mention test-readonly.c.
	* jit.dg/test-error-assign-readonly.c: New test.
	* jit.dg/test-readonly.c: New test.
---
 gcc/jit/docs/topics/compatibility.rst |  7 +++
 gcc/jit/docs/topics/expressions.rst   | 12 
 gcc/jit/jit-playback.cc   | 15 +++--
 gcc/jit/jit-playback.h|  9 ++-
 gcc/jit/jit-recording.cc  | 10 ++-
 gcc/jit/jit-recording.h   | 12 
 gcc/jit/libgccjit.cc  | 22 +++
 gcc/jit/libgccjit.h   |  5 ++
 gcc/jit/libgccjit.map |  5 ++
 gcc/testsuite/jit.dg/all-non-failing-tests.h  |  3 +
 .../jit.dg/test-error-assign-readonly.c   | 62 +++
 gcc/testsuite/jit.dg/test-readonly.c  | 38 
 12 files changed, 189 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/jit.dg/test-error-assign-readonly.c
 create mode 100644 gcc/testsuite/jit.dg/test-readonly.c

diff --git a/gcc/jit/docs/topics/compatibility.rst b/gcc/jit/docs/topics/compatibility.rst
index ebede440ee4..e13581d0685 100644
--- a/gcc/jit/docs/topics/compatibility.rst
+++ b/gcc/jit/docs/topics/compatibility.rst
@@ -378,3 +378,10 @@ alignment of a variable:
 
 ``LIBGCCJIT_ABI_25`` covers the addition of
 :func:`gcc_jit_type_get_restrict`
+
+.. _LIBGCCJIT_ABI_26:
+
+``LIBGCCJIT_ABI_26``
+
+``LIBGCCJIT_ABI_26`` covers the addition of
+:func:`gcc_jit_global_set_readonly`
diff --git a/gcc/jit/docs/topics/expressions.rst b/gcc/jit/docs/topics/expressions.rst
index 42cfee36302..8d4aa96e64a 100644
--- a/gcc/jit/docs/topics/expressions.rst
+++ b/gcc/jit/docs/topics/expressions.rst
@@ -944,6 +944,18 @@ Global variables
 
   #ifdef LIBGCCJIT_HAVE_CTORS
 
+.. function:: void\
+  gcc_jit_global_set_readonly (gcc_jit_lvalue *global)
+
+   Set the global variable as read-only, meaning you cannot assign to this variable.
+
+   This entrypoint was added in :ref:`LIBGCCJIT_ABI_26`; you can test for its
+   presence using:
+
+   .. code-block:: c
+
+  #ifdef LIBGCCJIT_HAVE_gcc_jit_global_set_readonly
+
 Working with pointers, structs and unions
 -
 
diff --git a/gcc/jit/jit-playback.cc b/gcc/jit/jit-playback.cc
index 537f3b1..420f3a843cf 100644
--- a/gcc/jit/jit-playback.cc
+++ b/gcc/jit/jit-playback.cc
@@ -607,7 +607,8 @@ global_new_decl (location *loc,
 		 enum gcc_jit_global_kind kind,
 		 type *type,
 		 const char *name,
-		 enum global_var_flags flags)
+		 enum global_var_flags flags,
+		 bool readonly)
 {
   gcc_assert (type);
   gcc_assert (name);
@@ -646,7 +647,7 @@ global_new_decl (location *loc,
   break;
 }
 
-  if (TYPE_READONLY (type_tree))
+  if (TYPE_READONLY (type_tree) || readonly)
 TREE_READONLY (inner) = 1;
 
   if (loc)
@@ -674,10 +675,11 @@ new_global (location *loc,
 	enum gcc_jit_global_kind kind,
 	type *type,
 	const char *name,
-	enum global_var_flags flags)
+	enum global_var_flags flags,
+	bool readonly)
 {
   tree inner =
-global_new_decl (loc, kind, type, name, flags);
+global_new_decl (loc, kind, type, name, flags, readonly);
 
   return global_finalize_lvalue (inner);
 }
@@ -822,9 +824,10 @@ new_global_initialized (location *loc,
 			size_t initializer_num_elem,
 			const void *initializer,
 			const char *name,
-			enum global_var_flags flags)
+			enum global_var_flags flags,
+			bool readonly)
 {
-  tree inner = global_new_decl (loc, kind, type, name, flags);
+  tree inner = global_new_decl (loc, kind, type

Re: [PATCH, V2] PR target/112886, Add %S to print_operand for vector pair support.

2024-01-19 Thread Peter Bergner

On 1/11/24 11:29 AM, Michael Meissner wrote:
> This is version 2 of the patch.  The only difference is I made the test case
> simpler to read.
[snip]
> gcc/
> 
>   PR target/112886
>   * config/rs6000/rs6000.cc (print_operand): Add %S output modifier.
>   * doc/md.texi (Modifiers): Mention %S can be used like %x.
> 
> gcc/testsuite/
> 
>   PR target/112886
>   * /gcc.target/powerpc/pr112886.c: New test.

This resolves my issue with the first patch, so LGTM.

Peter

[PATCH] c++/modules: Handle partial specialisations in GMF [PR113405]

2024-01-19 Thread Nathaniel Shead

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

Currently, when exporting names from the GMF, or within header modules,
for a set of constrained partial specialisations we only emit the first
one. This is because the 'type_specialization' list only includes a
single specialization per template+argument list; constraints are not
considered here.

The existing code uses a separate 'partial_specializations' list to
track this instead, but currently it's only used for declarations in the
module purview. This patch makes use of this list for all declarations.

PR c++/113405

gcc/cp/ChangeLog:

* module.cc (set_defining_module): Track partial specialisations
for all declarations.

gcc/testsuite/ChangeLog:

* g++.dg/modules/concept-9.h: New test.
* g++.dg/modules/concept-9_a.C: New test.
* g++.dg/modules/concept-9_b.C: New test.
* g++.dg/modules/concept-10_a.H: New test.
* g++.dg/modules/concept-10_b.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/module.cc|  5 -
 gcc/testsuite/g++.dg/modules/concept-10_a.H | 25 +
 gcc/testsuite/g++.dg/modules/concept-10_b.C |  8 +++
 gcc/testsuite/g++.dg/modules/concept-9.h| 18 +++
 gcc/testsuite/g++.dg/modules/concept-9_a.C  | 13 +++
 gcc/testsuite/g++.dg/modules/concept-9_b.C  |  8 +++
 6 files changed, 76 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/concept-10_a.H
 create mode 100644 gcc/testsuite/g++.dg/modules/concept-10_b.C
 create mode 100644 gcc/testsuite/g++.dg/modules/concept-9.h
 create mode 100644 gcc/testsuite/g++.dg/modules/concept-9_a.C
 create mode 100644 gcc/testsuite/g++.dg/modules/concept-9_b.C

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 8db662c0267..249d0816169 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -18860,8 +18860,11 @@ set_defining_module (tree decl)
   gcc_checking_assert (!DECL_LANG_SPECIFIC (decl)
   || !DECL_MODULE_IMPORT_P (decl));
 
-  if (module_has_cmi_p ())
+  if (module_p ())
 {
+  /* We need to track all declarations within a module, not just those
+in the module purview, because we don't necessarily know yet if
+this module will require a CMI while in the global fragment.  */
   tree ctx = DECL_CONTEXT (decl);
   if (ctx
  && (TREE_CODE (ctx) == RECORD_TYPE || TREE_CODE (ctx) == UNION_TYPE)
diff --git a/gcc/testsuite/g++.dg/modules/concept-10_a.H 
b/gcc/testsuite/g++.dg/modules/concept-10_a.H
new file mode 100644
index 000..c3a5fa727a7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/concept-10_a.H
@@ -0,0 +1,25 @@
+// Also test header modules
+// PR c++/113405
+// { dg-additional-options "-fmodule-header" }
+// { dg-require-effective-target c++20 }
+// { dg-module-cmi {} }
+
+template 
+concept foo = false;
+
+template 
+concept bar = true;
+
+template 
+struct corge {};
+
+template 
+struct corge {};
+
+template 
+struct corge {
+  using alias = int;
+};
+
+template 
+using corge_alias = corge::alias;
diff --git a/gcc/testsuite/g++.dg/modules/concept-10_b.C 
b/gcc/testsuite/g++.dg/modules/concept-10_b.C
new file mode 100644
index 000..67be13d5995
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/concept-10_b.C
@@ -0,0 +1,8 @@
+// PR c++/113405
+// { dg-additional-options "-fmodules-ts" }
+// { dg-require-effective-target c++20 }
+
+import "concept-10_a.H";
+
+struct test {};
+using quux = corge_alias;
diff --git a/gcc/testsuite/g++.dg/modules/concept-9.h 
b/gcc/testsuite/g++.dg/modules/concept-9.h
new file mode 100644
index 000..1c7f003228c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/concept-9.h
@@ -0,0 +1,18 @@
+// PR c++/113405
+
+template 
+concept foo = false;
+
+template 
+concept bar = true;
+
+template 
+struct corge {};
+
+template 
+struct corge {};
+
+template 
+struct corge {
+  using alias = int;
+};
diff --git a/gcc/testsuite/g++.dg/modules/concept-9_a.C 
b/gcc/testsuite/g++.dg/modules/concept-9_a.C
new file mode 100644
index 000..9a055b6dcc9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/concept-9_a.C
@@ -0,0 +1,13 @@
+// PR c++/113405
+// { dg-additional-options "-fmodules-ts" }
+// { dg-require-effective-target c++20 }
+// { dg-module-cmi M }
+
+module;
+
+#include "concept-9.h"
+
+export module M;
+
+export template
+using corge_alias = corge::alias;
diff --git a/gcc/testsuite/g++.dg/modules/concept-9_b.C 
b/gcc/testsuite/g++.dg/modules/concept-9_b.C
new file mode 100644
index 000..55a64a9a413
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/concept-9_b.C
@@ -0,0 +1,8 @@
+// PR c++/113405
+// { dg-additional-options "-fmodules-ts" }
+// { dg-require-effective-target c++20 }
+
+import M;
+
+struct test {};
+using quux = corge_alias;
-- 
2.43.0

Re: [PATCH] Avoid ICE in single-bit logical RMWs on m68k-uclinux [PR108640]

2024-01-19 Thread Jeff Law





On 1/18/24 09:39, Mikael Pettersson wrote:

When generating RMW logical operations on m68k, the backend
recognizes single-bit operations and rewrites them as bit
instructions on operands adjusted to address the intended byte.
When offsetting the addresses the backend keeps the modes as
SImode, even though the actual access will be in QImode.

The uclinux target defines M68K_OFFSETS_MUST_BE_WITHIN_SECTIONS_P
which adds a check that the adjusted operand is within the bounds
of the original object.  Since the address has been offset it is
not, and the compiler ICEs.

The bug is that the modes of the adjusted operands should have been
narrowed to QImode, which is that this patch does.  Nearby code
which narrows to HImode gets that right.

Bootstrapped and regression tested on m68k-linux-gnu.

Ok for master? (Note: I don't have commit rights.)

gcc/

PR target/108640
* config/m68k/m68k.cc (output_andsi3): Use QImode for
address adjusted for 1-byte RMW access.
(output_iorsi3): Likewise.
(output_xorsi3): Likewise.

gcc/testsuite/

PR target/108640
* gcc.target/m68k/pr108640.c: New test.
While not really a regression, this can clearly only affect the m68k 
target and fixes an ICE.  So I went ahead and pushed it to the trunk.


Just a note for the future, the test really isn't m68k specific in that 
it should compile just fine on any target GCC supports.  It just 
happened to ICE on the m68k.  We generally prefer to put such tests in 
target independent directories.  dg-torture, c-torture would be better 
choices for this kind of test in the future.


Thanks.

Jeff

Re: [PATCH] Avoid ICE on m68k -fzero-call-used-regs -fpic [PR110934]

2024-01-19 Thread Jeff Law





On 1/17/24 10:03, Mikael Pettersson wrote:

PR110934 is a problem on m68k where -fzero-call-used-regs -fpic ICEs
when clearing an FP register.

The generic code generates an XFmode move of zero to that register,
which becomes an XFmode load from initialized data, which due to -fpic
uses a non-constant address, which the backend rejects.  The
zero-call-used-regs pass runs very late, after register allocation and
frame layout, and at that point we can't allow new uses of the PIC
register or new pseudos.

To clear an FP register on m68k it's enough to do the move in SFmode,
but the generic code can't be told to do that, so this patch updates
m68k to use its own TARGET_ZERO_CALL_USED_REGS.

Bootstrapped and regression tested on m68k-linux-gnu.

Ok for master? (I don't have commit rights.)
We can certainly have new uses of the PIC register after reload.  What 
we can't do is allocate a new scratch register after reload to hold the 
address of the object from the GOT.  It's a subtle difference.


Because we're zeroing call used registers and we only do this at return 
points, we could (in theory) use one of the call-used address registers 
as a scratch.  Doing that requires (AFAICT) defining the same target 
hook you're using, so it's not any cleaner from that point of view.







+/* Implement TARGET_ZERO_CALL_USED_REGS.  */
+
+static HARD_REG_SET
+m68k_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs)
+{
+  rtx zero_fpreg = NULL_RTX;
+
+  for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
+if (TEST_HARD_REG_BIT (need_zeroed_hardregs, regno))
+  {
+   rtx reg, zero;
+
+   if (INT_REGNO_P (regno))
+ {
+   reg = regno_reg_rtx[regno];
+   zero = CONST0_RTX (SImode);
+ }
+   else if (FP_REGNO_P (regno))
+ {
+   reg = gen_raw_REG (SFmode, regno);
+   if (zero_fpreg == NULL_RTX)
+ {
+   /* On the 040/060 clearing an FP reg loads a large
+  immediate.  To reduce code size use the first
+  cleared FP reg to clear remaing ones.  Don't do

Minor typo.  s/remaing/remaining/

I'll fix that and push the patch to the trunk.  It's as clean as other 
approaches I pondered would likely be.


Jeff

[Committed] RISC-V: Suppress warning

2024-01-19 Thread Juzhe-Zhong

../../gcc/config/riscv/riscv.cc: In function 'void 
riscv_init_cumulative_args(CUMULATIVE_ARGS*, tree, rtx, tree, int)':
../../gcc/config/riscv/riscv.cc:4879:34: error: unused parameter 'fndecl' 
[-Werror=unused-parameter]
4879 | tree fndecl,
  | ~^~
../../gcc/config/riscv/riscv.cc: In function 'bool 
riscv_vector_mode_supported_any_target_p(machine_mode)':
../../gcc/config/riscv/riscv.cc:10537:56: error: unused parameter 'mode' 
[-Werror=unused-parameter]
10537 | riscv_vector_mode_supported_any_target_p (machine_mode mode)
  |   ~^~~~
cc1plus: all warnings being treated as errors
make[3]: *** [Makefile:2559: riscv.o] Error 1

Suppress these warnings.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_init_cumulative_args): Suppress warning.
(riscv_vector_mode_supported_any_target_p): Ditto.

---
 gcc/config/riscv/riscv.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index dd6e68a08c2..1f9546f4d3e 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4876,7 +4876,7 @@ void
 riscv_init_cumulative_args (CUMULATIVE_ARGS *cum,
tree fntype ATTRIBUTE_UNUSED,
rtx libname ATTRIBUTE_UNUSED,
-   tree fndecl,
+   tree fndecl ATTRIBUTE_UNUSED,
int caller ATTRIBUTE_UNUSED)
 {
   memset (cum, 0, sizeof (*cum));
@@ -10534,7 +10534,7 @@ extract_base_offset_in_addr (rtx mem, rtx *base, rtx 
*offset)
 /* Implements target hook vector_mode_supported_any_target_p.  */
 
 static bool
-riscv_vector_mode_supported_any_target_p (machine_mode mode)
+riscv_vector_mode_supported_any_target_p (machine_mode)
 {
   if (TARGET_XTHEADVECTOR)
 return false;
-- 
2.36.3

Re: [Committed] RISC-V: Suppress warning

2024-01-19 Thread Jeff Law





On 1/19/24 17:27, Juzhe-Zhong wrote:

../../gcc/config/riscv/riscv.cc: In function 'void 
riscv_init_cumulative_args(CUMULATIVE_ARGS*, tree, rtx, tree, int)':
../../gcc/config/riscv/riscv.cc:4879:34: error: unused parameter 'fndecl' 
[-Werror=unused-parameter]
4879 | tree fndecl,
   | ~^~
../../gcc/config/riscv/riscv.cc: In function 'bool 
riscv_vector_mode_supported_any_target_p(machine_mode)':
../../gcc/config/riscv/riscv.cc:10537:56: error: unused parameter 'mode' 
[-Werror=unused-parameter]
10537 | riscv_vector_mode_supported_any_target_p (machine_mode mode)
   |   ~^~~~
cc1plus: all warnings being treated as errors
make[3]: *** [Makefile:2559: riscv.o] Error 1

Suppress these warnings.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_init_cumulative_args): Suppress warning.
(riscv_vector_mode_supported_any_target_p): Ditto.
There's actually more cleanup to do in there ;-) One of the arguments 
currently marked as unused is actually used.  And the better way to 
handle unused arguments is to just drop their name (like you did with 
riscv_vector_mode_supported_any_target_p).


I'm actually in the process of bootstrapping and regression testing the 
additional fixes to riscv_init_cumulative_args.


jeff

Re: Re: [Committed] RISC-V: Suppress warning

2024-01-19 Thread 钟居哲

OK. I saw the other arguments there:

tree fntype ATTRIBUTE_UNUSED,
rtx libname ATTRIBUTE_UNUSED,

So I leverage these and add ATTRIBUTE_UNUSED to 'fndecl'

Maybe it's better remove all arguments for riscv_init_cumulative_args which are 
unused as you suggested.



juzhe.zh...@rivai.ai
 
From: Jeff Law
Date: 2024-01-20 08:52
To: Juzhe-Zhong; gcc-patches
CC: pan2.li; schwab
Subject: Re: [Committed] RISC-V: Suppress warning
 
 
On 1/19/24 17:27, Juzhe-Zhong wrote:
> ../../gcc/config/riscv/riscv.cc: In function 'void 
> riscv_init_cumulative_args(CUMULATIVE_ARGS*, tree, rtx, tree, int)':
> ../../gcc/config/riscv/riscv.cc:4879:34: error: unused parameter 'fndecl' 
> [-Werror=unused-parameter]
> 4879 | tree fndecl,
>| ~^~
> ../../gcc/config/riscv/riscv.cc: In function 'bool 
> riscv_vector_mode_supported_any_target_p(machine_mode)':
> ../../gcc/config/riscv/riscv.cc:10537:56: error: unused parameter 'mode' 
> [-Werror=unused-parameter]
> 10537 | riscv_vector_mode_supported_any_target_p (machine_mode mode)
>|   ~^~~~
> cc1plus: all warnings being treated as errors
> make[3]: *** [Makefile:2559: riscv.o] Error 1
> 
> Suppress these warnings.
> 
> gcc/ChangeLog:
> 
> * config/riscv/riscv.cc (riscv_init_cumulative_args): Suppress warning.
> (riscv_vector_mode_supported_any_target_p): Ditto.
There's actually more cleanup to do in there ;-) One of the arguments 
currently marked as unused is actually used.  And the better way to 
handle unused arguments is to just drop their name (like you did with 
riscv_vector_mode_supported_any_target_p).
 
I'm actually in the process of bootstrapping and regression testing the 
additional fixes to riscv_init_cumulative_args.
 
jeff

[PATCH] libstdc++: suppress -Wdangling-reference with operator| [PR111410]

2024-01-19 Thread Marek Polacek

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
It seems to me that we should exclude std::ranges::views::__adaptor::operator|
from the -Wdangling-reference warning.  It's commonly used when handling
ranges.

PR c++/111410

libstdc++-v3/ChangeLog:

* include/std/ranges: Add #pragma to disable -Wdangling-reference with
std::ranges::views::__adaptor::operator|.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference17.C: New test.
---
 gcc/testsuite/g++.dg/warn/Wdangling-reference17.C | 15 +++
 libstdc++-v3/include/std/ranges   |  3 +++
 2 files changed, 18 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference17.C

diff --git a/gcc/testsuite/g++.dg/warn/Wdangling-reference17.C 
b/gcc/testsuite/g++.dg/warn/Wdangling-reference17.C
new file mode 100644
index 000..223698422c2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wdangling-reference17.C
@@ -0,0 +1,15 @@
+// PR c++/111410
+// { dg-do compile { target c++20 } }
+// { dg-options "-Wdangling-reference" }
+
+#include 
+#include 
+
+int main()
+{
+  std::vector v{1, 2, 3, 4, 5};
+  for (auto i : std::span{v} | std::views::take(1))
+{
+  (void) i;
+}
+}
diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index 7ef835f486a..f2413badd9c 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -942,6 +942,8 @@ namespace views::__adaptor
 concept __is_range_adaptor_closure
   = requires (_Tp __t) { __adaptor::__is_range_adaptor_closure_fn(__t, 
__t); };
 
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wdangling-reference"
   // range | adaptor is equivalent to adaptor(range).
   template
 requires __is_range_adaptor_closure<_Self>
@@ -961,6 +963,7 @@ namespace views::__adaptor
   return _Pipe, decay_t<_Rhs>>{std::forward<_Lhs>(__lhs),
 std::forward<_Rhs>(__rhs)};
 }
+#pragma GCC diagnostic pop
 
   // The base class of every range adaptor non-closure.
   //

base-commit: 615e25c82de97acc17ab438f88d6788cf7ffe1d6
-- 
2.43.0

[PATCH] c++: -Wdangling-reference and lambda false warning [PR109640]

2024-01-19 Thread Marek Polacek

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
-Wdangling-reference checks if a function receives a temporary as its
argument, and only warns if any of the arguments was a temporary.  But
we should not warn when the temporary represents a lambda or we generate
false positives as in the attached testcases.

PR c++/113256
PR c++/111607
PR c++/109640

gcc/cp/ChangeLog:

* call.cc (do_warn_dangling_reference): Don't warn if the temporary
is of lambda type.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference14.C: New test.
* g++.dg/warn/Wdangling-reference15.C: New test.
* g++.dg/warn/Wdangling-reference16.C: New test.
---
 gcc/cp/call.cc|  9 --
 .../g++.dg/warn/Wdangling-reference14.C   | 22 +
 .../g++.dg/warn/Wdangling-reference15.C   | 31 +++
 .../g++.dg/warn/Wdangling-reference16.C   | 13 
 4 files changed, 72 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference14.C
 create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference15.C
 create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference16.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 1f5ff417c81..77f51bacce3 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -14123,7 +14123,10 @@ do_warn_dangling_reference (tree expr, bool arg_p)
   tree e = expr;
   while (handled_component_p (e))
e = TREE_OPERAND (e, 0);
-  if (!reference_like_class_p (TREE_TYPE (e)))
+  tree type = TREE_TYPE (e);
+  /* If the temporary represents a lambda, we don't really know
+what's going on here.  */
+  if (!reference_like_class_p (type) && !LAMBDA_TYPE_P (type))
return expr;
 }
 
@@ -14180,10 +14183,10 @@ do_warn_dangling_reference (tree expr, bool arg_p)
   initializing this reference parameter.  */
if (do_warn_dangling_reference (arg, /*arg_p=*/true))
  return expr;
- /* Don't warn about member function like:
+ /* Don't warn about member functions like:
  std::any a(...);
  S& s = a.emplace({0}, 0);
-which constructs a new object and returns a reference to it, but
+which construct a new object and return a reference to it, but
 we still want to detect:
   struct S { const S& self () { return *this; } };
   const S& s = S().self();
diff --git a/gcc/testsuite/g++.dg/warn/Wdangling-reference14.C 
b/gcc/testsuite/g++.dg/warn/Wdangling-reference14.C
new file mode 100644
index 000..92b38a965e0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wdangling-reference14.C
@@ -0,0 +1,22 @@
+// PR c++/113256
+// { dg-do compile { target c++14 } }
+// { dg-options "-Wdangling-reference" }
+
+#include 
+#include 
+
+template auto bind(M T::* pm, A)
+{
+return [=]( auto&& x ) -> M const& { return x.*pm; };
+}
+
+template struct arg {};
+
+arg<1> _1;
+
+int main()
+{
+std::pair pair;
+int const& x = bind( &std::pair::first, _1 )( pair ); // { 
dg-bogus "dangling reference" }
+assert( &x == &pair.first );
+}
diff --git a/gcc/testsuite/g++.dg/warn/Wdangling-reference15.C 
b/gcc/testsuite/g++.dg/warn/Wdangling-reference15.C
new file mode 100644
index 000..c39577db64a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wdangling-reference15.C
@@ -0,0 +1,31 @@
+// PR c++/111607
+// { dg-do compile { target c++20 } }
+// { dg-options "-Wdangling-reference" }
+
+#include 
+
+struct S {
+   constexpr S(int i_) : i(i_) {}
+   S(S const &) = delete;
+   S & operator=(S const &) = delete;
+   S(S &&) = delete;
+   S & operator=(S &&) = delete;
+   int i;
+};
+
+struct A {
+   S s{0};
+};
+
+using V = std::variant;
+
+consteval auto f(V const & v) {
+  auto const & s = std::visit([](auto const & v) -> S const & { return v.s; }, 
v); // { dg-bogus "dangling reference" }
+  return s.i;
+}
+
+int main() {
+   constexpr V a{std::in_place_type};
+   constexpr auto i = f(a);
+   return i;
+}
diff --git a/gcc/testsuite/g++.dg/warn/Wdangling-reference16.C 
b/gcc/testsuite/g++.dg/warn/Wdangling-reference16.C
new file mode 100644
index 000..91996922291
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wdangling-reference16.C
@@ -0,0 +1,13 @@
+// PR c++/109640
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wdangling-reference" }
+
+bool
+fn0 ()
+{
+int a;
+int&& i = [](int& r) -> int&& { return static_cast(r); }(a); // { 
dg-bogus "dangling reference" }
+auto const l = [](int& r) -> int&& { return static_cast(r); };
+int&& j = l(a);
+return &i == &j;
+}

base-commit: 615e25c82de97acc17ab438f88d6788cf7ffe1d6
-- 
2.43.0

Re: [PATCH] libstdc++: suppress -Wdangling-reference with operator| [PR111410]

2024-01-19 Thread Jonathan Wakely

On Sat, 20 Jan 2024, 03:47 Marek Polacek,  wrote:

> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
>

OK, thanks.

The standard ranges have their own protection against dangling via the
opt-in borrowed_range concept, and algorithms that don't allow returning
iterators into rvalue ranges, and the automatic use of ref_view or
owning_view as needed. So I think it's reasonable to assume they are less
prone to the bugs this warning detects, at least when used idiomatically.


> -- >8 --
> It seems to me that we should exclude
> std::ranges::views::__adaptor::operator|
> from the -Wdangling-reference warning.  It's commonly used when handling
> ranges.
>
> PR c++/111410
>
> libstdc++-v3/ChangeLog:
>
> * include/std/ranges: Add #pragma to disable -Wdangling-reference
> with
> std::ranges::views::__adaptor::operator|.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/warn/Wdangling-reference17.C: New test.
> ---
>  gcc/testsuite/g++.dg/warn/Wdangling-reference17.C | 15 +++
>  libstdc++-v3/include/std/ranges   |  3 +++
>  2 files changed, 18 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference17.C
>
> diff --git a/gcc/testsuite/g++.dg/warn/Wdangling-reference17.C
> b/gcc/testsuite/g++.dg/warn/Wdangling-reference17.C
> new file mode 100644
> index 000..223698422c2
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/warn/Wdangling-reference17.C
> @@ -0,0 +1,15 @@
> +// PR c++/111410
> +// { dg-do compile { target c++20 } }
> +// { dg-options "-Wdangling-reference" }
> +
> +#include 
> +#include 
> +
> +int main()
> +{
> +  std::vector v{1, 2, 3, 4, 5};
> +  for (auto i : std::span{v} | std::views::take(1))
> +{
> +  (void) i;
> +}
> +}
> diff --git a/libstdc++-v3/include/std/ranges
> b/libstdc++-v3/include/std/ranges
> index 7ef835f486a..f2413badd9c 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -942,6 +942,8 @@ namespace views::__adaptor
>  concept __is_range_adaptor_closure
>= requires (_Tp __t) {
> __adaptor::__is_range_adaptor_closure_fn(__t, __t); };
>
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Wdangling-reference"
>// range | adaptor is equivalent to adaptor(range).
>template
>  requires __is_range_adaptor_closure<_Self>
> @@ -961,6 +963,7 @@ namespace views::__adaptor
>return _Pipe,
> decay_t<_Rhs>>{std::forward<_Lhs>(__lhs),
>
>  std::forward<_Rhs>(__rhs)};
>  }
> +#pragma GCC diagnostic pop
>
>// The base class of every range adaptor non-closure.
>//
>
> base-commit: 615e25c82de97acc17ab438f88d6788cf7ffe1d6
> --
> 2.43.0
>
>

96 matches

Mail list logo