[PATCH] Fix code_helper unused argument warning for fr30

2023-08-17 Thread Jan-Benedict Glaw
Hi!

fr30 is the only target defining GO_IF_LEGITIMATE_ADDRESS right now, in
which case the `code_helper ch` argument to memory_address_addr_space_p()
is unused and emits a new warning.

gcc/ChangeLog:
* recog.cc (memory_address_addr_space_p): Mark possibly unused
argument as unused.

diff --git a/gcc/recog.cc b/gcc/recog.cc
index 2bff6c03e4d..92f151248a6 100644
--- a/gcc/recog.cc
+++ b/gcc/recog.cc
@@ -1803,7 +1803,7 @@ pop_operand (rtx op, machine_mode mode)
 
 bool
 memory_address_addr_space_p (machine_mode mode ATTRIBUTE_UNUSED, rtx addr,
-addr_space_t as, code_helper ch)
+addr_space_t as, code_helper ch ATTRIBUTE_UNUSED)
 {
 #ifdef GO_IF_LEGITIMATE_ADDRESS
   gcc_assert (ADDR_SPACE_GENERIC_P (as));



Ok for trunk?

Thanks,
  Jan-Benedict

-- 


signature.asc
Description: PGP signature


Re: [PATCH v4 0/6] Add Loongson SX/ASX instruction support to LoongArch target.

2023-08-17 Thread Chenghui Pan
Hi! I try to investigate on this problem, and modify the testcase to
compile and run on aarch64 for reference, but I get some strange result
(comment shows the info that I see by stepping through by using gdb):

typedef double __attribute__((vector_size(16))) v2df;

void use1(double d) {}

__attribute__((noipa)) v2df use(double d)
{
  //reg v8's value: {1, 2}
  register v2df x asm("v8") = {5, 9};
  //reg v8's value: {5, 9}
  __asm__("" : "+w" (x));
  return x;
}

void test(void)
{
  register v2df x asm("v8") = {1, 2};
  __asm__("" : "+w" (x));
  //reg v8's value: {1, 2}
  use(x[0]);
  //reg v8's value: {1, 0}
  use1(x[1]);
}

int main(int argc, char **argv)
{
  test();
  return 0;
}

The compile command is: gcc -march=armv8-a -Og -g 1.c (gcc
8.3.0+binutils 2.31)

Disassembly of test() and use():
00400558 :   
  400558:   fc1f0fe8str d8, [sp, #-16]!   
  40055c:   9000adrpx0, 40 <_init-0x3e0>  
  400560:   3dc19c08ldr q8, [x0, #1648]   
  400564:   4ea81d00mov v0.16b, v8.16b
  400568:   fc4107e8ldr d8, [sp], #16 
  40056c:   d65f03c0ret

00400570 :  
  400570:   a9be7bfdstp x29, x30, [sp, #-32]! 
  400574:   910003fdmov x29, sp   
  400578:   fd000be8str d8, [sp, #16] 
  40057c:   9000adrpx0, 40 <_init-0x3e0>  
  400580:   3dc1a008ldr q8, [x0, #1664]   
  400584:   5e080500mov d0, v8.d[0]   
  400588:   97f4bl  400558   
  40058c:   fd400be8ldr d8, [sp, #16] 
  400590:   a8c27bfdldp x29, x30, [sp], #32   
  400594:   d65f03c0ret 

As the register value in the comments, The compiling output on aarch64
also clobbers the high parts of vector register. I googled for some
documents and I find this:
https://developer.arm.com/documentation/den0024/a/The-ABI-for-ARM-64
-bit-Architecture/Register-use-in-the-AArch64-Procedure-Call-Standard
/Parameters-in-NEON-and-floating-point-registers 

Seems ARMv8-A only guarantees to preserve low 64-bit value of
NEON/floating-point register value. I'm not sure that I modify the
testcase in the right way and maybe we need more investigations. Any
ideas or suggestion?

On Wed, 2023-08-16 at 11:27 +0800, Xi Ruoyao wrote:
> The implementation fails to handle this test case properly:
> 
> typedef double __attribute__((vector_size(32))) v4df;
> 
> void use1(double);
> 
> __attribute__((noipa)) double use(double)
> {
>   register double x asm("f24") = 114.514;
>   __asm__("" : "+f" (x));
>   return x;
> }
> 
> void test(void)
> {
>   register v4df x asm("f24") = {1, 2, 3, 4};
>   __asm__("" : "+f" (x));
>   use(x[1]);
>   use1(x[3]);
> }
> 
> Here use() attempts to save and restore f24, but it uses fst.d/fld.d,
> clobbering the high 192 bits of xr24.  Now test() passes a wrong
> value
> of x[3] to use1().
> 
> Note that saving and restoring f24 with xvst/xvld in use() won't
> really
> fix the issue because in real life use() can be in another
> translation
> unit (or even a shared library) compiled with -mno-lsx.  So it seems
> we
> need to tell the compiler "a function call may clobber the high bits
> of
> a vector register even if the corresponding floating-point register
> is
> saved".  I'm not sure how to accomplish this...
> 
> On Tue, 2023-08-15 at 09:05 +0800, Chenghui Pan wrote:
> > This is an update of:
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626194.html
> > 
> > This version of patch set only introduces some small simplications
> > of
> > implementation. Because I missed the size limitation of mail size,
> > the
> > huge testsuite patches of v2 and v3 are not shown in the mail list.
> > So,
> > testsuite patches are splited from this patch set again and will be
> > submitted 
> > independently in the future.
> > 
> > Binutils-gdb introduced LSX/LASX support since 2.41 release:
> > https://lists.gnu.org/archive/html/info-gnu/2023-07/msg9.html
> > 
> > Brief history of patch set version:
> > v1 -> v2:
> > - Reduce usage of "unspec" in RTL template.
> > - Append Support of ADDR_REG_REG in LSX and LASX.
> > - Constraint docs are appended in gcc/doc/md.texi and ccomment
> > block.
> > - Codes related to vecarg are removed.
> > - Testsuite of LSX and LASX is added in v2. (Because of the size
> > limitation of
> >   mail list, these patches are not shown)
> > - Adjust the loongarch_expand_vector_init() function to reduce
> > instruction 
> >   output amount.
> > - Some minor implementation changes of RTL templates.
> > 
> > v2 -> v3:
> > - Revert vabsd/xvabsd RTL templates to unspec impl.

[PATCH v1] RISC-V: Support RVV VFWREDOSUM.VS rounding mode intrinsic API

2023-08-17 Thread Pan Li via Gcc-patches
From: Pan Li 

This patch would like to support the rounding mode API for the
VFWREDOSUM.VS as the below samples

* __riscv_vfwredosum_vs_f32m1_f64m1_rm
* __riscv_vfwredosum_vs_f32m1_f64m1_rm_m

Signed-off-by: Pan Li 

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(widen_freducop): Add frm_opt_type template arg.
(vfwredosum_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfwredosum_frm): New intrinsic function def.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-wredosum.c: New test.
---
 .../riscv/riscv-vector-builtins-bases.cc  |  9 -
 .../riscv/riscv-vector-builtins-bases.h   |  1 +
 .../riscv/riscv-vector-builtins-functions.def |  2 ++
 .../riscv/rvv/base/float-point-wredosum.c | 33 +++
 4 files changed, 44 insertions(+), 1 deletion(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wredosum.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index ef2991359da..abf03bab0da 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -1866,10 +1866,15 @@ public:
 };
 
 /* Implements widening floating-point reduction instructions.  */
-template
+template
 class widen_freducop : public function_base
 {
 public:
+  bool has_rounding_mode_operand_p () const override
+  {
+return FRM_OP == HAS_FRM;
+  }
+
   bool apply_mask_policy_p () const override { return false; }
 
   rtx expand (function_expander &e) const override
@@ -2544,6 +2549,7 @@ static CONSTEXPR const reducop vfredmax_obj;
 static CONSTEXPR const reducop vfredmin_obj;
 static CONSTEXPR const widen_freducop vfwredusum_obj;
 static CONSTEXPR const widen_freducop vfwredosum_obj;
+static CONSTEXPR const widen_freducop 
vfwredosum_frm_obj;
 static CONSTEXPR const vmv vmv_x_obj;
 static CONSTEXPR const vmv_s vmv_s_obj;
 static CONSTEXPR const vmv vfmv_f_obj;
@@ -2802,6 +2808,7 @@ BASE (vfredosum_frm)
 BASE (vfredmax)
 BASE (vfredmin)
 BASE (vfwredosum)
+BASE (vfwredosum_frm)
 BASE (vfwredusum)
 BASE (vmv_x)
 BASE (vmv_s)
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
b/gcc/config/riscv/riscv-vector-builtins-bases.h
index da8412b66df..c1bb164a712 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -245,6 +245,7 @@ extern const function_base *const vfredosum_frm;
 extern const function_base *const vfredmax;
 extern const function_base *const vfredmin;
 extern const function_base *const vfwredosum;
+extern const function_base *const vfwredosum_frm;
 extern const function_base *const vfwredusum;
 extern const function_base *const vmv_x;
 extern const function_base *const vmv_s;
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index 80e65bfb14b..da1157f5a56 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -507,6 +507,8 @@ DEF_RVV_FUNCTION (vfredosum_frm, reduc_alu_frm, 
no_mu_preds, f_vs_ops)
 DEF_RVV_FUNCTION (vfwredosum, reduc_alu, no_mu_preds, wf_vs_ops)
 DEF_RVV_FUNCTION (vfwredusum, reduc_alu, no_mu_preds, wf_vs_ops)
 
+DEF_RVV_FUNCTION (vfwredosum_frm, reduc_alu_frm, no_mu_preds, wf_vs_ops)
+
 /* 15. Vector Mask Instructions.  */
 
 // 15.1. Vector Mask-Register Logical Instructions
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wredosum.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wredosum.c
new file mode 100644
index 000..acf79569a22
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wredosum.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vfloat64m1_t
+test_riscv_vfwredosum_vs_f32m1_f64m1_rm (vfloat32m1_t op1, vfloat64m1_t op2,
+size_t vl) {
+  return __riscv_vfwredosum_vs_f32m1_f64m1_rm (op1, op2, 0, vl);
+}
+
+vfloat64m1_t
+test_vfwredosum_vs_f32m1_f64m1_rm_m (vbool32_t mask, vfloat32m1_t op1,
+vfloat64m1_t op2, size_t vl) {
+  return __riscv_vfwredosum_vs_f32m1_f64m1_rm_m (mask, op1, op2, 1, vl);
+}
+
+vfloat64m1_t
+test_riscv_vfwredosum_vs_f32m1_f64m1 (vfloat32m1_t op1, vfloat64m1_t op2,
+ size_t vl) {
+  return __riscv_vfwredosum_vs_f32m1_f64m1 (op1, op2, vl);
+}
+
+vfloat64m1_t
+test_vfwredosum_vs_f32m1_f64m1_m (vbool32_t mask, vfloat32m1_t op1,
+ vfloat64m1_t op2, size_t vl) {
+  return __riscv_vfwredosum_vs_f32m1_f64m1_m (mask, op1, op2, vl);
+}
+
+/* { dg-final { scan-assembler-times {vfwredosum\.vs\s+v[0-9]+,\s*v[0-9]+} 4 } 
} */
+/* { dg-final { sc

Re: [PATCH v1] RISC-V: Support RVV VFWREDOSUM.VS rounding mode intrinsic API

2023-08-17 Thread Kito Cheng via Gcc-patches
ok

On Thu, Aug 17, 2023 at 3:26 PM Pan Li via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> This patch would like to support the rounding mode API for the
> VFWREDOSUM.VS as the below samples
>
> * __riscv_vfwredosum_vs_f32m1_f64m1_rm
> * __riscv_vfwredosum_vs_f32m1_f64m1_rm_m
>
> Signed-off-by: Pan Li 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins-bases.cc
> (widen_freducop): Add frm_opt_type template arg.
> (vfwredosum_frm_obj): New declaration.
> (BASE): Ditto.
> * config/riscv/riscv-vector-builtins-bases.h: Ditto.
> * config/riscv/riscv-vector-builtins-functions.def
> (vfwredosum_frm): New intrinsic function def.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/float-point-wredosum.c: New test.
> ---
>  .../riscv/riscv-vector-builtins-bases.cc  |  9 -
>  .../riscv/riscv-vector-builtins-bases.h   |  1 +
>  .../riscv/riscv-vector-builtins-functions.def |  2 ++
>  .../riscv/rvv/base/float-point-wredosum.c | 33 +++
>  4 files changed, 44 insertions(+), 1 deletion(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wredosum.c
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
> b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> index ef2991359da..abf03bab0da 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> @@ -1866,10 +1866,15 @@ public:
>  };
>
>  /* Implements widening floating-point reduction instructions.  */
> -template
> +template
>  class widen_freducop : public function_base
>  {
>  public:
> +  bool has_rounding_mode_operand_p () const override
> +  {
> +return FRM_OP == HAS_FRM;
> +  }
> +
>bool apply_mask_policy_p () const override { return false; }
>
>rtx expand (function_expander &e) const override
> @@ -2544,6 +2549,7 @@ static CONSTEXPR const reducop vfredmax_obj;
>  static CONSTEXPR const reducop vfredmin_obj;
>  static CONSTEXPR const widen_freducop vfwredusum_obj;
>  static CONSTEXPR const widen_freducop vfwredosum_obj;
> +static CONSTEXPR const widen_freducop 
> vfwredosum_frm_obj;
>  static CONSTEXPR const vmv vmv_x_obj;
>  static CONSTEXPR const vmv_s vmv_s_obj;
>  static CONSTEXPR const vmv vfmv_f_obj;
> @@ -2802,6 +2808,7 @@ BASE (vfredosum_frm)
>  BASE (vfredmax)
>  BASE (vfredmin)
>  BASE (vfwredosum)
> +BASE (vfwredosum_frm)
>  BASE (vfwredusum)
>  BASE (vmv_x)
>  BASE (vmv_s)
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
> b/gcc/config/riscv/riscv-vector-builtins-bases.h
> index da8412b66df..c1bb164a712 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.h
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
> @@ -245,6 +245,7 @@ extern const function_base *const vfredosum_frm;
>  extern const function_base *const vfredmax;
>  extern const function_base *const vfredmin;
>  extern const function_base *const vfwredosum;
> +extern const function_base *const vfwredosum_frm;
>  extern const function_base *const vfwredusum;
>  extern const function_base *const vmv_x;
>  extern const function_base *const vmv_s;
> diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
> b/gcc/config/riscv/riscv-vector-builtins-functions.def
> index 80e65bfb14b..da1157f5a56 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-functions.def
> +++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
> @@ -507,6 +507,8 @@ DEF_RVV_FUNCTION (vfredosum_frm, reduc_alu_frm, 
> no_mu_preds, f_vs_ops)
>  DEF_RVV_FUNCTION (vfwredosum, reduc_alu, no_mu_preds, wf_vs_ops)
>  DEF_RVV_FUNCTION (vfwredusum, reduc_alu, no_mu_preds, wf_vs_ops)
>
> +DEF_RVV_FUNCTION (vfwredosum_frm, reduc_alu_frm, no_mu_preds, wf_vs_ops)
> +
>  /* 15. Vector Mask Instructions.  */
>
>  // 15.1. Vector Mask-Register Logical Instructions
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wredosum.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wredosum.c
> new file mode 100644
> index 000..acf79569a22
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wredosum.c
> @@ -0,0 +1,33 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vfloat64m1_t
> +test_riscv_vfwredosum_vs_f32m1_f64m1_rm (vfloat32m1_t op1, vfloat64m1_t op2,
> +size_t vl) {
> +  return __riscv_vfwredosum_vs_f32m1_f64m1_rm (op1, op2, 0, vl);
> +}
> +
> +vfloat64m1_t
> +test_vfwredosum_vs_f32m1_f64m1_rm_m (vbool32_t mask, vfloat32m1_t op1,
> +vfloat64m1_t op2, size_t vl) {
> +  return __riscv_vfwredosum_vs_f32m1_f64m1_rm_m (mask, op1, op2, 1, vl);
> +}
> +
> +vfloat64m1_t
> +test_riscv_vfwredosum_vs_f32m1_f64m1 (vfloat32m1_t op1, vfloat64m1_t op2,
> + size_t vl) {
> +  return __riscv_vfwredosum_vs_f32m1_f64m1 (op1, op2, vl);
> +}
> +
> +vfloat64m1_t

Re: [PATCH v1] RISC-V: Support RVV VFREDOSUM.VS rounding mode intrinsic API

2023-08-17 Thread Kito Cheng via Gcc-patches
lgtm

On Thu, Aug 17, 2023 at 2:23 PM Pan Li via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> This patch would like to support the rounding mode API for the
> VFREDOSUM.VS as the below samples.
>
> * __riscv_vfredosum_vs_f32m1_f32m1_rm
> * __riscv_vfredosum_vs_f32m1_f32m1_rm_m
>
> Signed-off-by: Pan Li 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins-bases.cc
> (vfredosum_frm_obj): New declaration.
> (BASE): Ditto.
> * config/riscv/riscv-vector-builtins-bases.h: Ditto.
> * config/riscv/riscv-vector-builtins-functions.def
> (vfredosum_frm): New intrinsic function def.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/float-point-redosum.c: New test.
> ---
>  .../riscv/riscv-vector-builtins-bases.cc  |  2 ++
>  .../riscv/riscv-vector-builtins-bases.h   |  1 +
>  .../riscv/riscv-vector-builtins-functions.def |  1 +
>  .../riscv/rvv/base/float-point-redosum.c  | 33 +++
>  4 files changed, 37 insertions(+)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
> b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> index 65f1d9c8ff7..ef2991359da 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> @@ -2539,6 +2539,7 @@ static CONSTEXPR const 
> widen_reducop vwredsumu_obj;
>  static CONSTEXPR const freducop vfredusum_obj;
>  static CONSTEXPR const freducop vfredusum_frm_obj;
>  static CONSTEXPR const freducop vfredosum_obj;
> +static CONSTEXPR const freducop vfredosum_frm_obj;
>  static CONSTEXPR const reducop vfredmax_obj;
>  static CONSTEXPR const reducop vfredmin_obj;
>  static CONSTEXPR const widen_freducop vfwredusum_obj;
> @@ -2797,6 +2798,7 @@ BASE (vwredsumu)
>  BASE (vfredusum)
>  BASE (vfredusum_frm)
>  BASE (vfredosum)
> +BASE (vfredosum_frm)
>  BASE (vfredmax)
>  BASE (vfredmin)
>  BASE (vfwredosum)
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
> b/gcc/config/riscv/riscv-vector-builtins-bases.h
> index fd1a84f3e68..da8412b66df 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.h
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
> @@ -241,6 +241,7 @@ extern const function_base *const vwredsumu;
>  extern const function_base *const vfredusum;
>  extern const function_base *const vfredusum_frm;
>  extern const function_base *const vfredosum;
> +extern const function_base *const vfredosum_frm;
>  extern const function_base *const vfredmax;
>  extern const function_base *const vfredmin;
>  extern const function_base *const vfwredosum;
> diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
> b/gcc/config/riscv/riscv-vector-builtins-functions.def
> index 90a83c02d52..80e65bfb14b 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-functions.def
> +++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
> @@ -501,6 +501,7 @@ DEF_RVV_FUNCTION (vfredmax, reduc_alu, no_mu_preds, 
> f_vs_ops)
>  DEF_RVV_FUNCTION (vfredmin, reduc_alu, no_mu_preds, f_vs_ops)
>
>  DEF_RVV_FUNCTION (vfredusum_frm, reduc_alu_frm, no_mu_preds, f_vs_ops)
> +DEF_RVV_FUNCTION (vfredosum_frm, reduc_alu_frm, no_mu_preds, f_vs_ops)
>
>  // 14.4. Vector Widening Floating-Point Reduction Instructions
>  DEF_RVV_FUNCTION (vfwredosum, reduc_alu, no_mu_preds, wf_vs_ops)
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c
> new file mode 100644
> index 000..2e6a3c28a89
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c
> @@ -0,0 +1,33 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vfloat32m1_t
> +test_riscv_vfredosum_vs_f32m1_f32m1_rm (vfloat32m1_t op1, vfloat32m1_t op2,
> +   size_t vl) {
> +  return __riscv_vfredosum_vs_f32m1_f32m1_rm (op1, op2, 0, vl);
> +}
> +
> +vfloat32m1_t
> +test_vfredosum_vs_f32m1_f32m1_rm_m (vbool32_t mask, vfloat32m1_t op1,
> +   vfloat32m1_t op2, size_t vl) {
> +  return __riscv_vfredosum_vs_f32m1_f32m1_rm_m (mask, op1, op2, 1, vl);
> +}
> +
> +vfloat32m1_t
> +test_riscv_vfredosum_vs_f32m1_f32m1 (vfloat32m1_t op1, vfloat32m1_t op2,
> +size_t vl) {
> +  return __riscv_vfredosum_vs_f32m1_f32m1 (op1, op2, vl);
> +}
> +
> +vfloat32m1_t
> +test_vfredosum_vs_f32m1_f32m1_m (vbool32_t mask, vfloat32m1_t op1,
> +vfloat32m1_t op2, size_t vl) {
> +  return __riscv_vfredosum_vs_f32m1_f32m1_m (mask, op1, op2, vl);
> +}
> +
> +/* { dg-final { scan-assembler-times 
> {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
> +/* { dg-final { scan-assembler-times {frrm\s+[axs][0-9]+} 2 } } */
> +/* { dg-final { scan-assembler-times {fsrm\s+[axs][0-9]+} 2 } } */
> +/* {

RE: [PATCH v1] RISC-V: Support RVV VFNCVT.X.F.W rounding mode intrinsic API

2023-08-17 Thread Li, Pan2 via Gcc-patches
Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, August 17, 2023 3:30 PM
To: Li, Pan2 
Cc: juzhe.zh...@rivai.ai
Subject: Re: [PATCH v1] RISC-V: Support RVV VFNCVT.X.F.W rounding mode 
intrinsic API

Yeah, I missed that, LGTM :P

On Thu, Aug 17, 2023 at 2:28 PM Li, Pan2  wrote:
>
> Hi Kito,
>
> In case you missed this one, which is the precondition of the rest rounding 
> mode API patches for committing.
> Thank in advance, and we are close to complete all the rounding mode API, 😉.
>
> Pan
>
> -Original Message-
> From: Li, Pan2 
> Sent: Wednesday, August 16, 2023 8:54 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; Li, Pan2 ; Wang, Yanzhang 
> ; kito.ch...@gmail.com
> Subject: [PATCH v1] RISC-V: Support RVV VFNCVT.X.F.W rounding mode intrinsic 
> API
>
> From: Pan Li 
>
> This patch would like to support the rounding mode API for the
> VFNCVT.X.F.W as the below samples.
>
> * __riscv_vfncvt_x_f_w_i16mf2_rm
> * __riscv_vfncvt_x_f_w_i16mf2_rm_m
>
> Signed-off-by: Pan Li 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins-bases.cc
> (class vfncvt_x): Add frm_op_type template arg.
> (BASE): New declaration.
> * config/riscv/riscv-vector-builtins-bases.h: Ditto.
> * config/riscv/riscv-vector-builtins-functions.def
> (vfncvt_x_frm): New intrinsic function def.
> * config/riscv/riscv-vector-builtins-shapes.cc
> (struct narrow_alu_frm_def): New shape function for frm.
> (SHAPE): New declaration.
> * config/riscv/riscv-vector-builtins-shapes.h: Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/float-point-ncvt-x.c: New test.
> ---
>  .../riscv/riscv-vector-builtins-bases.cc  |  9 -
>  .../riscv/riscv-vector-builtins-bases.h   |  1 +
>  .../riscv/riscv-vector-builtins-functions.def |  2 +
>  .../riscv/riscv-vector-builtins-shapes.cc | 39 +++
>  .../riscv/riscv-vector-builtins-shapes.h  |  1 +
>  .../riscv/rvv/base/float-point-ncvt-x.c   | 29 ++
>  6 files changed, 80 insertions(+), 1 deletion(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-x.c
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
> b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> index 050ecbe780c..2f40eeaeda5 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> @@ -1759,10 +1759,15 @@ public:
>  };
>
>  /* Implements vfncvt.x.  */
> -template
> +template
>  class vfncvt_x : public function_base
>  {
>  public:
> +  bool has_rounding_mode_operand_p () const override
> +  {
> +return FRM_OP == HAS_FRM;
> +  }
> +
>rtx expand (function_expander &e) const override
>{
>  return e.use_exact_insn (
> @@ -2502,6 +2507,7 @@ static CONSTEXPR const vfwcvt_rtz_x 
> vfwcvt_rtz_x_obj;
>  static CONSTEXPR const vfwcvt_rtz_x vfwcvt_rtz_xu_obj;
>  static CONSTEXPR const vfwcvt_f vfwcvt_f_obj;
>  static CONSTEXPR const vfncvt_x vfncvt_x_obj;
> +static CONSTEXPR const vfncvt_x vfncvt_x_frm_obj;
>  static CONSTEXPR const vfncvt_x vfncvt_xu_obj;
>  static CONSTEXPR const vfncvt_rtz_x vfncvt_rtz_x_obj;
>  static CONSTEXPR const vfncvt_rtz_x vfncvt_rtz_xu_obj;
> @@ -2756,6 +2762,7 @@ BASE (vfwcvt_rtz_x)
>  BASE (vfwcvt_rtz_xu)
>  BASE (vfwcvt_f)
>  BASE (vfncvt_x)
> +BASE (vfncvt_x_frm)
>  BASE (vfncvt_xu)
>  BASE (vfncvt_rtz_x)
>  BASE (vfncvt_rtz_xu)
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
> b/gcc/config/riscv/riscv-vector-builtins-bases.h
> index 6565740c597..edff0de2715 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.h
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
> @@ -220,6 +220,7 @@ extern const function_base *const vfwcvt_rtz_x;
>  extern const function_base *const vfwcvt_rtz_xu;
>  extern const function_base *const vfwcvt_f;
>  extern const function_base *const vfncvt_x;
> +extern const function_base *const vfncvt_x_frm;
>  extern const function_base *const vfncvt_xu;
>  extern const function_base *const vfncvt_rtz_x;
>  extern const function_base *const vfncvt_rtz_xu;
> diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
> b/gcc/config/riscv/riscv-vector-builtins-functions.def
> index 22c039c8cbb..5e37bae318a 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-functions.def
> +++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
> @@ -472,6 +472,8 @@ DEF_RVV_FUNCTION (vfncvt_f, narrow_alu, full_preds, 
> u_to_nf_xu_w_ops)
>  DEF_RVV_FUNCTION (vfncvt_f, narrow_alu, full_preds, f_to_nf_f_w_ops)
>  DEF_RVV_FUNCTION (vfncvt_rod_f, narrow_alu, full_preds, f_to_nf_f_w_ops)
>
> +DEF_RVV_FUNCTION (vfncvt_x_frm, narrow_alu_frm, full_preds, f_to_ni_f_w_ops)
> +
>  /* 14. Vector Reduction Operations.  */
>
>  // 14.1. Vector Single-Width Integer Reduction Instructions
> diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc 
> b/gcc/config/

RE: [PATCH v1] RISC-V: Support RVV VFNCVT.XU.F.W rounding mode intrinsic API

2023-08-17 Thread Li, Pan2 via Gcc-patches
Committed, thanks Kito.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Li, Pan2 via Gcc-patches
Sent: Thursday, August 17, 2023 10:18 AM
To: Kito Cheng 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: RE: [PATCH v1] RISC-V: Support RVV VFNCVT.XU.F.W rounding mode 
intrinsic API

Thanks Kito, will commit it after the VFNCVT.X.F.W one, aka the signed integer 
cvt.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, August 17, 2023 9:30 AM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Support RVV VFNCVT.XU.F.W rounding mode 
intrinsic API

LGTM

On Thu, Aug 17, 2023 at 9:23 AM Pan Li via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> This patch would like to support the rounding mode API for the
> VFNCVT.XU.F.W as the below samples.
>
> * __riscv_vfncvt_xu_f_w_u16mf2_rm
> * __riscv_vfncvt_xu_f_w_u16mf2_rm_m
>
> Signed-off-by: Pan Li 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins-bases.cc
> (vfncvt_xu_frm_obj): New declaration.
> (BASE): Ditto.
> * config/riscv/riscv-vector-builtins-bases.h: Ditto.
> * config/riscv/riscv-vector-builtins-functions.def
> (vfncvt_xu_frm): New intrinsic function def.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/float-point-ncvt-xu.c: New test.
> ---
>  .../riscv/riscv-vector-builtins-bases.cc  |  2 ++
>  .../riscv/riscv-vector-builtins-bases.h   |  1 +
>  .../riscv/riscv-vector-builtins-functions.def |  1 +
>  .../riscv/rvv/base/float-point-ncvt-xu.c  | 29 +++
>  4 files changed, 33 insertions(+)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-xu.c
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
> b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> index 2f40eeaeda5..acadec2afca 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> @@ -2509,6 +2509,7 @@ static CONSTEXPR const vfwcvt_f vfwcvt_f_obj;
>  static CONSTEXPR const vfncvt_x vfncvt_x_obj;
>  static CONSTEXPR const vfncvt_x vfncvt_x_frm_obj;
>  static CONSTEXPR const vfncvt_x vfncvt_xu_obj;
> +static CONSTEXPR const vfncvt_x 
> vfncvt_xu_frm_obj;
>  static CONSTEXPR const vfncvt_rtz_x vfncvt_rtz_x_obj;
>  static CONSTEXPR const vfncvt_rtz_x vfncvt_rtz_xu_obj;
>  static CONSTEXPR const vfncvt_f vfncvt_f_obj;
> @@ -2764,6 +2765,7 @@ BASE (vfwcvt_f)
>  BASE (vfncvt_x)
>  BASE (vfncvt_x_frm)
>  BASE (vfncvt_xu)
> +BASE (vfncvt_xu_frm)
>  BASE (vfncvt_rtz_x)
>  BASE (vfncvt_rtz_xu)
>  BASE (vfncvt_f)
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
> b/gcc/config/riscv/riscv-vector-builtins-bases.h
> index edff0de2715..9bd09a41960 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.h
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
> @@ -222,6 +222,7 @@ extern const function_base *const vfwcvt_f;
>  extern const function_base *const vfncvt_x;
>  extern const function_base *const vfncvt_x_frm;
>  extern const function_base *const vfncvt_xu;
> +extern const function_base *const vfncvt_xu_frm;
>  extern const function_base *const vfncvt_rtz_x;
>  extern const function_base *const vfncvt_rtz_xu;
>  extern const function_base *const vfncvt_f;
> diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
> b/gcc/config/riscv/riscv-vector-builtins-functions.def
> index 5e37bae318a..1e0e989fc2a 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-functions.def
> +++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
> @@ -473,6 +473,7 @@ DEF_RVV_FUNCTION (vfncvt_f, narrow_alu, full_preds, 
> f_to_nf_f_w_ops)
>  DEF_RVV_FUNCTION (vfncvt_rod_f, narrow_alu, full_preds, f_to_nf_f_w_ops)
>
>  DEF_RVV_FUNCTION (vfncvt_x_frm, narrow_alu_frm, full_preds, f_to_ni_f_w_ops)
> +DEF_RVV_FUNCTION (vfncvt_xu_frm, narrow_alu_frm, full_preds, f_to_nu_f_w_ops)
>
>  /* 14. Vector Reduction Operations.  */
>
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-xu.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-xu.c
> new file mode 100644
> index 000..82c3e1364bf
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-xu.c
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vuint16mf2_t
> +test_riscv_vfncvt_xu_f_w_u16mf2_rm (vfloat32m1_t op1, size_t vl) {
> +  return __riscv_vfncvt_xu_f_w_u16mf2_rm (op1, 0, vl);
> +}
> +
> +vuint16mf2_t
> +test_vfncvt_xu_f_w_u16mf2_rm_m (vbool32_t mask, vfloat32m1_t op1, size_t vl) 
> {
> +  return __riscv_vfncvt_xu_f_w_u16mf2_rm_m (mask, op1, 1, vl);
> +}
> +
> +vuint16mf2_t
> +test_riscv_vfncvt_xu_f_w_u16mf2 (vfloat32m1_t op1, size_t vl) {
> +  return __riscv_vfncvt_xu_f_w_u16mf2 (op1, vl);
> +}
> +
> +vuint16mf2_t
> +test_vfncvt_xu_f_w_u16mf2_m (vbool32_t mask, v

RE: [PATCH v1] RISC-V: Support RVV VFNCVT.F.{X|XU|F}.W rounding mode intrinsic API

2023-08-17 Thread Li, Pan2 via Gcc-patches
Committed, thanks Kito.

Pan

From: Kito Cheng 
Sent: Thursday, August 17, 2023 11:32 AM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Support RVV VFNCVT.F.{X|XU|F}.W rounding mode 
intrinsic API

Lgtm

Pan Li via Gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>於 2023年8月17日 
週四,10:19寫道:
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to support the rounding mode API for the
VFNCVT.F.{X|XU|F}.W as the below samples.

* __riscv_vfncvt_f_x_w_f32m1_rm
* __riscv_vfncvt_f_x_w_f32m1_rm_m
* __riscv_vfncvt_f_xu_w_f32m1_rm
* __riscv_vfncvt_f_xu_w_f32m1_rm_m
* __riscv_vfncvt_f_f_w_f32m1_rm
* __riscv_vfncvt_f_f_w_f32m1_rm_m

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfncvt_f): Add frm_op_type template arg.
(vfncvt_f_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfncvt_f_frm): New intrinsic function def.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-ncvt-f.c: New test.
---
 .../riscv/riscv-vector-builtins-bases.cc  | 10 ++-
 .../riscv/riscv-vector-builtins-bases.h   |  1 +
 .../riscv/riscv-vector-builtins-functions.def |  3 +
 .../riscv/rvv/base/float-point-ncvt-f.c   | 69 +++
 4 files changed, 82 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-f.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index acadec2afca..ad04647f9ba 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -1786,9 +1786,15 @@ public:
   }
 };

+template
 class vfncvt_f : public function_base
 {
 public:
+  bool has_rounding_mode_operand_p () const override
+  {
+return FRM_OP == HAS_FRM;
+  }
+
   rtx expand (function_expander &e) const override
   {
 if (e.op_info->op == OP_TYPE_f_w)
@@ -2512,7 +2518,8 @@ static CONSTEXPR const vfncvt_x 
vfncvt_xu_obj;
 static CONSTEXPR const vfncvt_x 
vfncvt_xu_frm_obj;
 static CONSTEXPR const vfncvt_rtz_x vfncvt_rtz_x_obj;
 static CONSTEXPR const vfncvt_rtz_x vfncvt_rtz_xu_obj;
-static CONSTEXPR const vfncvt_f vfncvt_f_obj;
+static CONSTEXPR const vfncvt_f vfncvt_f_obj;
+static CONSTEXPR const vfncvt_f vfncvt_f_frm_obj;
 static CONSTEXPR const vfncvt_rod_f vfncvt_rod_f_obj;
 static CONSTEXPR const reducop vredsum_obj;
 static CONSTEXPR const reducop vredmaxu_obj;
@@ -2769,6 +2776,7 @@ BASE (vfncvt_xu_frm)
 BASE (vfncvt_rtz_x)
 BASE (vfncvt_rtz_xu)
 BASE (vfncvt_f)
+BASE (vfncvt_f_frm)
 BASE (vfncvt_rod_f)
 BASE (vredsum)
 BASE (vredmaxu)
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
b/gcc/config/riscv/riscv-vector-builtins-bases.h
index 9bd09a41960..c8c649c4bb0 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -226,6 +226,7 @@ extern const function_base *const vfncvt_xu_frm;
 extern const function_base *const vfncvt_rtz_x;
 extern const function_base *const vfncvt_rtz_xu;
 extern const function_base *const vfncvt_f;
+extern const function_base *const vfncvt_f_frm;
 extern const function_base *const vfncvt_rod_f;
 extern const function_base *const vredsum;
 extern const function_base *const vredmaxu;
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index 1e0e989fc2a..cfbc125dcd8 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -474,6 +474,9 @@ DEF_RVV_FUNCTION (vfncvt_rod_f, narrow_alu, full_preds, 
f_to_nf_f_w_ops)

 DEF_RVV_FUNCTION (vfncvt_x_frm, narrow_alu_frm, full_preds, f_to_ni_f_w_ops)
 DEF_RVV_FUNCTION (vfncvt_xu_frm, narrow_alu_frm, full_preds, f_to_nu_f_w_ops)
+DEF_RVV_FUNCTION (vfncvt_f_frm, narrow_alu_frm, full_preds, i_to_nf_x_w_ops)
+DEF_RVV_FUNCTION (vfncvt_f_frm, narrow_alu_frm, full_preds, u_to_nf_xu_w_ops)
+DEF_RVV_FUNCTION (vfncvt_f_frm, narrow_alu_frm, full_preds, f_to_nf_f_w_ops)

 /* 14. Vector Reduction Operations.  */

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-f.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-f.c
new file mode 100644
index 000..d6d4be5e98e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-ncvt-f.c
@@ -0,0 +1,69 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vfloat32m1_t
+test_riscv_vfncvt_f_x_w_f32m1_rm (vint64m2_t op1, size_t vl) {
+  return __riscv_vfncvt_f_x_w_f32m1_rm (op1, 0, vl);
+}
+
+vfloat32m1_t
+test_vfncvt_f_x_w_f32m1_rm_m (vbool32_t mask, vint64m2_t op1, size_t vl) {
+  return __riscv_vfncvt_f_x_w_f32m1_rm_m (mask, op1, 1, vl);

RE: [PATCH v1] RISC-V: Support RVV VFREDUSUM.VS rounding mode intrinsic API

2023-08-17 Thread Li, Pan2 via Gcc-patches
Committed, thanks Kito.

Pan

From: Kito Cheng 
Sent: Thursday, August 17, 2023 11:33 AM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Support RVV VFREDUSUM.VS rounding mode 
intrinsic API

Lgtm

Pan Li via Gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>於 2023年8月17日 
週四,11:09寫道:
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to support the rounding mode API for the
VFREDUSUM.VS as the below samples.

* __riscv_vfredusum_vs_f32m1_f32m1_rm
* __riscv_vfredusum_vs_f32m1_f32m1_rm_m

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class freducop): Add frm_op_type template arg.
(vfredusum_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfredusum_frm): New intrinsic function def.
* config/riscv/riscv-vector-builtins-shapes.cc
(struct reduc_alu_frm_def): New class for frm shape.
(SHAPE): New declaration.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-redusum.c: New test.
---
 .../riscv/riscv-vector-builtins-bases.cc  |  9 -
 .../riscv/riscv-vector-builtins-bases.h   |  1 +
 .../riscv/riscv-vector-builtins-functions.def |  2 +
 .../riscv/riscv-vector-builtins-shapes.cc | 39 +++
 .../riscv/riscv-vector-builtins-shapes.h  |  1 +
 .../riscv/rvv/base/float-point-redusum.c  | 33 
 6 files changed, 84 insertions(+), 1 deletion(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redusum.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index ad04647f9ba..65f1d9c8ff7 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -1847,10 +1847,15 @@ public:
 };

 /* Implements floating-point reduction instructions.  */
-template
+template
 class freducop : public function_base
 {
 public:
+  bool has_rounding_mode_operand_p () const override
+  {
+return FRM_OP == HAS_FRM;
+  }
+
   bool apply_mask_policy_p () const override { return false; }

   rtx expand (function_expander &e) const override
@@ -2532,6 +2537,7 @@ static CONSTEXPR const reducop vredxor_obj;
 static CONSTEXPR const widen_reducop vwredsum_obj;
 static CONSTEXPR const widen_reducop vwredsumu_obj;
 static CONSTEXPR const freducop vfredusum_obj;
+static CONSTEXPR const freducop vfredusum_frm_obj;
 static CONSTEXPR const freducop vfredosum_obj;
 static CONSTEXPR const reducop vfredmax_obj;
 static CONSTEXPR const reducop vfredmin_obj;
@@ -2789,6 +2795,7 @@ BASE (vredxor)
 BASE (vwredsum)
 BASE (vwredsumu)
 BASE (vfredusum)
+BASE (vfredusum_frm)
 BASE (vfredosum)
 BASE (vfredmax)
 BASE (vfredmin)
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
b/gcc/config/riscv/riscv-vector-builtins-bases.h
index c8c649c4bb0..fd1a84f3e68 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -239,6 +239,7 @@ extern const function_base *const vredxor;
 extern const function_base *const vwredsum;
 extern const function_base *const vwredsumu;
 extern const function_base *const vfredusum;
+extern const function_base *const vfredusum_frm;
 extern const function_base *const vfredosum;
 extern const function_base *const vfredmax;
 extern const function_base *const vfredmin;
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index cfbc125dcd8..90a83c02d52 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -500,6 +500,8 @@ DEF_RVV_FUNCTION (vfredosum, reduc_alu, no_mu_preds, 
f_vs_ops)
 DEF_RVV_FUNCTION (vfredmax, reduc_alu, no_mu_preds, f_vs_ops)
 DEF_RVV_FUNCTION (vfredmin, reduc_alu, no_mu_preds, f_vs_ops)

+DEF_RVV_FUNCTION (vfredusum_frm, reduc_alu_frm, no_mu_preds, f_vs_ops)
+
 // 14.4. Vector Widening Floating-Point Reduction Instructions
 DEF_RVV_FUNCTION (vfwredosum, reduc_alu, no_mu_preds, wf_vs_ops)
 DEF_RVV_FUNCTION (vfwredusum, reduc_alu, no_mu_preds, wf_vs_ops)
diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc 
b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
index 80329113af3..f8fdec863e6 100644
--- a/gcc/config/riscv/riscv-vector-builtins-shapes.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
@@ -371,6 +371,44 @@ struct narrow_alu_frm_def : public build_frm_base
   }
 };

+/* reduc_alu_frm_def class.  */
+struct reduc_alu_frm_def : public build_frm_base
+{
+  char *get_name (function_builder &b, const function_instance &instance,
+ bool overloaded_p) const override
+  {
+char base_name[BASE_NAME_MAX_LE

RE: [PATCH v1] RISC-V: Support RVV VFREDOSUM.VS rounding mode intrinsic API

2023-08-17 Thread Li, Pan2 via Gcc-patches
Committed, thanks Kito.

Pan

-Original Message-
From: Li, Pan2  
Sent: Thursday, August 17, 2023 2:23 PM
To: gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Li, Pan2 ; Wang, Yanzhang 
; kito.ch...@gmail.com
Subject: [PATCH v1] RISC-V: Support RVV VFREDOSUM.VS rounding mode intrinsic API

From: Pan Li 

This patch would like to support the rounding mode API for the
VFREDOSUM.VS as the below samples.

* __riscv_vfredosum_vs_f32m1_f32m1_rm
* __riscv_vfredosum_vs_f32m1_f32m1_rm_m

Signed-off-by: Pan Li 

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(vfredosum_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfredosum_frm): New intrinsic function def.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-redosum.c: New test.
---
 .../riscv/riscv-vector-builtins-bases.cc  |  2 ++
 .../riscv/riscv-vector-builtins-bases.h   |  1 +
 .../riscv/riscv-vector-builtins-functions.def |  1 +
 .../riscv/rvv/base/float-point-redosum.c  | 33 +++
 4 files changed, 37 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 65f1d9c8ff7..ef2991359da 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -2539,6 +2539,7 @@ static CONSTEXPR const widen_reducop 
vwredsumu_obj;
 static CONSTEXPR const freducop vfredusum_obj;
 static CONSTEXPR const freducop vfredusum_frm_obj;
 static CONSTEXPR const freducop vfredosum_obj;
+static CONSTEXPR const freducop vfredosum_frm_obj;
 static CONSTEXPR const reducop vfredmax_obj;
 static CONSTEXPR const reducop vfredmin_obj;
 static CONSTEXPR const widen_freducop vfwredusum_obj;
@@ -2797,6 +2798,7 @@ BASE (vwredsumu)
 BASE (vfredusum)
 BASE (vfredusum_frm)
 BASE (vfredosum)
+BASE (vfredosum_frm)
 BASE (vfredmax)
 BASE (vfredmin)
 BASE (vfwredosum)
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
b/gcc/config/riscv/riscv-vector-builtins-bases.h
index fd1a84f3e68..da8412b66df 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -241,6 +241,7 @@ extern const function_base *const vwredsumu;
 extern const function_base *const vfredusum;
 extern const function_base *const vfredusum_frm;
 extern const function_base *const vfredosum;
+extern const function_base *const vfredosum_frm;
 extern const function_base *const vfredmax;
 extern const function_base *const vfredmin;
 extern const function_base *const vfwredosum;
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index 90a83c02d52..80e65bfb14b 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -501,6 +501,7 @@ DEF_RVV_FUNCTION (vfredmax, reduc_alu, no_mu_preds, 
f_vs_ops)
 DEF_RVV_FUNCTION (vfredmin, reduc_alu, no_mu_preds, f_vs_ops)
 
 DEF_RVV_FUNCTION (vfredusum_frm, reduc_alu_frm, no_mu_preds, f_vs_ops)
+DEF_RVV_FUNCTION (vfredosum_frm, reduc_alu_frm, no_mu_preds, f_vs_ops)
 
 // 14.4. Vector Widening Floating-Point Reduction Instructions
 DEF_RVV_FUNCTION (vfwredosum, reduc_alu, no_mu_preds, wf_vs_ops)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c
new file mode 100644
index 000..2e6a3c28a89
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vfloat32m1_t
+test_riscv_vfredosum_vs_f32m1_f32m1_rm (vfloat32m1_t op1, vfloat32m1_t op2,
+   size_t vl) {
+  return __riscv_vfredosum_vs_f32m1_f32m1_rm (op1, op2, 0, vl);
+}
+
+vfloat32m1_t
+test_vfredosum_vs_f32m1_f32m1_rm_m (vbool32_t mask, vfloat32m1_t op1,
+   vfloat32m1_t op2, size_t vl) {
+  return __riscv_vfredosum_vs_f32m1_f32m1_rm_m (mask, op1, op2, 1, vl);
+}
+
+vfloat32m1_t
+test_riscv_vfredosum_vs_f32m1_f32m1 (vfloat32m1_t op1, vfloat32m1_t op2,
+size_t vl) {
+  return __riscv_vfredosum_vs_f32m1_f32m1 (op1, op2, vl);
+}
+
+vfloat32m1_t
+test_vfredosum_vs_f32m1_f32m1_m (vbool32_t mask, vfloat32m1_t op1,
+vfloat32m1_t op2, size_t vl) {
+  return __riscv_vfredosum_vs_f32m1_f32m1_m (mask, op1, op2, vl);
+}
+
+/* { dg-final { scan-assembler-times 
{vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {frrm\s+[axs][0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {fsrm\s+[axs][0-9]+} 2 } } */
+/* { 

RE: [PATCH v1] RISC-V: Support RVV VFREDOSUM.VS rounding mode intrinsic API

2023-08-17 Thread Li, Pan2 via Gcc-patches
Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, August 17, 2023 3:30 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Support RVV VFREDOSUM.VS rounding mode 
intrinsic API

lgtm

On Thu, Aug 17, 2023 at 2:23 PM Pan Li via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> This patch would like to support the rounding mode API for the
> VFREDOSUM.VS as the below samples.
>
> * __riscv_vfredosum_vs_f32m1_f32m1_rm
> * __riscv_vfredosum_vs_f32m1_f32m1_rm_m
>
> Signed-off-by: Pan Li 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins-bases.cc
> (vfredosum_frm_obj): New declaration.
> (BASE): Ditto.
> * config/riscv/riscv-vector-builtins-bases.h: Ditto.
> * config/riscv/riscv-vector-builtins-functions.def
> (vfredosum_frm): New intrinsic function def.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/float-point-redosum.c: New test.
> ---
>  .../riscv/riscv-vector-builtins-bases.cc  |  2 ++
>  .../riscv/riscv-vector-builtins-bases.h   |  1 +
>  .../riscv/riscv-vector-builtins-functions.def |  1 +
>  .../riscv/rvv/base/float-point-redosum.c  | 33 +++
>  4 files changed, 37 insertions(+)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
> b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> index 65f1d9c8ff7..ef2991359da 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> @@ -2539,6 +2539,7 @@ static CONSTEXPR const 
> widen_reducop vwredsumu_obj;
>  static CONSTEXPR const freducop vfredusum_obj;
>  static CONSTEXPR const freducop vfredusum_frm_obj;
>  static CONSTEXPR const freducop vfredosum_obj;
> +static CONSTEXPR const freducop vfredosum_frm_obj;
>  static CONSTEXPR const reducop vfredmax_obj;
>  static CONSTEXPR const reducop vfredmin_obj;
>  static CONSTEXPR const widen_freducop vfwredusum_obj;
> @@ -2797,6 +2798,7 @@ BASE (vwredsumu)
>  BASE (vfredusum)
>  BASE (vfredusum_frm)
>  BASE (vfredosum)
> +BASE (vfredosum_frm)
>  BASE (vfredmax)
>  BASE (vfredmin)
>  BASE (vfwredosum)
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
> b/gcc/config/riscv/riscv-vector-builtins-bases.h
> index fd1a84f3e68..da8412b66df 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.h
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
> @@ -241,6 +241,7 @@ extern const function_base *const vwredsumu;
>  extern const function_base *const vfredusum;
>  extern const function_base *const vfredusum_frm;
>  extern const function_base *const vfredosum;
> +extern const function_base *const vfredosum_frm;
>  extern const function_base *const vfredmax;
>  extern const function_base *const vfredmin;
>  extern const function_base *const vfwredosum;
> diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
> b/gcc/config/riscv/riscv-vector-builtins-functions.def
> index 90a83c02d52..80e65bfb14b 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-functions.def
> +++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
> @@ -501,6 +501,7 @@ DEF_RVV_FUNCTION (vfredmax, reduc_alu, no_mu_preds, 
> f_vs_ops)
>  DEF_RVV_FUNCTION (vfredmin, reduc_alu, no_mu_preds, f_vs_ops)
>
>  DEF_RVV_FUNCTION (vfredusum_frm, reduc_alu_frm, no_mu_preds, f_vs_ops)
> +DEF_RVV_FUNCTION (vfredosum_frm, reduc_alu_frm, no_mu_preds, f_vs_ops)
>
>  // 14.4. Vector Widening Floating-Point Reduction Instructions
>  DEF_RVV_FUNCTION (vfwredosum, reduc_alu, no_mu_preds, wf_vs_ops)
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c
> new file mode 100644
> index 000..2e6a3c28a89
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-redosum.c
> @@ -0,0 +1,33 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vfloat32m1_t
> +test_riscv_vfredosum_vs_f32m1_f32m1_rm (vfloat32m1_t op1, vfloat32m1_t op2,
> +   size_t vl) {
> +  return __riscv_vfredosum_vs_f32m1_f32m1_rm (op1, op2, 0, vl);
> +}
> +
> +vfloat32m1_t
> +test_vfredosum_vs_f32m1_f32m1_rm_m (vbool32_t mask, vfloat32m1_t op1,
> +   vfloat32m1_t op2, size_t vl) {
> +  return __riscv_vfredosum_vs_f32m1_f32m1_rm_m (mask, op1, op2, 1, vl);
> +}
> +
> +vfloat32m1_t
> +test_riscv_vfredosum_vs_f32m1_f32m1 (vfloat32m1_t op1, vfloat32m1_t op2,
> +size_t vl) {
> +  return __riscv_vfredosum_vs_f32m1_f32m1 (op1, op2, vl);
> +}
> +
> +vfloat32m1_t
> +test_vfredosum_vs_f32m1_f32m1_m (vbool32_t mask, vfloat32m1_t op1,
> +vfloat32m1_t op2, size_t vl) {
> +  return __riscv_vfredosum_vs_f32m1_f

Re: [PATCH] libstdc++: fix memory clobbering in std::vector [PR110879]

2023-08-17 Thread Vladimir Palevich via Gcc-patches
On Thu, 17 Aug 2023 at 01:51, Jonathan Wakely  wrote:
>
> On 09/08/23 01:34 +0300, Vladimir Palevich wrote:
> >Because of the recent change in _M_realloc_insert and _M_default_append, call
> >to deallocate was ordered after assignment to class members of std::vector
> >(in the guard destructor), which is causing said members to be 
> >call-clobbered.
> >This is preventing further optimization, the compiler is unable to move 
> >memory
> >read out of a hot loop in this case.
> >This patch reorders the call to before assignments by putting guard in its 
> >own
> >block. Plus a new testsuite for this case.
> >I'm not very happy with the new testsuite, but I don't know how to properly
> >test this.
> >
> >Tested on x86_64-pc-linux-gnu.
> >
> >Maybe something could be done so that the compiler would be able to optimize
> >such cases anyway. Reads could be moved just after the clobbering calls in
> >unlikely branches, for example. This should be a fairly common case with
> >destructors at the end of a function.
> >
> >Note: I don't have write access.
> >
> >-- >8 --
> >
> >Fix ordering to prevent clobbering of class members by a call to deallocate
> >in _M_realloc_insert and _M_default_append.
> >
> >libstdc++-v3/ChangeLog:
> >PR libstdc++/110879
> >* include/bits/vector.tcc: End guard lifetime just before assignment to
> >class members.
> >* testsuite/libstdc++-dg/conformance.exp: Load scantree.exp.
> >* testsuite/23_containers/vector/110879.cc: New test.
> >
> >Signed-off-by: Vladimir Palevich  
> >---
> > libstdc++-v3/include/bits/vector.tcc  | 220 +-
> > .../testsuite/23_containers/vector/110879.cc  |  35 +++
> > .../testsuite/libstdc++-dg/conformance.exp|  13 ++
> > 3 files changed, 163 insertions(+), 105 deletions(-)
> > create mode 100644 libstdc++-v3/testsuite/23_containers/vector/110879.cc
> >
> >diff --git a/libstdc++-v3/include/bits/vector.tcc 
> >b/libstdc++-v3/include/bits/vector.tcc
> >index ada396c9b30..80631d1e2a1 100644
> >--- a/libstdc++-v3/include/bits/vector.tcc
> >+++ b/libstdc++-v3/include/bits/vector.tcc
> >@@ -488,78 +488,83 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
> >   private:
> >   _Guard(const _Guard&);
> >   };
> >-  _Guard __guard(__new_start, __len, _M_impl);
> >
> >-  // The order of the three operations is dictated by the C++11
> >-  // case, where the moves could alter a new element belonging
> >-  // to the existing vector.  This is an issue only for callers
> >-  // taking the element by lvalue ref (see last bullet of C++11
> >-  // [res.on.arguments]).
> >+  {
> >+  _Guard __guard(__new_start, __len, _M_impl);
> >
> >-  // If this throws, the existing elements are unchanged.
> >+  // The order of the three operations is dictated by the C++11
> >+  // case, where the moves could alter a new element belonging
> >+  // to the existing vector.  This is an issue only for callers
> >+  // taking the element by lvalue ref (see last bullet of C++11
> >+  // [res.on.arguments]).
> >+
> >+  // If this throws, the existing elements are unchanged.
> > #if __cplusplus >= 201103L
> >-  _Alloc_traits::construct(this->_M_impl,
> >- std::__to_address(__new_start + 
> >__elems_before),
> >- std::forward<_Args>(__args)...);
> >+  _Alloc_traits::construct(this->_M_impl,
> >+   std::__to_address(__new_start + 
> >__elems_before),
> >+   std::forward<_Args>(__args)...);
> > #else
> >-  _Alloc_traits::construct(this->_M_impl,
> >- __new_start + __elems_before,
> >- __x);
> >+  _Alloc_traits::construct(this->_M_impl,
> >+   __new_start + __elems_before,
> >+   __x);
> > #endif
> >
> > #if __cplusplus >= 201103L
> >-  if _GLIBCXX17_CONSTEXPR (_S_use_relocate())
> >-  {
> >-// Relocation cannot throw.
> >-__new_finish = _S_relocate(__old_start, __position.base(),
> >-   __new_start, _M_get_Tp_allocator());
> >-++__new_finish;
> >-__new_finish = _S_relocate(__position.base(), __old_finish,
> >-   __new_finish, _M_get_Tp_allocator());
> >-  }
> >-  else
> >+  if _GLIBCXX17_CONSTEXPR (_S_use_relocate())
> >+{
> >+  // Relocation cannot throw.
> >+  __new_finish = _S_relocate(__old_start, __position.base(),
> >+ __new_start, _M_get_Tp_allocator());
> >+  ++__new_finish;
> >+  __new_finish = _S_relocate(__position.base(), __old_finish,
> >+ __new_finish, _M_get_Tp_allocator());
> >+}
> >+  else
> > #endif
> >-  {
> >-// RAII type to destroy initialized elements.
> >-struct _Guard_elts
> > {
>

[committed] libstdc++: Fix testsuite no_pch directive

2023-08-17 Thread Jonathan Wakely via Gcc-patches
A new test I added was failing with -std=gnu++23 because that flag was
removed from the test options (but only after checking if it met the
c++20 effective target).

Tested x86_64-linux. Pushed to trunk.

-- >8 --

The { dg-add-options no_pch } directive is supposed to add a macro
definition that invalidates the PCH file, and ensures that the #include
directives in the test file are processed as written. But the proc that
adds the options actually removes all existing options, cancelling out
any previous dg-options directive.

This means that using no_pch will cause FAILs in a file that relies on
other options set by an earlier dg-options.

The no_pch directive was added for PR libstdc++/21769 where Janis
suggested adding it as return "$flags -D__GLIBCXX__=" but what
was actually committed didn't include the $flags so replaced them.

Additionally, using no_pch  only prevents the precompiled version of
 from being included, it doesn't prevent the
non-precompiled version being included by -include bits/stdc++.h in the
test flags. Use regsub to filter that out of the options as well.

libstdc++-v3/ChangeLog:

* testsuite/lib/dg-options.exp (add_options_for_no_pch): Remove
any "-include bits/stdc++.h" from options and add the macro to
the existing options instead of replacing them.
---
 libstdc++-v3/testsuite/lib/dg-options.exp | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp 
b/libstdc++-v3/testsuite/lib/dg-options.exp
index 73c1552e682..15e34f8a646 100644
--- a/libstdc++-v3/testsuite/lib/dg-options.exp
+++ b/libstdc++-v3/testsuite/lib/dg-options.exp
@@ -269,8 +269,10 @@ proc dg-require-target-fs-lwt { args } {
 }
 
 proc add_options_for_no_pch { flags } {
+# Remove any inclusion of bits/stdc++.h from the options.
+regsub -all -- "-include bits/stdc...h" $flags "" flags
 # This forces any generated and possibly included PCH to be invalid.
-return "-D__GLIBCXX__="
+return "$flags -D__GLIBCXX__="
 }
 
 # Add to FLAGS all the target-specific flags needed for networking.
-- 
2.41.0



[committed] libstdc++: Disable PCH for tests that rely on include order

2023-08-17 Thread Jonathan Wakely via Gcc-patches
Now that no_pch works, I can use it to fix this test that was failing
with PCH enabled and run with -std=gnu++23.

Tested x86_64-linux. Pushed to trunk.

-- >8 --

These tests expect to be able to #undef a feature test macro and then
include  to get it redefined. But if  has already been
included by the  PCH then including it again does nothing
and the macro remains undefined.

libstdc++-v3/ChangeLog:

* testsuite/24_iterators/move_iterator/p2520r0.cc: Add no_pch.
* testsuite/std/format/functions/format.cc: Likewise.
* testsuite/std/format/functions/format_c++23.cc: Likewise.
---
 libstdc++-v3/testsuite/24_iterators/move_iterator/p2520r0.cc | 1 +
 libstdc++-v3/testsuite/std/format/functions/format.cc| 1 +
 libstdc++-v3/testsuite/std/format/functions/format_c++23.cc  | 1 +
 3 files changed, 3 insertions(+)

diff --git a/libstdc++-v3/testsuite/24_iterators/move_iterator/p2520r0.cc 
b/libstdc++-v3/testsuite/24_iterators/move_iterator/p2520r0.cc
index 883d6cc09e0..e36ac574a8e 100644
--- a/libstdc++-v3/testsuite/24_iterators/move_iterator/p2520r0.cc
+++ b/libstdc++-v3/testsuite/24_iterators/move_iterator/p2520r0.cc
@@ -1,5 +1,6 @@
 // { dg-options "-std=gnu++20" }
 // { dg-do compile { target c++20 } }
+// { dg-add-options no_pch }
 
 // Verify P2520R0 changes to move_iterator's iterator_concept, which we treat
 // as a DR against C++20.
diff --git a/libstdc++-v3/testsuite/std/format/functions/format.cc 
b/libstdc++-v3/testsuite/std/format/functions/format.cc
index a8d5b652a5e..4db5202815d 100644
--- a/libstdc++-v3/testsuite/std/format/functions/format.cc
+++ b/libstdc++-v3/testsuite/std/format/functions/format.cc
@@ -1,5 +1,6 @@
 // { dg-options "-std=gnu++20" }
 // { dg-do run { target c++20 } }
+// { dg-add-options no_pch }
 
 #include 
 
diff --git a/libstdc++-v3/testsuite/std/format/functions/format_c++23.cc 
b/libstdc++-v3/testsuite/std/format/functions/format_c++23.cc
index f20c46cd7e3..3caa70fcdf2 100644
--- a/libstdc++-v3/testsuite/std/format/functions/format_c++23.cc
+++ b/libstdc++-v3/testsuite/std/format/functions/format_c++23.cc
@@ -1,4 +1,5 @@
 // { dg-do run { target c++23 } }
+// { dg-add-options no_pch }
 // This test does not have -std=gnu++20 in dg-options so that format.cc
 // can be tested for e.g. -std=c++26
 #include "format.cc"
-- 
2.41.0



Re: [PATCH] vect: Move VMAT_GATHER_SCATTER handlings from final loop nest

2023-08-17 Thread Richard Biener via Gcc-patches
On Wed, Aug 16, 2023 at 4:38 AM Kewen.Lin  wrote:
>
> on 2023/8/15 17:13, Richard Sandiford wrote:
> > Richard Biener  writes:
> >>> OK, fair enough.  So the idea is: see where we end up and then try to
> >>> improve/factor the APIs in a less peephole way?
> >>
> >> Yeah, I think that's the only good way forward.
> >
> > OK, no objection from me.  Sorry for holding the patch up.
>
> This hasn't been approved yet (although the patch on VMAT_LOAD_STORE_LANES
> was), so it wasn't held up and thanks for sharing your thoughts and making
> it get attention. :)
>
> From the discussions, it seems this looks good to both of you.  But I could
> be wrong, so may I ask if it's ok for trunk?

OK.

Richard.

> BR,
> Kewen


Re: [PATCH] Add -Wdisabled-optimization warning for not optimizing sibling calls

2023-08-17 Thread Richard Biener via Gcc-patches
On Wed, Aug 16, 2023 at 12:48 AM Bradley Lucier  wrote:
>
> First, if this is no longer the appropriate group for this discussion,
> please tell me where to send it.
>
> I've been working to understand all the comments here.  From them, I think:
>
> 1.  It's OK to have gcc report back to the user whether each particular
> call in tail position is optimized when -foptimize-sibling-calls is set
> as a compiler option; or, to report only those calls that have not been
> optimized.
>
> 2.  Given (1), the question is what form that information should take,
> and which gcc option should cause it to be expressed.
>
>  From comments in this thread and the documentation for today's mainline
> gcc, I configured and built Gambit Scheme with
>
> ./configure CC="/pkgs/gcc-mainline/bin/gcc -fopt-info-missed"
> --enable-single-host
>
> thinking that info about missed optimizations would be a good place to
> export information about non-optimized sibling calls.
>
> This may not have been a good idea, however, as I ended up with 93367
> lines about missed optimizations.
>
> Is this the right direction to proceed in?  The documentation says about
> -fopt-info-missed
>
>   One or more of the following option keywords can be used to
>   describe a group of optimizations:
>
>   'ipa'
>Enable dumps from all interprocedural optimizations.
>   'loop'
>Enable dumps from all loop optimizations.
>   'inline'
>Enable dumps from all inlining optimizations.
>   'omp'
>Enable dumps from all OMP (Offloading and Multi Processing)
>optimizations.
>   'vec'
>Enable dumps from all vectorization optimizations.
>   'optall'
>Enable dumps from all optimizations.  This is a superset of
>the optimization groups listed above.
>
> I'd like to limit the number of missed optimization warnings, but I
> don't know where sibling call optimization would fit into these categories.

I think it needs a new category, 'inline' is probably the "closest" existing one
but that also tends to be noisy.  Maybe 'call' would be a good name?  We could
report things like tail-recursion optimization, tail-calling and sibling calling
optimizations there, possibly also return/argument copy elision.

Richard.

>
> Brad


[PATCH] RISC-V: Forbidden fuse vlmax vsetvl to DEMAND_NONZERO_AVL vsetvl

2023-08-17 Thread Lehua Ding
Hi,

This little patch fix the fail testcase
(gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c)
after apply this patch
(https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627121.html).
The specific reason is that the vsetvl pass has bug and this patch
forbidden the fuse of this case. This patch needs to be committed
before that patch to work.

Best,
Lehua

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pass_vsetvl::backward_demand_fusion):
  Forbidden.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c:
  Address failure due to uninitialized vtype register. 

---
 gcc/config/riscv/riscv-vsetvl.cc   | 3 +++
 .../riscv/rvv/autovec/gather-scatter/strided_load_run-1.c  | 1 +
 2 files changed, 4 insertions(+)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 08c487d82c0..11ef5d628c4 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -3312,6 +3312,9 @@ pass_vsetvl::backward_demand_fusion (void)
  else if (block_info.reaching_out.dirty_p ())
{
  /* DIRTY -> DIRTY or VALID -> DIRTY.  */
+ if (block_info.reaching_out.demand_p (DEMAND_NONZERO_AVL)
+ && vlmax_avl_p (prop.get_avl ()))
+   continue;
  vector_insn_info new_info;
 
  if (block_info.reaching_out.compatible_p (prop))
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
index 7ffa93bf13f..5080f196601 100644
--- 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
+++ 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
@@ -7,6 +7,7 @@
 int
 main (void)
 {
+asm volatile ("vsetivli x0, 0, e8, m1, ta, ma");
 #define RUN_LOOP(DATA_TYPE, BITS)  
\
   DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];   
\
   DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];  
\
-- 
2.36.3



[PATCH v1] RISC-V: Support RVV VFWREDUSUM.VS rounding mode intrinsic API

2023-08-17 Thread Pan Li via Gcc-patches
From: Pan Li 

This patch would like to support the rounding mode API for the
VFWREDUSUM.VS as the below samples

* __riscv_vfwredusum_vs_f32m1_f64m1_rm
* __riscv_vfwredusum_vs_f32m1_f64m1_rm_m

Signed-off-by: Pan Li 

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(vfwredusum_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfwredusum_frm): New intrinsic function def.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-wredusum.c: New test.
---
 .../riscv/riscv-vector-builtins-bases.cc  |  2 ++
 .../riscv/riscv-vector-builtins-bases.h   |  1 +
 .../riscv/riscv-vector-builtins-functions.def |  1 +
 .../riscv/rvv/base/float-point-wredusum.c | 33 +++
 4 files changed, 37 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wredusum.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index abf03bab0da..5ee7d3119db 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -2548,6 +2548,7 @@ static CONSTEXPR const freducop 
vfredosum_frm_obj;
 static CONSTEXPR const reducop vfredmax_obj;
 static CONSTEXPR const reducop vfredmin_obj;
 static CONSTEXPR const widen_freducop vfwredusum_obj;
+static CONSTEXPR const widen_freducop 
vfwredusum_frm_obj;
 static CONSTEXPR const widen_freducop vfwredosum_obj;
 static CONSTEXPR const widen_freducop 
vfwredosum_frm_obj;
 static CONSTEXPR const vmv vmv_x_obj;
@@ -2810,6 +2811,7 @@ BASE (vfredmin)
 BASE (vfwredosum)
 BASE (vfwredosum_frm)
 BASE (vfwredusum)
+BASE (vfwredusum_frm)
 BASE (vmv_x)
 BASE (vmv_s)
 BASE (vfmv_f)
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
b/gcc/config/riscv/riscv-vector-builtins-bases.h
index c1bb164a712..69d4562091f 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -247,6 +247,7 @@ extern const function_base *const vfredmin;
 extern const function_base *const vfwredosum;
 extern const function_base *const vfwredosum_frm;
 extern const function_base *const vfwredusum;
+extern const function_base *const vfwredusum_frm;
 extern const function_base *const vmv_x;
 extern const function_base *const vmv_s;
 extern const function_base *const vfmv_f;
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index da1157f5a56..3ce06dc60b7 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -508,6 +508,7 @@ DEF_RVV_FUNCTION (vfwredosum, reduc_alu, no_mu_preds, 
wf_vs_ops)
 DEF_RVV_FUNCTION (vfwredusum, reduc_alu, no_mu_preds, wf_vs_ops)
 
 DEF_RVV_FUNCTION (vfwredosum_frm, reduc_alu_frm, no_mu_preds, wf_vs_ops)
+DEF_RVV_FUNCTION (vfwredusum_frm, reduc_alu_frm, no_mu_preds, wf_vs_ops)
 
 /* 15. Vector Mask Instructions.  */
 
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wredusum.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wredusum.c
new file mode 100644
index 000..6c888c10c0d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-wredusum.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vfloat64m1_t
+test_riscv_vfwredusum_vs_f32m1_f64m1_rm (vfloat32m1_t op1, vfloat64m1_t op2,
+size_t vl) {
+  return __riscv_vfwredusum_vs_f32m1_f64m1_rm (op1, op2, 0, vl);
+}
+
+vfloat64m1_t
+test_vfwredusum_vs_f32m1_f64m1_rm_m (vbool32_t mask, vfloat32m1_t op1,
+vfloat64m1_t op2, size_t vl) {
+  return __riscv_vfwredusum_vs_f32m1_f64m1_rm_m (mask, op1, op2, 1, vl);
+}
+
+vfloat64m1_t
+test_riscv_vfwredusum_vs_f32m1_f64m1 (vfloat32m1_t op1, vfloat64m1_t op2,
+ size_t vl) {
+  return __riscv_vfwredusum_vs_f32m1_f64m1 (op1, op2, vl);
+}
+
+vfloat64m1_t
+test_vfwredusum_vs_f32m1_f64m1_m (vbool32_t mask, vfloat32m1_t op1,
+ vfloat64m1_t op2, size_t vl) {
+  return __riscv_vfwredusum_vs_f32m1_f64m1_m (mask, op1, op2, vl);
+}
+
+/* { dg-final { scan-assembler-times {vfwredusum\.vs\s+v[0-9]+,\s*v[0-9]+} 4 } 
} */
+/* { dg-final { scan-assembler-times {frrm\s+[axs][0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {fsrm\s+[axs][0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {fsrmi\s+[01234]} 2 } } */
-- 
2.34.1



Re: [PATCH] RISC-V: Fix reduc_strict_run-1 test case.

2023-08-17 Thread Robin Dapp via Gcc-patches
> I'm not opposed to merging the test change, but I couldn't figure out
> where in C the implicit conversion was coming from: as far as I can
> tell the macros don't introduce any (it's "return _float16 *
> _float16"), I'd had the patch open since last night but couldn't
> figure it out.
> 
> We get a bunch of half->single->half converting in the generated
> assembly that smelled like we had a bug somewhere else, sorry if I'm
> just missing something...

Yes, good point, my explanation was wrong again.

What really (TM) happens is that the equality comparison, in presence
of _Float16 emulation(!), performs an extension to float/double for its
arguments.

So
  if (res != r * q)
is
  if ((float)res (float)!= (float)(r * q))

Now, (r * q) is also implicitly computed in float.  Because the
comparison requires a float argument, there is no intermediate conversion
back to _Float16 and the value is more accurate than it would be in
_Float16.
res, however, despite being calculated in float as well, is converted
to _Float16 for the function return or rather the assignment to "res".
Therefore it is less accurate than (r * q) and the comparison fails.

So, what would also help, even though it's not obvious at first
sight is:

 TYPE res = reduc_plus_##TYPE (a, b);   \
-if (res != r * q)  \
+TYPE ref = r * q;  \
+if (res != ref)\
   __builtin_abort ();  \
   }

This does not happen with proper _zfh because the operations are done
in _Float16 precision then.  BTW such kinds of non-obvious problems
are the reason why I split off _zvfh run tests into separate files
right away.

Regards
 Robin



Re: [PATCH] testsuite: Remove unused dg-line in ce8cdf5bcf96a2db6d7b9f656fc9ba58d7942a83

2023-08-17 Thread Richard Biener via Gcc-patches
On Tue, Aug 15, 2023 at 8:36 PM Benjamin Priour via Gcc-patches
 wrote:
>
> From: benjamin priour 
>
> Yet another blunder.
>
> Succesfully regstrapped against ce8cdf5bcf96a2db6d7b9f656fc9ba58d7942a83
> on x86_64-linux-gnu.
>
> OK to push on trunk ?

OK.

> Sorry,
> Benjamin.
>
> Fixup below.
> ---
>
> Test case g++.dg/analyzer/fanalyzer-show-events-in-system-headers.C
> introduced by patch ce8cdf5bcf96a2db6d7b9f656fc9ba58d7942a83
> emitted a warning for an unused dg-line variable.
> This fixes up the blunder.
>
> Signed-off-by: benjamin priour 
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/analyzer/fanalyzer-show-events-in-system-headers.C:
> Remove dg-line var declare_a.
> ---
>  .../g++.dg/analyzer/fanalyzer-show-events-in-system-headers.C   | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git 
> a/gcc/testsuite/g++.dg/analyzer/fanalyzer-show-events-in-system-headers.C 
> b/gcc/testsuite/g++.dg/analyzer/fanalyzer-show-events-in-system-headers.C
> index 4cc93d129f0..aa964f93563 100644
> --- a/gcc/testsuite/g++.dg/analyzer/fanalyzer-show-events-in-system-headers.C
> +++ b/gcc/testsuite/g++.dg/analyzer/fanalyzer-show-events-in-system-headers.C
> @@ -6,7 +6,7 @@
>  struct A {int x; int y;};
>
>  int main () { /* { dg-message "\\(1\\) entry to 'main'" "telltale event that 
> we are going within a deeper frame than 'main'" } */
> -  std::shared_ptr a; /* { dg-line declare_a } */
> +  std::shared_ptr a;
>a->x = 4; /* { dg-line deref_a } */
>/* { dg-warning "dereference of NULL" "" { target *-*-* } deref_a } */
>
> --
> 2.34.1
>


Re: [PATCH] bpf: fix pseudoc w regs for small modes [PR111029]

2023-08-17 Thread Richard Biener via Gcc-patches
On Tue, Aug 15, 2023 at 9:03 PM Jose E. Marchesi via Gcc-patches
 wrote:
>
>
> Hello David.
> Thanks for the patch.
>
> OK.

Picking a random patch/mail for this question - how do we maintain BPF
support for the most recent GCC release which is GCC 13?  I see the
current state in GCC 13 isn't fully able to provide upstream kernel BPF
support but GCC 14 contains some bugfixes and some new features(?).
Is it worthwhile to backport at least bugfixes while GCC 14 is still in
development even if those are not regression fixes?  Or is GCC 13 BPF
too broken to be used anyway?

Thanks,
Richard.

> > In the BPF pseudo-c assembly dialect, registers treated as 32-bits
> > rather than the full 64 in various instructions ought to be printed as
> > "wN" rather than "rN".  But bpf_print_register () was only doing this
> > for specifically SImode registers, meaning smaller modes were printed
> > incorrectly.
> >
> > This caused assembler errors like:
> >
> >   Error: unrecognized instruction `w2 =(s8)r1'
> >
> > for a 32-bit sign-extending register move instruction, where the source
> > register is used in QImode.
> >
> > Fix bpf_print_register () to print the "w" version of register when
> > specified by the template for any mode 32-bits or smaller.
> >
> > Tested on bpf-unknown-none.
> >
> >   PR target/111029
> >
> > gcc/
> >   * config/bpf/bpf.cc (bpf_print_register): Print 'w' registers
> >   for any mode 32-bits or smaller, not just SImode.
> >
> > gcc/testsuite/
> >
> >   * gcc.target/bpf/smov-2.c: New test.
> >   * gcc.target/bpf/smov-pseudoc-2.c: New test.
> > ---
> >  gcc/config/bpf/bpf.cc |  2 +-
> >  gcc/testsuite/gcc.target/bpf/smov-2.c | 15 +++
> >  gcc/testsuite/gcc.target/bpf/smov-pseudoc-2.c | 15 +++
> >  3 files changed, 31 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.target/bpf/smov-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/bpf/smov-pseudoc-2.c
> >
> > diff --git a/gcc/config/bpf/bpf.cc b/gcc/config/bpf/bpf.cc
> > index 3516b79bce4..1d0abd7fbb3 100644
> > --- a/gcc/config/bpf/bpf.cc
> > +++ b/gcc/config/bpf/bpf.cc
> > @@ -753,7 +753,7 @@ bpf_print_register (FILE *file, rtx op, int code)
> >  fprintf (file, "%s", reg_names[REGNO (op)]);
> >else
> >  {
> > -  if (code == 'w' && GET_MODE (op) == SImode)
> > +  if (code == 'w' && GET_MODE_SIZE (GET_MODE (op)) <= 4)
> >   {
> > if (REGNO (op) == BPF_FP)
> >   fprintf (file, "w10");
> > diff --git a/gcc/testsuite/gcc.target/bpf/smov-2.c 
> > b/gcc/testsuite/gcc.target/bpf/smov-2.c
> > new file mode 100644
> > index 000..6f3516d2385
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/bpf/smov-2.c
> > @@ -0,0 +1,15 @@
> > +/* Check signed 32-bit mov instructions.  */
> > +/* { dg-do compile } */
> > +/* { dg-options "-mcpu=v4 -O2" } */
> > +
> > +int
> > +foo (unsigned char a, unsigned short b)
> > +{
> > +  int x = (char) a;
> > +  int y = (short) b;
> > +
> > +  return x + y;
> > +}
> > +
> > +/* { dg-final { scan-assembler {movs32\t%r.,%r.,8\n} } } */
> > +/* { dg-final { scan-assembler {movs32\t%r.,%r.,16\n} } } */
> > diff --git a/gcc/testsuite/gcc.target/bpf/smov-pseudoc-2.c 
> > b/gcc/testsuite/gcc.target/bpf/smov-pseudoc-2.c
> > new file mode 100644
> > index 000..6af6cadf8df
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/bpf/smov-pseudoc-2.c
> > @@ -0,0 +1,15 @@
> > +/* Check signed 32-bit mov instructions (pseudo-C asm dialect).  */
> > +/* { dg-do compile } */
> > +/* { dg-options "-mcpu=v4 -O2 -masm=pseudoc" } */
> > +
> > +int
> > +foo (unsigned char a, unsigned short b)
> > +{
> > +  int x = (char) a;
> > +  int y = (short) b;
> > +
> > +  return x + y;
> > +}
> > +
> > +/* { dg-final { scan-assembler {w. = \(s8\) w.\n} } } */
> > +/* { dg-final { scan-assembler {w. = \(s16\) w.\n} } } */


Re: [PATCH] bpf: fix pseudoc w regs for small modes [PR111029]

2023-08-17 Thread Jose E. Marchesi via Gcc-patches


> On Tue, Aug 15, 2023 at 9:03 PM Jose E. Marchesi via Gcc-patches
>  wrote:
>>
>>
>> Hello David.
>> Thanks for the patch.
>>
>> OK.
>
> Picking a random patch/mail for this question - how do we maintain BPF
> support for the most recent GCC release which is GCC 13?  I see the
> current state in GCC 13 isn't fully able to provide upstream kernel BPF
> support but GCC 14 contains some bugfixes and some new features(?).
> Is it worthwhile to backport at least bugfixes while GCC 14 is still in
> development even if those are not regression fixes?  Or is GCC 13 BPF
> too broken to be used anyway?

Our plan is:

1. Get git GCC and git binutils to compile all the kernel BPF selftests.
   This covers both functionality (builtins, attributes, BTF, CO-RE,
   etc) and consolidation of behavior between the GNU and llvm bpf
   ports.  We are working very hard to achieve this point and we are
   very near: functionality wise we are on-par in all components, but
   there are some bugs we are fixing.  We expect to be done in a couple
   of weeks.

2. Once the above is achieved, we plan to start doing the backports to
   released/maintained versions of both binutils and GCC so distros like
   Debian (that already package gcc-bpf) can use the toolchain.

3. Next step is to make sure the compiler generates code that can
   generally satisfy the many restrictions imposed by the kernel
   verifier, at least to a point that is practical.  This is a difficult
   general problem not specific to GCC and is shared by llvm and other
   optimizing compilers, sort of a moving target, and it is not clear at
   all how to achieve this in a general and practical way.  We have some
   ideas and have submitted a proposal to discuss this topic during this
   year's Cauldron: "The challenge of compiling for verified targets".

> Thanks,
> Richard.
>
>> > In the BPF pseudo-c assembly dialect, registers treated as 32-bits
>> > rather than the full 64 in various instructions ought to be printed as
>> > "wN" rather than "rN".  But bpf_print_register () was only doing this
>> > for specifically SImode registers, meaning smaller modes were printed
>> > incorrectly.
>> >
>> > This caused assembler errors like:
>> >
>> >   Error: unrecognized instruction `w2 =(s8)r1'
>> >
>> > for a 32-bit sign-extending register move instruction, where the source
>> > register is used in QImode.
>> >
>> > Fix bpf_print_register () to print the "w" version of register when
>> > specified by the template for any mode 32-bits or smaller.
>> >
>> > Tested on bpf-unknown-none.
>> >
>> >   PR target/111029
>> >
>> > gcc/
>> >   * config/bpf/bpf.cc (bpf_print_register): Print 'w' registers
>> >   for any mode 32-bits or smaller, not just SImode.
>> >
>> > gcc/testsuite/
>> >
>> >   * gcc.target/bpf/smov-2.c: New test.
>> >   * gcc.target/bpf/smov-pseudoc-2.c: New test.
>> > ---
>> >  gcc/config/bpf/bpf.cc |  2 +-
>> >  gcc/testsuite/gcc.target/bpf/smov-2.c | 15 +++
>> >  gcc/testsuite/gcc.target/bpf/smov-pseudoc-2.c | 15 +++
>> >  3 files changed, 31 insertions(+), 1 deletion(-)
>> >  create mode 100644 gcc/testsuite/gcc.target/bpf/smov-2.c
>> >  create mode 100644 gcc/testsuite/gcc.target/bpf/smov-pseudoc-2.c
>> >
>> > diff --git a/gcc/config/bpf/bpf.cc b/gcc/config/bpf/bpf.cc
>> > index 3516b79bce4..1d0abd7fbb3 100644
>> > --- a/gcc/config/bpf/bpf.cc
>> > +++ b/gcc/config/bpf/bpf.cc
>> > @@ -753,7 +753,7 @@ bpf_print_register (FILE *file, rtx op, int code)
>> >  fprintf (file, "%s", reg_names[REGNO (op)]);
>> >else
>> >  {
>> > -  if (code == 'w' && GET_MODE (op) == SImode)
>> > +  if (code == 'w' && GET_MODE_SIZE (GET_MODE (op)) <= 4)
>> >   {
>> > if (REGNO (op) == BPF_FP)
>> >   fprintf (file, "w10");
>> > diff --git a/gcc/testsuite/gcc.target/bpf/smov-2.c 
>> > b/gcc/testsuite/gcc.target/bpf/smov-2.c
>> > new file mode 100644
>> > index 000..6f3516d2385
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/bpf/smov-2.c
>> > @@ -0,0 +1,15 @@
>> > +/* Check signed 32-bit mov instructions.  */
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-mcpu=v4 -O2" } */
>> > +
>> > +int
>> > +foo (unsigned char a, unsigned short b)
>> > +{
>> > +  int x = (char) a;
>> > +  int y = (short) b;
>> > +
>> > +  return x + y;
>> > +}
>> > +
>> > +/* { dg-final { scan-assembler {movs32\t%r.,%r.,8\n} } } */
>> > +/* { dg-final { scan-assembler {movs32\t%r.,%r.,16\n} } } */
>> > diff --git a/gcc/testsuite/gcc.target/bpf/smov-pseudoc-2.c 
>> > b/gcc/testsuite/gcc.target/bpf/smov-pseudoc-2.c
>> > new file mode 100644
>> > index 000..6af6cadf8df
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/bpf/smov-pseudoc-2.c
>> > @@ -0,0 +1,15 @@
>> > +/* Check signed 32-bit mov instructions (pseudo-C asm dialect).  */
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-mcpu=v4 -O2 -masm=pseudoc" } */
>> > +
>> > +int
>> > +foo (unsigned char a, unsigned 

Re: [PATCH] libgccjit: Add support for `restrict` attribute on function parameters

2023-08-17 Thread Guillaume Gomez via Gcc-patches
Hi Dave,

> What kind of testing has the patch had? (e.g. did you run "make check-
> jit" ?  Has this been in use on real Rust code?)

I tested it as Rust backend directly on this code:

```
pub fn foo(a: &mut i32, b: &mut i32, c: &i32) {
*a += *c;
*b += *c;
}
```

I ran it with `rustc` (and the GCC backend) with the following flags:
`-C link-args=-lc --emit=asm -O --crate-type=lib` which gave the diff
you can see in the attached file. Explanations: the diff on the right
has the `__restrict__` attribute used whereas on the left it is the
current version where we don't handle it.

As for C testing, I used this code:

```
void t(int *__restrict__ a, int *__restrict__ b, char *__restrict__ c) {
*a += *c;
*b += *c;
}
```

(without the `__restrict__` of course when I need to have a witness
ASM). I attached the diff as well, this time the file with the use of
`__restrict__` in on the left. I compiled with the following flags:
`-S -O3`.

> Please add a feature macro:
> #define LIBGCCJIT_HAVE_gcc_jit_type_get_restrict
> (see the similar ones in the header).

I added `LIBGCCJIT_HAVE_gcc_jit_type_get_size` and extended the
documentation as well to mention the ABI change.

> Please add a new ABI tag (LIBGCCJIT_ABI_25 ?), rather than adding this
> to ABI_0.

I added `LIBGCCJIT_ABI_34` as `LIBGCCJIT_ABI_33` was the last one.

> This refers to a "cold attribute"; is this a vestige of a copy-and-
> paste from a different test case?

It is a vestige indeed... Missed this one.

> I see that the test scans the generated assembler.  Does the test
> actually verify that restrict has an effect, or was that another
> vestige from a different test case?

No, this time it's what I wanted. Please see the C diff I provided
above to see that the ASM has a small diff that allowed me to confirm
that the `__restrict__` attribute was correctly set.

> If this test is meant to run at -O3 and thus can't be part of test-
> combination.c, please add a comment about it to
> gcc/testsuite/jit.dg/all-non-failing-tests.h (in the alphabetical
> place).

Below `-O3`, this ASM difference doesn't appear unfortunately.

> The patch also needs to add documentation for the new entrypoint (in
> topics/types.rst), and for the new ABI tag (in
> topics/compatibility.rst).

Added!

> Thanks again for the patch; hope the above is constructive

It was incredibly useful! Thanks for taking time to writing down the
explanations.

The new patch is attached to this email.

Cordially.

Le jeu. 17 août 2023 à 01:06, David Malcolm  a écrit :
>
> On Wed, 2023-08-16 at 22:06 +0200, Guillaume Gomez via Jit wrote:
> > My apologies, forgot to run the commit checkers. Here's the commit
> > with the errors fixed.
> >
> > Le mer. 16 août 2023 à 18:32, Guillaume Gomez
> >  a écrit :
> > >
> > > Hi,
>
> Hi Guillaume, thanks for the patch.
>
> > >
> > > This patch adds the possibility to specify the __restrict__
> > > attribute
> > > for function parameters. It is used by the Rust GCC backend.
>
> What kind of testing has the patch had? (e.g. did you run "make check-
> jit" ?  Has this been in use on real Rust code?)
>
> Overall, this patch looks close to being ready, but some nits below...
>
> [...]
>
> > diff --git a/gcc/jit/libgccjit.h b/gcc/jit/libgccjit.h
> > index 60eaf39bff6..2e0d08a06d8 100644
> > --- a/gcc/jit/libgccjit.h
> > +++ b/gcc/jit/libgccjit.h
> > @@ -635,6 +635,10 @@ gcc_jit_type_get_const (gcc_jit_type *type);
> >  extern gcc_jit_type *
> >  gcc_jit_type_get_volatile (gcc_jit_type *type);
> >
> > +/* Given type "T", get type "restrict T".  */
> > +extern gcc_jit_type *
> > +gcc_jit_type_get_restrict (gcc_jit_type *type);
> > +
> >  #define LIBGCCJIT_HAVE_SIZED_INTEGERS
> >
> >  /* Given types LTYPE and RTYPE, return non-zero if they are
> compatible.
>
> Please add a feature macro:
> #define LIBGCCJIT_HAVE_gcc_jit_type_get_restrict
> (see the similar ones in the header).
>
> > diff --git a/gcc/jit/libgccjit.map b/gcc/jit/libgccjit.map
> > index e52de0057a5..b7289b13845 100644
> > --- a/gcc/jit/libgccjit.map
> > +++ b/gcc/jit/libgccjit.map
> > @@ -104,6 +104,7 @@ LIBGCCJIT_ABI_0
> >  gcc_jit_type_as_object;
> >  gcc_jit_type_get_const;
> >  gcc_jit_type_get_pointer;
> > +gcc_jit_type_get_restrict;
> >  gcc_jit_type_get_volatile;
>
> Please add a new ABI tag (LIBGCCJIT_ABI_25 ?), rather than adding this
> to ABI_0.
>
> > diff --git a/gcc/testsuite/jit.dg/test-restrict.c
> b/gcc/testsuite/jit.dg/test-restrict.c
> > new file mode 100644
> > index 000..4c8c4407f91
> > --- /dev/null
> > +++ b/gcc/testsuite/jit.dg/test-restrict.c
> > @@ -0,0 +1,77 @@
> > +/* { dg-do compile { target x86_64-*-* } } */
> > +
> > +#include 
> > +#include 
> > +
> > +#include "libgccjit.h"
> > +
> > +/* We don't want set_options() in harness.h to set -O3 to see that
> the cold
> > +  attribute affects the optimizations. */
>
> This refers to a "cold attribute"; is this a vestige of a copy-and-
> paste from a different test case?
>
> I see

Re: [PATCH v3][RFC] c-family: Implement __has_feature and __has_extension [PR60512]

2023-08-17 Thread Alex Coplan via Gcc-patches
I'd like to ping this for review from C and C++ maintainers:
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626178.html

I probably should have dropped the RFC tag this time round as I think
the patch is nearly ready, I suppose we just need agreement on the
issues below: is there any GCC configuration where __thread can get
rejected (I don't know of one), and should cxx_binary_literals report as
a feature with -std=c2x?

Thanks,
Alex

On 03/08/2023 10:21, Alex Coplan via Gcc-patches wrote:
> Hi,
> 
> This patch implements clang's __has_feature and __has_extension in GCC.
> This is a v3 which addresses feedback for the v2 patch posted here:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626058.html
> 
> Main changes since v2:
>  - As per Jason's feedback, dropped the langhook in favour of
>a function prototyped in c-family/c-common.h and implemented in
>*-lang.cc for each frontend.
>  - Also dropped the callbacks as suggested, we now compute whether
>features/extensions are available when __has_feature is first invoked,
>and only add available features to the hash table (storing a boolean
>to indicate whether a given identifier names a feature or an extension).
>  - Added many comments to top-level definitions.
>  - Generally polished and tidied up a bit.
> 
> As of this writing, there are still a couple of unresolved issues
> around cxx_binary_literals and TLS, see:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626058.html
> 
> Bootstrapped/regtested on aarch64-linux-gnu and x86_64-apple-darwin.
> How does this version look?
> 
> Thanks,
> Alex
> 
> gcc/c-family/ChangeLog:
> 
>   PR c++/60512
>   * c-common.cc (struct hf_feature_info): New.
>   (c_common_register_feature): New.
>   (init_has_feature): New.
>   (has_feature_p): New.
>   * c-common.h (c_common_has_feature): New.
>   (c_family_register_lang_features): New.
>   (c_common_register_feature): New.
>   (has_feature_p): New.
>   (c_register_features): New.
>   (cp_register_features): New.
>   * c-lex.cc (init_c_lex): Plumb through has_feature callback.
>   (c_common_has_builtin): Generalize and move common part ...
>   (c_common_lex_availability_macro): ... here.
>   (c_common_has_feature): New.
>   * c-ppoutput.cc (init_pp_output): Plumb through has_feature.
> 
> gcc/c/ChangeLog:
> 
>   PR c++/60512
>   * c-lang.cc (c_family_register_lang_features): New.
>   * c-objc-common.cc (struct c_feature_info): New.
>   (c_register_features): New.
> 
> gcc/cp/ChangeLog:
> 
>   PR c++/60512
>   * cp-lang.cc (c_family_register_lang_features): New.
>   * cp-objcp-common.cc (struct cp_feature_selector): New.
>   (cp_feature_selector::has_feature): New.
>   (struct cp_feature_info): New.
>   (cp_register_features): New.
> 
> gcc/ChangeLog:
> 
>   PR c++/60512
>   * doc/cpp.texi: Document __has_{feature,extension}.
> 
> gcc/objc/ChangeLog:
> 
>   PR c++/60512
>   * objc-act.cc (struct objc_feature_info): New.
>   (objc_nonfragile_abi_p): New.
>   (objc_common_register_features): New.
>   * objc-act.h (objc_common_register_features): New.
>   * objc-lang.cc (c_family_register_lang_features): New.
> 
> gcc/objcp/ChangeLog:
> 
>   PR c++/60512
>   * objcp-lang.cc (c_family_register_lang_features): New.
> 
> libcpp/ChangeLog:
> 
>   PR c++/60512
>   * include/cpplib.h (struct cpp_callbacks): Add has_feature.
>   (enum cpp_builtin_type): Add BT_HAS_{FEATURE,EXTENSION}.
>   * init.cc: Add __has_{feature,extension}.
>   * macro.cc (_cpp_builtin_macro_text): Handle
>   BT_HAS_{FEATURE,EXTENSION}.
> 
> 
> gcc/testsuite/ChangeLog:
> 
>   PR c++/60512
>   * c-c++-common/has-feature-common.c: New test.
>   * g++.dg/ext/has-feature.C: New test.
>   * gcc.dg/asan/has-feature-asan.c: New test.
>   * gcc.dg/has-feature.c: New test.
>   * gcc.dg/ubsan/has-feature-ubsan.c: New test.
>   * obj-c++.dg/has-feature.mm: New test.
>   * objc.dg/has-feature.m: New test.


Re: [PATCH] RISC-V: Forbidden fuse vlmax vsetvl to DEMAND_NONZERO_AVL vsetvl

2023-08-17 Thread Robin Dapp via Gcc-patches
Hi Lehua,

unrelated but I'm seeing a lot of failing gather/scatter tests on
master right now.

> /* DIRTY -> DIRTY or VALID -> DIRTY.  */
> +   if (block_info.reaching_out.demand_p (DEMAND_NONZERO_AVL)
> +   && vlmax_avl_p (prop.get_avl ()))
> + continue;
> vector_insn_info new_info; 

Please add a small comment here which exact situation we're trying
to prevent.

> +asm volatile ("vsetivli x0, 0, e8, m1, ta, ma");

Why is this necessary or rather why is vtype uninitialized?  Is
this the mentioned bug?  If so, why do we still need it with the
vsetvl fix? 

Regards
 Robin



[PATCH] doc: Fixes to RTL-SSA sample code

2023-08-17 Thread Alex Coplan via Gcc-patches
Hi,

This patch fixes up the code examples in the RTL-SSA documentation (the
sections on making insn changes) to reflect the current API.

The main issues are as follows:
 - rtl_ssa::recog takes an obstack_watermark & as the first parameter.
   Presumably this is intended to be the change attempt, so I've updated
   the examples to pass this through.
 - The variants of recog and restrict_movement that take an ignore
   predicate have been renamed with an _ignoring suffix, so I've
   updated callers to use those names.
 - A couple of minor "obvious" fixes to add a missing address-of
   operator and correct a variable name.

OK for trunk?

Thanks,
Alex

gcc/ChangeLog:

* doc/rtl.texi: Fix up sample code for RTL-SSA insn changes.
diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
index 76aeafb8f15..0ed88f58821 100644
--- a/gcc/doc/rtl.texi
+++ b/gcc/doc/rtl.texi
@@ -4964,7 +4964,7 @@ the pass should check whether the new pattern matches a 
target
 instruction or satisfies the requirements of an inline asm:
 
 @smallexample
-if (!rtl_ssa::recog (change))
+if (!rtl_ssa::recog (attempt, change))
   return false;
 @end smallexample
 
@@ -5015,7 +5015,7 @@ if (!rtl_ssa::restrict_movement (change))
 insn_change_watermark watermark;
 // Use validate_change etc. to change INSN's pattern.
 @dots{}
-if (!rtl_ssa::recog (change)
+if (!rtl_ssa::recog (attempt, change)
 || !rtl_ssa::change_is_worthwhile (change))
   return false;
 
@@ -5048,7 +5048,7 @@ For example, if a pass is changing exactly two 
instructions,
 it might do:
 
 @smallexample
-rtl_ssa::insn_change *changes[] = @{ &change1, change2 @};
+rtl_ssa::insn_change *changes[] = @{ &change1, &change2 @};
 @end smallexample
 
 where @code{change1}'s instruction must come before @code{change2}'s.
@@ -5066,7 +5066,7 @@ in the correct order with respect to each other.
 The way to do this is:
 
 @smallexample
-if (!rtl_ssa::restrict_movement (change, insn_is_changing (changes)))
+if (!rtl_ssa::restrict_movement_ignoring (change, insn_is_changing (changes)))
   return false;
 @end smallexample
 
@@ -5078,7 +5078,7 @@ changing instructions (which might, for example, no 
longer need
 to clobber the flags register).  The way to do this is:
 
 @smallexample
-if (!rtl_ssa::recog (change, insn_is_changing (changes)))
+if (!rtl_ssa::recog_ignoring (attempt, change, insn_is_changing (changes)))
   return false;
 @end smallexample
 
@@ -5118,28 +5118,28 @@ Putting all this together, the process for a 
two-instruction change is:
 @smallexample
 auto attempt = crtl->ssa->new_change_attempt ();
 
-rtl_ssa::insn_change change (insn1);
+rtl_ssa::insn_change change1 (insn1);
 change1.new_defs = @dots{};
 change1.new_uses = @dots{};
 change1.move_range = @dots{};
 
-rtl_ssa::insn_change change (insn2);
+rtl_ssa::insn_change change2 (insn2);
 change2.new_defs = @dots{};
 change2.new_uses = @dots{};
 change2.move_range = @dots{};
 
-rtl_ssa::insn_change *changes[] = @{ &change1, change2 @};
+rtl_ssa::insn_change *changes[] = @{ &change1, &change2 @};
 
 auto is_changing = insn_is_changing (changes);
-if (!rtl_ssa::restrict_movement (change1, is_changing)
-|| !rtl_ssa::restrict_movement (change2, is_changing))
+if (!rtl_ssa::restrict_movement_ignoring (change1, is_changing)
+|| !rtl_ssa::restrict_movement_ignoring (change2, is_changing))
   return false;
 
 insn_change_watermark watermark;
 // Use validate_change etc. to change INSN1's and INSN2's patterns.
 @dots{}
-if (!rtl_ssa::recog (change1, is_changing)
-|| !rtl_ssa::recog (change2, is_changing)
+if (!rtl_ssa::recog_ignoring (attempt, change1, is_changing)
+|| !rtl_ssa::recog_ignoring (attempt, change2, is_changing)
 || !rtl_ssa::changes_are_worthwhile (changes)
 || !crtl->ssa->verify_insn_changes (changes))
   return false;


Re: [PATCH] doc: Fixes to RTL-SSA sample code

2023-08-17 Thread Richard Sandiford via Gcc-patches
Alex Coplan  writes:
> Hi,
>
> This patch fixes up the code examples in the RTL-SSA documentation (the
> sections on making insn changes) to reflect the current API.
>
> The main issues are as follows:
>  - rtl_ssa::recog takes an obstack_watermark & as the first parameter.
>Presumably this is intended to be the change attempt, so I've updated
>the examples to pass this through.
>  - The variants of recog and restrict_movement that take an ignore
>predicate have been renamed with an _ignoring suffix, so I've
>updated callers to use those names.
>  - A couple of minor "obvious" fixes to add a missing address-of
>operator and correct a variable name.
>
> OK for trunk?

OK.  Thanks for doing this.  I'm pretty sure the examples did
compile with one version of the API, but like you say, I forgot
to update it later. :(

Richard

> Thanks,
> Alex
>
> gcc/ChangeLog:
>
>   * doc/rtl.texi: Fix up sample code for RTL-SSA insn changes.
>
> diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
> index 76aeafb8f15..0ed88f58821 100644
> --- a/gcc/doc/rtl.texi
> +++ b/gcc/doc/rtl.texi
> @@ -4964,7 +4964,7 @@ the pass should check whether the new pattern matches a 
> target
>  instruction or satisfies the requirements of an inline asm:
>  
>  @smallexample
> -if (!rtl_ssa::recog (change))
> +if (!rtl_ssa::recog (attempt, change))
>return false;
>  @end smallexample
>  
> @@ -5015,7 +5015,7 @@ if (!rtl_ssa::restrict_movement (change))
>  insn_change_watermark watermark;
>  // Use validate_change etc. to change INSN's pattern.
>  @dots{}
> -if (!rtl_ssa::recog (change)
> +if (!rtl_ssa::recog (attempt, change)
>  || !rtl_ssa::change_is_worthwhile (change))
>return false;
>  
> @@ -5048,7 +5048,7 @@ For example, if a pass is changing exactly two 
> instructions,
>  it might do:
>  
>  @smallexample
> -rtl_ssa::insn_change *changes[] = @{ &change1, change2 @};
> +rtl_ssa::insn_change *changes[] = @{ &change1, &change2 @};
>  @end smallexample
>  
>  where @code{change1}'s instruction must come before @code{change2}'s.
> @@ -5066,7 +5066,7 @@ in the correct order with respect to each other.
>  The way to do this is:
>  
>  @smallexample
> -if (!rtl_ssa::restrict_movement (change, insn_is_changing (changes)))
> +if (!rtl_ssa::restrict_movement_ignoring (change, insn_is_changing 
> (changes)))
>return false;
>  @end smallexample
>  
> @@ -5078,7 +5078,7 @@ changing instructions (which might, for example, no 
> longer need
>  to clobber the flags register).  The way to do this is:
>  
>  @smallexample
> -if (!rtl_ssa::recog (change, insn_is_changing (changes)))
> +if (!rtl_ssa::recog_ignoring (attempt, change, insn_is_changing (changes)))
>return false;
>  @end smallexample
>  
> @@ -5118,28 +5118,28 @@ Putting all this together, the process for a 
> two-instruction change is:
>  @smallexample
>  auto attempt = crtl->ssa->new_change_attempt ();
>  
> -rtl_ssa::insn_change change (insn1);
> +rtl_ssa::insn_change change1 (insn1);
>  change1.new_defs = @dots{};
>  change1.new_uses = @dots{};
>  change1.move_range = @dots{};
>  
> -rtl_ssa::insn_change change (insn2);
> +rtl_ssa::insn_change change2 (insn2);
>  change2.new_defs = @dots{};
>  change2.new_uses = @dots{};
>  change2.move_range = @dots{};
>  
> -rtl_ssa::insn_change *changes[] = @{ &change1, change2 @};
> +rtl_ssa::insn_change *changes[] = @{ &change1, &change2 @};
>  
>  auto is_changing = insn_is_changing (changes);
> -if (!rtl_ssa::restrict_movement (change1, is_changing)
> -|| !rtl_ssa::restrict_movement (change2, is_changing))
> +if (!rtl_ssa::restrict_movement_ignoring (change1, is_changing)
> +|| !rtl_ssa::restrict_movement_ignoring (change2, is_changing))
>return false;
>  
>  insn_change_watermark watermark;
>  // Use validate_change etc. to change INSN1's and INSN2's patterns.
>  @dots{}
> -if (!rtl_ssa::recog (change1, is_changing)
> -|| !rtl_ssa::recog (change2, is_changing)
> +if (!rtl_ssa::recog_ignoring (attempt, change1, is_changing)
> +|| !rtl_ssa::recog_ignoring (attempt, change2, is_changing)
>  || !rtl_ssa::changes_are_worthwhile (changes)
>  || !crtl->ssa->verify_insn_changes (changes))
>return false;


Re: [PATCH] Fix code_helper unused argument warning for fr30

2023-08-17 Thread Richard Biener via Gcc-patches
On Thu, Aug 17, 2023 at 9:21 AM Jan-Benedict Glaw  wrote:
>
> Hi!
>
> fr30 is the only target defining GO_IF_LEGITIMATE_ADDRESS right now, in
> which case the `code_helper ch` argument to memory_address_addr_space_p()
> is unused and emits a new warning.

OK.

> gcc/ChangeLog:
> * recog.cc (memory_address_addr_space_p): Mark possibly unused
> argument as unused.
>
> diff --git a/gcc/recog.cc b/gcc/recog.cc
> index 2bff6c03e4d..92f151248a6 100644
> --- a/gcc/recog.cc
> +++ b/gcc/recog.cc
> @@ -1803,7 +1803,7 @@ pop_operand (rtx op, machine_mode mode)
>
>  bool
>  memory_address_addr_space_p (machine_mode mode ATTRIBUTE_UNUSED, rtx addr,
> -addr_space_t as, code_helper ch)
> +addr_space_t as, code_helper ch ATTRIBUTE_UNUSED)
>  {
>  #ifdef GO_IF_LEGITIMATE_ADDRESS
>gcc_assert (ADDR_SPACE_GENERIC_P (as));
>
>
>
> Ok for trunk?
>
> Thanks,
>   Jan-Benedict
>
> --


Re: [PATCH] RISC-V: Forbidden fuse vlmax vsetvl to DEMAND_NONZERO_AVL vsetvl

2023-08-17 Thread Lehua Ding
Hi Robin,


> unrelated but I'm seeing a lot of failing gather/scatter tests on
> master right now.


Are you talking about these FAILs like bellow? If so, If so it should be 
caused by a
recent commit from juzhe who is looking at it. If not, I didn't have 
these fails
in my local run.


  XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvand
  XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvand
  XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvand
  XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvand



> Please add a small comment here which exact situation we're trying
> to prevent.


OK, will add a comment like bellow:


  /* Forbidden this case be fused because it change the value of a5.
   bb 1: vsetvl zero, no_zero_avl
 ...
 use a5
 ...
   bb 2: vsetvl a5, zero
 =>
   bb 1: vsetvl a5, zero
 ...
 use a5
 ...
   bb 2:
  */





> Why is this necessary or rather why is vtype uninitialized?  Is
> this the mentioned bug?  If so, why do we still need it with the
> vsetvl fix?


This is because running a testcase with spike+pk will result in an
ILLEGAL INSTRUCTION error if the vtype registers are not initialized
before executing vmv1r.v instruction. This case fails because of this 
reason,
so explicitly execute vsetvl early. We are currently discussing with Kito to
constrain this case in psABI and ask the execution environment(pk) to ensure
that vtype is initialized, but not so fast. So when encountering a testcase that
fails because of this reason, I think use this way to fix it is ok.


-- Original --
From:   
 "Robin Dapp"   
 


Re: Another bug for __builtin_object_size? (Or expected behavior)

2023-08-17 Thread Siddhesh Poyarekar

On 2023-08-16 11:59, Qing Zhao wrote:

Jakub and Sid,

During my study, I found an interesting behavior for the following small 
testing case:

#include 
#include 

struct fixed {
   size_t foo;
   char b;
   char array[10];
} q = {};

#define noinline __attribute__((__noinline__))

static void noinline bar ()
{
   struct fixed *p = &q;

   printf("the__bos of MAX p->array sub is %d \n", 
__builtin_object_size(p->array, 1));
   printf("the__bos of MIN p->array sub is %d \n", 
__builtin_object_size(p->array, 3));

   return;
}

int main ()
{
   bar ();
   return 0;
}
[opc@qinzhao-aarch64-ol8 108896]$ sh t
/home/opc/Install/latest-d/bin/gcc -O -fstrict-flex-arrays=3 t2.c
the__bos of MAX p->array sub is 10
the__bos of MIN p->array sub is 15

I assume that the Minimum size in the sub-object should be 10 too (i.e 
__builtin_object_size(p->array, 3) should be 10 too).

So, first question: Is this correct or wrong behavior for 
__builtin_object_size(p->array, 3)?

The second question is, when I debugged into why 
__builtin_object_size(p->array, 3) returns 15 instead of 10, I observed the 
following:

1. In “early_objz” phase, The IR for p->array is:
(gdb) call debug_generic_expr(ptr)
&p_5->array

And the pt_var is:
(gdb) call debug_generic_expr(pt_var)
*p_5

As a result, the following condition in tree-object-size.cc:

  585   if (pt_var != TREE_OPERAND (ptr, 0))

Was satisfied, and then the algorithm for computing the SUBOBJECT was invoked 
and the size of the subobject 10 was used.

and then an MAX_EXPR was inserted after the __builtin_object_size call as:
   _3 = &p_5->array;
   _10 = __builtin_object_size (_3, 3);
   _4 = MAX_EXPR <_10, 10>;

Till now, everything looks fine.

2. within “ccp1” phase, when folding the call  to __builtin_object_size, the IR 
for the p-:>array is:
(gdb) call debug_generic_expr(ptr)
&MEM  [(void *)&q + 9B]

And the pt_var is:
(gdb) call debug_generic_expr(pt_var)
MEM  [(void *)&q + 9B]

As a result, the following condition in tree-object-size.cc:

  585   if (pt_var != TREE_OPERAND (ptr, 0))

Was NOT satisfied, therefore the algorithm for computing the SUBOBJECT was NOT 
invoked at all, as a result, the size in the whole object, 15, was used.

And then finally, MAX_EXPR (_10, 10) becomes MAX_EXPR (15, 10), 15 is the final 
result.

Based on the above, is there any issue with the current algorithm?


So this is a (sort of) known issue, which necessitated the early_objsz 
pass to get an estimate before a subobject reference was optimized to a 
MEM_REF.  However it looks like the MIN/MAX hack doesn't work in this 
case for OST_MINIMUM; it should probably get the minimum of the two 
passes if both passes were successful, or only the result of the pass 
that was successful.


Thanks,
Sid


Re: [PATCH] RISC-V: Forbidden fuse vlmax vsetvl to DEMAND_NONZERO_AVL vsetvl

2023-08-17 Thread Lehua Ding
I see these failing testcases on trunk:


                === gcc: Unexpected 
fails for rv64gcv_zfh lp64d medany spike ===
FAIL: gcc.dg/pr42685.c (test for excess errors)
FAIL: gcc.dg/pr45105.c (test for excess errors)
XPASS: gcc.dg/unroll-7.c scan-rtl-dump-not loop2_unroll "Invalid sum"
FAIL: gcc.dg/plugin/cpython-plugin-test-2.c 
-fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 17)
FAIL: gcc.dg/plugin/cpython-plugin-test-2.c 
-fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 18)
FAIL: gcc.dg/plugin/cpython-plugin-test-2.c 
-fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 21)
FAIL: gcc.dg/plugin/cpython-plugin-test-2.c 
-fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 31)
FAIL: gcc.dg/plugin/cpython-plugin-test-2.c 
-fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 32)
FAIL: gcc.dg/plugin/cpython-plugin-test-2.c 
-fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 35)
FAIL: gcc.dg/plugin/cpython-plugin-test-2.c 
-fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 45)
FAIL: gcc.dg/plugin/cpython-plugin-test-2.c 
-fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 55)
FAIL: gcc.dg/plugin/cpython-plugin-test-2.c 
-fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 63)
FAIL: gcc.dg/plugin/cpython-plugin-test-2.c 
-fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 66)
FAIL: gcc.dg/plugin/cpython-plugin-test-2.c 
-fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 68)
FAIL: gcc.dg/plugin/cpython-plugin-test-2.c 
-fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 69)
FAIL: gcc.dg/plugin/cpython-plugin-test-2.c 
-fplugin=./analyzer_cpython_plugin.so (test for excess errors)
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvand
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvand
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvand
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvand
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvid\\.v
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvid\\.v
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvid\\.v
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvid\\.v
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-tree-dump-times 
optimized ".VEC_PERM" 1
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-tree-dump-times 
optimized ".VEC_PERM" 1
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-tree-dump-times 
optimized ".VEC_PERM" 1
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-tree-dump-times 
optimized ".VEC_PERM" 1
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-16.c scan-assembler \\tvid\\.v
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-16.c scan-assembler \\tvid\\.v
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-16.c scan-assembler \\tvid\\.v
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-16.c scan-assembler \\tvid\\.v
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-16.c scan-tree-dump-times 
optimized ".VEC_PERM" 1
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-16.c scan-tree-dump-times 
optimized ".VEC_PERM" 1
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-16.c scan-tree-dump-times 
optimized ".VEC_PERM" 1
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-16.c scan-tree-dump-times 
optimized ".VEC_PERM" 1
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-17.c scan-assembler \\tvid\\.v
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-17.c scan-assembler \\tvid\\.v
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-17.c scan-assembler \\tvid\\.v
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-17.c scan-assembler \\tvid\\.v
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-17.c scan-assembler \\tvid\\.v
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-17.c scan-assembler \\tvid\\.v
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-17.c scan-tree-dump-times 
optimized ".VEC_PERM" 2
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-17.c scan-tree-dump-times 
optimized ".VEC_PERM" 2
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-17.c scan-tree-dump-times 
optimized ".VEC_PERM" 2
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-17.c scan-tree-dump-times 
optimized ".VEC_PERM" 2
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-17.c scan-tree-dump-times 
optimized ".VEC_PERM" 2
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-17.c scan-tree-dump-times 
optimized ".VEC_PERM" 2
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-18.c scan-assembler \\tvid\\.v
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-18.c scan-assembler \\tvid\\.v
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-18.c scan-assembler \\tvid\\.v
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-18.c scan-assembler \\tvid\\.v
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-19.c scan-assembler \\tvid\\.v
XPASS: gcc.target/riscv/rvv/autovec/partial/slp-19.c scan-assemble

Re: [pushed][LRA]: Spill pseudos assigned to fp when fp->sp elimination became impossible

2023-08-17 Thread SenthilKumar.Selvaraj--- via Gcc-patches
On Wed, 2023-08-16 at 12:13 -0400, Vladimir Makarov wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the 
> content is safe
> 
> The attached patch fixes recently found wrong insn removal in LRA port
> for AVR.
> 
> The patch was successfully tested and bootstrapped on x86-64 and aarch64.
> 
> 
Hi Vladimir,

  Thanks for working on this. After applying the patch, I'm seeing that the
  pseudo in the frame pointer that got spilled is taking up the same stack
  slot that was already assigned to a spilled pseudo, and that is causing 
execution
  failure (it is also causing a crash when building libgcc for avr)

(insn 108 15 3 3 (set (mem/c:SI (plus:HI (reg/f:HI 28 r28)
(const_int 1 [0x1])) [3 %sfp+1 S4 A8])
(reg:SI 4 r4 [orig:46 f.3_4 ] [46])) 
"/home/i41766/code/personal/gcc/gcc/testsuite/gcc.c-torture/execute/20050224-
1.c":29:16 119 {*movsi_split}
 (nil))
(insn 3 108 4 3 (set (mem/c:HI (plus:HI (reg/f:HI 28 r28)
(const_int 1 [0x1])) [3 %sfp+1 S2 A8])
(const_int 0 [0])) 
"/home/i41766/code/personal/gcc/gcc/testsuite/gcc.c-torture/execute/20050224-1.c":19:21
 101
{*movhi_split}
 (expr_list:REG_EQUAL (const_int 0 [0])
(nil)))

  Both pseudo 46, and the pseudo spilled off FP (51) get assigned stack slot 0.
  This translates to this obviously wrong assembly code.

lds r4,f
lds r5,f+1
lds r6,f+2
lds r7,f+3
std Y+1,r4
std Y+2,r5
std Y+3,r6
std Y+4,r7
std Y+2,__zero_reg__
std Y+1,__zero_reg__

  I tried a hacky workaround (see patch below) to create a new stack slot and
  assign the spilled pseudo to it, and that works.
  
  Not sure if that's the right way to do it though.

Regards
Senthil

  diff --git gcc/lra-spills.cc gcc/lra-spills.cc
index 7e1d35b5e4e..e985ab56a60 100644
--- gcc/lra-spills.cc
+++ gcc/lra-spills.cc
@@ -359,11 +359,12 @@ add_pseudo_to_slot (int regno, int slot_num)
length N.  Sort pseudos in PSEUDO_REGNOS for subsequent assigning
memory stack slots. */
 static void
-assign_stack_slot_num_and_sort_pseudos (int *pseudo_regnos, int n)
+assign_stack_slot_num_and_sort_pseudos (int *pseudo_regnos, int n, bool reset)
 {
   int i, j, regno;
 
-  slots_num = 0;
+  if (reset)
+slots_num = 0;
   /* Assign stack slot numbers to spilled pseudos, use smaller numbers
  for most frequently used pseudos. */
   for (i = 0; i < n; i++)
@@ -628,14 +629,15 @@ lra_spill (void)
   /* Sort regnos according their usage frequencies.  */
   qsort (pseudo_regnos, n, sizeof (int), regno_freq_compare);
   n = assign_spill_hard_regs (pseudo_regnos, n);
-  assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n);
+  assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n, true);
   for (i = 0; i < n; i++)
 if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
   assign_mem_slot (pseudo_regnos[i]);
-  if ((n2 = lra_update_fp2sp_elimination (pseudo_regnos)) > 0)
+  if ((n2 = lra_update_fp2sp_elimination (&pseudo_regnos[n])) > 0)
 {
+  assign_stack_slot_num_and_sort_pseudos (&pseudo_regnos[n], n2, false);
   /* Assign stack slots to spilled pseudos assigned to fp.  */
-  for (i = 0; i < n2; i++)
+  for (i = n; i < n + n2; i++)
if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
  assign_mem_slot (pseudo_regnos[i]);
 }

Regards
Senthil


[PATCH] c: Add support for [[__extension__ ...]]

2023-08-17 Thread Richard Sandiford via Gcc-patches
Joseph Myers  writes:
> On Wed, 16 Aug 2023, Richard Sandiford via Gcc-patches wrote:
>
>> Would it be OK to add support for:
>> 
>>   [[__extension__ ...]]
>> 
>> to suppress the pedwarn about using [[]] prior to C2X?  Then we can
>
> That seems like a plausible feature to add.

Thanks.  Of course, once I actually tried it, I hit a snag:
:: isn't a single lexing token prior to C2X, and so something like:

  [[__extension__ arm::streaming]]

would not be interpreted as a scoped attribute in C11.  The patch
gets around that by allowing two colons in place of :: when
__extension__ is used.  I realise that's pushing the bounds of
acceptability though...

I wondered about trying to require the two colons to be immediately
adjacent.  But:

(a) There didn't appear to be an existing API to check that, which seemed
like a red flag.  The closest I could find was get_source_text_between.

Similarly to that, it would in principle be possible to compare
two expanded locations.  But...

(b) I had a vague impression that locations were allowed to drop column
information for very large inputs (maybe I'm wrong).

(c) It wouldn't cope with token pasting.

So in the end I just used a simple two-token test, like for [[ and ]].

Bootstrapped & regression-tested on aarch64-linux-gnu.

Richard



[[]] attributes are a recent addition to C, but as a GNU extension,
GCC allows them to be used in C11 and earlier.  Normally this use
would trigger a pedwarn (for -pedantic, -Wc11-c2x-compat, etc.).

This patch allows the pedwarn to be suppressed by starting the
attribute-list with __extension__.

Also, :: is not a single lexing token prior to C2X, so it wasn't
possible to use scoped attributes in C11, even as a GNU extension.
The patch allows two colons to be used in place of :: when
__extension__ is used.  No attempt is made to check whether the
two colons are immediately adjacent.

gcc/
* doc/extend.texi: Document the C [[__extension__ ...]] construct.

gcc/c/
* c-parser.cc (c_parser_std_attribute): Conditionally allow
two colons to be used in place of ::.
(c_parser_std_attribute_list): New function, split out from...
(c_parser_std_attribute_specifier): ...here.  Allow the attribute-list
to start with __extension__.  When it does, also allow two colons
to be used in place of ::.

gcc/testsuite/
* gcc.dg/c2x-attr-syntax-6.c: New test.
* gcc.dg/c2x-attr-syntax-7.c: Likewise.
---
 gcc/c/c-parser.cc| 68 ++--
 gcc/doc/extend.texi  | 27 --
 gcc/testsuite/gcc.dg/c2x-attr-syntax-6.c | 50 +
 gcc/testsuite/gcc.dg/c2x-attr-syntax-7.c | 48 +
 4 files changed, 173 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/c2x-attr-syntax-6.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-attr-syntax-7.c

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 33fe7b115ff..82e56b28446 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -5390,10 +5390,18 @@ c_parser_balanced_token_sequence (c_parser *parser)
  ( balanced-token-sequence[opt] )
 
Keywords are accepted as identifiers for this purpose.
-*/
+
+   As an extension, we permit an attribute-specifier to be:
+
+ [ [ __extension__ attribute-list ] ]
+
+   Two colons are then accepted as a synonym for ::.  No attempt is made
+   to check whether the colons are immediately adjacent.  LOOSE_SCOPE_P
+   indicates whether this relaxation is in effect.  */
 
 static tree
-c_parser_std_attribute (c_parser *parser, bool for_tm)
+c_parser_std_attribute (c_parser *parser, bool for_tm,
+   bool loose_scope_p = false)
 {
   c_token *token = c_parser_peek_token (parser);
   tree ns, name, attribute;
@@ -5406,9 +5414,18 @@ c_parser_std_attribute (c_parser *parser, bool for_tm)
 }
   name = canonicalize_attr_name (token->value);
   c_parser_consume_token (parser);
-  if (c_parser_next_token_is (parser, CPP_SCOPE))
+  if (c_parser_next_token_is (parser, CPP_SCOPE)
+  || (loose_scope_p
+ && c_parser_next_token_is (parser, CPP_COLON)
+ && c_parser_peek_token (parser)->type == CPP_COLON))
 {
   ns = name;
+  if (c_parser_next_token_is (parser, CPP_COLON))
+   {
+ c_parser_consume_token (parser);
+ if (!c_parser_next_token_is (parser, CPP_COLON))
+   gcc_unreachable ();
+   }
   c_parser_consume_token (parser);
   token = c_parser_peek_token (parser);
   if (token->type != CPP_NAME && token->type != CPP_KEYWORD)
@@ -5481,19 +5498,9 @@ c_parser_std_attribute (c_parser *parser, bool for_tm)
 }
 
 static tree
-c_parser_std_attribute_specifier (c_parser *parser, bool for_tm)
+c_parser_std_attribute_list (c_parser *parser, bool for_tm,
+bool loose_scope_p = false)
 {
-  location_t loc = c_parser_peek_token (parser)->location;
-  if (!c_parser_require (parser, CPP_OPEN_S

Re: Re: [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp

2023-08-17 Thread Fei Gao
Hi Kito

Root cause has been identified.

Here's the frame layout fo the TC, please use courier font :)
+---+ 
|                               | 
|  GPR save area  112 B         | 
|                               |
+---+ 
|                               |<-- fs0 is beyond sp based 12-bit 
range 
|  FPR save area  96 B          |
|                               |
+---+ 
|                               |
|  local variables              |<-- stack_pointer_rtx after 
riscv_first_stack_step
|                               |
+---+ 

During stack frame allocation:
1. cm.push reserves 160 bytes, 112 for ra and sregs with 128-bit alignment as 
per ABI, and additional 48 bytes for first 6 fprs.
2. riscv_first_stack_step reserves 2032 bytes for the rest 6 fprs and local 
variables.
3. riscv_for_each_saved_reg tries to save fs0 which is beyond sp based 12-bit 
range,
    thus breaking gcc_assert (can_create_pseudo_p ()) in gen_reg_rtx when doing 
force reg as it's already after reload complete.

I tried with a solution like saving first 6 fprs immediately after cm.push. It 
seems working:)
I will fix epilogue correspondingly as well.

Thanks again for your test. 

BR, 
Fei

On 2023-08-16 16:33  Kito Cheng  wrote:
>
>Hi Fei:
>
>Tried to use Jiawei's patch to test this patch and found some issue:
>
>
>> @@ -5430,13 +5632,15 @@ riscv_expand_prologue (void)
>>    /* Save the registers.  */
>>    if ((frame->mask | frame->fmask) != 0)
>>  {
>> -  HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
>> -
>> -  insn = gen_add3_insn (stack_pointer_rtx,
>> -   stack_pointer_rtx,
>> -   GEN_INT (-step1));
>> -  RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>> -  remaining_size -= step1;
>> +  if (known_gt (remaining_size, frame->frame_pointer_offset))
>> +    {
>> +  HOST_WIDE_INT step1 = riscv_first_stack_step (frame, 
>> remaining_size);
>> +  remaining_size -= step1;
>> +  insn = gen_add3_insn (stack_pointer_rtx,
>> +    stack_pointer_rtx,
>> +    GEN_INT (-step1));
>> +  RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>> +    }
>>    riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, 
>>false);
>>  }
>>
>
>I hit some issue here during building libgcc, I use
>riscv-gnu-toolchain with --with-arch=rv64gzca_zcmp
>
>And the error message is:
>
>In file included from
>../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind-dw2.c:1471:
>../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc: In
>function '_Unwind_Backtrace':
>../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc:330:1:
>internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1176
> 330 | }
> | ^
>0x83753a gen_reg_rtx(machine_mode)
>   ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/emit-rtl.cc:1176
>0xf5566f maybe_legitimize_operand
>   ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8047
>0xf5566f maybe_legitimize_operands(insn_code, unsigned int, unsigned
>int, expand_operand*)
>   ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8191
>0xf511d9 maybe_gen_insn(insn_code, unsigned int, expand_operand*)
>   ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8210
>0xf58539 expand_binop_directly
>   ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1452
>0xf5 expand_binop(machine_mode, optab_tag, rtx_def*, rtx_def*,
>rtx_def*, int, optab_methods)
>   ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1539
>0xcbfdd0 force_operand(rtx_def*, rtx_def*)
>   ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:8231
>0xc8fca1 force_reg(machine_mode, rtx_def*)
>   ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/explow.cc:687
>0x144b8cd riscv_force_temporary
>   ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1531
>0x144b8cd riscv_force_address
>   ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1528
>0x144b8cd riscv_legitimize_move(machine_mode, rtx_def*, rtx_def*)
>   ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:2387
>0x1af063e gen_movdf(rtx_def*, rtx_def*)
>   ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2107
>0xcba503 rtx_insn* insn_gen_fn::operator()rtx_def*>(rtx_def*, rtx_def*) const
>   ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/recog.h:411
>0xcba503 emit_move_insn_1(rtx_def*, rtx_def*)
>   ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:4164
>0x143d6c4 riscv_emit_move(rtx_def*, rtx_def*)
>   ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1486
>0x143d6c4 riscv_save_reg
>   ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/ris

Re: [WIP RFC v2] analyzer: Add support of placement new and improved operator new [PR105948]

2023-08-17 Thread Benjamin Priour via Gcc-patches
On Thu, Aug 17, 2023 at 12:34 AM David Malcolm  wrote:

> On Wed, 2023-08-16 at 14:19 +0200, priour...@gmail.com wrote:
> > From: benjamin priour 
> >
> > Hi,
> > (s/we/the analyzer/)
>
> Hi Benjamin, thanks for the updated patch.
>
> >
> > I've been continuing my patch of supporting operator new variants
> > in the analyzer, and have added a few more test cases.
> >
> >
> > > > If "y" is null then the allocation failed and dereferencing
> > "y" will
> > > > cause
> > > > a segfault, not a "use-of-uninitialized-value".
> > > > Thus we should stick to 'dereference of NULL 'y'" only.
> > > > If "y" is non-null then the allocation succeeded and "*y" is
> > > > initialized
> > > > since we are calling a default initialization with the empty
> > > > parenthesis.
> > >
> > > I *think* it's possible to have the region_model have y
> > pointing to a
> > > heap_allocated_region of sizeof(int) size that's been
> > initialized, but
> > > still have the malloc state machine part of the program_state
> > say that
> > > the pointer is maybe-null.
> >
> > By maybe-null are you implying a new sm-malloc state ?
>
> Sorry, I was too vague here.
>
> I was referring to the "unchecked" state in sm-malloc.cc, which
> represents a pointer that's been returned from an allocator function,
> where the pointer hasn't yet been checked for being null/non-null.
>
> Oh I see then. Unfortunately I don't think initializing the
heap_allocated_region
while having he unchecked state is doable here. I could do it in the
kf_operator_new::on_call_{pre,post} hook, but it's not actually operator
new that
initiliaze the allocated region.
For calls such as "A a = new (nothrow) A;", then 'a' is actually never
initialized,
therefore we need the heap_allocated_region to reflect that.


> > I am not sure to follow on that front.
> >
> >
> > >
> > > > This led me to consider having "null-dereference" supersedes
> > > > "use-of-uninitialized-value", but
> > > > new PR 110830 made me reexamine it.
> > > >
> > > > I believe fixing PR 110830 is thus required before submitting
> > this
> > > > patch,
> > > > or we would have some extra irrelevant warnings.
> > >
> > > How bad would the problem be?  PR 110830 looks a little
> > involved, so is
> > > there a way to get the current patch in without dragging that
> > extra
> > > complexity in?
> >
> > Having "null-dereference" supersedes "use-of-uninitialized-value"
> > would
> > cause false negative upon conditional return statement (similarly as
> > demonstrated
> > in PR 110830).
> >
> > Since PR 110830 is off for the moment, I have tried solving this
> > differently.
> > I have considered using known NULL constraints on
> > heap_allocated_region
> > as "initialized_value".
> >
> > You can see below in the diff of region_model::get_store_value
> > two versions of this approach. The version commented out proved to
> > solve
> > the issue of the spurious "use-of-unitialized-value" tagging along
> > calls to
> > "new(std::nothrow) ()". However, this version also shortcircuits the
> > diagnostics of the "null-dereference" warning.
> >
> > Given
> > /* { dg-additional-options "-O0 -fno-exceptions -fno-analyzer-
> > suppress-followups" } */
> > #include 
> >
> > struct A
> > {
> >   int x;
> >   int y;
> > };
> >
> > void test_nonthrowing ()
> > {
> >   A* y = new(std::nothrow) A();
> >   int z = y->x + 2; /* { dg-warning "dereference of NULL 'y'" }
> > */
> >   /* { dg-bogus "use of uninitialized value '\\*y'" "" { xfail *-
> > *-* } .-1 } */
> >
> >   delete y;
> > }
> >
> > The analyzer sees gimple
> >
> >:
> >   _7 = operator new (8, ¬hrow);
> >   if (_7 != 0B)
> > goto ; [INV]
> >   else
> > goto ; [INV]
>
> I would have thought that at each branch of this conditional that
> region_model::add_constraint would be called, and within that we'd
> reach this code:
>
> 4339  /* Notify the context, if any.  This exists so that the state
> machines
> 4340 in a program_state can be notified about the condition, and
> so can
> 4341 set sm-state for e.g. unchecked->checked, both for cfg-edges,
> and
> 4342 when synthesizing constraints as above.  */
> 4343  if (ctxt)
> 4344ctxt->on_condition (lhs, op, rhs);
>
> This ought to call impl_region_model_context::on_condition in
> engine.cc, which ought to call malloc_state_machine::on_condition in
> sm-malloc.cc, and this ought to transition the sm-state of _7.
>
> Is something going wrong somewhere in the things I mentioned above?
>
> Nope. Everything's happening as you say.

> >
> >:
> >   MEM[(struct A *)_7].x = 0;
> >   MEM[(struct A *)_7].y = 0;
> >   iftmp.0_11 = _7;
> >   goto ; [INV]
> >
> >:
> >   iftmp.0_8 = _7;
> >
> >:
> >   # iftmp.0_2 = PHI 
> >   y_12 = iftmp.0_2;
> >   _1 = y_12->x;
>
> ...and at this point we have a deref from y_12, which on th

[PATCH] RISC-V: Fix XPASS slp testcases

2023-08-17 Thread Lehua Ding
This patch fixs XPASS slp testcases on trunk by
making the conditions for xfail stricter.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/slp-1.c: Fix.
* gcc.target/riscv/rvv/autovec/partial/slp-16.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-17.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-18.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-19.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-6.c: Ditto.

---
 .../gcc.target/riscv/rvv/autovec/partial/slp-1.c | 9 +
 .../gcc.target/riscv/rvv/autovec/partial/slp-16.c| 7 ---
 .../gcc.target/riscv/rvv/autovec/partial/slp-17.c| 7 ---
 .../gcc.target/riscv/rvv/autovec/partial/slp-18.c| 7 ---
 .../gcc.target/riscv/rvv/autovec/partial/slp-19.c| 6 --
 .../gcc.target/riscv/rvv/autovec/partial/slp-2.c | 5 +++--
 .../gcc.target/riscv/rvv/autovec/partial/slp-3.c | 5 +++--
 .../gcc.target/riscv/rvv/autovec/partial/slp-4.c | 5 +++--
 .../gcc.target/riscv/rvv/autovec/partial/slp-5.c | 5 +++--
 .../gcc.target/riscv/rvv/autovec/partial/slp-6.c | 5 +++--
 10 files changed, 36 insertions(+), 25 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-1.c
index 788e0450b47..3571a325f73 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-1.c
@@ -19,7 +19,8 @@ f (int8_t *restrict a, int8_t *restrict b, int n)
 }
 }
 
-/* FIXME: Since we don't have VECT cost model yet, LOAD_LANES/STORE_LANES are 
chosen instead of SLP.  */
-/* { dg-final { scan-tree-dump-times "\.VEC_PERM" 1 "optimized" { xfail *-*-* 
} } } */
-/* { dg-final { scan-assembler {\tvid\.v} { xfail *-*-* } } } */
-/* { dg-final { scan-assembler {\tvand} { xfail *-*-* } } } */
+/* FIXME: Since we don't have VECT cost model yet, LOAD_LANES/STORE_LANES are 
chosen
+   instead of SLP when riscv-autovec-lmul=m1 or m2.  */
+/* { dg-final { scan-tree-dump-times "\.VEC_PERM" 1 "optimized" { xfail { 
any-opts "--param riscv-autovec-lmul=m1" "--param riscv-autovec-lmul=m2" } } } 
} */
+/* { dg-final { scan-assembler {\tvid\.v} { xfail { any-opts "--param 
riscv-autovec-lmul=m1" "--param riscv-autovec-lmul=m2" } } } } */
+/* { dg-final { scan-assembler {\tvand} { xfail { any-opts "--param 
riscv-autovec-lmul=m1" "--param riscv-autovec-lmul=m2" } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-16.c
index b58e270eaa4..8c5c65152c8 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-16.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-16.c
@@ -19,7 +19,8 @@ f (uint8_t *restrict a, uint8_t *restrict b, int n)
 }
 }
 
-/* FIXME: Since we don't have VECT cost model yet, LOAD_LANES/STORE_LANES are 
chosen instead of SLP.  */
-/* { dg-final { scan-tree-dump-times "\.VEC_PERM" 1 "optimized" { xfail *-*-* 
} } } */
-/* { dg-final { scan-assembler {\tvid\.v} { xfail *-*-* } } } */
+/* FIXME: Since we don't have VECT cost model yet, LOAD_LANES/STORE_LANES are 
chosen
+   instead of SLP when riscv-autovec-lmul=m1 or m2.  */
+/* { dg-final { scan-tree-dump-times "\.VEC_PERM" 1 "optimized" { xfail { 
any-opts "--param riscv-autovec-lmul=m1" "--param riscv-autovec-lmul=m2" } } } 
} */
+/* { dg-final { scan-assembler {\tvid\.v} { xfail { any-opts "--param 
riscv-autovec-lmul=m1" "--param riscv-autovec-lmul=m2" } } } } */
 /* { dg-final { scan-assembler-not {\tvmul} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-17.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-17.c
index bccf3e6570a..67dbadafc48 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-17.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-17.c
@@ -29,7 +29,8 @@ f (uint8_t *restrict a, uint8_t *restrict b,
 }
 }
 
-/* FIXME: Since we don't have VECT cost model yet, LOAD_LANES/STORE_LANES are 
chosen instead of SLP.  */
-/* { dg-final { scan-tree-dump-times "\.VEC_PERM" 2 "optimized" { xfail *-*-* 
} } } */
-/* { dg-final { scan-assembler {\tvid\.v} { xfail *-*-* } } } */
+/* FIXME: Since we don't have VECT cost model yet, LOAD_LANES/STORE_LANES are 
chosen
+   instead of SLP when riscv-autovec-lmul=m1.  */
+/* { dg-final { scan-tree-dump-times "\.VEC_PERM" 2 "optimized" { xfail { 
any-opts "--param riscv-autovec-lmul=m1" } } } } */
+/* { dg-final { scan-assembler {\tvid\.v} { xfail { any-opts "--param 
riscv-autovec-lmul=m1" } } } } */
 /* { dg-final { scan-assembler-not {\tvmul} } } */
d

Re: [PATCH] RISC-V: Fix XPASS slp testcases

2023-08-17 Thread juzhe.zh...@rivai.ai
LGTM.

Thanks for fixing my previous mistakes.



juzhe.zh...@rivai.ai
 
From: Lehua Ding
Date: 2023-08-17 19:43
To: gcc-patches
CC: juzhe.zhong; kito.cheng; rdapp.gcc; palmer; jeffreyalaw
Subject: [PATCH] RISC-V: Fix XPASS slp testcases
This patch fixs XPASS slp testcases on trunk by
making the conditions for xfail stricter.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/autovec/partial/slp-1.c: Fix.
* gcc.target/riscv/rvv/autovec/partial/slp-16.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-17.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-18.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-19.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-6.c: Ditto.
 
---
.../gcc.target/riscv/rvv/autovec/partial/slp-1.c | 9 +
.../gcc.target/riscv/rvv/autovec/partial/slp-16.c| 7 ---
.../gcc.target/riscv/rvv/autovec/partial/slp-17.c| 7 ---
.../gcc.target/riscv/rvv/autovec/partial/slp-18.c| 7 ---
.../gcc.target/riscv/rvv/autovec/partial/slp-19.c| 6 --
.../gcc.target/riscv/rvv/autovec/partial/slp-2.c | 5 +++--
.../gcc.target/riscv/rvv/autovec/partial/slp-3.c | 5 +++--
.../gcc.target/riscv/rvv/autovec/partial/slp-4.c | 5 +++--
.../gcc.target/riscv/rvv/autovec/partial/slp-5.c | 5 +++--
.../gcc.target/riscv/rvv/autovec/partial/slp-6.c | 5 +++--
10 files changed, 36 insertions(+), 25 deletions(-)
 
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-1.c
index 788e0450b47..3571a325f73 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-1.c
@@ -19,7 +19,8 @@ f (int8_t *restrict a, int8_t *restrict b, int n)
 }
}
-/* FIXME: Since we don't have VECT cost model yet, LOAD_LANES/STORE_LANES are 
chosen instead of SLP.  */
-/* { dg-final { scan-tree-dump-times "\.VEC_PERM" 1 "optimized" { xfail *-*-* 
} } } */
-/* { dg-final { scan-assembler {\tvid\.v} { xfail *-*-* } } } */
-/* { dg-final { scan-assembler {\tvand} { xfail *-*-* } } } */
+/* FIXME: Since we don't have VECT cost model yet, LOAD_LANES/STORE_LANES are 
chosen
+   instead of SLP when riscv-autovec-lmul=m1 or m2.  */
+/* { dg-final { scan-tree-dump-times "\.VEC_PERM" 1 "optimized" { xfail { 
any-opts "--param riscv-autovec-lmul=m1" "--param riscv-autovec-lmul=m2" } } } 
} */
+/* { dg-final { scan-assembler {\tvid\.v} { xfail { any-opts "--param 
riscv-autovec-lmul=m1" "--param riscv-autovec-lmul=m2" } } } } */
+/* { dg-final { scan-assembler {\tvand} { xfail { any-opts "--param 
riscv-autovec-lmul=m1" "--param riscv-autovec-lmul=m2" } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-16.c
index b58e270eaa4..8c5c65152c8 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-16.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-16.c
@@ -19,7 +19,8 @@ f (uint8_t *restrict a, uint8_t *restrict b, int n)
 }
}
-/* FIXME: Since we don't have VECT cost model yet, LOAD_LANES/STORE_LANES are 
chosen instead of SLP.  */
-/* { dg-final { scan-tree-dump-times "\.VEC_PERM" 1 "optimized" { xfail *-*-* 
} } } */
-/* { dg-final { scan-assembler {\tvid\.v} { xfail *-*-* } } } */
+/* FIXME: Since we don't have VECT cost model yet, LOAD_LANES/STORE_LANES are 
chosen
+   instead of SLP when riscv-autovec-lmul=m1 or m2.  */
+/* { dg-final { scan-tree-dump-times "\.VEC_PERM" 1 "optimized" { xfail { 
any-opts "--param riscv-autovec-lmul=m1" "--param riscv-autovec-lmul=m2" } } } 
} */
+/* { dg-final { scan-assembler {\tvid\.v} { xfail { any-opts "--param 
riscv-autovec-lmul=m1" "--param riscv-autovec-lmul=m2" } } } } */
/* { dg-final { scan-assembler-not {\tvmul} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-17.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-17.c
index bccf3e6570a..67dbadafc48 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-17.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-17.c
@@ -29,7 +29,8 @@ f (uint8_t *restrict a, uint8_t *restrict b,
 }
}
-/* FIXME: Since we don't have VECT cost model yet, LOAD_LANES/STORE_LANES are 
chosen instead of SLP.  */
-/* { dg-final { scan-tree-dump-times "\.VEC_PERM" 2 "optimized" { xfail *-*-* 
} } } */
-/* { dg-final { scan-assembler {\tvid\.v} { xfail *-*-* } } } */
+/* FIXME: Since we don't have VECT cost model yet, LOAD_LANES/STORE_LANES are 
chosen
+   instead of SLP when riscv-autovec-lmul=m1.  */
+/* { dg-final { scan-tree-dump-times "\.VEC_PERM" 2 "optimized" { xfail { 
any-opts "--param riscv-autovec-lmul=m1" } } } } */
+/* { dg-final { s

Re: [PATCH] RISC-V: Add the missed half floating-point mode patterns of local_pic_load/store when only use zfhmin

2023-08-17 Thread Robin Dapp via Gcc-patches
Hi Lehua,

thanks for fixing this.  Looks like the same reason we have the
separation of zvfh and zvfhmin for vector loads/stores.

> +;; Iterator for hardware-supported load/store floating-point modes.
> +(define_mode_iterator ANYLSF [(SF "TARGET_HARD_FLOAT || TARGET_ZFINX")
> +   (DF "TARGET_DOUBLE_FLOAT || TARGET_ZDINX")
> +   (HF "TARGET_ZFHMIN || TARGET_ZHINX")])
> +

I first thought we needed TARGET_ZFH here as well but it appears that
TARGET_ZFH implies TARGET_ZFHMIN via riscv_implied_info.  We're lacking
that on the vector side and this should be addressed separately.

You likely want TARGET_ZHINXMIN instead of ZHINX though?  I mean the
hardware support is obviously always there but the patterns should
be available for the min extension already.  Please double check as
I haven't worked with that extension before.
Our test coverage for the *inx extensions is honestly a bit sparse,
maybe you would also want to add a testcase for a similar scenario?

> -;; We can support ANYF loads into X register if there is no double support
> +;; We can support ANYLSF loads into X register if there is no double support
>  ;; or if the target is 64-bit> -(define_insn "*local_pic_load"
> -  [(set (match_operand:ANYF 0 "register_operand" "=f,*r")
> - (mem:ANYF (match_operand 1 "absolute_symbolic_operand" "")))
> +(define_insn "*local_pic_load"
> +  [(set (match_operand:ANYLSF 0 "register_operand" "=f,*r")
> + (mem:ANYLSF (match_operand 1 "absolute_symbolic_operand" "")))
> (clobber (match_scratch:P 2 "=r,X"))]
>"TARGET_HARD_FLOAT && USE_LOAD_ADDRESS_MACRO (operands[1])
> && (!TARGET_DOUBLE_FLOAT || TARGET_64BIT)"
>"@
> -   \t%0,%1,%2
> +   \t%0,%1,%2
> \t%0,%1"
>[(set (attr "length") (const_int 8))])

Unrelated to your patch - but from a quick glimpse here I didn't see
why we require TARGET_HARD_FLOAT for the softload alternatives.  Aren't
zdinx, zfinx, zhinx a bit of a SOFT_FLOAT thing?  Well probably just
semantics... 

Apart from that LGTM.

Regards
 Robin



Re: [PATCH] RISC-V: Forbidden fuse vlmax vsetvl to DEMAND_NONZERO_AVL vsetvl

2023-08-17 Thread Robin Dapp via Gcc-patches
Hi Lehua,

> XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvand
> XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvand
> XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvand
> XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvand

Thanks for checking, I know about those but have other FAILs.  Probably
due to a recent update or so, need to check.

> This is because running a testcase with spike+pk will result in an
> ILLEGAL INSTRUCTION error if the vtype registers are not initialized
> before executing vmv1r.v instruction. This case fails because of this reason,
> so explicitly execute vsetvl early. We are currently discussing with Kito to
> constrain this case in psABI and ask the execution environment(pk) to ensure
> that vtype is initialized, but not so fast. So when encountering a testcase 
> that
> fails because of this reason, I think use this way to fix it is ok.

Hmm, ok so that has nothing to do with the rest of the patch but just
happend to be the same test case.
So we didn't schedule a vsetvl here because vmv1r doesn't require
one but the simulation doesn't initialize vtype before the first vsetvl?
If this is the only instance, I guess that's OK, but please add a comment
as well.

OK with the two comments added.

Regards
 Robin


Re: [PATCH] RISC-V: Add the missed half floating-point mode patterns of local_pic_load/store when only use zfhmin

2023-08-17 Thread Lehua Ding
Hi Robin,


> You likely want TARGET_ZHINXMIN instead of ZHINX though?  I mean the
> hardware support is obviously always there but the patterns should
> be available for the min extension already.  Please double check as
> I haven't worked with that extension before.
> Our test coverage for the *inx extensions is honestly a bit sparse,
> maybe you would also want to add a testcase for a similar scenario?


Indeed, thanks for the reminder. I'll add the missing ones and add V2 patch.


> Unrelated to your patch - but from a quick glimpse here I didn't see
> why we require TARGET_HARD_FLOAT for the softload alternatives.  
Aren't
> zdinx, zfinx, zhinx a bit of a SOFT_FLOAT thing?  Well probably just
> semantics...


Looking closely at this condition is a bit odd for me too.




Best,
Lehua


-- Original --
From:   
 "Robin Dapp"   
 


Re: [PATCH] RISC-V: Forbidden fuse vlmax vsetvl to DEMAND_NONZERO_AVL vsetvl

2023-08-17 Thread Lehua Ding
Hi Robin,


> Hmm, ok so that has nothing to do with the rest of the patch but just
> happend to be the same test case.
> So we didn't schedule a vsetvl here because vmv1r doesn't require
> one but the simulation doesn't initialize vtype before the first vsetvl?
> If this is the only instance, I guess that's OK, but please add a comment
> as well.


Understood exactly right. There should be a harmonized solution to
this problem later. This is an interim solution for reduce unnecessary 
failures..


Best,
Lehua


-- Original --
From:   
 "Robin Dapp"   
 


[committed] libstdc++: Regenerate Makefile.in

2023-08-17 Thread Jonathan Wakely via Gcc-patches
This target in include/Makefile.am was supposed to ensure that nobody
building gcc would need autogen to regenerate the bits/version.h header,
but it didn't make it in to include/Makefile.in.

Tested x86_64-linux, pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* include/Makefile.in: Regenerate.
---
 libstdc++-v3/include/Makefile.in | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in
index f5b04d3fe8a..d2c95ee0b95 100644
--- a/libstdc++-v3/include/Makefile.in
+++ b/libstdc++-v3/include/Makefile.in
@@ -1940,9 +1940,9 @@ ${pch3_output}: ${pch3_source} ${pch2_output}
$(CXX) $(PCHFLAGS) $(AM_CPPFLAGS) -O2 -g ${pch3_source} -o $@
 
 # AutoGen .
-${bits_srcdir}/version.h: ${bits_srcdir}/version.def \
-   ${bits_srcdir}/version.tpl
-   cd $(@D) && \
+.PHONY: update-version
+update-version:
+   cd ${bits_srcdir} && \
autogen version.def
 
 # The real deal.
-- 
2.41.0



[committed] libstdc++: Fix std::format("{:F}", inf) to use uppercase

2023-08-17 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk. Backport to gcc-13 will follow.

-- >8 --

std::format was treating {:f} and {:F} identically on the basis that for
the fixed 1.234567 format there are no alphabetical characters that need
to be in uppercase. But that's wrong for infinities and NaNs, which
should be formatted as "INF" and "NAN" for {:F}.

libstdc++-v3/ChangeLog:

* include/std/format (__format::_Pres_type): Add _Pres_F.
(__formatter_fp::parse): Use _Pres_F for 'F'.
(__formatter_fp::format): Set __upper for _Pres_F.
* testsuite/std/format/functions/format.cc: Check formatting of
infinity and NaN for each presentation type.
---
 libstdc++-v3/include/std/format  | 10 --
 .../testsuite/std/format/functions/format.cc | 12 
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format
index a8db10d6460..40c7d6128f6 100644
--- a/libstdc++-v3/include/std/format
+++ b/libstdc++-v3/include/std/format
@@ -309,7 +309,7 @@ namespace __format
 // Presentation types for integral types (including bool and charT).
 _Pres_d = 1, _Pres_b, _Pres_B, _Pres_o, _Pres_x, _Pres_X, _Pres_c,
 // Presentation types for floating-point types.
-_Pres_a = 1, _Pres_A, _Pres_e, _Pres_E, _Pres_f, _Pres_g, _Pres_G,
+_Pres_a = 1, _Pres_A, _Pres_e, _Pres_E, _Pres_f, _Pres_F, _Pres_g, _Pres_G,
 _Pres_p = 0, _Pres_P,   // For pointers.
 _Pres_s = 0,// For strings and bool.
 _Pres_esc = 0xf,// For strings and charT.
@@ -1382,10 +1382,13 @@ namespace __format
++__first;
break;
  case 'f':
- case 'F':
__spec._M_type = _Pres_f;
++__first;
break;
+ case 'F':
+   __spec._M_type = _Pres_F;
+   ++__first;
+   break;
  case 'g':
__spec._M_type = _Pres_g;
++__first;
@@ -1442,6 +1445,9 @@ namespace __format
  __use_prec = true;
  __fmt = chars_format::scientific;
  break;
+   case _Pres_F:
+ __upper = true;
+ [[fallthrough]];
case _Pres_f:
  __use_prec = true;
  __fmt = chars_format::fixed;
diff --git a/libstdc++-v3/testsuite/std/format/functions/format.cc 
b/libstdc++-v3/testsuite/std/format/functions/format.cc
index 4db5202815d..59ed3be8baa 100644
--- a/libstdc++-v3/testsuite/std/format/functions/format.cc
+++ b/libstdc++-v3/testsuite/std/format/functions/format.cc
@@ -159,6 +159,18 @@ test_alternate_forms()
   VERIFY( s == "1.e+01 1.e+01 1.e+01" );
 }
 
+void
+test_infnan()
+{
+  double inf = std::numeric_limits::infinity();
+  double nan = std::numeric_limits::quiet_NaN();
+  std::string s;
+  s = std::format("{0} {0:e} {0:E} {0:f} {0:F} {0:g} {0:G} {0:a} {0:A}", inf);
+  VERIFY( s == "inf inf INF inf INF inf INF inf INF" );
+  s = std::format("{0} {0:e} {0:E} {0:f} {0:F} {0:g} {0:G} {0:a} {0:A}", nan);
+  VERIFY( s == "nan nan NAN nan NAN nan NAN nan NAN" );
+}
+
 struct euro_punc : std::numpunct
 {
   std::string do_grouping() const override { return "\3\3"; }
-- 
2.41.0



[PATCH V2] RISC-V: Forbidden fuse vlmax vsetvl to DEMAND_NONZERO_AVL vsetvl

2023-08-17 Thread Lehua Ding
Hi,

This little patch fix the fail testcase
(gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c)
after apply this patch
(https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627121.html).
The specific reason is that the vsetvl pass has bug and this patch
forbidden the fuse of this case. This patch needs to be committed
before that patch to work.

Best,
Lehua

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pass_vsetvl::backward_demand_fusion):
  Forbidden.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c:
  Address failure due to uninitialized vtype register.
---
 gcc/config/riscv/riscv-vsetvl.cc| 17 +
 .../autovec/gather-scatter/strided_load_run-1.c |  6 ++
 2 files changed, 23 insertions(+)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 79cbac01047..2d8fa754ea0 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -3330,6 +3330,23 @@ pass_vsetvl::backward_demand_fusion (void)
  else if (block_info.reaching_out.dirty_p ())
{
  /* DIRTY -> DIRTY or VALID -> DIRTY.  */
+
+ /* Forbidden this case fuse because it change the value of a5.
+  bb 1: vsetvl zero, no_zero_avl
+...
+use a5
+...
+  bb 2: vsetvl a5, zero
+=>
+  bb 1: vsetvl a5, zero
+...
+use a5
+...
+  bb 2:
+ */
+ if (block_info.reaching_out.demand_p (DEMAND_NONZERO_AVL)
+ && vlmax_avl_p (prop.get_avl ()))
+   continue;
  vector_insn_info new_info;
 
  if (block_info.reaching_out.compatible_p (prop))
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
index 7ffa93bf13f..7eeb22aade2 100644
--- 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
+++ 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
@@ -7,6 +7,12 @@
 int
 main (void)
 {
+  /* FIXME: The purpose of this assembly is to ensure that the vtype register 
is
+ initialized befor instructions such as vmv1r.v are executed. Otherwise you
+ will get illegal instruction errors when running with spike+pk. This is an
+ interim solution for reduce unnecessary failures and a unified solution
+ will come later. */
+  asm volatile("vsetivli x0, 0, e8, m1, ta, ma");
 #define RUN_LOOP(DATA_TYPE, BITS)  
\
   DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];   
\
   DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];  
\
-- 
2.36.3



[COMMITTED] bpf: support `naked' function attributes in BPF targets

2023-08-17 Thread Jose E. Marchesi via Gcc-patches
The kernel selftests and other BPF programs make extensive use of the
`naked' function attribute with bodies written using basic inline
assembly.  This patch adds support for the attribute to
bpf-unkonwn-none, makes it to inhibit warnings due to lack of explicit
`return' statement, and updates documentation and testsuite
accordingly.

Tested in x86_64-linux-gnu host and bpf-unknown-none target.

gcc/ChangeLog

PR target/111046
* config/bpf/bpf.cc (bpf_attribute_table): Add entry for the
`naked' function attribute.
(bpf_warn_func_return): New function.
(TARGET_WARN_FUNC_RETURN): Define.
(bpf_expand_prologue): Add preventive comment.
(bpf_expand_epilogue): Likewise.
* doc/extend.texi (BPF Function Attributes): Document the `naked'
function attribute.

gcc/testsuite/ChangeLog

* gcc.target/bpf/naked-1.c: New test.
---
 gcc/config/bpf/bpf.cc  | 25 +
 gcc/doc/extend.texi| 11 +++
 gcc/testsuite/gcc.target/bpf/naked-1.c | 12 
 3 files changed, 48 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/bpf/naked-1.c

diff --git a/gcc/config/bpf/bpf.cc b/gcc/config/bpf/bpf.cc
index 1d0abd7fbb3..437bd652de3 100644
--- a/gcc/config/bpf/bpf.cc
+++ b/gcc/config/bpf/bpf.cc
@@ -154,6 +154,10 @@ static const struct attribute_spec bpf_attribute_table[] =
  { "preserve_access_index", 0, -1, false, true, false, true,
bpf_handle_preserve_access_index_attribute, NULL },
 
+ /* Support for `naked' function attribute.  */
+ { "naked", 0, 1, false, false, false, false,
+   bpf_handle_fndecl_attribute, NULL },
+
  /* The last attribute spec is set to be NULL.  */
  { NULL,   0,  0, false, false, false, false, NULL, NULL }
 };
@@ -335,6 +339,21 @@ bpf_function_value_regno_p (const unsigned int regno)
 #undef TARGET_FUNCTION_VALUE_REGNO_P
 #define TARGET_FUNCTION_VALUE_REGNO_P bpf_function_value_regno_p
 
+
+/* Determine whether to warn about lack of return statement in a
+   function.  */
+
+static bool
+bpf_warn_func_return (tree decl)
+{
+  /* Naked functions are implemented entirely in assembly, including
+ the return instructions.  */
+  return lookup_attribute ("naked", DECL_ATTRIBUTES (decl)) == NULL_TREE;
+}
+
+#undef TARGET_WARN_FUNC_RETURN
+#define TARGET_WARN_FUNC_RETURN bpf_warn_func_return
+
 /* Compute the size of the function's stack frame, including the local
area and the register-save area.  */
 
@@ -388,6 +407,9 @@ bpf_expand_prologue (void)
  dynamically.  This should have been checked already and an error
  emitted.  */
   gcc_assert (!cfun->calls_alloca);
+
+  /* If we ever need to have a proper prologue here, please mind the
+ `naked' function attribute.  */
 }
 
 /* Expand to the instructions in a function epilogue.  This function
@@ -399,6 +421,9 @@ bpf_expand_epilogue (void)
   /* See note in bpf_expand_prologue for an explanation on why we are
  not restoring callee-saved registers in BPF.  */
 
+  /* If we ever need to do anything else than just generating a return
+ instruction here, please mind the `naked' function attribute.  */
+
   emit_jump_insn (gen_exit ());
 }
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index b363386df6e..f657032cbef 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -5172,6 +5172,17 @@ attribute.  Example:
 int bpf_probe_read (void *dst, int size, const void *unsafe_ptr)
   __attribute__ ((kernel_helper (4)));
 @end smallexample
+
+@cindex @code{naked} function attribute, BPF
+@item naked
+This attribute allows the compiler to construct the requisite function
+declaration, while allowing the body of the function to be assembly
+code.  The specified function will not have prologue/epilogue
+sequences generated by the compiler.  Only basic @code{asm} statements
+can safely be included in naked functions (@pxref{Basic Asm}).  While
+using extended @code{asm} or a mixture of basic @code{asm} and C code
+may appear to work, they cannot be depended upon to work reliably and
+are not supported.
 @end table
 
 @node C-SKY Function Attributes
diff --git a/gcc/testsuite/gcc.target/bpf/naked-1.c 
b/gcc/testsuite/gcc.target/bpf/naked-1.c
new file mode 100644
index 000..cbbc4c51697
--- /dev/null
+++ b/gcc/testsuite/gcc.target/bpf/naked-1.c
@@ -0,0 +1,12 @@
+/* Verify that __attribute__((naked)) is accepted and
+   produces a naked function.  Also, the compiler must not
+   warn for the lack of return statement.  */
+/* { dg-do compile } */
+/* { dg-options "-O0 -Wreturn-type" } */
+
+int __attribute__((naked)) foo()
+{
+  __asm__ volatile ("@ naked");
+}
+/* { dg-final { scan-assembler "\t@ naked" } } */
+/* { dg-final { scan-assembler "\texit\n" } } */
-- 
2.30.2



[PATCH V2] RISC-V: Add the missed half floating-point mode patterns of local_pic_load/store when only use zfhmin or zhinxmin

2023-08-17 Thread Lehua Ding
Hi,

There is a new failed RISC-V 
testcase(testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c)
on the current trunk branch when use medany as default cmodel.
The reason is the load of half floating-point imm is convert from RTL 1 to RTL
2 as the cmodel be changed from medlow to medany. This change let insn 7 be
combineed with @pred_broadcast patterns (insn 8) at combine pass. However,
insn 6 and insn 7 are combined for SF and DF mode, but not for HF mode, and
the fail combined leads to insn 7 and insn 8 be combined. The reason of the
fail combined is the local_pic_loadhf pattern doesn't exist when only enable
zfhmin(implied by zvfh).

Therefore, when only zfhmin but not zfh is enabled, the define_insn of
*local_pic_load must also be able to produce the pattern for
*load_pic_loadhf pattern, since the zfhmin extension also includes a
half floating-point load/store instructions. So, I added an ANFLSF Iterator
and applied it to local_pic_load/store define_insns. I have checked other ANYF
usage scenarios and feel that this is the only place that needs to be corrected.
I may have missed something, please correct. Thanks.

RTL 1:

(insn 6 3 7 2 (set (reg:DI 137)
(high:DI (symbol_ref/u:DI ("*.LC0") [flags 0x82]))) 
"/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":7:1
 discrim 3 179 {*movdi_64bit}
 (nil))
(insn 7 6 8 2 (set (reg:HF 136)
(mem/u/c:HF (lo_sum:DI (reg:DI 137)
(symbol_ref/u:DI ("*.LC0") [flags 0x82])) [0  S2 A16])) 
"/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":7:1
 discrim 3 126 {*movhf_hardfloat}
 (expr_list:REG_EQUAL (const_double:HF 8.8828125e+0 [0x0.8e2p+4])
(nil)))

RTL 2:

(insn 6 3 7 2 (set (reg/f:DI 137)
(symbol_ref/u:DI ("*.LC0") [flags 0x82])) 
"/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":7:1
 discrim 3 179 {*movdi_64bit}
 (nil))
(insn 7 6 8 2 (set (reg:HF 136)
(mem/u/c:HF (reg/f:DI 137) [0  S2 A16])) 
"/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":7:1
 discrim 3 126 {*movhf_hardfloat}
 (expr_list:REG_EQUAL (const_double:HF 8.8828125e+0 [0x0.8e2p+4])
(nil)))
(insn 8 7 9 2 (set (reg:V2HF 135)
(if_then_else:V2HF (unspec:V2BI [
(const_vector:V2BI [
(const_int 1 [0x1]) repeated x2
])
(const_int 2 [0x2]) repeated x3
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(vec_duplicate:V2HF (reg:HF 136))
(unspec:V2HF [
(reg:SI 0 zero)
] UNSPEC_VUNDEF))) 
"/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":6:1
 discrim 3 1389 {*pred_broadcastv2hf}
 (nil))

Best,
Lehua

gcc/ChangeLog:

* config/riscv/iterators.md (TARGET_HARD_FLOAT || TARGET_ZFINX): New.
* config/riscv/pic.md (*local_pic_load): Change ANYF.
(*local_pic_load): To ANYLSF.
(*local_pic_load_32d): Ditto.
(*local_pic_load_32d): Ditto.
(*local_pic_store): Ditto.
(*local_pic_store): Ditto.
(*local_pic_store_32d): Ditto.
(*local_pic_store_32d): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/_Float16-zfhmin-4.c: New test.
* gcc.target/riscv/_Float16-zhinxmin-4.c: New test.

---
 gcc/config/riscv/iterators.md | 5 +++
 gcc/config/riscv/pic.md   | 34 +--
 .../gcc.target/riscv/_Float16-zfhmin-4.c  | 11 ++
 .../gcc.target/riscv/_Float16-zhinxmin-4.c| 11 ++
 4 files changed, 44 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/_Float16-zfhmin-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/_Float16-zhinxmin-4.c

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index d374a10810c..500bbc39a6b 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -67,6 +67,11 @@
(DF "TARGET_DOUBLE_FLOAT || TARGET_ZDINX")
(HF "TARGET_ZFH || TARGET_ZHINX")])
 
+;; Iterator for hardware-supported load/store floating-point modes.
+(define_mode_iterator ANYLSF [(SF "TARGET_HARD_FLOAT || TARGET_ZFINX")
+ (DF "TARGET_DOUBLE_FLOAT || TARGET_ZDINX")
+ (HF "TARGET_ZFHMIN || TARGET_ZHINXMIN")])
+
 ;; Iterator for floating-point modes that can be loaded into X registers.
 (define_mode_iterator SOFTF [SF (DF "TARGET_64BIT") (HF "TARGET_ZFHMIN")])
 
diff --git a/gcc/config/riscv/pic.md b/gcc/config/riscv/pic.md
index 9507850455a..da636e31619 100644
-

[PATCH] tree-optimization/111039 - abnormals and bit test merging

2023-08-17 Thread Richard Biener via Gcc-patches
The following guards the bit test merging code in if-combine against
the appearance of SSA names used in abnormal PHIs.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/111039
* tree-ssa-ifcombine.cc (ifcombine_ifandif): Check for
SSA_NAME_OCCURS_IN_ABNORMAL_PHI.

* gcc.dg/pr111039.c: New testcase.
---
 gcc/testsuite/gcc.dg/pr111039.c | 15 +++
 gcc/tree-ssa-ifcombine.cc   |  7 +++
 2 files changed, 22 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr111039.c

diff --git a/gcc/testsuite/gcc.dg/pr111039.c b/gcc/testsuite/gcc.dg/pr111039.c
new file mode 100644
index 000..bec9983b35f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr111039.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O" } */
+
+int _setjmp ();
+void abcd ();
+void abcde ();
+void compiler_corruption_function(int flags)
+{
+  int nowait = flags & 1048576, isexpand = flags & 8388608;
+  abcd();
+  _setjmp(flags);
+  if (nowait && isexpand)
+flags &= 0;
+  abcde();
+}
diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc
index 58e19c1508e..d5701e8c407 100644
--- a/gcc/tree-ssa-ifcombine.cc
+++ b/gcc/tree-ssa-ifcombine.cc
@@ -430,6 +430,9 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool 
inner_inv,
 {
   tree t, t2;
 
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (name1))
+   return false;
+
   /* Do it.  */
   gsi = gsi_for_stmt (inner_cond);
   t = fold_build2 (LSHIFT_EXPR, TREE_TYPE (name1),
@@ -486,6 +489,10 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool 
inner_inv,
   gimple_stmt_iterator gsi;
   tree t;
 
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (name1)
+ || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (name2))
+   return false;
+
   /* Find the common name which is bit-tested.  */
   if (name1 == name2)
;
-- 
2.35.3


[committed] libgomp: call numa_available first when using libnuma

2023-08-17 Thread Tobias Burnus

Found when looking at libnuma/libmemkind test issues discussing
in https://gcc.gnu.org/PR111024
[The fails of the PR are not fully understood but point towards a
 buggy libmemkind or combination or libmemkind + other libraries;
 they only show up recently as before no libgomp testcase checked
 for interleaved allocation - and on newer Linux systems it does
 pass.]


Calling numa_available() just ensures that the get_mempolicy syscall
is available in the Linux kernel, which is likely the case for Linux
distros in use, but still it makes sense to play be the rules and to
check that that's indeed the case - as recommended/demanded by the
(lib)numa manpage. The call is only done once and also only when
libnuma is supposed to get used.

Committed to mainline as Rev. r14-3287-g8f3c4517b1fff9

Tobias

PS: Crossref - some more doc + libmemkind improvements could/should
eventually be done. See https://gcc.gnu.org/PR111044 for details.
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
commit 8f3c4517b1fff965f2bdedcf376dcfd91cda422b
Author: Tobias Burnus 
Date:   Thu Aug 17 15:20:55 2023 +0200

libgomp: call numa_available first when using libnuma

The documentation requires that numa_available() is called and only
when successful, other libnuma function may be called. Internally,
it does a syscall to get_mempolicy with flag=0 (which would return
the default policy if mode were not NULL). If this returns -1 (and
not 0) and errno == ENOSYS, the Linux kernel does not have the
get_mempolicy syscall function; if so, numa_available() returns -1
(otherwise: 0).

libgomp/

PR libgomp/111024
* allocator.c (gomp_init_libnuma): Call numa_available; if
not available or not returning 0, disable libnuma usage.
---
 libgomp/allocator.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/libgomp/allocator.c b/libgomp/allocator.c
index 90f2dcb60d6..b4e50e2ad72 100644
--- a/libgomp/allocator.c
+++ b/libgomp/allocator.c
@@ -107,28 +107,39 @@ static pthread_once_t libnuma_data_once = PTHREAD_ONCE_INIT;
 
 static void
 gomp_init_libnuma (void)
 {
   void *handle = dlopen ("libnuma.so.1", RTLD_LAZY);
   struct gomp_libnuma_data *data;
 
   data = calloc (1, sizeof (struct gomp_libnuma_data));
   if (data == NULL)
 {
   if (handle)
 	dlclose (handle);
   return;
 }
+  if (handle)
+{
+  int (*numa_available) (void);
+  numa_available
+	= (__typeof (numa_available)) dlsym (handle, "numa_available");
+  if (!numa_available || numa_available () != 0)
+	{
+	  dlclose (handle);
+	  handle = NULL;
+	}
+}
   if (!handle)
 {
   __atomic_store_n (&libnuma_data, data, MEMMODEL_RELEASE);
   return;
 }
   data->numa_handle = handle;
   data->numa_alloc_local
 = (__typeof (data->numa_alloc_local)) dlsym (handle, "numa_alloc_local");
   data->numa_realloc
 = (__typeof (data->numa_realloc)) dlsym (handle, "numa_realloc");
   data->numa_free
 = (__typeof (data->numa_free)) dlsym (handle, "numa_free");
   __atomic_store_n (&libnuma_data, data, MEMMODEL_RELEASE);
 }


Re: RISC-V: Added support for CRC.

2023-08-17 Thread Alexander Monakov


On Wed, 16 Aug 2023, Philipp Tomsich wrote:

> > > I fully expect that latency to drop within the next 12-18 months.  In that
> > > world, there's not going to be much benefit to using hand-coded libraries 
> > > vs
> > > just letting the compiler do it.
> 
> I would also hope that the hand-coded libraries would eventually have
> a code path for compilers that support the built-in.

You seem to be working with the false assumption that the interface of the
proposed builtin matches how high-performance CRC computation is structured.
It is not. State-of-the-art CRC keeps unreduced intermediate residual, split
over multiple temporaries to allow overlapping CLMULs in the CPU. The
intermediate residuals are reduced only once, when the final CRC value is
needed. In constrast, the proposed builtin has data dependencies between
all adjacent instructions, and cannot allow the CPU to work at IPC > 1.0.

Shame how little you apparently understand of the "mindbending math".

Alexander


Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-08-17 Thread Qing Zhao via Gcc-patches
Hi, Kees,

Thanks for the testing case. 
Yes, I noticed this issue too, and already fixed it in my private branch. 

With the latest patch, the compilation has no issue:
[opc@qinzhao-ol8u3-x86 108896]$ sh t
/home/opc/Install/latest-d/bin/gcc -O2 -c -o /dev/null bug.c
[opc@qinzhao-ol8u3-x86 108896]$ 

Qing

> On Aug 17, 2023, at 2:38 AM, Kees Cook  wrote:
> 
> On Wed, Aug 16, 2023 at 10:31:30PM -0700, Kees Cook wrote:
>> On Fri, Aug 04, 2023 at 07:44:28PM +, Qing Zhao wrote:
>>> This is the 2nd version of the patch, per our discussion based on the
>>> review comments for the 1st version, the major changes in this version
>> 
>> I've been using Coccinelle to find and annotate[1] structures (193 so
>> far...), and I've encountered 2 cases of GCC internal errors. I'm working
>> on a minimized test case, but just in case these details are immediately
>> helpful, here's what I'm seeing:
> 
> Okay, I got it minimized:
> 
> $ cat poc.c
> struct a {
>  unsigned long c;
>  char d[] __attribute__((__counted_by__(c)));
> } *b;
> 
> void f(long);
> 
> void e(void) {
>  long g = __builtin_dynamic_object_size(b->d, 1);
>  f(g);
> }
> $ gcc -O2 -c -o /dev/null poc.c
> poc.c: In function 'e':
> poc.c:8:6: error: incorrect sharing of tree nodes
>8 | void e(void) {
>  |  ^
> *b.0_1
> _2 = &b.0_1->d;
> during GIMPLE pass: objsz
> poc.c:8:6: internal compiler error: verify_gimple failed
> 0xfe97fd verify_gimple_in_cfg(function*, bool, bool)
>../../../../gcc/gcc/tree-cfg.cc:5646
> 0xe84894 execute_function_todo
>../../../../gcc/gcc/passes.cc:2088
> 0xe84dee execute_todo
>../../../../gcc/gcc/passes.cc:2142
> 
> -- 
> Kees Cook



Re: [PATCH] RISC-V: Deduplicate #error messages in testsuite

2023-08-17 Thread Jeff Law via Gcc-patches




On 8/13/23 20:53, Tsukasa OI via Gcc-patches wrote:

From: Tsukasa OI

"#error Feature macro not defined" is required to test the existence of an
extension through the preprocessor.  However, multiple occurrence of the
exact same error message will confuse the developer once an error is
encountered.

This commit replaces such error messages to
"#error Feature macro for `EXT' not defined" to make which
macro is missing.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zvkn.c: Deduplicate #error messages.
* gcc.target/riscv/zvkn-1.c: Ditto.
* gcc.target/riscv/zvknc.c: Ditto.
* gcc.target/riscv/zvknc-1.c: Ditto.
* gcc.target/riscv/zvknc-2.c: Ditto.
* gcc.target/riscv/zvkng.c: Ditto.
* gcc.target/riscv/zvkng-1.c: Ditto.
* gcc.target/riscv/zvkng-2.c: Ditto.
* gcc.target/riscv/zvks.c: Ditto.
* gcc.target/riscv/zvks-1.c: Ditto.
* gcc.target/riscv/zvksc.c: Ditto.
* gcc.target/riscv/zvksc-1.c: Ditto.
* gcc.target/riscv/zvksc-2.c: Ditto.
* gcc.target/riscv/zvksg.c: Ditto.
* gcc.target/riscv/zvksg-1.c: Ditto.
* gcc.target/riscv/zvksg-2.c: Ditto.

Pushed to the trunk.  Thanks.
jeff


Re: Another bug for __builtin_object_size? (Or expected behavior)

2023-08-17 Thread Qing Zhao via Gcc-patches


> On Aug 17, 2023, at 7:00 AM, Siddhesh Poyarekar  wrote:
> 
> On 2023-08-16 11:59, Qing Zhao wrote:
>> Jakub and Sid,
>> During my study, I found an interesting behavior for the following small 
>> testing case:
>> #include 
>> #include 
>> struct fixed {
>>   size_t foo;
>>   char b;
>>   char array[10];
>> } q = {};
>> #define noinline __attribute__((__noinline__))
>> static void noinline bar ()
>> {
>>   struct fixed *p = &q;
>>   printf("the__bos of MAX p->array sub is %d \n", 
>> __builtin_object_size(p->array, 1));
>>   printf("the__bos of MIN p->array sub is %d \n", 
>> __builtin_object_size(p->array, 3));
>>   return;
>> }
>> int main ()
>> {
>>   bar ();
>>   return 0;
>> }
>> [opc@qinzhao-aarch64-ol8 108896]$ sh t
>> /home/opc/Install/latest-d/bin/gcc -O -fstrict-flex-arrays=3 t2.c
>> the__bos of MAX p->array sub is 10
>> the__bos of MIN p->array sub is 15
>> I assume that the Minimum size in the sub-object should be 10 too (i.e 
>> __builtin_object_size(p->array, 3) should be 10 too).
>> So, first question: Is this correct or wrong behavior for 
>> __builtin_object_size(p->array, 3)?
>> The second question is, when I debugged into why 
>> __builtin_object_size(p->array, 3) returns 15 instead of 10, I observed the 
>> following:
>> 1. In “early_objz” phase, The IR for p->array is:
>> (gdb) call debug_generic_expr(ptr)
>> &p_5->array
>> And the pt_var is:
>> (gdb) call debug_generic_expr(pt_var)
>> *p_5
>> As a result, the following condition in tree-object-size.cc:
>>  585   if (pt_var != TREE_OPERAND (ptr, 0))
>> Was satisfied, and then the algorithm for computing the SUBOBJECT was 
>> invoked and the size of the subobject 10 was used.
>> and then an MAX_EXPR was inserted after the __builtin_object_size call as:
>>   _3 = &p_5->array;
>>   _10 = __builtin_object_size (_3, 3);
>>   _4 = MAX_EXPR <_10, 10>;
>> Till now, everything looks fine.
>> 2. within “ccp1” phase, when folding the call  to __builtin_object_size, the 
>> IR for the p-:>array is:
>> (gdb) call debug_generic_expr(ptr)
>> &MEM  [(void *)&q + 9B]
>> And the pt_var is:
>> (gdb) call debug_generic_expr(pt_var)
>> MEM  [(void *)&q + 9B]
>> As a result, the following condition in tree-object-size.cc:
>>  585   if (pt_var != TREE_OPERAND (ptr, 0))
>> Was NOT satisfied, therefore the algorithm for computing the SUBOBJECT was 
>> NOT invoked at all, as a result, the size in the whole object, 15, was used.
>> And then finally, MAX_EXPR (_10, 10) becomes MAX_EXPR (15, 10), 15 is the 
>> final result.
>> Based on the above, is there any issue with the current algorithm?
> 
> So this is a (sort of) known issue, which necessitated the early_objsz pass 
> to get an estimate before a subobject reference was optimized to a MEM_REF.

Do you mean that after a subobject reference was optimized to a MEM_REF, there 
is no way to compute the size of the subobject anymore?

>  However it looks like the MIN/MAX hack doesn't work in this case for 
> OST_MINIMUM; it should probably get the minimum of the two passes if both 
> passes were successful, or only the result of the pass that was successful.

You mean that the following line:
2053   enum tree_code code = object_size_type & OST_MINIMUM ? MAX_EXPR : 
MIN_EXPR;
Might need to be changed to:
2053   enum tree_code code =  MIN_EXPR;

?

thanks.

Qing
> 
> Thanks,
> Sid



[PATCH V4] Add warning options -W[no-]compare-distinct-pointer-types

2023-08-17 Thread Jose E. Marchesi via Gcc-patches
[Changes from V3:
- Previous thread:
  https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600625.html
- The tests have been augmented to check all six relational
  operators.  In particular it covers both code paths impacted
  by the patch: the equality/inequality and the relational ops.]

GCC emits pedwarns unconditionally when comparing pointers of
different types, for example:

  int xdp_context (struct xdp_md *xdp)
{
void *data = (void *)(long)xdp->data;
__u32 *metadata = (void *)(long)xdp->data_meta;
__u32 ret;

if (metadata + 1 > data)
  return 0;
return 1;
   }

  /home/jemarch/foo.c: In function ‘xdp_context’:
  /home/jemarch/foo.c:15:20: warning: comparison of distinct pointer types 
lacks a cast
 15 |   if (metadata + 1 > data)
 |^

LLVM supports an option -W[no-]compare-distinct-pointer-types that can
be used in order to enable or disable the emission of such warnings.
It is enabled by default.

This patch adds the same options to GCC.

Documentation and testsuite updated included.
Regtested in x86_64-linu-gnu.
No regressions observed.

gcc/ChangeLog:

PR c/106537
* doc/invoke.texi (Option Summary): Mention
-Wcompare-distinct-pointer-types under `Warning Options'.
(Warning Options): Document -Wcompare-distinct-pointer-types.

gcc/c-family/ChangeLog:

PR c/106537
* c.opt (Wcompare-distinct-pointer-types): New option.

gcc/c/ChangeLog:

PR c/106537
* c-typeck.cc (build_binary_op): Warning on comparing distinct
pointer types only when -Wcompare-distinct-pointer-types.

gcc/testsuite/ChangeLog:

PR c/106537
* gcc.c-torture/compile/pr106537-1.c: New test.
* gcc.c-torture/compile/pr106537-2.c: Likewise.
* gcc.c-torture/compile/pr106537-3.c: Likewise.
---
 gcc/c-family/c.opt|  4 +++
 gcc/c/c-typeck.cc |  6 ++--
 gcc/doc/invoke.texi   |  6 
 .../gcc.c-torture/compile/pr106537-1.c| 34 +++
 .../gcc.c-torture/compile/pr106537-2.c| 32 +
 .../gcc.c-torture/compile/pr106537-3.c| 32 +
 6 files changed, 111 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr106537-1.c
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr106537-2.c
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr106537-3.c

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index c7b567ba7ab..2242524cd3e 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1935,6 +1935,10 @@ Winvalid-imported-macros
 C++ ObjC++ Var(warn_imported_macros) Warning
 Warn about macros that have conflicting header units definitions.
 
+Wcompare-distinct-pointer-types
+C ObjC Var(warn_compare_distinct_pointer_types) Warning Init(1)
+Warn if pointers of distinct types are compared without a cast.
+
 flang-info-include-translate
 C++ Var(note_include_translate_yes)
 Note #include directives translated to import declarations.
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 6f2fff51683..e6ddf37d412 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -12772,7 +12772,7 @@ build_binary_op (location_t location, enum tree_code 
code,
  else
/* Avoid warning about the volatile ObjC EH puts on decls.  */
if (!objc_ok)
- pedwarn (location, 0,
+ pedwarn (location, OPT_Wcompare_distinct_pointer_types,
   "comparison of distinct pointer types lacks a cast");
 
  if (result_type == NULL_TREE)
@@ -12912,8 +12912,8 @@ build_binary_op (location_t location, enum tree_code 
code,
  int qual = ENCODE_QUAL_ADDR_SPACE (as_common);
  result_type = build_pointer_type
  (build_qualified_type (void_type_node, qual));
- pedwarn (location, 0,
-  "comparison of distinct pointer types lacks a cast");
+  pedwarn (location, OPT_Wcompare_distinct_pointer_types,
+   "comparison of distinct pointer types lacks a cast");
}
}
   else if (code0 == POINTER_TYPE && null_pointer_constant_p (orig_op1))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 3380ed8bd6f..28ee6fb62bb 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -345,6 +345,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wcast-align  -Wcast-align=strict  -Wcast-function-type  -Wcast-qual
 -Wchar-subscripts
 -Wclobbered  -Wcomment
+-Wcompare-distinct-pointer-types
 -Wno-complain-wrong-lang
 -Wconversion  -Wno-coverage-mismatch  -Wno-cpp
 -Wdangling-else  -Wdangling-pointer  -Wdangling-pointer=@var{n}
@@ -9106,6 +9107,11 @@ The latter front end diagnoses
 @samp{f951: Warning: command-line option '-fno-rtti' is valid for C++/D/ObjC++ 
but not for Fortran},
 which may be disabled wit

Re: [PATCH V2] RISC-V: Forbidden fuse vlmax vsetvl to DEMAND_NONZERO_AVL vsetvl

2023-08-17 Thread Robin Dapp via Gcc-patches
OK, thanks.

Regards
 Robin


Re: [PATCH] Loongarch: Fix plugin header missing install.

2023-08-17 Thread chenglulu

LGTM!

在 2023/8/16 上午9:48, Guo Jie 写道:

gcc/ChangeLog:

* config/loongarch/t-loongarch: Add loongarch-driver.h into
TM_H. Add loongarch-def.h and loongarch-tune.h into
OPTIONS_H_EXTRA.

Co-authored-by: Lulu Cheng 
---
  gcc/config/loongarch/t-loongarch | 4 
  1 file changed, 4 insertions(+)

diff --git a/gcc/config/loongarch/t-loongarch b/gcc/config/loongarch/t-loongarch
index 6d6e3435d59..e73f4f437ef 100644
--- a/gcc/config/loongarch/t-loongarch
+++ b/gcc/config/loongarch/t-loongarch
@@ -16,6 +16,10 @@
  # along with GCC; see the file COPYING3.  If not see
  # .
  
+TM_H += $(srcdir)/config/loongarch/loongarch-driver.h

+OPTIONS_H_EXTRA += $(srcdir)/config/loongarch/loongarch-def.h \
+  $(srcdir)/config/loongarch/loongarch-tune.h
+
  # Canonical target triplet from config.gcc
  LA_MULTIARCH_TRIPLET = $(patsubst LA_MULTIARCH_TRIPLET=%,%,$\
  $(filter LA_MULTIARCH_TRIPLET=%,$(tm_defines)))




Re: [PING] Re: [PATCH v2] Re: [WIP] Have -Wpointer-sign be enabled by -Wextra, too [PR109836]

2023-08-17 Thread Joseph Myers
On Wed, 16 Aug 2023, Eric Gallager via Gcc-patches wrote:

> PING
> 
> On Tue, Aug 8, 2023 at 8:17 PM Eric Gallager  wrote:
> >
> > On Tue, May 30, 2023 at 5:42 PM Eric Gallager  wrote:
> > >
> > > PR109836 is a request to have -Wpointer-sign enabled by default. There
> > > were points of disagreement raised in the bug report, so I figured
> > > that maybe as a compromise, the warning could just be enabled by
> > > -Wextra, as well (I have in fact seen some projects that enable
> > > -Wextra but not -Wall). This patch would implement my suggestion of
> > > adding it to -Wextra, but it's not ready to commit yet, as it still
> > > needs testing, documentation, and a ChangeLog entry. I'm just posting
> > > it here as an RFC; what do people think?

The documentation for -Wextra says "This enables some extra warning flags 
that are not enabled by @option{-Wall}." (and this patch doesn't change 
that documentation).  I don't see any coherent reason for changing that to 
add a single one of the -Wall warnings (but not any of the others).  (I'm 
*not* suggesting making -Wextra a superset of -Wall, but I don't think 
this change is a sensible one.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH v3] LoongArch:Implement 128-bit floating point functions in gcc.

2023-08-17 Thread Joseph Myers
On Thu, 17 Aug 2023, Xi Ruoyao via Gcc-patches wrote:

> So I guess we just need
> 
> builtin_define ("__builtin_fabsq=__builtin_fabsf128");
> builtin_define ("__builtin_nanq=__builtin_nanf128");
> 
> etc. to map the "q" builtins to "f128" builtins if we really need the
> "q" builtins.
> 
> Joseph: the problem here is many customers of LoongArch CPUs wish to
> compile their old code with minimal change.  Is it acceptable to add
> these builtin_define's like rs6000-c.cc?  Note "a new architecture" does
> not mean we'll only compile post-C2x-era programs onto it.

The powerpc support for __float128 started in GCC 6, predating the support 
for _FloatN type names, built-in functions etc. in GCC 7 - that's why 
there's such backwards compatibility support there.  That name only exists 
on a few architectures.

If people really want to compile code using the old __float128 names for 
LoongArch I suppose you could have such #defines, but it would be better 
for people to make their code use the standard names (as supported from 
GCC 7 onwards, though only from GCC 13 in C++) and then put backwards 
compatibility in their code for using the __float128 names if they want to 
support the type with older GCC (GCC 6 or before for C; GCC 12 or before 
for C++) on x86_64 / i386 / powerpc / ia64.  Such backwards compatibility 
in user code is more likely to be relevant for C++ than for C, given how 
the C++ support was added to GCC much more recently.  (Note: I haven't 
checked when other compilers added support for the _Float128 name or 
associated built-in functions, whether for C or for C++, which might also 
affect when user code wants such compatibility.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH V4] Add warning options -W[no-]compare-distinct-pointer-types

2023-08-17 Thread Joseph Myers
On Thu, 17 Aug 2023, Jose E. Marchesi via Gcc-patches wrote:

> +@opindex Wcompare-distinct-pointer-types
> +@item -Wcompare-distinct-pointer-types

This @item should say @r{(C and Objective-C only)}, since the option isn't 
implemented for C++.  OK with that change.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] libgccjit: Add support for `restrict` attribute on function parameters

2023-08-17 Thread Guillaume Gomez via Gcc-patches
Antoni spot a typo I made:

I added `LIBGCCJIT_HAVE_gcc_jit_type_get_size` instead of
`LIBGCCJIT_HAVE_gcc_jit_type_get_restrict`. Fixed in this patch, sorry
for the noise.

Le jeu. 17 août 2023 à 11:30, Guillaume Gomez
 a écrit :
>
> Hi Dave,
>
> > What kind of testing has the patch had? (e.g. did you run "make check-
> > jit" ?  Has this been in use on real Rust code?)
>
> I tested it as Rust backend directly on this code:
>
> ```
> pub fn foo(a: &mut i32, b: &mut i32, c: &i32) {
> *a += *c;
> *b += *c;
> }
> ```
>
> I ran it with `rustc` (and the GCC backend) with the following flags:
> `-C link-args=-lc --emit=asm -O --crate-type=lib` which gave the diff
> you can see in the attached file. Explanations: the diff on the right
> has the `__restrict__` attribute used whereas on the left it is the
> current version where we don't handle it.
>
> As for C testing, I used this code:
>
> ```
> void t(int *__restrict__ a, int *__restrict__ b, char *__restrict__ c) {
> *a += *c;
> *b += *c;
> }
> ```
>
> (without the `__restrict__` of course when I need to have a witness
> ASM). I attached the diff as well, this time the file with the use of
> `__restrict__` in on the left. I compiled with the following flags:
> `-S -O3`.
>
> > Please add a feature macro:
> > #define LIBGCCJIT_HAVE_gcc_jit_type_get_restrict
> > (see the similar ones in the header).
>
> I added `LIBGCCJIT_HAVE_gcc_jit_type_get_size` and extended the
> documentation as well to mention the ABI change.
>
> > Please add a new ABI tag (LIBGCCJIT_ABI_25 ?), rather than adding this
> > to ABI_0.
>
> I added `LIBGCCJIT_ABI_34` as `LIBGCCJIT_ABI_33` was the last one.
>
> > This refers to a "cold attribute"; is this a vestige of a copy-and-
> > paste from a different test case?
>
> It is a vestige indeed... Missed this one.
>
> > I see that the test scans the generated assembler.  Does the test
> > actually verify that restrict has an effect, or was that another
> > vestige from a different test case?
>
> No, this time it's what I wanted. Please see the C diff I provided
> above to see that the ASM has a small diff that allowed me to confirm
> that the `__restrict__` attribute was correctly set.
>
> > If this test is meant to run at -O3 and thus can't be part of test-
> > combination.c, please add a comment about it to
> > gcc/testsuite/jit.dg/all-non-failing-tests.h (in the alphabetical
> > place).
>
> Below `-O3`, this ASM difference doesn't appear unfortunately.
>
> > The patch also needs to add documentation for the new entrypoint (in
> > topics/types.rst), and for the new ABI tag (in
> > topics/compatibility.rst).
>
> Added!
>
> > Thanks again for the patch; hope the above is constructive
>
> It was incredibly useful! Thanks for taking time to writing down the
> explanations.
>
> The new patch is attached to this email.
>
> Cordially.
>
> Le jeu. 17 août 2023 à 01:06, David Malcolm  a écrit :
> >
> > On Wed, 2023-08-16 at 22:06 +0200, Guillaume Gomez via Jit wrote:
> > > My apologies, forgot to run the commit checkers. Here's the commit
> > > with the errors fixed.
> > >
> > > Le mer. 16 août 2023 à 18:32, Guillaume Gomez
> > >  a écrit :
> > > >
> > > > Hi,
> >
> > Hi Guillaume, thanks for the patch.
> >
> > > >
> > > > This patch adds the possibility to specify the __restrict__
> > > > attribute
> > > > for function parameters. It is used by the Rust GCC backend.
> >
> > What kind of testing has the patch had? (e.g. did you run "make check-
> > jit" ?  Has this been in use on real Rust code?)
> >
> > Overall, this patch looks close to being ready, but some nits below...
> >
> > [...]
> >
> > > diff --git a/gcc/jit/libgccjit.h b/gcc/jit/libgccjit.h
> > > index 60eaf39bff6..2e0d08a06d8 100644
> > > --- a/gcc/jit/libgccjit.h
> > > +++ b/gcc/jit/libgccjit.h
> > > @@ -635,6 +635,10 @@ gcc_jit_type_get_const (gcc_jit_type *type);
> > >  extern gcc_jit_type *
> > >  gcc_jit_type_get_volatile (gcc_jit_type *type);
> > >
> > > +/* Given type "T", get type "restrict T".  */
> > > +extern gcc_jit_type *
> > > +gcc_jit_type_get_restrict (gcc_jit_type *type);
> > > +
> > >  #define LIBGCCJIT_HAVE_SIZED_INTEGERS
> > >
> > >  /* Given types LTYPE and RTYPE, return non-zero if they are
> > compatible.
> >
> > Please add a feature macro:
> > #define LIBGCCJIT_HAVE_gcc_jit_type_get_restrict
> > (see the similar ones in the header).
> >
> > > diff --git a/gcc/jit/libgccjit.map b/gcc/jit/libgccjit.map
> > > index e52de0057a5..b7289b13845 100644
> > > --- a/gcc/jit/libgccjit.map
> > > +++ b/gcc/jit/libgccjit.map
> > > @@ -104,6 +104,7 @@ LIBGCCJIT_ABI_0
> > >  gcc_jit_type_as_object;
> > >  gcc_jit_type_get_const;
> > >  gcc_jit_type_get_pointer;
> > > +gcc_jit_type_get_restrict;
> > >  gcc_jit_type_get_volatile;
> >
> > Please add a new ABI tag (LIBGCCJIT_ABI_25 ?), rather than adding this
> > to ABI_0.
> >
> > > diff --git a/gcc/testsuite/jit.dg/test-restrict.c
> > b/gcc/testsuite/jit.dg/test-restrict.c
> > > new file mode 100644
> >

Re: [PATCH V4] Add warning options -W[no-]compare-distinct-pointer-types

2023-08-17 Thread Jose E. Marchesi via Gcc-patches


> On Thu, 17 Aug 2023, Jose E. Marchesi via Gcc-patches wrote:
>
>> +@opindex Wcompare-distinct-pointer-types
>> +@item -Wcompare-distinct-pointer-types
>
> This @item should say @r{(C and Objective-C only)}, since the option isn't 
> implemented for C++.  OK with that change.

Pushed with that change.
Thanks for the prompt review!


Re: [PATCH] libgccjit: Add support for `restrict` attribute on function parameters

2023-08-17 Thread Guillaume Gomez via Gcc-patches
And now I just discovered that a lot of commits from Antoni's fork
haven't been sent upstream which is why the ABI count is so high in
his repository. Fixed that as well.

Le jeu. 17 août 2023 à 17:26, Guillaume Gomez
 a écrit :
>
> Antoni spot a typo I made:
>
> I added `LIBGCCJIT_HAVE_gcc_jit_type_get_size` instead of
> `LIBGCCJIT_HAVE_gcc_jit_type_get_restrict`. Fixed in this patch, sorry
> for the noise.
>
> Le jeu. 17 août 2023 à 11:30, Guillaume Gomez
>  a écrit :
> >
> > Hi Dave,
> >
> > > What kind of testing has the patch had? (e.g. did you run "make check-
> > > jit" ?  Has this been in use on real Rust code?)
> >
> > I tested it as Rust backend directly on this code:
> >
> > ```
> > pub fn foo(a: &mut i32, b: &mut i32, c: &i32) {
> > *a += *c;
> > *b += *c;
> > }
> > ```
> >
> > I ran it with `rustc` (and the GCC backend) with the following flags:
> > `-C link-args=-lc --emit=asm -O --crate-type=lib` which gave the diff
> > you can see in the attached file. Explanations: the diff on the right
> > has the `__restrict__` attribute used whereas on the left it is the
> > current version where we don't handle it.
> >
> > As for C testing, I used this code:
> >
> > ```
> > void t(int *__restrict__ a, int *__restrict__ b, char *__restrict__ c) {
> > *a += *c;
> > *b += *c;
> > }
> > ```
> >
> > (without the `__restrict__` of course when I need to have a witness
> > ASM). I attached the diff as well, this time the file with the use of
> > `__restrict__` in on the left. I compiled with the following flags:
> > `-S -O3`.
> >
> > > Please add a feature macro:
> > > #define LIBGCCJIT_HAVE_gcc_jit_type_get_restrict
> > > (see the similar ones in the header).
> >
> > I added `LIBGCCJIT_HAVE_gcc_jit_type_get_size` and extended the
> > documentation as well to mention the ABI change.
> >
> > > Please add a new ABI tag (LIBGCCJIT_ABI_25 ?), rather than adding this
> > > to ABI_0.
> >
> > I added `LIBGCCJIT_ABI_34` as `LIBGCCJIT_ABI_33` was the last one.
> >
> > > This refers to a "cold attribute"; is this a vestige of a copy-and-
> > > paste from a different test case?
> >
> > It is a vestige indeed... Missed this one.
> >
> > > I see that the test scans the generated assembler.  Does the test
> > > actually verify that restrict has an effect, or was that another
> > > vestige from a different test case?
> >
> > No, this time it's what I wanted. Please see the C diff I provided
> > above to see that the ASM has a small diff that allowed me to confirm
> > that the `__restrict__` attribute was correctly set.
> >
> > > If this test is meant to run at -O3 and thus can't be part of test-
> > > combination.c, please add a comment about it to
> > > gcc/testsuite/jit.dg/all-non-failing-tests.h (in the alphabetical
> > > place).
> >
> > Below `-O3`, this ASM difference doesn't appear unfortunately.
> >
> > > The patch also needs to add documentation for the new entrypoint (in
> > > topics/types.rst), and for the new ABI tag (in
> > > topics/compatibility.rst).
> >
> > Added!
> >
> > > Thanks again for the patch; hope the above is constructive
> >
> > It was incredibly useful! Thanks for taking time to writing down the
> > explanations.
> >
> > The new patch is attached to this email.
> >
> > Cordially.
> >
> > Le jeu. 17 août 2023 à 01:06, David Malcolm  a écrit :
> > >
> > > On Wed, 2023-08-16 at 22:06 +0200, Guillaume Gomez via Jit wrote:
> > > > My apologies, forgot to run the commit checkers. Here's the commit
> > > > with the errors fixed.
> > > >
> > > > Le mer. 16 août 2023 à 18:32, Guillaume Gomez
> > > >  a écrit :
> > > > >
> > > > > Hi,
> > >
> > > Hi Guillaume, thanks for the patch.
> > >
> > > > >
> > > > > This patch adds the possibility to specify the __restrict__
> > > > > attribute
> > > > > for function parameters. It is used by the Rust GCC backend.
> > >
> > > What kind of testing has the patch had? (e.g. did you run "make check-
> > > jit" ?  Has this been in use on real Rust code?)
> > >
> > > Overall, this patch looks close to being ready, but some nits below...
> > >
> > > [...]
> > >
> > > > diff --git a/gcc/jit/libgccjit.h b/gcc/jit/libgccjit.h
> > > > index 60eaf39bff6..2e0d08a06d8 100644
> > > > --- a/gcc/jit/libgccjit.h
> > > > +++ b/gcc/jit/libgccjit.h
> > > > @@ -635,6 +635,10 @@ gcc_jit_type_get_const (gcc_jit_type *type);
> > > >  extern gcc_jit_type *
> > > >  gcc_jit_type_get_volatile (gcc_jit_type *type);
> > > >
> > > > +/* Given type "T", get type "restrict T".  */
> > > > +extern gcc_jit_type *
> > > > +gcc_jit_type_get_restrict (gcc_jit_type *type);
> > > > +
> > > >  #define LIBGCCJIT_HAVE_SIZED_INTEGERS
> > > >
> > > >  /* Given types LTYPE and RTYPE, return non-zero if they are
> > > compatible.
> > >
> > > Please add a feature macro:
> > > #define LIBGCCJIT_HAVE_gcc_jit_type_get_restrict
> > > (see the similar ones in the header).
> > >
> > > > diff --git a/gcc/jit/libgccjit.map b/gcc/jit/libgccjit.map
> > > > index e52de0057a5..b7289b13845 100644
> > > > --- a/

Re: [PATCH] libgccjit: Add support for `restrict` attribute on function parameters

2023-08-17 Thread David Malcolm via Gcc-patches
On Thu, 2023-08-17 at 17:41 +0200, Guillaume Gomez wrote:
> And now I just discovered that a lot of commits from Antoni's fork
> haven't been sent upstream which is why the ABI count is so high in
> his repository. Fixed that as well.

Thanks for the updated patch; I was about to comment on that.

This version is good for gcc trunk.

Dave

> 
> Le jeu. 17 août 2023 à 17:26, Guillaume Gomez
>  a écrit :
> > 
> > Antoni spot a typo I made:
> > 
> > I added `LIBGCCJIT_HAVE_gcc_jit_type_get_size` instead of
> > `LIBGCCJIT_HAVE_gcc_jit_type_get_restrict`. Fixed in this patch,
> > sorry
> > for the noise.
> > 
> > Le jeu. 17 août 2023 à 11:30, Guillaume Gomez
> >  a écrit :
> > > 
> > > Hi Dave,
> > > 
> > > > What kind of testing has the patch had? (e.g. did you run "make
> > > > check-
> > > > jit" ?  Has this been in use on real Rust code?)
> > > 
> > > I tested it as Rust backend directly on this code:
> > > 
> > > ```
> > > pub fn foo(a: &mut i32, b: &mut i32, c: &i32) {
> > >     *a += *c;
> > >     *b += *c;
> > > }
> > > ```
> > > 
> > > I ran it with `rustc` (and the GCC backend) with the following
> > > flags:
> > > `-C link-args=-lc --emit=asm -O --crate-type=lib` which gave the
> > > diff
> > > you can see in the attached file. Explanations: the diff on the
> > > right
> > > has the `__restrict__` attribute used whereas on the left it is
> > > the
> > > current version where we don't handle it.
> > > 
> > > As for C testing, I used this code:
> > > 
> > > ```
> > > void t(int *__restrict__ a, int *__restrict__ b, char
> > > *__restrict__ c) {
> > >     *a += *c;
> > >     *b += *c;
> > > }
> > > ```
> > > 
> > > (without the `__restrict__` of course when I need to have a
> > > witness
> > > ASM). I attached the diff as well, this time the file with the
> > > use of
> > > `__restrict__` in on the left. I compiled with the following
> > > flags:
> > > `-S -O3`.
> > > 
> > > > Please add a feature macro:
> > > > #define LIBGCCJIT_HAVE_gcc_jit_type_get_restrict
> > > > (see the similar ones in the header).
> > > 
> > > I added `LIBGCCJIT_HAVE_gcc_jit_type_get_size` and extended the
> > > documentation as well to mention the ABI change.
> > > 
> > > > Please add a new ABI tag (LIBGCCJIT_ABI_25 ?), rather than
> > > > adding this
> > > > to ABI_0.
> > > 
> > > I added `LIBGCCJIT_ABI_34` as `LIBGCCJIT_ABI_33` was the last
> > > one.
> > > 
> > > > This refers to a "cold attribute"; is this a vestige of a copy-
> > > > and-
> > > > paste from a different test case?
> > > 
> > > It is a vestige indeed... Missed this one.
> > > 
> > > > I see that the test scans the generated assembler.  Does the
> > > > test
> > > > actually verify that restrict has an effect, or was that
> > > > another
> > > > vestige from a different test case?
> > > 
> > > No, this time it's what I wanted. Please see the C diff I
> > > provided
> > > above to see that the ASM has a small diff that allowed me to
> > > confirm
> > > that the `__restrict__` attribute was correctly set.
> > > 
> > > > If this test is meant to run at -O3 and thus can't be part of
> > > > test-
> > > > combination.c, please add a comment about it to
> > > > gcc/testsuite/jit.dg/all-non-failing-tests.h (in the
> > > > alphabetical
> > > > place).
> > > 
> > > Below `-O3`, this ASM difference doesn't appear unfortunately.
> > > 
> > > > The patch also needs to add documentation for the new
> > > > entrypoint (in
> > > > topics/types.rst), and for the new ABI tag (in
> > > > topics/compatibility.rst).
> > > 
> > > Added!
> > > 
> > > > Thanks again for the patch; hope the above is constructive
> > > 
> > > It was incredibly useful! Thanks for taking time to writing down
> > > the
> > > explanations.
> > > 
> > > The new patch is attached to this email.
> > > 
> > > Cordially.
> > > 
> > > Le jeu. 17 août 2023 à 01:06, David Malcolm 
> > > a écrit :
> > > > 
> > > > On Wed, 2023-08-16 at 22:06 +0200, Guillaume Gomez via Jit
> > > > wrote:
> > > > > My apologies, forgot to run the commit checkers. Here's the
> > > > > commit
> > > > > with the errors fixed.
> > > > > 
> > > > > Le mer. 16 août 2023 à 18:32, Guillaume Gomez
> > > > >  a écrit :
> > > > > > 
> > > > > > Hi,
> > > > 
> > > > Hi Guillaume, thanks for the patch.
> > > > 
> > > > > > 
> > > > > > This patch adds the possibility to specify the __restrict__
> > > > > > attribute
> > > > > > for function parameters. It is used by the Rust GCC
> > > > > > backend.
> > > > 
> > > > What kind of testing has the patch had? (e.g. did you run "make
> > > > check-
> > > > jit" ?  Has this been in use on real Rust code?)
> > > > 
> > > > Overall, this patch looks close to being ready, but some nits
> > > > below...
> > > > 
> > > > [...]
> > > > 
> > > > > diff --git a/gcc/jit/libgccjit.h b/gcc/jit/libgccjit.h
> > > > > index 60eaf39bff6..2e0d08a06d8 100644
> > > > > --- a/gcc/jit/libgccjit.h
> > > > > +++ b/gcc/jit/libgccjit.h
> > > > > @@ -635,6 +635,10 @@ gcc_jit_type_get_const (gcc_jit_type
> > > > > *type);
> >

Re: [PATCH] libgccjit: Add support for `restrict` attribute on function parameters

2023-08-17 Thread Guillaume Gomez via Gcc-patches
Thanks for the review!

Le jeu. 17 août 2023 à 17:50, David Malcolm  a écrit :
>
> On Thu, 2023-08-17 at 17:41 +0200, Guillaume Gomez wrote:
> > And now I just discovered that a lot of commits from Antoni's fork
> > haven't been sent upstream which is why the ABI count is so high in
> > his repository. Fixed that as well.
>
> Thanks for the updated patch; I was about to comment on that.
>
> This version is good for gcc trunk.
>
> Dave
>
> >
> > Le jeu. 17 août 2023 à 17:26, Guillaume Gomez
> >  a écrit :
> > >
> > > Antoni spot a typo I made:
> > >
> > > I added `LIBGCCJIT_HAVE_gcc_jit_type_get_size` instead of
> > > `LIBGCCJIT_HAVE_gcc_jit_type_get_restrict`. Fixed in this patch,
> > > sorry
> > > for the noise.
> > >
> > > Le jeu. 17 août 2023 à 11:30, Guillaume Gomez
> > >  a écrit :
> > > >
> > > > Hi Dave,
> > > >
> > > > > What kind of testing has the patch had? (e.g. did you run "make
> > > > > check-
> > > > > jit" ?  Has this been in use on real Rust code?)
> > > >
> > > > I tested it as Rust backend directly on this code:
> > > >
> > > > ```
> > > > pub fn foo(a: &mut i32, b: &mut i32, c: &i32) {
> > > > *a += *c;
> > > > *b += *c;
> > > > }
> > > > ```
> > > >
> > > > I ran it with `rustc` (and the GCC backend) with the following
> > > > flags:
> > > > `-C link-args=-lc --emit=asm -O --crate-type=lib` which gave the
> > > > diff
> > > > you can see in the attached file. Explanations: the diff on the
> > > > right
> > > > has the `__restrict__` attribute used whereas on the left it is
> > > > the
> > > > current version where we don't handle it.
> > > >
> > > > As for C testing, I used this code:
> > > >
> > > > ```
> > > > void t(int *__restrict__ a, int *__restrict__ b, char
> > > > *__restrict__ c) {
> > > > *a += *c;
> > > > *b += *c;
> > > > }
> > > > ```
> > > >
> > > > (without the `__restrict__` of course when I need to have a
> > > > witness
> > > > ASM). I attached the diff as well, this time the file with the
> > > > use of
> > > > `__restrict__` in on the left. I compiled with the following
> > > > flags:
> > > > `-S -O3`.
> > > >
> > > > > Please add a feature macro:
> > > > > #define LIBGCCJIT_HAVE_gcc_jit_type_get_restrict
> > > > > (see the similar ones in the header).
> > > >
> > > > I added `LIBGCCJIT_HAVE_gcc_jit_type_get_size` and extended the
> > > > documentation as well to mention the ABI change.
> > > >
> > > > > Please add a new ABI tag (LIBGCCJIT_ABI_25 ?), rather than
> > > > > adding this
> > > > > to ABI_0.
> > > >
> > > > I added `LIBGCCJIT_ABI_34` as `LIBGCCJIT_ABI_33` was the last
> > > > one.
> > > >
> > > > > This refers to a "cold attribute"; is this a vestige of a copy-
> > > > > and-
> > > > > paste from a different test case?
> > > >
> > > > It is a vestige indeed... Missed this one.
> > > >
> > > > > I see that the test scans the generated assembler.  Does the
> > > > > test
> > > > > actually verify that restrict has an effect, or was that
> > > > > another
> > > > > vestige from a different test case?
> > > >
> > > > No, this time it's what I wanted. Please see the C diff I
> > > > provided
> > > > above to see that the ASM has a small diff that allowed me to
> > > > confirm
> > > > that the `__restrict__` attribute was correctly set.
> > > >
> > > > > If this test is meant to run at -O3 and thus can't be part of
> > > > > test-
> > > > > combination.c, please add a comment about it to
> > > > > gcc/testsuite/jit.dg/all-non-failing-tests.h (in the
> > > > > alphabetical
> > > > > place).
> > > >
> > > > Below `-O3`, this ASM difference doesn't appear unfortunately.
> > > >
> > > > > The patch also needs to add documentation for the new
> > > > > entrypoint (in
> > > > > topics/types.rst), and for the new ABI tag (in
> > > > > topics/compatibility.rst).
> > > >
> > > > Added!
> > > >
> > > > > Thanks again for the patch; hope the above is constructive
> > > >
> > > > It was incredibly useful! Thanks for taking time to writing down
> > > > the
> > > > explanations.
> > > >
> > > > The new patch is attached to this email.
> > > >
> > > > Cordially.
> > > >
> > > > Le jeu. 17 août 2023 à 01:06, David Malcolm 
> > > > a écrit :
> > > > >
> > > > > On Wed, 2023-08-16 at 22:06 +0200, Guillaume Gomez via Jit
> > > > > wrote:
> > > > > > My apologies, forgot to run the commit checkers. Here's the
> > > > > > commit
> > > > > > with the errors fixed.
> > > > > >
> > > > > > Le mer. 16 août 2023 à 18:32, Guillaume Gomez
> > > > > >  a écrit :
> > > > > > >
> > > > > > > Hi,
> > > > >
> > > > > Hi Guillaume, thanks for the patch.
> > > > >
> > > > > > >
> > > > > > > This patch adds the possibility to specify the __restrict__
> > > > > > > attribute
> > > > > > > for function parameters. It is used by the Rust GCC
> > > > > > > backend.
> > > > >
> > > > > What kind of testing has the patch had? (e.g. did you run "make
> > > > > check-
> > > > > jit" ?  Has this been in use on real Rust code?)
> > > > >
> > > > > Overall, this patch looks close to bein

[pushed][LRA]: When assigning stack slots to pseudos previously assigned to fp consider other spilled pseudos

2023-08-17 Thread Vladimir Makarov via Gcc-patches
The following patch fixes a problem with allocating the same stack slots 
to conflicting pseudos.  The problem exists only for AVR LRA port.


The patch was successfully bootstrapped and tested on x86-64 and aarch64.

commit c024867d1aa9d465e0236fc9d45d8e1d4bb6bd30
Author: Vladimir N. Makarov 
Date:   Thu Aug 17 11:57:45 2023 -0400

[LRA]: When assigning stack slots to pseudos previously assigned to fp 
consider other spilled pseudos

The previous LRA patch can assign slot of conflicting pseudos to
pseudos spilled after prohibiting fp->sp elimination.  This patch
fixes this problem.

gcc/ChangeLog:

* lra-spills.cc (assign_stack_slot_num_and_sort_pseudos): Moving
slots_num initialization from here ...
(lra_spill): ... to here before the 1st call of
assign_stack_slot_num_and_sort_pseudos.  Add the 2nd call after
fp->sp elimination.

diff --git a/gcc/lra-spills.cc b/gcc/lra-spills.cc
index 7e1d35b5e4e..a663a1931e3 100644
--- a/gcc/lra-spills.cc
+++ b/gcc/lra-spills.cc
@@ -363,7 +363,6 @@ assign_stack_slot_num_and_sort_pseudos (int *pseudo_regnos, 
int n)
 {
   int i, j, regno;
 
-  slots_num = 0;
   /* Assign stack slot numbers to spilled pseudos, use smaller numbers
  for most frequently used pseudos. */
   for (i = 0; i < n; i++)
@@ -628,6 +627,7 @@ lra_spill (void)
   /* Sort regnos according their usage frequencies.  */
   qsort (pseudo_regnos, n, sizeof (int), regno_freq_compare);
   n = assign_spill_hard_regs (pseudo_regnos, n);
+  slots_num = 0;
   assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n);
   for (i = 0; i < n; i++)
 if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
@@ -635,6 +635,7 @@ lra_spill (void)
   if ((n2 = lra_update_fp2sp_elimination (pseudo_regnos)) > 0)
 {
   /* Assign stack slots to spilled pseudos assigned to fp.  */
+  assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n2);
   for (i = 0; i < n2; i++)
if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
  assign_mem_slot (pseudo_regnos[i]);


Re: [pushed][LRA]: Spill pseudos assigned to fp when fp->sp elimination became impossible

2023-08-17 Thread Vladimir Makarov via Gcc-patches



On 8/17/23 07:19, senthilkumar.selva...@microchip.com wrote:

On Wed, 2023-08-16 at 12:13 -0400, Vladimir Makarov wrote:

EXTERNAL EMAIL: Do not click links or open attachments unless you know the 
content is safe

The attached patch fixes recently found wrong insn removal in LRA port
for AVR.

The patch was successfully tested and bootstrapped on x86-64 and aarch64.



Hi Vladimir,

   Thanks for working on this. After applying the patch, I'm seeing that the
   pseudo in the frame pointer that got spilled is taking up the same stack
   slot that was already assigned to a spilled pseudo, and that is causing 
execution
   failure (it is also causing a crash when building libgcc for avr)

...
   I tried a hacky workaround (see patch below) to create a new stack slot and
   assign the spilled pseudo to it, and that works.
   
   Not sure if that's the right way to do it though.


The general way of solution is right but I've just committed a different 
version of the patch.





Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-08-17 Thread Kees Cook via Gcc-patches
On Thu, Aug 17, 2023 at 01:44:42PM +, Qing Zhao wrote:
> Thanks for the testing case. 
> Yes, I noticed this issue too, and already fixed it in my private branch. 
> 
> With the latest patch, the compilation has no issue:
> [opc@qinzhao-ol8u3-x86 108896]$ sh t
> /home/opc/Install/latest-d/bin/gcc -O2 -c -o /dev/null bug.c
> [opc@qinzhao-ol8u3-x86 108896]$ 

Great! Thanks. I look forward to v3. For now I'll leave off these 2
annotations in my kernel builds. :)

-Kees

-- 
Kees Cook


Re: [PATCH V2] RISC-V: Add the missed half floating-point mode patterns of local_pic_load/store when only use zfhmin or zhinxmin

2023-08-17 Thread Robin Dapp via Gcc-patches
Indeed all ANYLSF patterns have TARGET_HARD_FLOAT (==f extension) which
is incompatible with ZHINX or ZHINXMIN anyway.  That should really be fixed
separately or at least clarified, maybe I'm missing something.

Still we can go forward with the patch itself as it improves things
independently, so LGTM.

Regards
 Robin


Re: [PATCH] c: Add support for [[__extension__ ...]]

2023-08-17 Thread Richard Biener via Gcc-patches



> Am 17.08.2023 um 13:25 schrieb Richard Sandiford via Gcc-patches 
> :
> 
> Joseph Myers  writes:
>>> On Wed, 16 Aug 2023, Richard Sandiford via Gcc-patches wrote:
>>> 
>>> Would it be OK to add support for:
>>> 
>>>  [[__extension__ ...]]
>>> 
>>> to suppress the pedwarn about using [[]] prior to C2X?  Then we can
>> 
>> That seems like a plausible feature to add.
> 
> Thanks.  Of course, once I actually tried it, I hit a snag:
> :: isn't a single lexing token prior to C2X, and so something like:
> 
>  [[__extension__ arm::streaming]]
> 
> would not be interpreted as a scoped attribute in C11.  The patch
> gets around that by allowing two colons in place of :: when
> __extension__ is used.  I realise that's pushing the bounds of
> acceptability though...
> 
> I wondered about trying to require the two colons to be immediately
> adjacent.  But:
> 
> (a) There didn't appear to be an existing API to check that, which seemed
>like a red flag.  The closest I could find was get_source_text_between.

IStR a cop Toben has ->prev_white or so

>Similarly to that, it would in principle be possible to compare
>two expanded locations.  But...
> 
> (b) I had a vague impression that locations were allowed to drop column
>information for very large inputs (maybe I'm wrong).
> 
> (c) It wouldn't cope with token pasting.
> 
> So in the end I just used a simple two-token test, like for [[ and ]].
> Bootstrapped & regression-tested on aarch64-linux-gnu.
> 
> Richard
> 
> 
> 
> [[]] attributes are a recent addition to C, but as a GNU extension,
> GCC allows them to be used in C11 and earlier.  Normally this use
> would trigger a pedwarn (for -pedantic, -Wc11-c2x-compat, etc.).
> 
> This patch allows the pedwarn to be suppressed by starting the
> attribute-list with __extension__.
> 
> Also, :: is not a single lexing token prior to C2X, so it wasn't
> possible to use scoped attributes in C11, even as a GNU extension.
> The patch allows two colons to be used in place of :: when
> __extension__ is used.  No attempt is made to check whether the
> two colons are immediately adjacent.
> 
> gcc/
>* doc/extend.texi: Document the C [[__extension__ ...]] construct.
> 
> gcc/c/
>* c-parser.cc (c_parser_std_attribute): Conditionally allow
>two colons to be used in place of ::.
>(c_parser_std_attribute_list): New function, split out from...
>(c_parser_std_attribute_specifier): ...here.  Allow the attribute-list
>to start with __extension__.  When it does, also allow two colons
>to be used in place of ::.
> 
> gcc/testsuite/
>* gcc.dg/c2x-attr-syntax-6.c: New test.
>* gcc.dg/c2x-attr-syntax-7.c: Likewise.
> ---
> gcc/c/c-parser.cc| 68 ++--
> gcc/doc/extend.texi  | 27 --
> gcc/testsuite/gcc.dg/c2x-attr-syntax-6.c | 50 +
> gcc/testsuite/gcc.dg/c2x-attr-syntax-7.c | 48 +
> 4 files changed, 173 insertions(+), 20 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/c2x-attr-syntax-6.c
> create mode 100644 gcc/testsuite/gcc.dg/c2x-attr-syntax-7.c
> 
> diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
> index 33fe7b115ff..82e56b28446 100644
> --- a/gcc/c/c-parser.cc
> +++ b/gcc/c/c-parser.cc
> @@ -5390,10 +5390,18 @@ c_parser_balanced_token_sequence (c_parser *parser)
>  ( balanced-token-sequence[opt] )
> 
>Keywords are accepted as identifiers for this purpose.
> -*/
> +
> +   As an extension, we permit an attribute-specifier to be:
> +
> + [ [ __extension__ attribute-list ] ]
> +
> +   Two colons are then accepted as a synonym for ::.  No attempt is made
> +   to check whether the colons are immediately adjacent.  LOOSE_SCOPE_P
> +   indicates whether this relaxation is in effect.  */
> 
> static tree
> -c_parser_std_attribute (c_parser *parser, bool for_tm)
> +c_parser_std_attribute (c_parser *parser, bool for_tm,
> +bool loose_scope_p = false)
> {
>   c_token *token = c_parser_peek_token (parser);
>   tree ns, name, attribute;
> @@ -5406,9 +5414,18 @@ c_parser_std_attribute (c_parser *parser, bool for_tm)
> }
>   name = canonicalize_attr_name (token->value);
>   c_parser_consume_token (parser);
> -  if (c_parser_next_token_is (parser, CPP_SCOPE))
> +  if (c_parser_next_token_is (parser, CPP_SCOPE)
> +  || (loose_scope_p
> +  && c_parser_next_token_is (parser, CPP_COLON)
> +  && c_parser_peek_token (parser)->type == CPP_COLON))
> {
>   ns = name;
> +  if (c_parser_next_token_is (parser, CPP_COLON))
> +{
> +  c_parser_consume_token (parser);
> +  if (!c_parser_next_token_is (parser, CPP_COLON))
> +gcc_unreachable ();
> +}
>   c_parser_consume_token (parser);
>   token = c_parser_peek_token (parser);
>   if (token->type != CPP_NAME && token->type != CPP_KEYWORD)
> @@ -5481,19 +5498,9 @@ c_parser_std_attribute (c_parser *parser, bool for_tm)
> }
> 
> static tree
> -c_parser_std_attribute_specifie

[Committed] RISCV: Add rotate immediate regression test

2023-08-17 Thread Patrick O'Neill

On 8/16/23 21:36, Jeff Law wrote:




On 8/16/23 19:17, Patrick O'Neill wrote:

This adds new regression tests to ensure half-register rotations are
correctly optimized into rori instructions.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zbb-rol-ror-08.c: New test.
* gcc.target/riscv/zbb-rol-ror-09.c: New test.

Co-authored-by: Charlie Jenkins 
Signed-off-by: Patrick O'Neill 

OK
jeff

Committed
Patrick


Re: [PATCH][RFC] tree-optimization/92335 - Improve sinking heuristics for vectorization

2023-08-17 Thread Prathamesh Kulkarni via Gcc-patches
On Tue, 15 Aug 2023 at 14:28, Richard Sandiford
 wrote:
>
> Richard Biener  writes:
> > On Mon, 14 Aug 2023, Prathamesh Kulkarni wrote:
> >> On Mon, 7 Aug 2023 at 13:19, Richard Biener  
> >> wrote:
> >> > It doesn't seem to make a difference for x86.  That said, the "fix" is
> >> > probably sticking the correct target on the dump-check, it seems
> >> > that vect_fold_extract_last is no longer correct here.
> >> Um sorry, I did go thru various checks in target-supports.exp, but not
> >> sure which one will be appropriate for this case,
> >> and am stuck here :/ Could you please suggest how to proceed ?
> >
> > Maybe Richard S. knows the magic thing to test, he originally
> > implemented the direct conversion support.  I suggest to implement
> > such dg-checks if they are not present (I can't find them),
> > possibly quite specific to the modes involved (like we have
> > other checks with _qi_to_hi suffixes, for float modes maybe
> > just _float).
>
> Yeah, can't remember specific selectors for that feature.  TBH I think
> most (all?) of the tests were AArch64-specific.
Hi,
As Richi mentioned above, the test now vectorizes on AArch64 because
it has support for direct conversion
between vectors while x86 doesn't. IIUC this is because
supportable_convert_operation returns true
for V4HI -> V4SI on Aarch64 since it can use extend_v4hiv4si2 for
doing the conversion ?

In the attached patch, I added a new target check vect_extend which
(currently) returns 1 only for aarch64*-*-*,
which makes the test PASS on both the targets, altho I am not sure if
this is entirely correct.
Does the patch look OK ?

Thanks,
Prathamesh
>
> Thanks,
> Richard
diff --git a/gcc/testsuite/gcc.dg/vect/pr65947-7.c 
b/gcc/testsuite/gcc.dg/vect/pr65947-7.c
index 16cdcd1c6eb..c8623854af5 100644
--- a/gcc/testsuite/gcc.dg/vect/pr65947-7.c
+++ b/gcc/testsuite/gcc.dg/vect/pr65947-7.c
@@ -52,5 +52,4 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target 
vect_fold_extract_last } } } */
-/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" { target { ! 
vect_fold_extract_last } } } } */
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target vect_extend } 
} } */
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 92b6f69730e..29ef64b84f3 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -7768,6 +7768,16 @@ proc check_effective_target_vect_unpack { } {
 || [istarget amdgcn*-*-*] }}]
 }
 
+# Return 1 if the target plus current options supports vector
+# conversion of chars (to shorts) and shorts (to ints), 0 otherwise.
+#
+# This won't change for different subtargets so cache the result.
+
+proc check_effective_target_vect_extend { } {
+return [check_cached_effective_target_indexed vect_extend {
+  expr { [istarget aarch64*-*-*]}}]
+}
+
 # Return 1 if the target plus current options does not guarantee
 # that its STACK_BOUNDARY is >= the reguired vector alignment.
 #


Re: [PATCH] sso-string@gnu-versioned-namespace [PR83077]

2023-08-17 Thread François Dumont via Gcc-patches
Another fix to define __cow_string(const std::string&) in 
cxx11-stdexcept.cc even if ! _GLIBCXX_USE_DUAL_ABI.


On 13/08/2023 21:51, François Dumont wrote:


Here is another version with enhanced sizeof/alignof static_assert in 
string-inst.cc for the std::__cow_string definition from . 
The assertions in cow-stdexcept.cc are now checking the definition 
which is in the same file.


On 13/08/2023 15:27, François Dumont wrote:


Here is the fixed patch tested in all 3 modes:

- _GLIBCXX_USE_DUAL_ABI

- !_GLIBCXX_USE_DUAL_ABI && !_GLIBCXX_USE_CXX11_ABI

- !_GLIBCXX_USE_DUAL_ABI && _GLIBCXX_USE_CXX11_ABI

I don't know what you have in mind for the change below but I wanted 
to let you know that I tried to put COW std::basic_string into a 
nested __cow namespace when _GLIBCXX_USE_CXX11_ABI. But it had more 
impact on string-inst.cc so I preferred the macro substitution approach.


There are some test failing when !_GLIBCXX_USE_CXX11_ABI that are 
unrelated with my changes. I'll propose fixes in coming days.


    libstdc++: [_GLIBCXX_INLINE_VERSION] Use cxx11 abi [PR83077]

    Use cxx11 abi when activating versioned namespace mode. To do support
    a new configuration mode where !_GLIBCXX_USE_DUAL_ABI and 
_GLIBCXX_USE_CXX11_ABI.


    The main change is that std::__cow_string is now defined whenever 
_GLIBCXX_USE_DUAL_ABI
    or _GLIBCXX_USE_CXX11_ABI is true. Implementation is using 
available std::string in

    case of dual abi and a subset of it when it's not.

    On the other side std::__sso_string is defined only when 
_GLIBCXX_USE_DUAL_ABI is true
    and _GLIBCXX_USE_CXX11_ABI is false. Meaning that 
std::__sso_string is a typedef for the
    cow std::string implementation when dual abi is disabled and cow 
string is being used.


    libstdcxx-v3/ChangeLog:

    PR libstdc++/83077
    * acinclude.m4 [GLIBCXX_ENABLE_LIBSTDCXX_DUAL_ABI]: 
Default to "new" libstdcxx abi.
    * config/locale/dragonfly/monetary_members.cc 
[!_GLIBCXX_USE_DUAL_ABI]: Define money_base

    members.
    * config/locale/generic/monetary_members.cc 
[!_GLIBCXX_USE_DUAL_ABI]: Likewise.
    * config/locale/gnu/monetary_members.cc 
[!_GLIBCXX_USE_DUAL_ABI]: Likewise.

    * config/locale/gnu/numeric_members.cc
[!_GLIBCXX_USE_DUAL_ABI](__narrow_multibyte_chars): Define.
    * configure: Regenerate.
    * include/bits/c++config
[_GLIBCXX_INLINE_VERSION](_GLIBCXX_NAMESPACE_CXX11, 
_GLIBCXX_BEGIN_NAMESPACE_CXX11):

    Define empty.
[_GLIBCXX_INLINE_VERSION](_GLIBCXX_END_NAMESPACE_CXX11, 
_GLIBCXX_DEFAULT_ABI_TAG):

    Likewise.
    * include/bits/cow_string.h [!_GLIBCXX_USE_CXX11_ABI]: 
Define a light version of COW

    basic_string as __std_cow_string for use in stdexcept.
    * include/std/stdexcept [_GLIBCXX_USE_CXX11_ABI]: Define 
__cow_string.

    (__cow_string(const char*)): New.
    (__cow_string::c_str()): New.
    * python/libstdcxx/v6/printers.py 
(StdStringPrinter::__init__): Set self.new_string to True

    when std::__8::basic_string type is found.
    * src/Makefile.am 
[ENABLE_SYMVERS_GNU_NAMESPACE](ldbl_alt128_compat_sources): Define empty.

    * src/Makefile.in: Regenerate.
    * src/c++11/Makefile.am (cxx11_abi_sources): Rename into...
    (dual_abi_sources): ...this. Also move cow-local_init.cc, 
cxx11-hash_tr1.cc,

    cxx11-ios_failure.cc entries to...
    (sources): ...this.
    (extra_string_inst_sources): Move cow-fstream-inst.cc, 
cow-sstream-inst.cc, cow-string-inst.cc,
    cow-string-io-inst.cc, cow-wtring-inst.cc, 
cow-wstring-io-inst.cc, cxx11-locale-inst.cc,

    cxx11-wlocale-inst.cc entries to...
    (inst_sources): ...this.
    * src/c++11/Makefile.in: Regenerate.
    * src/c++11/cow-fstream-inst.cc [_GLIBCXX_USE_CXX11_ABI]: 
Skip definitions.
    * src/c++11/cow-locale_init.cc [_GLIBCXX_USE_CXX11_ABI]: 
Skip definitions.
    * src/c++11/cow-sstream-inst.cc [_GLIBCXX_USE_CXX11_ABI]: 
Skip definitions.
    * src/c++11/cow-stdexcept.cc [_GLIBCXX_USE_CXX11_ABI]: 
Include .
    [_GLIBCXX_USE_DUAL_ABI || 
_GLIBCXX_USE_CXX11_ABI](__cow_string): Redefine before
    including . Define 
_GLIBCXX_DEFINE_STDEXCEPT_INSTANTIATIONS so that

    __cow_string definition in  is skipped.
    [_GLIBCXX_USE_CXX11_ABI]: Skip Transaction Memory TS 
definitions.

    Move static_assert to check std::_cow_string abi layout to...
    * src/c++11/string-inst.cc: ...here.
    (_GLIBCXX_DEFINING_CXX11_ABI_INSTANTIATIONS): Define 
following _GLIBCXX_USE_CXX11_ABI

    value.
    [_GLIBCXX_USE_CXX11_ABI && 
!_GLIBCXX_DEFINING_CXX11_ABI_INSTANTIATIONS]:
    Define _GLIBCXX_DEFINING_COW_STRING_INSTANTIATIONS. 
Include .
    Define basic_string as __std_cow_string for the current

Re: [PATCH] sso-string@gnu-versioned-namespace [PR83077]

2023-08-17 Thread Jonathan Wakely via Gcc-patches
On Sun, 13 Aug 2023 at 14:27, François Dumont via Libstdc++
 wrote:
>
> Here is the fixed patch tested in all 3 modes:
>
> - _GLIBCXX_USE_DUAL_ABI
>
> - !_GLIBCXX_USE_DUAL_ABI && !_GLIBCXX_USE_CXX11_ABI
>
> - !_GLIBCXX_USE_DUAL_ABI && _GLIBCXX_USE_CXX11_ABI
>
> I don't know what you have in mind for the change below but I wanted to
> let you know that I tried to put COW std::basic_string into a nested
> __cow namespace when _GLIBCXX_USE_CXX11_ABI. But it had more impact on
> string-inst.cc so I preferred the macro substitution approach.

I was thinking of implementing the necessary special members functions
of __cow_string directly, so they are ABI compatible with the COW
std::basic_string but don't actually reuse the code. That would mean
we don't need to compile and instantiate the whole COW string just to
use a few members from it. But that can be done later, the macro
approach seems OK for now.

>
> There are some test failing when !_GLIBCXX_USE_CXX11_ABI that are
> unrelated with my changes. I'll propose fixes in coming days.

Which tests? I run the entire testsuite with
-D_GLIBCXX_USE_CXX11_ABI=0 several times per day and I'm not seeing
failures.

I'll review the patch ASAP, thanks for working on it.



[COMMITTED] PR tree-optimization/111009 - Fix range-ops operator_addr.

2023-08-17 Thread Andrew MacLeod via Gcc-patches
operator_addr was simply calling fold_range() to implement op1_range, 
but it turns out op1_range needs to be more restrictive.


take for example  from the PR :

   _13 = &dso->maj

when folding,  getting a value of 0 for op1 means dso->maj resolved to a 
value of [0,0].  fold_using_range::range_of_address will have processed 
the symbolics, or at least we know that op1 is 0.  Likewise if it is 
non-zero, we can also conclude the LHS is non-zero.


however, when working from the LHS, we cannot make the same 
conclusions.  GORI has no concept of symblics, so knowing the expressions is


[0,0]  = & 

 we cannot conclude the op1 is also 0.. in particular &dso->maj wouldnt 
be unless dso was zero and maj was also a zero offset.
Likewise if the LHS is [1,1] we cant be sure op1 is nonzero unless we 
know the type cannot wrap.


This patch simply implements op1_range with these rules instead of 
calling fold_range.


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  Pushed.

Andrew
From dc48d1d1d4458773f89f21b2f019f66ddf88f2e5 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Thu, 17 Aug 2023 11:13:14 -0400
Subject: [PATCH] Fix range-ops operator_addr.

Lack of symbolic information prevents op1_range from beig able to draw
the same conclusions as fold_range can.

PR tree-optimization/111009
gcc/
* range-op.cc (operator_addr_expr::op1_range): Be more restrictive.

gcc/testsuite/
* gcc.dg/pr111009.c: New.
---
 gcc/range-op.cc | 12 ++-
 gcc/testsuite/gcc.dg/pr111009.c | 38 +
 2 files changed, 49 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr111009.c

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index 086c6c19735..268f6b6f025 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -4325,7 +4325,17 @@ operator_addr_expr::op1_range (irange &r, tree type,
   const irange &op2,
   relation_trio) const
 {
-  return operator_addr_expr::fold_range (r, type, lhs, op2);
+   if (empty_range_varying (r, type, lhs, op2))
+return true;
+
+  // Return a non-null pointer of the LHS type (passed in op2), but only
+  // if we cant overflow, eitherwise a no-zero offset could wrap to zero.
+  // See PR 111009.
+  if (!contains_zero_p (lhs) && TYPE_OVERFLOW_UNDEFINED (type))
+r = range_nonzero (type);
+  else
+r.set_varying (type);
+  return true;
 }
 
 // Initialize any integral operators to the primary table
diff --git a/gcc/testsuite/gcc.dg/pr111009.c b/gcc/testsuite/gcc.dg/pr111009.c
new file mode 100644
index 000..3accd9ac063
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr111009.c
@@ -0,0 +1,38 @@
+/* PR tree-optimization/111009 */
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-strict-overflow" } */
+
+struct dso {
+ struct dso * next;
+ int maj;
+};
+
+__attribute__((noipa)) static void __dso_id__cmp_(void) {}
+
+__attribute__((noipa))
+static int bug(struct dso * d, struct dso *dso)
+{
+ struct dso **p = &d;
+ struct dso *curr = 0;
+
+ while (*p) {
+  curr = *p;
+  // prevent null deref below
+  if (!dso) return 1;
+  if (dso == curr) return 1;
+
+  int *a = &dso->maj;
+  // null deref
+  if (!(a && *a)) __dso_id__cmp_();
+
+  p = &curr->next;
+ }
+ return 0;
+}
+
+__attribute__((noipa))
+int main(void) {
+struct dso d = { 0, 0, };
+bug(&d, 0);
+}
+
-- 
2.41.0



Re: [PATCH] sso-string@gnu-versioned-namespace [PR83077]

2023-08-17 Thread François Dumont via Gcc-patches



On 17/08/2023 19:22, Jonathan Wakely wrote:

On Sun, 13 Aug 2023 at 14:27, François Dumont via Libstdc++
 wrote:

Here is the fixed patch tested in all 3 modes:

- _GLIBCXX_USE_DUAL_ABI

- !_GLIBCXX_USE_DUAL_ABI && !_GLIBCXX_USE_CXX11_ABI

- !_GLIBCXX_USE_DUAL_ABI && _GLIBCXX_USE_CXX11_ABI

I don't know what you have in mind for the change below but I wanted to
let you know that I tried to put COW std::basic_string into a nested
__cow namespace when _GLIBCXX_USE_CXX11_ABI. But it had more impact on
string-inst.cc so I preferred the macro substitution approach.

I was thinking of implementing the necessary special members functions
of __cow_string directly, so they are ABI compatible with the COW
std::basic_string but don't actually reuse the code. That would mean
we don't need to compile and instantiate the whole COW string just to
use a few members from it. But that can be done later, the macro
approach seems OK for now.


You'll see that when cow_string.h is included while 
_GLIBCXX_USE_CXX11_ABI == 1 then I am hiding a big part of the 
basic_string definition. Initially it was to avoid to have to include 
basic_string.tcc but it is also a lot of useless code indeed.






There are some test failing when !_GLIBCXX_USE_CXX11_ABI that are
unrelated with my changes. I'll propose fixes in coming days.

Which tests? I run the entire testsuite with
-D_GLIBCXX_USE_CXX11_ABI=0 several times per day and I'm not seeing
failures.

I'll review the patch ASAP, thanks for working on it.

So far the only issue I found are in the mode !_GLIBCXX_USE_DUAL_ABI && 
!_GLIBCXX_USE_CXX11_ABI. They are:


23_containers/unordered_map/96088.cc
23_containers/unordered_multimap/96088.cc
23_containers/unordered_multiset/96088.cc
23_containers/unordered_set/96088.cc
ext/debug_allocator/check_new.cc
ext/malloc_allocator/check_new.cc
ext/malloc_allocator/deallocate_local.cc
ext/new_allocator/deallocate_local.cc
ext/pool_allocator/allocate_chunk.cc
ext/throw_allocator/deallocate_local.cc

but not sure I'll try to fix those in this context.

François



Re: [Committed] RISCV: Add rotate immediate regression test

2023-08-17 Thread Palmer Dabbelt

On Thu, 17 Aug 2023 10:10:38 PDT (-0700), Patrick O'Neill wrote:

On 8/16/23 21:36, Jeff Law wrote:




On 8/16/23 19:17, Patrick O'Neill wrote:

This adds new regression tests to ensure half-register rotations are
correctly optimized into rori instructions.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zbb-rol-ror-08.c: New test.
* gcc.target/riscv/zbb-rol-ror-09.c: New test.

Co-authored-by: Charlie Jenkins 
Signed-off-by: Patrick O'Neill 

OK
jeff

Committed


IIRC this came up in the context of Linux's TCP checksum code.


Patrick


Re: Another bug for __builtin_object_size? (Or expected behavior)

2023-08-17 Thread Siddhesh Poyarekar

On 2023-08-17 09:58, Qing Zhao wrote:

So this is a (sort of) known issue, which necessitated the early_objsz pass to 
get an estimate before a subobject reference was optimized to a MEM_REF.


Do you mean that after a subobject reference was optimized to a MEM_REF, there 
is no way to compute the size of the subobject anymore?


Yes, in cases where the TYPE_SIZE is lost and there's no other 
allocation information to fall back on.



  However it looks like the MIN/MAX hack doesn't work in this case for 
OST_MINIMUM; it should probably get the minimum of the two passes if both 
passes were successful, or only the result of the pass that was successful.


You mean that the following line:
2053   enum tree_code code = object_size_type & OST_MINIMUM ? MAX_EXPR : 
MIN_EXPR;
Might need to be changed to:
2053   enum tree_code code =  MIN_EXPR;


Yes, that's it.  Maybe it's more correct if instead of MAX_EXPR if for 
OST_MINIMUM we stick with the early_objsz answer if it's non-zero.  I'm 
not sure if that's the case for maximum size though, my gut says it isn't.


Thanks,
Sid


Re: [PATCH V2] RISC-V: Add the missed half floating-point mode patterns of local_pic_load/store when only use zfhmin or zhinxmin

2023-08-17 Thread Palmer Dabbelt

On Thu, 17 Aug 2023 10:03:04 PDT (-0700), rdapp@gmail.com wrote:

Indeed all ANYLSF patterns have TARGET_HARD_FLOAT (==f extension) which
is incompatible with ZHINX or ZHINXMIN anyway.  That should really be fixed
separately or at least clarified, maybe I'm missing something.


We've also got the broader issue where these PIC patterns are likely not 
the way to go long term, they're just papering around some other issues 
(and are likely why we flip the implicit-relocs behavior implicitly).  
We should probably fix that at some point, but I don't see any reason to 
block a fix on a cleanup.


That said, given that folks are poking around in here it's probably 
worth putting together test cases for the other patterns in there.



Still we can go forward with the patch itself as it improves things
independently, so LGTM.


Ya, IMO it's fine to add these given they fix the issue.


Regards
 Robin


Re: [PATCH] libgccjit: Add support for `restrict` attribute on function parameters

2023-08-17 Thread Guillaume Gomez via Gcc-patches
Quick question: do you plan to make the merge or should I ask Antoni?

Le jeu. 17 août 2023 à 17:59, Guillaume Gomez 
a écrit :

> Thanks for the review!
>
> Le jeu. 17 août 2023 à 17:50, David Malcolm  a écrit
> :
> >
> > On Thu, 2023-08-17 at 17:41 +0200, Guillaume Gomez wrote:
> > > And now I just discovered that a lot of commits from Antoni's fork
> > > haven't been sent upstream which is why the ABI count is so high in
> > > his repository. Fixed that as well.
> >
> > Thanks for the updated patch; I was about to comment on that.
> >
> > This version is good for gcc trunk.
> >
> > Dave
> >
> > >
> > > Le jeu. 17 août 2023 à 17:26, Guillaume Gomez
> > >  a écrit :
> > > >
> > > > Antoni spot a typo I made:
> > > >
> > > > I added `LIBGCCJIT_HAVE_gcc_jit_type_get_size` instead of
> > > > `LIBGCCJIT_HAVE_gcc_jit_type_get_restrict`. Fixed in this patch,
> > > > sorry
> > > > for the noise.
> > > >
> > > > Le jeu. 17 août 2023 à 11:30, Guillaume Gomez
> > > >  a écrit :
> > > > >
> > > > > Hi Dave,
> > > > >
> > > > > > What kind of testing has the patch had? (e.g. did you run "make
> > > > > > check-
> > > > > > jit" ?  Has this been in use on real Rust code?)
> > > > >
> > > > > I tested it as Rust backend directly on this code:
> > > > >
> > > > > ```
> > > > > pub fn foo(a: &mut i32, b: &mut i32, c: &i32) {
> > > > > *a += *c;
> > > > > *b += *c;
> > > > > }
> > > > > ```
> > > > >
> > > > > I ran it with `rustc` (and the GCC backend) with the following
> > > > > flags:
> > > > > `-C link-args=-lc --emit=asm -O --crate-type=lib` which gave the
> > > > > diff
> > > > > you can see in the attached file. Explanations: the diff on the
> > > > > right
> > > > > has the `__restrict__` attribute used whereas on the left it is
> > > > > the
> > > > > current version where we don't handle it.
> > > > >
> > > > > As for C testing, I used this code:
> > > > >
> > > > > ```
> > > > > void t(int *__restrict__ a, int *__restrict__ b, char
> > > > > *__restrict__ c) {
> > > > > *a += *c;
> > > > > *b += *c;
> > > > > }
> > > > > ```
> > > > >
> > > > > (without the `__restrict__` of course when I need to have a
> > > > > witness
> > > > > ASM). I attached the diff as well, this time the file with the
> > > > > use of
> > > > > `__restrict__` in on the left. I compiled with the following
> > > > > flags:
> > > > > `-S -O3`.
> > > > >
> > > > > > Please add a feature macro:
> > > > > > #define LIBGCCJIT_HAVE_gcc_jit_type_get_restrict
> > > > > > (see the similar ones in the header).
> > > > >
> > > > > I added `LIBGCCJIT_HAVE_gcc_jit_type_get_size` and extended the
> > > > > documentation as well to mention the ABI change.
> > > > >
> > > > > > Please add a new ABI tag (LIBGCCJIT_ABI_25 ?), rather than
> > > > > > adding this
> > > > > > to ABI_0.
> > > > >
> > > > > I added `LIBGCCJIT_ABI_34` as `LIBGCCJIT_ABI_33` was the last
> > > > > one.
> > > > >
> > > > > > This refers to a "cold attribute"; is this a vestige of a copy-
> > > > > > and-
> > > > > > paste from a different test case?
> > > > >
> > > > > It is a vestige indeed... Missed this one.
> > > > >
> > > > > > I see that the test scans the generated assembler.  Does the
> > > > > > test
> > > > > > actually verify that restrict has an effect, or was that
> > > > > > another
> > > > > > vestige from a different test case?
> > > > >
> > > > > No, this time it's what I wanted. Please see the C diff I
> > > > > provided
> > > > > above to see that the ASM has a small diff that allowed me to
> > > > > confirm
> > > > > that the `__restrict__` attribute was correctly set.
> > > > >
> > > > > > If this test is meant to run at -O3 and thus can't be part of
> > > > > > test-
> > > > > > combination.c, please add a comment about it to
> > > > > > gcc/testsuite/jit.dg/all-non-failing-tests.h (in the
> > > > > > alphabetical
> > > > > > place).
> > > > >
> > > > > Below `-O3`, this ASM difference doesn't appear unfortunately.
> > > > >
> > > > > > The patch also needs to add documentation for the new
> > > > > > entrypoint (in
> > > > > > topics/types.rst), and for the new ABI tag (in
> > > > > > topics/compatibility.rst).
> > > > >
> > > > > Added!
> > > > >
> > > > > > Thanks again for the patch; hope the above is constructive
> > > > >
> > > > > It was incredibly useful! Thanks for taking time to writing down
> > > > > the
> > > > > explanations.
> > > > >
> > > > > The new patch is attached to this email.
> > > > >
> > > > > Cordially.
> > > > >
> > > > > Le jeu. 17 août 2023 à 01:06, David Malcolm 
> > > > > a écrit :
> > > > > >
> > > > > > On Wed, 2023-08-16 at 22:06 +0200, Guillaume Gomez via Jit
> > > > > > wrote:
> > > > > > > My apologies, forgot to run the commit checkers. Here's the
> > > > > > > commit
> > > > > > > with the errors fixed.
> > > > > > >
> > > > > > > Le mer. 16 août 2023 à 18:32, Guillaume Gomez
> > > > > > >  a écrit :
> > > > > > > >
> > > > > > > > Hi,
> > > > > >
> > > > > > Hi Guillaume, thanks for the patch.
> > > > > 

[PATCH] RISC-V: Implement TLS Descriptors.

2023-08-17 Thread Tatsuyuki Ishi via Gcc-patches
This implements TLS Descriptors (TLSDESC) as specified in [1].

In TLSDESC instruction sequence, the first instruction relocates against
the target TLS variable, while subsequent instructions relocates against
the address of the first. Such usage of labels are not well-supported
within GCC. Due to this, the 4-instruction sequence is implemented as a
single RTX insn.

For now, keep defaulting to the traditional TLS model, but this can be
revisited once toolchain and libc support ships.

[1]: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373
---
No regression in binutils and gcc tests for rv64gc, tested alongside the
binutils and glibc implementation (posted at the same time). During
testing, the default TLS dialect was changed to TLSDESC.

This contribution is made on behalf of Blue Whale Systems, which has
copyright assignment on file with the FSF.

 gcc/config/riscv/riscv-opts.h   |  6 ++
 gcc/config/riscv/riscv-protos.h |  5 +++--
 gcc/config/riscv/riscv.cc   | 34 +
 gcc/config/riscv/riscv.h|  3 +++
 gcc/config/riscv/riscv.md   | 22 -
 gcc/config/riscv/riscv.opt  | 14 ++
 6 files changed, 77 insertions(+), 7 deletions(-)

diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 378a17699cd..db03f35430a 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -319,4 +319,10 @@ enum riscv_entity
 #define TARGET_VECTOR_VLS  
\
   (TARGET_VECTOR && riscv_autovec_preference == RVV_SCALABLE)
 
+/* TLS types.  */
+enum riscv_tls_type {
+  TLS_TRADITIONAL,
+  TLS_DESCRIPTORS
+};
+
 #endif /* ! GCC_RISCV_OPTS_H */
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 472c00dc439..9b7471f7591 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -33,9 +33,10 @@ enum riscv_symbol_type {
   SYMBOL_TLS,
   SYMBOL_TLS_LE,
   SYMBOL_TLS_IE,
-  SYMBOL_TLS_GD
+  SYMBOL_TLS_GD,
+  SYMBOL_TLSDESC,
 };
-#define NUM_SYMBOL_TYPES (SYMBOL_TLS_GD + 1)
+#define NUM_SYMBOL_TYPES (SYMBOL_TLSDESC + 1)
 
 /* Classifies an address.
 
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 49062bef9fc..4ff0adbbb1e 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -799,6 +799,7 @@ static int riscv_symbol_insns (enum riscv_symbol_type type)
 case SYMBOL_ABSOLUTE: return 2; /* LUI + the reference.  */
 case SYMBOL_PCREL: return 2; /* AUIPC + the reference.  */
 case SYMBOL_TLS_LE: return 3; /* LUI + ADD TP + the reference.  */
+case SYMBOL_TLSDESC: return 6; /* 4-instruction call + ADD TP + the 
reference.  */
 case SYMBOL_GOT_DISP: return 3; /* AUIPC + LD GOT + the reference.  */
 default: gcc_unreachable ();
 }
@@ -1601,6 +1602,16 @@ static rtx riscv_tls_add_tp_le (rtx dest, rtx base, rtx 
sym)
 return gen_tls_add_tp_lesi (dest, base, tp, sym);
 }
 
+/* Instruction sequence to call the TLS Descriptor resolver.  */
+
+static rtx riscv_tlsdesc (rtx sym, rtx seqno)
+{
+  if (Pmode == DImode)
+return gen_tlsdescdi (sym, seqno);
+  else
+return gen_tlsdescsi (sym, seqno);
+}
+
 /* If MODE is MAX_MACHINE_MODE, ADDR appears as a move operand, otherwise
it appears in a MEM of that mode.  Return true if ADDR is a legitimate
constant in that context and can be split into high and low parts.
@@ -1734,7 +1745,7 @@ riscv_call_tls_get_addr (rtx sym, rtx result)
 static rtx
 riscv_legitimize_tls_address (rtx loc)
 {
-  rtx dest, tp, tmp;
+  rtx dest, tp, tmp, a0;
   enum tls_model model = SYMBOL_REF_TLS_MODEL (loc);
 
 #if 0
@@ -1750,9 +1761,24 @@ riscv_legitimize_tls_address (rtx loc)
   /* Rely on section anchors for the optimization that LDM TLS
 provides.  The anchor's address is loaded with GD TLS. */
 case TLS_MODEL_GLOBAL_DYNAMIC:
-  tmp = gen_rtx_REG (Pmode, GP_RETURN);
-  dest = gen_reg_rtx (Pmode);
-  emit_libcall_block (riscv_call_tls_get_addr (loc, tmp), dest, tmp, loc);
+  if (TARGET_TLSDESC)
+   {
+ static unsigned seqno;
+ tp = gen_rtx_REG (Pmode, THREAD_POINTER_REGNUM);
+ a0 = gen_rtx_REG (Pmode, GP_ARG_FIRST);
+ dest = gen_reg_rtx (Pmode);
+
+ emit_insn (riscv_tlsdesc (loc, GEN_INT (seqno)));
+ emit_insn (gen_add3_insn (dest, a0, tp));
+ seqno++;
+   }
+  else
+   {
+ tmp = gen_rtx_REG (Pmode, GP_RETURN);
+ dest = gen_reg_rtx (Pmode);
+ emit_libcall_block (riscv_call_tls_get_addr (loc, tmp), dest, tmp,
+ loc);
+   }
   break;
 
 case TLS_MODEL_INITIAL_EXEC:
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index e18a0081297..7cf1365ec08 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -1122,4 +1122,7 @@ extern void riscv_remove_unneeded_save_restore_calls 
(void);
 #define OPTIMIZE_MOD

Re: [PATCH] c: Add support for [[__extension__ ...]]

2023-08-17 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
>> Am 17.08.2023 um 13:25 schrieb Richard Sandiford via Gcc-patches 
>> :
>> 
>> Joseph Myers  writes:
 On Wed, 16 Aug 2023, Richard Sandiford via Gcc-patches wrote:
 
 Would it be OK to add support for:
 
  [[__extension__ ...]]
 
 to suppress the pedwarn about using [[]] prior to C2X?  Then we can
>>> 
>>> That seems like a plausible feature to add.
>> 
>> Thanks.  Of course, once I actually tried it, I hit a snag:
>> :: isn't a single lexing token prior to C2X, and so something like:
>> 
>>  [[__extension__ arm::streaming]]
>> 
>> would not be interpreted as a scoped attribute in C11.  The patch
>> gets around that by allowing two colons in place of :: when
>> __extension__ is used.  I realise that's pushing the bounds of
>> acceptability though...
>> 
>> I wondered about trying to require the two colons to be immediately
>> adjacent.  But:
>> 
>> (a) There didn't appear to be an existing API to check that, which seemed
>>like a red flag.  The closest I could find was get_source_text_between.
>
> IStR a cop Toben has ->prev_white or so

Ah, thanks.

  if (c_parser_next_token_is (parser, CPP_SCOPE)
  || (loose_scope_p
  && c_parser_next_token_is (parser, CPP_COLON)
  && c_parser_peek_2nd_token (parser)->type == CPP_COLON
  && !(c_parser_peek_2nd_token (parser)->flags & PREV_WHITE)))

seems to work for (i.e. reject):

typedef int [[__extension__ gnu : : vector_size (4)]] g3;
typedef int [[__extension__ gnu :/**/: vector_size (4)]] g13;

but not:

#define BAR :
typedef int [[__extension__ gnu BAR BAR vector_size (4)]] g5;

#define JOIN(A, B) A/**/B
typedef int [[__extension__ gnu JOIN(:,:) vector_size (4)]] g14;

I now realise the patch was peeking at the wrong token.  Will fix,
and add more tests.

Richard


Re: [PATCH] sso-string@gnu-versioned-namespace [PR83077]

2023-08-17 Thread Jonathan Wakely via Gcc-patches
On Thu, 17 Aug 2023 at 18:40, François Dumont  wrote:
>
>
> On 17/08/2023 19:22, Jonathan Wakely wrote:
> > On Sun, 13 Aug 2023 at 14:27, François Dumont via Libstdc++
> >  wrote:
> >> Here is the fixed patch tested in all 3 modes:
> >>
> >> - _GLIBCXX_USE_DUAL_ABI
> >>
> >> - !_GLIBCXX_USE_DUAL_ABI && !_GLIBCXX_USE_CXX11_ABI
> >>
> >> - !_GLIBCXX_USE_DUAL_ABI && _GLIBCXX_USE_CXX11_ABI
> >>
> >> I don't know what you have in mind for the change below but I wanted to
> >> let you know that I tried to put COW std::basic_string into a nested
> >> __cow namespace when _GLIBCXX_USE_CXX11_ABI. But it had more impact on
> >> string-inst.cc so I preferred the macro substitution approach.
> > I was thinking of implementing the necessary special members functions
> > of __cow_string directly, so they are ABI compatible with the COW
> > std::basic_string but don't actually reuse the code. That would mean
> > we don't need to compile and instantiate the whole COW string just to
> > use a few members from it. But that can be done later, the macro
> > approach seems OK for now.
>
> You'll see that when cow_string.h is included while
> _GLIBCXX_USE_CXX11_ABI == 1 then I am hiding a big part of the
> basic_string definition. Initially it was to avoid to have to include
> basic_string.tcc but it is also a lot of useless code indeed.
>
>
> >
> >> There are some test failing when !_GLIBCXX_USE_CXX11_ABI that are
> >> unrelated with my changes. I'll propose fixes in coming days.
> > Which tests? I run the entire testsuite with
> > -D_GLIBCXX_USE_CXX11_ABI=0 several times per day and I'm not seeing
> > failures.
> >
> > I'll review the patch ASAP, thanks for working on it.
> >
> So far the only issue I found are in the mode !_GLIBCXX_USE_DUAL_ABI &&
> !_GLIBCXX_USE_CXX11_ABI. They are:
>
> 23_containers/unordered_map/96088.cc
> 23_containers/unordered_multimap/96088.cc
> 23_containers/unordered_multiset/96088.cc
> 23_containers/unordered_set/96088.cc
> ext/debug_allocator/check_new.cc
> ext/malloc_allocator/check_new.cc
> ext/malloc_allocator/deallocate_local.cc
> ext/new_allocator/deallocate_local.cc
> ext/pool_allocator/allocate_chunk.cc
> ext/throw_allocator/deallocate_local.cc

Ah yes, they fail for !USE_DUAL_ABI builds, I wonder why.

/home/test/src/gcc/libstdc++-v3/testsuite/23_containers/unordered_map/96088.
cc:44: void test01(): Assertion '__gnu_test::counter::count() == 3' failed.
FAIL: 23_containers/unordered_map/96088.cc execution test



> but not sure I'll try to fix those in this context.

Right, those seem like something separate.

>
> François
>



[PATCH] Document cond_neg, cond_one_cmpl, cond_len_neg and cond_len_one_cmpl standard patterns

2023-08-17 Thread Andrew Pinski via Gcc-patches
When I added `cond_one_cmpl` (and the corresponding IFN) I had noticed cond_neg
standard named pattern was not documented and this adds the documentation for
all 4 named patterns now.

OK? Tested by building the manual.

gcc/ChangeLog:

* doc/md.texi (Standard patterns): Document cond_neg, cond_one_cmpl,
cond_len_neg and cond_len_one_cmpl.
---
 gcc/doc/md.texi | 62 +
 1 file changed, 62 insertions(+)

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 70590e68ffe..89562fdb43c 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -7194,6 +7194,40 @@ move operand 2 or (operands 2 + operand 3) into operand 
0 according to the
 comparison in operand 1.  If the comparison is false, operand 2 is moved into
 operand 0, otherwise (operand 2 + operand 3) is moved.
 
+@cindex @code{cond_neg@var{mode}} instruction pattern
+@cindex @code{cond_one_cmpl@var{mode}} instruction pattern
+@item @samp{cond_neg@var{mode}}
+@itemx @samp{cond_one_cmpl@var{mode}}
+When operand 1 is true, perform an operation on operands 2 and
+store the result in operand 0, otherwise store operand 3 in operand 0.
+The operation works elementwise if the operands are vectors.
+
+The scalar case is equivalent to:
+
+@smallexample
+op0 = op1 ? @var{op} op2 : op3;
+@end smallexample
+
+while the vector case is equivalent to:
+
+@smallexample
+for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)
+  op0[i] = op1[i] ? @var{op} op2[i] : op3[i];
+@end smallexample
+
+where, for example, @var{op} is @code{~} for @samp{cond_one_cmpl@var{mode}}.
+
+When defined for floating-point modes, the contents of @samp{op2[i]}
+are not interpreted if @samp{op1[i]} is false, just like they would not
+be in a normal C @samp{?:} condition.
+
+Operands 0, 2, and 3 all have mode @var{m}.  Operand 1 is a scalar
+integer if @var{m} is scalar, otherwise it has the mode returned by
+@code{TARGET_VECTORIZE_GET_MASK_MODE}.
+
+@samp{cond_@var{op}@var{mode}} generally corresponds to a conditional
+form of @samp{@var{op}@var{mode}2}.
+
 @cindex @code{cond_add@var{mode}} instruction pattern
 @cindex @code{cond_sub@var{mode}} instruction pattern
 @cindex @code{cond_mul@var{mode}} instruction pattern
@@ -7281,6 +7315,34 @@ for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)
   op0[i] = op1[i] ? fma (op2[i], op3[i], op4[i]) : op5[i];
 @end smallexample
 
+@cindex @code{cond_len_neg@var{mode}} instruction pattern
+@cindex @code{cond_len_one_cmpl@var{mode}} instruction pattern
+@item @samp{cond_len_neg@var{mode}}
+@itemx @samp{cond_len_one_cmpl@var{mode}}
+When operand 1 is true and element index < operand 4 + operand 5, perform an 
operation on operands 1 and
+store the result in operand 0, otherwise store operand 2 in operand 0.
+The operation only works for the operands are vectors.
+
+@smallexample
+for (i = 0; i < ops[4] + ops[5]; i++)
+  op0[i] = op1[i] ? @var{op} op2[i] : op3[i];
+@end smallexample
+
+where, for example, @var{op} is @code{~} for 
@samp{cond_len_one_cmpl@var{mode}}.
+
+When defined for floating-point modes, the contents of @samp{op2[i]}
+are not interpreted if @samp{op1[i]} is false, just like they would not
+be in a normal C @samp{?:} condition.
+
+Operands 0, 2, and 3 all have mode @var{m}.  Operand 1 is a scalar
+integer if @var{m} is scalar, otherwise it has the mode returned by
+@code{TARGET_VECTORIZE_GET_MASK_MODE}.  Operand 4 has whichever
+integer mode the target prefers.
+
+@samp{cond_len_@var{op}@var{mode}} generally corresponds to a conditional
+form of @samp{@var{op}@var{mode}2}.
+
+
 @cindex @code{cond_len_add@var{mode}} instruction pattern
 @cindex @code{cond_len_sub@var{mode}} instruction pattern
 @cindex @code{cond_len_mul@var{mode}} instruction pattern
-- 
2.31.1



Re: Another bug for __builtin_object_size? (Or expected behavior)

2023-08-17 Thread Qing Zhao via Gcc-patches


> On Aug 17, 2023, at 1:49 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-08-17 09:58, Qing Zhao wrote:
>>> So this is a (sort of) known issue, which necessitated the early_objsz pass 
>>> to get an estimate before a subobject reference was optimized to a MEM_REF.
>> Do you mean that after a subobject reference was optimized to a MEM_REF, 
>> there is no way to compute the size of the subobject anymore?
> 
> Yes, in cases where the TYPE_SIZE is lost and there's no other allocation 
> information to fall back on.

Okay, I see.

> 
>>>  However it looks like the MIN/MAX hack doesn't work in this case for 
>>> OST_MINIMUM; it should probably get the minimum of the two passes if both 
>>> passes were successful, or only the result of the pass that was successful.
>> You mean that the following line:
>> 2053   enum tree_code code = object_size_type & OST_MINIMUM ? MAX_EXPR : 
>> MIN_EXPR;
>> Might need to be changed to:
>> 2053   enum tree_code code =  MIN_EXPR;
> 
> Yes, that's it.  Maybe it's more correct if instead of MAX_EXPR if for 
> OST_MINIMUM we stick with the early_objsz answer if it's non-zero.  I'm not 
> sure if that's the case for maximum size though, my gut says it isn't.

So, the major purpose for adding the early object size phase is for computing 
SUBobjects size more precisely before the subobject information lost?

Then, I think whatever MIN or MAX, the early phase has more precise information 
than the later phase, we should use its result if it’s NOT UNKNOWN?

Qing
> 
> Thanks,
> Sid



[PATCH] improve error for when /usr/include isn't found [PR90835]

2023-08-17 Thread Eric Gallager via Gcc-patches
This is a pretty simple patch that ought to help Darwin users understand
better why their build is failing when they forget to pass the
--with-sysroot= flag to configure.

gcc/ChangeLog:

PR target/90835
* Makefile.in: improve error message when /usr/include is
missing


0001-improve-error-for-when-usr-include-isn-t-found.patch
Description: Binary data


Re: [PATCH] sso-string@gnu-versioned-namespace [PR83077]

2023-08-17 Thread Jonathan Wakely via Gcc-patches
On Thu, 17 Aug 2023 at 19:59, Jonathan Wakely  wrote:
>
> On Thu, 17 Aug 2023 at 18:40, François Dumont  wrote:
> >
> >
> > On 17/08/2023 19:22, Jonathan Wakely wrote:
> > > On Sun, 13 Aug 2023 at 14:27, François Dumont via Libstdc++
> > >  wrote:
> > >> Here is the fixed patch tested in all 3 modes:
> > >>
> > >> - _GLIBCXX_USE_DUAL_ABI
> > >>
> > >> - !_GLIBCXX_USE_DUAL_ABI && !_GLIBCXX_USE_CXX11_ABI
> > >>
> > >> - !_GLIBCXX_USE_DUAL_ABI && _GLIBCXX_USE_CXX11_ABI
> > >>
> > >> I don't know what you have in mind for the change below but I wanted to
> > >> let you know that I tried to put COW std::basic_string into a nested
> > >> __cow namespace when _GLIBCXX_USE_CXX11_ABI. But it had more impact on
> > >> string-inst.cc so I preferred the macro substitution approach.
> > > I was thinking of implementing the necessary special members functions
> > > of __cow_string directly, so they are ABI compatible with the COW
> > > std::basic_string but don't actually reuse the code. That would mean
> > > we don't need to compile and instantiate the whole COW string just to
> > > use a few members from it. But that can be done later, the macro
> > > approach seems OK for now.
> >
> > You'll see that when cow_string.h is included while
> > _GLIBCXX_USE_CXX11_ABI == 1 then I am hiding a big part of the
> > basic_string definition. Initially it was to avoid to have to include
> > basic_string.tcc but it is also a lot of useless code indeed.
> >
> >
> > >
> > >> There are some test failing when !_GLIBCXX_USE_CXX11_ABI that are
> > >> unrelated with my changes. I'll propose fixes in coming days.
> > > Which tests? I run the entire testsuite with
> > > -D_GLIBCXX_USE_CXX11_ABI=0 several times per day and I'm not seeing
> > > failures.
> > >
> > > I'll review the patch ASAP, thanks for working on it.
> > >
> > So far the only issue I found are in the mode !_GLIBCXX_USE_DUAL_ABI &&
> > !_GLIBCXX_USE_CXX11_ABI. They are:
> >
> > 23_containers/unordered_map/96088.cc
> > 23_containers/unordered_multimap/96088.cc
> > 23_containers/unordered_multiset/96088.cc
> > 23_containers/unordered_set/96088.cc
> > ext/debug_allocator/check_new.cc
> > ext/malloc_allocator/check_new.cc
> > ext/malloc_allocator/deallocate_local.cc
> > ext/new_allocator/deallocate_local.cc
> > ext/pool_allocator/allocate_chunk.cc
> > ext/throw_allocator/deallocate_local.cc
>
> Ah yes, they fail for !USE_DUAL_ABI builds, I wonder why.
>
> /home/test/src/gcc/libstdc++-v3/testsuite/23_containers/unordered_map/96088.
> cc:44: void test01(): Assertion '__gnu_test::counter::count() == 3' failed.
> FAIL: 23_containers/unordered_map/96088.cc execution test

It's due to this global object in src/c++20/tzdb.cc:
1081const string tzdata_file = "/tzdata.zi";

When the library uses COW strings that requires an allocation before
main, which uses the replacement operator new in the tests, which
fails to allocate. For example, in 22_locale/locale/cons/12352.cc we
have this function used by operator new:

int times_to_fail = 0;

void* allocate(std::size_t n)
{
  if (!times_to_fail--)
return 0;

The counter is initially zero, so if we try to allocate before it gets
set to a non-zero value in test01() then we fail.

The test should not assume no allocations before main() begins. The
simplest way to do that is with another global that says "we have
started testing" e.g.

--- a/libstdc++-v3/testsuite/22_locale/locale/cons/12352.cc
+++ b/libstdc++-v3/testsuite/22_locale/locale/cons/12352.cc
@@ -26,11 +26,12 @@
 #include 
 #include 

+bool tests_started = false;
 int times_to_fail = 0;

 void* allocate(std::size_t n)
 {
-  if (!times_to_fail--)
+  if (tests_started && !times_to_fail--)
 return 0;

   void* ret = std::malloc(n ? n : 1);
@@ -106,6 +107,8 @@ void operator delete[](void* p, const
std::nothrow_t&) throw()
 // libstdc++/12352
 void test01(int iters)
 {
+  tests_started = true;
+
   for (int j = 0; j < iters; ++j)
 {
   for (int i = 0; i < 100; ++i)


This way the replacement operator new doesn't start intentionally
failing until we ask it to do so.



Re: [PATCH] sso-string@gnu-versioned-namespace [PR83077]

2023-08-17 Thread Jonathan Wakely via Gcc-patches
On Thu, 17 Aug 2023 at 20:37, Jonathan Wakely  wrote:
>
> On Thu, 17 Aug 2023 at 19:59, Jonathan Wakely  wrote:
> >
> > On Thu, 17 Aug 2023 at 18:40, François Dumont  wrote:
> > >
> > >
> > > On 17/08/2023 19:22, Jonathan Wakely wrote:
> > > > On Sun, 13 Aug 2023 at 14:27, François Dumont via Libstdc++
> > > >  wrote:
> > > >> Here is the fixed patch tested in all 3 modes:
> > > >>
> > > >> - _GLIBCXX_USE_DUAL_ABI
> > > >>
> > > >> - !_GLIBCXX_USE_DUAL_ABI && !_GLIBCXX_USE_CXX11_ABI
> > > >>
> > > >> - !_GLIBCXX_USE_DUAL_ABI && _GLIBCXX_USE_CXX11_ABI
> > > >>
> > > >> I don't know what you have in mind for the change below but I wanted to
> > > >> let you know that I tried to put COW std::basic_string into a nested
> > > >> __cow namespace when _GLIBCXX_USE_CXX11_ABI. But it had more impact on
> > > >> string-inst.cc so I preferred the macro substitution approach.
> > > > I was thinking of implementing the necessary special members functions
> > > > of __cow_string directly, so they are ABI compatible with the COW
> > > > std::basic_string but don't actually reuse the code. That would mean
> > > > we don't need to compile and instantiate the whole COW string just to
> > > > use a few members from it. But that can be done later, the macro
> > > > approach seems OK for now.
> > >
> > > You'll see that when cow_string.h is included while
> > > _GLIBCXX_USE_CXX11_ABI == 1 then I am hiding a big part of the
> > > basic_string definition. Initially it was to avoid to have to include
> > > basic_string.tcc but it is also a lot of useless code indeed.
> > >
> > >
> > > >
> > > >> There are some test failing when !_GLIBCXX_USE_CXX11_ABI that are
> > > >> unrelated with my changes. I'll propose fixes in coming days.
> > > > Which tests? I run the entire testsuite with
> > > > -D_GLIBCXX_USE_CXX11_ABI=0 several times per day and I'm not seeing
> > > > failures.
> > > >
> > > > I'll review the patch ASAP, thanks for working on it.
> > > >
> > > So far the only issue I found are in the mode !_GLIBCXX_USE_DUAL_ABI &&
> > > !_GLIBCXX_USE_CXX11_ABI. They are:
> > >
> > > 23_containers/unordered_map/96088.cc
> > > 23_containers/unordered_multimap/96088.cc
> > > 23_containers/unordered_multiset/96088.cc
> > > 23_containers/unordered_set/96088.cc
> > > ext/debug_allocator/check_new.cc
> > > ext/malloc_allocator/check_new.cc
> > > ext/malloc_allocator/deallocate_local.cc
> > > ext/new_allocator/deallocate_local.cc
> > > ext/pool_allocator/allocate_chunk.cc
> > > ext/throw_allocator/deallocate_local.cc
> >
> > Ah yes, they fail for !USE_DUAL_ABI builds, I wonder why.
> >
> > /home/test/src/gcc/libstdc++-v3/testsuite/23_containers/unordered_map/96088.
> > cc:44: void test01(): Assertion '__gnu_test::counter::count() == 3' failed.
> > FAIL: 23_containers/unordered_map/96088.cc execution test
>
> It's due to this global object in src/c++20/tzdb.cc:
> 1081const string tzdata_file = "/tzdata.zi";
>
> When the library uses COW strings that requires an allocation before
> main, which uses the replacement operator new in the tests, which
> fails to allocate. For example, in 22_locale/locale/cons/12352.cc we
> have this function used by operator new:
>
> int times_to_fail = 0;
>
> void* allocate(std::size_t n)
> {
>   if (!times_to_fail--)
> return 0;
>
> The counter is initially zero, so if we try to allocate before it gets
> set to a non-zero value in test01() then we fail.
>
> The test should not assume no allocations before main() begins. The
> simplest way to do that is with another global that says "we have
> started testing" e.g.
>
> --- a/libstdc++-v3/testsuite/22_locale/locale/cons/12352.cc
> +++ b/libstdc++-v3/testsuite/22_locale/locale/cons/12352.cc
> @@ -26,11 +26,12 @@
>  #include 
>  #include 
>
> +bool tests_started = false;
>  int times_to_fail = 0;
>
>  void* allocate(std::size_t n)
>  {
> -  if (!times_to_fail--)
> +  if (tests_started && !times_to_fail--)
>  return 0;
>
>void* ret = std::malloc(n ? n : 1);
> @@ -106,6 +107,8 @@ void operator delete[](void* p, const
> std::nothrow_t&) throw()
>  // libstdc++/12352
>  void test01(int iters)
>  {
> +  tests_started = true;
> +
>for (int j = 0; j < iters; ++j)
>  {
>for (int i = 0; i < 100; ++i)
>
>
> This way the replacement operator new doesn't start intentionally
> failing until we ask it to do so.

I'll replace the global std::string objects with std::string_view
objects, so that they don't allocate even if the library only uses COW
strings.

We should still fix those tests though.



Re: [PATCH] improve error for when /usr/include isn't found [PR90835]

2023-08-17 Thread Iain Sandoe
Hi Eric,

thanks for working on this.

> On 17 Aug 2023, at 20:35, Eric Gallager  wrote:
> 
> This is a pretty simple patch that ought to help Darwin users understand
> better why their build is failing when they forget to pass the
> --with-sysroot= flag to configure.
> 
> gcc/ChangeLog:
> 
>PR target/90835
>* Makefile.in: improve error message when /usr/include is
>missing

1. the main issue with this approach is that the error does not happen until 
after the
   user has waited for the whole of the stage 1 build.

   (I had in mind the idea that top level configure can identify that the 
platform
is Darwin, and that there is no sysroot configured; 
 then [for bootstrap] complain if there is no /use/include
 els [for non-bootstrap] complain always)

- this would mean that the fail occurs at initial configure time.

2. if we went with this patch as an incremental improvement:

+ case ${build_os} in \
+   darwin*) \
+ echo "(on darwin this usually means you need to pass the 
--with-sysroot flag to configure to point it to where the system headers are 
actually put)" >&2; \

I think we need to put this in terms that relate to the system and things the 
user can find, so ;
“on Darwin this usually means you need to pass the --with-sysroot= flag to 
point to a valid MacOS SDK”

(In practice, the headers cause the first fail, but we also need to find the 
libraries when linking)

Iain





Re: Another bug for __builtin_object_size? (Or expected behavior)

2023-08-17 Thread Siddhesh Poyarekar

On 2023-08-17 15:27, Qing Zhao wrote:

Yes, that's it.  Maybe it's more correct if instead of MAX_EXPR if for 
OST_MINIMUM we stick with the early_objsz answer if it's non-zero.  I'm not 
sure if that's the case for maximum size though, my gut says it isn't.


So, the major purpose for adding the early object size phase is for computing 
SUBobjects size more precisely before the subobject information lost?


I suppose it's more about being able to do it at all, rather than precision.


Then, I think whatever MIN or MAX, the early phase has more precise information 
than the later phase, we should use its result if it’s NOT UNKNOWN?


We can't be sure about that though, can we?  For example for something 
like this:


struct S
{
  int a;
  char b[10];
  int c;
};

size_t foo (struct S *s)
{
  return __builtin_object_size (s->b, 1);
}

size_t bar ()
{
  struct S *in = malloc (8);

  return foo (in);
}

returns 10 for __builtin_object_size in early_objsz but when it sees the 
malloc in the later objsz pass, it returns 4:


$ gcc/cc1 -fdump-tree-objsz-details -quiet -o - -O bug.c
...
foo:
.LFB0:
.cfi_startproc
movl$10, %eax
ret
.cfi_endproc
...
bar:
.LFB1:
.cfi_startproc
movl$4, %eax
ret
.cfi_endproc
...

In fact, this ends up returning the wrong result for OST_MINIMUM:

$ gcc/cc1 -fdump-tree-objsz-details -quiet -o - -O bug.c
...
foo:
.LFB0:
.cfi_startproc
movl$10, %eax
ret
.cfi_endproc
...
bar:
.LFB1:
.cfi_startproc
movl$10, %eax
ret
.cfi_endproc
...

bar ought to have returned 4 too (and I'm betting the later objsz must 
have seen that) but it got overridden by the earlier estimate of 10.


We probably need smarter heuristics on choosing between the estimate of 
the early_objsz and late objsz passes each by itself isn't good enough 
for subobjects.


Thanks,
Sid


Re: Another bug for __builtin_object_size? (Or expected behavior)

2023-08-17 Thread Qing Zhao via Gcc-patches


> On Aug 17, 2023, at 3:59 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-08-17 15:27, Qing Zhao wrote:
>>> Yes, that's it.  Maybe it's more correct if instead of MAX_EXPR if for 
>>> OST_MINIMUM we stick with the early_objsz answer if it's non-zero.  I'm not 
>>> sure if that's the case for maximum size though, my gut says it isn't.
>> So, the major purpose for adding the early object size phase is for 
>> computing SUBobjects size more precisely before the subobject information 
>> lost?
> 
> I suppose it's more about being able to do it at all, rather than precision.

Without the subobject information in IR, our later object size phase uses the 
whole object size as an estimation as it currently does. -:)
> 
>> Then, I think whatever MIN or MAX, the early phase has more precise 
>> information than the later phase, we should use its result if it’s NOT 
>> UNKNOWN?
> 
> We can't be sure about that though, can we?  For example for something like 
> this:
> 
> struct S
> {
>  int a;
>  char b[10];
>  int c;
> };
> 
> size_t foo (struct S *s)
> {
>  return __builtin_object_size (s->b, 1);
> }
> 
> size_t bar ()
> {
>  struct S *in = malloc (8);
> 
>  return foo (in);
> }
> 
> returns 10 for __builtin_object_size in early_objsz but when it sees the 
> malloc in the later objsz pass, it returns 4:
> 
> $ gcc/cc1 -fdump-tree-objsz-details -quiet -o - -O bug.c
> ...
> foo:
> .LFB0:
>   .cfi_startproc
>   movl$10, %eax
>   ret
>   .cfi_endproc
> ...
> bar:
> .LFB1:
>   .cfi_startproc
>   movl$4, %eax
>   ret
>   .cfi_endproc
> ...
> 
> In fact, this ends up returning the wrong result for OST_MINIMUM:
> 
> $ gcc/cc1 -fdump-tree-objsz-details -quiet -o - -O bug.c
> ...
> foo:
> .LFB0:
>   .cfi_startproc
>   movl$10, %eax
>   ret
>   .cfi_endproc
> ...
> bar:
> .LFB1:
>   .cfi_startproc
>   movl$10, %eax
>   ret
>   .cfi_endproc
> ...
> 
> bar ought to have returned 4 too (and I'm betting the later objsz must have 
> seen that) but it got overridden by the earlier estimate of 10.

Okay, I see. 

Then is this the similar issue we discussed previously?  (As following:)

"
> Hi, Sid and Jakub,
> I have a question in the following source portion of the routine 
> “addr_object_size” of gcc/tree-object-size.cc:
>  743   bytes = compute_object_offset (TREE_OPERAND (ptr, 0), var);
>  744   if (bytes != error_mark_node)
>  745 {
>  746   bytes = size_for_offset (var_size, bytes);
>  747   if (var != pt_var && pt_var_size && TREE_CODE (pt_var) == 
> MEM_REF)
>  748 {
>  749   tree bytes2 = compute_object_offset (TREE_OPERAND (ptr, 0),
>  750pt_var);
>  751   if (bytes2 != error_mark_node)
>  752 {
>  753   bytes2 = size_for_offset (pt_var_size, bytes2);
>  754   bytes = size_binop (MIN_EXPR, bytes, bytes2);
>  755 }
>  756 }
>  757 }
> At line 754, why we always use “MIN_EXPR” whenever it’s for OST_MINIMUM or 
> not?
> Shall we use
> (object_size_type & OST_MINIMUM
> ? MIN_EXPR : MAX_EXPR)

That MIN_EXPR is not for OST_MINIMUM.  It is to cater for allocations like this:

typedef struct
{
 int a;
} A;

size_t f()
{
 A *p = malloc (1);

 return __builtin_object_size (p, 0);
}

where the returned size should be 1 and not sizeof (int).  The mode doesn't 
really matter in this case.
“

If this is the same issue, I think we can use the same solution: always use 
MIN_EXPR, 
What do you think?

Qing

> 
> We probably need smarter heuristics on choosing between the estimate of the 
> early_objsz and late objsz passes each by itself isn't good enough for 
> subobjects.
> 
> Thanks,
> Sid



[committed] libstdc++: Define std::string::resize_and_overwrite for C++11 and COW string

2023-08-17 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

There are several places in the library where we can improve performance
using resize_and_overwrite so it's inconvenient only being able to use
it in C++23 mode, and only for cxx11 strings. This adds it for COW
strings, and also adds __resize_and_overwrite as an extension for C++11
mode.

The new __resize_and_overwrite is available for C++11 and later, so
within the library we can use that consistently even in C++23.  In order
to avoid making a copy (which might not be possible for non-copyable,
non-movable types) the callable is passed to resize_and_overwrite as an
lvalue reference.  Unlike wrapping it in std::ref(op) this ensures that
invoking it as std::move(op)(n, p) will use the correct value category.
It also avoids any overhead that would be added by wrapping it in a
lambda like [&op](auto p, auto n) { return std::move(op)(p, n); }.

Adjust std::format to use the new __resize_and_overwrite, which we can
assume exists because we only use std::basic_string and
std::basic_string, so no program-defined specializations.

The uses in  cannot be replaced, because those
are type-dependent on an Allocator template parameter, which could mean
they use program-defined specializations of std::basic_string that don't
have the __resize_and_overwrite extension.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (__resize_and_overwrite): New
function.
* include/bits/basic_string.tcc (__resize_and_overwrite): New
function.
(resize_and_overwrite): Simplify by using reserve instead of
growing the string manually. Adjust for C++11 compatibility.
* include/bits/cow_string.h (resize_and_overwrite): New
function.
(__resize_and_overwrite): New function.
* include/bits/version.def (__cpp_lib_string_resize_and_overwrite):
Do not depend on cxx11abi.
* include/bits/version.h: Regenerate.
* include/std/format (__formatter_fp::_S_resize_and_overwrite):
Remove.
(__formatter_fp::format, __formatter_fp::_M_localize): Use
__resize_and_overwrite instead of _S_resize_and_overwrite.
* 
testsuite/21_strings/basic_string/capacity/char/resize_and_overwrite.cc:
Adjust for C++11 compatibility when included by ...
* 
testsuite/21_strings/basic_string/capacity/char/resize_and_overwrite_ext.cc:
New test.
---
 libstdc++-v3/include/bits/basic_string.h  |   7 ++
 libstdc++-v3/include/bits/basic_string.tcc|  54 ++
 libstdc++-v3/include/bits/cow_string.h|  90 
 libstdc++-v3/include/bits/version.def |   1 -
 libstdc++-v3/include/bits/version.h   |   2 +-
 libstdc++-v3/include/std/format   |  18 +---
 .../capacity/char/resize_and_overwrite.cc | 101 ++
 .../capacity/char/resize_and_overwrite_ext.cc |   6 ++
 8 files changed, 217 insertions(+), 62 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string/capacity/char/resize_and_overwrite_ext.cc

diff --git a/libstdc++-v3/include/bits/basic_string.h 
b/libstdc++-v3/include/bits/basic_string.h
index c68e6171aba..e6f94640150 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -1151,6 +1151,13 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
resize_and_overwrite(size_type __n, _Operation __op);
 #endif
 
+#if __cplusplus >= 201103L
+  /// Non-standard version of resize_and_overwrite for C++11 and above.
+  template
+   _GLIBCXX20_CONSTEXPR void
+   __resize_and_overwrite(size_type __n, _Operation __op);
+#endif
+
   /**
*  Returns the total number of characters that the %string can hold
*  before needing to allocate more memory.
diff --git a/libstdc++-v3/include/bits/basic_string.tcc 
b/libstdc++-v3/include/bits/basic_string.tcc
index c759c2f9525..104a517f794 100644
--- a/libstdc++-v3/include/bits/basic_string.tcc
+++ b/libstdc++-v3/include/bits/basic_string.tcc
@@ -561,44 +561,52 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return __n;
 }
 
-#if __cplusplus > 202002L
+#ifdef __cpp_lib_string_resize_and_overwrite // C++ >= 23
   template
   template
+[[__gnu__::__always_inline__]]
 constexpr void
 basic_string<_CharT, _Traits, _Alloc>::
-resize_and_overwrite(const size_type __n, _Operation __op)
-{
-  const size_type __capacity = capacity();
-  _CharT* __p;
-  if (__n > __capacity)
-   {
- auto __new_capacity = __n; // Must not allow _M_create to modify __n.
- __p = _M_create(__new_capacity, __capacity);
- this->_S_copy(__p, _M_data(), length()); // exclude trailing null
-#if __cpp_lib_is_constant_evaluated
- if (std::is_constant_evaluated())
-   traits_type::assign(__p + length(), __n - length(), _CharT());
+__resize_and_overwrite(const size_type __n, _Operation __op)
+{ resize_and_overwrite<_Operation&>(__n, __op); }
+#e

[committed] libstdc++: Optimize std::to_string using std::string::resize_and_overwrite

2023-08-17 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

This uses std::string::__resize_and_overwrite to avoid initializing the
string buffer with characters that are immediately overwritten. This
results in about 6% better performance for the std_to_string case in
int-benchmark.cc from https://github.com/fmtlib/format-benchmark

This requires a change to a testcase. The previous implementation
guaranteed that the string returned from std::to_string(integral-type)
would have no excess capacity, because it was constructed with the
correct length. The new implementation constructs an empty string and
then resizes it with resize_and_overwrite, which over-allocates. This
means that the "no-excess capacity" guarantee no longer holds.

We can also greatly improve the performance of std::to_wstring by using
std::to_string and then widening it with a new helper function, instead
of using std::swprintf to do the formatting.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (to_string(integral-type)): Use
resize_and_overwrite when available.
(__to_wstring_numeric): New helper functions.
(to_wstring): Use std::to_string then __to_wstring_numeric.
* 
testsuite/21_strings/basic_string/numeric_conversions/char/to_string_int.cc:
Remove check for no excess capacity.
---
 libstdc++-v3/include/bits/basic_string.h  | 173 +-
 .../numeric_conversions/char/to_string_int.cc |   2 -
 2 files changed, 123 insertions(+), 52 deletions(-)

diff --git a/libstdc++-v3/include/bits/basic_string.h 
b/libstdc++-v3/include/bits/basic_string.h
index e6f94640150..46326d02597 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -4197,8 +4197,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 const bool __neg = __val < 0;
 const unsigned __uval = __neg ? (unsigned)~__val + 1u : __val;
 const auto __len = __detail::__to_chars_len(__uval);
-string __str(__neg + __len, '-');
-__detail::__to_chars_10_impl(&__str[__neg], __len, __uval);
+string __str;
+__str.__resize_and_overwrite(__neg + __len, [=](char* __p, size_t __n) {
+  __p[0] = '-';
+  __detail::__to_chars_10_impl(__p + (int)__neg, __len, __uval);
+  return __n;
+});
 return __str;
   }
 
@@ -4209,8 +4213,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   noexcept // any 32-bit value fits in the SSO buffer
 #endif
   {
-string __str(__detail::__to_chars_len(__val), '\0');
-__detail::__to_chars_10_impl(&__str[0], __str.size(), __val);
+const auto __len = __detail::__to_chars_len(__val);
+string __str;
+__str.__resize_and_overwrite(__len, [__val](char* __p, size_t __n) {
+  __detail::__to_chars_10_impl(__p, __n, __val);
+  return __n;
+});
 return __str;
   }
 
@@ -4224,8 +4232,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 const bool __neg = __val < 0;
 const unsigned long __uval = __neg ? (unsigned long)~__val + 1ul : __val;
 const auto __len = __detail::__to_chars_len(__uval);
-string __str(__neg + __len, '-');
-__detail::__to_chars_10_impl(&__str[__neg], __len, __uval);
+string __str;
+__str.__resize_and_overwrite(__neg + __len, [=](char* __p, size_t __n) {
+  __p[0] = '-';
+  __detail::__to_chars_10_impl(__p + (int)__neg, __len, __uval);
+  return __n;
+});
 return __str;
   }
 
@@ -4236,8 +4248,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   noexcept // any 32-bit value fits in the SSO buffer
 #endif
   {
-string __str(__detail::__to_chars_len(__val), '\0');
-__detail::__to_chars_10_impl(&__str[0], __str.size(), __val);
+const auto __len = __detail::__to_chars_len(__val);
+string __str;
+__str.__resize_and_overwrite(__len, [__val](char* __p, size_t __n) {
+  __detail::__to_chars_10_impl(__p, __n, __val);
+  return __n;
+});
 return __str;
   }
 
@@ -4249,8 +4265,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 const unsigned long long __uval
   = __neg ? (unsigned long long)~__val + 1ull : __val;
 const auto __len = __detail::__to_chars_len(__uval);
-string __str(__neg + __len, '-');
-__detail::__to_chars_10_impl(&__str[__neg], __len, __uval);
+string __str;
+__str.__resize_and_overwrite(__neg + __len, [=](char* __p, size_t __n) {
+  __p[0] = '-';
+  __detail::__to_chars_10_impl(__p + (int)__neg, __len, __uval);
+  return __n;
+});
 return __str;
   }
 
@@ -4258,8 +4278,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   inline string
   to_string(unsigned long long __val)
   {
-string __str(__detail::__to_chars_len(__val), '\0');
-__detail::__to_chars_10_impl(&__str[0], __str.size(), __val);
+const auto __len = __detail::__to_chars_len(__val);
+string __str;
+__str.__resize_and_overwrite(__len, [__val](char* __p, size_t __n) {
+  __detail::__to_chars_10_impl(__p, __n, __val);
+  return __n;
+});
 return __str;
   }
 
@@ -4335,80 +4359,129 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   inline long

[committed] libstdc++: Fix -Wunused-parameter in

2023-08-17 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* include/experimental/internet (address_v4::to_string): Remove
unused parameter name.
---
 libstdc++-v3/include/experimental/internet | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/experimental/internet 
b/libstdc++-v3/include/experimental/internet
index bd9a05f12aa..173913a8cec 100644
--- a/libstdc++-v3/include/experimental/internet
+++ b/libstdc++-v3/include/experimental/internet
@@ -252,7 +252,7 @@ namespace ip
   __string_with<_Allocator>
   to_string(const _Allocator& __a = _Allocator()) const
   {
-   auto __write = [__addr = to_uint()](char* __p, size_t __n) {
+   auto __write = [__addr = to_uint()](char* __p, size_t) {
  auto __to_chars = [](char* __p, uint8_t __v) {
unsigned __n = __v >= 100u ? 3 : __v >= 10u ? 2 : 1;
std::__detail::__to_chars_10_impl(__p, __n, __v);
-- 
2.41.0



[committed] libstdc++: Implement std::to_string in terms of std::format (P2587R3)

2023-08-17 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

This change for C++26 affects std::to_string for floating-point
arguments, so that they should be formatted using std::format("{}", v)
instead of using sprintf. The modified specification in the standard
also affects integral arguments, but there's no observable difference
for them, and we already use std::to_chars for them anyway.

To avoid  depending on all of , this change actually
just uses std::to_chars directly instead of using std::format. This is
equivalent, because the format spec "{}" doesn't use any of the other
features of std::format.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (to_string(floating-point-type)):
Implement using std::to_chars for C++26.
* include/bits/version.def (__cpp_lib_to_string): Define.
* include/bits/version.h: Regenerate.
* testsuite/21_strings/basic_string/numeric_conversions/char/dr1261.cc:
Adjust expected result in C++26 mode.
* 
testsuite/21_strings/basic_string/numeric_conversions/char/to_string.cc:
Likewise.
* 
testsuite/21_strings/basic_string/numeric_conversions/wchar_t/dr1261.cc:
Likewise.
* 
testsuite/21_strings/basic_string/numeric_conversions/wchar_t/to_wstring.cc:
Likewise.
* 
testsuite/21_strings/basic_string/numeric_conversions/char/to_string_float.cc:
New test.
* 
testsuite/21_strings/basic_string/numeric_conversions/wchar_t/to_wstring_float.cc:
New test.
* testsuite/21_strings/basic_string/numeric_conversions/version.cc:
New test.
---
 libstdc++-v3/include/bits/basic_string.h  |  68 +++-
 libstdc++-v3/include/bits/version.def |  11 ++
 libstdc++-v3/include/bits/version.h   |  11 ++
 .../numeric_conversions/char/dr1261.cc|  11 +-
 .../numeric_conversions/char/to_string.cc |   9 +-
 .../char/to_string_float.cc   | 148 ++
 .../numeric_conversions/version.cc|  18 +++
 .../numeric_conversions/wchar_t/dr1261.cc |  11 +-
 .../numeric_conversions/wchar_t/to_wstring.cc |   9 +-
 .../wchar_t/to_wstring_float.cc   | 145 +
 10 files changed, 429 insertions(+), 12 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string/numeric_conversions/char/to_string_float.cc
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string/numeric_conversions/version.cc
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string/numeric_conversions/wchar_t/to_wstring_float.cc

diff --git a/libstdc++-v3/include/bits/basic_string.h 
b/libstdc++-v3/include/bits/basic_string.h
index 46326d02597..f4bbf521bba 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -47,9 +47,14 @@
 # include 
 #endif
 
+#if __cplusplus > 202302L
+# include 
+#endif
+
 #define __glibcxx_want_constexpr_string
 #define __glibcxx_want_string_resize_and_overwrite
 #define __glibcxx_want_string_udls
+#define __glibcxx_want_to_string
 #include 
 
 #if ! _GLIBCXX_USE_CXX11_ABI
@@ -4185,6 +4190,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   { return std::stod(__str, __idx); }
 #endif
 
+  // _GLIBCXX_RESOLVE_LIB_DEFECTS
   // DR 1261. Insufficent overloads for to_string / to_wstring
 
   _GLIBCXX_NODISCARD
@@ -4287,7 +4293,65 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 return __str;
   }
 
-#if _GLIBCXX_USE_C99_STDIO
+#if __cpp_lib_to_string >= 202306L
+
+  [[nodiscard]]
+  inline string
+  to_string(float __val)
+  {
+string __str;
+size_t __len = 15;
+do {
+  __str.resize_and_overwrite(__len,
+[__val, &__len] (char* __p, size_t __n) {
+   auto [__end, __err] = std::to_chars(__p, __p + __n, __val);
+   if (__err == errc{}) [[likely]]
+ return __end - __p;
+   __len *= 2;
+   return __p - __p;;
+  });
+} while (__str.empty());
+return __str;
+  }
+
+  [[nodiscard]]
+  inline string
+  to_string(double __val)
+  {
+string __str;
+size_t __len = 15;
+do {
+  __str.resize_and_overwrite(__len,
+[__val, &__len] (char* __p, size_t __n) {
+   auto [__end, __err] = std::to_chars(__p, __p + __n, __val);
+   if (__err == errc{}) [[likely]]
+ return __end - __p;
+   __len *= 2;
+   return __p - __p;;
+  });
+} while (__str.empty());
+return __str;
+  }
+
+  [[nodiscard]]
+  inline string
+  to_string(long double __val)
+  {
+string __str;
+size_t __len = 15;
+do {
+  __str.resize_and_overwrite(__len,
+[__val, &__len] (char* __p, size_t __n) {
+   auto [__end, __err] = std::to_chars(__p, __p + __n, __val);
+   if (__err == errc{}) [[likely]]
+ return __end - __p;
+   __len *= 2;
+   return __p - __p;;
+  });
+} while (__str.empty());
+return __str;
+  }
+#elif _GLIBCXX_USE_C99_STDIO
   // NB: (v)

[committed] libstdc++: Optimize std::string::assign(Iter, Iter) [PR110945]

2023-08-17 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

Calling string::assign(Iter, Iter) with "foreign" iterators (not the
string's own iterator or pointer types) currently constructs a temporary
string and then calls replace to copy the characters from it. That means
we copy from the iterators twice, and if the replace operation has to
grow the string then we also allocate twice.

By using *this = basic_string(first, last, get_allocator()) we only
perform a single allocation+copy and then do a cheap move assignment
instead of a second copy (and possible allocation). But that alternative
has to be done conditionally, so that we don't pessimize the native
iterator case (the string's own iterator and pointer types) which
currently select efficient overloads of replace which will not allocate
at all if the string already has sufficient capacity. For C++20 we can
extend that efficient case to work for any contiguous iterator with the
right value type, not just for the string's native iterators.

So the change is to inline the code that decides whether to work in
place or to allocate+copy (instead of deciding that via overload
resolution for replace), and for the allocate+copy case do a move
assignment instead of another call to replace.

For C++98 there is no change, as we can't do an efficient move
assignment anyway, so keep the current code.

We can also simplify assign(initializer_list) because the backing
array for an initializer_list is always disjunct with *this, so most of
the code in _M_replace is not needed.

libstdc++-v3/ChangeLog:

PR libstdc++/110945
* include/bits/basic_string.h (basic_string::assign(Iter, Iter)):
Dispatch to _M_replace or move assignment from a temporary,
based on the iterator type.
---
 libstdc++-v3/include/bits/basic_string.h | 42 +---
 1 file changed, 38 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/include/bits/basic_string.h 
b/libstdc++-v3/include/bits/basic_string.h
index f4bbf521bba..09fd62afa66 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -1711,15 +1711,36 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
*  Sets value of string to characters in the range [__first,__last).
   */
 #if __cplusplus >= 201103L
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wc++17-extensions"
   template>
_GLIBCXX20_CONSTEXPR
+   basic_string&
+   assign(_InputIterator __first, _InputIterator __last)
+   {
+#if __cplusplus >= 202002L
+ if constexpr (contiguous_iterator<_InputIterator>
+ && is_same_v, _CharT>)
+#else
+ if constexpr (__is_one_of<_InputIterator, const_iterator, iterator,
+   const _CharT*, _CharT*>::value)
+#endif
+   {
+ __glibcxx_requires_valid_range(__first, __last);
+ return _M_replace(size_type(0), size(),
+   std::__to_address(__first), __last - __first);
+   }
+ else
+   return *this = basic_string(__first, __last, get_allocator());
+   }
+#pragma GCC diagnostic pop
 #else
   template
+   basic_string&
+   assign(_InputIterator __first, _InputIterator __last)
+   { return this->replace(begin(), end(), __first, __last); }
 #endif
-basic_string&
-assign(_InputIterator __first, _InputIterator __last)
-{ return this->replace(begin(), end(), __first, __last); }
 
 #if __cplusplus >= 201103L
   /**
@@ -1730,7 +1751,20 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   _GLIBCXX20_CONSTEXPR
   basic_string&
   assign(initializer_list<_CharT> __l)
-  { return this->assign(__l.begin(), __l.size()); }
+  {
+   // The initializer_list array cannot alias the characters in *this
+   // so we don't need to use replace to that case.
+   const size_type __n = __l.size();
+   if (__n > capacity())
+ *this = basic_string(__l.begin(), __l.end(), get_allocator());
+   else
+ {
+   if (__n)
+ _S_copy(_M_data(), __l.begin(), __n);
+   _M_set_length(__n);
+ }
+   return *this;
+  }
 #endif // C++11
 
 #if __cplusplus >= 201703L
-- 
2.41.0



[committed] libstdc++: Rework std::format support for wchar_t

2023-08-17 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

This changes how std::format creates wide strings, by replacing uses of
std::ctype::widen with the recently-added __to_wstring_numeric
helper function. This removes the dependency on the locale, which should
only be used for locale-specific formats such as {:Ld}.

Also disable all the wide string formatting support if the
_GLIBCXX_USE_WCHAR_T macro is not defined. This is consistent with other
wchar_t support being disabled if the library is built without that
macro defined.

libstdc++-v3/ChangeLog:

* include/std/format [_GLIBCXX_USE_WCHAR_T]: Guard all wide
string formatters with this macro.
(__formatter_int::_M_format_int, __formatter_fp::format)
(formatter::format): Use __to_wstring_numeric
instead of std::ctype::widen.
(__formatter_fp::_M_localize): Use hardcoded wchar_t values
instead of std::ctype::widen.
* testsuite/std/format/functions/format.cc: Add more checks for
wstring formatting of arithmetic types.
---
 libstdc++-v3/include/std/format   | 108 --
 .../testsuite/std/format/functions/format.cc  |  10 ++
 2 files changed, 82 insertions(+), 36 deletions(-)

diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format
index 0d7d3d16420..79f810acce3 100644
--- a/libstdc++-v3/include/std/format
+++ b/libstdc++-v3/include/std/format
@@ -79,8 +79,10 @@ namespace __format
 
   using format_context
 = basic_format_context<__format::_Sink_iter, char>;
+#ifdef _GLIBCXX_USE_WCHAR_T
   using wformat_context
 = basic_format_context<__format::_Sink_iter, wchar_t>;
+#endif
 
   // [format.args], class template basic_format_args
   template class basic_format_args;
@@ -118,9 +120,11 @@ namespace __format
   template
 using format_string = basic_format_string...>;
 
+#ifdef _GLIBCXX_USE_WCHAR_T
   template
 using wformat_string
   = basic_format_string...>;
+#endif
 
   // [format.formatter], formatter
 
@@ -181,7 +185,9 @@ namespace __format
   // [format.parse.ctx], class template basic_format_parse_context
   template class basic_format_parse_context;
   using format_parse_context = basic_format_parse_context;
+#ifdef _GLIBCXX_USE_WCHAR_T
   using wformat_parse_context = basic_format_parse_context;
+#endif
 
   template
 class basic_format_parse_context
@@ -745,8 +751,13 @@ namespace __format
 bool _M_hasval = false;
   };
 
+#ifdef _GLIBCXX_USE_WCHAR_T
   template
 concept __char = same_as<_CharT, char> || same_as<_CharT, wchar_t>;
+#else
+  template
+concept __char = same_as<_CharT, char>;
+#endif
 
   template<__char _CharT>
 struct __formatter_str
@@ -1125,26 +1136,20 @@ namespace __format
{
  size_t __width = _M_spec._M_get_width(__fc);
 
- _Optional_locale __loc;
-
  basic_string_view<_CharT> __str;
  if constexpr (is_same_v)
__str = __narrow_str;
  else
{
- __loc = __fc.locale();
- auto& __ct = use_facet>(__loc.value());
  size_t __n = __narrow_str.size();
  auto __p = (_CharT*)__builtin_alloca(__n * sizeof(_CharT));
- __ct.widen(__narrow_str.data(), __narrow_str.data() + __n, __p);
+ __to_wstring_numeric(__narrow_str.data(), __n, __p);
  __str = {__p, __n};
}
 
  if (_M_spec._M_localized)
{
- if constexpr (is_same_v)
-   __loc = __fc.locale();
- const auto& __l = __loc.value();
+ const auto& __l = __fc.locale();
  if (__l.name() != "C")
{
  auto& __np = use_facet>(__l);
@@ -1612,35 +1617,19 @@ namespace __format
}
}
 
- // TODO move everything below to a new member function that
- // doesn't depend on _Fp type.
-
-
- _Optional_locale __loc;
  basic_string<_CharT> __wstr;
  basic_string_view<_CharT> __str;
  if constexpr (is_same_v<_CharT, char>)
__str = __narrow_str;
  else
{
- __loc = __fc.locale();
- auto& __ct = use_facet>(__loc.value());
- const char* __data = __narrow_str.data();
- auto __overwrite = [&__data, &__ct](_CharT* __p, size_t __n)
- {
-   __ct.widen(__data, __data + __n, __p);
-   return __n;
- };
- _S_resize_and_overwrite(__wstr, __narrow_str.size(), __overwrite);
+ __wstr = std::__to_wstring_numeric(__narrow_str);
  __str = __wstr;
}
 
  if (_M_spec._M_localized)
{
- if constexpr (is_same_v)
-   __wstr = _M_localize(__str, __expc, __fc.locale());
- else
-   __wstr = _M_localize(__str, __expc, __loc.value());
+ __wstr = _M_localize(__str, __expc, __fc.locale());
  if (

[committed] libstdc++: Simplify chrono::__units_suffix using std::format

2023-08-17 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk. Backport to gcc-13 to follow.

-- >8 --

For std::chrono formatting we can simplify __units_suffix by using
std::format_to to generate the "[n/m]s" suffix with the correct
character type and write directly to the output iterator, so it doesn't
need to be widened using ctype. We can't remove the use of ctype::widen
for formatting a time zone abbreviation as a wide string, because that
can contain arbitrary characters that can't be widened by
__to_wstring_numeric.

This also fixes a bug in the chrono formatter for %Z which created a
dangling wstring_view.

libstdc++-v3/ChangeLog:

* include/bits/chrono_io.h (__units_suffix_misc): Remove.
(__units_suffix): Return a known suffix as string view, do not
write unknown suffixes to a buffer.
(__fmt_units_suffix): New function that formats the suffix using
std::format_to.
(operator<<, __chrono_formatter::_M_q): Use __fmt_units_suffix.
(__chrono_formatter::_M_Z): Correct lifetime of wstring.
---
 libstdc++-v3/include/bits/chrono_io.h | 84 +--
 1 file changed, 29 insertions(+), 55 deletions(-)

diff --git a/libstdc++-v3/include/bits/chrono_io.h 
b/libstdc++-v3/include/bits/chrono_io.h
index 84791d41fb1..05caa64fb7c 100644
--- a/libstdc++-v3/include/bits/chrono_io.h
+++ b/libstdc++-v3/include/bits/chrono_io.h
@@ -38,7 +38,7 @@
 #include  // setw, setfill
 #include 
 
-#include 
+#include 
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
@@ -69,34 +69,9 @@ namespace __detail
 #define _GLIBCXX_WIDEN_(C, S) ::std::chrono::__detail::_Widen(S, L##S)
 #define _GLIBCXX_WIDEN(S) _GLIBCXX_WIDEN_(_CharT, S)
 
-
-  // Write an arbitrary duration suffix into the buffer.
-  template
-constexpr const char*
-__units_suffix_misc(char* __buf, size_t /* TODO check length? */) noexcept
-{
-  namespace __tc = std::__detail;
-  char* __p = __buf;
-  __p[0] = '[';
-  unsigned __nlen = __tc::__to_chars_len((uintmax_t)_Period::num);
-  __tc::__to_chars_10_impl(__p + 1, __nlen, (uintmax_t)_Period::num);
-  __p += 1 + __nlen;
-  if constexpr (_Period::den != 1)
-   {
- __p[0] = '/';
- unsigned __dlen = __tc::__to_chars_len((uintmax_t)_Period::den);
- __tc::__to_chars_10_impl(__p + 1, __dlen, (uintmax_t)_Period::den);
- __p += 1 + __dlen;
-   }
-  __p[0] = ']';
-  __p[1] = 's';
-  __p[2] = '\0';
-  return __buf;
-}
-
   template
-constexpr auto
-__units_suffix(char* __buf, size_t __n) noexcept
+constexpr basic_string_view<_CharT>
+__units_suffix() noexcept
 {
   // The standard say these are all narrow strings, which would need to
   // be widened at run-time when inserted into a wide stream. We use
@@ -134,7 +109,22 @@ namespace __detail
   _GLIBCXX_UNITS_SUFFIX(ratio<3600>,  "h")
   _GLIBCXX_UNITS_SUFFIX(ratio<86400>, "d")
 #undef _GLIBCXX_UNITS_SUFFIX
-  return __detail::__units_suffix_misc<_Period>(__buf, __n);
+   return {};
+}
+
+  template
+inline _Out
+__fmt_units_suffix(_Out __out) noexcept
+{
+  if (auto __s = __detail::__units_suffix<_Period, _CharT>(); __s.size())
+   return __format::__write(std::move(__out), __s);
+  else if constexpr (_Period::den == 1)
+   return std::format_to(std::move(__out), _GLIBCXX_WIDEN("[{}]s"),
+ (uintmax_t)_Period::num);
+  else
+   return std::format_to(std::move(__out), _GLIBCXX_WIDEN("[{}/{}]s"),
+ (uintmax_t)_Period::num,
+ (uintmax_t)_Period::den);
 }
 } // namespace __detail
 /// @endcond
@@ -149,14 +139,14 @@ namespace __detail
 operator<<(std::basic_ostream<_CharT, _Traits>& __os,
   const duration<_Rep, _Period>& __d)
 {
+  using _Out = ostreambuf_iterator<_CharT, _Traits>;
   using period = typename _Period::type;
-  char __buf[sizeof("[/]s") + 2 * numeric_limits::digits10];
   std::basic_ostringstream<_CharT, _Traits> __s;
   __s.flags(__os.flags());
   __s.imbue(__os.getloc());
   __s.precision(__os.precision());
   __s << __d.count();
-  __s << __detail::__units_suffix(__buf, sizeof(__buf));
+  __detail::__fmt_units_suffix(_Out(__s));
   __os << std::move(__s).str();
   return __os;
 }
@@ -1056,32 +1046,16 @@ namespace __format
   template
typename _FormatContext::iterator
_M_q(const _Tp&, typename _FormatContext::iterator __out,
-_FormatContext& __ctx) const
+_FormatContext&) const
{
  // %q The duration's unit suffix
  if constexpr (!chrono::__is_duration_v<_Tp>)
__throw_format_error("format error: argument is not a duration");
  else
{
+ namespace __d = chrono::__detail;
  using period = typename _Tp::period;
- char __buf[sizeof("[/]s") + 2 * 
numeric

[committed] libstdc++: Micro-optimize construction of named std::locale

2023-08-17 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

This shaves about 100ns off the std::locale constructor for named
locales (which is only about 1% of the total time).

Using !*s instead of !strcmp(s, "") doesn't make any difference as GCC
optimizes that already even at -O1. !strcmp(s, "C") is optimized at -O2
so replacing that with s[0] == 'C' && s[1] == '\0' only matters for the
--enable-libstdcxx-debug builds. But !strcmp(s, "POSIX") always makes a
call to strcmp at any optimization level. We make that strcmp call,
maybe several times, for any locale name except for "C" (which will be
matched before we get to the check for "POSIX").

For most targets, locale names begin with a lowercase letter and the
only one that begins with 'P' is "POSIX". Replacing !strcmp(s, "POSIX")
with s[0] == 'P' && !strcmp(s+1, "OSIX") means that we avoid calling
strcmp unless the string really does match "POSIX".

Maybe more importantly, I find is_C_locale(s) easier to read than
strcmp(s, "C") == 0 || strcmp(s, "POSIX") == 0, and !is_C_locale(s)
easier to read than strcmp(s, "C") != 0 && strcmp(s, "POSIX") != 0.

libstdc++-v3/ChangeLog:

* src/c++98/localename.cc (is_C_locale): New function.
(locale::locale(const char*)): Use is_C_locale.
---
 libstdc++-v3/src/c++98/localename.cc | 39 
 1 file changed, 23 insertions(+), 16 deletions(-)

diff --git a/libstdc++-v3/src/c++98/localename.cc 
b/libstdc++-v3/src/c++98/localename.cc
index 25e6d966dca..68cb81d0709 100644
--- a/libstdc++-v3/src/c++98/localename.cc
+++ b/libstdc++-v3/src/c++98/localename.cc
@@ -36,24 +36,37 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   using namespace __gnu_cxx;
 
+  static inline bool
+  is_C_locale(const char* s)
+  {
+switch (s[0])
+{
+case 'C':
+  return s[1] == '\0';
+case 'P':
+  return !std::strcmp(s+1, "OSIX");
+default:
+  return false;
+}
+  }
+
   locale::locale(const char* __s) : _M_impl(0)
   {
 if (__s)
   {
_S_initialize();
-   if (std::strcmp(__s, "C") == 0 || std::strcmp(__s, "POSIX") == 0)
+   if (is_C_locale(__s))
  (_M_impl = _S_classic)->_M_add_reference();
-   else if (std::strcmp(__s, "") != 0)
+   else if (*__s)
  _M_impl = new _Impl(__s, 1);
else
  {
// Get it from the environment.
char* __env = std::getenv("LC_ALL");
// If LC_ALL is set we are done.
-   if (__env && std::strcmp(__env, "") != 0)
+   if (__env && *__env)
  {
-   if (std::strcmp(__env, "C") == 0
-   || std::strcmp(__env, "POSIX") == 0)
+   if (is_C_locale(__env))
  (_M_impl = _S_classic)->_M_add_reference();
else
  _M_impl = new _Impl(__env, 1);
@@ -63,9 +76,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
// LANG may set a default different from "C".
string __lang;
__env = std::getenv("LANG");
-   if (!__env || std::strcmp(__env, "") == 0
-   || std::strcmp(__env, "C") == 0
-   || std::strcmp(__env, "POSIX") == 0)
+   if (!__env || !*__env || is_C_locale(__env))
  __lang = "C";
else
  __lang = __env;
@@ -77,17 +88,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  for (; __i < _S_categories_size; ++__i)
{
  __env = std::getenv(_S_categories[__i]);
- if (__env && std::strcmp(__env, "") != 0
- && std::strcmp(__env, "C") != 0
- && std::strcmp(__env, "POSIX") != 0)
+ if (__env && *__env && !is_C_locale(__env))
break;
}
else
  for (; __i < _S_categories_size; ++__i)
{
  __env = std::getenv(_S_categories[__i]);
- if (__env && std::strcmp(__env, "") != 0
- && __lang != __env)
+ if (__env && *__env && __lang != __env)
break;
}
 
@@ -113,14 +121,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  {
__env = std::getenv(_S_categories[__i]);
__str += _S_categories[__i];
-   if (!__env || std::strcmp(__env, "") == 0)
+   if (!__env || !*__env)
  {
__str += '=';
__str += __lang;
__str += ';';
  }
-   else if (std::strcmp(__env, "C") == 0
-|| std::strcmp(__env, "POSIX") == 0)
+   else if (is_C_locale(__env))
  __str += "=C;";
else
  {
-- 

[committed] libstdc++: Make __cmp_cat::__unseq constructor consteval

2023-08-17 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk. Probably good to backport.

-- >8 --

This constructor should only ever be used with a literal 0 as the
argument, so we can make it consteval. This has the nice advantage that
it is expanded immediately in the front end, and so GDB will never step
into the __cmp_cat::__unseq::__unseq(__unseq*) constructor that is
uninteresting and probably confusing to users.

libstdc++-v3/ChangeLog:

* libsupc++/compare (__cmp_cat::__unseq): Make ctor consteval.
* testsuite/18_support/comparisons/categories/zero_neg.cc: Prune
excess errors caused by invalid consteval calls.
---
 libstdc++-v3/libsupc++/compare| 2 +-
 .../18_support/comparisons/categories/zero_neg.cc | 8 
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/libsupc++/compare b/libstdc++-v3/libsupc++/compare
index b133fdbcf1e..9215f51e94b 100644
--- a/libstdc++-v3/libsupc++/compare
+++ b/libstdc++-v3/libsupc++/compare
@@ -53,7 +53,7 @@ namespace std _GLIBCXX_VISIBILITY(default)
 
 struct __unspec
 {
-  constexpr __unspec(__unspec*) noexcept { }
+  consteval __unspec(__unspec*) noexcept { }
 };
   }
 
diff --git 
a/libstdc++-v3/testsuite/18_support/comparisons/categories/zero_neg.cc 
b/libstdc++-v3/testsuite/18_support/comparisons/categories/zero_neg.cc
index 7daf799f71d..17a129bcb75 100644
--- a/libstdc++-v3/testsuite/18_support/comparisons/categories/zero_neg.cc
+++ b/libstdc++-v3/testsuite/18_support/comparisons/categories/zero_neg.cc
@@ -34,6 +34,11 @@ test01()
   std::weak_ordering::equivalent == 1;// { dg-error "invalid conversion" }
   std::strong_ordering::equivalent == 1;  // { dg-error "invalid conversion" }
 
+  constexpr int z = 0;
+  std::partial_ordering::equivalent == z; // { dg-error "invalid conversion" }
+  std::weak_ordering::equivalent == z;// { dg-error "invalid conversion" }
+  std::strong_ordering::equivalent == z;  // { dg-error "invalid conversion" }
+
   constexpr void* p = nullptr;
   std::partial_ordering::equivalent == p; // { dg-error "invalid conversion" }
   std::weak_ordering::equivalent == p;// { dg-error "invalid conversion" }
@@ -44,3 +49,6 @@ test01()
   std::weak_ordering::equivalent == nullptr;
   std::strong_ordering::equivalent == nullptr;
 }
+
+// { dg-prune-output "reinterpret_cast.* is not a constant expression" }
+// { dg-prune-output "cast from 'void.' is not allowed" }
-- 
2.41.0



[committed] libstdc++: Reuse double overload of __convert_to_v if possible

2023-08-17 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

For targets where double and long double have the same representation we
can reuse the same __convert_to_v code for both types. This will
slightly reduce the size of the compiled code in the library.

libstdc++-v3/ChangeLog:

* config/locale/generic/c_locale.cc (__convert_to_v): Reuse
double overload for long double if possible.
---
 libstdc++-v3/config/locale/generic/c_locale.cc | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/libstdc++-v3/config/locale/generic/c_locale.cc 
b/libstdc++-v3/config/locale/generic/c_locale.cc
index 8849d78fdfa..866ba0361dc 100644
--- a/libstdc++-v3/config/locale/generic/c_locale.cc
+++ b/libstdc++-v3/config/locale/generic/c_locale.cc
@@ -187,6 +187,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __convert_to_v(const char* __s, long double& __v,
   ios_base::iostate& __err, const __c_locale&) throw()
 {
+#if __DBL_MANT_DIG__ == __LDBL_MANT_DIG__
+  double __d;
+  __convert_to_v(__s, __d, __err, __c_locale);
+  __v = __d;
+#else
   // Assumes __s formatted for "C" locale.
   const char* __sav = __set_C_locale();
   if (!__sav)
@@ -233,6 +238,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   setlocale(LC_ALL, __sav);
   delete [] __sav;
+#endif // __DBL_MANT_DIG__ == __LDBL_MANT_DIG__
 }
 
   void
-- 
2.41.0



[committed] libstdc++: Define std::numeric_limits<_FloatNN> before C++23

2023-08-17 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

The extended floating-point types such as _Float32 are supported by GCC
prior to C++23, you just can't use the standard-conforming names from
 to refer to them. This change defines the specializations of
std::numeric_limits for those types for older dialects, not only for
C++23.

libstdc++-v3/ChangeLog:

* include/bits/c++config (__gnu_cxx::__bfloat16_t): Define
whenever __BFLT16_DIG__ is defined, not only for C++23.
* include/std/limits (numeric_limits): Likewise.
(numeric_limits<_Float16>, numeric_limits<_Float32>)
(numeric_limits<_Float64>): Likewise for other extended
floating-point types.
---
 libstdc++-v3/include/bits/c++config |   4 +-
 libstdc++-v3/include/std/limits | 194 +++-
 2 files changed, 103 insertions(+), 95 deletions(-)

diff --git a/libstdc++-v3/include/bits/c++config 
b/libstdc++-v3/include/bits/c++config
index dd47f274d5f..0a41cdd29a9 100644
--- a/libstdc++-v3/include/bits/c++config
+++ b/libstdc++-v3/include/bits/c++config
@@ -822,10 +822,10 @@ namespace std
 # define _GLIBCXX_LDOUBLE_IS_IEEE_BINARY128 1
 #endif
 
-#ifdef __STDCPP_BFLOAT16_T__
+#if defined __cplusplus && defined __BFLT16_DIG__
 namespace __gnu_cxx
 {
-  using __bfloat16_t = decltype(0.0bf16);
+  typedef __decltype(0.0bf16) __bfloat16_t;
 }
 #endif
 
diff --git a/libstdc++-v3/include/std/limits b/libstdc++-v3/include/std/limits
index 52b19ef8264..7a59e7520eb 100644
--- a/libstdc++-v3/include/std/limits
+++ b/libstdc++-v3/include/std/limits
@@ -1890,189 +1890,197 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #undef __glibcxx_long_double_traps
 #undef __glibcxx_long_double_tinyness_before
 
-#if __cplusplus > 202002L
-
 #define __glibcxx_concat3_(P,M,S) P ## M ## S
 #define __glibcxx_concat3(P,M,S) __glibcxx_concat3_ (P,M,S)
 
+#if __cplusplus >= 201103L
+# define __max_digits10 max_digits10
+#endif
+
 #define __glibcxx_float_n(BITSIZE) \
   __extension__
\
   template<>   \
 struct numeric_limits<_Float##BITSIZE> \
 {  \
-  static constexpr bool is_specialized = true; \
+  static _GLIBCXX_USE_CONSTEXPR bool is_specialized = true;
\
\
-  static constexpr _Float##BITSIZE \
-  min() noexcept   \
+  static _GLIBCXX_CONSTEXPR _Float##BITSIZE
\
+  min() _GLIBCXX_USE_NOEXCEPT  \
   { return __glibcxx_concat3 (__FLT, BITSIZE, _MIN__); }   \
\
-  static constexpr _Float##BITSIZE \
-  max() noexcept   \
+  static _GLIBCXX_CONSTEXPR _Float##BITSIZE
\
+  max() _GLIBCXX_USE_NOEXCEPT  \
   { return __glibcxx_concat3 (__FLT, BITSIZE, _MAX__); }   \
\
-  static constexpr _Float##BITSIZE \
-  lowest() noexcept
\
+  static _GLIBCXX_CONSTEXPR _Float##BITSIZE
\
+  lowest() _GLIBCXX_USE_NOEXCEPT   \
   { return -__glibcxx_concat3 (__FLT, BITSIZE, _MAX__); }  \
\
-  static constexpr int digits  \
+  static _GLIBCXX_USE_CONSTEXPR int digits \
= __glibcxx_concat3 (__FLT, BITSIZE, _MANT_DIG__);  \
-  static constexpr int digits10\
+  static _GLIBCXX_USE_CONSTEXPR int digits10   \
= __glibcxx_concat3 (__FLT, BITSIZE, _DIG__);   \
-  static constexpr int max_digits10
\
+  static _GLIBCXX_USE_CONSTEXPR int __max_digits10 \
= __glibcxx_max_digits10 (__glibcxx_concat3 (__FLT, BITSIZE,\
 _MANT_DIG__)); \
-  static constexpr bool is_signed = true;  \
-  static constexpr bool is_integer = false;
\
-  static constexpr bool is_exact = false;  \
-  static constexpr int radix = __FLT_RADIX__;  \
+  static _GLIBCXX_USE_CONSTE

  1   2   >