[PATCH] tree-optimization/115494 - PRE PHI translation and ranges

2025-01-16 Thread Richard Biener
When we PHI translate dependent expressions we keep SSA defs in
place of the translated expression in case the expression itself
did not change even though it's context did and thus the validity
of ranges associated with it.  That eventually leads to simplification
errors given we violate the precondition that used SSA defs fed to
vn_valueize are valid to use (including their associated ranges).
The following makes sure to replace those with new representatives
always, not only when the dependent expression translation changed it.

The fix was originally discovered by Michael Morin.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed to trunk.

PR tree-optimization/115494
* tre-ssa-pre.cc (phi_translate_1): Always generate a
representative for translated dependent expressions.

* gcc.dg/torture/pr115494.c: New testcase.

Co-Authored-By: Mikael Morin 
---
 gcc/testsuite/gcc.dg/torture/pr115494.c | 24 
 gcc/tree-ssa-pre.cc |  4 ++--
 2 files changed, 26 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr115494.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr115494.c 
b/gcc/testsuite/gcc.dg/torture/pr115494.c
new file mode 100644
index 000..a8c614433a6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr115494.c
@@ -0,0 +1,24 @@
+/* { dg-do run } */
+
+unsigned char a;
+int b = 1, c, d;
+int __attribute__((noipa))
+f()
+{
+  char e;
+  c = b - c;
+  a = ~(c || a);
+  e = -(b ^ a);
+  d = e && b;
+  a = ~(b & a);
+  if (a < 2)
+return 1;
+  return 0;
+}
+
+int main()
+{
+  if (f())
+__builtin_abort();
+}
+
diff --git a/gcc/tree-ssa-pre.cc b/gcc/tree-ssa-pre.cc
index 32bcd9b1b7a..735893bb191 100644
--- a/gcc/tree-ssa-pre.cc
+++ b/gcc/tree-ssa-pre.cc
@@ -1430,7 +1430,7 @@ phi_translate_1 (bitmap_set_t dest,
unsigned int op_val_id = VN_INFO (newnary->op[i])->value_id;
leader = find_leader_in_sets (op_val_id, set1, set2);
result = phi_translate (dest, leader, set1, set2, e);
-   if (result && result != leader)
+   if (result)
  /* If op has a leader in the sets we translate make
 sure to use the value of the translated expression.
 We might need a new representative for that.  */
@@ -1553,7 +1553,7 @@ phi_translate_1 (bitmap_set_t dest,
op_val_id = VN_INFO (op[n])->value_id;
leader = find_leader_in_sets (op_val_id, set1, set2);
opresult = phi_translate (dest, leader, set1, set2, e);
-   if (opresult && opresult != leader)
+   if (opresult)
  {
tree name = get_representative_for (opresult);
changed |= name != op[n];
-- 
2.43.0


Re: [PATCH v4] AArch64: Add LUTI ACLE for SVE2

2025-01-16 Thread Richard Sandiford
Thanks for the update.  Mostly LGTM, but some comments below:

 writes:
> diff --git a/gcc/config/aarch64/aarch64-sve2.md 
> b/gcc/config/aarch64/aarch64-sve2.md
> index f8cfe08f4c0..0a1dc314f94 100644
> --- a/gcc/config/aarch64/aarch64-sve2.md
> +++ b/gcc/config/aarch64/aarch64-sve2.md
> @@ -133,6 +133,7 @@
>  ;;  Optional AES extensions
>  ;;  Optional SHA-3 extensions
>  ;;  Optional SM4 extensions
> +;;  Table lookup

This puts it under:

;; == Cryptographic extensions

but it's not a crytographic extension.  Probably better to put it under:

;; == General

instead.

>  ;; =
>  ;; == Moves
> @@ -4211,3 +4212,35 @@
>"sm4ekey\t%0.s, %1.s, %2.s"
>[(set_attr "type" "crypto_sm4")]
>  )
> +
> +;; -
> +;;  Table lookup
> +;; -
> +;; Includes:
> +;; - LUTI2
> +;; - LUTI4
> +;; -
> +
> +(define_insn "@aarch64_sve_luti"
> +  [(set (match_operand:SVE_FULL_BH 0 "register_operand" "=w")
> + (unspec:SVE_FULL_BH
> +  [(match_operand:SVE_FULL_BH 1 "register_operand" "w")
> +   (match_operand:VNx16QI 2 "register_operand" "w")

This is correct

> +   (match_operand:DI 3 "const_int_operand")
> +   (const_int LUTI_BITS)]
> +  UNSPEC_SVE_LUTI))]
> +  "TARGET_LUT && TARGET_SVE2_OR_SME2"
> +  "luti\t%0., { %1. }, %2[%3]"
> +)
> +
> +(define_insn "@aarch64_sve_luti"
> +  [(set (match_operand: 0 "register_operand" "=w")
> + (unspec:
> +  [(match_operand:SVE_FULL_Hx2 1 "register_operand" "Uw2")

...but this should use aligned_register_operand instead of
register_operand.

> +   (match_operand:VNx16QI 2 "register_operand" "w")
> +   (match_operand:DI 3 "const_int_operand")
> +   (const_int LUTI_BITS)]
> +   UNSPEC_SVE_LUTI))]
> +  "TARGET_LUT && TARGET_SVE2_OR_SME2"
> +  "luti\t%0., %1, %2[%3]"
> +)
> diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
> index ff0f34dd043..0fbf96f1ab9 100644
> --- a/gcc/config/aarch64/iterators.md
> +++ b/gcc/config/aarch64/iterators.md
> @@ -553,6 +553,18 @@
>  (define_mode_iterator SVE_FULL_BHS [VNx16QI VNx8HI VNx4SI
>   VNx8BF VNx8HF VNx4SF])
>  
> +;; Fully-packed SVE vector byte modes that have 32-bit or smaller elements.
> +(define_mode_iterator SVE_FULL_BS [VNx16QI VNx4SI VNx4SF])

This is no longer needed.

> +
> +;; Fully-packed SVE vector byte modes that have 16-bit or smaller elements.
> +(define_mode_iterator SVE_FULL_BH [VNx16QI VNx8HI VNx8HF VNx8BF])
> +
> +;; Fully-packed half word SVE vector modes
> +(define_mode_iterator SVE_FULL_H [VNx8HI VNx8HF VNx8BF])

Similarly, SVE_FULL_H is no longer needed.

> +
> +;; Pairs of fully-packed SVE vector modes (half word only)
> +(define_mode_iterator SVE_FULL_Hx2 [VNx16HI VNx16HF VNx16BF])
> +
>  ;; Fully-packed SVE vector modes that have 32-bit elements.
>  (define_mode_iterator SVE_FULL_S [VNx4SI VNx4SF])
>  
> @@ -1186,6 +1198,7 @@
>  UNSPEC_UZPQ2
>  UNSPEC_ZIPQ1
>  UNSPEC_ZIPQ2
> +UNSPEC_SVE_LUTI
>  
>  ;; All used in aarch64-sme.md
>  UNSPEC_SME_ADD
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h 
> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h
> index d3ae707ac49..c0dd89fa924 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h
> @@ -780,4 +780,20 @@
>   "w" (z16), "w" (z22), "w" (z29));   \
>}
>  
> +#define TEST_1X2_NARROW(NAME, RTYPE, TTYPE, ZTYPE, CODE1, CODE2) \
> +  PROTO(NAME, void, ())  
> \
> +  {  \
> +register RTYPE z0 __asm ("z0");  \
> +register ZTYPE z5 __asm ("z5");  \
> +register TTYPE z6 __asm ("z6");  \
> +register RTYPE z16 __asm ("z16");
> \
> +register ZTYPE z22 __asm ("z22");
> \
> +register TTYPE z29 __asm ("z29");
> \
> +register RTYPE z0_res __asm ("z0");  
> \
> +__asm volatile ("" : "=w" (z0), "=w" (z5), "=w" (z6),\
> + "=w" (z16), "=w" (z22), "=w" (z29));\
> +INVOKE (CODE1, CODE2);   \
> +__asm volatile ("" :: "w" (z0_res), "w" (z5), "w" (z22));
> \
> +  }
> +
>  #endif
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/lut_1.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/l

Re: [PATCH v5 1/2] [APX CFCMOV] Support APX CFCMOV in if_convert pass

2025-01-16 Thread Richard Sandiford
Hongyu Wang  writes:
> From: Lingling Kong 
>
> Hi,
>
> Appreciated to Richard's review, the v5 patch contaings below change:
>
> 1. Separate the maskload/maskstore emit out from noce_emit_cmove, add
> a new function emit_mask_load_store in optabs.cc.
> 2. Follow the operand order of maskload and maskstore optab and takes
> cond as predicate operand with VOIDmode.
> 3. Cache may_trap_or_fault_p and correct the logic to ensure only one
> of cmove source operand can be a may_trap_or_fault memory.
>
> Bootstrapped & regtested on x86-64-pclinux-gnu.
>
> OK for trunk?
>
> APX CFCMOV feature implements conditionally faulting which means
> that all memory faults are suppressed when the condition code
> evaluates to false and load or store a memory operand. Now we
> could load or store a memory operand may trap or fault for
> conditional move.
>
> In middle-end, now we don't support a conditional move if we knew
> that a load from A or B could trap or fault. To enable CFCMOV, we
> use mask_load and mask_store as a proxy for backend expander. The
> predicate of mask_load/mask_store is recognized as comparison rtx
> in the inital implementation.
>
> Conditional move suppress_fault for condition mem store would not
> move any arithmetic calculations. For condition mem load now just
> support a conditional move one trap mem and one no trap and no mem
> cases.
>
> gcc/ChangeLog:
>
>   * ifcvt.cc (can_use_mask_load_store):  New function to check
>   wheter conditional fault load store .
>   (noce_try_cmove_arith): Relax the condition for operand
>   may_trap_or_fault check, expand with mask_load/mask_store optab
>   for one of the cmove operand may trap or fault.
>   (noce_process_if_block): Allow trap_or_fault dest for
>   "if (...)" *x = a; else skip" scenario when mask_store optab is
>   available.
>   * optabs.h (emit_mask_load_store): New declaration.
>   * optabs.cc (emit_mask_load_store): New function to emit
>   conditional move with mask_load/mask_store optab.

Thanks for the update.  This addresses the comments I had about
the use of the maskload/store optabs in previous versions.

I did make several attempts to review the patch beyond that, but I find
it very difficult to understand the flow of noce_try_cmove_arith, and
how all the various special cases fit together.  (Not your fault of
course.)  So I think someone who knows ifcvt should take it from here.

It would be nice if the internal implementation of emit_mask_load_store
could share more code with other routines though.

Thanks (and sorry),
Richard

> ---
>  gcc/ifcvt.cc  | 110 ++
>  gcc/optabs.cc | 103 ++
>  gcc/optabs.h  |   3 ++
>  3 files changed, 200 insertions(+), 16 deletions(-)
>
> diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
> index cb5597bc171..51ac398aee1 100644
> --- a/gcc/ifcvt.cc
> +++ b/gcc/ifcvt.cc
> @@ -778,6 +778,7 @@ static bool noce_try_store_flag_mask (struct noce_if_info 
> *);
>  static rtx noce_emit_cmove (struct noce_if_info *, rtx, enum rtx_code, rtx,
>   rtx, rtx, rtx, rtx = NULL, rtx = NULL);
>  static bool noce_try_cmove (struct noce_if_info *);
> +static bool can_use_mask_load_store (struct noce_if_info *);
>  static bool noce_try_cmove_arith (struct noce_if_info *);
>  static rtx noce_get_alt_condition (struct noce_if_info *, rtx, rtx_insn **);
>  static bool noce_try_minmax (struct noce_if_info *);
> @@ -2132,6 +2133,39 @@ noce_emit_bb (rtx last_insn, basic_block bb, bool 
> simple)
>return true;
>  }
>  
> +/* Return TRUE if backend supports scalar maskload_optab
> +   or maskstore_optab, who suppresses memory faults when trying to
> +   load or store a memory operand and the condition code evaluates
> +   to false.
> +   Currently the following forms
> +   "if (test) *x = a; else skip;" --> mask_store
> +   "if (test) x = *a; else x = b;" --> mask_load
> +   "if (test) x = a; else x = *b;" --> mask_load
> +   are supported.  */
> +
> +static bool
> +can_use_mask_load_store (struct noce_if_info *if_info)
> +{
> +  rtx b = if_info->b;
> +  rtx x = if_info->x;
> +  rtx cond = if_info->cond;
> +
> +  if (MEM_P (x))
> +{
> +  if (convert_optab_handler (maskstore_optab, GET_MODE (x),
> +  GET_MODE (cond)) == CODE_FOR_nothing)
> + return false;
> +
> +  if (!rtx_equal_p (x, b) || !may_trap_or_fault_p (x))
> + return false;
> +
> +  return true;
> +}
> +  else
> +return convert_optab_handler (maskload_optab, GET_MODE (x),
> +   GET_MODE (cond)) != CODE_FOR_nothing;
> +}
> +
>  /* Try more complex cases involving conditional_move.  */
>  
>  static bool
> @@ -2151,6 +2185,9 @@ noce_try_cmove_arith (struct noce_if_info *if_info)
>enum rtx_code code;
>rtx cond = if_info->cond;
>rtx_insn *ifcvt_seq;
> +  bool a_may_trap_or_fault = may_trap_or_fault_p (a);
> +  boo

Re: [PATCH] LoongArch: Add alsl.wu

2025-01-16 Thread Lulu Cheng

LGTM!

Thanks!

在 2025/1/15 下午6:09, Xi Ruoyao 写道:

On 64-bit capable LoongArch hardware, alsl.wu is similar to alsl.w but
zero-extending the 32-bit result.

gcc/ChangeLog:

* config/loongarch/loongarch.md (alslsi3_extend): Add alsl.wu.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/alsl_wu.c: New test.
---
  gcc/config/loongarch/loongarch.md| 8 
  gcc/testsuite/gcc.target/loongarch/alsl_wu.c | 9 +
  2 files changed, 13 insertions(+), 4 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/loongarch/alsl_wu.c

diff --git a/gcc/config/loongarch/loongarch.md 
b/gcc/config/loongarch/loongarch.md
index 59f45770311..1b46e8e4af0 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -3143,15 +3143,15 @@ (define_insn "alsl3"
[(set_attr "type" "arith")
 (set_attr "mode" "")])
  
-(define_insn "alslsi3_extend"

+(define_insn "*alslsi3_extend"
[(set (match_operand:DI 0 "register_operand" "=r")
-   (sign_extend:DI
+   (any_extend:DI
  (plus:SI
(ashift:SI (match_operand:SI 1 "register_operand" "r")
   (match_operand 2 "const_immalsl_operand" ""))
(match_operand:SI 3 "register_operand" "r"]
-  ""
-  "alsl.w\t%0,%1,%3,%2"
+  "TARGET_64BIT"
+  "alsl.w\t%0,%1,%3,%2"
[(set_attr "type" "arith")
 (set_attr "mode" "SI")])
  
diff --git a/gcc/testsuite/gcc.target/loongarch/alsl_wu.c b/gcc/testsuite/gcc.target/loongarch/alsl_wu.c

new file mode 100644
index 000..65f55e629dd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/alsl_wu.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-march=loongarch64 -mabi=lp64d -O2" } */
+/* { dg-final { scan-assembler "alsl\\.wu" } } */
+
+unsigned long
+test (unsigned int a, unsigned int b)
+{
+  return (a << 2) + b;
+}




Re: [committed] libstdc++: Implement LWG 2937 for std::filesystem::equivalent [PR118158]

2025-01-16 Thread Jonathan Wakely
On Thu, 16 Jan 2025 at 09:50, Jonathan Wakely  wrote:
>
> Do not report an error for (is_other(s1) && is_other(s2)) as the
> standard originally said, nor for (is_other(s1) || is_other(s2)) as
> libstdc++ was doing. We can compare inode numbers for special files and
> so give sensible answers.
>
> libstdc++-v3/ChangeLog:
>
> PR libstdc++/118158
> * src/c++17/fs_ops.cc (fs::equivalent): Remove error reporting
> for is_other(s1) && is_other(s2) case, as per LWG 2937.
> * testsuite/27_io/filesystem/operations/pr118158.cc: New test.
> ---
>
> Tested x86_64-linux. Pushed to trunk.
>
> I'm expecting the new test's use of mkfifo to fail on some targets, e.g.
> maybe rtems. We can either tweak the #if or add target selectors to
> exclude those targets when we discover which ones FAIL.

Ah, it FAILs on mingw-w64 for a start.

>
>  libstdc++-v3/src/c++17/fs_ops.cc  | 22 +++
>  .../27_io/filesystem/operations/pr118158.cc   | 62 +++
>  2 files changed, 69 insertions(+), 15 deletions(-)
>  create mode 100644 
> libstdc++-v3/testsuite/27_io/filesystem/operations/pr118158.cc
>
> diff --git a/libstdc++-v3/src/c++17/fs_ops.cc 
> b/libstdc++-v3/src/c++17/fs_ops.cc
> index 1d75f24f78e8..4f188153ae3a 100644
> --- a/libstdc++-v3/src/c++17/fs_ops.cc
> +++ b/libstdc++-v3/src/c++17/fs_ops.cc
> @@ -914,24 +914,16 @@ fs::equivalent(const path& p1, const path& p2, 
> error_code& ec) noexcept
>else
>  err = errno;
>
> -  if (exists(s1) && exists(s2))
> -{
> -  if (is_other(s1) && is_other(s2))
> -   {
> - ec = std::__unsupported();
> - return false;
> -   }
> -  ec.clear();
> -  if (is_other(s1) || is_other(s2))
> -   return false;
> -  return fs::equiv_files(p1.c_str(), st1, p2.c_str(), st2, ec);
> -}
> +  if (err)
> +ec.assign(err, std::generic_category());
>else if (!exists(s1) || !exists(s2))
>  ec = std::make_error_code(std::errc::no_such_file_or_directory);
> -  else if (err)
> -ec.assign(err, std::generic_category());
>else
> -ec.clear();
> +{
> +  ec.clear();
> +  if (s1.type() == s2.type())
> +   return fs::equiv_files(p1.c_str(), st1, p2.c_str(), st2, ec);
> +}
>return false;
>  #else
>ec = std::make_error_code(std::errc::function_not_supported);
> diff --git a/libstdc++-v3/testsuite/27_io/filesystem/operations/pr118158.cc 
> b/libstdc++-v3/testsuite/27_io/filesystem/operations/pr118158.cc
> new file mode 100644
> index ..b57a2d184f41
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/27_io/filesystem/operations/pr118158.cc
> @@ -0,0 +1,62 @@
> +// { dg-do run { target c++17 } }
> +// { dg-require-filesystem-ts "" }
> +
> +#include 
> +#include 
> +#include 
> +
> +#if defined(_GLIBCXX_HAVE_SYS_STAT_H) && defined(_GLIBCXX_HAVE_SYS_TYPES_H)
> +# include 
> +# include   // mkfifo
> +#endif
> +
> +namespace fs = std::filesystem;
> +
> +void
> +test_pr118158()
> +{
> +#if defined(_GLIBCXX_HAVE_SYS_STAT_H) && defined(_GLIBCXX_HAVE_SYS_TYPES_H) \
> +  && defined(S_IWUSR) && defined(S_IRUSR)
> +  auto p1 = __gnu_test::nonexistent_path();
> +  auto p2 = __gnu_test::nonexistent_path();
> +  auto p3 = __gnu_test::nonexistent_path();
> +  const std::error_code bad_ec = 
> make_error_code(std::errc::invalid_argument);
> +  std::error_code ec;
> +  bool result;
> +
> +  VERIFY( ! ::mkfifo(p1.c_str(), S_IWUSR | S_IRUSR) );
> +  __gnu_test::scoped_file f1(p1, __gnu_test::scoped_file::adopt_file);
> +
> +  // Special file is equivalent to itself.
> +  VERIFY( equivalent(p1, p1) );
> +  VERIFY( equivalent(p1, p1, ec) );
> +  VERIFY( ! ec );
> +
> +  VERIFY( ! ::mkfifo(p2.c_str(), S_IWUSR | S_IRUSR) );
> +  __gnu_test::scoped_file f2(p2, __gnu_test::scoped_file::adopt_file);
> +
> +  ec = bad_ec;
> +  // Distinct special files are not equivalent.
> +  VERIFY( ! equivalent(p1, p2, ec) );
> +  VERIFY( ! ec );
> +
> +  // Non-existent paths are always an error.
> +  VERIFY( ! equivalent(p1, p3, ec) );
> +  VERIFY( ec == std::errc::no_such_file_or_directory );
> +  ec = bad_ec;
> +  VERIFY( ! equivalent(p3, p2, ec) );
> +  VERIFY( ec == std::errc::no_such_file_or_directory );
> +
> +  // Special file is not equivalent to regular file.
> +  __gnu_test::scoped_file f3(p3);
> +  ec = bad_ec;
> +  VERIFY( ! equivalent(p1, p3, ec) );
> +  VERIFY( ! ec );
> +#endif
> +}
> +
> +int
> +main()
> +{
> +  test_pr118158();
> +}
> --
> 2.47.1
>



Re: [PATCH v5 1/4] RISC-V: Add Zicfiss ISA extension.

2025-01-16 Thread Kito Cheng
LGTM for the V5 series :)


On Thu, Jan 16, 2025 at 4:13 PM Monk Chiang  wrote:
>
> This patch is implemented according to the RISC-V CFI specification.
> It supports the generation of shadow stack instructions in the prologue,
> epilogue, non-local gotos, and unwinding.
>
> RISC-V CFI SPEC: https://github.com/riscv/riscv-cfi
>
> gcc/ChangeLog:
> * common/config/riscv/riscv-common.cc: Add ZICFISS ISA string.
> * gcc/config/riscv/predicates.md: New predicate x1x5_operand.
> * gcc/config/riscv/riscv.cc
>   (riscv_expand_prologue): Insert shadow stack instructions.
>   (riscv_expand_epilogue): Likewise.
>   (riscv_for_each_saved_reg): Assign t0 or ra register for
>   sspopchk instruction.
>   (need_shadow_stack_push_pop_p): New function. Omit shadow
>   stack operation on leaf function.
> * gcc/config/riscv/riscv.h
>   (need_shadow_stack_push_pop_p): Define.
> * gcc/config/riscv/riscv.md: Add shadow stack patterns.
>   (save_stack_nonlocal): Add shadow stack instructions for setjump.
>   (restore_stack_nonlocal): Add shadow stack instructions for 
> longjump.
>
> libgcc/ChangeLog:
> * gcc/config/riscv/riscv.opt (TARGET_ZICFISS): Define.
> * libgcc/config/riscv/linux-unwind.h: Include shadow-stack-unwind.h.
> * libgcc/config/riscv/shadow-stack-unwind.h
>   (_Unwind_Frames_Extra): Define.
>   (_Unwind_Frames_Increment): Define.
>
> gcc/testsuite/ChangeLog:
> * gcc/testsuite/gcc.target/riscv/ssp-1.c: New test.
> * gcc/testsuite/gcc.target/riscv/ssp-2.c: New test.
>
> Co-Developed-by: Greg McGary ,
>  Kito Cheng  
> ---
>  gcc/common/config/riscv/riscv-common.cc   |   7 ++
>  gcc/config/riscv/predicates.md|   6 ++
>  gcc/config/riscv/riscv.cc |  58 --
>  gcc/config/riscv/riscv.h  |   1 +
>  gcc/config/riscv/riscv.md | 125 +-
>  gcc/config/riscv/riscv.opt|   2 +
>  gcc/testsuite/gcc.target/riscv/ssp-1.c|  41 +++
>  gcc/testsuite/gcc.target/riscv/ssp-2.c|  10 ++
>  libgcc/config/riscv/linux-unwind.h|   5 +
>  libgcc/config/riscv/shadow-stack-unwind.h |  74 +
>  10 files changed, 320 insertions(+), 9 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/ssp-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/ssp-2.c
>  create mode 100644 libgcc/config/riscv/shadow-stack-unwind.h
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc 
> b/gcc/common/config/riscv/riscv-common.cc
> index bfc8aa559c5..8e8b6107a6d 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -111,6 +111,9 @@ static const riscv_implied_info_t riscv_implied_info[] =
>{"zfinx", "zicsr"},
>{"zdinx", "zicsr"},
>
> +  {"zicfiss", "zicsr"},
> +  {"zicfiss", "zimop"},
> +
>{"zk", "zkn"},
>{"zk", "zkr"},
>{"zk", "zkt"},
> @@ -325,6 +328,8 @@ static const struct riscv_ext_version 
> riscv_ext_version_table[] =
>{"zicclsm",  ISA_SPEC_CLASS_NONE, 1, 0},
>{"ziccrse",  ISA_SPEC_CLASS_NONE, 1, 0},
>
> +  {"zicfiss", ISA_SPEC_CLASS_NONE, 1, 0},
> +
>{"zimop", ISA_SPEC_CLASS_NONE, 1, 0},
>{"zcmop", ISA_SPEC_CLASS_NONE, 1, 0},
>
> @@ -1647,6 +1652,8 @@ static const riscv_ext_flag_table_t 
> riscv_ext_flag_table[] =
>RISCV_EXT_FLAG_ENTRY ("zicbop", x_riscv_zicmo_subext, MASK_ZICBOP),
>RISCV_EXT_FLAG_ENTRY ("zic64b", x_riscv_zicmo_subext, MASK_ZIC64B),
>
> +  RISCV_EXT_FLAG_ENTRY ("zicfiss", x_riscv_zi_subext, MASK_ZICFISS),
> +
>RISCV_EXT_FLAG_ENTRY ("zimop", x_riscv_mop_subext, MASK_ZIMOP),
>RISCV_EXT_FLAG_ENTRY ("zcmop", x_riscv_mop_subext, MASK_ZCMOP),
>
> diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
> index cda7502a62a..1f67d30be9d 100644
> --- a/gcc/config/riscv/predicates.md
> +++ b/gcc/config/riscv/predicates.md
> @@ -679,3 +679,9 @@
>return (riscv_symbolic_constant_p (op, &type)
>   && type == SYMBOL_PCREL);
>  })
> +
> +;; Shadow stack operands only allow x1, x5 registers
> +(define_predicate "x1x5_operand"
> +  (and (match_operand 0 "register_operand")
> +   (match_test "REGNO (op) == RETURN_ADDR_REGNUM
> +   || REGNO (op) == T0_REGNUM")))
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 65e09842fde..cd37b492183 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -7496,6 +7496,9 @@ riscv_save_reg_p (unsigned int regno)
>if (regno == GP_REGNUM || regno == THREAD_POINTER_REGNUM)
> return false;
>
> +  if (regno == RETURN_ADDR_REGNUM && TARGET_ZICFISS)
> +   return true;
> +
>/* We must save every register used in this function.  If this is not a
>  leaf function, then we must save all temporary registers.  */
>if (df_regs_ever_live_p (regno)
> @@ -8049,7 

Re: [PATCH v6 6/6] OpenMP: Update documentation of metadirective implementation status.

2025-01-16 Thread Tobias Burnus

Hi Sandra,

Sandra Loosemore:

libgomp/ChangeLog
* libgomp.texi (OpenMP 5.0): Mark metadirective and declare variant
as implemented.
(OpenMP 5.1): Mark target_device as supported.
Add changed interaction between declare target and OpenMP context
and dynamic selector support.
(OpenMP 5.2): Mark otherwise clause as supported, note that
default is also still accepted.


LGTM, at least after the Fortran patch is in. However, for …


+@item Dynamic selector support in @code{declare variant} @tab P
+  @tab Fortran rejects non-constant expressions in dynamic selectors;
+  C/C++ reject expressions using argument variables.
  @end multitable


It would be IMHO useful to link to a PR, as we already do for a few 
items in the list.


Thanks,

Tobias

PS: The Fortran patch still needs to be reviewed.



[wwwdocs, pushed] Document libstdc++ headers that are deprecated in GCC 15

2025-01-16 Thread Jonathan Wakely
Pushed to wwwdocs.

---
 htdocs/gcc-15/porting_to.html | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/htdocs/gcc-15/porting_to.html b/htdocs/gcc-15/porting_to.html
index c446e309..2bc0d4e5 100644
--- a/htdocs/gcc-15/porting_to.html
+++ b/htdocs/gcc-15/porting_to.html
@@ -155,6 +155,17 @@ may need to be included explicitly when compiling with GCC 
15:
 
 
 
+Deprecated headers
+
+Some C++ Standard Library headers now produce deprecation warnings when
+included. The warnings suggest how to adjust the code to avoid the warning,
+for example all uses of  and
+ can simply be removed,
+because they serve no purpose and are unnecessary in C++ programs.
+Most uses of  can be adjusted to use
+ instead, and uses of 
+can use  and/or .
+
 
 
 
-- 
2.47.1



[wwwdocs, pushed] Document dependency changes for libstdc++ header

2025-01-16 Thread Jonathan Wakely
Pushed to wwwdocs.

---
 htdocs/gcc-15/porting_to.html | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/htdocs/gcc-15/porting_to.html b/htdocs/gcc-15/porting_to.html
index 2bc0d4e5..b9b2efc7 100644
--- a/htdocs/gcc-15/porting_to.html
+++ b/htdocs/gcc-15/porting_to.html
@@ -153,6 +153,9 @@ may need to be included explicitly when compiling with GCC 
15:
   and 
   (for std::int8_t, std::int32_t etc.)
 
+ 
+(for std::endl, std::flush etc.)
+
 
 
 Deprecated headers
-- 
2.47.1



[PATCH] libstdc++: Handle exceptions in std::ostream::sentry destructor

2025-01-16 Thread Jonathan Wakely
Because basic_ostream::sentry::~sentry is implicitly noexcept, we can't
let any exceptions escape from it, or the program would terminate. If
the streambuf's sync() function throws, or if it returns an error and
setting badbit in the stream state throws, then the program would
terminate.

LWG 835 intended to prevent exceptions from being thrown by the
std::basic_ostream::sentry destructor, but failed to cover the case
where the streambuf's sync() member throws an exception. LWG 4188 is
needed to fix that part. In any case, LWG 835 was never implemented for
libstdc++ so this does so, along with my proposed fix for 4188 too (that
badbit should be set if pubsync exits via an exception).

In order to avoid a second try-catch block to handle an exception that
might be thrown by setting badbit, this introduces an RAII helper class
that temporarily clears the stream's exceptions mask, then restores it
afterwards.

The try-catch block doesn't handle the forced_unwind exception
explicitly, because catching and rethrowing that would just terminate
when it reached the sentry's implicit noexcept(true) anyway.

libstdc++-v3/ChangeLog:

* include/bits/ostream.h (basic_ostream::_Disable_exceptions):
RAII helper type.
(basic_ostream::sentry::~sentry): Use _Disable_exceptions. Add
try-catch block around call to pubsync.
* testsuite/27_io/basic_ostream/exceptions/char/lwg4188.cc: New
test.
* testsuite/27_io/basic_ostream/exceptions/wchar_t/lwg4188.cc:
New test.
---

The new LWG 4188 issue hasn't been seen by LWG yet, so I'll wait until
next week before pushing this. Even if LWG doesn't like my suggestion to
set badbit if pubsync() throws, we still need to prevent the existing
_M_os.setstate(badbit) call from throwing, because that's been required
since LWG 835 was approved for C++11.

Tested x86_64-linux.

 libstdc++-v3/include/bits/ostream.h   | 48 +++
 .../basic_ostream/exceptions/char/lwg4188.cc  | 42 
 .../exceptions/wchar_t/lwg4188.cc | 42 
 3 files changed, 124 insertions(+), 8 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/27_io/basic_ostream/exceptions/char/lwg4188.cc
 create mode 100644 
libstdc++-v3/testsuite/27_io/basic_ostream/exceptions/wchar_t/lwg4188.cc

diff --git a/libstdc++-v3/include/bits/ostream.h 
b/libstdc++-v3/include/bits/ostream.h
index 8ee63d2d66e5..d19a76ab2474 100644
--- a/libstdc++-v3/include/bits/ostream.h
+++ b/libstdc++-v3/include/bits/ostream.h
@@ -507,6 +507,27 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  return __d;
}
 #pragma GCC diagnostic pop
+
+  // RAII type to clear and restore an ostream's exceptions mask.
+  struct _Disable_exceptions
+  {
+   _Disable_exceptions(basic_ostream& __os)
+   : _M_os(__os), _M_exception(_M_os._M_exception)
+   { _M_os._M_exception = ios_base::goodbit; }
+
+   ~_Disable_exceptions()
+   { _M_os._M_exception = _M_exception; }
+
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wc++11-extensions" // deleted functions
+   _Disable_exceptions(const _Disable_exceptions&) = delete;
+   _Disable_exceptions& operator=(const _Disable_exceptions&) = delete;
+#pragma GCC diagnostic pop
+
+  private:
+   basic_ostream& _M_os;
+   const ios_base::iostate _M_exception;
+  };
 };
 
   /**
@@ -543,18 +564,29 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /**
*  @brief  Possibly flushes the stream.
*
-   *  If @c ios_base::unitbuf is set in @c os.flags(), and
-   *  @c std::uncaught_exception() is true, the sentry destructor calls
-   *  @c flush() on the output stream.
+   *  If `ios_base::unitbuf` is set in `os.flags()`, and
+   *  `std::uncaught_exception()` is true, the sentry destructor flushes
+   *  the output stream.
   */
   ~sentry()
   {
-   // XXX MT
-   if (bool(_M_os.flags() & ios_base::unitbuf) && !uncaught_exception())
+   // _GLIBCXX_RESOLVE_LIB_DEFECTS
+   // 397. ostream::sentry dtor throws exceptions
+   // 835. Tying two streams together (correction to DR 581)
+   // 4188. ostream::sentry destructor should handle exceptions
+   if (bool(_M_os.flags() & ios_base::unitbuf) && _M_os.good()
+ && !uncaught_exception()) // XXX MT
  {
-   // Can't call flush directly or else will get into recursive lock.
-   if (_M_os.rdbuf() && _M_os.rdbuf()->pubsync() == -1)
- _M_os.setstate(ios_base::badbit);
+   _Disable_exceptions __noex(_M_os);
+   __try
+ {
+   // Can't call _M_os.flush() directly because that constructs
+   // another sentry.
+   if (_M_os.rdbuf() && _M_os.rdbuf()->pubsync() == -1)
+ _M_os.setstate(ios_base::badbit);
+ }
+   __catch(...)
+ { _M_os.setstate(ios_base::badbit); }
  }
   }

Re: [PATCH] MIPS: Add conditions for use of the -mmips16e2 and -mips16 option.

2025-01-16 Thread Maciej W. Rozycki
On Thu, 16 Jan 2025, Jie Mei wrote:

> Make -mmips16e2 imply -mips16 as the ASE requires, so users won't
> be surprised even if they expect it to. Meanwhile, check if
> mips_isa_rev <= 5 when -mips16 is effective and >= 1 when -mmips16e2
> is effective.

 MIPSr1 is incompatible with MIPS16e2, and the only implementation known 
to me is MIPSr3.

  Maciej


Re: [PATCH 09/11] aarch64: Rewrite architecture strings for assembler

2025-01-16 Thread Richard Sandiford
Andrew Carlotti  writes:
> Add infrastructure to allow rewriting the architecture strings passed to
> the assembler (either as -march options or .arch directives).  There was
> already canonicalisation everywhere except for an -march driver option
> passed directly to the compiler; this patch applies the same
> canonicalisation there as well.
>
> gcc/ChangeLog:
>
>   * common/config/aarch64/aarch64-common.cc
>   (aarch64_get_arch_string_for_assembler): New.
>   (aarch64_rewrite_march): New.
>   (aarch64_rewrite_selected_cpu): Call new function.
>   * config/aarch64/aarch64-elf.h (ASM_SPEC): Remove identity mapping.
>   * config/aarch64/aarch64-protos.h
>   (aarch64_get_arch_string_for_assembler): New.
>   * config/aarch64/aarch64.cc
>   (aarch64_declare_function_name): Call new function.
>   (aarch64_start_file): Ditto.
>   * config/aarch64/aarch64.h
>   * config/aarch64/aarch64.h
>   (EXTRA_SPEC_FUNCTIONS): Use new macro name.
>   (MCPU_TO_MARCH_SPEC): Rename to...
>   (MARCH_REWRITE_SPEC): ...this, and add new spec rule.
>   (aarch64_rewrite_march): New declaration.
>   (MCPU_TO_MARCH_SPEC_FUNCTIONS): Rename to...
>   (MARCH_REWRITE_SPEC_FUNCTIONS): ...this, and add new function.
>   (ASM_CPU_SPEC): Use new macro name.

Looks good, but it'll need to be rebased on top of Tamar's fix to
MARCH_REWRITE_SPEC (please wait for that to go in first).  On that:

> @@ -1502,18 +1502,21 @@ extern const char *host_detect_local_cpu (int argc, 
> const char **argv);
>{"cpu",  "%{!march=*:%{!mcpu=*:-mcpu=%(VALUE)}}" },   \
>CONFIG_TUNE_SPEC
>  
> -#define MCPU_TO_MARCH_SPEC \
> -   " %{mcpu=*:-march=%:rewrite_mcpu(%{mcpu=*:%*})}"
> +#define MARCH_REWRITE_SPEC \
> +   " %{mcpu=*:-march=%:rewrite_mcpu(%{mcpu=*:%*})}" \
> +   " %{march=*:-march=%:rewrite_march(%{march=*:%*})}"

I suppose the way of handling both Tamar's change and yours would be
something like:

  "%{march=*:-march=%:rewrite_march(%{march=*:%*})" \
";:%{mcpu=*:-march=%:rewrite_mcpu(%{mcpu=*:%*})}}"

(untested).

> +extern const char *aarch64_rewrite_march (int argc, const char **argv);
>  extern const char *aarch64_rewrite_mcpu (int argc, const char **argv);
>  extern const char *is_host_cpu_not_armv8_base (int argc, const char **argv);
> -#define MCPU_TO_MARCH_SPEC_FUNCTIONS\
> +#define MARCH_REWRITE_SPEC_FUNCTIONS\

MARCH_REWRITE_SPEC_FUNCTIONS also doesn't quite cover it.  How about
just AARCH64_BASE_SPEC_FUNCTIONS, since its purpose is to define the
functions needed for both cross and native hosts?

OK with those changes if you agree.

Thanks,
Richard


> +  { "rewrite_march", aarch64_rewrite_march },  \
>{ "rewrite_mcpu",aarch64_rewrite_mcpu }, \
>{ "is_local_not_armv8_base", is_host_cpu_not_armv8_base },
>  
>  
>  #define ASM_CPU_SPEC \
> -   MCPU_TO_MARCH_SPEC
> +   MARCH_REWRITE_SPEC
>  
>  #define EXTRA_SPECS  \
>{ "asm_cpu_spec",  ASM_CPU_SPEC }
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> 75ba66a979c979fd01948b0a2066a15371df9bfa..95861c1088052cc60d1e02c654ee970cb8bc3bef
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -24831,16 +24831,12 @@ aarch64_declare_function_name (FILE *stream, const 
> char* name,
>  targ_options = TREE_TARGET_OPTION (target_option_current_node);
>gcc_assert (targ_options);
>  
> -  const struct processor *this_arch
> -= aarch64_get_arch (targ_options->x_selected_arch);
> -
>auto isa_flags = aarch64_get_asm_isa_flags (targ_options);
> -  std::string extension
> -= aarch64_get_extension_string_for_isa_flags (isa_flags,
> -   this_arch->flags);
> +  aarch64_arch arch = targ_options->x_selected_arch;
> +  std::string to_print
> += aarch64_get_arch_string_for_assembler (arch, isa_flags);
>/* Only update the assembler .arch string if it is distinct from the last
>   such string we printed.  */
> -  std::string to_print = this_arch->name + extension;
>if (to_print != aarch64_last_printed_arch_string)
>  {
>asm_fprintf (asm_out_file, "\t.arch %s\n", to_print.c_str ());
> @@ -24962,19 +24958,16 @@ aarch64_start_file (void)
>struct cl_target_option *default_options
>  = TREE_TARGET_OPTION (target_option_default_node);
>  
> -  const struct processor *default_arch
> -= aarch64_get_arch (default_options->x_selected_arch);
> +  aarch64_arch default_arch = default_options->x_selected_arch;
>auto default_isa_flags = aarch64_get_asm_isa_flags (default_options);
> -  std::string extension
> -= aarch64_get_extension_string_for_isa_flags (default_isa_flags,
> -   default_arch->flags);
> -
> -   aarch64_last_printed_arch_string = default_arch->name + extension;
> -   aarch64_last_printed_tune_string = "";

Re: [PATCH 09/11] aarch64: Rewrite architecture strings for assembler

2025-01-16 Thread Richard Sandiford
Andrew Carlotti  writes:
> @@ -697,6 +697,50 @@ aarch64_get_extension_string_for_isa_flags
> +  const struct arch_info *entry;
> +  for (entry = all_architectures; entry->arch != aarch64_no_arch; entry++)
> +{
> +  if (entry->arch == arch)
> + break;
> +}

Sorry for the nit, but forgot to say: the convention is not to have
braces here.

Thanks,
Richard


[PUSHED] [OpenACC/Fortran testsuite] Use relative line numbers for a few DejaGnu directives

2025-01-16 Thread Thomas Schwinge
From: Thomas Schwinge 

For easier maintenance.

gcc/testsuite/
* gfortran.dg/goacc/assumed.f95: Use relative line numbers for a
few DejaGnu directives.
* gfortran.dg/goacc/list.f95: Likewise.
* gfortran.dg/goacc/loop-1-2.f95: Likewise.
* gfortran.dg/goacc/loop-1.f95: Likewise.
* gfortran.dg/goacc/reduction.f95: Likewise.
---
 gcc/testsuite/gfortran.dg/goacc/assumed.f95   |  5 +--
 gcc/testsuite/gfortran.dg/goacc/list.f95  |  8 ++--
 gcc/testsuite/gfortran.dg/goacc/loop-1-2.f95  |  2 +-
 gcc/testsuite/gfortran.dg/goacc/loop-1.f95|  2 +-
 gcc/testsuite/gfortran.dg/goacc/reduction.f95 | 45 +--
 5 files changed, 30 insertions(+), 32 deletions(-)

diff --git a/gcc/testsuite/gfortran.dg/goacc/assumed.f95 
b/gcc/testsuite/gfortran.dg/goacc/assumed.f95
index 4efe5a2b06e..4e35c1d5960 100644
--- a/gcc/testsuite/gfortran.dg/goacc/assumed.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/assumed.f95
@@ -16,6 +16,7 @@ contains
 !$acc host_data use_device (a) ! { dg-error "Assumed size" }
 !$acc end host_data
 !$acc parallel loop reduction(+:a) ! { dg-error "Assumed size" }
+! { dg-error "Array 'a' is not permitted in reduction" "" { target "*-*-*" 
} .-1 }
 do i = 1,5
 enddo
 !$acc end parallel loop
@@ -37,6 +38,7 @@ contains
 !$acc host_data use_device (a) ! { dg-error "Assumed rank" }
 !$acc end host_data
 !$acc parallel loop reduction(+:a) ! { dg-error "Assumed rank" }
+! { dg-error "Array 'a' is not permitted in reduction" "" { target "*-*-*" 
} .-1 }
 do i = 1,5
 enddo
 !$acc end parallel loop
@@ -45,6 +47,3 @@ contains
 !$acc update self (a) ! { dg-error "Assumed rank" }
   end subroutine assumed_rank
 end module test
-
-! { dg-error "Array 'a' is not permitted in reduction" "" { target "*-*-*" } 
18 }
-! { dg-error "Array 'a' is not permitted in reduction" "" { target "*-*-*" } 
39 }
diff --git a/gcc/testsuite/gfortran.dg/goacc/list.f95 
b/gcc/testsuite/gfortran.dg/goacc/list.f95
index d2f4c5e88be..3d4ebf88720 100644
--- a/gcc/testsuite/gfortran.dg/goacc/list.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/list.f95
@@ -100,14 +100,14 @@ program test
   !$acc host_data use_device(10) ! { dg-error "Syntax error" }
 
   !$acc host_data use_device(/b/, /b/)
+  ! { dg-error "neither a POINTER nor an array" "" { target *-*-* } .-1 }
+  ! { dg-error "present on multiple clauses" "" { target *-*-* } .-2 }
   !$acc end host_data
-  ! { dg-error "neither a POINTER nor an array" "" { target *-*-* } 102 }
-  ! { dg-error "present on multiple clauses" "" { target *-*-* } 102 }
 
   !$acc host_data use_device(i, j, i)
+  ! { dg-error "neither a POINTER nor an array" "" { target *-*-* } .-1 }
+  ! { dg-error "present on multiple clauses" "" { target *-*-* } .-2 }
   !$acc end host_data
-  ! { dg-error "neither a POINTER nor an array" "" { target *-*-* } 107 }
-  ! { dg-error "present on multiple clauses" "" { target *-*-* } 107 }
 
   !$acc host_data use_device(p1)
   !$acc end host_data
diff --git a/gcc/testsuite/gfortran.dg/goacc/loop-1-2.f95 
b/gcc/testsuite/gfortran.dg/goacc/loop-1-2.f95
index e048205d2c3..8846e7d2a1f 100644
--- a/gcc/testsuite/gfortran.dg/goacc/loop-1-2.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/loop-1-2.f95
@@ -148,8 +148,8 @@ subroutine test1
 !$acc parallel loop collapse(2)
 do i = 1, 3
 do r = 4, 6
+   ! { dg-error "ACC LOOP iteration variable must be of type integer" 
"" { target *-*-* } .-1 }
 end do
-! { dg-error "ACC LOOP iteration variable must be of type integer" "" 
{ target *-*-* } 150 }
 end do
 
   !$acc loop independent seq
diff --git a/gcc/testsuite/gfortran.dg/goacc/loop-1.f95 
b/gcc/testsuite/gfortran.dg/goacc/loop-1.f95
index 776fa482af3..67dc97a3ecd 100644
--- a/gcc/testsuite/gfortran.dg/goacc/loop-1.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/loop-1.f95
@@ -148,8 +148,8 @@ subroutine test1
 !$acc parallel loop collapse(2)
 do i = 1, 3
 do r = 4, 6
+   ! { dg-error "ACC LOOP iteration variable must be of type integer" 
"" { target *-*-* } .-1 }
 end do
-! { dg-error "ACC LOOP iteration variable must be of type integer" "" 
{ target *-*-* } 150 }
 end do
 
   !$acc loop independent seq
diff --git a/gcc/testsuite/gfortran.dg/goacc/reduction.f95 
b/gcc/testsuite/gfortran.dg/goacc/reduction.f95
index a13574b150c..aaa82980e16 100644
--- a/gcc/testsuite/gfortran.dg/goacc/reduction.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/reduction.f95
@@ -25,14 +25,19 @@ save i2
 common /blk/ i1
 
 !$acc parallel reduction (+:ia2)
+! { dg-error "Array 'ia2' is not permitted in reduction" "" { target "*-*-*" } 
.-1 }
 !$acc end parallel
 !$acc parallel reduction (+:ra1)
+! { dg-error "Array 'ra1' is not permitted in reduction" "" { target "*-*-*" } 
.-1 }
 !$acc end parallel
 !$acc parallel reduction (+:ca1)
+! { dg-error "Array 'ca1' is not permitted in reduction" "" { target "*-*-*" } 
.-1 }
 !$acc

Re: [PATCH 10/11] aarch64: Refactor aarch64_rewrite_mcpu

2025-01-16 Thread Richard Sandiford
Andrew Carlotti  writes:
> Use aarch64_validate_cpu instead of the existing duplicate (and worse)
> version of the -mcpu parsing code.
>
> The original code used fatal_error; I'm guessing that using error
> instead should be ok.
>
> gcc/ChangeLog:
>
>   * common/config/aarch64/aarch64-common.cc
>   (aarch64_rewrite_selected_cpu): Refactor and inline into...
>   (aarch64_rewrite_mcpu): this.
>   * config/aarch64/aarch64-protos.h
>   (aarch64_rewrite_selected_cpu): Delete.

OK, thanks.

Richard

> diff --git a/gcc/common/config/aarch64/aarch64-common.cc 
> b/gcc/common/config/aarch64/aarch64-common.cc
> index 
> 297210e3809255d51b1aff4c827501534fae9546..1848d31c2c23e053535458044e0fcfd38b8f659b
>  100644
> --- a/gcc/common/config/aarch64/aarch64-common.cc
> +++ b/gcc/common/config/aarch64/aarch64-common.cc
> @@ -741,60 +741,29 @@ aarch64_rewrite_march (int argc, const char **argv)
>return xstrdup (outstr.c_str ());
>  }
>  
> -/* Attempt to rewrite NAME, which has been passed on the command line
> -   as a -mcpu option to an equivalent -march value.  If we can do so,
> -   return the new string, otherwise return an error.  */
> +/* Called by the driver to rewrite a name passed to the -mcpu argument
> +   to an equivalent -march value to be passed to the assembler.  The
> +   names passed from the commend line will be in ARGV, we want
> +   to use the right-most argument, which should be in
> +   ARGV[ARGC - 1].  ARGC should always be greater than 0.  */
>  
>  const char *
> -aarch64_rewrite_selected_cpu (const char *name)
> +aarch64_rewrite_mcpu (int argc, const char **argv)
>  {
> -  std::string original_string (name);
> -  std::string extension_str;
> -  std::string processor;
> -  size_t extension_pos = original_string.find_first_of ('+');
> -
> -  /* Strip and save the extension string.  */
> -  if (extension_pos != std::string::npos)
> -{
> -  processor = original_string.substr (0, extension_pos);
> -  extension_str = original_string.substr (extension_pos,
> -   std::string::npos);
> -}
> -  else
> -{
> -  /* No extensions.  */
> -  processor = original_string;
> -}
> -
> -  const struct processor_info* p_to_a;
> -  for (p_to_a = all_cores;
> -   p_to_a->arch != aarch64_no_arch;
> -   p_to_a++)
> -{
> -  if (p_to_a->name == processor)
> - break;
> -}
> -
> -  const struct arch_info* a_to_an;
> -  for (a_to_an = all_architectures;
> -   a_to_an->arch != aarch64_no_arch;
> -   a_to_an++)
> -{
> -  if (a_to_an->arch == p_to_a->arch)
> - break;
> -}
> +  gcc_assert (argc);
> +  const char *name = argv[argc - 1];
> +  aarch64_cpu cpu;
> +  aarch64_feature_flags flags;
>  
> -  /* We couldn't find that processor name, or the processor name we
> - found does not map to an architecture we understand.  */
> -  if (p_to_a->arch == aarch64_no_arch
> -  || a_to_an->arch == aarch64_no_arch)
> -fatal_error (input_location, "unknown value %qs for %<-mcpu%>", name);
> +  aarch64_validate_mcpu (name, &cpu, &flags);
>  
> -  aarch64_feature_flags extensions = p_to_a->flags;
> -  aarch64_parse_extension (extension_str.c_str (), &extensions, NULL);
> +  const struct processor_info *entry;
> +  for (entry = all_cores; entry->processor != aarch64_no_cpu; entry++)
> +if (entry->processor == cpu)
> +  break;
>  
> -  std::string outstr = aarch64_get_arch_string_for_assembler (a_to_an->arch,
> -   extensions);
> +  std::string outstr = aarch64_get_arch_string_for_assembler (entry->arch,
> +   flags);
>  
>/* We are going to memory leak here, nobody elsewhere
>   in the callchain is going to clean up after us.  The alternative is
> @@ -803,19 +772,6 @@ aarch64_rewrite_selected_cpu (const char *name)
>return xstrdup (outstr.c_str ());
>  }
>  
> -/* Called by the driver to rewrite a name passed to the -mcpu
> -   argument in preparation to be passed to the assembler.  The
> -   names passed from the commend line will be in ARGV, we want
> -   to use the right-most argument, which should be in
> -   ARGV[ARGC - 1].  ARGC should always be greater than 0.  */
> -
> -const char *
> -aarch64_rewrite_mcpu (int argc, const char **argv)
> -{
> -  gcc_assert (argc);
> -  return aarch64_rewrite_selected_cpu (argv[argc - 1]);
> -}
> -
>  /* Checks to see if the host CPU may not be Cortex-A53 or an unknown Armv8-a
> baseline CPU.  */
>  
> diff --git a/gcc/config/aarch64/aarch64-protos.h 
> b/gcc/config/aarch64/aarch64-protos.h
> index 
> b27da1e25720da06712da0eff1d527e23408a59f..4235f4a0ca51af49c2852a420f1056727b24f345
>  100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -1210,7 +1210,6 @@ bool aarch64_validate_march (const char *, aarch64_arch 
> *,
>  bool aarch64_validate_mcpu (const char *, aarch64_cp

Re: [PATCH] LoongArch: Fix cost model for alsl

2025-01-16 Thread Lulu Cheng



在 2025/1/16 下午8:59, Xi Ruoyao 写道:

On Thu, 2025-01-16 at 20:52 +0800, Xi Ruoyao wrote:

On Thu, 2025-01-16 at 20:30 +0800, Lulu Cheng wrote:

在 2025/1/15 下午6:10, Xi Ruoyao 写道:

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 9d97f0216f0..3a8e1297bd3 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -3929,14 +3929,31 @@ loongarch_rtx_costs (rtx x, machine_mode mode, int 
outer_code,
   
     /* If it's an add + mult (which is equivalent to shift left) and

     it's immediate operand satisfies const_immalsl_operand predicate.  */
-  if ((mode == SImode || (TARGET_64BIT && mode == DImode))
-     && GET_CODE (XEXP (x, 0)) == MULT)
+  if (code == PLUS

Hi,

This section of code is already within the "case PLUS" block, so I think
the condition "code == PLUS" is unnecessary.

The reason is "case MINUS:" falls through into "case PLUS:" and we don't
have a "slsl" instruction, so we need the check to reject things like a
- b * 4 here.

Pressed "sent" too early: I think I need to mention this in the
changelog indeed.

It's my fault for not reading carefully enough; I missed the "case 
MINUS:". I have no other questions now.




Re: [PATCH] c++: Make sure fold_sizeof_expr returns the correct type [PR117775]

2025-01-16 Thread Simon Martin
Hi Jakub, Jason,

On 15 Jan 2025, at 22:55, Jakub Jelinek wrote:

> On Wed, Jan 15, 2025 at 04:48:59PM -0500, Jason Merrill wrote:
>>> --- a/gcc/cp/decl.cc
>>> +++ b/gcc/cp/decl.cc
>>> @@ -11686,6 +11686,7 @@ fold_sizeof_expr (tree t)
>>> false, false);
>>> if (r == error_mark_node)
>>>   r = size_one_node;
>>> +  r = cp_fold_convert (TREE_TYPE (t), r);
>>
>> Instead of adding this conversion in all cases, let's change 
>> size_one_node
>> to
>>
>> build_int_cst (size_type_node, 1)
> That would need to be r = build_int_cst (TREE_TYPE (t), 1);
Jason is right: my patch does not need to do TREE_TYPE (t), and can 
simply use size_type_node - oversight on my part.

> I guess, while that is maybe fine, I don't see how it could avoid
> the cp_fold_convert call, because size_one_node can be returned
> also from e.g. c-family c_sizeof_or_alignof_type or its typeck.cc 
> callers,
> or it can be size_int (something) etc.
I took another look at the code paths in fold_sizeof_expr and I believe 
that we’re “good” in all of them except if we hit typeck.cc:2077 
(I have not been able so far to craft a test that does), because the 
code either builds a node with size_type_node as type, or goes through 
c_sizeof_or_alignof_type that fold_convert’s everything (except 
error_mark_node) to size_type_node (in c-common.cc:4028).

I can run a regression test round with a patch that uses 
“build_int_cst (size_type_node, 1)” in decl.cc:11930 and 
typeck.cc:2077, which should cover all the *current* cases for 
fold_sizeof_expr. However, the initial patch has the advantage that:
  - It’s consistent with what c_sizeof_or_alignof_type does
  - It will cover the (unlikely?) possibility that some new code path is 
added some day that that does not use the right type
  - It adds no cost to nominal cases, since cp_fold_convert does nothing 
if we already have the right type

Jason, would you still like me to test and submit a patch that uses 
build_int_cst in the two places identified above instead of doing 
cp_fold_convert (size_type_node, r) in decl.cc:11930?

Thanks, Simon




Re: [PATCH] LoongArch: Fix cost model for alsl

2025-01-16 Thread Xi Ruoyao
On Thu, 2025-01-16 at 20:52 +0800, Xi Ruoyao wrote:
> On Thu, 2025-01-16 at 20:30 +0800, Lulu Cheng wrote:
> > 
> > 在 2025/1/15 下午6:10, Xi Ruoyao 写道:
> > > diff --git a/gcc/config/loongarch/loongarch.cc 
> > > b/gcc/config/loongarch/loongarch.cc
> > > index 9d97f0216f0..3a8e1297bd3 100644
> > > --- a/gcc/config/loongarch/loongarch.cc
> > > +++ b/gcc/config/loongarch/loongarch.cc
> > > @@ -3929,14 +3929,31 @@ loongarch_rtx_costs (rtx x, machine_mode mode, 
> > > int outer_code,
> > >   
> > >     /* If it's an add + mult (which is equivalent to shift left) and
> > >    it's immediate operand satisfies const_immalsl_operand 
> > > predicate.  */
> > > -  if ((mode == SImode || (TARGET_64BIT && mode == DImode))
> > > -   && GET_CODE (XEXP (x, 0)) == MULT)
> > > +  if (code == PLUS
> > 
> > Hi,
> > 
> > This section of code is already within the "case PLUS" block, so I think 
> > the condition "code == PLUS" is unnecessary.
> 
> The reason is "case MINUS:" falls through into "case PLUS:" and we don't
> have a "slsl" instruction, so we need the check to reject things like a
> - b * 4 here.

Pressed "sent" too early: I think I need to mention this in the
changelog indeed.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[PATCH]middle-end: Add early break conditions to vect-switch-search-line-fast.c [PR118451]

2025-01-16 Thread Tamar Christina
Hi All,

When this test was added initially it didn't add the early break effective
target tests.

This means that the test was "passing" (as in, it was failing to vectorize)
because many targets don't support early break.

But the test should not have been run for these targets.  When the vectorizer
learned PFA the test started passing for 32-bit targets. I had adjusted the
testcase but fail to notice the requirements were wrong.

Thus this adds the extra guards, and on targets that don't support early break
this test will move to UNRESOLVED, which is what it should have been all
along...

Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/testsuite/ChangeLog:

PR testsuite/118451
* gcc.dg/vect/vect-switch-search-line-fast.c: Add early_break guards.

---
diff --git a/gcc/testsuite/gcc.dg/vect/vect-switch-search-line-fast.c 
b/gcc/testsuite/gcc.dg/vect/vect-switch-search-line-fast.c
index 
02ad7a451ca2218cf827d3b6cca4b36950fba555..21c77f49ebd7d99e9cec9a542da2335e588b45ba
 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-switch-search-line-fast.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-switch-search-line-fast.c
@@ -1,6 +1,8 @@
 /* PR116126 -- once this works use this version in libcpp/lex.c.
This also requires working value range propagation for s/end.  */
 /* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
 /* { dg-require-effective-target vect_int } */
 
 const unsigned char *search_line_fast2 (const unsigned char *s,




-- 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-switch-search-line-fast.c b/gcc/testsuite/gcc.dg/vect/vect-switch-search-line-fast.c
index 02ad7a451ca2218cf827d3b6cca4b36950fba555..21c77f49ebd7d99e9cec9a542da2335e588b45ba 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-switch-search-line-fast.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-switch-search-line-fast.c
@@ -1,6 +1,8 @@
 /* PR116126 -- once this works use this version in libcpp/lex.c.
This also requires working value range propagation for s/end.  */
 /* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
 /* { dg-require-effective-target vect_int } */
 
 const unsigned char *search_line_fast2 (const unsigned char *s,





Re: [PATCH]middle-end: Add early break conditions to vect-switch-search-line-fast.c [PR118451]

2025-01-16 Thread Richard Biener
On Thu, 16 Jan 2025, Tamar Christina wrote:

> Hi All,
> 
> When this test was added initially it didn't add the early break effective
> target tests.
> 
> This means that the test was "passing" (as in, it was failing to vectorize)
> because many targets don't support early break.
> 
> But the test should not have been run for these targets.  When the vectorizer
> learned PFA the test started passing for 32-bit targets. I had adjusted the
> testcase but fail to notice the requirements were wrong.
> 
> Thus this adds the extra guards, and on targets that don't support early break
> this test will move to UNRESOLVED, which is what it should have been all
> along...
> 
> Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

OK.

> Thanks,
> Tamar
> 
> gcc/testsuite/ChangeLog:
> 
>   PR testsuite/118451
>   * gcc.dg/vect/vect-switch-search-line-fast.c: Add early_break guards.
> 
> ---
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-switch-search-line-fast.c 
> b/gcc/testsuite/gcc.dg/vect/vect-switch-search-line-fast.c
> index 
> 02ad7a451ca2218cf827d3b6cca4b36950fba555..21c77f49ebd7d99e9cec9a542da2335e588b45ba
>  100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-switch-search-line-fast.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-switch-search-line-fast.c
> @@ -1,6 +1,8 @@
>  /* PR116126 -- once this works use this version in libcpp/lex.c.
> This also requires working value range propagation for s/end.  */
>  /* { dg-do compile } */
> +/* { dg-add-options vect_early_break } */
> +/* { dg-require-effective-target vect_early_break } */
>  /* { dg-require-effective-target vect_int } */
>  
>  const unsigned char *search_line_fast2 (const unsigned char *s,
> 
> 
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


[committed v2] libstdc++: Move std::basic_ostream to new internal header [PR99995]

2025-01-16 Thread Jonathan Wakely
This adds  so that other headers don't need to include
all of , which pulls in all of  since C++23 (for the
std::print and std::println overloads in ). This new header
allows the constrained operator<< in  to be defined
without all of std::format being compiled.

We could also replace  with  in all of
, , , and . That seems more
likely to cause problems for users who might be expecting  to
define std::endl, for example. Although the standard doesn't guarantee
that, it is more reasonable than expecting  to define it! We can
look into making those changes for GCC 16.

libstdc++-v3/ChangeLog:

PR libstdc++/5
* include/Makefile.am: Add new header.
* include/Makefile.in: Regenerate.
* include/bits/unique_ptr.h: Include bits/ostream.h instead of
ostream.
* include/std/ostream: Include new header.
* include/bits/ostream.h: New file.
---

Tested x86_64-linux. Pushed to trunk to partially address the PR 5
regression that our headers just keep pulling in too much stuff with
each new -std mode.

 libstdc++-v3/include/Makefile.am   |   1 +
 libstdc++-v3/include/Makefile.in   |   1 +
 libstdc++-v3/include/bits/ostream.h| 814 +
 libstdc++-v3/include/bits/unique_ptr.h |   2 +-
 libstdc++-v3/include/std/ostream   | 763 +--
 5 files changed, 818 insertions(+), 763 deletions(-)
 create mode 100644 libstdc++-v3/include/bits/ostream.h

diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 69f6c33b2955..de25aadd219d 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -137,6 +137,7 @@ bits_freestanding = \
${bits_srcdir}/memoryfwd.h \
${bits_srcdir}/monostate.h \
${bits_srcdir}/move.h \
+   ${bits_srcdir}/ostream.h \
${bits_srcdir}/out_ptr.h \
${bits_srcdir}/predefined_ops.h \
${bits_srcdir}/parse_numbers.h \
diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in
index 4c461afb7513..5a20dfb69b0e 100644
--- a/libstdc++-v3/include/Makefile.in
+++ b/libstdc++-v3/include/Makefile.in
@@ -492,6 +492,7 @@ bits_freestanding = \
${bits_srcdir}/memoryfwd.h \
${bits_srcdir}/monostate.h \
${bits_srcdir}/move.h \
+   ${bits_srcdir}/ostream.h \
${bits_srcdir}/out_ptr.h \
${bits_srcdir}/predefined_ops.h \
${bits_srcdir}/parse_numbers.h \
diff --git a/libstdc++-v3/include/bits/ostream.h 
b/libstdc++-v3/include/bits/ostream.h
new file mode 100644
index ..8ee63d2d66e5
--- /dev/null
+++ b/libstdc++-v3/include/bits/ostream.h
@@ -0,0 +1,814 @@
+// Output streams -*- C++ -*-
+
+// Copyright (C) 1997-2024 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// .
+
+/** @file bits/ostream.h
+ *  This is an internal header file, included by other library headers.
+ *  Do not attempt to use it directly. @headername{ostream}
+ */
+
+//
+// ISO C++ 14882: 27.6.2  Output streams
+//
+
+#ifndef _GLIBCXX_OSTREAM_H
+#define _GLIBCXX_OSTREAM_H 1
+
+#ifdef _GLIBCXX_SYSHDR
+#pragma GCC system_header
+#endif
+
+#include  // iostreams
+
+#include 
+#include 
+
+# define __glibcxx_want_print
+#include  // __glibcxx_syncbuf
+
+namespace std _GLIBCXX_VISIBILITY(default)
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+
+  /**
+   *  @brief  Template class basic_ostream.
+   *  @ingroup io
+   *
+   *  @tparam _CharT  Type of character stream.
+   *  @tparam _Traits  Traits for character type, defaults to
+   *   char_traits<_CharT>.
+   *
+   *  This is the base class for all output streams.  It provides text
+   *  formatting of all builtin types, and communicates with any class
+   *  derived from basic_streambuf to do the actual output.
+  */
+  template
+class basic_ostream : virtual public basic_ios<_CharT, _Traits>
+{
+public:
+  // Types (inherited from basic_ios):
+  typedef _CharT   char_type;
+  typedef typename _

Re: [PATCH 11/11] aarch64: Make AARCH64_FL_CRYPTO always unset

2025-01-16 Thread Richard Sandiford
Andrew Carlotti  writes:
> This feature flag bit only exists to support the +crypto alias.  Outside
> of option processing this bit needs to be set or unset consistently.
> This patch goes with the latter option.
>
> gcc/ChangeLog:
>
>   * common/config/aarch64/aarch64-common.cc: Assert that CRYPTO
>   bit is not set.
>   * config/aarch64/aarch64-feature-deps.h
>   (info.explicit_on): Unset CRYPTO bit.
>   (cpu_##CORE_IDENT): Ditto.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/crypto-alias-1.c: New test.

OK, thanks.  I'd rather get rid of AARCH64_FL_CRYPTO instead (without
changing observable behaviour), but that's a bigger project and not
suitable for this stage.

Richard

> diff --git a/gcc/common/config/aarch64/aarch64-common.cc 
> b/gcc/common/config/aarch64/aarch64-common.cc
> index 
> 1848d31c2c23e053535458044e0fcfd38b8f659b..8af3aa71be8a8d56ea3654e194dc58e81345178f
>  100644
> --- a/gcc/common/config/aarch64/aarch64-common.cc
> +++ b/gcc/common/config/aarch64/aarch64-common.cc
> @@ -620,6 +620,10 @@ aarch64_get_extension_string_for_isa_flags
>  {
>std::string outstr = "";
>  
> +  /* The CRYPTO bit should only be used to support the +crypto alias
> + during option processing, and should be cleared at all other times.
> + Verify this property for the supplied flags bitmask.  */
> +  gcc_assert (!(AARCH64_FL_CRYPTO & aarch64_isa_flags));
>aarch64_feature_flags current_flags = default_arch_flags;
>  
>/* As a special case, do not assume that the assembler will enable CRC
> diff --git a/gcc/config/aarch64/aarch64-feature-deps.h 
> b/gcc/config/aarch64/aarch64-feature-deps.h
> index 
> 67c3a5da8aa3f59607b9eb20fb329c6fdef2d46f..55a0dbfae6107388d97528b637f9120cc6b933a1
>  100644
> --- a/gcc/config/aarch64/aarch64-feature-deps.h
> +++ b/gcc/config/aarch64/aarch64-feature-deps.h
> @@ -56,7 +56,8 @@ get_enable (T1 i, Ts... args)
>  
> - explicit_on: the transitive closure of the features that an
>   explicit +FEATURE enables, including FLAG itself.  This is
> - always a superset of ENABLE
> + always a superset of ENABLE, except that the CRYPTO alias bit is
> + explicitly unset for consistency.
>  
> Also define a function FEATURE () that returns an info
> (which is an empty structure, since all members are static).
> @@ -69,7 +70,8 @@ template struct info;
>template<> struct info {   \
>  static constexpr auto flag = AARCH64_FL_##IDENT; \
>  static constexpr auto enable = flag | get_enable REQUIRES;   
> \
> -static constexpr auto explicit_on = enable | get_enable EXPLICIT_ON; \
> +static constexpr auto explicit_on   \
> +  = (enable | get_enable EXPLICIT_ON) & ~AARCH64_FL_CRYPTO; \
>}; \
>constexpr aarch64_feature_flags info::flag;
> \
>constexpr aarch64_feature_flags info::enable;  
> \
> @@ -114,9 +116,11 @@ get_flags_off (aarch64_feature_flags mask)
>  #include "config/aarch64/aarch64-option-extensions.def"
>  
>  /* Define cpu_ variables for each CPU, giving the transitive
> -   closure of all the features that the CPU supports.  */
> +   closure of all the features that the CPU supports.  The CRYPTO bit is just
> +   an alias, so explicitly unset it for consistency.  */
>  #define AARCH64_CORE(A, CORE_IDENT, C, ARCH_IDENT, FEATURES, F, G, H, I) \
> -  constexpr auto cpu_##CORE_IDENT = ARCH_IDENT ().enable | get_enable 
> FEATURES;
> +  constexpr auto cpu_##CORE_IDENT \
> += (ARCH_IDENT ().enable | get_enable FEATURES) & ~AARCH64_FL_CRYPTO;
>  #include "config/aarch64/aarch64-cores.def"
>  
>  /* Define fmv_deps_ variables for each FMV feature, giving the 
> transitive
> diff --git a/gcc/testsuite/gcc.target/aarch64/crypto-alias-1.c 
> b/gcc/testsuite/gcc.target/aarch64/crypto-alias-1.c
> new file mode 100644
> index 
> ..e2662b44e2db0b26f30c44e62c4b873a12a37972
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/crypto-alias-1.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mcpu=ampere1" } */
> +
> +__attribute__ ((__always_inline__))
> +__attribute__ ((target ("arch=armv8-a+crypto")))
> +inline int bar()
> +{
> +  return 5;
> +}
> +
> +int foo()
> +{
> +  return bar();
> +}


[PATCH] arm: [MVE] Fix predicates for vec_cmp, vec_vcmpu and vcond_mask (PR 115439)

2025-01-16 Thread Christophe Lyon
When compiling c-c++-common/vector-compare-3.c with
-march=armv8.1-m.main+mve+fp.dp -mfloat-abi=hard -mfpu=auto
(which enables MVE), we fail to match vcond_mask because operand 3 has
s_register_operand as predicate for a MVE_VPRED mode, but we try to
match:
(insn 26 25 27 2 (set (reg:V4SI 137)
 (unspec:V4SI [
 (reg:V4SI 144)
 (reg:V4SI 145)
 (subreg:V4BI (reg:HI 143) 0)
 ] VPSELQ_S)) 
"/src/gcc/testsuite/c-c++-common/vector-compare-3.c":23:6 -1
  (nil))

The fix is to use the right predicate: vpr_register_operand.

The patch also fixes vec_cmp and vec_cmpu in the same way.

When testing with
-mthumb/-march=armv8.1-m.main+mve.fp+fp.dp/-mtune=cortex-m55/-mfloat-abi=hard/-mfpu=auto
it fixes the ICES in c-c++-common/vector-compare-3.c,
g++.dg/opt/pr79734.C, g++.dg/tree-ssa/pr50.C and
gcc.dg/tree-ssa/pr50.c

gcc/ChangeLog

PR target/115439
* config/arm/mve.md (vec_vcmp, vec_vcmpu, vcond_mask): Use
vpr_register_operand predicate for MVE_VPRED operands.
---
 gcc/config/arm/mve.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 0c0337f9ee2..8527bd753e3 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -4587,7 +4587,7 @@ (define_expand "mov"
 ;; Expanders for vec_cmp and vcond
 
 (define_expand "vec_cmp"
-  [(set (match_operand: 0 "s_register_operand")
+  [(set (match_operand: 0 "vpr_register_operand")
(match_operator: 1 "comparison_operator"
  [(match_operand:MVE_VLD_ST 2 "s_register_operand")
   (match_operand:MVE_VLD_ST 3 "reg_or_zero_operand")]))]
@@ -4600,7 +4600,7 @@ (define_expand "vec_cmp"
 })
 
 (define_expand "vec_cmpu"
-  [(set (match_operand: 0 "s_register_operand")
+  [(set (match_operand: 0 "vpr_register_operand")
(match_operator: 1 "comparison_operator"
  [(match_operand:MVE_2 2 "s_register_operand")
   (match_operand:MVE_2 3 "reg_or_zero_operand")]))]
@@ -4614,7 +4614,7 @@ (define_expand "vec_cmpu"
 (define_expand "vcond_mask_"
   [(set (match_operand:MVE_VLD_ST 0 "s_register_operand")
(if_then_else:MVE_VLD_ST
- (match_operand: 3 "s_register_operand")
+ (match_operand: 3 "vpr_register_operand")
  (match_operand:MVE_VLD_ST 1 "s_register_operand")
  (match_operand:MVE_VLD_ST 2 "s_register_operand")))]
   "TARGET_HAVE_MVE"
-- 
2.34.1



Re: [PATCH] tailc, v2: Virtually undo IPA-VRP return value optimization for tail calls [PR118430]

2025-01-16 Thread Richard Biener
On Wed, 15 Jan 2025, Jakub Jelinek wrote:

> On Wed, Jan 15, 2025 at 03:16:04PM +0100, Richard Biener wrote:
> > > +  /* If IPA-VRP proves called function always returns a singleton 
> > > range,
> > > +  the return value is replaced by the only value in that range.
> > > +  For tail call purposes, pretend such replacement didn't happen.  */
> > > +  if (ass_var == NULL_TREE
> > > +   && !tail_recursion
> > > +   && TREE_CONSTANT (ret_var))
> > > + if (tree type = gimple_range_type (call))
> > > +   if (tree callee = gimple_call_fndecl (call))
> > > + if ((INTEGRAL_TYPE_P (type) || SCALAR_FLOAT_TYPE_P (type))
> > > + && useless_type_conversion_p (TREE_TYPE (TREE_TYPE (callee)),
> > > +   type)
> > > + && useless_type_conversion_p (TREE_TYPE (ret_var), type)
> > > + && ipa_return_value_range (val, callee)
> > > + && val.singleton_p (&valr)
> > 
> > I suppose it's good enough to check
> > 
> >  useless_type_conversion_p (TREE_TYPE (ret_var), TREE_TYPE (valr))?
> 
> One of the checks is for the gimple_fntype to actually match the type of
> the gimple_call_fndecl, one could always have some ugly hacks like
> int foo (void);
> ...
>   long (*fn) (void) = (long (*) (void)) foo;
>   fn ();
> etc., copied over from gimple-range-fold.cc, the other is making sure
> it matches also the caller's return type.
> 
> > but the bigger question is whether RTL expansion does the right thing
> > here without changing the IL at this point back to return the LHS
> > and put that back?  IIRC there are some sanity checks upon tail call
> > expansion, but does this all work when the call itself doesn't have
> > its return value used?  Aka some call expansion checks might be
> > elided in this case.
> 
> It does the right thing, it just relies on the tailc pass to do its job
> properly.
> E.g. when we have
>[local count: 1073741824]:
>   foo (x_2(D));
>   baz (&v);
>   v ={v} {CLOBBER(eos)};
>   bar (x_2(D)); [tail call]
>   return 1;
> when expand_gimple_basic_block handles the bar (x_2(D)); call, it uses
>   if (call_stmt && gimple_call_tail_p (call_stmt))
> {
>   bool can_fallthru;
>   new_bb = expand_gimple_tailcall (bb, call_stmt, &can_fallthru);
>   if (new_bb)
> {
>   if (can_fallthru)
> bb = new_bb;
>   else
> {
>   currently_expanding_gimple_stmt = NULL;
>   return new_bb;
> }
> }
> }
> As it is actually tail callable during expansion of the bar (x_2(D)); call
> stmt, expand_gimple_tailbb returns non-NULL and sets can_fallthru to false,
> plus emits
> ;; bar (x_2(D)); [tail call]
> 
> (insn 11 10 12 2 (set (reg:SI 5 di)
> (reg/v:SI 99 [ x ])) "pr118430.c":35:10 -1
>  (nil))
> 
> (call_insn/j 12 11 13 2 (set (reg:SI 0 ax)
> (call (mem:QI (symbol_ref:DI ("bar") [flags 0x3]   0x7fb39020bd00 bar>) [0 bar S1 A8])
> (const_int 0 [0]))) "pr118430.c":35:10 -1
>  (expr_list:REG_CALL_DECL (symbol_ref:DI ("bar") [flags 0x3]  
> )
> (expr_list:REG_EH_REGION (const_int 0 [0])
> (nil)))
> (expr_list:SI (use (reg:SI 5 di))
> (nil)))
> 
> (barrier 13 12 0)
> Because it doesn't fallthru, no further statements in the same bb are
> expanded.  Now, if the bb with return happened to be in some other basic
> block from the [tail call], it could be expanded but because the bb with
> tail call ends with a barrier, it doesn't fall thru there and if nothing
> else could reach it, we'd remove the unreachable bb RSN.
> 
> > So I feel a bit nervous marking sth as tail-call that doesn't
> > actually look like one (unless we make it so again).
> 
> Expansion really counts on tailc to verify all the following statements
> are useless.  Even if it is
>   _1 = bar (...); [tail call]
>   return _1;
> it is treated the same, we also don't expand the return _1; there
> separately, after all, nothing initializes the _1 anywhere when expanding
> the tail call (unless tail call fails and we expand it as normal call of
> course).  And as tailc allows, there could be further statements, copying of
> SSA_NAMEs, debug statements, clobber statements, ...

I see.

The patch is OK then.

Thanks,
Richard.

>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


[PATCH v5 1/4] RISC-V: Add Zicfiss ISA extension.

2025-01-16 Thread Monk Chiang
This patch is implemented according to the RISC-V CFI specification.
It supports the generation of shadow stack instructions in the prologue,
epilogue, non-local gotos, and unwinding.

RISC-V CFI SPEC: https://github.com/riscv/riscv-cfi

gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add ZICFISS ISA string.
* gcc/config/riscv/predicates.md: New predicate x1x5_operand.
* gcc/config/riscv/riscv.cc
  (riscv_expand_prologue): Insert shadow stack instructions.
  (riscv_expand_epilogue): Likewise.
  (riscv_for_each_saved_reg): Assign t0 or ra register for
  sspopchk instruction.
  (need_shadow_stack_push_pop_p): New function. Omit shadow
  stack operation on leaf function.
* gcc/config/riscv/riscv.h
  (need_shadow_stack_push_pop_p): Define.
* gcc/config/riscv/riscv.md: Add shadow stack patterns.
  (save_stack_nonlocal): Add shadow stack instructions for setjump.
  (restore_stack_nonlocal): Add shadow stack instructions for longjump.

libgcc/ChangeLog:
* gcc/config/riscv/riscv.opt (TARGET_ZICFISS): Define.
* libgcc/config/riscv/linux-unwind.h: Include shadow-stack-unwind.h.
* libgcc/config/riscv/shadow-stack-unwind.h
  (_Unwind_Frames_Extra): Define.
  (_Unwind_Frames_Increment): Define.

gcc/testsuite/ChangeLog:
* gcc/testsuite/gcc.target/riscv/ssp-1.c: New test.
* gcc/testsuite/gcc.target/riscv/ssp-2.c: New test.

Co-Developed-by: Greg McGary ,
 Kito Cheng  
---
 gcc/common/config/riscv/riscv-common.cc   |   7 ++
 gcc/config/riscv/predicates.md|   6 ++
 gcc/config/riscv/riscv.cc |  58 --
 gcc/config/riscv/riscv.h  |   1 +
 gcc/config/riscv/riscv.md | 125 +-
 gcc/config/riscv/riscv.opt|   2 +
 gcc/testsuite/gcc.target/riscv/ssp-1.c|  41 +++
 gcc/testsuite/gcc.target/riscv/ssp-2.c|  10 ++
 libgcc/config/riscv/linux-unwind.h|   5 +
 libgcc/config/riscv/shadow-stack-unwind.h |  74 +
 10 files changed, 320 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/ssp-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/ssp-2.c
 create mode 100644 libgcc/config/riscv/shadow-stack-unwind.h

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index bfc8aa559c5..8e8b6107a6d 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -111,6 +111,9 @@ static const riscv_implied_info_t riscv_implied_info[] =
   {"zfinx", "zicsr"},
   {"zdinx", "zicsr"},
 
+  {"zicfiss", "zicsr"},
+  {"zicfiss", "zimop"},
+
   {"zk", "zkn"},
   {"zk", "zkr"},
   {"zk", "zkt"},
@@ -325,6 +328,8 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
   {"zicclsm",  ISA_SPEC_CLASS_NONE, 1, 0},
   {"ziccrse",  ISA_SPEC_CLASS_NONE, 1, 0},
 
+  {"zicfiss", ISA_SPEC_CLASS_NONE, 1, 0},
+
   {"zimop", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zcmop", ISA_SPEC_CLASS_NONE, 1, 0},
 
@@ -1647,6 +1652,8 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   RISCV_EXT_FLAG_ENTRY ("zicbop", x_riscv_zicmo_subext, MASK_ZICBOP),
   RISCV_EXT_FLAG_ENTRY ("zic64b", x_riscv_zicmo_subext, MASK_ZIC64B),
 
+  RISCV_EXT_FLAG_ENTRY ("zicfiss", x_riscv_zi_subext, MASK_ZICFISS),
+
   RISCV_EXT_FLAG_ENTRY ("zimop", x_riscv_mop_subext, MASK_ZIMOP),
   RISCV_EXT_FLAG_ENTRY ("zcmop", x_riscv_mop_subext, MASK_ZCMOP),
 
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index cda7502a62a..1f67d30be9d 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -679,3 +679,9 @@
   return (riscv_symbolic_constant_p (op, &type)
  && type == SYMBOL_PCREL);
 })
+
+;; Shadow stack operands only allow x1, x5 registers
+(define_predicate "x1x5_operand"
+  (and (match_operand 0 "register_operand")
+   (match_test "REGNO (op) == RETURN_ADDR_REGNUM
+   || REGNO (op) == T0_REGNUM")))
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 65e09842fde..cd37b492183 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -7496,6 +7496,9 @@ riscv_save_reg_p (unsigned int regno)
   if (regno == GP_REGNUM || regno == THREAD_POINTER_REGNUM)
return false;
 
+  if (regno == RETURN_ADDR_REGNUM && TARGET_ZICFISS)
+   return true;
+
   /* We must save every register used in this function.  If this is not a
 leaf function, then we must save all temporary registers.  */
   if (df_regs_ever_live_p (regno)
@@ -8049,7 +8052,7 @@ riscv_is_eh_return_data_register (unsigned int regno)
 
 static void
 riscv_for_each_saved_reg (poly_int64 sp_offset, riscv_save_restore_fn fn,
- bool epilogue, bool maybe_eh_return)
+ bool epilogue, bool maybe_eh_return, bool sibcall_p)

[PATCH v5 3/4] RISC-V: Add .note.gnu.property for ZICFILP and ZICFISS ISA extension

2025-01-16 Thread Monk Chiang
gcc/ChangeLog:
* gcc/config/riscv/riscv.cc
(riscv_file_end_indicate_exec_stack): Add .note.gnu.property.
* gcc/config/riscv/linux.h (TARGET_ASM_FILE_END): Define.

libgcc/ChangeLog:
* libgcc/config/riscv/crti.S: Add lpad instructions.
* libgcc/config/riscv/crtn.S: Likewise.
* libgcc/config/riscv/save-restore.S: Likewise.
* libgcc/config/riscv/riscv-asm.h: Add GNU_PROPERTY for ZICFILP,
  ZICFISS.

Co-Developed-by: Jesse Huang 
---
 gcc/config/riscv/riscv.cc  | 52 +-
 libgcc/config/riscv/crti.S |  2 +
 libgcc/config/riscv/crtn.S |  2 +
 libgcc/config/riscv/riscv-asm.h| 69 +-
 libgcc/config/riscv/save-restore.S |  5 +++
 5 files changed, 128 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 4afb0b95839..cb448aba9c0 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -10334,6 +10334,56 @@ riscv_file_start (void)
 riscv_emit_attribute ();
 }
 
+void
+riscv_file_end ()
+{
+  file_end_indicate_exec_stack ();
+  long GNU_PROPERTY_RISCV_FEATURE_1_AND  = 0;
+  unsigned long feature_1_and = 0;
+
+  if (TARGET_ZICFISS)
+feature_1_and |= 0x1 << 0;
+
+  if (TARGET_ZICFILP)
+feature_1_and |= 0x1 << 1;
+
+  if (feature_1_and)
+{
+  /* Generate .note.gnu.property section.  */
+  switch_to_section (get_section (".note.gnu.property",
+ SECTION_NOTYPE, NULL));
+
+  /* The program property descriptor is aligned to 4 bytes in 32-bit
+objects and 8 bytes in 64-bit objects.  */
+  unsigned p2align = TARGET_64BIT ? 3 : 2;
+
+  fprintf (asm_out_file, "\t.p2align\t%u\n", p2align);
+  /* name length.  */
+  fprintf (asm_out_file, "\t.long\t1f - 0f\n");
+  /* data length.  */
+  fprintf (asm_out_file, "\t.long\t5f - 2f\n");
+  /* note type.  */
+  fprintf (asm_out_file, "\t.long\t5\n");
+  fprintf (asm_out_file, "0:\n");
+  /* vendor name: "GNU".  */
+  fprintf (asm_out_file, "\t.asciz\t\"GNU\"\n");
+  fprintf (asm_out_file, "1:\n");
+
+  /* pr_type.  */
+  fprintf (asm_out_file, "\t.p2align\t3\n");
+  fprintf (asm_out_file, "2:\n");
+  fprintf (asm_out_file, "\t.long\t0xc000\n");
+  /* pr_datasz.  */
+  fprintf (asm_out_file, "\t.long\t4f - 3f\n");
+  fprintf (asm_out_file, "3:\n");
+  /* zicfiss, zicfilp.  */
+  fprintf (asm_out_file, "\t.long\t%x\n", feature_1_and);
+  fprintf (asm_out_file, "4:\n");
+  fprintf (asm_out_file, "\t.p2align\t%u\n", p2align);
+  fprintf (asm_out_file, "5:\n");
+}
+}
+
 /* Implement TARGET_ASM_OUTPUT_MI_THUNK.  Generate rtl rather than asm text
in order to avoid duplicating too much logic from elsewhere.  */
 
@@ -13975,7 +14025,7 @@ bool need_shadow_stack_push_pop_p ()
 #undef TARGET_ASM_FILE_START_FILE_DIRECTIVE
 #define TARGET_ASM_FILE_START_FILE_DIRECTIVE true
 #undef TARGET_ASM_FILE_END
-#define TARGET_ASM_FILE_END file_end_indicate_exec_stack
+#define TARGET_ASM_FILE_END riscv_file_end
 
 #undef TARGET_EXPAND_BUILTIN_VA_START
 #define TARGET_EXPAND_BUILTIN_VA_START riscv_va_start
diff --git a/libgcc/config/riscv/crti.S b/libgcc/config/riscv/crti.S
index 89bac706c63..3a67fd77156 100644
--- a/libgcc/config/riscv/crti.S
+++ b/libgcc/config/riscv/crti.S
@@ -1 +1,3 @@
 /* crti.S is empty because .init_array/.fini_array are used exclusively. */
+
+#include "riscv-asm.h"
diff --git a/libgcc/config/riscv/crtn.S b/libgcc/config/riscv/crtn.S
index ca6ee7b6fba..cb80782bb55 100644
--- a/libgcc/config/riscv/crtn.S
+++ b/libgcc/config/riscv/crtn.S
@@ -1 +1,3 @@
 /* crtn.S is empty because .init_array/.fini_array are used exclusively. */
+
+#include "riscv-asm.h"
diff --git a/libgcc/config/riscv/riscv-asm.h b/libgcc/config/riscv/riscv-asm.h
index b6dbeaedc20..73bddb3f9e7 100644
--- a/libgcc/config/riscv/riscv-asm.h
+++ b/libgcc/config/riscv/riscv-asm.h
@@ -23,9 +23,11 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
 #define FUNC_SIZE(X)   .size X,.-X
 
 #define FUNC_BEGIN(X)  \
+   .align 2;   \
.globl X;   \
FUNC_TYPE (X);  \
-X:
+X: \
+   LPAD
 
 #define FUNC_END(X)\
FUNC_SIZE(X)
@@ -39,3 +41,68 @@ X:
 #define HIDDEN_JUMPTARGET(X)   CONCAT1(__hidden_, X)
 #define HIDDEN_DEF(X)  FUNC_ALIAS(HIDDEN_JUMPTARGET(X), X); \
.hidden HIDDEN_JUMPTARGET(X)
+
+/* GNU_PROPERTY_RISCV64_* macros from elf.h for use in asm code.  */
+#define FEATURE_1_AND 0xc000
+#define FEATURE_1_FCFI 1
+#define FEATURE_1_BCFI 2
+
+/* Add a NT_GNU_PROPERTY_TYPE_0 note.  */
+#if __riscv_xlen == 32
+#  define GNU_PROPERTY(type, value)\
+.section .note.gnu.property, "a";  \
+.p2align 2;\
+.word 4;   \
+  

[PATCH v5 2/4] RISC-V: Add Zicfilp ISA extension.

2025-01-16 Thread Monk Chiang
This patch only support landing pad value is 0.
The next version will implement function signature based labeling
scheme.

RISC-V CFI SPEC: https://github.com/riscv/riscv-cfi

gcc/ChangeLog:
* gcc/common/config/riscv/riscv-common.cc: Add ZICFILP ISA
  string.
* gcc/config.gcc: Add riscv-zicfilp.o
* gcc/config/riscv/riscv-passes.def (INSERT_PASS_BEFORE):
  Insert landing pad instructions.
* gcc/config/riscv/riscv-protos.h (make_pass_insert_landing_pad):
  Declare.
* gcc/config/riscv/riscv-zicfilp.cc: New file.
* gcc/config/riscv/riscv.cc
  (riscv_trampoline_init): Add landing pad instructions.
  (riscv_legitimize_call_address): Likewise.
  (riscv_output_mi_thunk): Likewise.
* gcc/config/riscv/riscv.h: Update.
* gcc/config/riscv/riscv.md: Add landing pad patterns.
* gcc/config/riscv/riscv.opt (TARGET_ZICFILP): Define.
* gcc/config/riscv/t-riscv: Add build rule for
  riscv-zicfilp.o

gcc/testsuite/ChangeLog:
* gcc/testsuite/gcc.target/riscv/interrupt-no-lpad.c: New test.
* gcc/testsuite/gcc.target/riscv/zicfilp-call.c: New test.

Co-Developed-by: Greg McGary 
 Kito Cheng  
---
 gcc/common/config/riscv/riscv-common.cc   |   3 +
 gcc/config.gcc|   2 +-
 gcc/config/riscv/riscv-passes.def |   1 +
 gcc/config/riscv/riscv-protos.h   |   1 +
 gcc/config/riscv/riscv-zicfilp.cc | 169 ++
 gcc/config/riscv/riscv.cc | 145 ---
 gcc/config/riscv/riscv.h  |  13 +-
 gcc/config/riscv/riscv.md |  73 +++-
 gcc/config/riscv/riscv.opt|   2 +
 gcc/config/riscv/t-riscv  |   9 +
 .../gcc.target/riscv/interrupt-no-lpad.c  |   7 +
 gcc/testsuite/gcc.target/riscv/zicfilp-call.c |  14 ++
 12 files changed, 402 insertions(+), 37 deletions(-)
 create mode 100644 gcc/config/riscv/riscv-zicfilp.cc
 create mode 100644 gcc/testsuite/gcc.target/riscv/interrupt-no-lpad.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zicfilp-call.c

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 8e8b6107a6d..5038f0eb959 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -113,6 +113,7 @@ static const riscv_implied_info_t riscv_implied_info[] =
 
   {"zicfiss", "zicsr"},
   {"zicfiss", "zimop"},
+  {"zicfilp", "zicsr"},
 
   {"zk", "zkn"},
   {"zk", "zkr"},
@@ -329,6 +330,7 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
   {"ziccrse",  ISA_SPEC_CLASS_NONE, 1, 0},
 
   {"zicfiss", ISA_SPEC_CLASS_NONE, 1, 0},
+  {"zicfilp", ISA_SPEC_CLASS_NONE, 1, 0},
 
   {"zimop", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zcmop", ISA_SPEC_CLASS_NONE, 1, 0},
@@ -1653,6 +1655,7 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   RISCV_EXT_FLAG_ENTRY ("zic64b", x_riscv_zicmo_subext, MASK_ZIC64B),
 
   RISCV_EXT_FLAG_ENTRY ("zicfiss", x_riscv_zi_subext, MASK_ZICFISS),
+  RISCV_EXT_FLAG_ENTRY ("zicfilp", x_riscv_zi_subext, MASK_ZICFILP),
 
   RISCV_EXT_FLAG_ENTRY ("zimop", x_riscv_mop_subext, MASK_ZIMOP),
   RISCV_EXT_FLAG_ENTRY ("zcmop", x_riscv_mop_subext, MASK_ZCMOP),
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 55e37146ee0..87fed823118 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -553,7 +553,7 @@ riscv*)
extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o 
riscv-shorten-memrefs.o riscv-selftests.o riscv-string.o"
extra_objs="${extra_objs} riscv-v.o riscv-vsetvl.o riscv-vector-costs.o 
riscv-avlprop.o"
extra_objs="${extra_objs} riscv-vector-builtins.o 
riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o 
sifive-vector-builtins-bases.o"
-   extra_objs="${extra_objs} thead.o riscv-target-attr.o"
+   extra_objs="${extra_objs} thead.o riscv-target-attr.o riscv-zicfilp.o"
d_target_objs="riscv-d.o"
extra_headers="riscv_vector.h riscv_crypto.h riscv_bitmanip.h 
riscv_th_vector.h riscv_cmo.h"
target_gtfiles="$target_gtfiles 
\$(srcdir)/config/riscv/riscv-vector-builtins.cc"
diff --git a/gcc/config/riscv/riscv-passes.def 
b/gcc/config/riscv/riscv-passes.def
index cbea23c8b44..7e6a2a0e53d 100644
--- a/gcc/config/riscv/riscv-passes.def
+++ b/gcc/config/riscv/riscv-passes.def
@@ -20,3 +20,4 @@
 INSERT_PASS_AFTER (pass_rtl_store_motion, 1, pass_shorten_memrefs);
 INSERT_PASS_AFTER (pass_split_all_insns, 1, pass_avlprop);
 INSERT_PASS_BEFORE (pass_fast_rtl_dce, 1, pass_vsetvl);
+INSERT_PASS_BEFORE (pass_shorten_branches, 1, pass_insert_landing_pad);
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index dd3b36d47a6..6362380ec6c 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -200,6 +200,7 @@ extern bool riscv_hard_regno_rename_ok (unsigned, unsigne

[PATCH v5 4/4] RISC-V: Add -fcf-protection=[full|branch|return] to enable zicfiss, zicfilp.

2025-01-16 Thread Monk Chiang
gcc/ChangeLog:
* gcc/config/riscv/riscv.cc
  (is_zicfilp_p): New function.
  (is_zicfiss_p): New function.
* gcc/config/riscv/riscv-zicfilp.cc: Update.
* gcc/config/riscv/riscv.h: Update.
* gcc/config/riscv/riscv.md: Update.

gcc/testsuite/ChangeLog:
* gcc/testsuite/c-c++-common/fcf-protection-1.c: Update.
* gcc/testsuite/c-c++-common/fcf-protection-2.c: Update.
* gcc/testsuite/c-c++-common/fcf-protection-3.c: Update.
* gcc/testsuite/c-c++-common/fcf-protection-4.c: Update.
* gcc/testsuite/c-c++-common/fcf-protection-5.c: Update.
* gcc/testsuite/c-c++-common/fcf-protection-6.c: Update.
* gcc/testsuite/c-c++-common/fcf-protection-7.c: Update.
* gcc/testsuite/gcc.target/riscv/ssp-1.c: Update.
* gcc/testsuite/gcc.target/riscv/ssp-2.c: Update.
* gcc/testsuite/gcc.target/riscv/zicfilp-call.c: Update.
* gcc/testsuite/gcc.target/riscv/interrupt-no-lpad.c: Update.
---
 gcc/config/riscv/riscv-zicfilp.cc |  2 +-
 gcc/config/riscv/riscv.cc | 52 +++
 gcc/config/riscv/riscv.h  |  8 +--
 gcc/config/riscv/riscv.md | 10 ++--
 gcc/testsuite/c-c++-common/fcf-protection-1.c |  1 +
 gcc/testsuite/c-c++-common/fcf-protection-2.c |  1 +
 gcc/testsuite/c-c++-common/fcf-protection-3.c |  1 +
 gcc/testsuite/c-c++-common/fcf-protection-4.c |  1 +
 gcc/testsuite/c-c++-common/fcf-protection-5.c |  1 +
 gcc/testsuite/c-c++-common/fcf-protection-6.c |  1 +
 gcc/testsuite/c-c++-common/fcf-protection-7.c |  1 +
 .../gcc.target/riscv/interrupt-no-lpad.c  |  2 +-
 gcc/testsuite/gcc.target/riscv/ssp-1.c|  2 +-
 gcc/testsuite/gcc.target/riscv/ssp-2.c|  2 +-
 gcc/testsuite/gcc.target/riscv/zicfilp-call.c |  2 +-
 15 files changed, 63 insertions(+), 24 deletions(-)

diff --git a/gcc/config/riscv/riscv-zicfilp.cc 
b/gcc/config/riscv/riscv-zicfilp.cc
index 42b129920b3..834d6e5c778 100644
--- a/gcc/config/riscv/riscv-zicfilp.cc
+++ b/gcc/config/riscv/riscv-zicfilp.cc
@@ -150,7 +150,7 @@ public:
   /* opt_pass methods: */
   virtual bool gate (function *)
 {
-  return TARGET_ZICFILP;
+  return is_zicfilp_p ();
 }
 
   virtual unsigned int execute (function *)
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index cb448aba9c0..8a4e23851d7 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -6682,7 +6682,7 @@ riscv_legitimize_call_address (rtx addr)
   rtx reg = RISCV_CALL_ADDRESS_TEMP (Pmode);
   riscv_emit_move (reg, addr);
 
-  if (TARGET_ZICFILP)
+  if (is_zicfilp_p ())
{
  rtx sw_guarded = RISCV_CALL_ADDRESS_LPAD (Pmode);
  emit_insn (gen_set_guarded (Pmode, reg));
@@ -6692,7 +6692,7 @@ riscv_legitimize_call_address (rtx addr)
   return reg;
 }
 
-  if (TARGET_ZICFILP && REG_P (addr))
+  if (is_zicfilp_p () && REG_P (addr))
 emit_insn (gen_set_lpl (Pmode, const0_rtx));
 
   return addr;
@@ -7508,7 +7508,7 @@ riscv_save_reg_p (unsigned int regno)
   if (regno == GP_REGNUM || regno == THREAD_POINTER_REGNUM)
return false;
 
-  if (regno == RETURN_ADDR_REGNUM && TARGET_ZICFISS)
+  if (regno == RETURN_ADDR_REGNUM && is_zicfiss_p ())
return true;
 
   /* We must save every register used in this function.  If this is not a
@@ -10341,10 +10341,10 @@ riscv_file_end ()
   long GNU_PROPERTY_RISCV_FEATURE_1_AND  = 0;
   unsigned long feature_1_and = 0;
 
-  if (TARGET_ZICFISS)
+  if (is_zicfilp_p ())
 feature_1_and |= 0x1 << 0;
 
-  if (TARGET_ZICFILP)
+  if (is_zicfiss_p ())
 feature_1_and |= 0x1 << 1;
 
   if (feature_1_and)
@@ -10404,7 +10404,7 @@ riscv_output_mi_thunk (FILE *file, tree thunk_fndecl 
ATTRIBUTE_UNUSED,
   /* Mark the end of the (empty) prologue.  */
   emit_note (NOTE_INSN_PROLOGUE_END);
 
-  if (TARGET_ZICFILP)
+  if (is_zicfilp_p ())
 emit_insn(gen_lpad (const0_rtx));
 
   /* Determine if we can use a sibcall to call FUNCTION directly.  */
@@ -10631,6 +10631,20 @@ riscv_override_options_internal (struct gcc_options 
*opts)
 
   /* Convert -march and -mrvv-vector-bits to a chunks count.  */
   riscv_vector_chunks = riscv_convert_vector_chunks (opts);
+
+  if (opts->x_flag_cf_protection != CF_NONE)
+{
+  if ((opts->x_flag_cf_protection & CF_RETURN) == CF_RETURN
+ && !TARGET_ZICFISS)
+   error ("%<-fcf-protection%> is not compatible with this target");
+
+  if ((opts->x_flag_cf_protection & CF_BRANCH) == CF_BRANCH
+ && !TARGET_ZICFILP)
+   error ("%<-fcf-protection%> is not compatible with this target");
+
+  opts->x_flag_cf_protection
+  = (cf_protection_level) (opts->x_flag_cf_protection | CF_SET);
+}
 }
 
 /* Implement TARGET_OPTION_OVERRIDE.  */
@@ -10925,7 +10939,7 @@ riscv_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain_value)
 
   /* Work out the offsets of the pointers from the start of th

Re: [PATCH v3] AArch64: Add LUTI ACLE for SVE2

2025-01-16 Thread Richard Sandiford
Saurabh Jha  writes:
> On 1/8/2025 11:13 AM, Richard Sandiford wrote:
>>  writes:
>>> [...]
>>> diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def 
>>> b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def
>>> index e726fa1fb68..0c4f8251ac0 100644
>>> --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def
>>> +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def
>>> @@ -164,6 +164,10 @@ DEF_SVE_FUNCTION (svwhilegt, compare_scalar, while, 
>>> none)
>>>   DEF_SVE_FUNCTION (svwhilerw, compare_ptr, all_data, none)
>>>   DEF_SVE_FUNCTION (svwhilewr, compare_ptr, all_data, none)
>>>   DEF_SVE_FUNCTION (svxar, ternary_shift_right_imm, all_integer, none)
>>> +DEF_SVE_FUNCTION (svluti2_lane, luti2, bhs_data, none)
>>> +DEF_SVE_FUNCTION (svluti4_lane, luti4, bhs_data, none)
>>> +DEF_SVE_FUNCTION_GS (svluti4_lane, luti4, bhs_data, x2, none)
>> 
>> bhs_data looks wrong: there should be no .s versions.  Similarly...
>> 
>>> +
>>>   #undef REQUIRED_EXTENSIONS
>>>   
>>>   #define REQUIRED_EXTENSIONS nonstreaming_sve (AARCH64_FL_SVE2)
>>> [...]
>>> diff --git a/gcc/config/aarch64/aarch64-sve2.md 
>>> b/gcc/config/aarch64/aarch64-sve2.md
>>> index f8cfe08f4c0..7dcbc0700da 100644
>>> --- a/gcc/config/aarch64/aarch64-sve2.md
>>> +++ b/gcc/config/aarch64/aarch64-sve2.md
>>> @@ -133,6 +133,7 @@
>>>   ;;  Optional AES extensions
>>>   ;;  Optional SHA-3 extensions
>>>   ;;  Optional SM4 extensions
>>> +;;  Table lookup
>>>   
>>>   ;; 
>>> =
>>>   ;; == Moves
>>> @@ -4211,3 +4212,47 @@
>>> "sm4ekey\t%0.s, %1.s, %2.s"
>>> [(set_attr "type" "crypto_sm4")]
>>>   )
>>> +
>>> +;; 
>>> -
>>> +;;  Table lookup
>>> +;; 
>>> -
>>> +;; Includes:
>>> +;; - LUTI2
>>> +;; - LUTI4
>>> +;; 
>>> -
>>> +
>>> +(define_insn "@aarch64_sve_luti"
>>> +  [(set (match_operand:SVE_FULL_BS 0 "register_operand" "=w")
>>> +   (unspec:SVE_FULL_BS
>>> + [(match_operand:SVE_FULL_BS 1 "register_operand" "w")
>>> +  (match_operand:VNx16QI 2 "register_operand" "w")
>>> +  (match_operand:DI 3 "const_int_operand")
>>> +  (const_int LUTI_BITS)]
>>> + UNSPEC_SVE_LUTI))]
>>> +  "TARGET_SVE2"
>>> +  "luti\t%0., { %1. }, %2[%3]"
>>> +)
>>> +
>>> +(define_insn "@aarch64_sve_luti"
>>> +  [(set (match_operand: 0 "register_operand" "=w")
>>> +   (unspec:
>>> +   [(match_operand:SVE_FULL_H 1 "aligned_register_operand" "w")
>>> +   (match_operand:VNx16QI 2 "register_operand" "w")
>>> +   (match_operand:DI 3 "const_int_operand")
>>> +   (const_int LUTI_BITS)]
>>> +   UNSPEC_SVE_LUTI))]
>>> +  "TARGET_SVE2"
>>> +  "luti\t%0., { %1. }, %2[%3]"
>>> +)
>> 
>> ...there should be .S (VNx4) variants here.  Also, the .H variants
>> don't require an aligned register operand.  (FWIW, using "w" with
>> "aligned_register_operand" is wrong in any case, since "w" accepts
>> unaligned registers.)
>
> You meant "..x2" here, right?

I did mean VNx4 (the mode prefix for full .S vectors, as in VNx4SI).
But I also meant "there should be *no* .S" -- sorry about that.

Richard


Re: [PATCH v3 0/4] Hard Register Constraints

2025-01-16 Thread Stefan Schulze Frielinghaus
On Wed, Jan 15, 2025 at 10:29:03PM -0700, Jeff Law wrote:
> 
> 
> On 11/29/24 2:15 AM, Stefan Schulze Frielinghaus wrote:
> > Ping.
> > 
> > On Fri, Oct 25, 2024 at 11:57:16AM +0200, Stefan Schulze Frielinghaus wrote:
> > > This is a follow-up to
> > > https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663238.html
> > > 
> > > The primary changes are about error handling and documentation updates.
> > > Now, we error out whenever a hard register constraint is used more than
> > > once across an alternative for outputs or inputs.  For example, the
> > > following is allowed for register asm
> > > 
> > >register int y __asm__ ("0") = x;
> > >__asm__ ("" : "=r" (y) : "0" (y), "r" (y));
> > > 
> > > and the analogue for hard register constraints
> > > 
> > >int y = x;
> > >__asm__ ("" : "={0}" (y) : "0" (y), "{0}" (y));  // invalid
> > > 
> > > is rejected.
> > > 
> > > Furthermore, for hard register constraints we fail if an output object
> > > is used more than once as e.g.
> > > 
> > >int x;
> > >asm ("" : "=r" (x), "={1}" (x));  // rejected
> > > 
> > > although
> > > 
> > >int x;
> > >asm ("" : "=r" (x), "=r" (x));
> > > 
> > > is accepted.
> > > 
> > > Thus, in total the changes make hard register constraints more strict in
> > > order to prevent subtle bugs.
> So this really should have gotten more attention many months ago.
> 
> Conceptually I see the value in being able to being able to specify a
> specific register in an asm.  The single register class constraints found on
> x86 have effectively given that port that capability, but others which truly
> general purpose registers files don't have a good way to do this stuff.
> 
> I think we should look to try and move this forward early in the gcc-16
> cycle.  You're definitely going to need to update the manual for the new
> capability.

I added some documentation to gcc/doc/extend.texi.  Is there some
other place I should document this?

> 
> I wouldn't be surprised if this doesn't work on reload targets.  While I
> think we *can* make it work as the right infrastructure is largely in place,
> I don't think it's worth the time as reload should be on the chopping block
> in a few months.  So I think focusing on LRA only is sensible.
> 
> I'm not aware of any reasonable way to easily solve the fixed register
> problem.  Though conceptually you could defer when fixed registers are
> pruned from the register classes.  Vlad might have better ideas in that
> space.
> 
> Do we detect conflicts between a hard register constraint and another
> constraint which requires a singleton class?  That's going to be an error I
> suspect, but curious if it's handled.

That is a good point.  Currently I suspect no.  I will have a look.

> 
> Anyway, I'm sure we'll have other details to hash through.  Mostly I wanted
> to signal that I can see the value in what you're doing and that we should
> be looking to move forward with it during the gcc-16 cycle.

Sounds good; thanks for letting me know.

Cheers,
Stefan


Re: [EXTERNAL] Re: [PATCH] Fix setting of call graph node AutoFDO count [PR116743]

2025-01-16 Thread Richard Biener
On Thu, Jan 16, 2025 at 3:17 AM Eugene Rozenfeld
 wrote:
>
> I committed the patch to trunk. Is it ok to backport to gcc-12, gcc-13, and 
> gcc-14?

Yes.

> -Original Message-
> From: Richard Biener 
> Sent: Monday, January 13, 2025 11:22 PM
> To: Eugene Rozenfeld 
> Cc: GCC-Patches-ML ; Jan Hubicka ; 
> rvmal...@amazon.com
> Subject: [EXTERNAL] Re: [PATCH] Fix setting of call graph node AutoFDO count 
> [PR116743]
>
> On Mon, Jan 13, 2025 at 10:47 PM Eugene Rozenfeld 
>  wrote:
> >
> > We are initializing both the call graph node count and
> >
> > the entry block count of the function with the head_count value
> >
> > from the profile.
> >
> >
> >
> > Count propagation algorithm may refine the entry block count
> >
> > and we may end up with a case where the call graph node count
> >
> > is set to 0 but the entry block count is non-zero. That becomes
> >
> > a problem because we have this code in execute_fixup_cfg:
> >
> >
> >
> > profile_count num = node->count;
> >
> > profile_count den = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
> >
> > bool scale = num.initialized_p () && !(num == den);
> >
> >
> >
> > Here if num is 0 but den is not 0, scale becomes true and we
> >
> > lose the counts in
> >
> >
> >
> > if (scale)
> >
> >   bb->count = bb->count.apply_scale (num, den);
> >
> >
> >
> > This is what happened the issue reported in PR116743
> >
> > (a 10% regression in MySQL HAMMERDB tests).
> >
> > 3d9e6767939e9658260e2506e81ec32b37cba041 made an improvement in
> >
> > AutoFDO count propagation, which caused the mismatch between
> >
> > the call graph node count (zero) and the entry block count (non-zero)
> >
> > and subsequent loss of counts as described above.
> >
> >
> >
> > The fix is to update the call graph node count once we've done count 
> > propagation.
> >
> >
> >
> > Tested on x86_64-pc-linux-gnu.
>
> OK.
>
> Thanks,
> Richard.
>
> >
> >
> > gcc/ChangeLog:
> >
> > PR gcov-profile/116743
> >
> > * auto-profile.c (afdo_annotate_cfg): Fix mismatch
> > between the call graph node count
> >
> > and the entry block count.
> >
> > ---
> >
> > gcc/auto-profile.cc | 4 ++--
> >
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> >
> >
> > diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc
> >
> > index 5d0e8afb9a1..aa4d1634f01 100644
> >
> > --- a/gcc/auto-profile.cc
> >
> > +++ b/gcc/auto-profile.cc
> >
> > @@ -1538,8 +1538,6 @@ afdo_annotate_cfg (const stmt_set
> > &promoted_stmts)
> >
> >if (s == NULL)
> >
> >  return;
> >
> > -  cgraph_node::get (current_function_decl)->count
> >
> > - = profile_count::from_gcov_type (s->head_count ()).afdo ();
> >
> >ENTRY_BLOCK_PTR_FOR_FN (cfun)->count
> >
> >   = profile_count::from_gcov_type (s->head_count ()).afdo ();
> >
> >EXIT_BLOCK_PTR_FOR_FN (cfun)->count = profile_count::zero ().afdo
> > ();
> >
> > @@ -1578,6 +1576,8 @@ afdo_annotate_cfg (const stmt_set
> > &promoted_stmts)
> >
> >/* Calculate, propagate count and probability information on
> > CFG.  */
> >
> >afdo_calculate_branch_prob (&annotated_bb);
> >
> >  }
> >
> > +  cgraph_node::get(current_function_decl)->count
> >
> > +  = ENTRY_BLOCK_PTR_FOR_FN(cfun)->count;
> >
> >update_max_bb_count ();
> >
> >profile_status_for_fn (cfun) = PROFILE_READ;
> >
> >if (flag_value_profile_transformations)
> >
> > --
> >
> > 2.34.1
> >
> >


[committed] libstdc++: Check feature test macro for associative container node extraction

2025-01-16 Thread Jonathan Wakely
Replace some `__cplusplus > 201402L` preprocessor checks with more
expressive checks for the appropriate feature test macro.

libstdc++-v3/ChangeLog:

* include/bits/stl_map.h: Check __glibcxx_node_extract instead
of __cplusplus.
* include/bits/stl_multimap.h: Likewise.
* include/bits/stl_multiset.h: Likewise.
* include/bits/stl_set.h: Likewise.
* include/bits/stl_tree.h: Likewise.
---

Tested x86_64-linux. Pushed to trunk.

 libstdc++-v3/include/bits/stl_map.h  | 4 ++--
 libstdc++-v3/include/bits/stl_multimap.h | 4 ++--
 libstdc++-v3/include/bits/stl_multiset.h | 4 ++--
 libstdc++-v3/include/bits/stl_set.h  | 4 ++--
 libstdc++-v3/include/bits/stl_tree.h | 8 
 5 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/libstdc++-v3/include/bits/stl_map.h 
b/libstdc++-v3/include/bits/stl_map.h
index c1c0f7577bbd..d2d0b524cceb 100644
--- a/libstdc++-v3/include/bits/stl_map.h
+++ b/libstdc++-v3/include/bits/stl_map.h
@@ -180,7 +180,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   typedef typename _Rep_type::reverse_iterator  reverse_iterator;
   typedef typename _Rep_type::const_reverse_iterator 
const_reverse_iterator;
 
-#if __cplusplus > 201402L
+#ifdef __glibcxx_node_extract // >= C++17
   using node_type = typename _Rep_type::node_type;
   using insert_return_type = typename _Rep_type::insert_return_type;
 #endif
@@ -642,7 +642,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
}
 #endif
 
-#if __cplusplus > 201402L
+#ifdef __glibcxx_node_extract // >= C++17
   /// Extract a node.
   node_type
   extract(const_iterator __pos)
diff --git a/libstdc++-v3/include/bits/stl_multimap.h 
b/libstdc++-v3/include/bits/stl_multimap.h
index 426db214c045..661d870fd01f 100644
--- a/libstdc++-v3/include/bits/stl_multimap.h
+++ b/libstdc++-v3/include/bits/stl_multimap.h
@@ -171,7 +171,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   typedef typename _Rep_type::reverse_iterator  reverse_iterator;
   typedef typename _Rep_type::const_reverse_iterator 
const_reverse_iterator;
 
-#if __cplusplus > 201402L
+#ifdef __glibcxx_node_extract // >= C++17
   using node_type = typename _Rep_type::node_type;
 #endif
 
@@ -632,7 +632,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   { this->insert(__l.begin(), __l.end()); }
 #endif
 
-#if __cplusplus > 201402L
+#ifdef __glibcxx_node_extract // >= C++17
   /// Extract a node.
   node_type
   extract(const_iterator __pos)
diff --git a/libstdc++-v3/include/bits/stl_multiset.h 
b/libstdc++-v3/include/bits/stl_multiset.h
index 6c70e12884a8..57caf6e8cc41 100644
--- a/libstdc++-v3/include/bits/stl_multiset.h
+++ b/libstdc++-v3/include/bits/stl_multiset.h
@@ -151,7 +151,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   typedef typename _Rep_type::size_type size_type;
   typedef typename _Rep_type::difference_type   difference_type;
 
-#if __cplusplus > 201402L
+#ifdef __glibcxx_node_extract // >= C++17
   using node_type = typename _Rep_type::node_type;
 #endif
 
@@ -566,7 +566,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   { this->insert(__l.begin(), __l.end()); }
 #endif
 
-#if __cplusplus > 201402L
+#ifdef __glibcxx_node_extract // >= C++17
   /// Extract a node.
   node_type
   extract(const_iterator __pos)
diff --git a/libstdc++-v3/include/bits/stl_set.h 
b/libstdc++-v3/include/bits/stl_set.h
index 975ac521d9a1..f32323db3687 100644
--- a/libstdc++-v3/include/bits/stl_set.h
+++ b/libstdc++-v3/include/bits/stl_set.h
@@ -154,7 +154,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   typedef typename _Rep_type::difference_type   difference_type;
   ///@}
 
-#if __cplusplus > 201402L
+#ifdef __glibcxx_node_extract // >= C++17
   using node_type = typename _Rep_type::node_type;
   using insert_return_type = typename _Rep_type::insert_return_type;
 #endif
@@ -581,7 +581,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   { this->insert(__l.begin(), __l.end()); }
 #endif
 
-#if __cplusplus > 201402L
+#ifdef __glibcxx_node_extract // >= C++17
   /// Extract a node.
   node_type
   extract(const_iterator __pos)
diff --git a/libstdc++-v3/include/bits/stl_tree.h 
b/libstdc++-v3/include/bits/stl_tree.h
index ae9b4469d88c..7285676f427f 100644
--- a/libstdc++-v3/include/bits/stl_tree.h
+++ b/libstdc++-v3/include/bits/stl_tree.h
@@ -71,7 +71,7 @@
 #if __cplusplus >= 201103L
 # include 
 #endif
-#if __cplusplus > 201402L
+#ifdef __glibcxx_node_extract // >= C++17
 # include 
 #endif
 
@@ -1010,7 +1010,7 @@ namespace __rb_tree
 #endif
 } // namespace __rb_tree
 
-#if __cplusplus > 201402L
+#ifdef __glibcxx_node_extract // >= C++17
   template
 struct _Rb_tree_merge_helper { };
 #endif
@@ -1451,7 +1451,7 @@ namespace __rb_tree
   typedef std::reverse_iterator   reverse_iterator;
   typedef std::reverse_iterator const_reverse_iterator;
 
-#if __cplusplus > 201402L
+#ifdef __glibcxx_node_extract // >= C++17
   using node_type 

[committed] libstdc++: Implement LWG 2937 for std::filesystem::equivalent [PR118158]

2025-01-16 Thread Jonathan Wakely
Do not report an error for (is_other(s1) && is_other(s2)) as the
standard originally said, nor for (is_other(s1) || is_other(s2)) as
libstdc++ was doing. We can compare inode numbers for special files and
so give sensible answers.

libstdc++-v3/ChangeLog:

PR libstdc++/118158
* src/c++17/fs_ops.cc (fs::equivalent): Remove error reporting
for is_other(s1) && is_other(s2) case, as per LWG 2937.
* testsuite/27_io/filesystem/operations/pr118158.cc: New test.
---

Tested x86_64-linux. Pushed to trunk.

I'm expecting the new test's use of mkfifo to fail on some targets, e.g.
maybe rtems. We can either tweak the #if or add target selectors to
exclude those targets when we discover which ones FAIL.

 libstdc++-v3/src/c++17/fs_ops.cc  | 22 +++
 .../27_io/filesystem/operations/pr118158.cc   | 62 +++
 2 files changed, 69 insertions(+), 15 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/27_io/filesystem/operations/pr118158.cc

diff --git a/libstdc++-v3/src/c++17/fs_ops.cc b/libstdc++-v3/src/c++17/fs_ops.cc
index 1d75f24f78e8..4f188153ae3a 100644
--- a/libstdc++-v3/src/c++17/fs_ops.cc
+++ b/libstdc++-v3/src/c++17/fs_ops.cc
@@ -914,24 +914,16 @@ fs::equivalent(const path& p1, const path& p2, 
error_code& ec) noexcept
   else
 err = errno;
 
-  if (exists(s1) && exists(s2))
-{
-  if (is_other(s1) && is_other(s2))
-   {
- ec = std::__unsupported();
- return false;
-   }
-  ec.clear();
-  if (is_other(s1) || is_other(s2))
-   return false;
-  return fs::equiv_files(p1.c_str(), st1, p2.c_str(), st2, ec);
-}
+  if (err)
+ec.assign(err, std::generic_category());
   else if (!exists(s1) || !exists(s2))
 ec = std::make_error_code(std::errc::no_such_file_or_directory);
-  else if (err)
-ec.assign(err, std::generic_category());
   else
-ec.clear();
+{
+  ec.clear();
+  if (s1.type() == s2.type())
+   return fs::equiv_files(p1.c_str(), st1, p2.c_str(), st2, ec);
+}
   return false;
 #else
   ec = std::make_error_code(std::errc::function_not_supported);
diff --git a/libstdc++-v3/testsuite/27_io/filesystem/operations/pr118158.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/operations/pr118158.cc
new file mode 100644
index ..b57a2d184f41
--- /dev/null
+++ b/libstdc++-v3/testsuite/27_io/filesystem/operations/pr118158.cc
@@ -0,0 +1,62 @@
+// { dg-do run { target c++17 } }
+// { dg-require-filesystem-ts "" }
+
+#include 
+#include 
+#include 
+
+#if defined(_GLIBCXX_HAVE_SYS_STAT_H) && defined(_GLIBCXX_HAVE_SYS_TYPES_H)
+# include 
+# include   // mkfifo
+#endif
+
+namespace fs = std::filesystem;
+
+void
+test_pr118158()
+{
+#if defined(_GLIBCXX_HAVE_SYS_STAT_H) && defined(_GLIBCXX_HAVE_SYS_TYPES_H) \
+  && defined(S_IWUSR) && defined(S_IRUSR)
+  auto p1 = __gnu_test::nonexistent_path();
+  auto p2 = __gnu_test::nonexistent_path();
+  auto p3 = __gnu_test::nonexistent_path();
+  const std::error_code bad_ec = make_error_code(std::errc::invalid_argument);
+  std::error_code ec;
+  bool result;
+
+  VERIFY( ! ::mkfifo(p1.c_str(), S_IWUSR | S_IRUSR) );
+  __gnu_test::scoped_file f1(p1, __gnu_test::scoped_file::adopt_file);
+
+  // Special file is equivalent to itself.
+  VERIFY( equivalent(p1, p1) );
+  VERIFY( equivalent(p1, p1, ec) );
+  VERIFY( ! ec );
+
+  VERIFY( ! ::mkfifo(p2.c_str(), S_IWUSR | S_IRUSR) );
+  __gnu_test::scoped_file f2(p2, __gnu_test::scoped_file::adopt_file);
+
+  ec = bad_ec;
+  // Distinct special files are not equivalent.
+  VERIFY( ! equivalent(p1, p2, ec) );
+  VERIFY( ! ec );
+
+  // Non-existent paths are always an error.
+  VERIFY( ! equivalent(p1, p3, ec) );
+  VERIFY( ec == std::errc::no_such_file_or_directory );
+  ec = bad_ec;
+  VERIFY( ! equivalent(p3, p2, ec) );
+  VERIFY( ec == std::errc::no_such_file_or_directory );
+
+  // Special file is not equivalent to regular file.
+  __gnu_test::scoped_file f3(p3);
+  ec = bad_ec;
+  VERIFY( ! equivalent(p1, p3, ec) );
+  VERIFY( ! ec );
+#endif
+}
+
+int
+main()
+{
+  test_pr118158();
+}
-- 
2.47.1



[PATCH] libstdc++: fix possible undefined std::timespec in module std

2025-01-16 Thread yxj-github-437
I notice std::timespec and std::timespec_get are used in preprocessor
condition _GLIBCXX_HAVE_TIMESPEC_GET. So in module std, it should be
the same.

libstdc++-v3:
* src/c++23/std-clib.cc.in: move std::timespec in preprocessor
condition _GLIBCXX_HAVE_TIMESPEC_GET
---
 libstdc++-v3/src/c++23/std-clib.cc.in | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/src/c++23/std-clib.cc.in 
b/libstdc++-v3/src/c++23/std-clib.cc.in
index 6809ad229d7..10fc03e7ce0 100644
--- a/libstdc++-v3/src/c++23/std-clib.cc.in
+++ b/libstdc++-v3/src/c++23/std-clib.cc.in
@@ -585,9 +585,9 @@ export C_LIB_NAMESPACE
   using std::strftime;
   using std::time;
   using std::time_t;
-  using std::timespec;
   using std::tm;
 #ifdef _GLIBCXX_HAVE_TIMESPEC_GET
+  using std::timespec;
   using std::timespec_get;
 #endif
 }
-- 
2.43.0



Re: [PATCH] c++: Make sure fold_sizeof_expr returns the correct type [PR117775]

2025-01-16 Thread Jason Merrill

On 1/16/25 10:25 AM, Simon Martin wrote:

On 16 Jan 2025, at 16:05, Jason Merrill wrote:


On 1/16/25 7:19 AM, Simon Martin wrote:

Hi Jakub, Jason,

On 15 Jan 2025, at 22:55, Jakub Jelinek wrote:


On Wed, Jan 15, 2025 at 04:48:59PM -0500, Jason Merrill wrote:

--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -11686,6 +11686,7 @@ fold_sizeof_expr (tree t)
false, false);
  if (r == error_mark_node)
r = size_one_node;
+  r = cp_fold_convert (TREE_TYPE (t), r);


Instead of adding this conversion in all cases, let's change
size_one_node
to

build_int_cst (size_type_node, 1)

That would need to be r = build_int_cst (TREE_TYPE (t), 1);

Jason is right: my patch does not need to do TREE_TYPE (t), and can
simply use size_type_node - oversight on my part.


I guess, while that is maybe fine, I don't see how it could avoid
the cp_fold_convert call, because size_one_node can be returned
also from e.g. c-family c_sizeof_or_alignof_type or its typeck.cc
callers,
or it can be size_int (something) etc.

I took another look at the code paths in fold_sizeof_expr and I
believe
that we’re “good” in all of them except if we hit
typeck.cc:2077
(I have not been able so far to craft a test that does), because the
code either builds a node with size_type_node as type, or goes
through
c_sizeof_or_alignof_type that fold_convert’s everything (except
error_mark_node) to size_type_node (in c-common.cc:4028).

I can run a regression test round with a patch that uses
“build_int_cst (size_type_node, 1)” in decl.cc:11930 and
typeck.cc:2077, which should cover all the *current* cases for
fold_sizeof_expr. However, the initial patch has the advantage that:
- It’s consistent with what c_sizeof_or_alignof_type does
- It will cover the (unlikely?) possibility that some new code
path is
added some day that that does not use the right type
- It adds no cost to nominal cases, since cp_fold_convert does
nothing
if we already have the right type

Jason, would you still like me to test and submit a patch that uses
build_int_cst in the two places identified above instead of doing
cp_fold_convert (size_type_node, r) in decl.cc:11930?


No need; your original patch is OK, thanks.

Thanks Jason, I will merge it momentarily. Since it’s a regression
from GCC 12, is it OK for branches as well?


Yes.

Jason



Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-16 Thread Richard Sandiford
Tamar Christina  writes:
>> -Original Message-
>> From: Richard Sandiford 
>> Sent: Thursday, January 16, 2025 7:11 AM
>> To: Tamar Christina 
>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> ; ktkac...@gcc.gnu.org
>> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions
>> for unknown non-homogenous systems [PR113257]
>> 
>> Richard Sandiford  writes:
>> > Tamar Christina  writes:
>> >> Ok for master? and how do you feel about a backport for the two patches to
>> help
>> >> distros?
>> >
>> > Backporting to GCC 14 & GCC 13 sounds good.  Not so sure about GCC 12,
>> > since I think we should be extra cautious with the "most stable" branch,
>> > but let's see what others think.
>> >
>> > OK for trunk, and for GCC 14 & 13 after a grace period, with one
>> > trivial nit below:
>> 
>> Sorry, was concentrating too much on the -mcpu vs. -march preemption
>> thing and forgot to think about other aspects of the patch.  The routine
>> is used for all three of -march=native, -mcpu=native, and -mtune=native,
>> so I think we want something like the following on top of your patch
>> (untested so far).
>> 
>
> Cool, how's this one?
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master? and for backport to GCC 13 and 14?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>   PR target/113257
>   * config/aarch64/driver-aarch64.cc (get_cpu_from_id, DEFAULT_CPU): New.
>   (host_detect_local_cpu): Use it.
>
> gcc/testsuite/ChangeLog:
>
>   PR target/113257
>   * gcc.target/aarch64/cpunative/info_34: New test.
>   * gcc.target/aarch64/cpunative/native_cpu_34.c: New test.
>   * gcc.target/aarch64/cpunative/info_35: New test.
>   * gcc.target/aarch64/cpunative/native_cpu_35.c: New test.
>
> Co-authored-by: Richard Sandiford 
>
> -- inline copy of patch --
>
> diff --git a/gcc/config/aarch64/driver-aarch64.cc 
> b/gcc/config/aarch64/driver-aarch64.cc
> index 
> 45fce67a646351b848b7cd7d0fd35d343731c0d1..26ba2cd6f8883300951268aab7d0a22ec2588a0d
>  100644
> --- a/gcc/config/aarch64/driver-aarch64.cc
> +++ b/gcc/config/aarch64/driver-aarch64.cc
> @@ -60,6 +60,7 @@ struct aarch64_core_data
>  #define ALL_VARIANTS ((unsigned)-1)
>  /* Default architecture to use if -mcpu=native did not detect a known CPU.  
> */
>  #define DEFAULT_ARCH "8A"
> +#define DEFAULT_CPU "generic-armv8-a"
>  
>  #define AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHED, ARCH, FLAGS, COSTS, IMP, 
> PART, VARIANT) \
>{ CORE_NAME, #ARCH, IMP, PART, VARIANT, feature_deps::cpu_##CORE_IDENT },
> @@ -106,6 +107,21 @@ get_arch_from_id (const char* id)
>return NULL;
>  }
>  
> +/* Return an aarch64_core_data for the cpu described
> +   by ID, or NULL if ID describes something we don't know about.  */
> +
> +static const aarch64_core_data *
> +get_cpu_from_id (const char* name)
> +{
> +  for (unsigned i = 0; aarch64_cpu_data[i].name != NULL; i++)
> +{
> +  if (strcmp (name, aarch64_cpu_data[i].name) == 0)
> + return &aarch64_cpu_data[i];
> +}

Redundant braces, the convention says:

  for (unsigned i = 0; aarch64_cpu_data[i].name != NULL; i++)
if (strcmp (name, aarch64_cpu_data[i].name) == 0)
  return &aarch64_cpu_data[i];

OK with that change, thanks, and sorry for back-tracking on the
original ack.

Richard

> +
> +  return NULL;
> +}
> +
>  /* Check wether the CORE array is the same as the big.LITTLE BL_CORE.
> For an example CORE={0xd08, 0xd03} and
> BL_CORE=AARCH64_BIG_LITTLE (0xd08, 0xd03) will return true.  */
> @@ -403,18 +419,11 @@ host_detect_local_cpu (int argc, const char **argv)
>  || variants[0] == aarch64_cpu_data[i].variant))
> break;
>  
> -  if (aarch64_cpu_data[i].name == NULL)
> +  if (arch)
>   {
> -   auto arch_info = get_arch_from_id (DEFAULT_ARCH);
> -
> -   gcc_assert (arch_info);
> -
> -   res = concat ("-march=", arch_info->name, NULL);
> -   default_flags = arch_info->flags;
> - }
> -  else if (arch)
> - {
> -   const char *arch_id = aarch64_cpu_data[i].arch;
> +   const char *arch_id = (aarch64_cpu_data[i].name
> +  ? aarch64_cpu_data[i].arch
> +  : DEFAULT_ARCH);
> auto arch_info = get_arch_from_id (arch_id);
>  
> /* We got some arch indentifier that's not in aarch64-arches.def?  */
> @@ -424,12 +433,15 @@ host_detect_local_cpu (int argc, const char **argv)
> res = concat ("-march=", arch_info->name, NULL);
> default_flags = arch_info->flags;
>   }
> -  else
> +  else if (cpu || aarch64_cpu_data[i].name)
>   {
> -   default_flags = aarch64_cpu_data[i].flags;
> +   auto cpu_info = (aarch64_cpu_data[i].name
> +? &aarch64_cpu_data[i]
> +: get_cpu_from_id (DEFAULT_CPU));
> +   default_flags = cpu_info->flags;
> res = concat ("-m",
>   cpu ? "cpu" : "tune", "

Re: [PATCH] c++: Change c++2b and gnu++2b to c++23 and gnu++23 in C++ diagnostics

2025-01-16 Thread Jason Merrill

On 1/15/25 4:31 PM, Jakub Jelinek wrote:

Hi!

This is something we should have done when -std=c++23 was made the
primary option and -std=c++2b turned into undocumented alias.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.


2025-01-15  Jakub Jelinek  

gcc/cp/
* parser.cc (cp_parser_lambda_declarator_opt,
cp_parser_statement, cp_parser_selection_statement,
cp_parser_jump_statement): Use -std=c++23 and -std=gnu++23
in diagnostics rather than -std=c++2b and -std=gnu++2b.
* semantics.cc (finish_compound_literal): Likewise.
* typeck2.cc (build_functional_cast_1): Likewise.
* decl.cc (start_decl): Likewise.
* constexpr.cc (ensure_literal_type_for_constexpr_object,
potential_constant_expression_1): Likewise.
gcc/c-family/
* c-lex.cc (interpret_float): Use -std=c++23 and -std=gnu++23
in diagnostics rather than -std=c++2b and -std=gnu++2b.

--- gcc/cp/parser.cc.jj 2025-01-15 08:43:39.070926235 +0100
+++ gcc/cp/parser.cc2025-01-15 12:16:00.376964834 +0100
@@ -12165,8 +12165,8 @@ cp_parser_lambda_declarator_opt (cp_pars
  {
pedwarn (omitted_parms_loc, OPT_Wc__23_extensions,
   "parameter declaration before lambda declaration "
-  "specifiers only optional with %<-std=c++2b%> or "
-  "%<-std=gnu++2b%>");
+  "specifiers only optional with %<-std=c++23%> or "
+  "%<-std=gnu++23%>");
omitted_parms_loc = UNKNOWN_LOCATION;
  }
/* Peek at the params, see if we have an xobj parameter.  */
@@ -12262,8 +12262,8 @@ cp_parser_lambda_declarator_opt (cp_pars
  {
pedwarn (omitted_parms_loc, OPT_Wc__23_extensions,
   "parameter declaration before lambda transaction "
-  "qualifier only optional with %<-std=c++2b%> or "
-  "%<-std=gnu++2b%>");
+  "qualifier only optional with %<-std=c++23%> or "
+  "%<-std=gnu++23%>");
omitted_parms_loc = UNKNOWN_LOCATION;
  }
  
@@ -12275,8 +12275,8 @@ cp_parser_lambda_declarator_opt (cp_pars

  {
pedwarn (omitted_parms_loc, OPT_Wc__23_extensions,
   "parameter declaration before lambda exception "
-  "specification only optional with %<-std=c++2b%> or "
-  "%<-std=gnu++2b%>");
+  "specification only optional with %<-std=c++23%> or "
+  "%<-std=gnu++23%>");
omitted_parms_loc = UNKNOWN_LOCATION;
  }
  
@@ -12293,8 +12293,8 @@ cp_parser_lambda_declarator_opt (cp_pars

if (omitted_parms_loc)
pedwarn (omitted_parms_loc, OPT_Wc__23_extensions,
 "parameter declaration before lambda trailing "
-"return type only optional with %<-std=c++2b%> or "
-"%<-std=gnu++2b%>");
+"return type only optional with %<-std=c++23%> or "
+"%<-std=gnu++23%>");
cp_lexer_consume_token (parser->lexer);
return_type = cp_parser_trailing_type_id (parser);
  }
@@ -13069,7 +13069,7 @@ cp_parser_statement (cp_parser* parser,
  if (cxx_dialect < cxx23)
pedwarn (loc, OPT_Wc__23_extensions,
 "label at end of compound statement only available "
-"with %<-std=c++2b%> or %<-std=gnu++2b%>");
+"with %<-std=c++23%> or %<-std=gnu++23%>");
  return;
}
  in_compound_for_pragma = false;
@@ -13826,7 +13826,7 @@ cp_parser_selection_statement (cp_parser
if (cxx_dialect < cxx23)
  pedwarn (tok->location, OPT_Wc__23_extensions,
   "% only available with "
-  "%<-std=c++2b%> or %<-std=gnu++2b%>");
+  "%<-std=c++23%> or %<-std=gnu++23%>");
  
  	bool save_in_consteval_if_p = in_consteval_if_p;

statement = begin_if_stmt ();
@@ -15225,7 +15225,7 @@ cp_parser_jump_statement (cp_parser* par
  && cxx_dialect < cxx23)
{
  error ("% in % function only available with "
-"%<-std=c++2b%> or %<-std=gnu++2b%>");
+"%<-std=c++23%> or %<-std=gnu++23%>");
  cp_function_chain->invalid_constexpr = true;
}
  
--- gcc/cp/semantics.cc.jj	2025-01-10 10:32:28.723730783 +0100

+++ gcc/cp/semantics.cc 2025-01-15 12:16:14.453769788 +0100
@@ -3751,7 +3751,7 @@ finish_compound_literal (tree type, tree
else if (cxx_dialect < cxx23)
pedwarn (input_location, OPT_Wc__23_extensions,
 "% only available with "
-"%<-std=c++2b%> or %<-std=gnu++2b%>");
+"%<-std=c++23%> or %<-std=gnu++23%>");
type = do_auto_deduction (type, compound_literal, type, complain,
adc_variable_type);
if (type == error_mark_node)
--- gcc/cp/typeck2.cc.jj2025-01-15 08:43:39.071926221 +0100
+++ gcc/cp/typeck2.cc   2

Re: [PATCH] wwwdocs: experiments with a Python postprocessing script

2025-01-16 Thread David Malcolm
On Thu, 2025-01-16 at 22:58 +0800, Gerald Pfeifer wrote:
> On Wed, 15 Jan 2025, David Malcolm wrote:
> > The heading elements in our website contain "id" information,
> > but currently to find them you to look at the page source,
> > whereas in the generated HTML for the manual we have e.g.:
> > 
> >   ¶
> > 
> > which shows up nicely in the browser in e.g.
> >   https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html
> :
> > It's *very* helpful to have easily shareable links to within pages.
> 
> Absolutely agreed.
> 
> > I've never managed to build MetaHTML and have always just crossed
> > my 
> > fingers and hoped when making edits to the GCC website;
> > bin/preprocess 
> > just errors out for me immediately due to not finding mhc.
> 
> Yes, sadly the GNU project let MetaHTML die (though I raised this
> more 
> than once). I still think the concept as such was fine and it served
> us 
> well over the years, but building has been challenging 20 years ago
> and 
> would require some fierce source code editing nowadays. :-(
> 
> > So this patch as written replaces the invocation of mhc with an 
> > invocation of the python script, which of course drops various
> > features.
> 
> Yeah! I was hoping we could return to your script. IIRC I once pinged
> and 
> you were busy; happy to collaborate on finishing this up.
> 
> In any case, a few years ago I spent quite some time and effort to
> prepare 
> the stage, migrating the site to CSS (done) and making individual
> pages 
> self contained (also done), which removed most of the original
> MetaHTML 
> usage.
> 
> This is why things appear somewhat fine, even without MetaHTML
> available.
> 
> > and, for now, the loss of the mhc stuff here:
> >   https://dmalcolm.fedorapeople.org/gcc/2025-01-15/htdocs/
> > 
> > compared to:
> >   https://gcc.gnu.org/
> 
> So it appears the two biggest losses are
>  (1) the default footer on every page, and
>  (2) the navigation bar on the main page?
> Plus 
>  (3) loss of favicon.ico on every page,
>  (4) postprocessing of /install docs,
>  (5) no longer adding "- GNU Project" to every page title.
> 
> Anything else you are aware of?
> 
> > Gerald: if you have mhc working, can you please try adjusting the
> > bin/ so it runs *both*. mhc and the python script.
> 
> I have a 32-bit x86 build on a local machine which probably is 20+
> years 
> old, plus a comparable, though not identical, build on gcc.gnu.org.
> 
> Building newly is something I tried a while ago and gave up. Not 
> infeasible when one patches out code we don't need to some non-
> standard
> things, but painful and not worth it.
> 
> (I'm sorry, I'm not sure what you mean by the above, i.e., what you'd
> like 
> to see adjusted?)

Sorry for being unclear.

What I mean is that I think it's possible to run *both* mhc and my
script on the input files (my script takes a file, rather than stdin,
so it can't be done directly in a shell pipeline though).

But I don't have a working mhc so I can't test that; you do, so I was
hoping you could hack up preprocess so it runs both.

Alternatively I can try to write a version of the patch that does that
(but I can't test it locally).

I'd love to get rid of metahtml, but for now I just want easily
copyable links for the gcc 15 "changes" and porting guide.

> 
> > --- a/bin/preprocess
> > +++ b/bin/preprocess
> > @@ -33,8 +33,6 @@
> >  #
> >  # By Gerald Pfeifer  1999-12-29.
>    ^^
> Well, talking about old code! Back then MetaHTML built fine IIRC on
> your 
> average GNU/Linux distribution. :-/
> 
> 
> How do we best take it from there?
> 
> I believe at this point, and with MetaHTML unrecoverably dead for 10+
> years, and my website rework there isn't dramatically much left we're
> missing.
> 
> htdocs/style.mhtml actually tells us what, the two bigger items being
> (1) 
> default footer and (2) navigation bar on the main page as listed
> above.
> 
> (The BACKPATH code in style.mhtmlwas used for the GCJ main page,
> which is 
> gone now. Of course it would be nice to have navigation on every
> page, but 
> that's an enhancement, not a regression to avoid.)
> 
> 
> With those two addressed, and (3) possibly later on, I think we
> should 
> bite the bullet, rip of the bandaid, plunge into the cold water,
> whatever 
> idiom we want to use. :-)
> 
> 
> How can we tackle those?
> 
> Maybe some "macros", best HTML comments that insert text or include a
> text 
> file from a magic subdirectory? We could use that for the navigation 
> aspect.
> 
> And some "include this text before  in every single document"
> magic for the default footer?
> 
> (Disclaimer: these are just some ideas. There may be vastly better
> ways.)

If we're going that way, can we simply use a well-known Python
templating system, such as Jinja:
  https://jinja.palletsprojects.com/en/stable/

or just migrate to a well-known static site generator.  I looked for
ones implemented in Python and found Pelican:
  http

Re: [PATCH] c++, v2: Fix up reshape_* RAW_DATA_CST handling [PR118214]

2025-01-16 Thread Jason Merrill

On 1/15/25 4:29 PM, Jakub Jelinek wrote:

On Wed, Jan 15, 2025 at 10:27:56AM -0500, Jason Merrill wrote:

@@ -7432,12 +7426,18 @@ reshape_init_r (tree type, reshape_iter
{
  vec *v = 0;
  CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, init);
- tree raw_init = cp_maybe_split_raw_data (d);
+ bool inc_cur;
+ reshape_iter dsave = *d;
+ tree raw_init = cp_maybe_split_raw_data (d, &inc_cur);
  CONSTRUCTOR_APPEND_ELT (v, NULL_TREE,
  raw_init ? raw_init : d->cur->value);
  if (has_designator_problem (d, complain))
-   return error_mark_node;
- if (!raw_init)
+   {
+ if (!inc_cur)
+   *d = dsave;


Why restore *d on the error path?  Won't the caller just return
error_mark_node as well?


Here is a patch with a better replacement for that hunk.
As I wrote, the reason I haven't used consume_init there was that
has_designator_problem needs to test the old, not new d->cur, and
it was called after the splitting.
Calling it before that or allocating the va_gc vector is better and
matches what reshape_init_r does at the start of the function.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.


2025-01-15  Jakub Jelinek  

PR c++/118214
* decl.cc (struct reshape_iter): Add raw_idx member.
(cp_maybe_split_raw_data): Add inc_cur parameter, set *inc_cur,
don't modify original CONSTRUCTOR, use d->raw_idx to track index
into a RAW_DATA_CST d->cur->value.
(consume_init): Adjust cp_maybe_split_raw_data caller, increment
d->cur when cur_inc is true.
(reshape_init_array_1): Don't modify original CONSTRUCTOR when
handling RAW_DATA_CST d->cur->value and !reuse, instead use
d->raw_idx to track index into RAW_DATA_CST.
(reshape_single_init): Initialize iter.raw_idx.
(reshape_init_class): Adjust for introduction of d->raw_idx,
adjust cp_maybe_split_raw_data caller, do d->cur++ if inc_cur
rather than when it returns non-NULL.
(reshape_init_r): Check for has_designator_problem for second
half of _Complex earlier, also check for
error_operand_p (d->cur->value).  Use consume_init instead of
cp_maybe_split_raw_data with later conditional d->cur++.
(reshape_init): Initialize d.raw_idx.

* g++.dg/cpp/embed-17.C: New test.
* g++.dg/cpp0x/pr118214.C: New test.

--- gcc/cp/decl.cc.jj   2024-12-27 16:03:52.622536496 +0100
+++ gcc/cp/decl.cc  2025-01-15 19:26:49.765532110 +0100
@@ -6823,11 +6823,13 @@ check_for_uninitialized_const_var (tree
  
  /* Structure holding the current initializer being processed by reshape_init.
 CUR is a pointer to the current element being processed, END is a pointer
-   after the last element present in the initializer.  */
+   after the last element present in the initializer and RAW_IDX is index into
+   RAW_DATA_CST if that is CUR elt.  */
  struct reshape_iter
  {
constructor_elt *cur;
constructor_elt *end;
+  unsigned raw_idx;
  };
  
  static tree reshape_init_r (tree, reshape_iter *, tree, tsubst_flags_t);

@@ -6895,18 +6897,20 @@ is_direct_enum_init (tree type, tree ini
  }
  
  /* Helper function for reshape_init*.  Split first element of

-   RAW_DATA_CST and save the rest to d->cur->value.  */
+   RAW_DATA_CST or return NULL for other elements.  Set *INC_CUR
+   to true if the whole d->cur has been consumed.  */
  
  static tree

-cp_maybe_split_raw_data (reshape_iter *d)
+cp_maybe_split_raw_data (reshape_iter *d, bool *inc_cur)
  {
+  *inc_cur = true;
if (TREE_CODE (d->cur->value) != RAW_DATA_CST)
  return NULL_TREE;
-  tree ret = *raw_data_iterator (d->cur->value, 0);
-  ++RAW_DATA_POINTER (d->cur->value);
-  --RAW_DATA_LENGTH (d->cur->value);
-  if (RAW_DATA_LENGTH (d->cur->value) == 1)
-d->cur->value = *raw_data_iterator (d->cur->value, 0);
+  tree ret = *raw_data_iterator (d->cur->value, d->raw_idx++);
+  if (d->raw_idx != (unsigned) RAW_DATA_LENGTH (d->cur->value))
+*inc_cur = false;
+  else
+d->raw_idx = 0;
return ret;
  }
  
@@ -6918,9 +6922,11 @@ cp_maybe_split_raw_data (reshape_iter *d

  static tree
  consume_init (tree init, reshape_iter *d)
  {
-  if (tree raw_init = cp_maybe_split_raw_data (d))
-return raw_init;
-  d->cur++;
+  bool inc_cur;
+  if (tree raw_init = cp_maybe_split_raw_data (d, &inc_cur))
+init = raw_init;
+  if (inc_cur)
+d->cur++;
return init;
  }
  
@@ -6979,10 +6985,8 @@ reshape_init_array_1 (tree elt_type, tre

  {
tree elt_init;
constructor_elt *old_cur = d->cur;
-  const char *old_raw_data_ptr = NULL;
-
-  if (TREE_CODE (d->cur->value) == RAW_DATA_CST)
-   old_raw_data_ptr = RAW_DATA_POINTER (d->cur->value);
+  unsigned int old_raw_idx = d->raw_idx;
+  bool old_raw_data_cst = TREE_CODE (d->cur->value) == RAW_DATA_CST;
  
if (d->cur->index

[Patch] Fortran/OpenMP: Fix declare_variant's 'adjust_args' mishandling with return by reference [PR118321]

2025-01-16 Thread Tobias Burnus

For declare_variant's 'adjust_args', the arguments for need_device_ptr
are referenced by argument number. This patch fixes that calculation
if hidden arguments due to return-by-reference calling exist.

Note: It only fixes the Fortran issue; the C++ issue still needs to be fixed.

Build + regtested on x86-64-gnu-linux.
And comments before I commit it?

Tobias
Fortran/OpenMP: Fix declare_variant's 'adjust_args' mishandling with return by reference [PR118321]

declare_variant's 'adjust_args' clause references the arguments in the
middle end by the argument position; this has to account for hidden
arguments that are inserted before due to return by reference,
as done in this commit.

	PR fortran/118321

gcc/fortran/ChangeLog:

	* trans-openmp.cc (gfc_trans_omp_declare_variant): Honor hidden
	arguments for append_arg's need_device_ptr.

gcc/testsuite/ChangeLog:

	* gfortran.dg/gomp/adjust-args-12.f90: New test.

 gcc/fortran/trans-openmp.cc   | 14 ++--
 gcc/testsuite/gfortran.dg/gomp/adjust-args-12.f90 | 40 +++
 2 files changed, 51 insertions(+), 3 deletions(-)

diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index 2c6192820cc..d3ebc9b4745 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -8622,7 +8622,7 @@ gfc_trans_omp_declare_variant (gfc_namespace *ns)
 	  if (!search_ns->proc_name->attr.function
 	  && !search_ns->proc_name->attr.subroutine)
 	gfc_error ("The base name for % must be "
-		   "specified at %L ", &odv->where);
+		   "specified at %L", &odv->where);
 	  else
 	error_found = false;
 	}
@@ -8821,6 +8821,13 @@ gfc_trans_omp_declare_variant (gfc_namespace *ns)
 		  // Handle adjust_args
 		  tree need_device_ptr_list = make_node (TREE_LIST);
 		  vec adjust_args_list = vNULL;
+		  int arg_idx_offset = 0;
+		  if (gfc_return_by_reference (ns->proc_name))
+		{
+		  arg_idx_offset++;
+		  if (ns->proc_name->ts.type == BT_CHARACTER)
+			arg_idx_offset++;
+		}
 		  for (gfc_omp_namelist *arg_list = odv->adjust_args_list;
 		   arg_list != NULL; arg_list = arg_list->next)
 		{
@@ -8847,14 +8854,15 @@ gfc_trans_omp_declare_variant (gfc_namespace *ns)
 			if (arg->sym == arg_list->sym)
 			  break;
 			  gcc_assert (arg != NULL);
+			  // Store 0-based argument index,
+			  // as in gimplify_call_expr
 			  need_device_ptr_list = chainon (
 			need_device_ptr_list,
 			build_tree_list (
 			  NULL_TREE,
 			  build_int_cst (
 integer_type_node,
-idx))); // Store 0-based argument index,
-	// as in gimplify_call_expr
+idx + arg_idx_offset)));
 			}
 		}
 
diff --git a/gcc/testsuite/gfortran.dg/gomp/adjust-args-12.f90 b/gcc/testsuite/gfortran.dg/gomp/adjust-args-12.f90
new file mode 100644
index 000..94fdd6c7a62
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/adjust-args-12.f90
@@ -0,0 +1,40 @@
+! { dg-additional-options "-fdump-tree-gimple" }
+
+! PR fortran/118321
+
+! Ensure that hidden arguments (return by reference) do not mess up the
+! argument counting of need_device_ptr
+
+! Here, we want to process the 3rd argument: 'c' as dummy argument = 'y' as actual.
+
+
+! { dg-final { scan-tree-dump-times "__builtin_omp_get_mapped_ptr" 1 "gimple" } }
+! { dg-final { scan-tree-dump "D\.\[0-9\]+ = __builtin_omp_get_mapped_ptr \\(y, D\.\[0-9\]+\\);" "gimple" } }
+
+! { dg-final { scan-tree-dump " \\(&pstr.\[0-9\], &slen.\[0-9\], &\"abc\"\\\[1\\\]\{lb: 1 sz: 1\}, x, D\.\[0-9\]+, z, &\"cde\"\\\[1\\\]\{lb: 1 sz: 1\}, 3, 3\\);" "gimple" } }
+
+module m
+  use iso_c_binding
+  implicit none (type, external)
+contains
+  character(:) function  (a,b,c,d,e)
+allocatable :: 
+character(*) :: a, e
+type(c_ptr), value :: b,c,d
+  end
+  character(:) function  (a,b,c,d,e)
+!$omp declare variant() match(construct={dispatch})  &
+!$omp&  adjust_args(need_device_ptr : c)
+allocatable :: 
+character(*) :: a, e
+type(c_ptr), value :: b,c,d
+  end
+end module m
+
+use m
+implicit none (type, external)
+type(c_ptr) :: x,y,z
+character(len=:), allocatable :: str
+!$omp dispatch
+  str =  ("abc", x, y, z, "cde")
+end


[PATCH] Fix uniqueness of symtab_node::get_dump_name.

2025-01-16 Thread Michal Jires
symtab_node::get_dump_name uses node order to identify nodes.
Order is no longer unique because of Incremental LTO patches.
This patch moves uid from cgraph_node node to symtab_node,
so get_dump_name can use uid instead and get back unique dump names.

In inlining passes, uid is replaced with more appropriate (more compact
for indexing) summary id.

Bootstrapped/regtested on x86_64-linux.
Ok for trunk?

gcc/ChangeLog:

* cgraph.cc (symbol_table::create_empty):
Move uid to symtab_node.
(test_symbol_table_test): Change expected dump id.
* cgraph.h (struct cgraph_node):
Move uid to symtab_node.
(symbol_table::register_symbol): Likewise.
* dumpfile.cc (test_capture_of_dump_calls):
Change expected dump id.
* ipa-inline.cc (update_caller_keys):
Use summary id instead of uid.
(update_callee_keys): Likewise.
* symtab.cc (symtab_node::get_dump_name):
Use uid instead of order.

gcc/testsuite/ChangeLog:

* gcc.dg/live-patching-1.c: Change expected dump id.
* gcc.dg/live-patching-4.c: Likewise.
---
 gcc/cgraph.cc  |  4 ++--
 gcc/cgraph.h   | 25 ++---
 gcc/dumpfile.cc|  8 
 gcc/ipa-inline.cc  |  6 +++---
 gcc/symtab.cc  |  2 +-
 gcc/testsuite/gcc.dg/live-patching-1.c |  2 +-
 gcc/testsuite/gcc.dg/live-patching-4.c |  2 +-
 7 files changed, 26 insertions(+), 23 deletions(-)

diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
index 83a9b59ef30..d0b19ad850e 100644
--- a/gcc/cgraph.cc
+++ b/gcc/cgraph.cc
@@ -290,7 +290,7 @@ cgraph_node *
 symbol_table::create_empty (void)
 {
   cgraph_count++;
-  return new (ggc_alloc ()) cgraph_node (cgraph_max_uid++);
+  return new (ggc_alloc ()) cgraph_node ();
 }
 
 /* Register HOOK to be called with DATA on each removed edge.  */
@@ -4338,7 +4338,7 @@ test_symbol_table_test ()
   /* Verify that the node has order 0 on both iterations,
 and thus that nodes have predictable dump names in selftests.  */
   ASSERT_EQ (node->order, 0);
-  ASSERT_STREQ (node->dump_name (), "test_decl/0");
+  ASSERT_STREQ (node->dump_name (), "test_decl/1");
 }
 }
 
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 7856d53c9e9..065fcc742e8 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -124,7 +124,7 @@ public:
   order (-1), next_sharing_asm_name (NULL),
   previous_sharing_asm_name (NULL), same_comdat_group (NULL), ref_list (),
   alias_target (NULL), lto_file_data (NULL), aux (NULL),
-  x_comdat_group (NULL_TREE), x_section (NULL)
+  x_comdat_group (NULL_TREE), x_section (NULL), m_uid (-1)
   {}
 
   /* Return name.  */
@@ -492,6 +492,12 @@ public:
   /* Perform internal consistency checks, if they are enabled.  */
   static inline void checking_verify_symtab_nodes (void);
 
+  /* Get unique identifier of the node.  */
+  inline int get_uid ()
+  {
+return m_uid;
+  }
+
   /* Type of the symbol.  */
   ENUM_BITFIELD (symtab_type) type : 8;
 
@@ -668,6 +674,9 @@ protected:
  void *data,
  bool include_overwrite);
 private:
+  /* Unique id of the node.  */
+  int m_uid;
+
   /* Workers for set_section.  */
   static bool set_section_from_string (symtab_node *n, void *s);
   static bool set_section_from_node (symtab_node *n, void *o);
@@ -882,7 +891,7 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : public 
symtab_node
   friend class symbol_table;
 
   /* Constructor.  */
-  explicit cgraph_node (int uid)
+  explicit cgraph_node ()
 : symtab_node (SYMTAB_FUNCTION), callees (NULL), callers (NULL),
   indirect_calls (NULL),
   next_sibling_clone (NULL), prev_sibling_clone (NULL), clones (NULL),
@@ -903,7 +912,7 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : public 
symtab_node
   redefined_extern_inline (false), tm_may_enter_irr (false),
   ipcp_clone (false), gc_candidate (false),
   called_by_ifunc_resolver (false), has_omp_variant_constructs (false),
-  m_uid (uid), m_summary_id (-1)
+  m_summary_id (-1)
   {}
 
   /* Remove the node from cgraph and all inline clones inlined into it.
@@ -1304,12 +1313,6 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : 
public symtab_node
 dump_cgraph (stderr);
   }
 
-  /* Get unique identifier of the node.  */
-  inline int get_uid ()
-  {
-return m_uid;
-  }
-
   /* Get summary id of the node.  */
   inline int get_summary_id ()
   {
@@ -1503,8 +1506,6 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : 
public symtab_node
   unsigned has_omp_variant_constructs : 1;
 
 private:
-  /* Unique id of the node.  */
-  int m_uid;
 
   /* Summary id that is recycled.  */
   int m_summary_id;
@@ -2815,6 +2816,8 @@ symbol_table::register_symbol (symtab_node *node)
 nodes->previous = node;
   nodes = node;
 
+  nodes->m_uid = cgraph_max_uid++;
+

[patch,avr] Use INT_N to built-in define __int24.

2025-01-16 Thread Georg-Johann Lay

This patch uses the INT_N interface to define __int24.

Ok for trunk?

Johann

--

AVR: Use INT_N to built-in define __int24.

This patch uses the INT_N interface to define __int24 in avr-modes.def.

Since the testsuite uses -Wpedantic and __int24 is a C/C++ extension,
uses of __int24 and __uint24 is now marked as __extension__.

PR target/118329
gcc/
* config/avr/avr-modes.def b/gcc/config/avr/avr-modes.def:
Add INT_N (PSI, 24).
* config/avr/avr.cc b/gcc/config/avr/avr.cc (avr_init_builtin_int24)
<__int24>: Remove definition.
<__uint24>: Adjust definition to INT_N interface.
gcc/testsuite/
* gcc.target/avr/pr115830-add.c (__int24, __uint24): Add __extension__
to respective typedefs.
* gcc.target/avr/pr115830-sub-ext.c: Same.
* gcc.target/avr/pr115830-sub.c: Same.
* gcc.target/avr/torture/get-mem.c: Same.
* gcc.target/avr/torture/set-mem.c: Same.
* gcc.target/avr/torture/ifelse-c.h: Same.
* gcc.target/avr/torture/ifelse-d.h: Same.
* gcc.target/avr/torture/ifelse-q.h: Same.
* gcc.target/avr/torture/ifelse-r.h: Same.
* gcc.target/avr/torture/int24-mul.c: Same.
* gcc.target/avr/torture/pr109907-2.c: Same.
* gcc.target/avr/torture/pr61443.c: Same.
* gcc.target/avr/torture/pr63633-ice-mult.c: Same.
* gcc.target/avr/torture/shift-l-u24.c: Same.
* gcc.target/avr/torture/shift-r-i24.c: Same.
* gcc.target/avr/torture/shift-r-u24.c: Same.
* gcc.target/avr/torture/add-extend.c: Same.
* gcc.target/avr/torture/sub-extend.c: Same.
* gcc.target/avr/torture/sub-zerox.c: Same.
* gcc.target/avr/torture/test-gprs.h: Same.diff --git a/gcc/config/avr/avr-modes.def b/gcc/config/avr/avr-modes.def
index e69636a6099..de350135f95 100644
--- a/gcc/config/avr/avr-modes.def
+++ b/gcc/config/avr/avr-modes.def
@@ -18,6 +18,7 @@
.  */
 
 FRACTIONAL_INT_MODE (PSI, 24, 3);
+INT_N (PSI, 24);
 
 /* Used when the N (and Z) flag(s) of SREG are set.
The N flag indicates  whether the value is negative.
diff --git a/gcc/config/avr/avr.cc b/gcc/config/avr/avr.cc
index 05a6905b5d6..1b2349ea784 100644
--- a/gcc/config/avr/avr.cc
+++ b/gcc/config/avr/avr.cc
@@ -15336,11 +15336,13 @@ avr_builtin_decl (unsigned id, bool /*initialize_p*/)
 static void
 avr_init_builtin_int24 (void)
 {
-  tree int24_type  = make_signed_type (GET_MODE_BITSIZE (PSImode));
-  tree uint24_type = make_unsigned_type (GET_MODE_BITSIZE (PSImode));
-
-  lang_hooks.types.register_builtin_type (int24_type, "__int24");
-  lang_hooks.types.register_builtin_type (uint24_type, "__uint24");
+  for (int i = 0; i < NUM_INT_N_ENTS; ++i)
+if (int_n_data[i].bitsize == 24)
+  {
+	tree uint24_type = int_n_trees[i].unsigned_type;
+	lang_hooks.types.register_builtin_type (uint24_type, "__uint24");
+	break;
+  }
 }
 
 
diff --git a/gcc/testsuite/gcc.target/avr/pr115830-add.c b/gcc/testsuite/gcc.target/avr/pr115830-add.c
index 99ac89cd0a6..89ce4847141 100644
--- a/gcc/testsuite/gcc.target/avr/pr115830-add.c
+++ b/gcc/testsuite/gcc.target/avr/pr115830-add.c
@@ -5,8 +5,8 @@ typedef __UINT8_TYPE__ u8;
 typedef __INT8_TYPE__  i8;
 typedef __UINT16_TYPE__ u16;
 typedef __INT16_TYPE__  i16;
-typedef __uint24 u24;
-typedef __int24  i24;
+__extension__ typedef __uint24 u24;
+__extension__ typedef __int24  i24;
 typedef __UINT32_TYPE__ u32;
 typedef __INT32_TYPE__  i32;
 
diff --git a/gcc/testsuite/gcc.target/avr/pr115830-sub-ext.c b/gcc/testsuite/gcc.target/avr/pr115830-sub-ext.c
index 3fac6ddd0df..3f959ab2cb9 100644
--- a/gcc/testsuite/gcc.target/avr/pr115830-sub-ext.c
+++ b/gcc/testsuite/gcc.target/avr/pr115830-sub-ext.c
@@ -5,8 +5,8 @@ typedef __UINT8_TYPE__ u8;
 typedef __INT8_TYPE__  i8;
 typedef __UINT16_TYPE__ u16;
 typedef __INT16_TYPE__  i16;
-typedef __uint24 u24;
-typedef __int24  i24;
+__extension__ typedef __uint24 u24;
+__extension__ typedef __int24  i24;
 typedef __UINT32_TYPE__ u32;
 typedef __INT32_TYPE__  i32;
 
diff --git a/gcc/testsuite/gcc.target/avr/pr115830-sub.c b/gcc/testsuite/gcc.target/avr/pr115830-sub.c
index ef24e74752d..5ad5e8918a1 100644
--- a/gcc/testsuite/gcc.target/avr/pr115830-sub.c
+++ b/gcc/testsuite/gcc.target/avr/pr115830-sub.c
@@ -5,8 +5,8 @@ typedef __UINT8_TYPE__ u8;
 typedef __INT8_TYPE__  i8;
 typedef __UINT16_TYPE__ u16;
 typedef __INT16_TYPE__  i16;
-typedef __uint24 u24;
-typedef __int24  i24;
+__extension__ typedef __uint24 u24;
+__extension__ typedef __int24  i24;
 typedef __UINT32_TYPE__ u32;
 typedef __INT32_TYPE__  i32;
 
diff --git a/gcc/testsuite/gcc.target/avr/torture/add-extend.c b/gcc/testsuite/gcc.target/avr/torture/add-extend.c
index 320f510e677..d87077a218b 100644
--- a/gcc/testsuite/gcc.target/avr/torture/add-extend.c
+++ b/gcc/testsuite/gcc.target/avr/torture/add-extend.c
@@ -2,12 +2,12 @@
 
 typedef __UINT8_TYPE__ u8;
 typedef __UINT16_TYPE__ u16;
-typedef __uint24 u24;
+__extensio

Re: Ping: [PATCH] d,ada/spec: only sub nostd{inc,lib} rather than nostd{inc,lib}*

2025-01-16 Thread Iain Buclaw
Excerpts from Arsen Arsenović's message of Oktober 7, 2024 11:31 pm:
> Ping on this patch.
> 

Thanks, this is OK for D.

Note, that the gdc driver does try to accomodate for mixing C++ and D 
code in the same application - so given:

gdc a.d b.cc

It adds -lstdc++ to the link command for convenience.

Iain.


RE: [RFC] PR81358: Enable automatic linking of libatomic

2025-01-16 Thread Prathamesh Kulkarni


> -Original Message-
> From: Prathamesh Kulkarni 
> Sent: 10 January 2025 09:48
> To: Thomas Schwinge 
> Cc: Tobias Burnus ; Joseph Myers
> ; Xi Ruoyao ; Matthew
> Malcomson ; gcc-patches@gcc.gnu.org; Tom de
> Vries 
> Subject: RE: [RFC] PR81358: Enable automatic linking of libatomic
> 
> External email: Use caution opening links or attachments
> 
> 
> > -Original Message-
> > From: Thomas Schwinge 
> > Sent: 07 January 2025 17:45
> > To: Prathamesh Kulkarni 
> > Cc: Tobias Burnus ; Joseph Myers
> > ; Xi Ruoyao ; Matthew
> > Malcomson ; gcc-patches@gcc.gnu.org; Tom de
> > Vries 
> > Subject: RE: [RFC] PR81358: Enable automatic linking of libatomic
> >
> > External email: Use caution opening links or attachments
> >
> >
> > Hi Prathamesh!
> Hi Thomas, thanks for the review!
> >
> > Thanks for working on this!
> >
> >
> > Per my understanding, this patch won't automagically resolve the
> need
> > to
> > (occasionally) having to specify '-foffload-options=nvptx-none=-
> > latomic'
> > for nvptx offloading: it doesn't use 'LINK_LIBATOMIC_SPEC',
> currently
> > only used via 'GNU_USER_TARGET_LINK_GCC_C_SEQUENCE_SPEC' from
> > 'gcc/config/gnu-user.h' (general issue, affecting a lot of
> > configurations, to be addressed as necessary):
> >
> > > --- a/gcc/config/gnu-user.h
> > > +++ b/gcc/config/gnu-user.h
> >
> > >  #define GNU_USER_TARGET_LINK_GCC_C_SEQUENCE_SPEC \
> > > -  "%{static|static-pie:--start-group} %G %{!nolibc:%L} \
> > > +  "%{static|static-pie:--start-group} %G %{!nolibc:"
> > > + LINK_LIBATOMIC_SPEC "%L} \
> > > %{static|static-pie:--end-group}%{!static:%{!static-pie:%G}}"
> >
> > > --- a/gcc/gcc.cc
> > > +++ b/gcc/gcc.cc
> >
> > >  /* Here is the spec for running the linker, after compiling all
> > > files.  */
> > >
> > > +#if defined(TARGET_PROVIDES_LIBATOMIC) &&
> defined(USE_LD_AS_NEEDED)
> > > +#define LINK_LIBATOMIC_SPEC "%{!fno-link-libatomic:"
> > LD_AS_NEEDED_OPTION \
> > > + " -latomic " LD_NO_AS_NEEDED_OPTION "} "
> > > +#else
> > > +#define LINK_LIBATOMIC_SPEC ""
> > > +#endif
> > > +
> > >  /* This is overridable by the target in case they need to specify
> > the
> > > -lgcc and -lc order specially, yet not require them to
> override
> > all
> > > of LINK_COMMAND_SPEC.  */
> >
> > ..., and the nvptx linker doesn't support '--as-needed'.
> >
> > I'll think about it; indeed it'd be good to get that resolved, too.
> >
> >
> > On 2024-12-20T15:37:42+, Prathamesh Kulkarni
> >  wrote:
> > > [...] copying libatomic.a  over to $(gcc_objdir)$(MULTISUBDIR)/,
> and
> > > can confirm that 64-bit libatomic.a is copied to $build/gcc/ and
> 32-
> > bit libatomic.a is copied to $build/gcc/32/.
> >
> > So this:
> >
> > > --- a/libatomic/Makefile.am
> > > +++ b/libatomic/Makefile.am
> >
> > > @@ -162,6 +162,11 @@ libatomic_convenience_la_LIBADD =
> > > $(libatomic_la_LIBADD)  # when it is reloaded during the build of
> > all-multi.
> > >  all-multi: $(libatomic_la_LIBADD)
> > >
> > > +gcc_objdir = $(MULTIBUILDTOP)../../$(host_subdir)/gcc
> > > +all: all-multi libatomic.la libatomic_convenience.la
> > > + $(INSTALL_DATA) .libs/libatomic.a
> $(gcc_objdir)$(MULTISUBDIR)/
> > > + chmod 644 $(gcc_objdir)$(MULTISUBDIR)/libatomic.a
> >
> > ... is conceptually modelled after libgcc, where the libraries get
> > copied into 'gcc/'?  However, here we only copy the static
> > 'libatomic.a'; libgcc does a 'make install-leaf', see
> > 'libgcc/Makefile.in':
> >
> > all: all-multi
> > # Now that we have built all the objects, we need to copy
> > # them back to the GCC directory.  Too many things (other
> > # in-tree libraries, and DejaGNU) know about the layout
> > # of the build tree, for now.
> > $(MAKE) install-leaf DESTDIR=$(gcc_objdir) \
> >   slibdir= libsubdir= MULTIOSDIR=$(MULTIDIR)
> >
> > ..., which also installs dynamic libraries.  Is that difference
> > intentional and/or possibly important?
> Well, I wasn't sure what extension to use for shared libraries, and
> initially avoided copying them.
> libgcc seems to use $(SHLIB_EXT) to specify extension name for shared
> libraries, which can be overridden by targets.
> 
> Matthew pointed out to me that using libtool --mode=install works for
> copying both static and shared libraries (with the numbered version
> libatomic.so.1.2.0), so in the attached patch, I changed Makefile.am
> rule to following:
> gcc_objdir = `pwd`/$(MULTIBUILDTOP)../../gcc/
> all: all-multi libatomic.la libatomic_convenience.la
> $(LIBTOOL) --mode=install $(INSTALL_DATA) libatomic.la
> $(gcc_objdir)$(MULTISUBDIR)/
> 
> Which seems to install libatomic.a, libatomic.so and the numbered
> version in $build/gcc/ and in $build/gcc/32/ (and $build/gcc/mgomp/
> for nvptx build).
> Does it look OK ?
> >
> > Does libatomic even need a switch corresponding to '-static-libgcc'?
> I am not sure, hoping for Joseph to chime in.
> >
> > Given that libatomic libraries get copied

[committed] d: Add testcase for fixed PR116373

2025-01-16 Thread Iain Buclaw
Hi,

This patch adds a testcase for a PR that was fixed in upstream, and
merged in r15-6559-g332cf038fda109.

Regression tested on x86_64-linux-gnu, and committed to mainline.

Regards,
Iain.

---
PR d/116373

gcc/testsuite/ChangeLog:

* gdc.dg/pr116373.d: New test.
---
 gcc/testsuite/gdc.dg/pr116373.d | 8 
 1 file changed, 8 insertions(+)
 create mode 100644 gcc/testsuite/gdc.dg/pr116373.d

diff --git a/gcc/testsuite/gdc.dg/pr116373.d b/gcc/testsuite/gdc.dg/pr116373.d
new file mode 100644
index 000..b58863bacf2
--- /dev/null
+++ b/gcc/testsuite/gdc.dg/pr116373.d
@@ -0,0 +1,8 @@
+// { dg-do compile }
+int[] x;
+
+void foo (int[] y = x[]) {}
+
+void main () {
+foo();
+}
-- 
2.43.0



Re: [PATCH] rs6000: Fix ICE for invalid constants in built-in functions

2025-01-16 Thread Peter Bergner
On 1/13/25 3:59 PM, Peter Bergner wrote:
> rs6000: Fix ICE for invalid constants in built-in functions
> 
> For invalid constant operand values used in built-in functions, return
> const0_rtx to signify an error occurred during expansion.
> 
> Bootstrapped and retested on powerlc64le-linux with no regressions.
> Ok for trunk and backports after some trunk burn-in time?

Approved offline by Segher, so I pushed the commit.
I'll let it bake on trunk for a bit before backporting.

Peter




Re: [PATCH] rs6000: Fix loop limit for built-in constant checking

2025-01-16 Thread Peter Bergner
On 1/10/25 11:18 AM, Peter Bergner wrote:
> The loop checking for built-in constant operand restrictions was missing
> some operands due to the loop limit being too small.  Fixing that exposed
> a testsuite failure which is caused by a typo in the pmxvi4ger8pp definition
> where we had made the PMASK field too small.
> 
> Bootstrapped and retested on powerpc64le-linux with no regressions.
> Ok for trunk and backports after some trunk burn-in time?

Approved offline by Segher, so I pushed the commit.
I'll let it bake on trunk for a bit before backporting.

Peter




[GCC-14][committed] d: Fix ICE in dmd.expressionsem.resolveLoc

2025-01-16 Thread Iain Buclaw
Hi,

This patch backports the individual fix for PR116373 from the upstream
merge in r15-6559-g332cf038fda109 into the releases/gcc-14 branch.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32, and
committed to branch.

Regards,
Iain.

---
PR d/116373

gcc/d/ChangeLog:

* dmd/expressionsem.d (resolveLoc): Check for null pointer before
resolving bounds of slice.

gcc/testsuite/ChangeLog:

* gdc.dg/pr116373.d: New test.

(cherry picked from commit c7dab40d7569c51ac4e62ceea05c7c049da426bb)
---
 gcc/d/dmd/expressionsem.d   | 6 --
 gcc/testsuite/gdc.dg/pr116373.d | 8 
 2 files changed, 12 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gdc.dg/pr116373.d

diff --git a/gcc/d/dmd/expressionsem.d b/gcc/d/dmd/expressionsem.d
index 7ae7f400d16..a6425d31143 100644
--- a/gcc/d/dmd/expressionsem.d
+++ b/gcc/d/dmd/expressionsem.d
@@ -15331,8 +15331,10 @@ Expression resolveLoc(Expression exp, const ref Loc 
loc, Scope* sc)
 Expression visitSlice(SliceExp exp)
 {
 exp.e1 = exp.e1.resolveLoc(loc, sc);
-exp.lwr = exp.lwr.resolveLoc(loc, sc);
-exp.upr = exp.upr.resolveLoc(loc, sc);
+if (exp.lwr)
+exp.lwr = exp.lwr.resolveLoc(loc, sc);
+if (exp.upr)
+exp.upr = exp.upr.resolveLoc(loc, sc);
 
 return exp;
 }
diff --git a/gcc/testsuite/gdc.dg/pr116373.d b/gcc/testsuite/gdc.dg/pr116373.d
new file mode 100644
index 000..b58863bacf2
--- /dev/null
+++ b/gcc/testsuite/gdc.dg/pr116373.d
@@ -0,0 +1,8 @@
+// { dg-do compile }
+int[] x;
+
+void foo (int[] y = x[]) {}
+
+void main () {
+foo();
+}
-- 
2.43.0



Re: [PATCH v6 0/6] Remaining patches for metadirectives/dynamic selectors

2025-01-16 Thread Sandra Loosemore

On 1/14/25 20:38, Sandra Loosemore wrote:

I've incorporated a fix for Tobias's recent comment about recognizing
C_OMP_DIR_META in the "assumes" directive in the C and C++ front ends, but
I've deferred fixing the wording of the existing error message because it
affects a pile of testcases unrelated to metadirectives, and also because
I want to tweak the Fortran message too.  I think I can address this later
using my documentation maintainer superpowers.  ;-)


I've now pushed the attached cleanup for that issue, along with parts 1, 
2, 3, and 5 of this series.  Part 4 (the Fortran front end changes) is 
still pending review, and part 6 depends on part 4.


Besides diagnostic message wording being within my realm as 
documentation maintainer, Tobias approved the attached patch offline as 
well.


-Sandra
From e3514e01c60def48dfbba51805ec8cce0e1949e8 Mon Sep 17 00:00:00 2001
From: Sandra Loosemore 
Date: Wed, 15 Jan 2025 17:22:53 +
Subject: [PATCH] OpenMP: Improve error message for invalid directive in
 "assumes".

gcc/c/ChangeLog
	* c-parser.cc (c_parser_omp_assumption_clauses): Give a more specific
	error message for invalid directives vs unknown names.

gcc/cp/ChangeLog
	* parser.cc (cp_parser_omp_assumption_clauses): Give a more specific
	error message for invalid directives vs unknown names.

gcc/fortran/ChangeLog
	* openmp.cc (gfc_omp_absent_contains_clause): Use an Oxford comma
	in the message.

gcc/testsuite/ChangeLog
	* c-c++-common/gomp/assume-2.c: Adjust expected diagnostics.
	* c-c++-common/gomp/assumes-2.c: Likewise.
	* c-c++-common/gomp/begin-assumes-2.c: Likewise.
	* gfortran.dg/gomp/allocate-6.f90: Likewise.
	* gfortran.dg/gomp/assumes-2.f90: Likewise.
---
 gcc/c/c-parser.cc | 19 ---
 gcc/cp/parser.cc  | 13 +
 gcc/fortran/openmp.cc |  2 +-
 gcc/testsuite/c-c++-common/gomp/assume-2.c| 10 +-
 gcc/testsuite/c-c++-common/gomp/assumes-2.c   | 10 +-
 .../c-c++-common/gomp/begin-assumes-2.c   | 10 +-
 gcc/testsuite/gfortran.dg/gomp/allocate-6.f90 |  4 ++--
 gcc/testsuite/gfortran.dg/gomp/assumes-2.f90  |  2 +-
 8 files changed, 40 insertions(+), 30 deletions(-)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index b45c7ef961f..f193329099f 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -28846,13 +28846,18 @@ c_parser_omp_assumption_clauses (c_parser *parser, bool is_assume)
 			= c_omp_categorize_directive (directive[0],
 		  directive[1],
 		  directive[2]);
-		  if (dir == NULL
-			  || dir->kind == C_OMP_DIR_DECLARATIVE
-			  || dir->kind == C_OMP_DIR_INFORMATIONAL
-			  || dir->kind == C_OMP_DIR_META
-			  || dir->id == PRAGMA_OMP_END
-			  || (!dir->second && directive[1])
-			  || (!dir->third && directive[2]))
+		  if (dir
+			  && (dir->kind == C_OMP_DIR_DECLARATIVE
+			  || dir->kind == C_OMP_DIR_INFORMATIONAL
+			  || dir->kind == C_OMP_DIR_META))
+			error_at (dloc, "invalid OpenMP directive name in "
+	"%qs clause argument: declarative, "
+	"informational, and meta directives "
+	"not permitted", p);
+		  else if (dir == NULL
+			   || dir->id == PRAGMA_OMP_END
+			   || (!dir->second && directive[1])
+			   || (!dir->third && directive[2]))
 			error_at (dloc, "unknown OpenMP directive name in "
 	"%qs clause argument", p);
 		  else
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 82fac649d0c..775680a067d 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -49887,10 +49887,15 @@ cp_parser_omp_assumption_clauses (cp_parser *parser, cp_token *pragma_tok,
 			= c_omp_categorize_directive (directive[0],
 		  directive[1],
 		  directive[2]);
-		  if (dir == NULL
-			  || dir->kind == C_OMP_DIR_DECLARATIVE
-			  || dir->kind == C_OMP_DIR_INFORMATIONAL
-			  || dir->kind == C_OMP_DIR_META
+		  if (dir
+			  && (dir->kind == C_OMP_DIR_DECLARATIVE
+			  || dir->kind == C_OMP_DIR_INFORMATIONAL
+			  || dir->kind == C_OMP_DIR_META))
+			error_at (dloc, "invalid OpenMP directive name in "
+	"%qs clause argument: declarative, "
+	"informational, and meta directives "
+	"not permitted", p);
+		  else if (dir == NULL
 			  || dir->id == PRAGMA_OMP_END
 			  || (!dir->second && directive[1])
 			  || (!dir->third && directive[2]))
diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index c2adefd58d6..b49745a9fe0 100644
--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -1642,7 +1642,7 @@ gfc_omp_absent_contains_clause (gfc_omp_assumptions **assume, bool is_absent)
 	  || kind == GFC_OMP_DIR_META)
 	{
 	  gfc_error ("Invalid %qs directive at %L in %s clause: declarative, "
-		 "informational and meta directives not permitted",
+		 "informational, and meta directives not permitted",
 		 gfc_ascii_statement (st, true), &old_loc,
 		 is_absent ? "ABSENT" : "CONTAINS");
 	  return MATCH_ERROR;
d

Re: [PATCH v3 1/2] RISC-V: Update Xsfvfnrclip implementation.

2025-01-16 Thread Kito Cheng
committed, thanks :)

On Fri, Dec 13, 2024 at 8:39 PM Jiawei  wrote:
>
> Update implementation of Xsfvfnrclip, using return type as iterator.
>
> gcc/ChangeLog:
>
> * config/riscv/genrvv-type-indexer.cc (expand_floattype): New func.
> (main): New type.
> * config/riscv/riscv-vector-builtins-types.def (DEF_RVV_XFQF_OPS): 
> New def.
> (vint8mf8_t): Ditto.
> (vint8mf4_t): Ditto.
> (vint8mf2_t): Ditto.
> (vint8m1_t): Ditto.
> (vint8m2_t): Ditto.
> * config/riscv/riscv-vector-builtins.cc (DEF_RVV_XFQF_OPS): Ditto.
> (rvv_arg_type_info::get_xfqf_float_type): Ditto.
> * config/riscv/riscv-vector-builtins.def (xfqf_vector): Ditto.
> (xfqf_float): Ditto.
> * config/riscv/riscv-vector-builtins.h
> *(struct rvv_arg_type_info): New function prototype.
> * config/riscv/sifive-vector.md: Update iterator.
> * config/riscv/vector-iterators.md: Ditto.
>
> ---
>  gcc/config/riscv/genrvv-type-indexer.cc   | 17 ++
>  .../riscv/riscv-vector-builtins-types.def | 13 
>  gcc/config/riscv/riscv-vector-builtins.cc | 33 +++
>  gcc/config/riscv/riscv-vector-builtins.def|  4 ++-
>  gcc/config/riscv/riscv-vector-builtins.h  |  1 +
>  gcc/config/riscv/sifive-vector.md | 10 +++---
>  gcc/config/riscv/vector-iterators.md  | 25 +++---
>  7 files changed, 78 insertions(+), 25 deletions(-)
>
> diff --git a/gcc/config/riscv/genrvv-type-indexer.cc 
> b/gcc/config/riscv/genrvv-type-indexer.cc
> index e1eee34237a..a2974269adc 100644
> --- a/gcc/config/riscv/genrvv-type-indexer.cc
> +++ b/gcc/config/riscv/genrvv-type-indexer.cc
> @@ -164,6 +164,18 @@ floattype (unsigned sew, int lmul_log2)
>return mode.str ();
>  }
>
> +std::string
> +expand_floattype (unsigned sew, int lmul_log2, unsigned nf)
> +{
> +  if (sew != 8 || nf!= 1
> +  || (!valid_type (sew * 4, lmul_log2 + 2, /*float_t*/ true)))
> +return "INVALID";
> +
> +  std::stringstream mode;
> +  mode << "vfloat" << sew * 4 << to_lmul (lmul_log2 + 2) << "_t";
> +  return mode.str ();
> +}
> +
>  std::string
>  floattype (unsigned sew, int lmul_log2, unsigned nf)
>  {
> @@ -276,6 +288,7 @@ main (int argc, const char **argv)
>fprintf (fp, "  /*QLMUL1*/ INVALID,\n");
>fprintf (fp, "  /*QLMUL1_SIGNED*/ INVALID,\n");
>fprintf (fp, "  /*QLMUL1_UNSIGNED*/ INVALID,\n");
> +  fprintf (fp, "  /*XFQF*/ INVALID,\n");
>for (unsigned eew : {8, 16, 32, 64})
> fprintf (fp, "  /*EEW%d_INTERPRET*/ INVALID,\n", eew);
>
> @@ -384,6 +397,8 @@ main (int argc, const char **argv)
>  inttype (8, /*lmul_log2*/ 0, false).c_str ());
> fprintf (fp, "  /*QLMUL1_UNSIGNED*/ %s,\n",
>  inttype (8, /*lmul_log2*/ 0, true).c_str ());
> +   fprintf (fp, "  /*XFQF*/ %s,\n",
> +expand_floattype (sew, lmul_log2, nf).c_str ());
> for (unsigned eew : {8, 16, 32, 64})
>   {
> if (eew == sew)
> @@ -473,6 +488,7 @@ main (int argc, const char **argv)
>  bfloat16_wide_type (/*lmul_log2*/ 0).c_str ());
> fprintf (fp, "  /*QLMUL1_SIGNED*/ INVALID,\n");
> fprintf (fp, "  /*QLMUL1_UNSIGNED*/ INVALID,\n");
> +   fprintf (fp, "  /*XFQF*/ INVALID,\n");
> for (unsigned eew : {8, 16, 32, 64})
>   fprintf (fp, "  /*EEW%d_INTERPRET*/ INVALID,\n", eew);
>
> @@ -558,6 +574,7 @@ main (int argc, const char **argv)
>floattype (sew / 4, /*lmul_log2*/ 0).c_str ());
>   fprintf (fp, "  /*QLMUL1_SIGNED*/ INVALID,\n");
>   fprintf (fp, "  /*QLMUL1_UNSIGNED*/ INVALID,\n");
> + fprintf (fp, "  /*XFQF*/ INVALID,\n");
>   for (unsigned eew : {8, 16, 32, 64})
> fprintf (fp, "  /*EEW%d_INTERPRET*/ INVALID,\n", eew);
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins-types.def 
> b/gcc/config/riscv/riscv-vector-builtins-types.def
> index 96412bfd1a5..df55b6a8823 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-types.def
> +++ b/gcc/config/riscv/riscv-vector-builtins-types.def
> @@ -363,6 +363,12 @@ along with GCC; see the file COPYING3. If not see
>  #define DEF_RVV_QMACC_OPS(TYPE, REQUIRE)
>  #endif
>
> +/* Use "DEF_RVV_XFQF_OPS" macro include signed integer which will
> +   be iterated and registered as intrinsic functions.  */
> +#ifndef DEF_RVV_XFQF_OPS
> +#define DEF_RVV_XFQF_OPS(TYPE, REQUIRE)
> +#endif
> +
>  DEF_RVV_I_OPS (vint8mf8_t, RVV_REQUIRE_MIN_VLEN_64)
>  DEF_RVV_I_OPS (vint8mf4_t, 0)
>  DEF_RVV_I_OPS (vint8mf2_t, 0)
> @@ -1451,6 +1457,12 @@ DEF_RVV_QMACC_OPS (vint32m2_t, 0)
>  DEF_RVV_QMACC_OPS (vint32m4_t, 0)
>  DEF_RVV_QMACC_OPS (vint32m8_t, 0)
>
> +DEF_RVV_XFQF_OPS (vint8mf8_t, 0)
> +DEF_RVV_XFQF_OPS (vint8mf4_t, 0)
> +DEF_RVV_XFQF_OPS (vint8mf2_t, 0)
> +DEF_RVV_XFQF_OPS (vint8m1_t, 0)
> +DEF_RVV_XFQF_OPS (vint8m2_t, 0)
> +
>  #undef DEF_RVV_I_OPS

Re: [PATCH v3 2/2] RISC-V: Update Xsfvqmacc and Xsfvfnrclip's testcases

2025-01-16 Thread Kito Cheng
committed, thanks :)

On Fri, Dec 13, 2024 at 8:39 PM Jiawei  wrote:
>
> From: Liao Shihua 
>
> Update Sifive Xsfvqmacc and Xsfvfnrclip extension's testcases.
>
> version log:
> Update synchronize LMUL settings with return type.
>
> gcc/ChangeLog:
>
> * config/riscv/vector.md: New attr set.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/xsfvector/sf_vfnrclip_x_f_qf.c: Add vsetivli 
> checking.
> * gcc.target/riscv/rvv/xsfvector/sf_vfnrclip_xu_f_qf.c: Ditto.
> * gcc.target/riscv/rvv/xsfvector/sf_vqmacc_2x8x2.c: Ditto.
> * gcc.target/riscv/rvv/xsfvector/sf_vqmacc_4x8x4.c: Ditto.
> * gcc.target/riscv/rvv/xsfvector/sf_vqmaccsu_2x8x2.c: Ditto.
> * gcc.target/riscv/rvv/xsfvector/sf_vqmaccsu_4x8x4.c: Ditto.
> * gcc.target/riscv/rvv/xsfvector/sf_vqmaccu_2x8x2.c: Ditto.
> * gcc.target/riscv/rvv/xsfvector/sf_vqmaccu_4x8x4.c: Ditto.
> * gcc.target/riscv/rvv/xsfvector/sf_vqmaccus_2x8x2.c: Ditto.
> * gcc.target/riscv/rvv/xsfvector/sf_vqmaccus_4x8x4.c: Ditto.
>
> ---
>  gcc/config/riscv/vector.md|  7 ++-
>  .../riscv/rvv/xsfvector/sf_vfnrclip_x_f_qf.c  | 60 ++
>  .../riscv/rvv/xsfvector/sf_vfnrclip_xu_f_qf.c | 63 ++-
>  .../riscv/rvv/xsfvector/sf_vqmacc_2x8x2.c | 16 +
>  .../riscv/rvv/xsfvector/sf_vqmacc_4x8x4.c | 16 +
>  .../riscv/rvv/xsfvector/sf_vqmaccsu_2x8x2.c   | 17 +
>  .../riscv/rvv/xsfvector/sf_vqmaccsu_4x8x4.c   | 17 +
>  .../riscv/rvv/xsfvector/sf_vqmaccu_2x8x2.c| 16 +
>  .../riscv/rvv/xsfvector/sf_vqmaccu_4x8x4.c| 17 +
>  .../riscv/rvv/xsfvector/sf_vqmaccus_2x8x2.c   | 17 +
>  .../riscv/rvv/xsfvector/sf_vqmaccus_4x8x4.c   | 17 +
>  11 files changed, 259 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
> index 58406f3d17c..d24916d2caf 100644
> --- a/gcc/config/riscv/vector.md
> +++ b/gcc/config/riscv/vector.md
> @@ -56,7 +56,8 @@
>   
> vssegtux,vssegtox,vlsegdff,vandn,vbrev,vbrev8,vrev8,vcpop,vclz,vctz,vrol,\
>   
> vror,vwsll,vclmul,vclmulh,vghsh,vgmul,vaesef,vaesem,vaesdf,vaesdm,\
>   
> vaeskf1,vaeskf2,vaesz,vsha2ms,vsha2ch,vsha2cl,vsm4k,vsm4r,vsm3me,vsm3c,\
> - vfncvtbf16,vfwcvtbf16,vfwmaccbf16")
> + vfncvtbf16,vfwcvtbf16,vfwmaccbf16,\
> + sf_vqmacc,sf_vfnrclip")
>  (const_string "true")]
> (const_string "false")))
>
> @@ -893,7 +894,7 @@
>   
> vfredo,vfwredu,vfwredo,vslideup,vslidedown,vislide1up,\
>   
> vislide1down,vfslide1up,vfslide1down,vgather,viwmuladd,vfwmuladd,\
>   
> vlsegds,vlsegdux,vlsegdox,vandn,vrol,vror,vwsll,vclmul,vclmulh,\
> - vfwmaccbf16")
> + vfwmaccbf16,sf_vqmacc,sf_vfnrclip")
>(symbol_ref "riscv_vector::get_ta(operands[6])")
>
>  (eq_attr "type" "vimuladd,vfmuladd")
> @@ -924,7 +925,7 @@
>   vfwalu,vfwmul,vfsgnj,vfcmp,vslideup,vslidedown,\
>   
> vislide1up,vislide1down,vfslide1up,vfslide1down,vgather,\
>   
> viwmuladd,vfwmuladd,vlsegds,vlsegdux,vlsegdox,vandn,vrol,\
> -  vror,vwsll,vclmul,vclmulh,vfwmaccbf16")
> + 
> vror,vwsll,vclmul,vclmulh,vfwmaccbf16,sf_vqmacc,sf_vfnrclip")
>(symbol_ref "riscv_vector::get_ma(operands[7])")
>
>  (eq_attr "type" "vimuladd,vfmuladd")
> diff --git 
> a/gcc/testsuite/gcc.target/riscv/rvv/xsfvector/sf_vfnrclip_x_f_qf.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/xsfvector/sf_vfnrclip_x_f_qf.c
> index 813f7860f64..a4193b5aea9 100644
> --- a/gcc/testsuite/gcc.target/riscv/rvv/xsfvector/sf_vfnrclip_x_f_qf.c
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/xsfvector/sf_vfnrclip_x_f_qf.c
> @@ -7,6 +7,7 @@
>  /*
>  ** test_sf_vfnrclip_x_f_qf_i8mf8_vint8mf8_t:
>  ** ...
> +** vsetivli\s+zero+,0+,e8+,mf8+,ta+,ma+
>  ** sf\.vfnrclip\.x\.f\.qf\tv[0-9]+,v[0-9]+,fa[0-9]+
>  ** ...
>  */
> @@ -17,6 +18,7 @@ vint8mf8_t 
> test_sf_vfnrclip_x_f_qf_i8mf8_vint8mf8_t(vfloat32mf2_t vs2, float rs1
>  /*
>  ** test_sf_vfnrclip_x_f_qf_i8mf4_vint8mf4_t:
>  ** ...
> +** vsetivli\s+zero+,0+,e8+,mf4+,ta+,ma+
>  ** sf\.vfnrclip\.x\.f\.qf\tv[0-9]+,v[0-9]+,fa[0-9]+
>  ** ...
>  */
> @@ -27,6 +29,7 @@ vint8mf4_t 
> test_sf_vfnrclip_x_f_qf_i8mf4_vint8mf4_t(vfloat32m1_t vs2, float rs1,
>  /*
>  ** test_sf_vfnrclip_x_f_qf_i8mf2_vint8mf2_t:
>  ** ...
> +** vsetivli\s+zero+,0+,e8+,mf2+,ta+,ma+
>  ** sf\.vfnrclip\.x\.f\.qf\tv[0-9]+,v[0-9]+,fa[0-9]+
>  ** ...
>  */
> @@ -37,6 +40,7 @@ vint8mf2_t 
> test_sf_vfnrclip_x_f_qf_i8mf2_vint8mf2_t(vfloat32m2_t vs2, float rs1,
>  /*
>  ** test_sf_vfnrclip_x_f_qf_i8m1_vint8m1_t:
>  ** ...
> +** vsetivli\s+zero+,0+,e8+,m1+,ta+,ma+
>  ** sf\.vfnrclip\.x\.f\.qf\tv[0-9]+,v

Honor dump options for C/C++ '-fdump-tree-original'

2025-01-16 Thread Thomas Schwinge
Hi!

I have noticed that '-fdump-tree-original-lineno' for Fortran (for
example) does dump location information, but for C/C++ it does not.
The reason is that Fortran (and other front ends) use code like:

/* Output the GENERIC tree.  */
dump_function (TDI_original, fndecl);

..., but 'gcc/c-family/c-gimplify.cc:c_genericize' has some special code
to "Dump the C-specific tree IR", and that (unless 'TDF_RAW') calls
'gcc/c-family/c-pretty-print.cc:print_c_tree', and appears to completely
ignore the 'dump_flags_t'.  (Ignores it in 'c_pretty_printer::statement',
and passes 'TDF_NONE' into 'dump_generic_node'.)

See the attached "Honor dump options for C/C++ '-fdump-tree-original'"
for what I have quickly hacked up.  Does that make any sense to do like
this, and if yes, how much more polish does this need, or if no, how
should we approach this issue otherwise?

(I need this, no surprise, for use in test cases.)


Grüße
 Thomas


>From 5820291e06c3f5ae7002ef1ec735e4e6b8590c1f Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Thu, 16 Jan 2025 15:32:56 +0100
Subject: [PATCH] Honor dump options for C/C++ '-fdump-tree-original'

---
 gcc/c-family/c-gimplify.cc |  2 +-
 gcc/c-family/c-pretty-print.cc | 29 -
 gcc/c-family/c-pretty-print.h  |  6 --
 3 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/gcc/c-family/c-gimplify.cc b/gcc/c-family/c-gimplify.cc
index 89a1f5c1e80..54fb912124d 100644
--- a/gcc/c-family/c-gimplify.cc
+++ b/gcc/c-family/c-gimplify.cc
@@ -701,7 +701,7 @@ c_genericize (tree fndecl)
 	dump_node (DECL_SAVED_TREE (fndecl),
 		   TDF_SLIM | local_dump_flags, dump_orig);
   else
-	print_c_tree (dump_orig, DECL_SAVED_TREE (fndecl));
+	print_c_tree (dump_orig, DECL_SAVED_TREE (fndecl), local_dump_flags);
   fprintf (dump_orig, "\n");
 }
 
diff --git a/gcc/c-family/c-pretty-print.cc b/gcc/c-family/c-pretty-print.cc
index 0b6810e1224..1ce19f54988 100644
--- a/gcc/c-family/c-pretty-print.cc
+++ b/gcc/c-family/c-pretty-print.cc
@@ -2858,6 +2858,9 @@ c_pretty_printer::statement (tree t)
 {
 
 case SWITCH_STMT:
+  if (dump_flags != TDF_NONE)
+	internal_error ("dump flags not handled here");
+
   pp_c_ws_string (this, "switch");
   pp_space (this);
   pp_c_left_paren (this);
@@ -2875,6 +2878,9 @@ c_pretty_printer::statement (tree t)
 	for ( expression(opt) ; expression(opt) ; expression(opt) ) statement
 	for ( declaration expression(opt) ; expression(opt) ) statement  */
 case WHILE_STMT:
+  if (dump_flags != TDF_NONE)
+	internal_error ("dump flags not handled here");
+
   pp_c_ws_string (this, "while");
   pp_space (this);
   pp_c_left_paren (this);
@@ -2887,6 +2893,9 @@ c_pretty_printer::statement (tree t)
   break;
 
 case DO_STMT:
+  if (dump_flags != TDF_NONE)
+	internal_error ("dump flags not handled here");
+
   pp_c_ws_string (this, "do");
   pp_newline_and_indent (this, 3);
   statement (DO_BODY (t));
@@ -2901,6 +2910,9 @@ c_pretty_printer::statement (tree t)
   break;
 
 case FOR_STMT:
+  if (dump_flags != TDF_NONE)
+	internal_error ("dump flags not handled here");
+
   pp_c_ws_string (this, "for");
   pp_space (this);
   pp_c_left_paren (this);
@@ -2929,6 +2941,9 @@ c_pretty_printer::statement (tree t)
 	continue ;
 	return expression(opt) ;  */
 case BREAK_STMT:
+  if (dump_flags != TDF_NONE)
+	internal_error ("dump flags not handled here");
+
   pp_string (this, "break");
   if (BREAK_NAME (t))
 	{
@@ -2940,6 +2955,9 @@ c_pretty_printer::statement (tree t)
   break;
 
 case CONTINUE_STMT:
+  if (dump_flags != TDF_NONE)
+	internal_error ("dump flags not handled here");
+
   pp_string (this, "continue");
   if (CONTINUE_NAME (t))
 	{
@@ -2953,15 +2971,16 @@ c_pretty_printer::statement (tree t)
 default:
   if (pp_needs_newline (this))
 	pp_newline_and_indent (this, 0);
-  dump_generic_node (this, t, pp_indentation (this), TDF_NONE, true);
+  dump_generic_node (this, t, pp_indentation (this), dump_flags, true);
 }
 }
 
 
 /* Initialize the PRETTY-PRINTER for handling C codes.  */
 
-c_pretty_printer::c_pretty_printer ()
+c_pretty_printer::c_pretty_printer (dump_flags_t dump_flags)
   : pretty_printer (),
+dump_flags (dump_flags),
 offset_list (),
 flags ()
 {
@@ -2981,9 +3000,9 @@ c_pretty_printer::clone () const
 /* Print the tree T in full, on file FILE.  */
 
 void
-print_c_tree (FILE *file, tree t)
+print_c_tree (FILE *file, tree t, dump_flags_t dump_flags)
 {
-  c_pretty_printer pp;
+  c_pretty_printer pp (dump_flags);
 
   pp_needs_newline (&pp) = true;
   pp.set_output_stream (file);
@@ -2996,7 +3015,7 @@ print_c_tree (FILE *file, tree t)
 DEBUG_FUNCTION void
 debug_c_tree (tree t)
 {
-  print_c_tree (stderr, t);
+  print_c_tree (stderr, t, TDF_NONE);
   fputc ('\n', stderr);
 }
 
diff --git a/gcc/c-family/c-pretty-print.h b/gcc/c-family/c-pretty-pr

Re: [PATCH] wwwdocs: experiments with a Python postprocessing script

2025-01-16 Thread Gerald Pfeifer
On Wed, 15 Jan 2025, David Malcolm wrote:
> The heading elements in our website contain "id" information,
> but currently to find them you to look at the page source,
> whereas in the generated HTML for the manual we have e.g.:
> 
>   ¶
> 
> which shows up nicely in the browser in e.g.
>   https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html
:
> It's *very* helpful to have easily shareable links to within pages.

Absolutely agreed.

> I've never managed to build MetaHTML and have always just crossed my 
> fingers and hoped when making edits to the GCC website; bin/preprocess 
> just errors out for me immediately due to not finding mhc.

Yes, sadly the GNU project let MetaHTML die (though I raised this more 
than once). I still think the concept as such was fine and it served us 
well over the years, but building has been challenging 20 years ago and 
would require some fierce source code editing nowadays. :-(

> So this patch as written replaces the invocation of mhc with an 
> invocation of the python script, which of course drops various features.

Yeah! I was hoping we could return to your script. IIRC I once pinged and 
you were busy; happy to collaborate on finishing this up.

In any case, a few years ago I spent quite some time and effort to prepare 
the stage, migrating the site to CSS (done) and making individual pages 
self contained (also done), which removed most of the original MetaHTML 
usage.

This is why things appear somewhat fine, even without MetaHTML available.

> and, for now, the loss of the mhc stuff here:
>   https://dmalcolm.fedorapeople.org/gcc/2025-01-15/htdocs/
> 
> compared to:
>   https://gcc.gnu.org/

So it appears the two biggest losses are
 (1) the default footer on every page, and
 (2) the navigation bar on the main page?
Plus 
 (3) loss of favicon.ico on every page,
 (4) postprocessing of /install docs,
 (5) no longer adding "- GNU Project" to every page title.

Anything else you are aware of?

> Gerald: if you have mhc working, can you please try adjusting the
> bin/ so it runs *both*. mhc and the python script.

I have a 32-bit x86 build on a local machine which probably is 20+ years 
old, plus a comparable, though not identical, build on gcc.gnu.org.

Building newly is something I tried a while ago and gave up. Not 
infeasible when one patches out code we don't need to some non-standard
things, but painful and not worth it.

(I'm sorry, I'm not sure what you mean by the above, i.e., what you'd like 
to see adjusted?)

> --- a/bin/preprocess
> +++ b/bin/preprocess
> @@ -33,8 +33,6 @@
>  #
>  # By Gerald Pfeifer  1999-12-29.
   ^^
Well, talking about old code! Back then MetaHTML built fine IIRC on your 
average GNU/Linux distribution. :-/


How do we best take it from there?

I believe at this point, and with MetaHTML unrecoverably dead for 10+ 
years, and my website rework there isn't dramatically much left we're 
missing.

htdocs/style.mhtml actually tells us what, the two bigger items being (1) 
default footer and (2) navigation bar on the main page as listed above.

(The BACKPATH code in style.mhtmlwas used for the GCJ main page, which is 
gone now. Of course it would be nice to have navigation on every page, but 
that's an enhancement, not a regression to avoid.)


With those two addressed, and (3) possibly later on, I think we should 
bite the bullet, rip of the bandaid, plunge into the cold water, whatever 
idiom we want to use. :-)


How can we tackle those?

Maybe some "macros", best HTML comments that insert text or include a text 
file from a magic subdirectory? We could use that for the navigation 
aspect.

And some "include this text before  in every single document"
magic for the default footer?

(Disclaimer: these are just some ideas. There may be vastly better ways.)


What do you think?

Gerald


Re: [PATCH] c++: Make sure fold_sizeof_expr returns the correct type [PR117775]

2025-01-16 Thread Jason Merrill

On 1/16/25 7:19 AM, Simon Martin wrote:

Hi Jakub, Jason,

On 15 Jan 2025, at 22:55, Jakub Jelinek wrote:


On Wed, Jan 15, 2025 at 04:48:59PM -0500, Jason Merrill wrote:

--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -11686,6 +11686,7 @@ fold_sizeof_expr (tree t)
false, false);
 if (r == error_mark_node)
   r = size_one_node;
+  r = cp_fold_convert (TREE_TYPE (t), r);


Instead of adding this conversion in all cases, let's change
size_one_node
to

build_int_cst (size_type_node, 1)

That would need to be r = build_int_cst (TREE_TYPE (t), 1);

Jason is right: my patch does not need to do TREE_TYPE (t), and can
simply use size_type_node - oversight on my part.


I guess, while that is maybe fine, I don't see how it could avoid
the cp_fold_convert call, because size_one_node can be returned
also from e.g. c-family c_sizeof_or_alignof_type or its typeck.cc
callers,
or it can be size_int (something) etc.

I took another look at the code paths in fold_sizeof_expr and I believe
that we’re “good” in all of them except if we hit typeck.cc:2077
(I have not been able so far to craft a test that does), because the
code either builds a node with size_type_node as type, or goes through
c_sizeof_or_alignof_type that fold_convert’s everything (except
error_mark_node) to size_type_node (in c-common.cc:4028).

I can run a regression test round with a patch that uses
“build_int_cst (size_type_node, 1)” in decl.cc:11930 and
typeck.cc:2077, which should cover all the *current* cases for
fold_sizeof_expr. However, the initial patch has the advantage that:
   - It’s consistent with what c_sizeof_or_alignof_type does
   - It will cover the (unlikely?) possibility that some new code path is
added some day that that does not use the right type
   - It adds no cost to nominal cases, since cp_fold_convert does nothing
if we already have the right type

Jason, would you still like me to test and submit a patch that uses
build_int_cst in the two places identified above instead of doing
cp_fold_convert (size_type_node, r) in decl.cc:11930?


No need; your original patch is OK, thanks.

Jason



RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-16 Thread Tamar Christina
> -Original Message-
> From: Richard Sandiford 
> Sent: Thursday, January 16, 2025 7:11 AM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; ktkac...@gcc.gnu.org
> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions
> for unknown non-homogenous systems [PR113257]
> 
> Richard Sandiford  writes:
> > Tamar Christina  writes:
> >> Ok for master? and how do you feel about a backport for the two patches to
> help
> >> distros?
> >
> > Backporting to GCC 14 & GCC 13 sounds good.  Not so sure about GCC 12,
> > since I think we should be extra cautious with the "most stable" branch,
> > but let's see what others think.
> >
> > OK for trunk, and for GCC 14 & 13 after a grace period, with one
> > trivial nit below:
> 
> Sorry, was concentrating too much on the -mcpu vs. -march preemption
> thing and forgot to think about other aspects of the patch.  The routine
> is used for all three of -march=native, -mcpu=native, and -mtune=native,
> so I think we want something like the following on top of your patch
> (untested so far).
> 

Cool, how's this one?

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master? and for backport to GCC 13 and 14?

Thanks,
Tamar

gcc/ChangeLog:

PR target/113257
* config/aarch64/driver-aarch64.cc (get_cpu_from_id, DEFAULT_CPU): New.
(host_detect_local_cpu): Use it.

gcc/testsuite/ChangeLog:

PR target/113257
* gcc.target/aarch64/cpunative/info_34: New test.
* gcc.target/aarch64/cpunative/native_cpu_34.c: New test.
* gcc.target/aarch64/cpunative/info_35: New test.
* gcc.target/aarch64/cpunative/native_cpu_35.c: New test.

Co-authored-by: Richard Sandiford 

-- inline copy of patch --

diff --git a/gcc/config/aarch64/driver-aarch64.cc 
b/gcc/config/aarch64/driver-aarch64.cc
index 
45fce67a646351b848b7cd7d0fd35d343731c0d1..26ba2cd6f8883300951268aab7d0a22ec2588a0d
 100644
--- a/gcc/config/aarch64/driver-aarch64.cc
+++ b/gcc/config/aarch64/driver-aarch64.cc
@@ -60,6 +60,7 @@ struct aarch64_core_data
 #define ALL_VARIANTS ((unsigned)-1)
 /* Default architecture to use if -mcpu=native did not detect a known CPU.  */
 #define DEFAULT_ARCH "8A"
+#define DEFAULT_CPU "generic-armv8-a"
 
 #define AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHED, ARCH, FLAGS, COSTS, IMP, 
PART, VARIANT) \
   { CORE_NAME, #ARCH, IMP, PART, VARIANT, feature_deps::cpu_##CORE_IDENT },
@@ -106,6 +107,21 @@ get_arch_from_id (const char* id)
   return NULL;
 }
 
+/* Return an aarch64_core_data for the cpu described
+   by ID, or NULL if ID describes something we don't know about.  */
+
+static const aarch64_core_data *
+get_cpu_from_id (const char* name)
+{
+  for (unsigned i = 0; aarch64_cpu_data[i].name != NULL; i++)
+{
+  if (strcmp (name, aarch64_cpu_data[i].name) == 0)
+   return &aarch64_cpu_data[i];
+}
+
+  return NULL;
+}
+
 /* Check wether the CORE array is the same as the big.LITTLE BL_CORE.
For an example CORE={0xd08, 0xd03} and
BL_CORE=AARCH64_BIG_LITTLE (0xd08, 0xd03) will return true.  */
@@ -403,18 +419,11 @@ host_detect_local_cpu (int argc, const char **argv)
 || variants[0] == aarch64_cpu_data[i].variant))
  break;
 
-  if (aarch64_cpu_data[i].name == NULL)
+  if (arch)
{
- auto arch_info = get_arch_from_id (DEFAULT_ARCH);
-
- gcc_assert (arch_info);
-
- res = concat ("-march=", arch_info->name, NULL);
- default_flags = arch_info->flags;
-   }
-  else if (arch)
-   {
- const char *arch_id = aarch64_cpu_data[i].arch;
+ const char *arch_id = (aarch64_cpu_data[i].name
+? aarch64_cpu_data[i].arch
+: DEFAULT_ARCH);
  auto arch_info = get_arch_from_id (arch_id);
 
  /* We got some arch indentifier that's not in aarch64-arches.def?  */
@@ -424,12 +433,15 @@ host_detect_local_cpu (int argc, const char **argv)
  res = concat ("-march=", arch_info->name, NULL);
  default_flags = arch_info->flags;
}
-  else
+  else if (cpu || aarch64_cpu_data[i].name)
{
- default_flags = aarch64_cpu_data[i].flags;
+ auto cpu_info = (aarch64_cpu_data[i].name
+  ? &aarch64_cpu_data[i]
+  : get_cpu_from_id (DEFAULT_CPU));
+ default_flags = cpu_info->flags;
  res = concat ("-m",
cpu ? "cpu" : "tune", "=",
-   aarch64_cpu_data[i].name,
+   cpu_info->name,
NULL);
}
 }
@@ -449,6 +461,20 @@ host_detect_local_cpu (int argc, const char **argv)
  break;
}
}
+
+  /* On big.LITTLE if we find any unknown CPUs we can still pick arch
+features as the cores should have the same features.  So just pick
+the feature flags from any of the cpus.  *

Re: [PATCH] LoongArch: Fix cost model for alsl

2025-01-16 Thread Lulu Cheng



在 2025/1/15 下午6:10, Xi Ruoyao 写道:

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 9d97f0216f0..3a8e1297bd3 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -3929,14 +3929,31 @@ loongarch_rtx_costs (rtx x, machine_mode mode, int 
outer_code,
  
/* If it's an add + mult (which is equivalent to shift left) and

 it's immediate operand satisfies const_immalsl_operand predicate.  */
-  if ((mode == SImode || (TARGET_64BIT && mode == DImode))
- && GET_CODE (XEXP (x, 0)) == MULT)
+  if (code == PLUS


Hi,

This section of code is already within the "case PLUS" block, so I think 
the condition "code == PLUS" is unnecessary.


Otherwise LGTM.

Thanks!


+ && (mode == SImode || (TARGET_64BIT && mode == DImode)))
{
- rtx op2 = XEXP (XEXP (x, 0), 1);
- if (const_immalsl_operand (op2, mode))
+ HOST_WIDE_INT shamt = -1;
+ rtx lhs = XEXP (x, 0);
+ rtx_code code_lhs = GET_CODE (lhs);
+
+ switch (code_lhs)
+   {
+   case ASHIFT:
+ if (CONST_INT_P (XEXP (lhs, 1)))
+   shamt = INTVAL (XEXP (lhs, 1));
+ break;
+   case MULT:
+ if (CONST_INT_P (XEXP (lhs, 1)))
+   shamt = exact_log2 (INTVAL (XEXP (lhs, 1)));
+ break;
+   default:
+ break;
+   }
+
+ if (IN_RANGE (shamt, 1, 4))
{
  *total = (COSTS_N_INSNS (1)
-   + set_src_cost (XEXP (XEXP (x, 0), 0), mode, speed)
+   + set_src_cost (XEXP (lhs, 0), mode, speed)
+ set_src_cost (XEXP (x, 1), mode, speed));
  return true;
}
diff --git a/gcc/testsuite/gcc.target/loongarch/alsl-cost.c 
b/gcc/testsuite/gcc.target/loongarch/alsl-cost.c
new file mode 100644
index 000..a182279015c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/alsl-cost.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mtune=loongarch64" } */
+/* { dg-final { scan-assembler-times "alsl\\\.\[wd\]" 2 } } */
+
+struct P
+{
+  long a, b;
+};
+
+struct P
+t (struct P x, long n)
+{
+  return (struct P){.a = x.a + n * 8, .b = x.b + n * 8};
+}




Re: [PATCH] LoongArch: Fix cost model for alsl

2025-01-16 Thread Xi Ruoyao
On Thu, 2025-01-16 at 20:30 +0800, Lulu Cheng wrote:
> 
> 在 2025/1/15 下午6:10, Xi Ruoyao 写道:
> > diff --git a/gcc/config/loongarch/loongarch.cc 
> > b/gcc/config/loongarch/loongarch.cc
> > index 9d97f0216f0..3a8e1297bd3 100644
> > --- a/gcc/config/loongarch/loongarch.cc
> > +++ b/gcc/config/loongarch/loongarch.cc
> > @@ -3929,14 +3929,31 @@ loongarch_rtx_costs (rtx x, machine_mode mode, int 
> > outer_code,
> >   
> >     /* If it's an add + mult (which is equivalent to shift left) and
> >      it's immediate operand satisfies const_immalsl_operand predicate.  */
> > -  if ((mode == SImode || (TARGET_64BIT && mode == DImode))
> > -     && GET_CODE (XEXP (x, 0)) == MULT)
> > +  if (code == PLUS
> 
> Hi,
> 
> This section of code is already within the "case PLUS" block, so I think 
> the condition "code == PLUS" is unnecessary.

The reason is "case MINUS:" falls through into "case PLUS:" and we don't
have a "slsl" instruction, so we need the check to reject things like a
- b * 4 here.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH] c++: Make sure fold_sizeof_expr returns the correct type [PR117775]

2025-01-16 Thread Simon Martin
On 16 Jan 2025, at 16:05, Jason Merrill wrote:

> On 1/16/25 7:19 AM, Simon Martin wrote:
>> Hi Jakub, Jason,
>>
>> On 15 Jan 2025, at 22:55, Jakub Jelinek wrote:
>>
>>> On Wed, Jan 15, 2025 at 04:48:59PM -0500, Jason Merrill wrote:
> --- a/gcc/cp/decl.cc
> +++ b/gcc/cp/decl.cc
> @@ -11686,6 +11686,7 @@ fold_sizeof_expr (tree t)
>   false, false);
>  if (r == error_mark_node)
>r = size_one_node;
> +  r = cp_fold_convert (TREE_TYPE (t), r);

 Instead of adding this conversion in all cases, let's change
 size_one_node
 to

 build_int_cst (size_type_node, 1)
>>> That would need to be r = build_int_cst (TREE_TYPE (t), 1);
>> Jason is right: my patch does not need to do TREE_TYPE (t), and can
>> simply use size_type_node - oversight on my part.
>>
>>> I guess, while that is maybe fine, I don't see how it could avoid
>>> the cp_fold_convert call, because size_one_node can be returned
>>> also from e.g. c-family c_sizeof_or_alignof_type or its typeck.cc
>>> callers,
>>> or it can be size_int (something) etc.
>> I took another look at the code paths in fold_sizeof_expr and I 
>> believe
>> that we’re “good” in all of them except if we hit 
>> typeck.cc:2077
>> (I have not been able so far to craft a test that does), because the
>> code either builds a node with size_type_node as type, or goes 
>> through
>> c_sizeof_or_alignof_type that fold_convert’s everything (except
>> error_mark_node) to size_type_node (in c-common.cc:4028).
>>
>> I can run a regression test round with a patch that uses
>> “build_int_cst (size_type_node, 1)” in decl.cc:11930 and
>> typeck.cc:2077, which should cover all the *current* cases for
>> fold_sizeof_expr. However, the initial patch has the advantage that:
>>- It’s consistent with what c_sizeof_or_alignof_type does
>>- It will cover the (unlikely?) possibility that some new code 
>> path is
>> added some day that that does not use the right type
>>- It adds no cost to nominal cases, since cp_fold_convert does 
>> nothing
>> if we already have the right type
>>
>> Jason, would you still like me to test and submit a patch that uses
>> build_int_cst in the two places identified above instead of doing
>> cp_fold_convert (size_type_node, r) in decl.cc:11930?
>
> No need; your original patch is OK, thanks.
Thanks Jason, I will merge it momentarily. Since it’s a regression 
from GCC 12, is it OK for branches as well?

Simon



Re: [PATCH] c++: 'this' capture clobbered during recursive inst [PR116756]

2025-01-16 Thread Patrick Palka
On Mon, 13 Jan 2025, Jason Merrill wrote:

> On 1/10/25 2:20 PM, Patrick Palka wrote:
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
> > OK for trunk?
> > 
> > The documentation for LAMBDA_EXPR_THIS_CAPTURE seems outdated because
> > it says the field is only used at parse time, but apparently it's also
> > used at instantiation time.
> > 
> > Non-'this' captures don't seem to be affected, because there is no
> > corresponding LAMBDA_EXPR field that gets clobbered, and instead their
> > uses get resolved via the local specialization mechanism which is
> > recursion aware.
> > 
> > The bug also disappears if we explicitly use this in the openSeries call,
> > i.e. this->openSeries(...), because that sidesteps the use of
> > maybe_resolve_dummy / LAMBDA_EXPR_THIS_CAPTURE for resolving the
> > implicit object, and instead gets resolved via the local mechanism
> > specialization.
> > 
> > Maybe this suggests that there's a better way to fix this, but I'm not
> > sure...
> 
> That does sound like an interesting direction.  Maybe for a generic lambda,
> LAMBDA_EXPR_THIS_CAPTURE could just refer to the captured parameter, and we
> use retrieve_local_specialization to find the proxy?

Like so?  Tested on x86_64-pc-linux-gnu, full bootstrap+regtest in
progress.

-- >8 --

Subject: [PATCH v2] c++: 'this' capture clobbered during recursive inst
 [PR116756]

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
OK for trunk?

-- >8 --

Here during instantiation of generic lambda's op() [with I = 0] we
substitute into the call self(self, cst<1>{}) which requires recursive
instantiation of the same op() [with I = 1] (which isn't deferred due to
lambda's deduced return type.  During this recursive instantiation, the
DECL_EXPR case of tsubst_stmt clobbers LAMBDA_EXPR_THIS_CAPTURE to point
to the child op()'s specialized capture proxy instead of the parent's,
and the original value is never restored.

So later when substituting into the openSeries call in the parent op()
maybe_resolve_dummy uses the 'this' proxy belonging to the child op(),
which leads to a context mismatch ICE during gimplification of the
proxy.

An earlier version of this patch fixed this by making instantiate_body
save/restore LAMBDA_EXPR_THIS_CAPTURE during a lambda op() instantiation.
But it seems cleaner to avoid overwriting LAMBDA_EXPR_THIS_CAPTURE in the
first place by making it point to the non-specialized capture proxy, and
instead call retrieve_local_specialization as needed, which is what this
patch implements.  It's simpler then to not clear LAMBDA_EXPR_THIS_CAPTURE
after parsing/regenerating a lambda.

PR c++/116756

gcc/cp/ChangeLog:

* lambda.cc (lambda_expr_this_capture): Call
retrieve_local_specialization on the result of
LAMBDA_EXPR_THIS_CAPTURE for a generic lambda.
* parser.cc (cp_parser_lambda_expression): Don't clear
LAMBDA_EXPR_THIS_CAPTURE.
* pt.cc (tsubst_stmt) : Don't overwrite
LAMBDA_EXPR_THIS_CAPTURE.
(tsubst_lambda_expr): Don't clear LAMBDA_EXPR_THIS_CAPTURE
afterward.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-if-lambda7.C: New test.
---
 gcc/cp/lambda.cc  |  6 +
 gcc/cp/parser.cc  |  3 ---
 gcc/cp/pt.cc  | 11 +
 .../g++.dg/cpp1z/constexpr-if-lambda7.C   | 24 +++
 4 files changed, 31 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda7.C

diff --git a/gcc/cp/lambda.cc b/gcc/cp/lambda.cc
index be8a0fe01cb..4ee8f6c745d 100644
--- a/gcc/cp/lambda.cc
+++ b/gcc/cp/lambda.cc
@@ -785,6 +785,12 @@ lambda_expr_this_capture (tree lambda, int add_capture_p)
   tree result;
 
   tree this_capture = LAMBDA_EXPR_THIS_CAPTURE (lambda);
+  if (this_capture)
+if (tree spec = retrieve_local_specialization (this_capture))
+  {
+   gcc_checking_assert (generic_lambda_fn_p (lambda_function (lambda)));
+   this_capture = spec;
+  }
 
   /* In unevaluated context this isn't an odr-use, so don't capture.  */
   if (cp_unevaluated_operand)
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 74f4f7cd6d8..16bbb87a815 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -11723,9 +11723,6 @@ cp_parser_lambda_expression (cp_parser* parser)
 parser->omp_array_section_p = saved_omp_array_section_p;
   }
 
-  /* This field is only used during parsing of the lambda.  */
-  LAMBDA_EXPR_THIS_CAPTURE (lambda_expr) = NULL_TREE;
-
   /* This lambda shouldn't have any proxies left at this point.  */
   gcc_assert (LAMBDA_EXPR_PENDING_PROXIES (lambda_expr) == NULL);
   /* And now that we're done, push proxies for an enclosing lambda.  */
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 961696f333e..64c7d3da405 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -18938,12 +18938,6 @@ tsubst_stmt (tree t, tree args, tsubst_flags_t 
complain, tree in_dec

Re: PING **(6./7.): [patch, Fortran] -fc-prototypes fixes.

2025-01-16 Thread Thomas Koenig

Am 16.01.25 um 01:50 schrieb Jerry D:


Yes I think this is OK, a definite improvement.


Committed as r15-6967.

Thanks for the review!

Best regards

Thomas



[pushed]PR118067][LRA]: Use the right mode to evaluate secondary memory reload

2025-01-16 Thread Vladimir Makarov

The following patch solves

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118067

The patch was successfully tested and bootstrapped on x86_64, aarch64, 
and ppc64le.


commit d9835825b3d7193b3d6669174f4386be2cb1
Author: Vladimir N. Makarov 
Date:   Thu Jan 16 12:17:31 2025 -0500

[PR118067][LRA]: Use the right mode to evaluate secondary memory reload

  In the PR case, LRA made insn alternative costly.  It happened
because LRA incorrectly found that the alternative needs 2nd memory
reload as the wrong mode for targetm.secondary_memory_needed was used.
This resulted in LRA cycling as an alternative with mask regs was
chosen.  The patch fixes the PR and add more debug printing which
could be useful in the future for debugging function
process_alt_operands.

gcc/ChangeLog:

PR rtl-optimization/1180167
* lra-constraints.cc (process_alt_operands): Use operand mode not
subreg reg mode.  Add and improve debugging prints for updating
losers.

gcc/testsuite/ChangeLog:

PR rtl-optimization/118067
* gcc.target/i386/pr118067.c: New.

diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
index 8f32e98f1c4..3d5abcfaeb0 100644
--- a/gcc/lra-constraints.cc
+++ b/gcc/lra-constraints.cc
@@ -2465,6 +2465,11 @@ process_alt_operands (int only_alternative)
 			&& (operand_reg[m] == NULL_RTX
 || hard_regno[m] < 0))
 			  {
+			if (lra_dump_file != NULL)
+			  fprintf
+(lra_dump_file,
+ "%d Matched operand reload: "
+ "losers++\n", m);
 			losers++;
 			reload_nregs
 			  += (ira_reg_class_max_nregs[curr_alt[m]]
@@ -2909,6 +2914,10 @@ process_alt_operands (int only_alternative)
 			   "Strict low subreg reload -- refuse\n");
 		  goto fail;
 		}
+		  if (lra_dump_file != NULL)
+		fprintf
+		  (lra_dump_file,
+		   "%d Operand reload: losers++\n", nop);
 		  losers++;
 		}
 	  if (operand_reg[nop] != NULL_RTX
@@ -2945,7 +2954,14 @@ process_alt_operands (int only_alternative)
 		{
 		  const_to_mem = 1;
 		  if (! no_regs_p)
-		losers++;
+		{
+		  if (lra_dump_file != NULL)
+			fprintf
+			  (lra_dump_file,
+			   "%d Constant reload through memory: "
+			   "losers++\n", nop);
+		  losers++;
+		}
 		}
 
 	  /* Alternative loses if it requires a type of reload not
@@ -3127,12 +3143,19 @@ process_alt_operands (int only_alternative)
 	  if (this_alternative != NO_REGS
 		  && REG_P (op) && (cl = get_reg_class (REGNO (op))) != NO_REGS
 		  && ((curr_static_id->operand[nop].type != OP_OUT
-		   && targetm.secondary_memory_needed (GET_MODE (op), cl,
+		   && targetm.secondary_memory_needed (mode, cl,
 			   this_alternative))
 		  || (curr_static_id->operand[nop].type != OP_IN
 			  && (targetm.secondary_memory_needed
-			  (GET_MODE (op), this_alternative, cl)
-		losers++;
+			  (mode, this_alternative, cl)
+		{
+		  if (lra_dump_file != NULL)
+		fprintf
+		  (lra_dump_file,
+		   "%d Secondary memory reload needed: "
+		   "losers++\n", nop);
+		  losers++;
+		}
 
 	  if (MEM_P (op) && offmemok)
 		addr_losers++;
@@ -3346,7 +3369,7 @@ process_alt_operands (int only_alternative)
 	  if (lra_dump_file != NULL)
 		fprintf
 		  (lra_dump_file,
-		   "%d Conflict early clobber reload: reject--\n",
+		   "%d Conflict early clobber reload: losers++\n",
 		   i);
 	}
 	  else
@@ -3358,6 +3381,12 @@ process_alt_operands (int only_alternative)
 		  {
 		curr_alt_match_win[j] = false;
 		losers++;
+		if (lra_dump_file != NULL)
+		  fprintf
+			(lra_dump_file,
+			 "%d Matching conflict early clobber "
+			 "reloads: losers++\n",
+			 j);
 		overall += LRA_LOSER_COST_FACTOR;
 		  }
 	  if (! curr_alt_match_win[i])
@@ -3375,7 +3404,7 @@ process_alt_operands (int only_alternative)
 		fprintf
 		  (lra_dump_file,
 		   "%d Matched conflict early clobber reloads: "
-		   "reject--\n",
+		   "losers++\n",
 		   i);
 	}
 	  /* Early clobber was already reflected in REJECT. */
diff --git a/gcc/testsuite/gcc.target/i386/pr118067.c b/gcc/testsuite/gcc.target/i386/pr118067.c
new file mode 100644
index 000..7a7f072a5d8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr118067.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O -fno-split-wide-types -mavx512f" } */
+
+typedef unsigned short U __attribute__((__vector_size__(64)));
+typedef int V __attribute__((__vector_size__(64)));
+typedef __int128 W __attribute__((__vector_size__(64)));
+
+W
+foo(U u, V v)
+{
+  W w;
+  /* __asm__ volatile ("" : "=v"(w)); prevents the -Wuninitialized warning */
+  u[0] >>= 1;
+  v %= (V)w;
+  return (W)u + (W)v;
+}


Re: [PATCH] [testsuite] rearrange requirements for dfp bitint run tests

2025-01-16 Thread Alexandre Oliva
On Jan 10, 2025, Alexandre Oliva  wrote:

> dfp.exp sets the default to compile when dfprt is not available, but
> some dfp bitint tests override the default without that requirement,
> and try to run even when dfprt is not available.

> Instead of overriding the default, rewrite the requirements so that
> they apply even when compiling, since the absence of bitint or of
> int128 would presumably cause compile failures.

> Regstrapped on x86_64-linux-gnu.  Also tested with aarch64-elf and
> arm-eabi on gcc-14, with dfp support (implicitly) disabled in libgcc.
> Ok to install?

Ping?
https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673160.html

> for  gcc/testsuite/ChangeLog

>   * gcc.dg/dfp/bitint-1.c: Rewrite requirements to retain dfprt.
>   * gcc.dg/dfp/bitint-2.c: Likewise.
>   * gcc.dg/dfp/bitint-3.c: Likewise.
>   * gcc.dg/dfp/bitint-4.c: Likewise.
>   * gcc.dg/dfp/bitint-5.c: Likewise.
>   * gcc.dg/dfp/bitint-6.c: Likewise.
>   * gcc.dg/dfp/bitint-7.c: Likewise.
>   * gcc.dg/dfp/bitint-8.c: Likewise.
>   * gcc.dg/dfp/int128-1.c: Likewise.
>   * gcc.dg/dfp/int128-2.c: Likewise.
>   * gcc.dg/dfp/int128-3.c: Likewise.
>   * gcc.dg/dfp/int128-4.c: Likewise.

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


[PATCH] libfortran: fix conversion of UNSIGNED(kind=16) to decimal in output [PR118406]

2025-01-16 Thread Harald Anlauf

Dear all,

the conversion of (unsigned) integers to decimal in output was designed
to be efficient up to INTEGER(kind=16) and did not handle values larger
than roughly (10^19 * 2^64).

The attached obvious patch fixes this.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From f66049d52327242743e7e9ff59d8373fcb333212 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Thu, 16 Jan 2025 20:23:06 +0100
Subject: [PATCH] libfortran: fix conversion of UNSIGNED(kind=16) to decimal in
 output [PR118406]

	PR libfortran/118406

libgfortran/ChangeLog:

	* runtime/string.c (gfc_itoa): Handle unsigned integers larger than
	(10^19 * 2^64).

gcc/testsuite/ChangeLog:

	* gfortran.dg/unsigned_write.f90: New test.
---
 gcc/testsuite/gfortran.dg/unsigned_write.f90 | 40 
 libgfortran/runtime/string.c | 35 -
 2 files changed, 66 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/unsigned_write.f90

diff --git a/gcc/testsuite/gfortran.dg/unsigned_write.f90 b/gcc/testsuite/gfortran.dg/unsigned_write.f90
new file mode 100644
index 000..903c212ee3a
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/unsigned_write.f90
@@ -0,0 +1,40 @@
+! { dg-do  run }
+! This is a libgfortran (runtime library) test, need to run only once!
+!
+! { dg-require-effective-target fortran_integer_16 }
+! { dg-additional-options "-funsigned" }
+!
+! PR libfortran/118406 - printing large UNSIGNED(kind=16) crashes
+
+program print_large_unsigned
+  unsigned(16), parameter :: u16_max =  huge(0U_16)
+  unsigned(16), parameter :: u8_max  = uint(huge(0U_8),16)! UINT64_MAX
+  unsigned(16), parameter :: ten19   = uint(10_8 ** 18,16)*10U_16 ! 10**19
+  character(42) :: s
+
+  ! Reference: signed integer
+  write(s,*)   huge(0_16)
+  if (adjustl (s) /= "170141183460469231731687303715884105727") stop 1
+
+  ! Same value as unsigned
+  write(s,*) uint (huge(0_16),16)
+  if (adjustl (s) /= "170141183460469231731687303715884105727") stop 2
+
+  ! Extreme and intermediate values
+  write(s,*) u16_max
+  if (adjustl (s) /= "340282366920938463463374607431768211455") stop 3
+
+  write(s,*) (u16_max - 3U_16) / 4U_16 * 3U_16
+  if (adjustl (s) /= "255211775190703847597530955573826158589") stop 4
+
+  ! Test branches of implementation in string.c::gfc_itoa
+  write(s,*) u8_max * ten19
+  if (adjustl (s) /= "18446744073709551615000") stop 5
+
+  write(s,*) u8_max * ten19 - 1U_16
+  if (adjustl (s) /= "18446744073709551614999") stop 6
+
+  write(s,*) u8_max * ten19 + 1U_16
+  if (adjustl (s) /= "18446744073709551615001") stop 7
+
+end
diff --git a/libgfortran/runtime/string.c b/libgfortran/runtime/string.c
index 8acc94292c4..a0e2a85e8e2 100644
--- a/libgfortran/runtime/string.c
+++ b/libgfortran/runtime/string.c
@@ -241,18 +241,35 @@ gfc_itoa (GFC_UINTEGER_LARGEST n, char *buffer, size_t len)
 	 the uint64_t function are not sufficient for all 128-bit unsigned
 	 integers (we would need three calls), but they do suffice for all
 	 values up to 2^127, which is the largest that Fortran can produce
-	 (-HUGE(0_16)-1) with its signed integer types.  */
+	 (-HUGE(0_16)-1) with its signed integer types.
+	 With the introduction of UNSIGNED integers, we must treat the case
+	 of unsigned ints larger than (10^19 * 2^64) by adding one step.  */
   _Static_assert (sizeof(GFC_UINTEGER_LARGEST) <= 2 * sizeof(uint64_t),
 		  "integer too large");
 
-  GFC_UINTEGER_LARGEST r;
-  r = n % TEN19;
-  n = n / TEN19;
-  assert (r <= UINT64_MAX);
-  p = itoa64_pad19 (r, p);
-
-  assert(n <= UINT64_MAX);
-  return itoa64 (n, p);
+  if (n <= TEN19 * UINT64_MAX)
+	{
+	  GFC_UINTEGER_LARGEST r;
+	  r = n % TEN19;
+	  n = n / TEN19;
+	  assert (r <= UINT64_MAX);
+	  p = itoa64_pad19 (r, p);
+
+	  assert(n <= UINT64_MAX);
+	  return itoa64 (n, p);
+	}
+  else
+	{
+	  /* Here n > (10^19 * 2^64).  */
+	  GFC_UINTEGER_LARGEST d1, r1, d2, r2;
+	  d1 = n / (TEN19 * TEN19);
+	  r1 = n % (TEN19 * TEN19);
+	  d2 = r1 / TEN19;
+	  r2 = r1 % TEN19;
+	  p = itoa64_pad19 (r2, p);
+	  p = itoa64_pad19 (d2, p);
+	  return itoa64 (d1, p);
+	}
 }
 #else
   /* On targets where the largest integer is 64-bit, just use that.  */
-- 
2.43.0



[PATCH #2/2] [testsuite] drop explicit run overrider in more dfp tests

2025-01-16 Thread Alexandre Oliva


A few more dfp tests that recently got backported to gcc-14 override
dfp.exp's selection of default action depending on dfprt.  Let the
default stand.

This is a followup of the patch I've just pinged.  Regstrapped on
x86_64-linux-gnu, also tested on arm-eabi and aarch64-elf.  Ok to
install?


for  gcc/testsuite/ChangeLog

* gcc.dg/dfp/pr102674.c: Use the default dg-do.
* gcc.dg/dfp/pr43374.c: Likewise.
---
 gcc/testsuite/gcc.dg/dfp/pr102674.c |1 -
 gcc/testsuite/gcc.dg/dfp/pr43374.c  |1 -
 2 files changed, 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/dfp/pr102674.c 
b/gcc/testsuite/gcc.dg/dfp/pr102674.c
index c67ecf5ce71bc..8139353d5ca17 100644
--- a/gcc/testsuite/gcc.dg/dfp/pr102674.c
+++ b/gcc/testsuite/gcc.dg/dfp/pr102674.c
@@ -1,5 +1,4 @@
 /* PR middle-end/102674 */
-/* { dg-do run } */
 /* { dg-options "-O2" } */
 
 #define FP_NAN 0
diff --git a/gcc/testsuite/gcc.dg/dfp/pr43374.c 
b/gcc/testsuite/gcc.dg/dfp/pr43374.c
index 83f3dca1c1f29..3ecadd1bac02a 100644
--- a/gcc/testsuite/gcc.dg/dfp/pr43374.c
+++ b/gcc/testsuite/gcc.dg/dfp/pr43374.c
@@ -1,5 +1,4 @@
 /* PR middle-end/43374 */
-/* { dg-do run } */
 /* { dg-options "-O2" } */
 
 __attribute__((noipa)) int


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH] libfortran: fix conversion of UNSIGNED(kind=16) to decimal in output [PR118406]

2025-01-16 Thread Thomas Koenig

Hello Harald,


the conversion of (unsigned) integers to decimal in output was designed
to be efficient up to INTEGER(kind=16) and did not handle values larger
than roughly (10^19 * 2^64).

The attached obvious patch fixes this.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?


OK.

Thanks a lot for finding and fixing this!

Best regards

Thomas



[PATCH] [testsuite] skip test on non-hosted libstdc++ [PR113994]

2025-01-16 Thread Alexandre Oliva


Tests that include  need to be skipped when libstdc++ is built
in freestanding mode.


for  gcc/testsuite/ChangeLog

PR rtl-optimization/113994
* g++.dg/pr113994.C: Require hosted libstdc++.
---
 gcc/testsuite/g++.dg/torture/pr113994.C |1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/g++.dg/torture/pr113994.C 
b/gcc/testsuite/g++.dg/torture/pr113994.C
index c9c186d45ee7d..bd749c4ada8a6 100644
--- a/gcc/testsuite/g++.dg/torture/pr113994.C
+++ b/gcc/testsuite/g++.dg/torture/pr113994.C
@@ -1,5 +1,6 @@
 // PR rtl-optimization/113994
 // { dg-do run }
+// { dg-skip-if "requires hosted libstdc++ for string" { ! hostedlib } }
 
 #include 
 

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH] wwwdocs: experiments with a Python postprocessing script

2025-01-16 Thread David Malcolm
On Thu, 2025-01-16 at 22:58 +0800, Gerald Pfeifer wrote:
> On Wed, 15 Jan 2025, David Malcolm wrote:
> > The heading elements in our website contain "id" information,
> > but currently to find them you to look at the page source,
> > whereas in the generated HTML for the manual we have e.g.:
> > 
> >   ¶
> > 
> > which shows up nicely in the browser in e.g.
> >   https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html
> :
> > It's *very* helpful to have easily shareable links to within pages.
> 
> Absolutely agreed.
> 
> > I've never managed to build MetaHTML and have always just crossed
> > my 
> > fingers and hoped when making edits to the GCC website;
> > bin/preprocess 
> > just errors out for me immediately due to not finding mhc.
> 
> Yes, sadly the GNU project let MetaHTML die (though I raised this
> more 
> than once). I still think the concept as such was fine and it served
> us 
> well over the years, but building has been challenging 20 years ago
> and 
> would require some fierce source code editing nowadays. :-(
> 
> > So this patch as written replaces the invocation of mhc with an 
> > invocation of the python script, which of course drops various
> > features.
> 
> Yeah! I was hoping we could return to your script. IIRC I once pinged
> and 
> you were busy; happy to collaborate on finishing this up.

As it happens, I had entirely forgotten about this earlier work until
you and Joseph mentioned it.

For reference the old patch is here:
  https://gcc.gnu.org/legacy-ml/gcc-patches/2018-06/msg00176.html

Maybe I can allocate some cycles in stage 4 to fully eliminating mhc
from the website build.

Dave



[PATCH] [testsuite] [arm] adjust wmul expectations [PR113560]

2025-01-16 Thread Alexandre Oliva


Since the machine-independent widening multiply logic was improved
PR113560, ARM's wmul-[567].c fail.  AFAICT the logic takes advantage
of the fact that, after zero-extending a narrow integral type to a
wider type, further zero- or sign-extending is equivalent, which
enables different instructions to be used for equivalent effect.

Adjust the tests to accept all the equivalent instructions that can be
used.

Regstrapped on x86_64-linux-gnu, also tested on arm-eabi and
aarch64-elf.  Ok to install?


for  gcc/testsuite/ChangeLog

PR target/113560
* gcc.target/arm/wmul-5.c: Accept other mla instructions.
* gcc.target/arm/wmul-6.c: Likewise.
* gcc.target/arm/wmul-7.c: Likewise.
---
 gcc/testsuite/gcc.target/arm/wmul-5.c |4 +++-
 gcc/testsuite/gcc.target/arm/wmul-6.c |4 +++-
 gcc/testsuite/gcc.target/arm/wmul-7.c |4 +++-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/wmul-5.c 
b/gcc/testsuite/gcc.target/arm/wmul-5.c
index 9f29a81c0b8bd..282e007d9be3f 100644
--- a/gcc/testsuite/gcc.target/arm/wmul-5.c
+++ b/gcc/testsuite/gcc.target/arm/wmul-5.c
@@ -8,4 +8,6 @@ foo (long long a, char *b, char *c)
   return a + *b * *c;
 }
 
-/* { dg-final { scan-assembler "umlal" } } */
+/* smlalbb after zero-extending the chars to HImode, or either signed- or
+   unsigned-widening multiply after extending them to SImode.  */
+/* { dg-final { scan-assembler {(?:smlalbb|[us]mlal)} } } */
diff --git a/gcc/testsuite/gcc.target/arm/wmul-6.c 
b/gcc/testsuite/gcc.target/arm/wmul-6.c
index babdaab1efd55..05247e00c5ebc 100644
--- a/gcc/testsuite/gcc.target/arm/wmul-6.c
+++ b/gcc/testsuite/gcc.target/arm/wmul-6.c
@@ -8,4 +8,6 @@ foo (long long a, unsigned char *b, signed char *c)
   return a + (long long)*b * (long long)*c;
 }
 
-/* { dg-final { scan-assembler "smlalbb" } } */
+/* After zero-extending B and sign-extending C to [HS]imode, either
+   signed-widening multiply will do.  */
+/* { dg-final { scan-assembler {smlal(?:bb)?} } } */
diff --git a/gcc/testsuite/gcc.target/arm/wmul-7.c 
b/gcc/testsuite/gcc.target/arm/wmul-7.c
index 2db4ad4e10d52..26933c42401a3 100644
--- a/gcc/testsuite/gcc.target/arm/wmul-7.c
+++ b/gcc/testsuite/gcc.target/arm/wmul-7.c
@@ -8,4 +8,6 @@ foo (unsigned long long a, unsigned char *b, unsigned short *c)
   return a + *b * *c;
 }
 
-/* { dg-final { scan-assembler "umlal" } } */
+/* After zero-extending both to SImode, either signed- or unsigned-widening
+   multiply will do.  */
+/* { dg-final { scan-assembler {[us]mlal} } } */


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH] MIPS: Add conditions for use of the -mmips16e2 and -mips16 option.

2025-01-16 Thread Rong Zhang
On Thu, 2025-01-16 at 11:41 +, Maciej W. Rozycki wrote:
> On Thu, 16 Jan 2025, Jie Mei wrote:
> 
> > Make -mmips16e2 imply -mips16 as the ASE requires, so users won't
> > be surprised even if they expect it to. Meanwhile, check if
> > mips_isa_rev <= 5 when -mips16 is effective and >= 1 when -mmips16e2
> > is effective.
> 
>  MIPSr1 is incompatible with MIPS16e2, and the only implementation known 
> to me is MIPSr3.

The revision check adheres to the ASE manual (MIPS16e2 Application-
Specific Extension Technical Reference Manual, MD01172-2B-MIPS16e2-AFP-
01.00):

   1.1 Base Architecture Requirements
   
   The MIPS16e2 ASE requires the following base architecture support:
   
   The MIPS32 or MIPS64 Architecture: The MIPS16e2 ASE requires a
   compliant implementation of the  MIPS32 or MIPS64 Architecture.

That being said, after some brief investigation, I believe you're
right. Some MIPS16e2 instructions do not have MIPS32/64 equivalent
before MIPSr2, so implementing MIPS16e2 upon MIPSr1 is not viable.
Thanks for pointing that out.

Except for the fact that the first implementation of MIPS16e2 is based
on MIPSr3, there seems no blocker for implementing MIPS16e2 upon
MIPSr2. We'll seed a v2 patch raising the minimal revision to r2.

>   Maciej

Thanks,
Rong


Re: [PATCH] i386: Fix wrong insn generated by shld/shrd ndd split [PR118510]

2025-01-16 Thread Uros Bizjak
On Fri, Jan 17, 2025 at 4:38 AM Hongyu Wang  wrote:
>
> Hi,
>
> For shld/shrd_ndd_2 insn, the spiltter outputs wrong pattern that
> mixed parallel for clobber and set. Separate out the set to dest
> from parallel to fix it.
>
> Bootstrapped & regtested on x86-64-pc-linux-gnu.
>
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/118510
> * config/i386/i386.md (*x86_64_shld_ndd_2): Separate set to
> operand[0] from parallel in output template.
> (*x86_shld_ndd_2): Likewise.
> (*x86_64_shrd_ndd_2): Likewise.
> (*x86_shrd_ndd_2): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> PR target/118510
> * gcc.target/i386/pr118510.c: New test.
> ---
>  gcc/config/i386/i386.md  | 16 
>  gcc/testsuite/gcc.target/i386/pr118510.c | 14 ++
>  2 files changed, 22 insertions(+), 8 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr118510.c
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index 362b0ddcf40..ebfd76593ab 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -15615,8 +15615,8 @@ (define_insn_and_split "*x86_64_shld_ndd_2"
>  (minus:QI (const_int 64)
>(and:QI (match_dup 3)
>(const_int 63 0)))
> - (clobber (reg:CC FLAGS_REG))
> - (set (match_dup 0) (match_dup 4))])]
> + (clobber (reg:CC FLAGS_REG))])
> +   (set (match_dup 0) (match_dup 4))]

Is there a reason to have operand 0 with "nonimmediate_operand"
predicate? If you have to generate a register temporary and then
unconditionally copy it to the output, it is better to use
"register_operand" predicate and leave middle end to do the copy for
you. Please see the patch at comment #3 in the PR.

Uros.

>  {
>operands[4] = gen_reg_rtx (DImode);
>emit_move_insn (operands[4], operands[0]);
> @@ -15851,8 +15851,8 @@ (define_insn_and_split "*x86_shld_ndd_2"
>  (minus:QI (const_int 32)
>(and:QI (match_dup 3)
>(const_int 31 0)))
> - (clobber (reg:CC FLAGS_REG))
> - (set (match_dup 0) (match_dup 4))])]
> + (clobber (reg:CC FLAGS_REG))])
> +   (set (match_dup 0) (match_dup 4))]
>  {
>operands[4] = gen_reg_rtx (SImode);
>emit_move_insn (operands[4], operands[0]);
> @@ -17010,8 +17010,8 @@ (define_insn_and_split "*x86_64_shrd_ndd_2"
>  (minus:QI (const_int 64)
>(and:QI (match_dup 3)
>(const_int 63 0)))
> - (clobber (reg:CC FLAGS_REG))
> - (set (match_dup 0) (match_dup 4))])]
> + (clobber (reg:CC FLAGS_REG))])
> +   (set (match_dup 0) (match_dup 4))]
>  {
>operands[4] = gen_reg_rtx (DImode);
>emit_move_insn (operands[4], operands[0]);
> @@ -17245,8 +17245,8 @@ (define_insn_and_split "*x86_shrd_ndd_2"
>  (minus:QI (const_int 32)
>(and:QI (match_dup 3)
>(const_int 31 0)))
> - (clobber (reg:CC FLAGS_REG))
> - (set (match_dup 0) (match_dup 4))])]
> + (clobber (reg:CC FLAGS_REG))])
> +   (set (match_dup 0) (match_dup 4))]
>  {
>operands[4] = gen_reg_rtx (SImode);
>emit_move_insn (operands[4], operands[0]);
> diff --git a/gcc/testsuite/gcc.target/i386/pr118510.c 
> b/gcc/testsuite/gcc.target/i386/pr118510.c
> new file mode 100644
> index 000..6cfe8182b6f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr118510.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-O2 -mapxf" } */
> +
> +typedef struct cpp_num cpp_num;
> +struct cpp_num {
> +int high;
> +unsigned low;
> +int overflow;
> +};
> +int num_rshift_n;
> +cpp_num num_lshift(cpp_num num) {
> +num.low = num.low >> num_rshift_n | num.high << (32 - num_rshift_n);
> +return num;
> +}
> --
> 2.31.1
>


RE: [COMMITTED] OpenMP: Fix metadirective test failures on x86_64 with -m32

2025-01-16 Thread Jiang, Haochen
> From: Sandra Loosemore 
> Sent: Friday, January 17, 2025 12:11 PM
> 

Thanks for the quick fix!

Thx,
Haochen

> gcc/testsuite/ChangeLog
>   * c-c++-common/gomp/metadirective-device.c: Don't add extra
> options
>   for target ia32.
>   * c-c++-common/gomp/metadirective-target-device-1.c: Likewise.
> ---
>  gcc/testsuite/c-c++-common/gomp/metadirective-device.c  | 2 +-
>  gcc/testsuite/c-c++-common/gomp/metadirective-target-device-1.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/c-c++-common/gomp/metadirective-device.c
> b/gcc/testsuite/c-c++-common/gomp/metadirective-device.c
> index 09b795eeabe..380762477b0 100644
> --- a/gcc/testsuite/c-c++-common/gomp/metadirective-device.c
> +++ b/gcc/testsuite/c-c++-common/gomp/metadirective-device.c
> @@ -1,6 +1,6 @@
>  /* { dg-do compile }  */
>  /* { dg-additional-options "-foffload=disable -fdump-tree-optimized" } */
> -/* { dg-additional-options "-DDEVICE_ARCH=x86_64 -DDEVICE_ISA=sse -
> msse" { target x86_64-*-* } } */
> +/* { dg-additional-options "-DDEVICE_ARCH=x86_64 -DDEVICE_ISA=sse -
> msse" { target { x86_64-*-* && { ! ia32 } } } } */
> 
>  #include 
> 
> diff --git a/gcc/testsuite/c-c++-common/gomp/metadirective-target-device-
> 1.c b/gcc/testsuite/c-c++-common/gomp/metadirective-target-device-1.c
> index 6373349d37f..5d3a4c3ff9b 100644
> --- a/gcc/testsuite/c-c++-common/gomp/metadirective-target-device-1.c
> +++ b/gcc/testsuite/c-c++-common/gomp/metadirective-target-device-1.c
> @@ -1,6 +1,6 @@
>  /* { dg-do compile }  */
>  /* { dg-additional-options "-fdump-tree-optimized" } */
> -/* { dg-additional-options "-DDEVICE_ARCH=x86_64 -DDEVICE_ISA=mmx -
> mmmx" { target x86_64-*-* } }  */
> +/* { dg-additional-options "-DDEVICE_ARCH=x86_64 -DDEVICE_ISA=mmx -
> mmmx" { target { x86_64-*-* && { ! ia32 } } } } */
> 
>  #include 
> 
> --
> 2.34.1



[COMMITTED] OpenMP: Fix metadirective test failures on x86_64 with -m32

2025-01-16 Thread Sandra Loosemore
gcc/testsuite/ChangeLog
* c-c++-common/gomp/metadirective-device.c: Don't add extra options
for target ia32.
* c-c++-common/gomp/metadirective-target-device-1.c: Likewise.
---
 gcc/testsuite/c-c++-common/gomp/metadirective-device.c  | 2 +-
 gcc/testsuite/c-c++-common/gomp/metadirective-target-device-1.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/gomp/metadirective-device.c 
b/gcc/testsuite/c-c++-common/gomp/metadirective-device.c
index 09b795eeabe..380762477b0 100644
--- a/gcc/testsuite/c-c++-common/gomp/metadirective-device.c
+++ b/gcc/testsuite/c-c++-common/gomp/metadirective-device.c
@@ -1,6 +1,6 @@
 /* { dg-do compile }  */
 /* { dg-additional-options "-foffload=disable -fdump-tree-optimized" } */
-/* { dg-additional-options "-DDEVICE_ARCH=x86_64 -DDEVICE_ISA=sse -msse" { 
target x86_64-*-* } } */
+/* { dg-additional-options "-DDEVICE_ARCH=x86_64 -DDEVICE_ISA=sse -msse" { 
target { x86_64-*-* && { ! ia32 } } } } */
 
 #include 
 
diff --git a/gcc/testsuite/c-c++-common/gomp/metadirective-target-device-1.c 
b/gcc/testsuite/c-c++-common/gomp/metadirective-target-device-1.c
index 6373349d37f..5d3a4c3ff9b 100644
--- a/gcc/testsuite/c-c++-common/gomp/metadirective-target-device-1.c
+++ b/gcc/testsuite/c-c++-common/gomp/metadirective-target-device-1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile }  */
 /* { dg-additional-options "-fdump-tree-optimized" } */
-/* { dg-additional-options "-DDEVICE_ARCH=x86_64 -DDEVICE_ISA=mmx -mmmx" { 
target x86_64-*-* } }  */
+/* { dg-additional-options "-DDEVICE_ARCH=x86_64 -DDEVICE_ISA=mmx -mmmx" { 
target { x86_64-*-* && { ! ia32 } } } } */
 
 #include 
 
-- 
2.34.1



Re: [PATCH] c++: Friend classes don't shadow enclosing template class paramater [PR118255]

2025-01-16 Thread Simon Martin
On 17 Jan 2025, at 0:12, Jason Merrill wrote:

> On 1/5/25 11:19 AM, Simon Martin wrote:
>> We currently reject the following code
>>
>> === code here ===
>> template  struct S { friend class non_template; };
>> class non_template {};
>> S<0> s;
>> === code here ===
>>
>> While EDG agrees with the current behaviour, clang and MSVC don't 
>> (see
>> https://godbolt.org/z/69TGaabhd), and I believe that this code is 
>> valid,
>> since the friend clause does not actually declare a type, so it 
>> cannot
>> shadow anything. The fact that we didn't error out if the 
>> non_template
>> class was declared before S backs this up as well.
>>
>> This patch fixes this by skipping the call to check_template_shadow 
>> for
>> hidden bindings.
>
> OK.
Thanks Jason. Since it’s a regression from GCC 8, OK as well for 
active branches?

Simon

>> Successfully tested on x86_64-pc-linux-gnu.
>>
>>  PR c++/118255
>>
>> gcc/cp/ChangeLog:
>>
>>  * name-lookup.cc (pushdecl): Don't call check_template_shadow
>>  for hidden bindings.
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * g++.dg/lookup/pr99116-1.C: Adjust test expectation.
>>  * g++.dg/template/friend84.C: New test.
>>
>> ---
>>   gcc/cp/name-lookup.cc|  5 -
>>   gcc/testsuite/g++.dg/lookup/pr99116-1.C  |  2 +-
>>   gcc/testsuite/g++.dg/template/friend84.C | 26 
>> 
>>   3 files changed, 31 insertions(+), 2 deletions(-)
>>   create mode 100644 gcc/testsuite/g++.dg/template/friend84.C
>>
>> diff --git a/gcc/cp/name-lookup.cc b/gcc/cp/name-lookup.cc
>> index 0e185d3ef42..d1abb205bc7 100644
>> --- a/gcc/cp/name-lookup.cc
>> +++ b/gcc/cp/name-lookup.cc
>> @@ -4040,7 +4040,10 @@ pushdecl (tree decl, bool hiding)
>> if (old && anticipated_builtin_p (old))
>>  old = OVL_CHAIN (old);
>>  -  check_template_shadow (decl);
>> +  if (hiding)
>> +; /* Hidden bindings don't shadow anything.  */
>> +  else
>> +check_template_shadow (decl);
>>  if (DECL_DECLARES_FUNCTION_P (decl))
>>  {
>> diff --git a/gcc/testsuite/g++.dg/lookup/pr99116-1.C 
>> b/gcc/testsuite/g++.dg/lookup/pr99116-1.C
>> index 01b483ea915..efee3e4aca3 100644
>> --- a/gcc/testsuite/g++.dg/lookup/pr99116-1.C
>> +++ b/gcc/testsuite/g++.dg/lookup/pr99116-1.C
>> @@ -2,7 +2,7 @@
>>template struct Z {
>>  -  friend struct T; // { dg-error "shadows template parameter" }
>> +  friend struct T; // { dg-bogus "shadows template parameter" }
>>   };
>>struct Y {
>> diff --git a/gcc/testsuite/g++.dg/template/friend84.C 
>> b/gcc/testsuite/g++.dg/template/friend84.C
>> new file mode 100644
>> index 000..64ea41a552b
>> --- /dev/null
>> +++ b/gcc/testsuite/g++.dg/template/friend84.C
>> @@ -0,0 +1,26 @@
>> +// PR c++/118255
>> +// { dg-do "compile" }
>> +
>> +// The PR's case, that used to error out.
>> +template 
>> +struct S {
>> +  friend class non_template; // { dg-bogus "shadows template 
>> parameter" }
>> +};
>> +
>> +class non_template {};
>> +S<0> s;
>> +
>> +// We already accepted cases where the friend is already declared.
>> +template 
>> +struct T {
>> +  friend class non_template;
>> +};
>> +T<0> t;
>> +
>> +// We should reject (re)declarations.
>> +template 
>> +struct U {
>> +  class non_template {};  // { dg-error "shadows template parameter" 
>> }
>> +  void non_template () {} // { dg-error "shadows template parameter" 
>> }
>> +};
>> +U<0> u;



Re: [PATCH 2/2] LoongArch: Improve reassociation for bitwise operation and left shift [PR 115921]

2025-01-16 Thread Xi Ruoyao



于 2025年1月15日 GMT+08:00 下午6:12:34,Xi Ruoyao  写道:
>For things like
>
>(x | 0x101) << 11
>
>It's obvious to write:
>
>ori $r4,$r4,257
>slli.d  $r4,$r4,11
>
>But we are actually generating something insane:
>
>lu12i.w $r12,524288>>12 # 0x8
>ori $r12,$r12,2048
>slli.d  $r4,$r4,11
>or  $r4,$r4,$r12
>jr  $r1

/* snip */

> 
>+/* Test if reassociate (a << shamt) [&|^] mask to
>+   (a [&|^] (mask >> shamt)) << shamt is possible and beneficial.
>+   If true, return (mask >> shamt).  Return NULL_RTX otherwise.  */
>+
>+rtx
>+loongarch_reassoc_shift_bitwise (bool is_and, rtx shamt, rtx mask,
>+   machine_mode mode)
>+{
>+  gcc_checking_assert (CONST_INT_P (shamt));
>+  gcc_checking_assert (CONST_INT_P (mask));
>+  gcc_checking_assert (mode == SImode || mode == DImode);
>+
>+  if (ctz_hwi (INTVAL (mask)) < INTVAL (shamt))
>+return NULL_RTX;
>+
>+  rtx new_mask = gen_int_mode (UINTVAL (mask) >> UINTVAL (shamt), mode);
>+  if (const_uns_arith_operand (new_mask, mode))
>+return new_mask;
>+
>+  if (!is_and)
>+return NULL_RTX;
>+
>+  if (low_bitmask_operand (new_mask, mode))
>+return new_mask;
>+
>+  new_mask = gen_int_mode (INTVAL (mask) >> UINTVAL (shamt), mode);

This is problematic: it relies on shifting negative integers for which the 
behavior is
unspecificed in C++14.  I'll send V2 to fix it.


Re: [PATCH] tree-optimization/92539 - missed optimization leads to bogus -Warray-bounds

2025-01-16 Thread Richard Biener
On Wed, 15 Jan 2025, Richard Biener wrote:

> The following makes niter analysis recognize a loop with an exit
> condition scanning over a STRING_CST.  This is done via enhancing
> the force evaluation code rather than recognizing for example
> strlen (s) as number of iterations because it allows to handle
> some more cases.
> 
> STRING_CSTs are easy to handle since nothing can write to them, also
> processing those should be cheap.  I've refrained from handling
> anything besides char8_t.
> 
> Note to avoid the -Warray-bound dianostic we have to either early unroll
> the loop (there's no final value replacement done, there's a PR
> for doing this as part of CD-DCE when possibly eliding a loop),
> or create a canonical IV so we can DCE the loads.  The latter is what
> the patch does, also avoiding to repeatedly force-evaluate niters.
> This also makes final value replacement work again since now ivcanon
> is after it.
> 
> There are some testsuite adjustments needed, in particular we now
> unroll some loops early, causing messages to appear in different
> passes but also vectorization to now no longer happening on
> outer loops.  The changes mitigate that.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu.  I'll wait for
> CI to pick this up and still appreciate a look at the STRING_CST
> handling details.

Pushed as r15-6990-g44d21551362f90

> Thanks,
> Richard.
> 
>   PR tree-optimization/92539
>   * tree-ssa-loop-ivcanon.cc (tree_unroll_loops_completely_1):
>   Also try force-evaluation if ivcanon did not yet run.
>   (canonicalize_loop_induction_variables):
>   When niter was computed constant by force evaluation add a
>   canonical IV if we didn't unroll.
>   * tree-ssa-loop-niter.cc (loop_niter_by_eval): When we
>   don't find a proper PHI try if the exit condition scans
>   over a STRING_CST and simulate that.
> 
>   * g++.dg/warn/Warray-bounds-pr92539.C: New testcase.
>   * gcc.dg/tree-ssa/sccp-16.c: New testcase.
>   * g++.dg/vect/pr87621.cc: Use larger power to avoid
>   inner loop unrolling.
>   * gcc.dg/vect/pr89440.c: Use larger loop bound to avoid
>   inner loop unrolling.
>   * gcc.dg/pr77975.c: Scan cunrolli dump and adjust.
> ---
>  gcc/testsuite/g++.dg/vect/pr87621.cc  |  2 +-
>  .../g++.dg/warn/Warray-bounds-pr92539.C   | 51 +++
>  gcc/testsuite/gcc.dg/pr77975.c|  6 +--
>  gcc/testsuite/gcc.dg/tree-ssa/sccp-16.c   | 16 ++
>  gcc/testsuite/gcc.dg/vect/pr89440.c   |  4 +-
>  gcc/tree-ssa-loop-ivcanon.cc  | 11 ++--
>  gcc/tree-ssa-loop-niter.cc| 47 -
>  7 files changed, 127 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/warn/Warray-bounds-pr92539.C
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/sccp-16.c
> 
> diff --git a/gcc/testsuite/g++.dg/vect/pr87621.cc 
> b/gcc/testsuite/g++.dg/vect/pr87621.cc
> index cfc53be4ee1..bc55116ccbf 100644
> --- a/gcc/testsuite/g++.dg/vect/pr87621.cc
> +++ b/gcc/testsuite/g++.dg/vect/pr87621.cc
> @@ -21,7 +21,7 @@ T pow(T x, unsigned int n)
>  void testVec(int* x)
>  {
>for (int i = 0; i < 8; ++i)
> -x[i] = pow(x[i], 10);
> +x[i] = pow(x[i], 100);
>  }
>  
>  /* { dg-final { scan-tree-dump "OUTER LOOP VECTORIZED" "vect" { target { 
> vect_double && vect_hw_misalign } } } } */
> diff --git a/gcc/testsuite/g++.dg/warn/Warray-bounds-pr92539.C 
> b/gcc/testsuite/g++.dg/warn/Warray-bounds-pr92539.C
> new file mode 100644
> index 000..ea506ed1450
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/warn/Warray-bounds-pr92539.C
> @@ -0,0 +1,51 @@
> +// { dg-do compile { target c++11 } }
> +// { dg-options "-O2 -Warray-bounds" }
> +
> +static bool
> +ischar(int ch)
> +{
> +return (0 == (ch & ~0xff) || ~0 == (ch | 0xff)) != 0;
> +}
> +
> +static bool eat(char const*& first, char const* last)
> +{
> +if (first != last && ischar(*first)) { // { dg-bogus "bounds" }
> +++first;
> +return true;
> +}
> +return false;
> +}
> +
> +static bool eat_two(char const*& first, char const* last)
> +{
> +auto save = first;
> +if (eat(first, last) && eat(first, last))
> +return true;
> +first = save;
> +return false;
> +}
> +
> +static bool foo(char const*& first, char const* last)
> +{
> +auto local_iterator = first;
> +int i = 0;
> +for (; i < 3; ++i)
> +if (!eat_two(local_iterator, last))
> +return false;
> +first = local_iterator;
> +return true;
> +}
> +
> +static bool test(char const* in, bool full_match = true)
> +{
> +auto last = in;
> +while (*last)
> +++last;
> +return foo(in, last) && (!full_match || (in == last)); // { dg-bogus 
> "bounds" }
> +}
> +
> +int main()
> +{
> +return test("aa");
> +}
> +
> diff --git a/gcc/testsuite/gcc.dg/pr77975.c b/gcc/testsuite/gcc.dg/pr77975.c
> index a187ce2b50c..9d7aad49841 100644
> --- a/gcc/test

[PATCH v3 1/2] RISC-V: Allocate the initial register in the expand phase for the vl of XTheadVector

2025-01-16 Thread Jin Ma
Since the parameter vl of XTheadVector does not support immediate numbers, we 
need
to put it in the register in advance. That generates the initial code correctly.

PR 116593

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins.cc 
(function_expander::add_input_operand):
Put immediate for vl to GPR for XTheadVector.

gcc/testsuite/ChangeLog:

* g++.target/riscv/xtheadvector/pr116593-1.C: New test.
* g++.target/riscv/xtheadvector/xtheadvector.exp: New test.
---
 gcc/config/riscv/riscv-vector-builtins.cc | 16 +++-
 .../riscv/xtheadvector/pr116593-1.C   | 12 ++
 .../riscv/xtheadvector/xtheadvector.exp   | 37 +++
 3 files changed, 64 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.target/riscv/xtheadvector/pr116593-1.C
 create mode 100644 gcc/testsuite/g++.target/riscv/xtheadvector/xtheadvector.exp

diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index d2fe849c693e..b77f0b1567c1 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -4120,7 +4120,21 @@ function_expander::add_input_operand (unsigned argno)
 {
   tree arg = CALL_EXPR_ARG (exp, argno);
   rtx x = expand_normal (arg);
-  add_input_operand (TYPE_MODE (TREE_TYPE (arg)), x);
+
+  /* Since the parameter vl of XTheadVector does not support
+ immediate numbers, we need to put it in the register
+ in advance.  */
+  if (TARGET_XTHEADVECTOR
+  && CONST_INT_P (x)
+  && base->apply_vl_p ()
+  && argno == (unsigned) (call_expr_nargs (exp) - 1)
+  && x != CONST0_RTX (GET_MODE (x)))
+{
+  x = force_reg (word_mode, x);
+  add_input_operand (TYPE_MODE (TREE_TYPE (arg)), x);
+}
+  else
+add_input_operand (TYPE_MODE (TREE_TYPE (arg)), x);
 }
 
 /* Since we may normalize vop/vop_tu/vop_m/vop_tumu.. into a single patter.
diff --git a/gcc/testsuite/g++.target/riscv/xtheadvector/pr116593-1.C 
b/gcc/testsuite/g++.target/riscv/xtheadvector/pr116593-1.C
new file mode 100644
index ..6590dcbe5030
--- /dev/null
+++ b/gcc/testsuite/g++.target/riscv/xtheadvector/pr116593-1.C
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_xtheadvector -mabi=ilp32d -O2" { target { rv32 
} } } */
+/* { dg-options "-march=rv64gc_xtheadvector -mabi=lp64d -O2" { target { rv64 } 
} } */
+
+#include 
+
+vint32m1_t foo (vint32m1_t vs2, vint32m1_t vs1)
+{
+   return __riscv_vadd_vv_i32m1(vs2, vs1, 3);
+}
+
+/* { dg-final { scan-assembler-times "li\ta\[0-9\],3\n" 1 } } */
diff --git a/gcc/testsuite/g++.target/riscv/xtheadvector/xtheadvector.exp 
b/gcc/testsuite/g++.target/riscv/xtheadvector/xtheadvector.exp
new file mode 100644
index ..40c868b0d805
--- /dev/null
+++ b/gcc/testsuite/g++.target/riscv/xtheadvector/xtheadvector.exp
@@ -0,0 +1,37 @@
+# Copyright (C) 2023-2025 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .
+
+# GCC testsuite that uses the `dg.exp' driver.
+
+# Test the front-end for C++.
+# We don't need to test back-end code-gen in RV32 system for C++
+# Because it is already tested in C.
+# Exit immediately if this isn't a RISC-V target.
+if ![istarget riscv*-*-*] then {
+  return
+}
+
+# Load support procs.
+load_lib g++-dg.exp
+
+# Initialize `dg'.
+dg-init
+
+# Main loop.
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.C]] "" ""
+
+# All done.
+dg-finish
-- 
2.25.1



[PATCH v3 2/2] RISC-V: Add a new constraint to ensure that the vl of XTheadVector does not produce a non-zero immediate.

2025-01-16 Thread Jin Ma
Although we have handled the vl of XTheadVector correctly in the
expand phase and predicates, the results show that the work is
still insufficient.

In the curr_insn_transform function, the insn is transformed from:
(insn 69 67 225 12 (set (mem:RVVM8SF (reg/f:DI 218 [ _77 ]) [0  S[128, 128] 
A32])
(if_then_else:RVVM8SF (unspec:RVVMF4BI [
(const_vector:RVVMF4BI repeat [
(const_int 1 [0x1])
])
(reg:DI 209)
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(reg/v:RVVM8SF 143 [ _xx ])
(mem:RVVM8SF (reg/f:DI 218 [ _77 ]) [0  S[128, 128] A32])))
 (expr_list:REG_DEAD (reg/v:RVVM8SF 143 [ _xx ])
(nil)))
to
(insn 69 284 225 11 (set (mem:RVVM8SF (reg/f:DI 18 s2 [orig:218 _77 ] [218]) [0 
 S[128, 128] A32])
(if_then_else:RVVM8SF (unspec:RVVMF4BI [
(const_vector:RVVMF4BI repeat [
(const_int 1 [0x1])
])
(const_int 1 [0x1])
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(reg/v:RVVM8SF 104 v8 [orig:143 _xx ] [143])
(mem:RVVM8SF (reg/f:DI 18 s2 [orig:218 _77 ] [218]) [0  S[128, 128] 
A32])))
 (nil))

Looking at the log for the reload pass, it is found that "Changing pseudo 209 in
operand 3 of insn 69 on equiv 0x1".
It converts the vl operand in insn from the expected register(reg:DI 209) to the
constant 1(const_int 1 [0x1]).

This conversion occurs because, although the predicate for the vl operand is
restricted by "vector_length_operand" in the pattern, the constraint is still
"rK", which allows the transformation.

The issue is that changing the "rK" constraint to "rJ" for the constraint of vl
operand in the pattern would prevent this conversion, But unfortunately this 
will
conflict with RVV (RISC-V Vector Extension).

Based on the review's recommendations, the best solution for now is to create
a new constraint to distinguish between RVV and XTheadVector, which is exactly
what this patch does.

PR 116593

gcc/ChangeLog:

* config/riscv/constraints.md (vl): New.
* config/riscv/thead-vector.md: Likewise.
* config/riscv/vector.md: Likewise.

gcc/testsuite/ChangeLog:

* g++.target/riscv/xtheadvector/pr116593-2.C: New test.
---
 gcc/config/riscv/constraints.md   |   6 +
 gcc/config/riscv/thead-vector.md  |  18 +-
 gcc/config/riscv/vector.md| 466 +-
 .../riscv/xtheadvector/pr116593-2.C   |  47 ++
 4 files changed, 295 insertions(+), 242 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/riscv/xtheadvector/pr116593-2.C

diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index f25975dc0208..df62491b2edc 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -209,6 +209,12 @@ (define_constraint "vk"
   (and (match_code "const_vector")
(match_test "riscv_vector::const_vec_all_same_in_range_p (op, 0, 31)")))
 
+(define_constraint "vl"
+  "A uimm5 for vector or zero for XTheadVector."
+  (and (match_code "const_int")
+   (ior (match_test "!TARGET_XTHEADVECTOR && satisfies_constraint_K (op)")
+   (match_test "TARGET_XTHEADVECTOR && satisfies_constraint_J (op)"
+
 (define_constraint "Wc0"
   "@internal
  A constraint that matches a vector of immediate all zeros."
diff --git a/gcc/config/riscv/thead-vector.md b/gcc/config/riscv/thead-vector.md
index 5fe9ba08c4eb..5a02debdd207 100644
--- a/gcc/config/riscv/thead-vector.md
+++ b/gcc/config/riscv/thead-vector.md
@@ -108,7 +108,7 @@ (define_insn_and_split "@pred_th_whole_mov"
   [(set (match_operand:V_VLS_VT 0 "reg_or_mem_operand"  "=vr,vr, m")
(unspec:V_VLS_VT
  [(match_operand:V_VLS_VT 1 "reg_or_mem_operand" " vr, m,vr")
-  (match_operand 2 "vector_length_operand"   " rK, rK, rK")
+  (match_operand 2 "vector_length_operand"   "rvl,rvl,rvl")
   (match_operand 3 "const_1_operand" "  i, i, i")
   (reg:SI VL_REGNUM)
   (reg:SI VTYPE_REGNUM)]
@@ -133,7 +133,7 @@ (define_insn_and_split "@pred_th_whole_mov"
   [(set (match_operand:VB 0 "reg_or_mem_operand"  "=vr,vr, m")
(unspec:VB
  [(match_operand:VB 1 "reg_or_mem_operand" " vr, m,vr")
-  (match_operand 2 "vector_length_operand"   " rK, rK, rK")
+  (match_operand 2 "vector_length_operand"   "rvl,rvl,rvl")
   (match_operand 3 "const_1_operand" "  i, i, i")
   (reg:SI VL_REGNUM)
   (reg:SI VTYPE_REGNUM)]
@@ -161,7 +161,7 @@ (define_insn_and_split "*pred_th_mov"
(if_then_else:VB_VLS
  (unspec:VB_VLS
[(match_operand:VB_VLS 1 "vector_all_trues_

[PATCH] i386: Fix wrong insn generated by shld/shrd ndd split [PR118510]

2025-01-16 Thread Hongyu Wang
Hi,

For shld/shrd_ndd_2 insn, the spiltter outputs wrong pattern that
mixed parallel for clobber and set. Separate out the set to dest
from parallel to fix it.

Bootstrapped & regtested on x86-64-pc-linux-gnu.

Ok for trunk?

gcc/ChangeLog:

PR target/118510
* config/i386/i386.md (*x86_64_shld_ndd_2): Separate set to
operand[0] from parallel in output template.
(*x86_shld_ndd_2): Likewise.
(*x86_64_shrd_ndd_2): Likewise.
(*x86_shrd_ndd_2): Likewise.

gcc/testsuite/ChangeLog:

PR target/118510
* gcc.target/i386/pr118510.c: New test.
---
 gcc/config/i386/i386.md  | 16 
 gcc/testsuite/gcc.target/i386/pr118510.c | 14 ++
 2 files changed, 22 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr118510.c

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 362b0ddcf40..ebfd76593ab 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -15615,8 +15615,8 @@ (define_insn_and_split "*x86_64_shld_ndd_2"
 (minus:QI (const_int 64)
   (and:QI (match_dup 3)
   (const_int 63 0)))
- (clobber (reg:CC FLAGS_REG))
- (set (match_dup 0) (match_dup 4))])]
+ (clobber (reg:CC FLAGS_REG))])
+   (set (match_dup 0) (match_dup 4))]
 {
   operands[4] = gen_reg_rtx (DImode);
   emit_move_insn (operands[4], operands[0]);
@@ -15851,8 +15851,8 @@ (define_insn_and_split "*x86_shld_ndd_2"
 (minus:QI (const_int 32)
   (and:QI (match_dup 3)
   (const_int 31 0)))
- (clobber (reg:CC FLAGS_REG))
- (set (match_dup 0) (match_dup 4))])]
+ (clobber (reg:CC FLAGS_REG))])
+   (set (match_dup 0) (match_dup 4))]
 {
   operands[4] = gen_reg_rtx (SImode);
   emit_move_insn (operands[4], operands[0]);
@@ -17010,8 +17010,8 @@ (define_insn_and_split "*x86_64_shrd_ndd_2"
 (minus:QI (const_int 64)
   (and:QI (match_dup 3)
   (const_int 63 0)))
- (clobber (reg:CC FLAGS_REG))
- (set (match_dup 0) (match_dup 4))])]
+ (clobber (reg:CC FLAGS_REG))])
+   (set (match_dup 0) (match_dup 4))]
 {
   operands[4] = gen_reg_rtx (DImode);
   emit_move_insn (operands[4], operands[0]);
@@ -17245,8 +17245,8 @@ (define_insn_and_split "*x86_shrd_ndd_2"
 (minus:QI (const_int 32)
   (and:QI (match_dup 3)
   (const_int 31 0)))
- (clobber (reg:CC FLAGS_REG))
- (set (match_dup 0) (match_dup 4))])]
+ (clobber (reg:CC FLAGS_REG))])
+   (set (match_dup 0) (match_dup 4))]
 {
   operands[4] = gen_reg_rtx (SImode);
   emit_move_insn (operands[4], operands[0]);
diff --git a/gcc/testsuite/gcc.target/i386/pr118510.c 
b/gcc/testsuite/gcc.target/i386/pr118510.c
new file mode 100644
index 000..6cfe8182b6f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr118510.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mapxf" } */
+
+typedef struct cpp_num cpp_num;
+struct cpp_num {
+int high;
+unsigned low;
+int overflow;
+};
+int num_rshift_n;
+cpp_num num_lshift(cpp_num num) {
+num.low = num.low >> num_rshift_n | num.high << (32 - num_rshift_n);
+return num;
+}
-- 
2.31.1



Re: [PATCH] Fix uniqueness of symtab_node::get_dump_name.

2025-01-16 Thread Richard Biener
On Thu, 16 Jan 2025, Michal Jires wrote:

> symtab_node::get_dump_name uses node order to identify nodes.
> Order is no longer unique because of Incremental LTO patches.
> This patch moves uid from cgraph_node node to symtab_node,
> so get_dump_name can use uid instead and get back unique dump names.
> 
> In inlining passes, uid is replaced with more appropriate (more compact
> for indexing) summary id.
> 
> Bootstrapped/regtested on x86_64-linux.
> Ok for trunk?

This looks reasonable, but I defer to Honza for a final ack.

Richard.

> gcc/ChangeLog:
> 
>   * cgraph.cc (symbol_table::create_empty):
>   Move uid to symtab_node.
>   (test_symbol_table_test): Change expected dump id.
>   * cgraph.h (struct cgraph_node):
>   Move uid to symtab_node.
>   (symbol_table::register_symbol): Likewise.
>   * dumpfile.cc (test_capture_of_dump_calls):
>   Change expected dump id.
>   * ipa-inline.cc (update_caller_keys):
>   Use summary id instead of uid.
>   (update_callee_keys): Likewise.
>   * symtab.cc (symtab_node::get_dump_name):
>   Use uid instead of order.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/live-patching-1.c: Change expected dump id.
>   * gcc.dg/live-patching-4.c: Likewise.
> ---
>  gcc/cgraph.cc  |  4 ++--
>  gcc/cgraph.h   | 25 ++---
>  gcc/dumpfile.cc|  8 
>  gcc/ipa-inline.cc  |  6 +++---
>  gcc/symtab.cc  |  2 +-
>  gcc/testsuite/gcc.dg/live-patching-1.c |  2 +-
>  gcc/testsuite/gcc.dg/live-patching-4.c |  2 +-
>  7 files changed, 26 insertions(+), 23 deletions(-)
> 
> diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
> index 83a9b59ef30..d0b19ad850e 100644
> --- a/gcc/cgraph.cc
> +++ b/gcc/cgraph.cc
> @@ -290,7 +290,7 @@ cgraph_node *
>  symbol_table::create_empty (void)
>  {
>cgraph_count++;
> -  return new (ggc_alloc ()) cgraph_node (cgraph_max_uid++);
> +  return new (ggc_alloc ()) cgraph_node ();
>  }
>  
>  /* Register HOOK to be called with DATA on each removed edge.  */
> @@ -4338,7 +4338,7 @@ test_symbol_table_test ()
>/* Verify that the node has order 0 on both iterations,
>and thus that nodes have predictable dump names in selftests.  */
>ASSERT_EQ (node->order, 0);
> -  ASSERT_STREQ (node->dump_name (), "test_decl/0");
> +  ASSERT_STREQ (node->dump_name (), "test_decl/1");
>  }
>  }
>  
> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
> index 7856d53c9e9..065fcc742e8 100644
> --- a/gcc/cgraph.h
> +++ b/gcc/cgraph.h
> @@ -124,7 +124,7 @@ public:
>order (-1), next_sharing_asm_name (NULL),
>previous_sharing_asm_name (NULL), same_comdat_group (NULL), ref_list 
> (),
>alias_target (NULL), lto_file_data (NULL), aux (NULL),
> -  x_comdat_group (NULL_TREE), x_section (NULL)
> +  x_comdat_group (NULL_TREE), x_section (NULL), m_uid (-1)
>{}
>  
>/* Return name.  */
> @@ -492,6 +492,12 @@ public:
>/* Perform internal consistency checks, if they are enabled.  */
>static inline void checking_verify_symtab_nodes (void);
>  
> +  /* Get unique identifier of the node.  */
> +  inline int get_uid ()
> +  {
> +return m_uid;
> +  }
> +
>/* Type of the symbol.  */
>ENUM_BITFIELD (symtab_type) type : 8;
>  
> @@ -668,6 +674,9 @@ protected:
> void *data,
> bool include_overwrite);
>  private:
> +  /* Unique id of the node.  */
> +  int m_uid;
> +
>/* Workers for set_section.  */
>static bool set_section_from_string (symtab_node *n, void *s);
>static bool set_section_from_node (symtab_node *n, void *o);
> @@ -882,7 +891,7 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : 
> public symtab_node
>friend class symbol_table;
>  
>/* Constructor.  */
> -  explicit cgraph_node (int uid)
> +  explicit cgraph_node ()
>  : symtab_node (SYMTAB_FUNCTION), callees (NULL), callers (NULL),
>indirect_calls (NULL),
>next_sibling_clone (NULL), prev_sibling_clone (NULL), clones (NULL),
> @@ -903,7 +912,7 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : 
> public symtab_node
>redefined_extern_inline (false), tm_may_enter_irr (false),
>ipcp_clone (false), gc_candidate (false),
>called_by_ifunc_resolver (false), has_omp_variant_constructs (false),
> -  m_uid (uid), m_summary_id (-1)
> +  m_summary_id (-1)
>{}
>  
>/* Remove the node from cgraph and all inline clones inlined into it.
> @@ -1304,12 +1313,6 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : 
> public symtab_node
>  dump_cgraph (stderr);
>}
>  
> -  /* Get unique identifier of the node.  */
> -  inline int get_uid ()
> -  {
> -return m_uid;
> -  }
> -
>/* Get summary id of the node.  */
>inline int get_summary_id ()
>{
> @@ -1503,8 +1506,6 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_no

Re: [PATCH 2/2] match: Improve the `x ==/!= ~x` pattern [PR118483]

2025-01-16 Thread Richard Biener
On Fri, Jan 17, 2025 at 1:11 AM Andrew Pinski  wrote:
>
> This improves this pattern by 2 ways:
> * Allow for an optional convert, similar to how the few other
>   `a OP ~a` patterns also allow for an optional convert.
> * Use bitwise_inverted_equal_p/maybe_bit_not instead of directly
>   matching bit_not. Just like the other patterns do too.
>
> Note pr118483-2.c used to optimized for aarch64-linux-gnu with GCC 4.9.4
> on the RTL level even though the gimple level was missing it.

OK.

Thanks,
Richard.

> PR tree-optimization/118483
>
> gcc/ChangeLog:
>
> * match.pd (`x ==/!= ~x`): Allow for an optional convert
> and use itwise_inverted_equal_p/maybe_bit_not instead of
> directly matching bit_not.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/pr118483-1.c: New test.
> * gcc.dg/tree-ssa/pr118483-2.c: New test.
> * gcc.dg/tree-ssa/pr118483-3.c: New test.
> * gcc.dg/tree-ssa/pr118483-4.c: New test.
>
> Signed-off-by: Andrew Pinski 
> ---
>  gcc/match.pd   |  7 +--
>  gcc/testsuite/gcc.dg/tree-ssa/pr118483-1.c | 18 ++
>  gcc/testsuite/gcc.dg/tree-ssa/pr118483-2.c | 18 ++
>  gcc/testsuite/gcc.dg/tree-ssa/pr118483-3.c | 14 ++
>  gcc/testsuite/gcc.dg/tree-ssa/pr118483-4.c | 11 +++
>  5 files changed, 66 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr118483-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr118483-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr118483-3.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr118483-4.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index b6cbb851897..5ac7e7417b1 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -6959,8 +6959,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  /* x != ~x -> true */
>  (for cmp (eq ne)
>   (simplify
> -  (cmp:c @0 (bit_not @0))
> -  { constant_boolean_node (cmp == NE_EXPR, type); }))
> +  (cmp:c (convert? @0) (convert? (maybe_bit_not @1)))
> +  (with { bool wascmp; }
> +   (if (types_match (TREE_TYPE (@0), TREE_TYPE (@1))
> +&& bitwise_inverted_equal_p (@0, @1, wascmp))
> +{ constant_boolean_node (cmp == NE_EXPR, type); }
>
>  /* Fold ~X op ~Y as Y op X.  */
>  (for cmp (simple_comparison)
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr118483-1.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr118483-1.c
> new file mode 100644
> index 000..e31876c940a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr118483-1.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* PR tree-optimization/118483 */
> +/* { dg-final { scan-tree-dump-not "abort " "optimized" } } */
> +
> +
> +/* The value of `l == e` is always false as it is
> +   `(b == 0) == (b != 0)`. */
> +
> +int d;
> +int f(int b)
> +{
> +  int e = b == 0;
> +  d = e;
> +  int l = b != 0;
> +  if (l == e)
> +__builtin_abort ();
> +}
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr118483-2.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr118483-2.c
> new file mode 100644
> index 000..84867719867
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr118483-2.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* PR tree-optimization/118483 */
> +/* { dg-final { scan-tree-dump-not "abort " "optimized" } } */
> +
> +
> +/* The value of `l == e` is always false as it is
> +   `(b == 0) == (b != 0)`. */
> +
> +int d;
> +int f(int b)
> +{
> +  int e = b == 0;
> +  d = e;
> +  int l = !e;
> +  if (l == e)
> +__builtin_abort ();
> +}
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr118483-3.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr118483-3.c
> new file mode 100644
> index 000..65efaf5c30f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr118483-3.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* PR tree-optimization/118483 */
> +/* { dg-final { scan-tree-dump "return 0;" "optimized" } } */
> +
> +/* This should optimize down to just `return 0;` */
> +/* as `(short)a == ~(short)a` is always false. */
> +int f(int a)
> +{
> +  short b = a;
> +  int e = ~a;
> +  short c = e;
> +  return b == c;
> +}
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr118483-4.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr118483-4.c
> new file mode 100644
> index 000..c6e389c4674
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr118483-4.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* PR tree-optimization/118483 */
> +/* { dg-final { scan-tree-dump "return 0;" "optimized" } } */
> +
> +/* This should optimize down to just `return 0;` */
> +/* as `a == 0` and `a != 0` are opposites. */
> +int f(int a)
> +{
> +  return (a == 0) == (a != 0);
> +}
> --
> 2.43.0
>


[GCC16 stage 1][RFC][C][PATCH 1/3] Extend "counted_by" attribute to pointer fields of structures.

2025-01-16 Thread Qing Zhao
For example:

struct PP {
  size_t count2;
  char other1;
  char *array2 __attribute__ ((counted_by (count2)));
  int other2;
} *pp;

specifies that the "array2" is an array that is pointed by the
pointer field, and its number of elements is given by the field
"count2" in the same structure.

gcc/c-family/ChangeLog:

* c-attribs.cc (handle_counted_by_attribute): Accept counted_by
attribute for pointer fields.

gcc/c/ChangeLog:

* c-decl.cc (verify_counted_by_attribute): Change the 2nd argument
to a vector of fields with counted_by attribute. Verify all fields
in this vector.
(finish_struct): Collect all the fields with counted_by attribute
to a vector and pass this vector to verify_counted_by_attribute.

gcc/ChangeLog:

* doc/extend.texi: Extend counted_by attribute to pointer fields in
structures. Add one more requirement to pointers with counted_by
attribute.

gcc/testsuite/ChangeLog:

* gcc.dg/flex-array-counted-by.c: Update test.
* gcc.dg/pointer-counted-by-2.c: New test.
* gcc.dg/pointer-counted-by-3.c: New test.
* gcc.dg/pointer-counted-by.c: New test.
---
 gcc/c-family/c-attribs.cc|  15 ++-
 gcc/c/c-decl.cc  |  91 +++--
 gcc/doc/extend.texi  |  37 +-
 gcc/testsuite/gcc.dg/flex-array-counted-by.c |   2 +-
 gcc/testsuite/gcc.dg/pointer-counted-by-2.c  |   8 ++
 gcc/testsuite/gcc.dg/pointer-counted-by-3.c  | 127 +++
 gcc/testsuite/gcc.dg/pointer-counted-by.c|  70 ++
 7 files changed, 298 insertions(+), 52 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pointer-counted-by-2.c
 create mode 100644 gcc/testsuite/gcc.dg/pointer-counted-by-3.c
 create mode 100644 gcc/testsuite/gcc.dg/pointer-counted-by.c

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index eb76430dd07..6dd23db4f9d 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -2906,16 +2906,18 @@ handle_counted_by_attribute (tree *node, tree name,
" declaration %q+D", name, decl);
   *no_add_attrs = true;
 }
-  /* This attribute only applies to field with array type.  */
-  else if (TREE_CODE (TREE_TYPE (decl)) != ARRAY_TYPE)
+  /* This attribute only applies to field with array type or pointer type.  */
+  else if (TREE_CODE (TREE_TYPE (decl)) != ARRAY_TYPE
+  && TREE_CODE (TREE_TYPE (decl)) != POINTER_TYPE)
 {
   error_at (DECL_SOURCE_LOCATION (decl),
-   "%qE attribute is not allowed for a non-array field",
-   name);
+   "%qE attribute is not allowed for a non-array"
+   " or non-pointer field", name);
   *no_add_attrs = true;
 }
   /* This attribute only applies to a C99 flexible array member type.  */
-  else if (! c_flexible_array_member_type_p (TREE_TYPE (decl)))
+  else if (TREE_CODE (TREE_TYPE (decl)) == ARRAY_TYPE
+  && !c_flexible_array_member_type_p (TREE_TYPE (decl)))
 {
   error_at (DECL_SOURCE_LOCATION (decl),
"%qE attribute is not allowed for a non-flexible"
@@ -2930,7 +2932,8 @@ handle_counted_by_attribute (tree *node, tree name,
   *no_add_attrs = true;
 }
   /* Issue error when there is a counted_by attribute with a different
- field as the argument for the same flexible array member field.  */
+ field as the argument for the same flexible array member or
+ pointer field.  */
   else if (old_counted_by != NULL_TREE)
 {
   tree old_fieldname = TREE_VALUE (TREE_VALUE (old_counted_by));
diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index f60b2a54a17..2c242ce8593 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -9448,56 +9448,62 @@ c_update_type_canonical (tree t)
 }
 }
 
-/* Verify the argument of the counted_by attribute of the flexible array
-   member FIELD_DECL is a valid field of the containing structure,
-   STRUCT_TYPE, Report error and remove this attribute when it's not.  */
+/* Verify the argument of the counted_by attribute of each of the
+   FIELDS_WITH_COUNTED_BY is a valid field of the containing structure,
+   STRUCT_TYPE, Report error and remove the corresponding attribute
+   when it's not.  */
 
 static void
-verify_counted_by_attribute (tree struct_type, tree field_decl)
+verify_counted_by_attribute (tree struct_type,
+auto_vec *fields_with_counted_by)
 {
-  tree attr_counted_by = lookup_attribute ("counted_by",
-  DECL_ATTRIBUTES (field_decl));
-
-  if (!attr_counted_by)
-return;
+  for (tree field_decl : *fields_with_counted_by)
+{
+  tree attr_counted_by = lookup_attribute ("counted_by",
+   DECL_ATTRIBUTES (field_decl));
 
-  /* If there is an counted_by attribute attached to the field,
- verify it.  */
+  if (!attr_counted_by)
+   continue;
 
-  tr

[GCC16 stage 1][RFC][PATCH 0/3]extend "counted_by" attribute to pointer fields of structures

2025-01-16 Thread Qing Zhao
Hi,

This is the patch set to extend "counted_by" attribute to pointer fields of 
structures.

For example:

struct PP {
  size_t count2;
  char other1;
  char *array2 __attribute__ ((counted_by (count2)));
  int other2;
} *pp;

specifies that the "array2" is an array that is pointed by the
pointer field, and its number of elements is given by the field
"count2" in the same structure.

Per the previous discussion with Martin and Bill
(https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669320.html)

there are the following importand facts about "counted_by" on pointer fields 
compared
to the "counted_by" on FAM fields:

1. one more new requirement for pointer fields with "counted_by" attribute:
   pp->array2 and pp->count2 can ONLY be changed by changing the whole structure
   at the same time.

2. the following feature for FAM field with "counted_by" attribute is NOT
   valid for the pointer field any more:

" One important feature of the attribute is, a reference to the
 flexible array member field uses the latest value assigned to the
 field that represents the number of the elements before that
 reference.  For example,

p->count = val1;
p->array[20] = 0;  // ref1 to p->array
p->count = val2;
p->array[30] = 0;  // ref2 to p->array

 in the above, 'ref1' uses 'val1' as the number of the elements in
 'p->array', and 'ref2' uses 'val2' as the number of elements in
 'p->array'. "

Although in the previous discussion, I agreed with Martin that we should use the
designator syntax (i.e, counted_by (.n) instead of counted_by (n)) for the
counted_by attribute for pointer fields, after more consideration and discussion
with Bill Wendling (who is working on the same work for CLANG), we decided to
keep the current syntax of FAM for pointer fields. And leave the new syntax (.n)
and more complicate expressions to a later work. 

This patch set includes 3 parts:

1.Extend "counted_by" attribute to pointer fields of structures. 
2.Convert a pointer reference with counted_by attribute to .ACCESS_WITH_SIZE
and use it in builtinin-object-size.
3.Use the counted_by attribute of pointers in array bound checker.

In which, the patch 1 and 2 are simple and straightforward, however, the patch 
3  
is a little complicate due to the following reason:

Current array bound checker only instruments ARRAY_REF, and the INDEX
information is the 2nd operand of the ARRAY_REF.

When extending the array bound checker to pointer references with
counted_by attributes, the hardest part is to get the INDEX of the
corresponding array ref from the offset computation expression of
the pointer ref. 

The whole patch set has been bootstrapped and regression tested on both aarch64
and x86.

Let me know any comments and suggestions.
 
Thanks.

Qing

Qing Zhao (3):
  Extend "counted_by" attribute to pointer fields of structures.
  Convert a pointer reference with counted_by attribute to
.ACCESS_WITH_SIZE and use it in builtinin-object-size.
  Use the counted_by attribute of pointers in array bound checker.

 gcc/c-family/c-attribs.cc |  15 +-
 gcc/c-family/c-gimplify.cc|   7 +
 gcc/c-family/c-ubsan.cc   | 264 --
 gcc/c/c-decl.cc   |  91 +++---
 gcc/c/c-typeck.cc |  41 +--
 gcc/doc/extend.texi   |  37 ++-
 gcc/testsuite/gcc.dg/flex-array-counted-by.c  |   2 +-
 gcc/testsuite/gcc.dg/pointer-counted-by-2.c   |   8 +
 gcc/testsuite/gcc.dg/pointer-counted-by-3.c   | 127 +
 gcc/testsuite/gcc.dg/pointer-counted-by-4.c   |  63 +
 gcc/testsuite/gcc.dg/pointer-counted-by-5.c   |  48 
 gcc/testsuite/gcc.dg/pointer-counted-by-6.c   |  47 
 gcc/testsuite/gcc.dg/pointer-counted-by-7.c   |  30 ++
 gcc/testsuite/gcc.dg/pointer-counted-by-8.c   |  30 ++
 gcc/testsuite/gcc.dg/pointer-counted-by.c |  70 +
 .../ubsan/pointer-counted-by-bounds-2.c   |  47 
 .../ubsan/pointer-counted-by-bounds-3.c   |  35 +++
 .../ubsan/pointer-counted-by-bounds-4.c   |  35 +++
 .../ubsan/pointer-counted-by-bounds-5.c   |  46 +++
 .../ubsan/pointer-counted-by-bounds-6.c   |  33 +++
 .../gcc.dg/ubsan/pointer-counted-by-bounds.c  |  46 +++
 gcc/tree-object-size.cc   |  11 +-
 22 files changed, 1045 insertions(+), 88 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pointer-counted-by-2.c
 create mode 100644 gcc/testsuite/gcc.dg/pointer-counted-by-3.c
 create mode 100644 gcc/testsuite/gcc.dg/pointer-counted-by-4.c
 create mode 100644 gcc/testsuite/gcc.dg/pointer-counted-by-5.c
 create mode 100644 gcc/testsuite/gcc.dg/pointer-counted-by-6.c
 create mode 100644 gcc/testsuite/gcc.dg/pointer-counted-by-7.c
 create mode 100644 gcc/testsuite/gcc.dg/pointer-counted-by-8.c
 create mode 100644 gcc/testsuite/gcc.dg/pointer-counted-by.c
 create mode 100644 gcc

[GCC16 stage 1][RFC][PATCH 2/3] Convert a pointer reference with counted_by attribute to .ACCESS_WITH_SIZE and use it in builtinin-object-size.

2025-01-16 Thread Qing Zhao
gcc/c/ChangeLog:

* c-typeck.cc (build_counted_by_ref): Handle pointers with counted_by.
(build_access_with_size_for_counted_by): Likewise.

gcc/ChangeLog:

* tree-object-size.cc (access_with_size_object_size): Handle pointers
with counted_by.
(collect_object_sizes_for): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/pointer-counted-by-4.c: New test.
* gcc.dg/pointer-counted-by-5.c: New test.
* gcc.dg/pointer-counted-by-6.c: New test.
* gcc.dg/pointer-counted-by-7.c: New test.
* gcc.dg/pointer-counted-by-8.c: New test.
---
 gcc/c/c-typeck.cc   | 41 --
 gcc/testsuite/gcc.dg/pointer-counted-by-4.c | 63 +
 gcc/testsuite/gcc.dg/pointer-counted-by-5.c | 48 
 gcc/testsuite/gcc.dg/pointer-counted-by-6.c | 47 +++
 gcc/testsuite/gcc.dg/pointer-counted-by-7.c | 30 ++
 gcc/testsuite/gcc.dg/pointer-counted-by-8.c | 30 ++
 gcc/tree-object-size.cc | 11 +++-
 7 files changed, 251 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pointer-counted-by-4.c
 create mode 100644 gcc/testsuite/gcc.dg/pointer-counted-by-5.c
 create mode 100644 gcc/testsuite/gcc.dg/pointer-counted-by-6.c
 create mode 100644 gcc/testsuite/gcc.dg/pointer-counted-by-7.c
 create mode 100644 gcc/testsuite/gcc.dg/pointer-counted-by-8.c

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index dbb688cabaa..9302236f126 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -2911,8 +2911,8 @@ should_suggest_deref_p (tree datum_type)
 
 /* For a SUBDATUM field of a structure or union DATUM, generate a REF to
the object that represents its counted_by per the attribute counted_by
-   attached to this field if it's a flexible array member field, otherwise
-   return NULL_TREE.
+   attached to this field if it's a flexible array member or a pointer
+   field, otherwise return NULL_TREE.
Set COUNTED_BY_TYPE to the TYPE of the counted_by field.
For example, if:
 
@@ -2933,7 +2933,9 @@ static tree
 build_counted_by_ref (tree datum, tree subdatum, tree *counted_by_type)
 {
   tree type = TREE_TYPE (datum);
-  if (!c_flexible_array_member_type_p (TREE_TYPE (subdatum)))
+  tree sub_type = TREE_TYPE (subdatum);
+  if (!c_flexible_array_member_type_p (sub_type)
+  && TREE_CODE (sub_type) != POINTER_TYPE)
 return NULL_TREE;
 
   tree attr_counted_by = lookup_attribute ("counted_by",
@@ -2964,8 +2966,11 @@ build_counted_by_ref (tree datum, tree subdatum, tree 
*counted_by_type)
 }
 
 /* Given a COMPONENT_REF REF with the location LOC, the corresponding
-   COUNTED_BY_REF, and the COUNTED_BY_TYPE, generate an INDIRECT_REF
-   to a call to the internal function .ACCESS_WITH_SIZE.
+   COUNTED_BY_REF, and the COUNTED_BY_TYPE, generate the corresponding
+   call to the internal function .ACCESS_WITH_SIZE.
+
+   Generate an INDIRECT_REF to a call to the internal function
+   .ACCESS_WITH_SIZE.
 
REF
 
@@ -2975,17 +2980,15 @@ build_counted_by_ref (tree datum, tree subdatum, tree 
*counted_by_type)
(TYPE_OF_ARRAY *)0))
 
NOTE: The return type of this function is the POINTER type pointing
-   to the original flexible array type.
-   Then the type of the INDIRECT_REF is the original flexible array type.
-
-   The type of the first argument of this function is a POINTER type
-   to the original flexible array type.
+   to the original flexible array type or the original pointer type.
+   Then the type of the INDIRECT_REF is the original flexible array type
+   or the original pointer type.
 
The 4th argument of the call is a constant 0 with the TYPE of the
object pointed by COUNTED_BY_REF.
 
-   The 6th argument of the call is a constant 0 with the pointer TYPE
-   to the original flexible array type.
+   The 6th argument of the call is a constant 0 of the same TYPE as
+   the return type of the call.
 
   */
 static tree
@@ -2993,20 +2996,26 @@ build_access_with_size_for_counted_by (location_t loc, 
tree ref,
   tree counted_by_ref,
   tree counted_by_type)
 {
-  gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref)));
-  /* The result type of the call is a pointer to the flexible array type.  */
+  gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref))
+ || TREE_CODE (TREE_TYPE (ref)) == POINTER_TYPE);
+  bool is_fam = c_flexible_array_member_type_p (TREE_TYPE (ref));
+  tree first_arg = is_fam ? array_to_pointer_conversion (loc, ref)
+ : build_unary_op (loc, ADDR_EXPR, ref, false);
+
+  /* The result type of the call is a pointer to the original type
+ of the ref.  */
   tree result_type = c_build_pointer_type (TREE_TYPE (ref));
 
   tree call
 = build_call_expr_internal_loc (loc, IFN_ACCESS_WITH_SIZE,
result_type, 6,
- 

[GCC16 stage 1][RFC][C][PATCH 3/3] Use the counted_by attribute of pointers in array bound checker.

2025-01-16 Thread Qing Zhao
Current array bound checker only instruments ARRAY_REF, and the INDEX
information is the 2nd operand of the ARRAY_REF.

When extending the array bound checker to pointer references with
counted_by attributes, the hardest part is to get the INDEX of the
corresponding array ref from the offset computation expression of
the pointer ref.  I.e.

Given an OFFSET expression, and the ELEMENT_SIZE,
get the index expression from the OFFSET.
For example:
  OFFSET:
   ((long unsigned int) m * (long unsigned int) SAVE_EXPR ) * 4
  ELEMENT_SIZE:
   (sizetype) SAVE_EXPR  * 4
get the index as (long unsigned int) m.

gcc/c-family/ChangeLog:

* c-gimplify.cc (ubsan_walk_array_refs_r): Instrument INDIRECT_REF
with .ACCESS_WITH_SIZE in its address computation.
* c-ubsan.cc (ubsan_instrument_bounds): Format change.
(ubsan_instrument_bounds_pointer): New function.
(get_factors_from_mul_expr): New function.
(get_index_from_offset): New function.
(get_index_from_pointer_addr_expr): New function.
(is_instrumentable_pointer_array): New function.
(ubsan_array_ref_instrumented_p): Handle INDIRECT_REF.
(ubsan_maybe_instrument_array_ref): Handle INDIRECT_REF.

gcc/testsuite/ChangeLog:

* gcc.dg/ubsan/pointer-counted-by-bounds-2.c: New test.
* gcc.dg/ubsan/pointer-counted-by-bounds-3.c: New test.
* gcc.dg/ubsan/pointer-counted-by-bounds-4.c: New test.
* gcc.dg/ubsan/pointer-counted-by-bounds-5.c: New test.
* gcc.dg/ubsan/pointer-counted-by-bounds-6.c: New test.
* gcc.dg/ubsan/pointer-counted-by-bounds.c: New test.
---
 gcc/c-family/c-gimplify.cc|   7 +
 gcc/c-family/c-ubsan.cc   | 264 --
 .../ubsan/pointer-counted-by-bounds-2.c   |  47 
 .../ubsan/pointer-counted-by-bounds-3.c   |  35 +++
 .../ubsan/pointer-counted-by-bounds-4.c   |  35 +++
 .../ubsan/pointer-counted-by-bounds-5.c   |  46 +++
 .../ubsan/pointer-counted-by-bounds-6.c   |  33 +++
 .../gcc.dg/ubsan/pointer-counted-by-bounds.c  |  46 +++
 8 files changed, 496 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/pointer-counted-by-bounds-2.c
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/pointer-counted-by-bounds-3.c
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/pointer-counted-by-bounds-4.c
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/pointer-counted-by-bounds-5.c
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/pointer-counted-by-bounds-6.c
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/pointer-counted-by-bounds.c

diff --git a/gcc/c-family/c-gimplify.cc b/gcc/c-family/c-gimplify.cc
index 89a1f5c1e80..1ca54911249 100644
--- a/gcc/c-family/c-gimplify.cc
+++ b/gcc/c-family/c-gimplify.cc
@@ -120,6 +120,13 @@ ubsan_walk_array_refs_r (tree *tp, int *walk_subtrees, 
void *data)
   walk_tree (&TREE_OPERAND (*tp, 1), ubsan_walk_array_refs_r, pset, pset);
   walk_tree (&TREE_OPERAND (*tp, 0), ubsan_walk_array_refs_r, pset, pset);
 }
+  else if (TREE_CODE (*tp) == INDIRECT_REF
+  && TREE_CODE (TREE_OPERAND (*tp, 0)) == POINTER_PLUS_EXPR
+  && TREE_CODE (TREE_OPERAND (TREE_OPERAND (*tp, 0), 0))
+   == INDIRECT_REF)
+if (is_access_with_size_p
+   (TREE_OPERAND (TREE_OPERAND (TREE_OPERAND (*tp, 0), 0), 0)))
+ubsan_maybe_instrument_array_ref (tp, false);
   return NULL_TREE;
 }
 
diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
index 78b78685469..21fb0e312f7 100644
--- a/gcc/c-family/c-ubsan.cc
+++ b/gcc/c-family/c-ubsan.cc
@@ -420,7 +420,6 @@ get_bound_from_access_with_size (tree call)
   return size;
 }
 
-
 /* Instrument array bounds for ARRAY_REFs.  We create special builtin,
that gets expanded in the sanopt pass, and make an array dimension
of it.  ARRAY is the array, *INDEX is an index to the array.
@@ -450,8 +449,7 @@ ubsan_instrument_bounds (location_t loc, tree array, tree 
*index,
   && is_access_with_size_p ((TREE_OPERAND (array, 0
{
  bound = get_bound_from_access_with_size ((TREE_OPERAND (array, 0)));
- bound = fold_build2 (MINUS_EXPR, TREE_TYPE (bound),
-  bound,
+ bound = fold_build2 (MINUS_EXPR, TREE_TYPE (bound), bound,
   build_int_cst (TREE_TYPE (bound), 1));
}
   else
@@ -554,38 +552,270 @@ ubsan_instrument_bounds (location_t loc, tree array, 
tree *index,
   *index, bound);
 }
 
-/* Return true iff T is an array that was instrumented by SANITIZE_BOUNDS.  */
+
+/* Instrument array bounds for the pointer array whose base address
+   is a call to .ACCESS_WITH_SIZE.  We create special builtin, that
+   gets expanded in the sanopt pass, and make an array dimension of
+   it.  POINTER is the pointer array's base address, *INDEX is an
+   index to the array.
+   Return NULL_TREE if no instrumentation is emitted.  */
+
+tr

Re: [GCC16 stage 1][RFC][PATCH 0/3]extend "counted_by" attribute to pointer fields of structures

2025-01-16 Thread Qing Zhao


> On Jan 16, 2025, at 17:29, Bill Wendling  wrote:
> 
> On Thu, Jan 16, 2025 at 1:19 PM Qing Zhao  wrote:
>> 
>> Hi,
>> 
>> This is the patch set to extend "counted_by" attribute to pointer fields of 
>> structures.
>> 
>> For example:
>> 
>> struct PP {
>>  size_t count2;
>>  char other1;
>>  char *array2 __attribute__ ((counted_by (count2)));
>>  int other2;
>> } *pp;
>> 
>> specifies that the "array2" is an array that is pointed by the
>> pointer field, and its number of elements is given by the field
>> "count2" in the same structure.
>> 
>> Per the previous discussion with Martin and Bill
>> (https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669320.html)
>> 
>> there are the following importand facts about "counted_by" on pointer fields 
>> compared
>> to the "counted_by" on FAM fields:
>> 
>> 1. one more new requirement for pointer fields with "counted_by" attribute:
>>   pp->array2 and pp->count2 can ONLY be changed by changing the whole 
>> structure
>>   at the same time.
>> 
>> 2. the following feature for FAM field with "counted_by" attribute is NOT
>>   valid for the pointer field any more:
>> 
>>" One important feature of the attribute is, a reference to the
>> flexible array member field uses the latest value assigned to the
>> field that represents the number of the elements before that
>> reference.  For example,
>> 
>>p->count = val1;
>>p->array[20] = 0;  // ref1 to p->array
>>p->count = val2;
>>p->array[30] = 0;  // ref2 to p->array
>> 
>> in the above, 'ref1' uses 'val1' as the number of the elements in
>> 'p->array', and 'ref2' uses 'val2' as the number of elements in
>> 'p->array'. "
>> 
>> Although in the previous discussion, I agreed with Martin that we should use 
>> the
>> designator syntax (i.e, counted_by (.n) instead of counted_by (n)) for the
>> counted_by attribute for pointer fields, after more consideration and 
>> discussion
>> with Bill Wendling (who is working on the same work for CLANG), we decided to
>> keep the current syntax of FAM for pointer fields. And leave the new syntax 
>> (.n)
>> and more complicate expressions to a later work.
>> 
>> This patch set includes 3 parts:
>> 
>> 1.Extend "counted_by" attribute to pointer fields of structures.
>> 2.Convert a pointer reference with counted_by attribute to .ACCESS_WITH_SIZE
>>and use it in builtinin-object-size.
>> 3.Use the counted_by attribute of pointers in array bound checker.
>> 
>> In which, the patch 1 and 2 are simple and straightforward, however, the 
>> patch 3
>> is a little complicate due to the following reason:
>> 
>>Current array bound checker only instruments ARRAY_REF, and the INDEX
>>information is the 2nd operand of the ARRAY_REF.
>> 
>>When extending the array bound checker to pointer references with
>>counted_by attributes, the hardest part is to get the INDEX of the
>>corresponding array ref from the offset computation expression of
>>the pointer ref.
>> 
>> The whole patch set has been bootstrapped and regression tested on both 
>> aarch64
>> and x86.
>> 
>> Let me know any comments and suggestions.
>> 
>> Thanks.
>> 
>> Qing
>> 
>> Qing Zhao (3):
>>  Extend "counted_by" attribute to pointer fields of structures.
>>  Convert a pointer reference with counted_by attribute to
>>.ACCESS_WITH_SIZE and use it in builtinin-object-size.
>>  Use the counted_by attribute of pointers in array bound checker.
>> 
>> gcc/c-family/c-attribs.cc |  15 +-
>> gcc/c-family/c-gimplify.cc|   7 +
>> gcc/c-family/c-ubsan.cc   | 264 --
>> gcc/c/c-decl.cc   |  91 +++---
>> gcc/c/c-typeck.cc |  41 +--
>> gcc/doc/extend.texi   |  37 ++-
>> gcc/testsuite/gcc.dg/flex-array-counted-by.c  |   2 +-
>> gcc/testsuite/gcc.dg/pointer-counted-by-2.c   |   8 +
>> gcc/testsuite/gcc.dg/pointer-counted-by-3.c   | 127 +
>> gcc/testsuite/gcc.dg/pointer-counted-by-4.c   |  63 +
>> gcc/testsuite/gcc.dg/pointer-counted-by-5.c   |  48 
>> gcc/testsuite/gcc.dg/pointer-counted-by-6.c   |  47 
>> gcc/testsuite/gcc.dg/pointer-counted-by-7.c   |  30 ++
>> gcc/testsuite/gcc.dg/pointer-counted-by-8.c   |  30 ++
>> gcc/testsuite/gcc.dg/pointer-counted-by.c |  70 +
> 
> Do you have any tests where the 'count' field is after the pointer field?

Yes.

In /gcc/testsuite/gcc.dg/pointer-counted-by.c  

struct mixed_array_2 {
  float *array_1 __attribute ((counted_by (count1)));
  int count1;
  float *array_2 __attribute ((counted_by (count1)));
  long *array_3 __attribute ((counted_by (count2)));
  int count2;
  long array_4[] __attribute ((counted_by (count2)));
};


count2 is After array_3.

Though I might need to add more such cases. Will do that.

Qing


> -bw
> 
>> .../ubsan/pointer-counted-by-bounds-2.c   |  47 
>> .../ubsan/pointer-counted-by-boun

Re: [GCC16 stage 1][RFC][PATCH 0/3]extend "counted_by" attribute to pointer fields of structures

2025-01-16 Thread Bill Wendling
On Thu, Jan 16, 2025 at 3:06 PM Qing Zhao  wrote:
> > On Jan 16, 2025, at 17:29, Bill Wendling  wrote:
> >
> > On Thu, Jan 16, 2025 at 1:19 PM Qing Zhao  wrote:
> >>
> >> Hi,
> >>
> >> This is the patch set to extend "counted_by" attribute to pointer fields 
> >> of structures.
> >>
> >> For example:
> >>
> >> struct PP {
> >>  size_t count2;
> >>  char other1;
> >>  char *array2 __attribute__ ((counted_by (count2)));
> >>  int other2;
> >> } *pp;
> >>
> >> specifies that the "array2" is an array that is pointed by the
> >> pointer field, and its number of elements is given by the field
> >> "count2" in the same structure.
> >>
> >> Per the previous discussion with Martin and Bill
> >> (https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669320.html)
> >>
> >> there are the following importand facts about "counted_by" on pointer 
> >> fields compared
> >> to the "counted_by" on FAM fields:
> >>
> >> 1. one more new requirement for pointer fields with "counted_by" attribute:
> >>   pp->array2 and pp->count2 can ONLY be changed by changing the whole 
> >> structure
> >>   at the same time.
> >>
> >> 2. the following feature for FAM field with "counted_by" attribute is NOT
> >>   valid for the pointer field any more:
> >>
> >>" One important feature of the attribute is, a reference to the
> >> flexible array member field uses the latest value assigned to the
> >> field that represents the number of the elements before that
> >> reference.  For example,
> >>
> >>p->count = val1;
> >>p->array[20] = 0;  // ref1 to p->array
> >>p->count = val2;
> >>p->array[30] = 0;  // ref2 to p->array
> >>
> >> in the above, 'ref1' uses 'val1' as the number of the elements in
> >> 'p->array', and 'ref2' uses 'val2' as the number of elements in
> >> 'p->array'. "
> >>
> >> Although in the previous discussion, I agreed with Martin that we should 
> >> use the
> >> designator syntax (i.e, counted_by (.n) instead of counted_by (n)) for the
> >> counted_by attribute for pointer fields, after more consideration and 
> >> discussion
> >> with Bill Wendling (who is working on the same work for CLANG), we decided 
> >> to
> >> keep the current syntax of FAM for pointer fields. And leave the new 
> >> syntax (.n)
> >> and more complicate expressions to a later work.
> >>
> >> This patch set includes 3 parts:
> >>
> >> 1.Extend "counted_by" attribute to pointer fields of structures.
> >> 2.Convert a pointer reference with counted_by attribute to 
> >> .ACCESS_WITH_SIZE
> >>and use it in builtinin-object-size.
> >> 3.Use the counted_by attribute of pointers in array bound checker.
> >>
> >> In which, the patch 1 and 2 are simple and straightforward, however, the 
> >> patch 3
> >> is a little complicate due to the following reason:
> >>
> >>Current array bound checker only instruments ARRAY_REF, and the INDEX
> >>information is the 2nd operand of the ARRAY_REF.
> >>
> >>When extending the array bound checker to pointer references with
> >>counted_by attributes, the hardest part is to get the INDEX of the
> >>corresponding array ref from the offset computation expression of
> >>the pointer ref.
> >>
> >> The whole patch set has been bootstrapped and regression tested on both 
> >> aarch64
> >> and x86.
> >>
> >> Let me know any comments and suggestions.
> >>
> >> Thanks.
> >>
> >> Qing
> >>
> >> Qing Zhao (3):
> >>  Extend "counted_by" attribute to pointer fields of structures.
> >>  Convert a pointer reference with counted_by attribute to
> >>.ACCESS_WITH_SIZE and use it in builtinin-object-size.
> >>  Use the counted_by attribute of pointers in array bound checker.
> >>
> >> gcc/c-family/c-attribs.cc |  15 +-
> >> gcc/c-family/c-gimplify.cc|   7 +
> >> gcc/c-family/c-ubsan.cc   | 264 --
> >> gcc/c/c-decl.cc   |  91 +++---
> >> gcc/c/c-typeck.cc |  41 +--
> >> gcc/doc/extend.texi   |  37 ++-
> >> gcc/testsuite/gcc.dg/flex-array-counted-by.c  |   2 +-
> >> gcc/testsuite/gcc.dg/pointer-counted-by-2.c   |   8 +
> >> gcc/testsuite/gcc.dg/pointer-counted-by-3.c   | 127 +
> >> gcc/testsuite/gcc.dg/pointer-counted-by-4.c   |  63 +
> >> gcc/testsuite/gcc.dg/pointer-counted-by-5.c   |  48 
> >> gcc/testsuite/gcc.dg/pointer-counted-by-6.c   |  47 
> >> gcc/testsuite/gcc.dg/pointer-counted-by-7.c   |  30 ++
> >> gcc/testsuite/gcc.dg/pointer-counted-by-8.c   |  30 ++
> >> gcc/testsuite/gcc.dg/pointer-counted-by.c |  70 +
> >
> > Do you have any tests where the 'count' field is after the pointer field?
>
> Yes.
>
> In /gcc/testsuite/gcc.dg/pointer-counted-by.c
>
> struct mixed_array_2 {
>   float *array_1 __attribute ((counted_by (count1)));
>   int count1;
>   float *array_2 __attribute ((counted_by (count1)));
>   long *array_3 __attribute ((counted_by (count2

RE: gcc mode switching issue (was Re: RISC-V round_away () handling of non canonical rounding modes)

2025-01-16 Thread Li, Pan2
Hi Vineet,

Is there any more information about the issue description here? Like steps for 
reproducing, as well as expect behavior but actual result.. etc.
It is not easy to start the investigation with blow mail thread. Thanks a lot.

Pan

-Original Message-
From: Vineet Gupta  
Sent: Friday, January 17, 2025 9:28 AM
To: Andrew Waterman 
Cc: Joseph Myers ; GNU C Library 
; Jeff Law ; Palmer Dabbelt 
; gnu-toolchain ; Robin Dapp 
; juzhe.zh...@rivai.ai; GCC Patches 

Subject: gcc mode switching issue (was Re: RISC-V round_away () handling of non 
canonical rounding modes)

On 1/16/25 15:07, Vineet Gupta wrote:
> +CC Juzhe, Robin, gcc patches mailing list
>
> On 1/16/25 14:49, Andrew Waterman wrote:
>> On Thu, Jan 16, 2025 at 11:43 AM Vineet Gupta  wrote:
>>> On 1/16/25 11:14, Joseph Myers wrote:
 The simple thing to do is to change sysdeps/riscv/rvf/get-rounding-mode.h
 so it only returns a supported value (so making code using
 get_rounding_mode treat FE_TONEARESTFROMZERO the same as FE_TONEAREST,
 effectively).  That doesn't give you actual support for this rounding
 mode, but should at least avoid aborts if it's set, in the absence of the
 larger changes discussed above to implement full FE_TONEARESTFROMZERO
 support.
>>> The simple approach feels simpler 😉
>>> We can certainly fudge get_round_mode() to return FE_TONEARESTFROMZERO as
>>> FE_TONEAREST.
>>> But is that correct semantically as in the machine itself is in a different
>>> rounding mode than what glibc thinks it in and could compute values 
>>> numerically
>>> differently than is expected.
>>>
>>> I wonder if gcc should even be generating insns with such rounding mode for 
>>> the
>>> general (not explicit) cases.
>> I was wondering the same thing.  On the scalar side, the FP ops have
>> the static rounding mode field, so there isn't a reason to change the
>> dynamic rounding mode if the compiler wants to use directed rounding
>> for a specific scalar instruction.  The vector instructions mostly do
>> not have static rounding modes, so maybe this is the result of
>> autovectorization of e.g. a loop that invokes `lround`?
> Either autovec or just vec
>
> I notice the following pattern in generated code
>
>    90610:    00225073  fsrmi    zero,4
>    90614:    0d707057  vsetvli    zero,zero,e32,mf2,ta,ma
>    90618:    4a1890d7  vfncvt.x.f.w    v1,v1
>
> I have a feeling it is generated by following
>
> (define_insn "@pred_narrow_fcvt_x_f"
>   [(set (match_operand: 0 "register_operand"    "=vd, vd, vr,
> vr,  &vr,  &vr")
>     (if_then_else:
>   (unspec:
>     [(match_operand: 1 "vector_mask_operand"   " vm,
> vm,Wc1,Wc1,vmWc1,vmWc1")
>  (match_operand 4 "vector_length_operand"  " rK, rK, rK, rK,  
> rK,   rK")
>  (match_operand 5 "const_int_operand"  "  i,  i,  i,  i,  
>  
> i,    i")
>  (match_operand 6 "const_int_operand"  "  i,  i,  i,  i,  
>  
> i,    i")
>  (match_operand 7 "const_int_operand"  "  i,  i,  i,  i,  
>  
> i,    i")
>  (match_operand 8 "const_int_operand"  "  i,  i,  i,  i,  
>  
> i,    i")
>  (reg:SI VL_REGNUM)
>  (reg:SI VTYPE_REGNUM)
>  (reg:SI FRM_REGNUM)] UNSPEC_VPREDICATE)
>   (unspec:
>  [(match_operand:V_VLSF 3 "register_operand"   "  0,  0,  0,  0,  
> vr,   vr")] VFCVTS)
>   (match_operand: 2 "vector_merge_operand"  " vu,  0, vu,  0,  
> vu,    0")))]
>   "TARGET_VECTOR"
>   "vfncvt.x.f.w\t%0,%3%p1"
>   [(set_attr "type" "vfncvtftoi")
>    (set_attr "mode" "")
>    (set (attr "frm_mode")
>     (symbol_ref "riscv_vector::get_frm_mode (operands[8])"))
>    (set_attr "spec_restriction" "none,none,thv,thv,none,none")])
>
>
> Although I'm not sure how exactly this generates the FSRM (assuming I'm 
> looking
> at right thing)
>
> FWIW all the testsuite tests for narrowing conversion from float2int seem to 
> be
> checking rtz variant.
> Will have to reduce Fortran - oh well !

Nope it is not the vfncvt stuff, rather some issue in gcc RISC-V mode switching.
We don't need any glibc changes.

-Vineet


Re: [PATCH 0/5] Add btf_decl_tag and btf_type_tag C attributes

2025-01-16 Thread Yonghong Song





On 10/30/24 11:31 AM, David Faust wrote:

This patch series adds support for the btf_decl_tag and btf_type_tag attributes
to GCC. This entails:

- Two new C-family attributes that allow to associate (to "tag") particular
   declarations and types with arbitrary strings. As explained below, this is
   intended to be used to, for example, characterize certain pointer types.  A
   single declaration or type may have multiple occurrences of these attributes.

- The conveyance of that information in the DWARF output in the form of a new
   DIE: DW_TAG_GNU_annotation, and a new attribute: DW_AT_GNU_annotation.

- The conveyance of that information in the BTF output in the form of two new
   kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG. These BTF
   kinds are already supported by LLVM and other tools in the BPF ecosystem.

Both of these attributes are already supported by clang, and beginning to be
used in various ways by eBPF users and inside the Linux kernel.

Purpose
===

1)  Addition of C-family language constructs (attributes) to specify free-text
 tags on certain language elements, such as struct fields.

 The purpose of these annotations is to provide additional information about
 types, variables, and function parameters of interest to the kernel. A
 driving use case is to tag pointer types within the Linux kernel and eBPF
 programs with additional semantic information, such as '__user' or '__rcu'.

 For example, consider the Linux kernel function do_execve with the
 following declaration:

   static int do_execve(struct filename *filename,
  const char __user *const __user *__argv,
  const char __user *const __user *__envp);

 Here, __user could be defined with these annotations to record semantic
 information about the pointer parameters (e.g., they are user-provided) in
 DWARF and BTF information. Other kernel facilities such as the eBPF 
verifier
 can read the tags and make use of the information.

2)  Conveying the tags in the generated DWARF debug info.

 The main motivation for emitting the tags in DWARF is that the Linux kernel
 generates its BTF information via pahole, using DWARF as a source:

 ++  BTF  BTF   +--+
 | pahole |---> vmlinux.btf --->| verifier |
 ++ +--+
 ^^
 ||
   DWARF |BTF |
 ||
  vmlinux  +-+
  module1.ko   | BPF program |
  module2.ko   +-+
...

 This is because:

 a)  Unlike GCC, LLVM will only generate BTF for BPF programs.

 b)  GCC can generate BTF for whatever target with -gbtf, but there is no
 support for linking/deduplicating BTF in the linker.

 In the scenario above, the verifier needs access to the pointer tags of
 both the kernel types/declarations (conveyed in the DWARF and translated
 to BTF by pahole) and those of the BPF program (available directly in BTF).

 Another motivation for having the tag information in DWARF, unrelated to
 BPF and BTF, is that the drgn project (another DWARF consumer) also wants
 to benefit from these tags in order to differentiate between different
 kinds of pointers in the kernel.

3)  Conveying the tags in the generated BTF debug info.

 This is easy: the main purpose of having this info in BTF is for the
 compiled eBPF programs. The kernel verifier can then access the tags
 of pointers used by the eBPF programs.

For more information about these tags and the motivation behind them, please
refer to the following Linux kernel discussions: [1], [2], [3].

DWARF Representation


Compared to prior iterations of this work, this patch series introduces a new
DWARF representation meant to address issues in the previous format. The format
is detailed below.

New DWARF extension: DW_TAG_GNU_annotation.  These DIEs encode the annotation
information.  They exist near the top level of the DIE tree as children of the
compilation unit DIE.  The user-supplied annotations ("tags") are encoded via
DW_AT_name and DW_AT_const_value.  DW_AT_name holds the name of the attribute
which is the source of the annotation (currently only "btf_type_tag" or
"btf_decl_tag").  DW_AT_const_value holds the arbitrary user string from the
attribute argument.

   DW_TAG_GNU_annotation
 DW_AT_name: "btf_decl_tag" or "btf_type_tag"
 DW_AT_const_value: 
 DW_AT_GNU_annotation: see below.

New DWARF extension: DW_AT_GNU_annotation.  If present, the
DW_AT_GNU_annotation attribute is a reference to a DW_TAG_GNU_annotation DIE
holding annotations for the objec

Re: [PATCH v8] c++: Fix overeager Woverloaded-virtual with conversion operators [PR109918]

2025-01-16 Thread Jason Merrill

On 10/16/24 11:43 AM, Simon Martin wrote:

As you know the patch had to be reverted due to PR117114, that
highlighted a bunch of issues with comparing DECL_VINDEXes: it might
give false positives in case of multiple inheritance (the case in
PR117114), but also if there’s single inheritance by the hierarchy has
more than two levels (another issue I found while bootstrapping with
rust enabled).


Yes, relying on DECL_VINDEX equality was wrong, sorry to mislead you.


The attached updated patch introduces an overrides_p function, based on
the existing check_final_overrider, and uses it when the signatures match.


That seems unnecessary.  It seems like removing that only breaks 
Woverloaded-virt11.C, and making that work again only requires bringing 
back the check that DECL_VINDEX (fndecl) is set (to any value).  Or 
remembering that fndecl was a template, so it can't really have the same 
signature as a non-template, whatever same_signature_p says.


Jason



[PATCH] c++: fix wrong-code with constexpr prvalue opt [PR118396]

2025-01-16 Thread Marek Polacek
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
The recent r15-6369 unfortunately caused a bad wrong-code issue.
Here we have

  TARGET_EXPR 

and call cp_fold_r -> maybe_constant_init with object=D.2996.  In
cxx_eval_outermost_constant_expr we now take the type of the object
if present.  An object can't have type 'void' and so we continue to
evaluate the initializer.  That evaluates into a VOID_CST, meaning
we disregard the whole initializer, and terrible things ensue.

I think the new prvalue optimization (r15-6052) should only be enabled
for simple TARGET_EXPRs -- ones that don't initialize the object via
a sequence of statement as the one above.

I also think we want an assert so that this doesn't happen again.

PR c++/118396
PR c++/118523

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_outermost_constant_expr): Assert
that we don't throw away the initializer by evaluating it away.
* cp-gimplify.cc (cp_fold_r): Only attempt to evaluate the initializer
if the TARGET_EXPR is simple.

gcc/testsuite/ChangeLog:

* g++.dg/tree-ssa/pr78687.C: Revert r15-6052.
* g++.dg/tree-ssa/pr90883.C: Likewise.
* g++.dg/cpp0x/constexpr-prvalue4.C: New test.
* g++.dg/cpp1y/constexpr-prvalue3.C: New test.
---
 gcc/cp/constexpr.cc   | 14 --
 gcc/cp/cp-gimplify.cc |  4 +-
 .../g++.dg/cpp0x/constexpr-prvalue4.C | 33 ++
 .../g++.dg/cpp1y/constexpr-prvalue3.C | 45 +++
 gcc/testsuite/g++.dg/tree-ssa/pr78687.C   |  3 +-
 gcc/testsuite/g++.dg/tree-ssa/pr90883.C   |  3 +-
 6 files changed, 94 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-prvalue4.C
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-prvalue3.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index c898e3bfa6e..61b25277419 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -8876,9 +8876,17 @@ cxx_eval_outermost_constant_expr (tree t, bool 
allow_non_constant,
   /* Turn off -frounding-math for manifestly constant evaluation.  */
   warning_sentinel rm (flag_rounding_math,
   ctx.manifestly_const_eval == mce_true);
-  tree type = (object
-  ? cv_unqualified (TREE_TYPE (object))
-  : initialized_type (t));
+  tree type;
+  if (object)
+{
+  type = cv_unqualified (TREE_TYPE (object));
+  /* If there is an object to initialize, make sure we don't throw
+away the initializer.  */
+  gcc_assert (!VOID_TYPE_P (initialized_type (t)) || constexpr_dtor);
+}
+  else
+type = initialized_type (t);
+
   tree r = t;
   bool is_consteval = false;
   if (VOID_TYPE_P (type))
diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
index c7074b00cef..283c0fa3e26 100644
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -1475,7 +1475,9 @@ cp_fold_r (tree *stmt_p, int *walk_subtrees, void *data_)
  cp_walk_tree (&init, cp_fold_r, data, NULL);
  cp_walk_tree (&TARGET_EXPR_CLEANUP (stmt), cp_fold_r, data, NULL);
  *walk_subtrees = 0;
- if (!flag_no_inline)
+ /* Only attempt to evaluate the initializer if we're inlining and
+the TARGET_EXPR is simple.  */
+ if (!flag_no_inline && !VOID_TYPE_P (TREE_TYPE (init)))
{
  tree folded = maybe_constant_init (init, TARGET_EXPR_SLOT (stmt));
  if (folded != init && TREE_CONSTANT (folded))
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-prvalue4.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-prvalue4.C
new file mode 100644
index 000..afcee65f880
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-prvalue4.C
@@ -0,0 +1,33 @@
+// PR c++/118396
+// { dg-do run { target c++11 } }
+// { dg-options "-O" }
+
+void *operator new(__SIZE_TYPE__, void *__p) { return __p; }
+
+struct Foo {
+  virtual ~Foo() = default;
+};
+struct Data {
+  int status;
+  Foo data{};
+};
+
+Data *P, *Q;
+
+struct vector {
+  vector (const Data &__value) {
+P = static_cast(__builtin_operator_new(0));
+new (P) Data (__value);
+Q = P + 1;
+  }
+  Data *begin() { return P; }
+  Data *end() { return Q; }
+};
+
+int
+main ()
+{
+  vector items_(Data{});
+  for (auto item : items_)
+item.status == 0 ? void() : __builtin_abort ();
+}
diff --git a/gcc/testsuite/g++.dg/cpp1y/constexpr-prvalue3.C 
b/gcc/testsuite/g++.dg/cpp1y/constexpr-prvalue3.C
new file mode 100644
index 000..8ea86c60be5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/constexpr-prvalue3.C
@@ -0,0 +1,45 @@
+// PR c++/118523
+// { dg-do compile { target c++14 } }
+// { dg-options "-O2 -Wall" }
+
+struct __new_allocator {
+  constexpr __new_allocator() {}
+  __new_allocator(__new_allocator &) {}
+};
+template  using __allocator_base = __new_allocator;
+template  struct allocator_traits;
+template  struct allocator : __allocator_base {};
+template  struc

gcc mode switching issue (was Re: RISC-V round_away () handling of non canonical rounding modes)

2025-01-16 Thread Vineet Gupta
On 1/16/25 15:07, Vineet Gupta wrote:
> +CC Juzhe, Robin, gcc patches mailing list
>
> On 1/16/25 14:49, Andrew Waterman wrote:
>> On Thu, Jan 16, 2025 at 11:43 AM Vineet Gupta  wrote:
>>> On 1/16/25 11:14, Joseph Myers wrote:
 The simple thing to do is to change sysdeps/riscv/rvf/get-rounding-mode.h
 so it only returns a supported value (so making code using
 get_rounding_mode treat FE_TONEARESTFROMZERO the same as FE_TONEAREST,
 effectively).  That doesn't give you actual support for this rounding
 mode, but should at least avoid aborts if it's set, in the absence of the
 larger changes discussed above to implement full FE_TONEARESTFROMZERO
 support.
>>> The simple approach feels simpler 😉
>>> We can certainly fudge get_round_mode() to return FE_TONEARESTFROMZERO as
>>> FE_TONEAREST.
>>> But is that correct semantically as in the machine itself is in a different
>>> rounding mode than what glibc thinks it in and could compute values 
>>> numerically
>>> differently than is expected.
>>>
>>> I wonder if gcc should even be generating insns with such rounding mode for 
>>> the
>>> general (not explicit) cases.
>> I was wondering the same thing.  On the scalar side, the FP ops have
>> the static rounding mode field, so there isn't a reason to change the
>> dynamic rounding mode if the compiler wants to use directed rounding
>> for a specific scalar instruction.  The vector instructions mostly do
>> not have static rounding modes, so maybe this is the result of
>> autovectorization of e.g. a loop that invokes `lround`?
> Either autovec or just vec
>
> I notice the following pattern in generated code
>
>    90610:    00225073  fsrmi    zero,4
>    90614:    0d707057  vsetvli    zero,zero,e32,mf2,ta,ma
>    90618:    4a1890d7  vfncvt.x.f.w    v1,v1
>
> I have a feeling it is generated by following
>
> (define_insn "@pred_narrow_fcvt_x_f"
>   [(set (match_operand: 0 "register_operand"    "=vd, vd, vr,
> vr,  &vr,  &vr")
>     (if_then_else:
>   (unspec:
>     [(match_operand: 1 "vector_mask_operand"   " vm,
> vm,Wc1,Wc1,vmWc1,vmWc1")
>  (match_operand 4 "vector_length_operand"  " rK, rK, rK, rK,  
> rK,   rK")
>  (match_operand 5 "const_int_operand"  "  i,  i,  i,  i,  
>  
> i,    i")
>  (match_operand 6 "const_int_operand"  "  i,  i,  i,  i,  
>  
> i,    i")
>  (match_operand 7 "const_int_operand"  "  i,  i,  i,  i,  
>  
> i,    i")
>  (match_operand 8 "const_int_operand"  "  i,  i,  i,  i,  
>  
> i,    i")
>  (reg:SI VL_REGNUM)
>  (reg:SI VTYPE_REGNUM)
>  (reg:SI FRM_REGNUM)] UNSPEC_VPREDICATE)
>   (unspec:
>  [(match_operand:V_VLSF 3 "register_operand"   "  0,  0,  0,  0,  
> vr,   vr")] VFCVTS)
>   (match_operand: 2 "vector_merge_operand"  " vu,  0, vu,  0,  
> vu,    0")))]
>   "TARGET_VECTOR"
>   "vfncvt.x.f.w\t%0,%3%p1"
>   [(set_attr "type" "vfncvtftoi")
>    (set_attr "mode" "")
>    (set (attr "frm_mode")
>     (symbol_ref "riscv_vector::get_frm_mode (operands[8])"))
>    (set_attr "spec_restriction" "none,none,thv,thv,none,none")])
>
>
> Although I'm not sure how exactly this generates the FSRM (assuming I'm 
> looking
> at right thing)
>
> FWIW all the testsuite tests for narrowing conversion from float2int seem to 
> be
> checking rtz variant.
> Will have to reduce Fortran - oh well !

Nope it is not the vfncvt stuff, rather some issue in gcc RISC-V mode switching.
We don't need any glibc changes.

-Vineet


[PATCH] c++: Use mapped reads and writes when munmap and msync are available

2025-01-16 Thread John David Anglin
Tested on hppa64-hp-hpux11.11 and hppa-unknown-linux-gnu.

Okay for trunk?

Dave
---

c++: Use mapped reads and writes when munmap and msync are available

Module support is broken when MAPPED_READING and MAPPED_WRITING
are defined to 0.  This causes internal compiler errors in the
permissive-error-1.C and permissive-error-2.C tests.

HP-UX 11.11 doesn't define _POSIX_MAPPED_FILES but it does have
munmap and msync.  Testing indicates support is sufficient for
c++ modules, so use checks for these functions instead of
_POSIX_MAPPED_FILES check.

2025-01-16  John David Anglin  

gcc/ChangeLog:

PR c++/116524
* configure.ac: Check for munmap and msync.
* configure: Regenerate.
* config.in: Regenerate.

gcc/cp/ChangeLog:
* module.cc: Test HAVE_MUNMAP and HAVE_MSYNC instead of
_POSIX_MAPPED_FILES > 0.

diff --git a/gcc/configure.ac b/gcc/configure.ac
index 6c38c4925fb..8fab93c9365 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -1574,7 +1574,7 @@ AC_CHECK_FUNCS(times clock kill getrlimit setrlimit atoq \
popen sysconf strsignal getrusage nl_langinfo \
gettimeofday mbstowcs wcswidth mmap posix_fallocate setlocale \
gcc_UNLOCKED_FUNCS madvise mallinfo mallinfo2 fstatat getauxval \
-   clock_gettime)
+   clock_gettime munmap msync)
 
 # At least for glibc, clock_gettime is in librt.  But don't pull that
 # in if it still doesn't give us the function we want.
diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 61116fe7669..9b2bbdb2988 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -241,11 +241,11 @@ Classes used:
 #define MAPPED_READING 0
 #define MAPPED_WRITING 0
 #else
-#if HAVE_MMAP_FILE && _POSIX_MAPPED_FILES > 0
-/* mmap, munmap.  */
+#if HAVE_MMAP_FILE && HAVE_MUNMAP && HAVE_MSYNC
+/* mmap, munmap, msync.  */
 #define MAPPED_READING 1
 #if HAVE_SYSCONF && defined (_SC_PAGE_SIZE)
-/* msync, sysconf (_SC_PAGE_SIZE), ftruncate  */
+/* sysconf (_SC_PAGE_SIZE), ftruncate  */
 /* posix_fallocate used if available.  */
 #define MAPPED_WRITING 1
 #else


signature.asc
Description: PGP signature


Re: [PATCH] c++: bogus error with nested lambdas [PR117602]

2025-01-16 Thread Marek Polacek
On Wed, Jan 15, 2025 at 04:18:36PM -0500, Jason Merrill wrote:
> On 1/15/25 12:55 PM, Marek Polacek wrote:
> > On Wed, Jan 15, 2025 at 09:39:41AM -0500, Jason Merrill wrote:
> > > On 11/15/24 9:08 AM, Marek Polacek wrote:
> > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > > > 
> > > > -- >8 --
> > > > The error here should also check that we aren't nested in another
> > > > lambda; in it, at_function_scope_p() will be false.
> > > > 
> > > > PR c++/117602
> > > > 
> > > > gcc/cp/ChangeLog:
> > > > 
> > > > * parser.cc (cp_parser_lambda_introducer): Check if we're in a 
> > > > lambda
> > > > before emitting the error about a non-local lambda with
> > > > a capture-default.
> > > > 
> > > > gcc/testsuite/ChangeLog:
> > > > 
> > > > * g++.dg/cpp2a/lambda-uneval19.C: New test.
> > > > ---
> > > >gcc/cp/parser.cc |  5 -
> > > >gcc/testsuite/g++.dg/cpp2a/lambda-uneval19.C | 14 ++
> > > >2 files changed, 18 insertions(+), 1 deletion(-)
> > > >create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-uneval19.C
> > > > 
> > > > diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
> > > > index 07b12224615..dc79ff42a3b 100644
> > > > --- a/gcc/cp/parser.cc
> > > > +++ b/gcc/cp/parser.cc
> > > > @@ -11611,7 +11611,10 @@ cp_parser_lambda_introducer (cp_parser* 
> > > > parser, tree lambda_expr)
> > > >  cp_lexer_consume_token (parser->lexer);
> > > >  first = false;
> > > > -  if (!(at_function_scope_p () || parsing_nsdmi ()))
> > > > +  if (!(at_function_scope_p ()
> > > > +   || parsing_nsdmi ()
> > > > +   || (current_class_type
> > > > +   && LAMBDA_TYPE_P (current_class_type
> > > 
> > > How about using current_nonlambda_scope () instead of at_function_scope_p
> > > ()?
> > 
> > I think I remember not using that because current_nonlambda_scope() will
> > give us a namespace_decl :: for non-local stuff so it won't be null.  Do
> > you still prefer that (checking the result of current_nonlambda_scope())
> > to what I did in my patch?
> 
> I think so, your change looks to be true for lambdas outside function scope
> as well.

I think it works correctly for both

  auto x = [&]() { // error
  [&]() { };
  };
  auto x2 = []() {
  [&]() { };
  };

but current_nonlambda_scope () will return '::' for the nested lambdas too.
Am I missing something?

Marek



Re: [PATCH v3 3/6] c++: Fix ABI for lambdas declared in alias templates [PR116568]

2025-01-16 Thread Nathaniel Shead
On Thu, Jan 16, 2025 at 07:09:33PM -0500, Jason Merrill wrote:
> On 1/6/25 7:22 AM, Nathaniel Shead wrote:
> > I'm not 100% sure I've handled this properly, any feedback welcome.
> > In particular, maybe should I use `DECL_IMPLICIT_TYPEDEF_P` in the
> > mangling logic instead of `!TYPE_DECL_ALIAS_P`?  They both seem to work
> > in this case but not sure which would be clearer.
> > 
> > I also looked into trying do a limited form of 'start_decl' before
> > parsing the type but there were too many circular dependencies for me to
> > work through, so I think any such changes would have to wait till GCC16
> > (if they're even possible at all).
> > 
> > -- >8 --
> > 
> > This adds mangling support for lambdas with a mangling context of an
> > alias template, and gives that context when instantiating such a lambda.
> 
> I think this is wrong, an alias is not an entity so it is not a definable
> item.
> 
> The ABI change proposal also doesn't mention aliases.
> 
> Jason
> 

Ah right, I see; I'd treated https://eel.is/c++draft/basic.def.odr#1.5
as being any template, but I see now it's "any templated entity" which
is different (since as you say an alias isn't an entity).

In that case, how do you think we should handle class-scope alias
templates of lambdas?  Such a class is surely a definable item, and so
e.g. 

  struct S {
template 
using X = decltype([]{ return I; });
  };
  using L1 = S::X<1>;
  using L2 = S::X<2>;

should this work and declare L1 to be the same type across TUs?
In which case it would need mangling to include the template arguments.
Or because this is a template instantiation are there different rules?

The alternative would of course be that such lambdas are TU-local, which
is what I believe Clang currently does.

Nathaniel


Re: [PATCH] libstdc++: Use string::push_back instead of string::operator+=

2025-01-16 Thread Aditya K
>> From db5036e40ed7ac43b66ca74c44ec8d0bdc934b07 Mon Sep 17 00:00:00 2001
>> From: AdityaK 
>> <1108430...@users.noreply.github.com>
>> Date: Sun, 29 Dec 2024 18:14:29 -0800
>> Subject: [PATCH] libstdc++: Use string::push_back instead of 
>> string::operator+=
>>
>> operator+= returns string& which is ignored anyways.

>Why does this matter? The compiler can see that the return value isn't used.

>Using += seems more readable to me.


nvm, i see both produce the same code. sorry for the noise.





From: Aditya K 
Sent: Tuesday, January 14, 2025 12:05 PM
To: gcc-patches@gcc.gnu.org ; libstd...@gcc.gnu.org 

Cc: jwak...@redhat.com 
Subject: Re: [PATCH] libstdc++: Use string::push_back instead of 
string::operator+=

pinging in case this was missed.


From: Aditya K 
Sent: Sunday, December 29, 2024 6:36 PM
To: gcc-patches@gcc.gnu.org ; libstd...@gcc.gnu.org 

Cc: jwak...@redhat.com 
Subject: [PATCH] libstdc++: Use string::push_back instead of string::operator+=

>From db5036e40ed7ac43b66ca74c44ec8d0bdc934b07 Mon Sep 17 00:00:00 2001
From: AdityaK <1108430...@users.noreply.github.com>
Date: Sun, 29 Dec 2024 18:14:29 -0800
Subject: [PATCH] libstdc++: Use string::push_back instead of string::operator+=

operator+= returns string& which is ignored anyways.
---
 libstdc++-v3/ChangeLog | 5 +
 libstdc++-v3/include/bits/basic_string.tcc | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog
index 9ab5eeb55a5..be90bfd47e8 100644
--- a/libstdc++-v3/ChangeLog
+++ b/libstdc++-v3/ChangeLog
@@ -1,3 +1,8 @@
+2024-12-29  Aditya Kumar  
+   * include/bits/basic_string.tcc (getline): Use string::push_back
+   instead of string::operator+=
+
+
 2024-12-29  Gerald Pfeifer  

 * doc/html/manual/profile_mode_diagnostics.html: Delete.
diff --git a/libstdc++-v3/include/bits/basic_string.tcc 
b/libstdc++-v3/include/bits/basic_string.tcc
index caeddaf2f5b..ddb41c8e7e2 100644
--- a/libstdc++-v3/include/bits/basic_string.tcc
+++ b/libstdc++-v3/include/bits/basic_string.tcc
@@ -935,7 +935,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  && !_Traits::eq_int_type(__c, __eof)
  && !_Traits::eq_int_type(__c, __idelim))
 {
- __str += _Traits::to_char_type(__c);
+ __str.push_back(_Traits::to_char_type(__c));
   ++__extracted;
   __c = __in.rdbuf()->snextc();
 }
--
2.47.1.613.gc27f4b7a9f-goog



[PATCH V2] rs6000: Disassemble opaque modes using subregs to allow optimizations [PR109116]

2025-01-16 Thread Peter Bergner
Unfortunately I accidentally dropped this patch series without posting V2
after Kewen's patch review. :-(  Here's V2 with the suggested changes made.
Here is the thread from the V1 patch submission:

https://inbox.sourceware.org/gcc-patches/1f32e2bf-83c2-4664-b7f3-4a6996978...@linux.ibm.com/


Changes from V1:
* Delete unneeded UNSPEC_MMA_EXTRACT.
* Use 16 rather than GET_MODE_SIZE (V16QImode) as requested by Kewen.
* Remove unneeded change to rs6000_modes_tieable_p.
* Rebase to current trunk.

PR109116 exposes an issue where using unspecs to access each vector component
of an opaque mode variable leads to unneeded register copies, because our rtl
optimizers cannot handle unspecs.  Instead, use subregs to access each vector
register component of the opaque mode variable, which our optimizers know how
to handle and optimize.

I did not include a test case with the patch, since writing a test case that
attempts to ensure we don't emit unneeded register copies is nearly impossible
since those copies can still be generated for reasons other than the causes
in this patch.  I have verified that this patch does improve code generation
for some unit tests and our AI libraries team has confirmed that performance
of their tests improved when using this patch.

This passed bootstrap and regtesting with no regressions on powerpc64le-linux
and powerpc64-linux.  Ok for trunk?

Peter


gcc/
PR target/109116
* config/rs6000/mma.md (unspec): Delete UNSPEC_MMA_EXTRACT.
(vsx_disassemble_pair): Expand into a vector register sized subreg.
(mma_disassemble_acc): Likewise.
(*vsx_disassemble_pair): Delete.
(*mma_disassemble_acc): Likewise.

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 50e577ab44d..85f3a926682 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -30,7 +30,6 @@ (define_constants [(MAX_MMA_OPERANDS 7)])
 
 (define_c_enum "unspec"
   [UNSPEC_VSX_ASSEMBLE
-   UNSPEC_MMA_EXTRACT
UNSPEC_MMA_PMXVBF16GER2
UNSPEC_MMA_PMXVBF16GER2NN
UNSPEC_MMA_PMXVBF16GER2NP
@@ -398,29 +397,8 @@ (define_expand "vsx_disassemble_pair"
(match_operand 2 "const_0_to_1_operand")]
   "TARGET_MMA"
 {
-  rtx src;
-  int regoff = INTVAL (operands[2]);
-  src = gen_rtx_UNSPEC (V16QImode,
-   gen_rtvec (2, operands[1], GEN_INT (regoff)),
-   UNSPEC_MMA_EXTRACT);
-  emit_move_insn (operands[0], src);
-  DONE;
-})
-
-(define_insn_and_split "*vsx_disassemble_pair"
-  [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" "=mwa")
-   (unspec:V16QI [(match_operand:OO 1 "vsx_register_operand" "wa")
- (match_operand 2 "const_0_to_1_operand")]
- UNSPEC_MMA_EXTRACT))]
-  "TARGET_MMA
-   && vsx_register_operand (operands[1], OOmode)"
-  "#"
-  "&& reload_completed"
-  [(const_int 0)]
-{
-  int reg = REGNO (operands[1]);
-  int regoff = INTVAL (operands[2]);
-  rtx src = gen_rtx_REG (V16QImode, reg + regoff);
+  int regoff = INTVAL (operands[2]) * 16;
+  rtx src = simplify_gen_subreg (V16QImode, operands[1], OOmode, regoff);
   emit_move_insn (operands[0], src);
   DONE;
 })
@@ -472,29 +450,8 @@ (define_expand "mma_disassemble_acc"
(match_operand 2 "const_0_to_3_operand")]
   "TARGET_MMA"
 {
-  rtx src;
-  int regoff = INTVAL (operands[2]);
-  src = gen_rtx_UNSPEC (V16QImode,
-   gen_rtvec (2, operands[1], GEN_INT (regoff)),
-   UNSPEC_MMA_EXTRACT);
-  emit_move_insn (operands[0], src);
-  DONE;
-})
-
-(define_insn_and_split "*mma_disassemble_acc"
-  [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" "=mwa")
-   (unspec:V16QI [(match_operand:XO 1 "fpr_reg_operand" "d")
- (match_operand 2 "const_0_to_3_operand")]
- UNSPEC_MMA_EXTRACT))]
-  "TARGET_MMA
-   && fpr_reg_operand (operands[1], XOmode)"
-  "#"
-  "&& reload_completed"
-  [(const_int 0)]
-{
-  int reg = REGNO (operands[1]);
-  int regoff = INTVAL (operands[2]);
-  rtx src = gen_rtx_REG (V16QImode, reg + regoff);
+  int regoff = INTVAL (operands[2]) * 16;
+  rtx src = simplify_gen_subreg (V16QImode, operands[1], XOmode, regoff);
   emit_move_insn (operands[0], src);
   DONE;
 })



Re: [PATCH] c++: Relax checking assert about elision to support -fno-elide-constructors [PR114619]

2025-01-16 Thread Jason Merrill

On 10/19/24 5:09 AM, Simon Martin wrote:

We currently ICE in checking mode with cxx_dialect < 17 on the following
valid code

=== cut here ===
struct X {
   X(const X&) {}
};
extern X x;
void foo () {
   new X[1]{x};
}
=== cut here ===

The problem is that cp_gimplify_expr gcc_checking_asserts that a
TARGET_EXPR is not TARGET_EXPR_ELIDING_P (or cannot be elided), while in
this case with cxx_dialect < 17, it is TARGET_EXPR_ELIDING_P but we have
not even tried to elide.

This patch relaxes that gcc_checking_assert to not fail when using
cxx_dialect < 17 and -fno-elide-constructors (I considered being more
clever at setting TARGET_EXPR_ELIDING_P appropriately but it looks more
risky and not worth the extra complexity for a checking assert).


The problem is that in that case we end up with two copy constructor 
calls instead of one: one built in massage_init_elt, and the other in 
expand_default_init.  The result of the first copy is marked 
TARGET_EXPR_ELIDING_P, so when we try to pass it to the second copy we 
hit the assert.  I think the assert is catching a real bug: even with 
-fno-elide-constructors we should only copy once, not twice.


This seems to be because 'digested' has the wrong value in 
build_vec_init; we did just call digest_init in build_new_1, but 
build_vec_init doesn't understand that.


Jason



Re: [GCC16 stage 1][RFC][PATCH 0/3]extend "counted_by" attribute to pointer fields of structures

2025-01-16 Thread Bill Wendling
On Thu, Jan 16, 2025 at 1:19 PM Qing Zhao  wrote:
>
> Hi,
>
> This is the patch set to extend "counted_by" attribute to pointer fields of 
> structures.
>
> For example:
>
> struct PP {
>   size_t count2;
>   char other1;
>   char *array2 __attribute__ ((counted_by (count2)));
>   int other2;
> } *pp;
>
> specifies that the "array2" is an array that is pointed by the
> pointer field, and its number of elements is given by the field
> "count2" in the same structure.
>
> Per the previous discussion with Martin and Bill
> (https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669320.html)
>
> there are the following importand facts about "counted_by" on pointer fields 
> compared
> to the "counted_by" on FAM fields:
>
> 1. one more new requirement for pointer fields with "counted_by" attribute:
>pp->array2 and pp->count2 can ONLY be changed by changing the whole 
> structure
>at the same time.
>
> 2. the following feature for FAM field with "counted_by" attribute is NOT
>valid for the pointer field any more:
>
> " One important feature of the attribute is, a reference to the
>  flexible array member field uses the latest value assigned to the
>  field that represents the number of the elements before that
>  reference.  For example,
>
> p->count = val1;
> p->array[20] = 0;  // ref1 to p->array
> p->count = val2;
> p->array[30] = 0;  // ref2 to p->array
>
>  in the above, 'ref1' uses 'val1' as the number of the elements in
>  'p->array', and 'ref2' uses 'val2' as the number of elements in
>  'p->array'. "
>
> Although in the previous discussion, I agreed with Martin that we should use 
> the
> designator syntax (i.e, counted_by (.n) instead of counted_by (n)) for the
> counted_by attribute for pointer fields, after more consideration and 
> discussion
> with Bill Wendling (who is working on the same work for CLANG), we decided to
> keep the current syntax of FAM for pointer fields. And leave the new syntax 
> (.n)
> and more complicate expressions to a later work.
>
> This patch set includes 3 parts:
>
> 1.Extend "counted_by" attribute to pointer fields of structures.
> 2.Convert a pointer reference with counted_by attribute to .ACCESS_WITH_SIZE
> and use it in builtinin-object-size.
> 3.Use the counted_by attribute of pointers in array bound checker.
>
> In which, the patch 1 and 2 are simple and straightforward, however, the 
> patch 3
> is a little complicate due to the following reason:
>
> Current array bound checker only instruments ARRAY_REF, and the INDEX
> information is the 2nd operand of the ARRAY_REF.
>
> When extending the array bound checker to pointer references with
> counted_by attributes, the hardest part is to get the INDEX of the
> corresponding array ref from the offset computation expression of
> the pointer ref.
>
> The whole patch set has been bootstrapped and regression tested on both 
> aarch64
> and x86.
>
> Let me know any comments and suggestions.
>
> Thanks.
>
> Qing
>
> Qing Zhao (3):
>   Extend "counted_by" attribute to pointer fields of structures.
>   Convert a pointer reference with counted_by attribute to
> .ACCESS_WITH_SIZE and use it in builtinin-object-size.
>   Use the counted_by attribute of pointers in array bound checker.
>
>  gcc/c-family/c-attribs.cc |  15 +-
>  gcc/c-family/c-gimplify.cc|   7 +
>  gcc/c-family/c-ubsan.cc   | 264 --
>  gcc/c/c-decl.cc   |  91 +++---
>  gcc/c/c-typeck.cc |  41 +--
>  gcc/doc/extend.texi   |  37 ++-
>  gcc/testsuite/gcc.dg/flex-array-counted-by.c  |   2 +-
>  gcc/testsuite/gcc.dg/pointer-counted-by-2.c   |   8 +
>  gcc/testsuite/gcc.dg/pointer-counted-by-3.c   | 127 +
>  gcc/testsuite/gcc.dg/pointer-counted-by-4.c   |  63 +
>  gcc/testsuite/gcc.dg/pointer-counted-by-5.c   |  48 
>  gcc/testsuite/gcc.dg/pointer-counted-by-6.c   |  47 
>  gcc/testsuite/gcc.dg/pointer-counted-by-7.c   |  30 ++
>  gcc/testsuite/gcc.dg/pointer-counted-by-8.c   |  30 ++
>  gcc/testsuite/gcc.dg/pointer-counted-by.c |  70 +

Do you have any tests where the 'count' field is after the pointer field?

-bw

>  .../ubsan/pointer-counted-by-bounds-2.c   |  47 
>  .../ubsan/pointer-counted-by-bounds-3.c   |  35 +++
>  .../ubsan/pointer-counted-by-bounds-4.c   |  35 +++
>  .../ubsan/pointer-counted-by-bounds-5.c   |  46 +++
>  .../ubsan/pointer-counted-by-bounds-6.c   |  33 +++
>  .../gcc.dg/ubsan/pointer-counted-by-bounds.c  |  46 +++
>  gcc/tree-object-size.cc   |  11 +-
>  22 files changed, 1045 insertions(+), 88 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/pointer-counted-by-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/pointer-counted-by-3.c
>  create mode 100644 gcc/testsuite/gcc.dg/pointer-counted-by-4.c

[r15-6963 Regression] FAIL: c-c++-common/gomp/metadirective-target-device-1.c -std=c++98 scan-tree-dump-times optimized "GOMP_error" 0 on Linux/x86_64

2025-01-16 Thread haochen.jiang
On Linux/x86_64,

fdeceba59bee60040fd58203b6fe0239d789eade is the first bad commit
commit fdeceba59bee60040fd58203b6fe0239d789eade
Author: Sandra Loosemore 
Date:   Wed Jan 8 01:55:47 2025 +

OpenMP: Shared metadirective/dynamic selector tests for C and C++

caused

FAIL: c-c++-common/gomp/metadirective-device.c scan-tree-dump-not optimized 
"__builtin_GOMP_error"
FAIL: c-c++-common/gomp/metadirective-device.c  -std=c++17  scan-tree-dump-not 
optimized "__builtin_GOMP_error"
FAIL: c-c++-common/gomp/metadirective-device.c  -std=c++26  scan-tree-dump-not 
optimized "__builtin_GOMP_error"
FAIL: c-c++-common/gomp/metadirective-device.c  -std=c++98  scan-tree-dump-not 
optimized "__builtin_GOMP_error"
FAIL: c-c++-common/gomp/metadirective-target-device-1.c scan-tree-dump-times 
optimized "GOMP_error" 0
FAIL: c-c++-common/gomp/metadirective-target-device-1.c  -std=c++17  
scan-tree-dump-times optimized "GOMP_error" 0
FAIL: c-c++-common/gomp/metadirective-target-device-1.c  -std=c++26  
scan-tree-dump-times optimized "GOMP_error" 0
FAIL: c-c++-common/gomp/metadirective-target-device-1.c  -std=c++98  
scan-tree-dump-times optimized "GOMP_error" 0

with GCC configured with

../../gcc/configure 
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r15-6963/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="gomp.exp=c-c++-common/gomp/metadirective-device.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="gomp.exp=c-c++-common/gomp/metadirective-device.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="gomp.exp=c-c++-common/gomp/metadirective-target-device-1.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="gomp.exp=c-c++-common/gomp/metadirective-target-device-1.c 
--target_board='unix{-m32\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at haochen dot jiang at intel.com.)
(If you met problems with cascadelake related, disabling AVX512F in command 
line might save that.)
(However, please make sure that there is no potential problems with AVX512.)


Re: [PATCH] [testsuite] skip test on non-hosted libstdc++ [PR113994]

2025-01-16 Thread Mike Stump
On Jan 16, 2025, at 11:42 AM, Alexandre Oliva  wrote:
> 
> Tests that include  need to be skipped when libstdc++ is built
> in freestanding mode.

Ok.

> for  gcc/testsuite/ChangeLog
> 
>   PR rtl-optimization/113994
>   * g++.dg/pr113994.C: Require hosted libstdc++.


  1   2   >