date:20230214

[PATCH] asan: Add --param=asan-kernel-mem-intrinsic-prefix= [PR108777]

2023-02-14 Thread Jakub Jelinek via Gcc-patches

Hi!

While in the -fsanitize=address case libasan overloads memcpy, memset,
memmove and many other builtins, such that they are always instrumented,
Linux kernel for -fsanitize=kernel-address recently changed or is changing,
such that memcpy, memset and memmove actually aren't instrumented because
they are often used also from no_sanitize ("kernel-address") functions
and wants __{,hw,}asaN_{memcpy,memset,memmove} to be used instead
for the instrumented calls.  See e.g. the https://lkml.org/lkml/2023/2/9/1182
thread.  Without appropriate support on the compiler side, that will mean
any time a kernel-address instrumented function (most of them) calls
memcpy/memset/memmove, they will not be instrumented and thus won't catch
kernel bugs.  Apparently clang 15 has a param for this.

The following patch implements the same (except it is a usual GCC --param,
not -mllvm argument) on the GCC side.  I know this isn't a regression
bugfix, but given that -fsanitize=kernel-address has a single project that
uses it which badly wants this I think it would be worthwhile to make an
exception and get this into GCC 13 rather than waiting another year, it
won't affect non-kernel code, nor even the kernel unless the new parameter
is used.

Bootstrapped/regtested on x86_64-linux and i686-linux and Marco has tested
it on the kernel, ok for trunk?

2023-02-14  Jakub Jelinek  

PR sanitizer/108777
* params.opt (-param=asan-kernel-mem-intrinsic-prefix=): New param.
* asan.h (asan_memfn_rtl): Declare.
* asan.cc (asan_memfn_rtls): New variable.
(asan_memfn_rtl): New function.
* builtins.cc (expand_builtin): If
param_asan_kernel_mem_intrinsic_prefix and function is
kernel-{,hw}address sanitized, emit calls to
__{,hw}asan_{memcpy,memmove,memset} rather than
{memcpy,memmove,memset}.  Use sanitize_flags_p (SANITIZE_ADDRESS)
instead of flag_sanitize & SANITIZE_ADDRESS to check if
asan_intercepted_p functions shouldn't be expanded inline.

* gcc.dg/asan/pr108777-1.c: New test.
* gcc.dg/asan/pr108777-2.c: New test.
* gcc.dg/asan/pr108777-3.c: New test.
* gcc.dg/asan/pr108777-4.c: New test.
* gcc.dg/asan/pr108777-5.c: New test.
* gcc.dg/asan/pr108777-6.c: New test.
* gcc.dg/completion-3.c: Adjust expected multiline output.

--- gcc/params.opt.jj   2023-02-10 19:04:58.289014706 +0100
+++ gcc/params.opt  2023-02-13 16:19:50.411101775 +0100
@@ -50,6 +50,10 @@ Enable asan store operations protection.
 Common Joined UInteger Var(param_asan_instrumentation_with_call_threshold) 
Init(7000) Param Optimization
 Use callbacks instead of inline code if number of accesses in function becomes 
greater or equal to this number.
 
+-param=asan-kernel-mem-intrinsic-prefix=
+Common Joined UInteger Var(param_asan_kernel_mem_intrinsic_prefix) Init(0) 
IntegerRange(0, 1) Param Optimization
+Prefix calls to memcpy, memset and memmove with __asan_ or __hwasan_ for 
-fsanitize=kernel-address or -fsanitize=kernel-hwaddress.
+
 -param=asan-memintrin=
 Common Joined UInteger Var(param_asan_memintrin) Init(1) IntegerRange(0, 1) 
Param Optimization
 Enable asan builtin functions protection.
--- gcc/asan.h.jj   2023-01-02 09:32:26.721222635 +0100
+++ gcc/asan.h  2023-02-13 16:45:14.475088159 +0100
@@ -33,6 +33,7 @@ extern bool asan_expand_check_ifn (gimpl
 extern bool asan_expand_mark_ifn (gimple_stmt_iterator *);
 extern bool asan_expand_poison_ifn (gimple_stmt_iterator *, bool *,
hash_map &);
+extern rtx asan_memfn_rtl (tree);
 
 extern void hwasan_record_frame_init ();
 extern void hwasan_record_stack_var (rtx, rtx, poly_int64, poly_int64);
--- gcc/asan.cc.jj  2023-02-02 10:54:44.326473507 +0100
+++ gcc/asan.cc 2023-02-13 16:52:16.711015256 +0100
@@ -391,6 +391,46 @@ asan_memintrin (void)
 }
 
 
+/* Support for --param asan-kernel-mem-intrinsic-prefix=1.  */
+static GTY(()) rtx asan_memfn_rtls[3];
+
+rtx
+asan_memfn_rtl (tree fndecl)
+{
+  int i;
+  const char *f, *p;
+  char buf[sizeof ("__hwasan_memmove")];
+
+  switch (DECL_FUNCTION_CODE (fndecl))
+{
+case BUILT_IN_MEMCPY: i = 0; f = "memcpy"; break;
+case BUILT_IN_MEMSET: i = 1; f = "memset"; break;
+case BUILT_IN_MEMMOVE: i = 2; f = "memmove"; break;
+default: gcc_unreachable ();
+}
+  if (asan_memfn_rtls[i] == NULL_RTX)
+{
+  tree save_name = DECL_NAME (fndecl);
+  tree save_assembler_name = DECL_ASSEMBLER_NAME (fndecl);
+  rtx save_rtl = DECL_RTL (fndecl);
+  if (flag_sanitize & SANITIZE_KERNEL_HWADDRESS)
+   p = "__hwasan_";
+  else
+   p = "__asan_";
+  strcpy (buf, p);
+  strcat (buf, f);
+  DECL_NAME (fndecl) = get_identifier (buf);
+  DECL_ASSEMBLER_NAME_RAW (fndecl) = NULL_TREE;
+  SET_DECL_RTL (fndecl, NULL_RTX);
+  asan_memfn_rtls[i] = DECL_RTL (fndecl);
+  DECL_NAME (fndecl) = save_name;
+  DECL_ASSEMBLER_NAME_RAW (fndec

Re: [PATCH 2/2] vect: Make partial trapping ops use predication [PR96373]

2023-02-14 Thread Richard Sandiford via Gcc-patches

"Kewen.Lin"  writes:
> on 2023/2/13 21:57, Richard Sandiford wrote:
>> "Kewen.Lin"  writes:
>>> Hi Richard,
>>>
>>> on 2023/1/27 19:08, Richard Sandiford via Gcc-patches wrote:
 PR96373 points out that a predicated SVE loop currently converts
 trapping unconditional ops into unpredicated vector ops.  Doing
 the operation on inactive lanes can then raise an exception.

 As discussed in the PR trail, we aren't 100% consistent about
 whether we preserve traps or not.  But the direction of travel
 is clearly to improve that rather than live with it.  This patch
 tries to do that for the SVE case.

 Doing this regresses gcc.target/aarch64/sve/fabd_1.c.  I've added
 -fno-trapping-math for now and filed PR108571 to track it.
 A similar problem applies to fsubr_1.d.

 I think this is likely to regress Power 10, since conditional
 operations are only available for masked loops.  I think we'll
 need to add -fno-trapping-math to any affected testcases,
 but I don't have a Power 10 system to test on.  Kewen, would you
 mind giving this a spin and seeing how bad the fallout is?

>>>
>>> Sorry for the late reply, I'm just back from vacation.
>>>
>>> Thank you for fixing this and caring about Power10!
>>>
>>> I tested your proposed patch on one Power10 machine (ppc64le),
>>> it's bootstrapped but some test failures got exposed as below.
>>>
>>> < FAIL: gcc.target/powerpc/p9-vec-length-epil-1.c scan-assembler-times 
>>> mlxvlM 14
>>> < FAIL: gcc.target/powerpc/p9-vec-length-epil-1.c scan-assembler-times 
>>> mstxvlM 7
>>> < FAIL: gcc.target/powerpc/p9-vec-length-epil-2.c scan-assembler-times 
>>> mlxvlM 20
>>> < FAIL: gcc.target/powerpc/p9-vec-length-epil-2.c scan-assembler-times 
>>> mstxvlM 10
>>> < FAIL: gcc.target/powerpc/p9-vec-length-epil-3.c scan-assembler-times 
>>> mlxvlM 14
>>> < FAIL: gcc.target/powerpc/p9-vec-length-epil-3.c scan-assembler-times 
>>> mstxvlM 7
>>> < FAIL: gcc.target/powerpc/p9-vec-length-epil-4.c scan-assembler-times 
>>> mlxvlM 70
>>> < FAIL: gcc.target/powerpc/p9-vec-length-epil-4.c scan-assembler-times 
>>> mlxvx?M 120
>>> < FAIL: gcc.target/powerpc/p9-vec-length-epil-4.c scan-assembler-times 
>>> mstxvlM 70
>>> < FAIL: gcc.target/powerpc/p9-vec-length-epil-4.c scan-assembler-times 
>>> mstxvx?M 70
>>> < FAIL: gcc.target/powerpc/p9-vec-length-epil-5.c scan-assembler-times 
>>> mlxvlM 21
>>> < FAIL: gcc.target/powerpc/p9-vec-length-epil-5.c scan-assembler-times 
>>> mstxvlM 21
>>> < FAIL: gcc.target/powerpc/p9-vec-length-epil-5.c scan-assembler-times 
>>> mstxvx?M 21
>>> < FAIL: gcc.target/powerpc/p9-vec-length-epil-6.c scan-assembler-times 
>>> mlxvlM 10
>>> < FAIL: gcc.target/powerpc/p9-vec-length-epil-6.c scan-assembler-times 
>>> mlxvx?M 42
>>> < FAIL: gcc.target/powerpc/p9-vec-length-epil-6.c scan-assembler-times 
>>> mstxvlM 10
>>> < FAIL: gcc.target/powerpc/p9-vec-length-epil-8.c scan-assembler-times 
>>> mlxvlM 16
>>> < FAIL: gcc.target/powerpc/p9-vec-length-epil-8.c scan-assembler-times 
>>> mstxvlM 7
>>> < FAIL: gcc.target/powerpc/p9-vec-length-full-1.c scan-assembler-not 
>>> mlxvxM
>>> < FAIL: gcc.target/powerpc/p9-vec-length-full-1.c scan-assembler-not 
>>> mstxvxM
>>> < FAIL: gcc.target/powerpc/p9-vec-length-full-1.c scan-assembler-times 
>>> mlxvlM 20
>>> < FAIL: gcc.target/powerpc/p9-vec-length-full-1.c scan-assembler-times 
>>> mstxvlM 10
>>> < FAIL: gcc.target/powerpc/p9-vec-length-full-2.c scan-assembler-not 
>>> mlxvxM
>>> < FAIL: gcc.target/powerpc/p9-vec-length-full-2.c scan-assembler-not 
>>> mstxvxM
>>> < FAIL: gcc.target/powerpc/p9-vec-length-full-2.c scan-assembler-times 
>>> mlxvlM 20
>>> < FAIL: gcc.target/powerpc/p9-vec-length-full-2.c scan-assembler-times 
>>> mstxvlM 10
>>> < FAIL: gcc.target/powerpc/p9-vec-length-full-3.c scan-assembler-times 
>>> mlxvlM 14
>>> < FAIL: gcc.target/powerpc/p9-vec-length-full-3.c scan-assembler-times 
>>> mstxvlM 7
>>> < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-not 
>>> mlxvxM
>>> < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-not 
>>> mstxvM
>>> < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-not 
>>> mstxvxM
>>> < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-times 
>>> mlxvlM 70
>>> < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-times 
>>> mstxvlM 70
>>> < FAIL: gcc.target/powerpc/p9-vec-length-full-5.c scan-assembler-not 
>>> mlxvxM
>>> < FAIL: gcc.target/powerpc/p9-vec-length-full-5.c scan-assembler-not 
>>> mstxvM
>>> < FAIL: gcc.target/powerpc/p9-vec-length-full-5.c scan-assembler-not 
>>> mstxvxM
>>> < FAIL: gcc.target/powerpc/p9-vec-length-full-5.c scan-assembler-times 
>>>

[PATCH] RISC-V: Finish all integer C/C++ intrinsics

2023-02-14 Thread juzhe . zhong

From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/predicates.md: Refine codes.
* config/riscv/riscv-protos.h (RVV_VUNDEF): New macro.
* config/riscv/riscv-v.cc: Refine codes.
* config/riscv/riscv-vector-builtins-bases.cc (enum ternop_type): New 
enum.
(class imac): New class.
(enum widen_ternop_type): New enum.
(class iwmac): New class.
(BASE): New class.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vmacc): Ditto.
(vnmsac): Ditto.
(vmadd): Ditto.
(vnmsub): Ditto.
(vwmacc): Ditto.
(vwmaccu): Ditto.
(vwmaccsu): Ditto.
(vwmaccus): Ditto.
* config/riscv/riscv-vector-builtins.cc 
(function_builder::apply_predication): Adjust for multiply-add support.
(function_expander::add_vundef_operand): Refine codes.
(function_expander::use_ternop_insn): New function.
(function_expander::use_widen_ternop_insn): Ditto.
* config/riscv/riscv-vector-builtins.h: New function.
* config/riscv/vector.md (@pred_mul_): New pattern.
(pred_mul__undef_merge): Ditto.
(*pred_): Ditto.
(*pred_): Ditto.
(*pred_mul_): Ditto.
(@pred_mul__scalar): Ditto.
(*pred_mul__undef_merge_scalar): Ditto.
(*pred__scalar): Ditto.
(*pred__scalar): Ditto.
(*pred_mul__scalar): Ditto.
(*pred_mul__undef_merge_extended_scalar): Ditto.
(*pred__extended_scalar): Ditto.
(*pred__extended_scalar): Ditto.
(*pred_mul__extended_scalar): Ditto.
(@pred_widen_mul_plus): Ditto.
(@pred_widen_mul_plus_scalar): Ditto.
(@pred_widen_mul_plussu): Ditto.
(@pred_widen_mul_plussu_scalar): Ditto.
(@pred_widen_mul_plusus_scalar): Ditto.

---
 gcc/config/riscv/predicates.md|   3 +-
 gcc/config/riscv/riscv-protos.h   |   2 +
 gcc/config/riscv/riscv-v.cc   |   4 +-
 .../riscv/riscv-vector-builtins-bases.cc  | 142 
 .../riscv/riscv-vector-builtins-bases.h   |   8 +
 .../riscv/riscv-vector-builtins-functions.def |  15 +
 gcc/config/riscv/riscv-vector-builtins.cc | 243 ++-
 gcc/config/riscv/riscv-vector-builtins.h  |   2 +
 gcc/config/riscv/vector.md| 680 +-
 9 files changed, 1079 insertions(+), 20 deletions(-)

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index bbbf523d588..7bc7c0b4f4d 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -292,8 +292,7 @@
(match_operand 0 "vector_all_trues_mask_operand")))
 
 (define_predicate "vector_undef_operand"
-  (match_test "GET_CODE (op) == UNSPEC
-   && (XINT (op, 1) == UNSPEC_VUNDEF)"))
+  (match_test "rtx_equal_p (op, RVV_VUNDEF (GET_MODE (op)))"))
 
 (define_predicate "vector_merge_operand"
   (ior (match_operand 0 "register_operand")
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index a4476e6235f..9d8b0b78a06 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -121,6 +121,8 @@ extern void riscv_run_selftests (void);
 
 namespace riscv_vector {
 #define RVV_VLMAX gen_rtx_REG (Pmode, X0_REGNUM)
+#define RVV_VUNDEF(MODE)   
\
+  gen_rtx_UNSPEC (MODE, gen_rtvec (1, const0_rtx), UNSPEC_VUNDEF)
 enum vlmul_type
 {
   LMUL_1 = 0,
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index cc26888d58b..600b2e6ecad 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -67,9 +67,7 @@ public:
   }
   void add_vundef_operand (machine_mode mode)
   {
-add_input_operand (gen_rtx_UNSPEC (mode, gen_rtvec (1, const0_rtx),
-  UNSPEC_VUNDEF),
-  mode);
+add_input_operand (RVV_VUNDEF (mode), mode);
   }
   void add_policy_operand (enum tail_policy vta, enum mask_policy vma)
   {
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 4f3531d4486..ba701482728 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -539,6 +539,132 @@ public:
   }
 };
 
+/* Enumerates types of ternary operations.
+   We have 2 types ternop:
+ - 1. accumulator is vd:
+vmacc.vv vd,vs1,vs2 # vd = vs1 * vs2 + vd.
+ - 2. accumulator is vs2:
+vmadd.vv vd,vs1,vs2 # vd = vs1 * vd + vs2.  */
+enum ternop_type
+{
+  TERNOP_VMACC,
+  TERNOP_VNMSAC,
+  TERNOP_VMADD,
+  TERNOP_VNMSUB,
+};
+
+/* Implements vmacc/vnmsac/vmadd/vnmsub.  */
+template
+class imac : public function_base
+{
+public:
+  bool has_merge_operand_p () const override { return false; }
+
+  rtx expand (function_expander &e) const override
+  {
+switch (TERNOP_TYPE)
+  {
+  case TER

Re: [PATCH] asan: Add --param=asan-kernel-mem-intrinsic-prefix= [PR108777]

2023-02-14 Thread Richard Biener via Gcc-patches

On Tue, 14 Feb 2023, Jakub Jelinek wrote:

> Hi!
> 
> While in the -fsanitize=address case libasan overloads memcpy, memset,
> memmove and many other builtins, such that they are always instrumented,
> Linux kernel for -fsanitize=kernel-address recently changed or is changing,
> such that memcpy, memset and memmove actually aren't instrumented because
> they are often used also from no_sanitize ("kernel-address") functions
> and wants __{,hw,}asaN_{memcpy,memset,memmove} to be used instead
> for the instrumented calls.  See e.g. the https://lkml.org/lkml/2023/2/9/1182
> thread.  Without appropriate support on the compiler side, that will mean
> any time a kernel-address instrumented function (most of them) calls
> memcpy/memset/memmove, they will not be instrumented and thus won't catch
> kernel bugs.  Apparently clang 15 has a param for this.
> 
> The following patch implements the same (except it is a usual GCC --param,
> not -mllvm argument) on the GCC side.  I know this isn't a regression
> bugfix, but given that -fsanitize=kernel-address has a single project that
> uses it which badly wants this I think it would be worthwhile to make an
> exception and get this into GCC 13 rather than waiting another year, it
> won't affect non-kernel code, nor even the kernel unless the new parameter
> is used.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux and Marco has tested
> it on the kernel, ok for trunk?

OK.

Thanks,
Richard.

> 2023-02-14  Jakub Jelinek  
> 
>   PR sanitizer/108777
>   * params.opt (-param=asan-kernel-mem-intrinsic-prefix=): New param.
>   * asan.h (asan_memfn_rtl): Declare.
>   * asan.cc (asan_memfn_rtls): New variable.
>   (asan_memfn_rtl): New function.
>   * builtins.cc (expand_builtin): If
>   param_asan_kernel_mem_intrinsic_prefix and function is
>   kernel-{,hw}address sanitized, emit calls to
>   __{,hw}asan_{memcpy,memmove,memset} rather than
>   {memcpy,memmove,memset}.  Use sanitize_flags_p (SANITIZE_ADDRESS)
>   instead of flag_sanitize & SANITIZE_ADDRESS to check if
>   asan_intercepted_p functions shouldn't be expanded inline.
> 
>   * gcc.dg/asan/pr108777-1.c: New test.
>   * gcc.dg/asan/pr108777-2.c: New test.
>   * gcc.dg/asan/pr108777-3.c: New test.
>   * gcc.dg/asan/pr108777-4.c: New test.
>   * gcc.dg/asan/pr108777-5.c: New test.
>   * gcc.dg/asan/pr108777-6.c: New test.
>   * gcc.dg/completion-3.c: Adjust expected multiline output.
> 
> --- gcc/params.opt.jj 2023-02-10 19:04:58.289014706 +0100
> +++ gcc/params.opt2023-02-13 16:19:50.411101775 +0100
> @@ -50,6 +50,10 @@ Enable asan store operations protection.
>  Common Joined UInteger Var(param_asan_instrumentation_with_call_threshold) 
> Init(7000) Param Optimization
>  Use callbacks instead of inline code if number of accesses in function 
> becomes greater or equal to this number.
>  
> +-param=asan-kernel-mem-intrinsic-prefix=
> +Common Joined UInteger Var(param_asan_kernel_mem_intrinsic_prefix) Init(0) 
> IntegerRange(0, 1) Param Optimization
> +Prefix calls to memcpy, memset and memmove with __asan_ or __hwasan_ for 
> -fsanitize=kernel-address or -fsanitize=kernel-hwaddress.
> +
>  -param=asan-memintrin=
>  Common Joined UInteger Var(param_asan_memintrin) Init(1) IntegerRange(0, 1) 
> Param Optimization
>  Enable asan builtin functions protection.
> --- gcc/asan.h.jj 2023-01-02 09:32:26.721222635 +0100
> +++ gcc/asan.h2023-02-13 16:45:14.475088159 +0100
> @@ -33,6 +33,7 @@ extern bool asan_expand_check_ifn (gimpl
>  extern bool asan_expand_mark_ifn (gimple_stmt_iterator *);
>  extern bool asan_expand_poison_ifn (gimple_stmt_iterator *, bool *,
>   hash_map &);
> +extern rtx asan_memfn_rtl (tree);
>  
>  extern void hwasan_record_frame_init ();
>  extern void hwasan_record_stack_var (rtx, rtx, poly_int64, poly_int64);
> --- gcc/asan.cc.jj2023-02-02 10:54:44.326473507 +0100
> +++ gcc/asan.cc   2023-02-13 16:52:16.711015256 +0100
> @@ -391,6 +391,46 @@ asan_memintrin (void)
>  }
>  
>  
> +/* Support for --param asan-kernel-mem-intrinsic-prefix=1.  */
> +static GTY(()) rtx asan_memfn_rtls[3];
> +
> +rtx
> +asan_memfn_rtl (tree fndecl)
> +{
> +  int i;
> +  const char *f, *p;
> +  char buf[sizeof ("__hwasan_memmove")];
> +
> +  switch (DECL_FUNCTION_CODE (fndecl))
> +{
> +case BUILT_IN_MEMCPY: i = 0; f = "memcpy"; break;
> +case BUILT_IN_MEMSET: i = 1; f = "memset"; break;
> +case BUILT_IN_MEMMOVE: i = 2; f = "memmove"; break;
> +default: gcc_unreachable ();
> +}
> +  if (asan_memfn_rtls[i] == NULL_RTX)
> +{
> +  tree save_name = DECL_NAME (fndecl);
> +  tree save_assembler_name = DECL_ASSEMBLER_NAME (fndecl);
> +  rtx save_rtl = DECL_RTL (fndecl);
> +  if (flag_sanitize & SANITIZE_KERNEL_HWADDRESS)
> + p = "__hwasan_";
> +  else
> + p = "__asan_";
> +  strcpy (buf, p);
> +  strcat (buf, f);
> +  DECL_NAME (fn

nvptx: Adjust 'scan-assembler' in 'gfortran.dg/weak-1.f90' (was: Support for NOINLINE attribute)

2023-02-14 Thread Thomas Schwinge

Hi!

On 2023-02-13T18:50:23+0100, Harald Anlauf via Gcc-patches 
 wrote:
> Pushed as:
>
> commit 086a1df4374962787db37c1f0d1bd9beb828f9e3

> On 2/12/23 22:28, Harald Anlauf via Gcc-patches wrote:
>> There is one thing I cannot test, which is the handling of weak symbols
>> on other platforms.  A quick glance at the C testcases suggests that
>> someone with access to either an NVPTX or MingGW target might tell
>> whether that particular target should be excluded.

Indeed nvptx does use a different assembler syntax; I've pushed to
master branch commit 8d8175869ca94c600e64e27b7676787b2a398f6e
"nvptx: Adjust 'scan-assembler' in 'gfortran.dg/weak-1.f90'", see
attached.

And I'm curious, is '!GCC$ ATTRIBUTES weak' meant to be used only for
weak definitions (like in 'gfortran.dg/weak-1.f90'), or also for weak
declarations (which, for example, in the C world then evaluate to
zero-address unless actually defined)?  When I did a quick experiment,
that didn't seem to work?  (But may be my fault, of course.)

And, orthogonally: is '!GCC$ ATTRIBUTES weak' meant to be used only for
subroutines (like in 'gfortran.dg/weak-1.f90') and also functions (I
suppose; test case?), or also for weak "data" in some way (which, for
example, in the C world then evaluates to a zero-address unless actually
defined)?

Could help to at least add a few more test cases, and clarify the
documentation?

Grüße
 Thomas

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 8d8175869ca94c600e64e27b7676787b2a398f6e Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 14 Feb 2023 10:11:19 +0100
Subject: [PATCH] nvptx: Adjust 'scan-assembler' in 'gfortran.dg/weak-1.f90'

Fix-up for recent commit 086a1df4374962787db37c1f0d1bd9beb828f9e3
"Fortran: Add !GCC$ attributes NOINLINE,NORETURN,WEAK".

	gcc/testsuite/
	* gfortran.dg/weak-1.f90: Adjust 'scan-assembler' for nvptx.
---
 gcc/testsuite/gfortran.dg/weak-1.f90 | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gfortran.dg/weak-1.f90 b/gcc/testsuite/gfortran.dg/weak-1.f90
index d9aca686775a..9ec1fe74053e 100644
--- a/gcc/testsuite/gfortran.dg/weak-1.f90
+++ b/gcc/testsuite/gfortran.dg/weak-1.f90
@@ -1,6 +1,7 @@
 ! { dg-do compile }
 ! { dg-require-weak "" }
-! { dg-final { scan-assembler "weak\[^ \t\]*\[ \t\]_?impl" } }
+! { dg-final { scan-assembler "weak\[^ \t\]*\[ \t\]_?impl" { target { ! nvptx-*-* } } } }
+! { dg-final { scan-assembler-times "\\.weak \\.func impl" 2 { target nvptx-*-* } } }
 subroutine impl
 !GCC$ ATTRIBUTES weak :: impl
 end subroutine
-- 
2.39.1

Re: OpenMP Patch Ping – including "[13 Regression]" patches

2023-02-14 Thread Tobias Burnus


* The 'loop' patch fixes a long-standing bug exposed by a GCC 13 commit,
  making it a "13 Regression" fix
* The next two are simple bug fixes, relatively obvious and have very
  limited-scope code changes - fixing wrong-code issues.
* The next two are a bit longer but also rather contained.
* Julian's patch set fixes several real bugs (such as PR108624), but it 
unfortunately
  more complex; still, it would be good if they could be eventually be reviewed 
...


[Patch][v2] OpenMP/Fortran: Fix loop-iter var privatization with !$OMP LOOP 
[PR108512]
https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611730.html


[Patch] libgomp: Fix 'target enter data' with always pointer
https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611920.html

[Patch] libgomp: Fix reverse-offload for GOMP_MAP_TO_PSET
https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611429.html


[PATCH] openmp: Add support for 'present' modifier"
https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611299.html

[Patch] Fortran/OpenMP: Add parsing support for allocators/allocate directive
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608904.html


[PATCH v6 00/11] OpenMP: C/C++ lvalue parsing, C/C++/Fortran "declare mapper" 
support
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/thread.html#609031

Note: For 10/11 of the set, there was a follow up:
[PATCH v6 10/11] OpenMP: Support OpenMP 5.0 "declare mapper" directives for C
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609566.html


On 07.02.23 17:34, Tobias Burnus wrote:


On 10.01.23 12:37, Tobias Burnus wrote [with FYI comments omitted]:
...

Fortran deep mapping (allocatable components)

[Patch][1/2] OpenMP: Add lang hooks + run-time filled map arrays for
Fortran deep mapping of DT
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609637.html


PS: NOTE to the list above: I have stopped checking older patches. I know
some more are pending review, others need to be revised. I will re-check,
once the below listed patches have been reviewed. Cf. old list.


Thanks,

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Re: [PATCH] debug: Support "phrs" for dumping a HARD_REG_SET

2023-02-14 Thread Richard Sandiford via Gcc-patches

Hans-Peter Nilsson via Gcc-patches  writes:
> Ok to commit?  It survived both a cris-elf regtest and a
> x86_64-linux-gnu native regtest. :)

OK, thanks.

Richard

>  8< 
> The debug-function in sel-sched-dump.cc that would be
> suitable for a hookup to a command in gdb is guarded by
> #ifdef INSN_SCHEDULING, thus can't be used for all targets.
> Better move the function marked DEBUG_FUNCTION elsewhere,
> here to a file with a suitable static function to call.
>
> There are multiple sets of similar functions dumping
> HARD_REG_SETs, but cleaning that up is better left to a
> separate commit.
>
> gcc:
>   * gdbinit.in (phrs): New command.
>   * sel-sched-dump.cc (debug_hard_reg_set): Remove debug-function.
>   * ira-color.cc (debug_hard_reg_set): New, calling print_hard_reg_set.
> ---
>  gcc/gdbinit.in| 12 
>  gcc/ira-color.cc  |  7 +++
>  gcc/sel-sched-dump.cc | 10 --
>  3 files changed, 19 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/gdbinit.in b/gcc/gdbinit.in
> index 1f7592b0e26a..a76079a46af7 100644
> --- a/gcc/gdbinit.in
> +++ b/gcc/gdbinit.in
> @@ -31,6 +31,7 @@ GCC gdbinit file introduces several debugging shorthands:
>  pdd [dw_die_ref],
>  pbm [bitmap],
>  pel [location_t],
> +phrs [HARD_REG_SET]
>  pp, pbs, pcfun
>  
>  They are generally implemented by calling a function that prints to stderr,
> @@ -145,6 +146,17 @@ Print given GENERIC expression in C syntax.
>  See also 'help-gcc-hooks'.
>  end
>  
> +define phrs
> +eval "set $debug_arg = $%s", $argc ? "arg0" : ""
> +call debug_hard_reg_set ($debug_arg)
> +end
> +
> +document phrs
> +GCC hook: debug_hard_reg_set (HARD_REG_SET)
> +Print given HARD_REG_SET.
> +See also 'help-gcc-hooks'.
> +end
> +
>  define pmz
>  eval "set $debug_arg = $%s", $argc ? "arg0" : ""
>  call mpz_out_str(stderr, 10, $debug_arg)
> diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc
> index fe6dfc6e7692..1fb2958bddd0 100644
> --- a/gcc/ira-color.cc
> +++ b/gcc/ira-color.cc
> @@ -512,6 +512,13 @@ print_hard_reg_set (FILE *f, HARD_REG_SET set, bool 
> new_line_p)
>  fprintf (f, "\n");
>  }
>  
> +/* Dump a hard reg set SET to stderr.  */
> +DEBUG_FUNCTION void
> +debug_hard_reg_set (HARD_REG_SET set)
> +{
> +  print_hard_reg_set (stderr, set, true);
> +}
> +
>  /* Print allocno hard register subforest given by ROOTS and its LEVEL
> to F.  */
>  static void
> diff --git a/gcc/sel-sched-dump.cc b/gcc/sel-sched-dump.cc
> index b4eef8803df9..05de98409375 100644
> --- a/gcc/sel-sched-dump.cc
> +++ b/gcc/sel-sched-dump.cc
> @@ -986,16 +986,6 @@ debug_blist (blist_t bnds)
>restore_dump ();
>  }
>  
> -/* Dump a hard reg set SET to stderr.  */
> -DEBUG_FUNCTION void
> -debug_hard_reg_set (HARD_REG_SET set)
> -{
> -  switch_dump (stderr);
> -  dump_hard_reg_set ("", set);
> -  sel_print ("\n");
> -  restore_dump ();
> -}
> -
>  /* Debug a cfg region with default flags.  */
>  void
>  sel_debug_cfg (void)

[Patch] More LLP64 fixes and PIC values fixes for PE targets

2023-02-14 Thread Jonathan Yong via Gcc-patches


Attached patches OK?From 616e43ac41879040e73a266065874148553cddcc Mon Sep 17 00:00:00 2001
From: Jonathan Yong <10wa...@gmail.com>
Date: Tue, 14 Feb 2023 10:37:03 +
Subject: [PATCH 2/2] gcc/testsuite/gcc.dg: fix pic test case for PE targets

gcc/testsuite/ChangeLog:

	* pic-2.c: fix expected __PIC__ value.
	* pic-3.c: ditto.
	* pic-4.c: ditto.

Signed-off-by: Jonathan Yong <10wa...@gmail.com>
---
 gcc/testsuite/gcc.dg/pic-2.c | 6 +-
 gcc/testsuite/gcc.dg/pic-3.c | 6 +-
 gcc/testsuite/gcc.dg/pic-4.c | 6 +-
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/pic-2.c b/gcc/testsuite/gcc.dg/pic-2.c
index 3846ec4ff47..24260538cc0 100644
--- a/gcc/testsuite/gcc.dg/pic-2.c
+++ b/gcc/testsuite/gcc.dg/pic-2.c
@@ -4,7 +4,11 @@
 /* { dg-skip-if "__PIC__ is always 1 for MIPS" { mips*-*-* } } */
 /* { dg-skip-if "__PIE__ is always defined for GCN" { amdgcn*-*-* } } */
 
-#if __PIC__ != 2
+#if defined(__CYGWIN__) || defined(__WIN32__)
+# if __PIC__ != 1
+#  error __PIC__ is not 1!
+# endif
+#elif __PIC__ != 2
 # error __PIC__ is not 2!
 #endif
 
diff --git a/gcc/testsuite/gcc.dg/pic-3.c b/gcc/testsuite/gcc.dg/pic-3.c
index 1397977e7f8..d3eb120652a 100644
--- a/gcc/testsuite/gcc.dg/pic-3.c
+++ b/gcc/testsuite/gcc.dg/pic-3.c
@@ -1,7 +1,11 @@
 /* { dg-do compile { target { ! { *-*-darwin* hppa*64*-*-* mips*-*-linux-* amdgcn*-*-* } } } } */
 /* { dg-options "-fno-pic" } */
 
-#ifdef __PIC__
+#if defined(__CYGWIN__) || defined(__WIN32__)
+# if __PIC__ != 1
+#  error __PIC__ is not 1!
+# endif
+#elif __PIC__
 # error __PIC__ is defined!
 #endif
 
diff --git a/gcc/testsuite/gcc.dg/pic-4.c b/gcc/testsuite/gcc.dg/pic-4.c
index d6d9dc90046..d7acefaf9aa 100644
--- a/gcc/testsuite/gcc.dg/pic-4.c
+++ b/gcc/testsuite/gcc.dg/pic-4.c
@@ -1,7 +1,11 @@
 /* { dg-do compile { target { ! { *-*-darwin* hppa*64*-*-* mips*-*-linux-* amdgcn*-*-* } } } } */
 /* { dg-options "-fno-PIC" } */
 
-#ifdef __PIC__
+#if defined(__CYGWIN__) || defined(__WIN32__)
+# if __PIC__ != 1
+#  error __PIC__ is not 1!
+# endif
+#elif __PIC__
 # error __PIC__ is defined!
 #endif
 
-- 
2.39.1

From a1fafc5a3c70684e843f5f0b6cf392ce349cb6b0 Mon Sep 17 00:00:00 2001
From: Jonathan Yong <10wa...@gmail.com>
Date: Tue, 14 Feb 2023 10:29:05 +
Subject: [PATCH 1/2] gcc/testsuite/gcc.dg: fix LLP64 targets

gcc/testsuite/ChangeLog:

	* gcc.dg/builtins-69.c: Use (long )*regex pattern to
	allow long long instead of just long.
	* gcc.dg/pr80163.c: use __INTPTR_TYPE__ for LLP64 tagets.

Signed-off-by: Jonathan Yong <10wa...@gmail.com>
---
 gcc/testsuite/gcc.dg/builtins-69.c | 2 +-
 gcc/testsuite/gcc.dg/pr80163.c | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/builtins-69.c b/gcc/testsuite/gcc.dg/builtins-69.c
index 26dfb3bfc1b..b754b5d26ee 100644
--- a/gcc/testsuite/gcc.dg/builtins-69.c
+++ b/gcc/testsuite/gcc.dg/builtins-69.c
@@ -14,7 +14,7 @@ int test_index (void)
 /* PR middle-end/86202 - ICE in get_range_info calling an invalid memcpy()
declaration */
 
-void *memcpy (void *, void *, __SIZE_TYPE__ *);   /* { dg-warning "conflicting types for built-in function .memcpy.; expected .void \\\*\\\(void \\\*, const void \\\*, \(long \)?unsigned int\\\)." } */
+void *memcpy (void *, void *, __SIZE_TYPE__ *);   /* { dg-warning "conflicting types for built-in function .memcpy.; expected .void \\\*\\\(void \\\*, const void \\\*, \(long \)*unsigned int\\\)." } */
 
 void test_memcpy (void *p, void *q, __SIZE_TYPE__ *r)
 {
diff --git a/gcc/testsuite/gcc.dg/pr80163.c b/gcc/testsuite/gcc.dg/pr80163.c
index 37a7abd1181..f65955c0ec9 100644
--- a/gcc/testsuite/gcc.dg/pr80163.c
+++ b/gcc/testsuite/gcc.dg/pr80163.c
@@ -2,6 +2,7 @@
 /* { dg-do compile { target int128 } } */
 /* { dg-options "-O0" } */
 
+typedef __INTPTR_TYPE__ intptr_t;
 void bar (void);
 
 __int128_t *
@@ -10,7 +11,7 @@ foo (void)
 a:
   bar ();
 b:;
-  static __int128_t d = (long) &&a - (long) &&b;	/* { dg-error "initializer element is not computable at load time" } */
+  static __int128_t d = (intptr_t) &&a - (intptr_t) &&b;	/* { dg-error "initializer element is not computable at load time" } */
   return &d;
 }
 
-- 
2.39.1

[PATCH] c++: Add testcases from some Issaquah DRs

2023-02-14 Thread Jakub Jelinek via Gcc-patches

Hi!

The following patch adds testcases for 5 DRs.  In the DR2475, DR2530 and
DR2691 my understanding is we already implement the desired behavior,
in DR2478 partially (I've added 2 dg-bogus there, I think we inherit
rather than overwrite DECL_DECLARED_CONSTINIT_P for explicit specialization
somewhere, still far better than clang++) and DR2673 on the other side the
DR was to codify the clang++ behavior rather than GCC.

Not 100% sure if it is better to commit the 2 with dg-bogus or just wait
until the actual fixes are implemented.  BTW, I've noticed
register_specialization does:
  FOR_EACH_CLONE (clone, fn)
{
  DECL_DECLARED_INLINE_P (clone)
= DECL_DECLARED_INLINE_P (fn);
  DECL_SOURCE_LOCATION (clone)
= DECL_SOURCE_LOCATION (fn);
  DECL_DELETED_FN (clone)
= DECL_DELETED_FN (fn);
}
but not e.g. constexpr/consteval, have tried to cover that in a testcase
but haven't managed to do so.

Tested on x86_64-linux with
GXX_TESTSUITE_STDS=98,11,14,17,20,2b make check-g++ 
RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} dg.exp=DRs/*.C'
ok for trunk (or ok just for the 1st, 3rd and 5th testcase)?

2023-02-14  Jakub Jelinek  

* g++.dg/DRs/dr2475.C: New test.
* g++.dg/DRs/dr2478.C: New test.
* g++.dg/DRs/dr2530.C: New test.
* g++.dg/DRs/dr2673.C: New test.
* g++.dg/DRs/dr2691.C: New test.

--- gcc/testsuite/g++.dg/DRs/dr2475.C.jj2023-02-14 10:14:18.300920099 
+0100
+++ gcc/testsuite/g++.dg/DRs/dr2475.C   2023-02-14 11:24:38.676314439 +0100
@@ -0,0 +1,6 @@
+// DR 2475 - Object declarations of type cv void
+// { dg-do compile }
+
+int f(), x;
+extern void g(),
+  y;   // { dg-error "variable or field 'y' declared void" }
--- gcc/testsuite/g++.dg/DRs/dr2478.C.jj2023-02-14 10:23:35.487795016 
+0100
+++ gcc/testsuite/g++.dg/DRs/dr2478.C   2023-02-14 11:24:49.092162197 +0100
@@ -0,0 +1,74 @@
+// DR 2478 - Properties of explicit specializations of implicitly-instantiated 
class templates
+// { dg-do compile { target c++20 } }
+
+template 
+struct S {
+  int foo () { return 0; }
+  constexpr int bar () { return 0; }
+  int baz () { return 0; }
+  consteval int qux () { return 0; }
+  constexpr S () {}
+  static constinit T x;
+  static T y;
+};
+
+template 
+T S::x = S ().foo ();// { dg-error "'constinit' variable 
'S::x' does not have a constant initializer" }
+   // { dg-error "call to non-'constexpr' 
function" "" { target *-*-* } .-1 }
+
+template 
+T S::y = S ().foo ();
+
+template <>
+constexpr int
+S::foo ()
+{
+  return 0;
+}
+
+template <>
+int
+S::bar ()
+{
+  return 0;
+}
+
+template <>
+consteval int
+S::baz ()
+{
+  return 0;
+}
+
+template <>
+int
+S::qux ()
+{
+  return 0;
+}
+
+template <>
+long S::x = S ().foo ();   // { dg-bogus "'constinit' variable 
'S::x' does not have a constant initializer" "" { xfail *-*-* } }
+   // { dg-bogus "call to non-'constexpr' 
function" "" { xfail *-*-* } .-1 }
+
+template <>
+constinit long S::y = S ().foo (); // { dg-error "'constinit' 
variable 'S::y' does not have a constant initializer" }
+   // { dg-error "call to 
non-'constexpr' function" "" { target *-*-* } .-1 }
+
+constinit auto a = S ().foo ();  // { dg-error "'constinit' variable 'a' 
does not have a constant initializer" }
+   // { dg-error "call to non-'constexpr' 
function" "" { target *-*-* } .-1 }
+constinit auto b = S ().bar ();
+constinit auto c = S ().foo ();
+constinit auto d = S ().bar ();   // { dg-error "'constinit' variable 'd' 
does not have a constant initializer" }
+   // { dg-error "call to non-'constexpr' 
function" "" { target *-*-* } .-1 }
+constinit auto e = S ().baz ();
+constinit auto f = S ().qux ();  // { dg-error "'constinit' variable 'f' 
does not have a constant initializer" }
+   // { dg-error "call to non-'constexpr' 
function" "" { target *-*-* } .-1 }
+constinit auto g = S ().baz ();   // { dg-error "'constinit' variable 'g' 
does not have a constant initializer" }
+   // { dg-error "call to non-'constexpr' 
function" "" { target *-*-* } .-1 }
+constinit auto h = S ().qux ();
+auto i = S::x;
+auto j = S::x;
+auto k = S::x;
+auto l = S::y;
+auto m = S::y;
--- gcc/testsuite/g++.dg/DRs/dr2530.C.jj2023-02-14 11:23:14.306547587 
+0100
+++ gcc/testsuite/g++.dg/DRs/dr2530.C   2023-02-14 11:25:58.557146894 +0100
@@ -0,0 +1,5 @@
+// DR 2530 - Multiple definitions of enumerators
+// { dg-do compile }
+
+enum E { e, e };   // { dg-error "redefinition of 'e'" }
+enum F { f = 0, f = 0 };   // { dg-error "redefinition of 'f'" }
--- gcc/testsuite/g++.dg/DRs/dr2673.C.jj2023-02-14 11:38:15.0303942

Re: [PATCH] c++: Add testcases from some Issaquah DRs

2023-02-14 Thread Jakub Jelinek via Gcc-patches

On Tue, Feb 14, 2023 at 12:22:33PM +0100, Jakub Jelinek via Gcc-patches wrote:
> 2023-02-14  Jakub Jelinek  
> 
>   * g++.dg/DRs/dr2691.C: New test.

Actually, this one isn't a DR, so maybe it should go into:
* gcc/testsuite/c-c++-common/cpp/delimited-escape-seq-8.c: New test.
instead.

> --- gcc/testsuite/g++.dg/DRs/dr2691.C.jj  2023-02-14 11:48:35.841335492 
> +0100
> +++ gcc/testsuite/g++.dg/DRs/dr2691.C 2023-02-14 11:57:21.538669133 +0100
> @@ -0,0 +1,15 @@
> +// DR 2691 - hexadecimal-escape-sequence is too greedy
> +// { dg-do run { target c++11 } }
> +// { dg-require-effective-target wchar }
> +// { dg-options "-pedantic" }
> +
> +extern "C" void abort ();
> +
> +const char32_t *a = U"\x{20}ab";// { dg-warning "delimited escape 
> sequences are only valid in" "" { target c++20_down } }
> +
> +int
> +main ()
> +{
> +  if (a[0] != U'\x20' || a[1] != U'a' || a[2] != U'b' || a[3] != U'\0')
> +abort ();
> +}

Jakub

[PATCH] tree-optimization/108782 - nested first order recurrence vectorization

2023-02-14 Thread Richard Biener via Gcc-patches

First order recurrence vectorization isn't possible for nested
loops.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/108782
* tree-vect-loop.cc (vect_phi_first_order_recurrence_p):
Make sure we're not vectorizing an inner loop.

* gcc.dg/torture/pr108782.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr108782.c | 21 +
 gcc/tree-vect-loop.cc   |  4 
 2 files changed, 25 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr108782.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr108782.c 
b/gcc/testsuite/gcc.dg/torture/pr108782.c
new file mode 100644
index 000..1eac93db574
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr108782.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fno-tree-copy-prop" } */
+
+int m;
+
+__attribute__ ((simd)) int
+foo (void)
+{
+  unsigned a;
+  int b = 0;
+
+  m = a = 1;
+  while (a != 0)
+{
+  b = m;
+  m = 2;
+  ++a;
+}
+
+  return b;
+}
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index becf96bb2b8..8387f7690b2 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -538,6 +538,10 @@ static bool
 vect_phi_first_order_recurrence_p (loop_vec_info loop_vinfo, class loop *loop,
   gphi *phi)
 {
+  /* A nested cycle isn't vectorizable as first order recurrence.  */
+  if (LOOP_VINFO_LOOP (loop_vinfo) != loop)
+return false;
+
   /* Ensure the loop latch definition is from within the loop.  */
   edge latch = loop_latch_edge (loop);
   tree ldef = PHI_ARG_DEF_FROM_EDGE (phi, latch);
-- 
2.35.3

[PATCH] Fix small regression in Ada

2023-02-14 Thread Eric Botcazou via Gcc-patches

It is present on the mainline and 12 branch and comes from Andrew P. and me 
forgetting about the VOID_TYPE_P case of SAVE_EXPRs.

Tested on x86-64/Linux, applied on mainline and 12 branch as obvious.


2023-02-14  Eric Botcazou  

gcc/
* gimplify.cc (gimplify_save_expr): Add missing guard.

gcc/ada/
* gcc-interface/trans.cc (gnat_gimplify_expr): Add missing guard.


2023-02-14  Eric Botcazou  

* gnat.dg/shift2.adb: New test.

-- 
Eric Botcazoudiff --git a/gcc/ada/gcc-interface/trans.cc b/gcc/ada/gcc-interface/trans.cc
index 28e3867d142..5fc1a26fede 100644
--- a/gcc/ada/gcc-interface/trans.cc
+++ b/gcc/ada/gcc-interface/trans.cc
@@ -9049,7 +9049,9 @@ gnat_gimplify_expr (tree *expr_p, gimple_seq *pre_p,
 
   /* Propagate TREE_NO_WARNING from expression to temporary by using the
 	 SAVE_EXPR itself as an intermediate step.  See gimplify_save_expr.  */
-  if (SAVE_EXPR_RESOLVED_P (expr))
+  if (type == void_type_node)
+	;
+  else if (SAVE_EXPR_RESOLVED_P (expr))
 	TREE_NO_WARNING (op) = TREE_NO_WARNING (expr);
   else
 	TREE_NO_WARNING (expr) = TREE_NO_WARNING (op);
diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index 1b362dd83e3..96845154a92 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -6441,7 +6441,7 @@ gimplify_save_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p)
   gcc_assert (TREE_CODE (*expr_p) == SAVE_EXPR);
   val = TREE_OPERAND (*expr_p, 0);
 
-  if (TREE_TYPE (val) == error_mark_node)
+  if (val && TREE_TYPE (val) == error_mark_node)
 return GS_ERROR;
 
   /* If the SAVE_EXPR has not been resolved, then evaluate it once.  */
-- { dg-do compile }

with Interfaces; use Interfaces;

function Shift2 (V : Unsigned_32) return Unsigned_32 is
begin
  return Shift_Left (V, (case V is when 0 => 1, when others => 0));
end;

[og12] In 'libgomp/allocator.c:omp_realloc', route 'free' through 'MEMSPACE_FREE' (was: [PATCH] libgomp, OpenMP, nvptx: Low-latency memory allocator)

2023-02-14 Thread Thomas Schwinge

Hi Andrew!

On 2022-01-13T11:13:51+, Andrew Stubbs  wrote:
> Updated patch: this version fixes some missed cases of malloc in the
> realloc implementation.

Right, and as it seems I've run into another issue: a stray 'free'.

> --- a/libgomp/allocator.c
> +++ b/libgomp/allocator.c

Re 'omp_realloc':

> @@ -660,9 +709,10 @@ retry:
>gomp_mutex_unlock (&allocator_data->lock);
>  #endif
>if (prev_size)
> - new_ptr = realloc (data->ptr, new_size);
> + new_ptr = MEMSPACE_REALLOC (allocator_data->memspace, data->ptr,
> + data->size, new_size);
>else
> - new_ptr = malloc (new_size);
> + new_ptr = MEMSPACE_ALLOC (allocator_data->memspace, new_size);
>if (new_ptr == NULL)
>   {
>  #ifdef HAVE_SYNC_BUILTINS
> @@ -690,7 +740,11 @@ retry:
>  && (free_allocator_data == NULL
>  || free_allocator_data->pool_size == ~(uintptr_t) 0))
>  {
> -  new_ptr = realloc (data->ptr, new_size);
> +  omp_memspace_handle_t memspace __attribute__((unused))
> + = (allocator_data
> +? allocator_data->memspace
> +: predefined_alloc_mapping[allocator]);
> +  new_ptr = MEMSPACE_REALLOC (memspace, data->ptr, data->size, new_size);
>if (new_ptr == NULL)
>   goto fail;
>ret = (char *) new_ptr + sizeof (struct omp_mem_header);
> @@ -701,7 +755,11 @@ retry:
>  }
>else
>  {
> -  new_ptr = malloc (new_size);
> +  omp_memspace_handle_t memspace __attribute__((unused))
> + = (allocator_data
> +? allocator_data->memspace
> +: predefined_alloc_mapping[allocator]);
> +  new_ptr = MEMSPACE_ALLOC (memspace, new_size);
>if (new_ptr == NULL)
>   goto fail;
>  }
> @@ -735,32 +793,35 @@ retry:
|free (data->ptr);
>return ret;

I run into a SIGSEGV if a non-'malloc'-based allocation is 'free'd here.

The attached
"In 'libgomp/allocator.c:omp_realloc', route 'free' through 'MEMSPACE_FREE'"
appears to resolve my issue, but not yet regression-tested.  Does that
look correct to you?

Or, instead of invoking 'MEMSPACE_FREE', should we scrap the
'used_pool_size' bookkeeping here, and just invoke 'omp_free' instead?

--- libgomp/allocator.c
+++ libgomp/allocator.c
@@ -842,19 +842,7 @@ retry:
   if (old_size - old_alignment < size)
 size = old_size - old_alignment;
   memcpy (ret, ptr, size);
-  if (__builtin_expect (free_allocator_data
-   && free_allocator_data->pool_size < ~(uintptr_t) 0, 0))
-{
-#ifdef HAVE_SYNC_BUILTINS
-  __atomic_add_fetch (&free_allocator_data->used_pool_size, 
-data->size,
- MEMMODEL_RELAXED);
-#else
-  gomp_mutex_lock (&free_allocator_data->lock);
-  free_allocator_data->used_pool_size -= data->size;
-  gomp_mutex_unlock (&free_allocator_data->lock);
-#endif
-}
-  free (data->ptr);
+  ialias_call (omp_free) (ptr, free_allocator);
   return ret;

(I've not yet analyzed whether that's completely equivalent.)


Note that this likewise applies to the current upstream submission:

"libgomp, nvptx: low-latency memory allocator".


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From d49d0b9dc4f96c496afb2d5caac4addb382fdf39 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 14 Feb 2023 13:35:03 +0100
Subject: [PATCH] In 'libgomp/allocator.c:omp_realloc', route 'free' through
 'MEMSPACE_FREE'

... to not run into a SIGSEGV if a non-'malloc'-based allocation is 'free'd
here.

Fix-up for og12 commit c5d1d7651297a273321154a5fe1b01eba9dcf604
"libgomp, nvptx: low-latency memory allocator".

	libgomp/
	* allocator.c (omp_realloc): Route 'free' through 'MEMSPACE_FREE'.
---
 libgomp/allocator.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/libgomp/allocator.c b/libgomp/allocator.c
index 05b323d458e2..ba9a4e17cc20 100644
--- a/libgomp/allocator.c
+++ b/libgomp/allocator.c
@@ -854,7 +854,17 @@ retry:
   gomp_mutex_unlock (&free_allocator_data->lock);
 #endif
 }
-  free (data->ptr);
+  {
+omp_memspace_handle_t was_memspace __attribute__((unused))
+  = (free_allocator_data
+	 ? free_allocator_data->memspace
+	 : predefined_alloc_mapping[free_allocator]);
+int was_pinned __attribute__((unused))
+  = (free_allocator_data
+	 ? free_allocator_data->pinned
+	 : free_allocator == ompx_pinned_mem_alloc);
+MEMSPACE_FREE (was_memspace, data->ptr, data->size, was_pinned);
+  }
   return ret;
 
 fail:
-- 
2.39.1

Re: [PATCH] target/108738 - optimize bit operations in STV

2023-02-14 Thread Richard Biener via Gcc-patches

On Thu, 9 Feb 2023, Richard Biener wrote:

> The following does low-hanging optimizations, combining bitmap
> test and set and removing redundant operations.
> 
> This shaves off half of the testcase compile time.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?

Ping - sorry, forgot to CC maintainers.

Thanks,
Richard.

> Thanks,
> Richard.
> 
>   PR target/108738
>   * config/i386/i386-features.cc (scalar_chain::add_to_queue):
>   Combine bitmap test and set.
>   (scalar_chain::add_insn): Likewise.
>   (scalar_chain::analyze_register_chain): Remove redundant
>   attempt to add to queue and instead strengthen assert.
>   Sink common attempts to mark the def dual-mode.
>   (scalar_chain::add_to_queue): Remove redundant insn bitmap
>   check.
> ---
>  gcc/config/i386/i386-features.cc | 18 --
>  1 file changed, 8 insertions(+), 10 deletions(-)
> 
> diff --git a/gcc/config/i386/i386-features.cc 
> b/gcc/config/i386/i386-features.cc
> index 9bd6d8677bb..eff91301009 100644
> --- a/gcc/config/i386/i386-features.cc
> +++ b/gcc/config/i386/i386-features.cc
> @@ -314,14 +314,12 @@ scalar_chain::~scalar_chain ()
>  void
>  scalar_chain::add_to_queue (unsigned insn_uid)
>  {
> -  if (bitmap_bit_p (insns, insn_uid)
> -  || bitmap_bit_p (queue, insn_uid))
> +  if (!bitmap_set_bit (queue, insn_uid))
>  return;
>  
>if (dump_file)
>  fprintf (dump_file, "  Adding insn %d into chain's #%d queue\n",
>insn_uid, chain_id);
> -  bitmap_set_bit (queue, insn_uid);
>  }
>  
>  /* For DImode conversion, mark register defined by DEF as requiring
> @@ -362,10 +360,9 @@ void
>  scalar_chain::analyze_register_chain (bitmap candidates, df_ref ref)
>  {
>df_link *chain;
> +  bool mark_def = false;
>  
> -  gcc_assert (bitmap_bit_p (insns, DF_REF_INSN_UID (ref))
> -   || bitmap_bit_p (candidates, DF_REF_INSN_UID (ref)));
> -  add_to_queue (DF_REF_INSN_UID (ref));
> +  gcc_checking_assert (bitmap_bit_p (insns, DF_REF_INSN_UID (ref)));
>  
>for (chain = DF_REF_CHAIN (ref); chain; chain = chain->next)
>  {
> @@ -398,9 +395,12 @@ scalar_chain::analyze_register_chain (bitmap candidates, 
> df_ref ref)
> if (dump_file)
>   fprintf (dump_file, "  r%d use in insn %d isn't convertible\n",
>DF_REF_REGNO (chain->ref), uid);
> -   mark_dual_mode_def (ref);
> +   mark_def = true;
>   }
>  }
> +
> +  if (mark_def)
> +mark_dual_mode_def (ref);
>  }
>  
>  /* Add instruction into a chain.  */
> @@ -408,14 +408,12 @@ scalar_chain::analyze_register_chain (bitmap 
> candidates, df_ref ref)
>  void
>  scalar_chain::add_insn (bitmap candidates, unsigned int insn_uid)
>  {
> -  if (bitmap_bit_p (insns, insn_uid))
> +  if (!bitmap_set_bit (insns, insn_uid))
>  return;
>  
>if (dump_file)
>  fprintf (dump_file, "  Adding insn %d to chain #%d\n", insn_uid, 
> chain_id);
>  
> -  bitmap_set_bit (insns, insn_uid);
> -
>rtx_insn *insn = DF_INSN_UID_GET (insn_uid)->insn;
>rtx def_set = single_set (insn);
>if (def_set && REG_P (SET_DEST (def_set))
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

Re: [PATCH] target/108738 - STV bitmap operations compile-time hog

2023-02-14 Thread Richard Biener via Gcc-patches

On Thu, 9 Feb 2023, Richard Biener wrote:

> When the set of candidates becomes very large then repeated
> bit checks on it during the build of an actual chain can become
> slow because of the O(n) nature of bitmap tests.  The following
> switches the candidates bitmaps to the tree representation before
> building the chains to get O(log n) amortized behavior.
> 
> For the testcase at hand this improves STV time by 50%.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?

Ping - sorry, forgot to CC maintainers.

Richard.

> Thanks,
> Richard.
> 
>   PR target/108738
>   * config/i386/i386-features.cc (convert_scalars_to_vector):
>   Switch candidates bitmaps to tree view before building the chains.
> ---
>  gcc/config/i386/i386-features.cc | 49 +---
>  1 file changed, 26 insertions(+), 23 deletions(-)
> 
> diff --git a/gcc/config/i386/i386-features.cc 
> b/gcc/config/i386/i386-features.cc
> index ec13d4e7489..9bd6d8677bb 100644
> --- a/gcc/config/i386/i386-features.cc
> +++ b/gcc/config/i386/i386-features.cc
> @@ -2283,30 +2283,33 @@ convert_scalars_to_vector (bool timode_p)
>fprintf (dump_file, "There are no candidates for optimization.\n");
>  
>for (unsigned i = 0; i <= 2; ++i)
> -while (!bitmap_empty_p (&candidates[i]))
> -  {
> - unsigned uid = bitmap_first_set_bit (&candidates[i]);
> - scalar_chain *chain;
> -
> - if (cand_mode[i] == TImode)
> -   chain = new timode_scalar_chain;
> - else
> -   chain = new general_scalar_chain (cand_mode[i], cand_vmode[i]);
> -
> - /* Find instructions chain we want to convert to vector mode.
> -Check all uses and definitions to estimate all required
> -conversions.  */
> - chain->build (&candidates[i], uid);
> -
> - if (chain->compute_convert_gain () > 0)
> -   converted_insns += chain->convert ();
> - else
> -   if (dump_file)
> - fprintf (dump_file, "Chain #%d conversion is not profitable\n",
> -  chain->chain_id);
> +{
> +  bitmap_tree_view (&candidates[i]);
> +  while (!bitmap_empty_p (&candidates[i]))
> + {
> +   unsigned uid = bitmap_first_set_bit (&candidates[i]);
> +   scalar_chain *chain;
>  
> - delete chain;
> -  }
> +   if (cand_mode[i] == TImode)
> + chain = new timode_scalar_chain;
> +   else
> + chain = new general_scalar_chain (cand_mode[i], cand_vmode[i]);
> +
> +   /* Find instructions chain we want to convert to vector mode.
> +  Check all uses and definitions to estimate all required
> +  conversions.  */
> +   chain->build (&candidates[i], uid);
> +
> +   if (chain->compute_convert_gain () > 0)
> + converted_insns += chain->convert ();
> +   else
> + if (dump_file)
> +   fprintf (dump_file, "Chain #%d conversion is not profitable\n",
> +chain->chain_id);
> +
> +   delete chain;
> + }
> +}
>  
>if (dump_file)
>  fprintf (dump_file, "Total insns converted: %d\n", converted_insns);
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

[PATCH] RISC-V: Add ternary constraint tests

2023-02-14 Thread juzhe . zhong

From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/ternop_vv_constraint-1.c: New test.
* gcc.target/riscv/rvv/base/ternop_vv_constraint-2.c: New test.
* gcc.target/riscv/rvv/base/ternop_vx_constraint-1.c: New test.
* gcc.target/riscv/rvv/base/ternop_vx_constraint-2.c: New test.
* gcc.target/riscv/rvv/base/ternop_vx_constraint-3.c: New test.
* gcc.target/riscv/rvv/base/ternop_vx_constraint-4.c: New test.
* gcc.target/riscv/rvv/base/ternop_vx_constraint-5.c: New test.
* gcc.target/riscv/rvv/base/ternop_vx_constraint-6.c: New test.
* gcc.target/riscv/rvv/base/ternop_vx_constraint-7.c: New test.

---
 .../riscv/rvv/base/ternop_vv_constraint-1.c   |  83 +++
 .../riscv/rvv/base/ternop_vv_constraint-2.c   |  83 +++
 .../riscv/rvv/base/ternop_vx_constraint-1.c   |  71 ++
 .../riscv/rvv/base/ternop_vx_constraint-2.c   |  38 +
 .../riscv/rvv/base/ternop_vx_constraint-3.c   | 125 +
 .../riscv/rvv/base/ternop_vx_constraint-4.c   | 123 +
 .../riscv/rvv/base/ternop_vx_constraint-5.c   | 123 +
 .../riscv/rvv/base/ternop_vx_constraint-6.c   | 130 ++
 .../riscv/rvv/base/ternop_vx_constraint-7.c   | 130 ++
 9 files changed, 906 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/ternop_vv_constraint-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/ternop_vv_constraint-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/ternop_vx_constraint-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/ternop_vx_constraint-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/ternop_vx_constraint-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/ternop_vx_constraint-4.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/ternop_vx_constraint-5.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/ternop_vx_constraint-6.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/ternop_vx_constraint-7.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/ternop_vv_constraint-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/ternop_vv_constraint-1.c
new file mode 100644
index 000..838776e5c50
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/ternop_vv_constraint-1.c
@@ -0,0 +1,83 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+#include "riscv_vector.h"
+
+/*
+** f1:
+** vsetivli\tzero,4,e32,m1,ta,ma
+** vle32\.v\tv[0-9]+,0\([a-x0-9]+\)
+** vle32\.v\tv[0-9]+,0\([a-x0-9]+\)
+** vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+
+** vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+
+** vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+
+** vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+
+** vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+
+** vse32\.v\tv[0-9]+,0\([a-x0-9]+\)
+** ret
+*/
+void f1 (void * in, void * in2, void *out)
+{
+vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4);
+vint32m1_t v2 = __riscv_vle32_v_i32m1 (in2, 4);
+vint32m1_t v3 = __riscv_vmacc_vv_i32m1 (v, v2, v2, 4);
+vint32m1_t v4 = __riscv_vmacc_vv_i32m1(v3, v2, v2, 4);
+v4 = __riscv_vmacc_vv_i32m1 (v4, v2, v2, 4);
+v4 = __riscv_vmacc_vv_i32m1 (v4, v2, v2, 4);
+v4 = __riscv_vmacc_vv_i32m1 (v4, v2, v2, 4);
+__riscv_vse32_v_i32m1 (out, v4, 4);
+}
+
+/*
+** f2:
+** vsetivli\tzero,4,e32,m1,tu,ma
+** vle32\.v\tv[0-9]+,0\([a-x0-9]+\)
+** vle32\.v\tv[0-9]+,0\([a-x0-9]+\)
+** vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+
+** vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+
+** vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+
+** vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+
+** vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+
+** vse32\.v\tv[0-9]+,0\([a-x0-9]+\)
+** ret
+*/
+void f2 (void * in, void * in2, void *out)
+{
+vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4);
+vint32m1_t v2 = __riscv_vle32_v_i32m1 (in2, 4);
+vint32m1_t v3 = __riscv_vmacc_vv_i32m1_tu (v, v2, v2, 4);
+vint32m1_t v4 = __riscv_vmacc_vv_i32m1_tu(v3, v2, v2, 4);
+v4 = __riscv_vmacc_vv_i32m1_tu (v4, v2, v2, 4);
+v4 = __riscv_vmacc_vv_i32m1_tu (v4, v2, v2, 4);
+v4 = __riscv_vmacc_vv_i32m1_tu (v4, v2, v2, 4);
+__riscv_vse32_v_i32m1 (out, v4, 4);
+}
+
+/*
+** f3:
+** vsetivli\tzero,4,e32,m1,ta,ma
+** vlm\.v\tv[0-9]+,0\([a-x0-9]+\)
+** vle32\.v\tv[0-9]+,0\([a-x0-9]+\)
+** vle32\.v\tv[0-9]+,0\([a-x0-9]+\)
+** vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t
+** vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t
+** vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t
+** vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t
+** vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t
+** vse32\.v\tv[0-9]+,0\([a-x0-9

[PATCH] Speedup DF dataflow solver

2023-02-14 Thread Richard Biener via Gcc-patches

The following makes sure to process blocks that follow the current
block in the iteration order in the same iteration and only postpone
blocks that would be visited earlier to the next iteration.

For the all.i testcase in PR26854 at -O2 this shaves off 50% of
the time to solve the DF RD problem, other problems also improve
but not as drastically.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

OK for trunk?

Thanks,
Richard.

PR middle-end/26854
* df-core.cc (df_worklist_propagate_forward): Put later
blocks on worklist and only earlier blocks on pending.
(df_worklist_propagate_backward): Likewise.
(df_worklist_dataflow_doublequeue): Change the iteration
to process new blocks in the same iteration if that
maintains the iteration order.
---
 gcc/df-core.cc | 54 --
 1 file changed, 35 insertions(+), 19 deletions(-)

diff --git a/gcc/df-core.cc b/gcc/df-core.cc
index e5ae9ab9348..38f69ac5743 100644
--- a/gcc/df-core.cc
+++ b/gcc/df-core.cc
@@ -874,7 +874,8 @@ make_pass_df_finish (gcc::context *ctxt)
 /* Helper function for df_worklist_dataflow.
Propagate the dataflow forward.
Given a BB_INDEX, do the dataflow propagation
-   and set bits on for successors in PENDING
+   and set bits on for successors in PENDING for earlier
+   and WORKLIST for later in bbindex_to_postorder
if the out set of the dataflow has changed.
 
AGE specify time when BB was visited last time.
@@ -890,10 +891,11 @@ make_pass_df_finish (gcc::context *ctxt)
 
 static bool
 df_worklist_propagate_forward (struct dataflow *dataflow,
-   unsigned bb_index,
-   unsigned *bbindex_to_postorder,
-   bitmap pending,
-   sbitmap considered,
+  unsigned bb_index,
+  unsigned *bbindex_to_postorder,
+  bitmap worklist,
+  bitmap pending,
+  sbitmap considered,
   vec &last_change_age,
   int age)
 {
@@ -924,7 +926,13 @@ df_worklist_propagate_forward (struct dataflow *dataflow,
   unsigned ob_index = e->dest->index;
 
   if (bitmap_bit_p (considered, ob_index))
-bitmap_set_bit (pending, bbindex_to_postorder[ob_index]);
+   {
+ if (bbindex_to_postorder[bb_index]
+ < bbindex_to_postorder[ob_index])
+   bitmap_set_bit (worklist, bbindex_to_postorder[ob_index]);
+ else
+   bitmap_set_bit (pending, bbindex_to_postorder[ob_index]);
+   }
 }
   return true;
 }
@@ -937,10 +945,11 @@ df_worklist_propagate_forward (struct dataflow *dataflow,
 
 static bool
 df_worklist_propagate_backward (struct dataflow *dataflow,
-unsigned bb_index,
-unsigned *bbindex_to_postorder,
-bitmap pending,
-sbitmap considered,
+   unsigned bb_index,
+   unsigned *bbindex_to_postorder,
+   bitmap worklist,
+   bitmap pending,
+   sbitmap considered,
vec &last_change_age,
int age)
 {
@@ -971,7 +980,13 @@ df_worklist_propagate_backward (struct dataflow *dataflow,
   unsigned ob_index = e->src->index;
 
   if (bitmap_bit_p (considered, ob_index))
-bitmap_set_bit (pending, bbindex_to_postorder[ob_index]);
+   {
+ if (bbindex_to_postorder[bb_index]
+ < bbindex_to_postorder[ob_index])
+   bitmap_set_bit (worklist, bbindex_to_postorder[ob_index]);
+ else
+   bitmap_set_bit (pending, bbindex_to_postorder[ob_index]);
+   }
 }
   return true;
 }
@@ -1021,36 +1036,37 @@ df_worklist_dataflow_doublequeue (struct dataflow 
*dataflow,
  and pending is for the next. */
   while (!bitmap_empty_p (pending))
 {
-  bitmap_iterator bi;
-  unsigned int index;
-
   std::swap (pending, worklist);
 
-  EXECUTE_IF_SET_IN_BITMAP (worklist, 0, index, bi)
+  do
{
+ unsigned index = bitmap_first_set_bit (worklist);
+ bitmap_clear_bit (worklist, index);
+
  unsigned bb_index;
  dcount++;
 
- bitmap_clear_bit (pending, index);
  bb_index = blocks_in_postorder[index];
  prev_age = last_visit_age[index];
  if (dir == DF_FORWARD)
changed = df_worklist_propagate_forward (dataflow, bb_index,
 bbindex_to_postorder,
-

Re: [PATCH] amdgcn: Add instruction patterns for vector operations on complex numbers

2023-02-14 Thread Andrew Stubbs


On 09/02/2023 20:13, Andrew Jenner wrote:
This patch introduces instruction patterns for complex number operations 
in the GCN machine description. These patterns are cmul, cmul_conj, 
vec_addsub, vec_fmaddsub, vec_fmsubadd, cadd90, cadd270, cmla and cmls 
(cmla_conj and cmls_conj were not found to be favorable to implement). 
As a side effect of adding cmls, I also added fms patterns corresponding 
to the existing fma patterns. Tested on CDNA2 GFX90a.


OK to commit?


gcc/ChangeLog:

 * config/gcn/gcn-protos.h (gcn_expand_dpp_swap_pairs_insn)
     (gcn_expand_dpp_distribute_even_insn)
     (gcn_expand_dpp_distribute_odd_insn): Declare.
     * config/gcn/gcn-valu.md (@dpp_swap_pairs)
     (@dpp_distribute_even, @dpp_distribute_odd)
     (cmul3, cml4, vec_addsub3)
     (cadd3, vec_fmaddsub4, vec_fmsubadd4)
     (fms4, fms4_negop2, fms4)
     (fms4_negop2): New patterns.
     * config/gcn/gcn.cc (gcn_expand_dpp_swap_pairs_insn)
     (gcn_expand_dpp_distribute_even_insn)
     (gcn_expand_dpp_distribute_odd_insn): New functions.
     * config/gcn/gcn.md: Add entries to unspec enum.

gcc/testsuite/ChangeLog:

 * gcc.target/gcn/complex.c: New test.


+;; It would be possible to represent these without the UNSPEC as
+;;
+;; (vec_merge
+;;   (fma op1 op2 op3)
+;;   (fma op1 op2 (neg op3))
+;;   (merge-const))
+;;
+;; But this doesn't seem useful in practice.
+
+(define_expand "vec_fmaddsub4"
+  [(set (match_operand:V_noHI 0 "register_operand" "=&v")
+(unspec:V_noHI
+  [(match_operand:V_noHI 1 "register_operand" "v")
+   (match_operand:V_noHI 2 "register_operand" "v")
+   (match_operand:V_noHI 3 "register_operand" "v")]
+  UNSPEC_FMADDSUB))]

This is a define_expand pattern that has a custom-code expansion with an 
unconditional "DONE", so the actual RTL representation is irrelevant 
here: it only needs to have the match_operand entries. The 
UNSPEC_FMADDSUB is therefore dead (as in, it will never appear in the 
IR). We can safely remove those, although I don't hate them for 
readability purposes.


The UNSPEC_CMUL and UNSPEC_CMUL_CONJ are similarly "dead", but since you 
use them for an iterator they're still useful in the machine description.


+(define_insn "fms4"
+  [(set (match_operand:V_FP 0 "register_operand"  "=  v,   v")
+   (fma:V_FP
+ (match_operand:V_FP 1 "gcn_alu_operand" "% vA,  vA")
+   (match_operand:V_FP 2 "gcn_alu_operand" "  vA,vSvA")
+   (neg:V_FP
+ (match_operand:V_FP 3 "gcn_alu_operand" "vSvA,  vA"]
+  ""
+  "v_fma%i0\t%0, %1, %2, -%3"
+  [(set_attr "type" "vop3a")
+   (set_attr "length" "8")])

Please ensure that the alternatives are vertically aligned in the same 
style as the rest of the file.


+/* Generate DPP pairwise swap instruction.
+   The opcode is given by INSN.  */
+
+char *
+gcn_expand_dpp_swap_pairs_insn (machine_mode mode, const char *insn,
+   int ARG_UNUSED (unspec))

+/* Generate DPP distribute even instruction.
+   The opcode is given by INSN.  */
+
+char *
+gcn_expand_dpp_distribute_even_insn (machine_mode mode, const char *insn,
+int ARG_UNUSED (unspec))

+/* Generate DPP distribute odd instruction.
+   The opcode is given by INSN.  */
+
+char *
+gcn_expand_dpp_distribute_odd_insn (machine_mode mode, const char *insn,
+   int ARG_UNUSED (unspec))

Please add a comment that isn't just the function name in words. Explain 
what operation happens here and maybe show an example of what it produces.


+++ b/gcc/testsuite/gcc.target/gcn/complex.c
@@ -0,0 +1,640 @@
+// { dg-do run }
+// { dg-options "-O -fopenmp-simd -ftree-loop-if-convert -fno-ssa-phiopt" }

Does the -fopenmp-simd option do anything here? There are no "omp 
declare simd" directives.


+void cmulF(float *td, float *te, float *tf, float *tg, int tas)
+{
+  typedef _Complex float complexT;
+  int array_size = tas/2;
+  complexT *d = (complexT*)(td);
+  complexT *e = (complexT*)(te);
+  complexT *f = (complexT*)(tf);
+#pragma omp target teams distribute parallel for simd
+  for (int i = 0; i < array_size; i++)
+{
+  d[i] = e[i] * f[i];
+}
+}

Tests in gcc.target/gcn won't do anything with "omp target" directives. 
I would expect the loop to vectorize without, at -O2 or above (or "-O1 
-ftree-vectorize"), but you might find the output easier to read with 
"__restrict" on the parameters as that will avoid emitting the runtime 
alias check and scalar code implementation.


I'd also expect you to have to do something to avoid inlining.

+  td = (float*)omp_aligned_alloc(ALIGNMENT, sizeof(float)*array_size, 
omp_default_mem_alloc);
+  te = (float*)omp_aligned_alloc(ALIGNMENT, sizeof(float)*array_size, 
omp_default_mem_alloc);
+  tf = (float*)omp_aligned_alloc(ALIGNMENT, sizeof(float)*array_size, 
omp_default_mem_alloc);
+  tg = (float*)omp_aligned_alloc(ALIGNMENT, sizeof(float)*a

[PATCH] ipa: Avoid IPA confusing scalar values and single-field aggregates (PR 108679)

2023-02-14 Thread Martin Jambor

Hi,

PR 108679 testcase shows a situation when IPA-CP is able to track a
scalar constant in a single-field structure that is part of a bigger
structure.  This smaller struture is however also passed in a few calls
to other functions, but the two same-but-different entities, originally
places at the same offset and with the same size, and this confuses the
mechanism that takes care of handling call statements after IPA-SRA.

I think that in stage 4 it is best to revert to GCC 12 behavior in this
particular case (when IPA-CP detects a constant in a single-field
structure or a single element array that is part of a bigger aggregate)
and the patch below does that.  If accepted, I plan to file a
missed-optimization bug to track that we could use the IPA-CP propagated
value to re-construct the small aggregate arguments.

Bootstrapped and tested and LTO bootstrapped on x86_64-linux.  OK for
master?

Thanks,

Martin




gcc/ChangeLog:

2023-02-13  Martin Jambor  

PR ipa/108679
* ipa-sra.cc (push_param_adjustments_for_index): Do not omit
creation of non-scalar replacements even if IPA-CP knows their
contents.

gcc/testsuite/ChangeLog:

2023-02-13  Martin Jambor  

PR ipa/108679
* gcc.dg/ipa/pr108679.c: New test.
---
 gcc/ipa-sra.cc  |  2 +-
 gcc/testsuite/gcc.dg/ipa/pr108679.c | 25 +
 2 files changed, 26 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/pr108679.c

diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc
index 0495f446bf4..3de7d426b7e 100644
--- a/gcc/ipa-sra.cc
+++ b/gcc/ipa-sra.cc
@@ -3989,7 +3989,7 @@ push_param_adjustments_for_index (isra_func_summary *ifs, 
unsigned base_index,
{
  ipa_argagg_value_list avl (ipcp_ts);
  tree value = avl.get_value (base_index, pa->unit_offset);
- if (value)
+ if (value && !AGGREGATE_TYPE_P (pa->type))
{
  if (dump_file)
fprintf (dump_file, "- omitting component at byte "
diff --git a/gcc/testsuite/gcc.dg/ipa/pr108679.c 
b/gcc/testsuite/gcc.dg/ipa/pr108679.c
new file mode 100644
index 000..b1ed50bb831
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr108679.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+struct S1 {
+  signed f0;
+};
+struct S2 {
+  struct S1 f2;
+  short f8;
+} g_18;
+void safe_lshift_func_int16_t_s_u();
+void safe_unary_minus_func_uint64_t_u();
+int safe_mul_func_uint8_t_u_u(int, struct S1 p_14);
+int g_732, func_6_l_17;
+static int *func_12();
+static int func_6(struct S2 p_7) { func_12(func_6_l_17, p_7.f2, g_18, 0); }
+static int *func_12(int, struct S1 p_14) {
+  safe_lshift_func_int16_t_s_u();
+  safe_unary_minus_func_uint64_t_u();
+  g_732 = safe_mul_func_uint8_t_u_u(0, p_14);
+}
+int main() {
+  struct S2 l_10 = {3};
+  func_6(l_10);
+}
-- 
2.39.1

[PATCH] More DF worklist solver improvements

2023-02-14 Thread Richard Biener via Gcc-patches

The following switches the double-queue iteration solver with a
single-queue one that prioritizes backward data flow solving
over forward progress.  That is, it first converges on earlier
cycles before propagating the (possibly again chainging) data
to the following blocks.  This improves data locality and
possibly avoids visiting later blocks multiple times but it
might also cause inner cycles to be iterated multiple times,
so it's possibly not always an improvement.  With the
rev_post_order_and_mark_dfs_back_seme API it would be possible
to iterate only outermost cycles immediately, like we do for
var-tracking.

For the PR26854 all.i testcase it halves DF RD processing time
again (ontop of the previous improvement).

I wanted to show off the potential but will not push this
now but instead will see to find the cycles to do it the
var-tracking style.

* df-core.cc (df_worklist_propagate_forward): Always
add to worklist.
(df_worklist_propagate_backward): Likewise.
(df_worklist_dataflow_doublequeue): Change to a single
queue worklist implementation.
---
 gcc/df-core.cc | 49 ++---
 1 file changed, 14 insertions(+), 35 deletions(-)

diff --git a/gcc/df-core.cc b/gcc/df-core.cc
index 38f69ac5743..30c1bfcc314 100644
--- a/gcc/df-core.cc
+++ b/gcc/df-core.cc
@@ -874,8 +874,7 @@ make_pass_df_finish (gcc::context *ctxt)
 /* Helper function for df_worklist_dataflow.
Propagate the dataflow forward.
Given a BB_INDEX, do the dataflow propagation
-   and set bits on for successors in PENDING for earlier
-   and WORKLIST for later in bbindex_to_postorder
+   and set bits on WORKLIST for successors
if the out set of the dataflow has changed.
 
AGE specify time when BB was visited last time.
@@ -894,7 +893,6 @@ df_worklist_propagate_forward (struct dataflow *dataflow,
   unsigned bb_index,
   unsigned *bbindex_to_postorder,
   bitmap worklist,
-  bitmap pending,
   sbitmap considered,
   vec &last_change_age,
   int age)
@@ -926,13 +924,7 @@ df_worklist_propagate_forward (struct dataflow *dataflow,
   unsigned ob_index = e->dest->index;
 
   if (bitmap_bit_p (considered, ob_index))
-   {
- if (bbindex_to_postorder[bb_index]
- < bbindex_to_postorder[ob_index])
-   bitmap_set_bit (worklist, bbindex_to_postorder[ob_index]);
- else
-   bitmap_set_bit (pending, bbindex_to_postorder[ob_index]);
-   }
+   bitmap_set_bit (worklist, bbindex_to_postorder[ob_index]);
 }
   return true;
 }
@@ -948,7 +940,6 @@ df_worklist_propagate_backward (struct dataflow *dataflow,
unsigned bb_index,
unsigned *bbindex_to_postorder,
bitmap worklist,
-   bitmap pending,
sbitmap considered,
vec &last_change_age,
int age)
@@ -980,13 +971,7 @@ df_worklist_propagate_backward (struct dataflow *dataflow,
   unsigned ob_index = e->src->index;
 
   if (bitmap_bit_p (considered, ob_index))
-   {
- if (bbindex_to_postorder[bb_index]
- < bbindex_to_postorder[ob_index])
-   bitmap_set_bit (worklist, bbindex_to_postorder[ob_index]);
- else
-   bitmap_set_bit (pending, bbindex_to_postorder[ob_index]);
-   }
+   bitmap_set_bit (worklist, bbindex_to_postorder[ob_index]);
 }
   return true;
 }
@@ -995,17 +980,17 @@ df_worklist_propagate_backward (struct dataflow *dataflow,
 
 /* Main dataflow solver loop.
 
-   DATAFLOW is problem we are solving, PENDING is worklist of basic blocks we
+   DATAFLOW is problem we are solving, WORKLIST is worklist of basic blocks we
need to visit.
BLOCK_IN_POSTORDER is array of size N_BLOCKS specifying postorder in BBs and
BBINDEX_TO_POSTORDER is array mapping back BB->index to postorder position.
-   PENDING will be freed.
+   WORKLIST will be freed.
 
The worklists are bitmaps indexed by postorder positions.  
 
-   The function implements standard algorithm for dataflow solving with two
-   worklists (we are processing WORKLIST and storing new BBs to visit in
-   PENDING).
+   The function implements standard algorithm for dataflow solving
+   (we are processing WORKLIST, storing new BBs to the same list
+   to visit dataflow changes across backedges first).
 
As an optimization we maintain ages when BB was changed (stored in
last_change_age) and when it was last visited (stored in last_visit_age).
@@ -1014,15 +999,14 @@ df_worklist_propagate_backward (struct dataflow

[PATCH] RISC-V: Replace simm32_p with immediate_operand (Pmode)

2023-02-14 Thread juzhe . zhong

From: Ju-Zhe Zhong 

simm32_p is used to check constant int value within 32-bit.
It's used in handling SEW = 64 in rv32 system since such constant int
value with 32-bit allow us use vx instruction.

The current implementation of simm32_p is quite ugly and now I figure
out immedate_operand (op, pmode) can help us to check whether the op
is a constant value within 32-bit.

I already have a bunch testcases to test SEW = 64 in rv32 system and
all regression tests are passed with this patch.

gcc/ChangeLog:

* config/riscv/riscv-protos.h (simm32_p):
* config/riscv/riscv-v.cc (simm32_p):
* config/riscv/vector.md:

---
 gcc/config/riscv/riscv-protos.h |  1 -
 gcc/config/riscv/riscv-v.cc | 10 --
 gcc/config/riscv/vector.md  | 34 -
 3 files changed, 17 insertions(+), 28 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 9d8b0b78a06..ee8e903ddf5 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -176,7 +176,6 @@ enum tail_policy get_prefer_tail_policy ();
 enum mask_policy get_prefer_mask_policy ();
 rtx get_avl_type_rtx (enum avl_type);
 opt_machine_mode get_vector_mode (scalar_mode, poly_uint64);
-extern bool simm32_p (rtx);
 extern bool simm5_p (rtx);
 extern bool neg_simm5_p (rtx);
 #ifdef RTX_CODE
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 600b2e6ecad..dd70bf9b541 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -396,16 +396,6 @@ get_vector_mode (scalar_mode inner_mode, poly_uint64 
nunits)
   return opt_machine_mode ();
 }
 
-/* Helper functions for handling sew=64 on RV32 system. */
-bool
-simm32_p (rtx x)
-{
-  if (!CONST_INT_P (x))
-return false;
-  unsigned HOST_WIDE_INT val = UINTVAL (x);
-  return val <= 0x7FFFULL || val >= 0x8000ULL;
-}
-
 bool
 simm5_p (rtx x)
 {
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index dc79aa230bd..b6e67e94f67 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -889,7 +889,7 @@
   {
rtx v = gen_reg_rtx (mode);
 
-   if (riscv_vector::simm32_p (operands[3]))
+   if (immediate_operand (operands[3], Pmode))
  operands[3] = gen_rtx_SIGN_EXTEND (mode,
force_reg (Pmode, operands[3]));
else
@@ -1479,7 +1479,7 @@
   {
rtx v = gen_reg_rtx (mode);
 
-   if (riscv_vector::simm32_p (operands[4]))
+   if (immediate_operand (operands[4], Pmode))
  {
if (!rtx_equal_p (operands[4], const0_rtx))
  operands[4] = force_reg (Pmode, operands[4]);
@@ -1572,7 +1572,7 @@
   {
rtx v = gen_reg_rtx (mode);
 
-   if (riscv_vector::simm32_p (operands[4]))
+   if (immediate_operand (operands[4], Pmode))
  {
if (!rtx_equal_p (operands[4], const0_rtx))
  operands[4] = force_reg (Pmode, operands[4]);
@@ -1665,7 +1665,7 @@
   {
rtx v = gen_reg_rtx (mode);
 
-   if (riscv_vector::simm32_p (operands[4]))
+   if (immediate_operand (operands[4], Pmode))
  operands[4] = gen_rtx_SIGN_EXTEND (mode,
force_reg (Pmode, operands[4]));
else
@@ -1820,7 +1820,7 @@
   {
rtx v = gen_reg_rtx (mode);
 
-   if (riscv_vector::simm32_p (operands[4]))
+   if (immediate_operand (operands[4], Pmode))
  operands[4] = gen_rtx_SIGN_EXTEND (mode,
force_reg (Pmode, operands[4]));
else
@@ -1907,7 +1907,7 @@
   {
rtx v = gen_reg_rtx (mode);
 
-   if (riscv_vector::simm32_p (operands[4]))
+   if (immediate_operand (operands[4], Pmode))
  operands[4] = gen_rtx_SIGN_EXTEND (mode,
force_reg (Pmode, operands[4]));
else
@@ -2034,7 +2034,7 @@
   {
rtx v = gen_reg_rtx (mode);
 
-   if (riscv_vector::simm32_p (operands[4]))
+   if (immediate_operand (operands[4], Pmode))
  {
if (!rtx_equal_p (operands[4], const0_rtx))
  operands[4] = force_reg (Pmode, operands[4]);
@@ -2226,7 +2226,7 @@
   {
 rtx v = gen_reg_rtx (mode);
 
-   if (riscv_vector::simm32_p (operands[3]))
+   if (immediate_operand (operands[3], Pmode))
  operands[3] = gen_rtx_SIGN_EXTEND (mode,
force_reg (Pmode, operands[3]));
 else
@@ -2320,7 +2320,7 @@
   {
 rtx v = gen_reg_rtx (mode);
 
-   if (riscv_vector::simm32_p (operands[3]))
+   if (immediate_operand (operands[3], Pmode))
  {
if (!rtx_equal_p (operands[3], const0_rtx))
  operands[3] = force_reg (Pmode, operands[3]);
@@ -2497,7 +2497,7 @@
   {
 rtx v = gen_reg_rtx (mode);
 
-   if (riscv_vector::simm32_p (operands[2]))
+   if (immediate_operand (operands[2], Pmode))
  operands[2] = gen_rtx_SIGN_EXTEND (mode,
force_reg (Pmode, operands[2]));

Re: [PATCH] ipa: Avoid IPA confusing scalar values and single-field aggregates (PR 108679)

2023-02-14 Thread Richard Biener via Gcc-patches

On Tue, Feb 14, 2023 at 3:50 PM Martin Jambor  wrote:
>
> Hi,
>
> PR 108679 testcase shows a situation when IPA-CP is able to track a
> scalar constant in a single-field structure that is part of a bigger
> structure.  This smaller struture is however also passed in a few calls
> to other functions, but the two same-but-different entities, originally
> places at the same offset and with the same size, and this confuses the
> mechanism that takes care of handling call statements after IPA-SRA.
>
> I think that in stage 4 it is best to revert to GCC 12 behavior in this
> particular case (when IPA-CP detects a constant in a single-field
> structure or a single element array that is part of a bigger aggregate)
> and the patch below does that.  If accepted, I plan to file a
> missed-optimization bug to track that we could use the IPA-CP propagated
> value to re-construct the small aggregate arguments.
>
> Bootstrapped and tested and LTO bootstrapped on x86_64-linux.  OK for
> master?

OK.

Richard.

> Thanks,
>
> Martin
>
>
>
>
> gcc/ChangeLog:
>
> 2023-02-13  Martin Jambor  
>
> PR ipa/108679
> * ipa-sra.cc (push_param_adjustments_for_index): Do not omit
> creation of non-scalar replacements even if IPA-CP knows their
> contents.
>
> gcc/testsuite/ChangeLog:
>
> 2023-02-13  Martin Jambor  
>
> PR ipa/108679
> * gcc.dg/ipa/pr108679.c: New test.
> ---
>  gcc/ipa-sra.cc  |  2 +-
>  gcc/testsuite/gcc.dg/ipa/pr108679.c | 25 +
>  2 files changed, 26 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/ipa/pr108679.c
>
> diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc
> index 0495f446bf4..3de7d426b7e 100644
> --- a/gcc/ipa-sra.cc
> +++ b/gcc/ipa-sra.cc
> @@ -3989,7 +3989,7 @@ push_param_adjustments_for_index (isra_func_summary 
> *ifs, unsigned base_index,
> {
>   ipa_argagg_value_list avl (ipcp_ts);
>   tree value = avl.get_value (base_index, pa->unit_offset);
> - if (value)
> + if (value && !AGGREGATE_TYPE_P (pa->type))
> {
>   if (dump_file)
> fprintf (dump_file, "- omitting component at byte "
> diff --git a/gcc/testsuite/gcc.dg/ipa/pr108679.c 
> b/gcc/testsuite/gcc.dg/ipa/pr108679.c
> new file mode 100644
> index 000..b1ed50bb831
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/ipa/pr108679.c
> @@ -0,0 +1,25 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +struct S1 {
> +  signed f0;
> +};
> +struct S2 {
> +  struct S1 f2;
> +  short f8;
> +} g_18;
> +void safe_lshift_func_int16_t_s_u();
> +void safe_unary_minus_func_uint64_t_u();
> +int safe_mul_func_uint8_t_u_u(int, struct S1 p_14);
> +int g_732, func_6_l_17;
> +static int *func_12();
> +static int func_6(struct S2 p_7) { func_12(func_6_l_17, p_7.f2, g_18, 0); }
> +static int *func_12(int, struct S1 p_14) {
> +  safe_lshift_func_int16_t_s_u();
> +  safe_unary_minus_func_uint64_t_u();
> +  g_732 = safe_mul_func_uint8_t_u_u(0, p_14);
> +}
> +int main() {
> +  struct S2 l_10 = {3};
> +  func_6(l_10);
> +}
> --
> 2.39.1
>

Re: [og12] In 'libgomp/allocator.c:omp_realloc', route 'free' through 'MEMSPACE_FREE' (was: [PATCH] libgomp, OpenMP, nvptx: Low-latency memory allocator)

2023-02-14 Thread Andrew Stubbs


On 14/02/2023 12:54, Thomas Schwinge wrote:

Hi Andrew!

On 2022-01-13T11:13:51+, Andrew Stubbs  wrote:

Updated patch: this version fixes some missed cases of malloc in the
realloc implementation.


Right, and as it seems I've run into another issue: a stray 'free'.


--- a/libgomp/allocator.c
+++ b/libgomp/allocator.c


Re 'omp_realloc':


@@ -660,9 +709,10 @@ retry:
gomp_mutex_unlock (&allocator_data->lock);
  #endif
if (prev_size)
- new_ptr = realloc (data->ptr, new_size);
+ new_ptr = MEMSPACE_REALLOC (allocator_data->memspace, data->ptr,
+ data->size, new_size);
else
- new_ptr = malloc (new_size);
+ new_ptr = MEMSPACE_ALLOC (allocator_data->memspace, new_size);
if (new_ptr == NULL)
   {
  #ifdef HAVE_SYNC_BUILTINS
@@ -690,7 +740,11 @@ retry:
  && (free_allocator_data == NULL
  || free_allocator_data->pool_size == ~(uintptr_t) 0))
  {
-  new_ptr = realloc (data->ptr, new_size);
+  omp_memspace_handle_t memspace __attribute__((unused))
+ = (allocator_data
+? allocator_data->memspace
+: predefined_alloc_mapping[allocator]);
+  new_ptr = MEMSPACE_REALLOC (memspace, data->ptr, data->size, new_size);
if (new_ptr == NULL)
   goto fail;
ret = (char *) new_ptr + sizeof (struct omp_mem_header);
@@ -701,7 +755,11 @@ retry:
  }
else
  {
-  new_ptr = malloc (new_size);
+  omp_memspace_handle_t memspace __attribute__((unused))
+ = (allocator_data
+? allocator_data->memspace
+: predefined_alloc_mapping[allocator]);
+  new_ptr = MEMSPACE_ALLOC (memspace, new_size);
if (new_ptr == NULL)
   goto fail;
  }
@@ -735,32 +793,35 @@ retry:

|free (data->ptr);

return ret;


I run into a SIGSEGV if a non-'malloc'-based allocation is 'free'd here.

The attached
"In 'libgomp/allocator.c:omp_realloc', route 'free' through 'MEMSPACE_FREE'"
appears to resolve my issue, but not yet regression-tested.  Does that
look correct to you?


That looks correct. The only remaining use of "free" should be the one 
referring to the allocator object itself (i.e. the destructor).



Or, instead of invoking 'MEMSPACE_FREE', should we scrap the
'used_pool_size' bookkeeping here, and just invoke 'omp_free' instead?

 --- libgomp/allocator.c
 +++ libgomp/allocator.c
 @@ -842,19 +842,7 @@ retry:
if (old_size - old_alignment < size)
  size = old_size - old_alignment;
memcpy (ret, ptr, size);
 -  if (__builtin_expect (free_allocator_data
 -   && free_allocator_data->pool_size < ~(uintptr_t) 0, 0))
 -{
 -#ifdef HAVE_SYNC_BUILTINS
 -  __atomic_add_fetch (&free_allocator_data->used_pool_size, 
-data->size,
 - MEMMODEL_RELAXED);
 -#else
 -  gomp_mutex_lock (&free_allocator_data->lock);
 -  free_allocator_data->used_pool_size -= data->size;
 -  gomp_mutex_unlock (&free_allocator_data->lock);
 -#endif
 -}
 -  free (data->ptr);
 +  ialias_call (omp_free) (ptr, free_allocator);
return ret;

(I've not yet analyzed whether that's completely equivalent.)


The used_pool_size code comes from upstream, so if you want to go beyond 
the mechanical substitution of "free" then you're adding a new patch 
(rather than tweaking an old one). I'll leave that for others to comment on.


Andrew

[PATCH] Improve VN PHI hash table handling

2023-02-14 Thread Richard Biener via Gcc-patches

The hash function of PHIs is weak since we want to be able to CSE
them even across basic-blocks in some cases.  The following avoids
weakening the hash for cases we are never going to CSE, reducing
the number of collisions and avoiding redundant work in the
hash and equality functions.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

* tree-ssa-sccvn.cc (vn_phi_compute_hash): Key skipping
basic block index hashing on the availability of ->cclhs.
(vn_phi_eq): Avoid re-doing sanity checks for CSE but
rely on ->cclhs availability.
(vn_phi_lookup): Set ->cclhs only when we are eventually
going to CSE the PHI.
(vn_phi_insert): Likewise.
---
 gcc/tree-ssa-sccvn.cc | 77 ---
 1 file changed, 43 insertions(+), 34 deletions(-)

diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index e5bb278196a..8ee77fd2b78 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -4629,9 +4629,9 @@ vn_phi_compute_hash (vn_phi_t vp1)
 case 1:
   break;
 case 2:
-  if (vp1->block->loop_father->header == vp1->block)
-   ;
-  else
+  /* When this is a PHI node subject to CSE for different blocks
+avoid hashing the block index.  */
+  if (vp1->cclhs)
break;
   /* Fallthru.  */
 default:
@@ -4715,32 +4715,33 @@ vn_phi_eq (const_vn_phi_t const vp1, const_vn_phi_t 
const vp2)
 
case 2:
  {
-   /* Rule out backedges into the PHI.  */
-   if (vp1->block->loop_father->header == vp1->block
-   || vp2->block->loop_father->header == vp2->block)
+   /* Make sure both PHIs are classified as CSEable.  */
+   if (! vp1->cclhs || ! vp2->cclhs)
  return false;
 
+   /* Rule out backedges into the PHI.  */
+   gcc_checking_assert
+ (vp1->block->loop_father->header != vp1->block
+  && vp2->block->loop_father->header != vp2->block);
+
/* If the PHI nodes do not have compatible types
   they are not the same.  */
if (!types_compatible_p (vp1->type, vp2->type))
  return false;
 
+   /* If the immediate dominator end in switch stmts multiple
+  values may end up in the same PHI arg via intermediate
+  CFG merges.  */
basic_block idom1
  = get_immediate_dominator (CDI_DOMINATORS, vp1->block);
basic_block idom2
  = get_immediate_dominator (CDI_DOMINATORS, vp2->block);
-   /* If the immediate dominator end in switch stmts multiple
-  values may end up in the same PHI arg via intermediate
-  CFG merges.  */
-   if (EDGE_COUNT (idom1->succs) != 2
-   || EDGE_COUNT (idom2->succs) != 2)
- return false;
+   gcc_checking_assert (EDGE_COUNT (idom1->succs) == 2
+&& EDGE_COUNT (idom2->succs) == 2);
 
/* Verify the controlling stmt is the same.  */
-   gcond *last1 = safe_dyn_cast  (last_stmt (idom1));
-   gcond *last2 = safe_dyn_cast  (last_stmt (idom2));
-   if (! last1 || ! last2)
- return false;
+   gcond *last1 = as_a  (last_stmt (idom1));
+   gcond *last2 = as_a  (last_stmt (idom2));
bool inverted_p;
if (! cond_stmts_equal_p (last1, vp1->cclhs, vp1->ccrhs,
  last2, vp2->cclhs, vp2->ccrhs,
@@ -4835,15 +4836,19 @@ vn_phi_lookup (gimple *phi, bool backedges_varying_p)
   /* Extract values of the controlling condition.  */
   vp1->cclhs = NULL_TREE;
   vp1->ccrhs = NULL_TREE;
-  basic_block idom1 = get_immediate_dominator (CDI_DOMINATORS, vp1->block);
-  if (EDGE_COUNT (idom1->succs) == 2)
-if (gcond *last1 = safe_dyn_cast  (last_stmt (idom1)))
-  {
-   /* ???  We want to use SSA_VAL here.  But possibly not
-  allow VN_TOP.  */
-   vp1->cclhs = vn_valueize (gimple_cond_lhs (last1));
-   vp1->ccrhs = vn_valueize (gimple_cond_rhs (last1));
-  }
+  if (EDGE_COUNT (vp1->block->preds) == 2
+  && vp1->block->loop_father->header != vp1->block)
+{
+  basic_block idom1 = get_immediate_dominator (CDI_DOMINATORS, vp1->block);
+  if (EDGE_COUNT (idom1->succs) == 2)
+   if (gcond *last1 = safe_dyn_cast  (last_stmt (idom1)))
+ {
+   /* ???  We want to use SSA_VAL here.  But possibly not
+  allow VN_TOP.  */
+   vp1->cclhs = vn_valueize (gimple_cond_lhs (last1));
+   vp1->ccrhs = vn_valueize (gimple_cond_rhs (last1));
+ }
+}
   vp1->hashcode = vn_phi_compute_hash (vp1);
   slot = valid_info->phis->find_slot_with_hash (vp1, vp1->hashcode, NO_INSERT);
   if (!slot)
@@ -4885,15 +4890,19 @@ vn_phi_insert (gimple *phi, tree result, bool 
backedges_varying_p)
   /* Extract values of the controlling condition.  */
   vp1->cclhs = NULL_TREE;
   v

[PATCH] Fix possible sanopt compile-time hog

2023-02-14 Thread Richard Biener via Gcc-patches

While working on bitmap operations I figured sanopt.cc uses
a sbitmap worklist, iterating using bitmap_first_set_bit on it.
That's quadratic since bitmap_first_set_bit for sbitmap is O(n).

The fix is to use regular bitmaps for the worklist and the bitmap
feeding it and to avoid a useless copy.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

OK for trunk?

Thanks,
Richard.

* sanopt.cc (sanitize_asan_mark_unpoison): Use bitmap
for with_poison and alias worklist to it.
(sanitize_asan_mark_poison): Likewise.
---
 gcc/sanopt.cc | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/gcc/sanopt.cc b/gcc/sanopt.cc
index aed22850079..b356a21eca3 100644
--- a/gcc/sanopt.cc
+++ b/gcc/sanopt.cc
@@ -987,15 +987,11 @@ static void
 sanitize_asan_mark_unpoison (void)
 {
   /* 1) Find all BBs that contain an ASAN_MARK poison call.  */
-  auto_sbitmap with_poison (last_basic_block_for_fn (cfun) + 1);
-  bitmap_clear (with_poison);
+  auto_bitmap with_poison;
   basic_block bb;
 
   FOR_EACH_BB_FN (bb, cfun)
 {
-  if (bitmap_bit_p (with_poison, bb->index))
-   continue;
-
   gimple_stmt_iterator gsi;
   for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi_prev (&gsi))
{
@@ -1010,8 +1006,8 @@ sanitize_asan_mark_unpoison (void)
 
   auto_sbitmap poisoned (last_basic_block_for_fn (cfun) + 1);
   bitmap_clear (poisoned);
-  auto_sbitmap worklist (last_basic_block_for_fn (cfun) + 1);
-  bitmap_copy (worklist, with_poison);
+  /* We now treat with_poison as worklist.  */
+  bitmap worklist = with_poison;
 
   /* 2) Propagate the information to all reachable blocks.  */
   while (!bitmap_empty_p (worklist))
@@ -1088,8 +1084,7 @@ static void
 sanitize_asan_mark_poison (void)
 {
   /* 1) Find all BBs that possibly contain an ASAN_CHECK.  */
-  auto_sbitmap with_check (last_basic_block_for_fn (cfun) + 1);
-  bitmap_clear (with_check);
+  auto_bitmap with_check;
   basic_block bb;
 
   FOR_EACH_BB_FN (bb, cfun)
@@ -1108,8 +1103,8 @@ sanitize_asan_mark_poison (void)
 
   auto_sbitmap can_reach_check (last_basic_block_for_fn (cfun) + 1);
   bitmap_clear (can_reach_check);
-  auto_sbitmap worklist (last_basic_block_for_fn (cfun) + 1);
-  bitmap_copy (worklist, with_check);
+  /* We now treat with_check as worklist.  */
+  bitmap worklist = with_check;
 
   /* 2) Propagate the information to all definitions blocks.  */
   while (!bitmap_empty_p (worklist))
-- 
2.35.3

[PATCH] RISC-V: Remove "extern“ for namespace [NFC]

2023-02-14 Thread juzhe . zhong

From: Ju-Zhe Zhong 

Just like other targets, aarch64_sve namespace in aarch64-protos.h
arm_mve in arm-protos, nds namesace in nds32-protos.h

They all don't have 'extern' in namespace.
This is a NFC patch to make RISC-V be consistent with other targets.
No functionality change.

gcc/ChangeLog:

* config/riscv/riscv-protos.h (riscv_run_selftests): Remove 'extern'.
(init_builtins): Ditto.
(mangle_builtin_type): Ditto.
(verify_type_context): Ditto.
(handle_pragma_vector):  Ditto.
(builtin_decl): Ditto.
(expand_builtin): Ditto.
(const_vec_all_same_in_range_p): Ditto.
(legitimize_move): Ditto.
(emit_vlmax_op): Ditto.
(emit_nonvlmax_op): Ditto.
(get_vlmul): Ditto.
(get_ratio): Ditto.
(get_ta): Ditto.
(get_ma): Ditto.
(get_avl_type): Ditto.
(calculate_ratio): Ditto.
(enum vlmul_type): Ditto.
(simm5_p): Ditto.
(neg_simm5_p): Ditto.
(has_vi_variant_p): Ditto.

---
 gcc/config/riscv/riscv-protos.h | 40 -
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index ee8e903ddf5..81ad2eabc00 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -115,7 +115,7 @@ extern const riscv_cpu_info *riscv_find_cpu (const char *);
 /* Routines implemented in riscv-selftests.cc.  */
 #if CHECKING_P
 namespace selftest {
-extern void riscv_run_selftests (void);
+void riscv_run_selftests (void);
 } // namespace selftest
 #endif
 
@@ -141,24 +141,24 @@ enum avl_type
   VLMAX,
 };
 /* Routines implemented in riscv-vector-builtins.cc.  */
-extern void init_builtins (void);
-extern const char *mangle_builtin_type (const_tree);
+void init_builtins (void);
+const char *mangle_builtin_type (const_tree);
 #ifdef GCC_TARGET_H
-extern bool verify_type_context (location_t, type_context_kind, const_tree, 
bool);
+bool verify_type_context (location_t, type_context_kind, const_tree, bool);
 #endif
-extern void handle_pragma_vector (void);
-extern tree builtin_decl (unsigned, bool);
-extern rtx expand_builtin (unsigned int, tree, rtx);
-extern bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
-extern bool legitimize_move (rtx, rtx, machine_mode);
-extern void emit_vlmax_op (unsigned, rtx, rtx, machine_mode);
-extern void emit_nonvlmax_op (unsigned, rtx, rtx, rtx, machine_mode);
-extern enum vlmul_type get_vlmul (machine_mode);
-extern unsigned int get_ratio (machine_mode);
-extern int get_ta (rtx);
-extern int get_ma (rtx);
-extern int get_avl_type (rtx);
-extern unsigned int calculate_ratio (unsigned int, enum vlmul_type);
+void handle_pragma_vector (void);
+tree builtin_decl (unsigned, bool);
+rtx expand_builtin (unsigned int, tree, rtx);
+bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
+bool legitimize_move (rtx, rtx, machine_mode);
+void emit_vlmax_op (unsigned, rtx, rtx, machine_mode);
+void emit_nonvlmax_op (unsigned, rtx, rtx, rtx, machine_mode);
+enum vlmul_type get_vlmul (machine_mode);
+unsigned int get_ratio (machine_mode);
+int get_ta (rtx);
+int get_ma (rtx);
+int get_avl_type (rtx);
+unsigned int calculate_ratio (unsigned int, enum vlmul_type);
 enum tail_policy
 {
   TAIL_UNDISTURBED = 0,
@@ -176,10 +176,10 @@ enum tail_policy get_prefer_tail_policy ();
 enum mask_policy get_prefer_mask_policy ();
 rtx get_avl_type_rtx (enum avl_type);
 opt_machine_mode get_vector_mode (scalar_mode, poly_uint64);
-extern bool simm5_p (rtx);
-extern bool neg_simm5_p (rtx);
+bool simm5_p (rtx);
+bool neg_simm5_p (rtx);
 #ifdef RTX_CODE
-extern bool has_vi_variant_p (rtx_code, rtx);
+bool has_vi_variant_p (rtx_code, rtx);
 #endif
 }
 
-- 
2.36.3

Re: [PATCH] Fix possible sanopt compile-time hog

2023-02-14 Thread Jakub Jelinek via Gcc-patches

On Tue, Feb 14, 2023 at 04:20:24PM +0100, Richard Biener wrote:
> While working on bitmap operations I figured sanopt.cc uses
> a sbitmap worklist, iterating using bitmap_first_set_bit on it.
> That's quadratic since bitmap_first_set_bit for sbitmap is O(n).
> 
> The fix is to use regular bitmaps for the worklist and the bitmap
> feeding it and to avoid a useless copy.
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> 
> OK for trunk?
> 
> Thanks,
> Richard.
> 
>   * sanopt.cc (sanitize_asan_mark_unpoison): Use bitmap
>   for with_poison and alias worklist to it.
>   (sanitize_asan_mark_poison): Likewise.

Ok, thanks.

Jakub

Re: [PATCH] target/108738 - optimize bit operations in STV

2023-02-14 Thread Uros Bizjak via Gcc-patches

On Thu, Feb 9, 2023 at 3:25 PM Richard Biener via Gcc-patches
 wrote:
>
> The following does low-hanging optimizations, combining bitmap
> test and set and removing redundant operations.
>
> This shaves off half of the testcase compile time.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
>
> Thanks,
> Richard.
>
> PR target/108738
> * config/i386/i386-features.cc (scalar_chain::add_to_queue):
> Combine bitmap test and set.
> (scalar_chain::add_insn): Likewise.
> (scalar_chain::analyze_register_chain): Remove redundant
> attempt to add to queue and instead strengthen assert.
> Sink common attempts to mark the def dual-mode.
> (scalar_chain::add_to_queue): Remove redundant insn bitmap
> check.

LGTM.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386-features.cc | 18 --
>  1 file changed, 8 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/config/i386/i386-features.cc 
> b/gcc/config/i386/i386-features.cc
> index 9bd6d8677bb..eff91301009 100644
> --- a/gcc/config/i386/i386-features.cc
> +++ b/gcc/config/i386/i386-features.cc
> @@ -314,14 +314,12 @@ scalar_chain::~scalar_chain ()
>  void
>  scalar_chain::add_to_queue (unsigned insn_uid)
>  {
> -  if (bitmap_bit_p (insns, insn_uid)
> -  || bitmap_bit_p (queue, insn_uid))
> +  if (!bitmap_set_bit (queue, insn_uid))
>  return;
>
>if (dump_file)
>  fprintf (dump_file, "  Adding insn %d into chain's #%d queue\n",
>  insn_uid, chain_id);
> -  bitmap_set_bit (queue, insn_uid);
>  }
>
>  /* For DImode conversion, mark register defined by DEF as requiring
> @@ -362,10 +360,9 @@ void
>  scalar_chain::analyze_register_chain (bitmap candidates, df_ref ref)
>  {
>df_link *chain;
> +  bool mark_def = false;
>
> -  gcc_assert (bitmap_bit_p (insns, DF_REF_INSN_UID (ref))
> - || bitmap_bit_p (candidates, DF_REF_INSN_UID (ref)));
> -  add_to_queue (DF_REF_INSN_UID (ref));
> +  gcc_checking_assert (bitmap_bit_p (insns, DF_REF_INSN_UID (ref)));
>
>for (chain = DF_REF_CHAIN (ref); chain; chain = chain->next)
>  {
> @@ -398,9 +395,12 @@ scalar_chain::analyze_register_chain (bitmap candidates, 
> df_ref ref)
>   if (dump_file)
> fprintf (dump_file, "  r%d use in insn %d isn't convertible\n",
>  DF_REF_REGNO (chain->ref), uid);
> - mark_dual_mode_def (ref);
> + mark_def = true;
> }
>  }
> +
> +  if (mark_def)
> +mark_dual_mode_def (ref);
>  }
>
>  /* Add instruction into a chain.  */
> @@ -408,14 +408,12 @@ scalar_chain::analyze_register_chain (bitmap 
> candidates, df_ref ref)
>  void
>  scalar_chain::add_insn (bitmap candidates, unsigned int insn_uid)
>  {
> -  if (bitmap_bit_p (insns, insn_uid))
> +  if (!bitmap_set_bit (insns, insn_uid))
>  return;
>
>if (dump_file)
>  fprintf (dump_file, "  Adding insn %d to chain #%d\n", insn_uid, 
> chain_id);
>
> -  bitmap_set_bit (insns, insn_uid);
> -
>rtx_insn *insn = DF_INSN_UID_GET (insn_uid)->insn;
>rtx def_set = single_set (insn);
>if (def_set && REG_P (SET_DEST (def_set))
> --
> 2.35.3

Re: [PATCH] target/108738 - STV bitmap operations compile-time hog

2023-02-14 Thread Uros Bizjak via Gcc-patches

On Thu, Feb 9, 2023 at 3:25 PM Richard Biener via Gcc-patches
 wrote:
>
> When the set of candidates becomes very large then repeated
> bit checks on it during the build of an actual chain can become
> slow because of the O(n) nature of bitmap tests.  The following
> switches the candidates bitmaps to the tree representation before
> building the chains to get O(log n) amortized behavior.
>
> For the testcase at hand this improves STV time by 50%.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
>
> Thanks,
> Richard.
>
> PR target/108738
> * config/i386/i386-features.cc (convert_scalars_to_vector):
> Switch candidates bitmaps to tree view before building the chains.

'git diff -w' shows the true change ;)

LGTM.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386-features.cc | 49 +---
>  1 file changed, 26 insertions(+), 23 deletions(-)
>
> diff --git a/gcc/config/i386/i386-features.cc 
> b/gcc/config/i386/i386-features.cc
> index ec13d4e7489..9bd6d8677bb 100644
> --- a/gcc/config/i386/i386-features.cc
> +++ b/gcc/config/i386/i386-features.cc
> @@ -2283,30 +2283,33 @@ convert_scalars_to_vector (bool timode_p)
>fprintf (dump_file, "There are no candidates for optimization.\n");
>
>for (unsigned i = 0; i <= 2; ++i)
> -while (!bitmap_empty_p (&candidates[i]))
> -  {
> -   unsigned uid = bitmap_first_set_bit (&candidates[i]);
> -   scalar_chain *chain;
> -
> -   if (cand_mode[i] == TImode)
> - chain = new timode_scalar_chain;
> -   else
> - chain = new general_scalar_chain (cand_mode[i], cand_vmode[i]);
> -
> -   /* Find instructions chain we want to convert to vector mode.
> -  Check all uses and definitions to estimate all required
> -  conversions.  */
> -   chain->build (&candidates[i], uid);
> -
> -   if (chain->compute_convert_gain () > 0)
> - converted_insns += chain->convert ();
> -   else
> - if (dump_file)
> -   fprintf (dump_file, "Chain #%d conversion is not profitable\n",
> -chain->chain_id);
> +{
> +  bitmap_tree_view (&candidates[i]);
> +  while (!bitmap_empty_p (&candidates[i]))
> +   {
> + unsigned uid = bitmap_first_set_bit (&candidates[i]);
> + scalar_chain *chain;
>
> -   delete chain;
> -  }
> + if (cand_mode[i] == TImode)
> +   chain = new timode_scalar_chain;
> + else
> +   chain = new general_scalar_chain (cand_mode[i], cand_vmode[i]);
> +
> + /* Find instructions chain we want to convert to vector mode.
> +Check all uses and definitions to estimate all required
> +conversions.  */
> + chain->build (&candidates[i], uid);
> +
> + if (chain->compute_convert_gain () > 0)
> +   converted_insns += chain->convert ();
> + else
> +   if (dump_file)
> + fprintf (dump_file, "Chain #%d conversion is not profitable\n",
> +  chain->chain_id);
> +
> + delete chain;
> +   }
> +}
>
>if (dump_file)
>  fprintf (dump_file, "Total insns converted: %d\n", converted_insns);
> --
> 2.35.3

[PATCH] bpf: fix memory constraint of ldx/stx instructions [PR108790]

2023-02-14 Thread David Faust via Gcc-patches

In some cases where the target memory address for an ldx or stx
instruction could be reduced to a constant, GCC could emit a malformed
instruction like:

ldxdw %r0,0

Rather than the expected form:

ldxdw %rX, [%rY + OFFSET]

This is due to the constraint allowing a const_int operand, which the
output templates do not handle.

Fix it by introducing a new memory constraint for the appropriate
operands of these instructions, which is identical to 'm' except that
it does not accept const_int.

Tested with bpf-unknown-none, no known regressions.
OK?

Thanks.

gcc/

PR target/108790
* config/bpf/constraints.md (q): New memory constraint.
* config/bpf/bpf.md (zero_extendhidi2): Use it here.
(zero_extendqidi2): Likewise.
(zero_extendsidi2): Likewise.
(*mov): Likewise.

gcc/testsuite/

PR target/108790
* gcc.target/bpf/ldxdw.c: New test.
---
 gcc/config/bpf/bpf.md| 10 +-
 gcc/config/bpf/constraints.md| 11 +++
 gcc/testsuite/gcc.target/bpf/ldxdw.c | 12 
 3 files changed, 28 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/bpf/ldxdw.c

diff --git a/gcc/config/bpf/bpf.md b/gcc/config/bpf/bpf.md
index d9af98384ef..f6be0a21234 100644
--- a/gcc/config/bpf/bpf.md
+++ b/gcc/config/bpf/bpf.md
@@ -242,7 +242,7 @@ (define_insn "xor3"
 
 (define_insn "zero_extendhidi2"
   [(set (match_operand:DI 0 "register_operand" "=r,r,r")
-   (zero_extend:DI (match_operand:HI 1 "nonimmediate_operand" "0,r,m")))]
+   (zero_extend:DI (match_operand:HI 1 "nonimmediate_operand" "0,r,q")))]
   ""
   "@
and\t%0,0x
@@ -252,7 +252,7 @@ (define_insn "zero_extendhidi2"
 
 (define_insn "zero_extendqidi2"
   [(set (match_operand:DI 0 "register_operand" "=r,r,r")
-   (zero_extend:DI (match_operand:QI 1 "nonimmediate_operand" "0,r,m")))]
+   (zero_extend:DI (match_operand:QI 1 "nonimmediate_operand" "0,r,q")))]
   ""
   "@
and\t%0,0xff
@@ -263,7 +263,7 @@ (define_insn "zero_extendqidi2"
 (define_insn "zero_extendsidi2"
   [(set (match_operand:DI 0 "register_operand" "=r,r")
(zero_extend:DI
- (match_operand:SI 1 "nonimmediate_operand" "r,m")))]
+ (match_operand:SI 1 "nonimmediate_operand" "r,q")))]
   ""
   "@
* return bpf_has_alu32 ? \"mov32\t%0,%1\" : 
\"mov\t%0,%1\;and\t%0,0x\";
@@ -302,8 +302,8 @@ (define_expand "mov"
 }")
 
 (define_insn "*mov"
-  [(set (match_operand:MM 0 "nonimmediate_operand" "=r, r,r,m,m")
-(match_operand:MM 1 "mov_src_operand"  " m,rI,B,r,I"))]
+  [(set (match_operand:MM 0 "nonimmediate_operand" "=r, r,r,q,q")
+(match_operand:MM 1 "mov_src_operand"  " q,rI,B,r,I"))]
   ""
   "@
ldx\t%0,%1
diff --git a/gcc/config/bpf/constraints.md b/gcc/config/bpf/constraints.md
index c8a65cfcddb..33f9177b8eb 100644
--- a/gcc/config/bpf/constraints.md
+++ b/gcc/config/bpf/constraints.md
@@ -29,3 +29,14 @@ (define_constraint "B"
 (define_constraint "S"
   "A constant call address."
   (match_code "const,symbol_ref,label_ref,const_int"))
+
+;;
+;; Memory constraints.
+;;
+
+; Just like 'm' but disallows const_int.
+; Used for ldx[b,h,w,dw] and stx[b,h,w,dw] instructions.
+(define_memory_constraint "q"
+  "Memory reference which is not a constant integer."
+  (and (match_code "mem")
+   (match_test "GET_CODE(XEXP(op, 0)) != CONST_INT")))
diff --git a/gcc/testsuite/gcc.target/bpf/ldxdw.c 
b/gcc/testsuite/gcc.target/bpf/ldxdw.c
new file mode 100644
index 000..0985ea3e6ac
--- /dev/null
+++ b/gcc/testsuite/gcc.target/bpf/ldxdw.c
@@ -0,0 +1,12 @@
+/* Verify that we do not generate a malformed ldxdw instruction
+   with a constant instead of register + offset.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* { dg-final { scan-assembler-times "ldxdw\t%r.,\\\[%r.+0\\\]" 1 } } */
+/* { dg-final { scan-assembler-not "ldxdw\t%r.,\[0-9\]+" } } */
+
+unsigned long long test () {
+  return *((unsigned long long *) 0x4000);
+}
-- 
2.39.0

Re: nvptx: Adjust 'scan-assembler' in 'gfortran.dg/weak-1.f90' (was: Support for NOINLINE attribute)

2023-02-14 Thread Harald Anlauf via Gcc-patches


Hi Thomas,

On 2/14/23 10:35, Thomas Schwinge wrote:

Hi!

On 2023-02-13T18:50:23+0100, Harald Anlauf via Gcc-patches 
 wrote:

Pushed as:

commit 086a1df4374962787db37c1f0d1bd9beb828f9e3



On 2/12/23 22:28, Harald Anlauf via Gcc-patches wrote:

There is one thing I cannot test, which is the handling of weak symbols
on other platforms.  A quick glance at the C testcases suggests that
someone with access to either an NVPTX or MingGW target might tell
whether that particular target should be excluded.


Indeed nvptx does use a different assembler syntax; I've pushed to
master branch commit 8d8175869ca94c600e64e27b7676787b2a398f6e
"nvptx: Adjust 'scan-assembler' in 'gfortran.dg/weak-1.f90'", see
attached.


thanks for taking care of this.


And I'm curious, is '!GCC$ ATTRIBUTES weak' meant to be used only for
weak definitions (like in 'gfortran.dg/weak-1.f90'), or also for weak
declarations (which, for example, in the C world then evaluate to
zero-address unless actually defined)?  When I did a quick experiment,
that didn't seem to work?  (But may be my fault, of course.)

And, orthogonally: is '!GCC$ ATTRIBUTES weak' meant to be used only for
subroutines (like in 'gfortran.dg/weak-1.f90') and also functions (I
suppose; test case?), or also for weak "data" in some way (which, for
example, in the C world then evaluates to a zero-address unless actually
defined)?


It also works for functions, e.g.

integer function f ()
!GCC$ ATTRIBUTES weak :: f
  print *, "weak f"
  f = 0
end

Regarding symbols beyond procedures (subroutines, functions),
I had a look at what Crayftn supports.  Its manpage has:

```
WEAK

Syntax and use of the WEAK directive.
!DIR$ WEAK procedure_name[, procedure_name] ...
!DIR$ WEAK procedure_name= stub_name[, procedure_name1= stub_name1] ...

[...]

The WEAK directive supports the following arguments:

procedure_name
A weak object in the form of a variable or procedure.
stub_name
A stub procedure that exists in the code. The stub_name will be
called if a strong reference does not exist for procedure_name. The
stub_name procedure must have the same name and dummy argument list as
procedure_name.
```

However, testing e.g. with a module variable either gave an
error message or assembly that suggests that this does not work,
at least not with version cce/14.0.0.


Could help to at least add a few more test cases, and clarify the
documentation?


I'm not sure whether we need to support weak symbols other than
procedures in gfortran.  Maybe Rimvydas can comment on this.

We could clarify the documentation an reject e.g. variables
using:

diff --git a/gcc/fortran/trans-decl.cc b/gcc/fortran/trans-decl.cc
index ff64588b9a8..75c04ad7ece 100644
--- a/gcc/fortran/trans-decl.cc
+++ b/gcc/fortran/trans-decl.cc
@@ -814,6 +814,13 @@ gfc_finish_var_decl (tree decl, gfc_symbol * sym)
   && (TREE_STATIC (decl) || DECL_EXTERNAL (decl)))
 set_decl_tls_model (decl, decl_default_tls_model (decl));

+  if ((sym->attr.ext_attr & (1 << EXT_ATTR_WEAK))
+  && sym->attr.flavor != FL_PROCEDURE)
+{
+  gfc_error ("Symbol %qs at %L has the WEAK attribute but is not a "
+"procedure", sym->name, &sym->declared_at);
+}
+
   gfc_finish_decl_attrs (decl, &sym->attr);
 }

This would reject code like

module m
  integer :: i, j
!GCC$ ATTRIBUTES weak :: j
end

weak-1.f90:18:17:

   18 |   integer :: i, j
  | 1
Error: Symbol 'j' at (1) has the WEAK attribute but is not a procedure

Comments and thoughts?

Cheers,
Harald



Grüße
  Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

[pushed] libstdc++: Update an open-std.org link

2023-02-14 Thread Gerald Pfeifer

I *think* I now consistently got www.open-std.org covered throughout the 
entire tree.

Pushed.
Gerald

libstdc++-v3/ChangeLog:

* doc/xml/manual/status_cxx2017.xml: Update an open-std.org link
to www.open-std.org and https.
* doc/html/manual/status.html: Regenerate.
---
 libstdc++-v3/doc/html/manual/status.html   | 2 +-
 libstdc++-v3/doc/xml/manual/status_cxx2017.xml | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/doc/html/manual/status.html 
b/libstdc++-v3/doc/html/manual/status.html
index c99d51ff709..d046bd2de47 100644
--- a/libstdc++-v3/doc/html/manual/status.html
+++ b/libstdc++-v3/doc/html/manual/status.html
@@ -1018,7 +1018,7 @@ since C++14 and the implementation is complete.
 N (components from V1 are still in namespace
 fundamentals_v1)
   Library Fundamentals 2 TS
-   http://open-std.org/JTC1/SC22/WG21/docs/papers/2015/p0013r1.html"; 
target="_top">
+   https://www.open-std.org/JTC1/SC22/WG21/docs/papers/2015/p0013r1.html"; 
target="_top">
  P0013R1

   Logical Operator Type Traits (revision 1)YLibrary Fundamentals 2 TS
diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
index 7ca63cbad12..bea6db929c6 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
@@ -2846,7 +2846,7 @@ since C++14 and the implementation is complete.
 
 
   
-   http://www.w3.org/1999/xlink"; 
xlink:href="http://open-std.org/JTC1/SC22/WG21/docs/papers/2015/p0013r1.html";>
+   http://www.w3.org/1999/xlink"; 
xlink:href="https://www.open-std.org/JTC1/SC22/WG21/docs/papers/2015/p0013r1.html";>
  P0013R1

   
-- 
2.39.1

Re: [committed] wwwdocs: cxx-status: Move www.open-std.org to https

2023-02-14 Thread Gerald Pfeifer

On Mon, 30 Jan 2023, Gerald Pfeifer wrote:
> On Sun, 31 Jul 2022, Jonathan Wakely wrote:
>> https://www.open-std.org/ says "The site www.open-std.org is holding a
>> number of web pages for groups producing open standards:" but I don't
>> think it really matters which we use.
> It's not a biggie, though consistency never hurts (and makes it harder to 
> miss something). :-)
> 
> At this point we only have two or so instances of open-std.org without
> www. left, and I'll be moving those over as part of some broader changes.

I believe I have changed all open-std.org to www.open-std.org throughout
gcc/ (mostly, though not exclusively libstdc++) and wwwdocs/.

Gerald

Re: [PATCH v5 3/5] p1689r5: initial support

2023-02-14 Thread Jason Merrill via Gcc-patches


On 1/25/23 13:06, Ben Boeckel wrote:

This patch implements support for [P1689R5][] to communicate to a build
system the C++20 module dependencies to build systems so that they may
build `.gcm` files in the proper order.


Thanks again.


Support is communicated through the following three new flags:

- `-fdeps-format=` specifies the format for the output. Currently named
   `p1689r5`.

- `-fdeps-file=` specifies the path to the file to write the format to.

- `-fdep-output=` specifies the `.o` that will be written for the TU
   that is scanned. This is required so that the build system can
   correlate the dependency output with the actual compilation that will
   occur.


I notice that the actual flags are all -fdep-*, though some of them are 
-fdeps-* here, and the internal variables all seem to be fdeps_*.  I 
lean toward harmonizing on "deps", I think.


I don't love the three separate options, but I suppose it's fine.  I'd 
prefer "target" instead of "output".


It should be possible to omit both -file and -target and get reasonable 
defaults, like the ones for -MD/-MQ in gcc.cc:cpp_unique_options.



CMake supports this format as of 17 Jun 2022 (to be part of 3.25.0)
using an experimental feature selection (to allow for future usage
evolution without committing to how it works today). While it remains
experimental, docs may be found in CMake's documentation for
experimental features.

Future work may include using this format for Fortran module
dependencies as well, however this is still pending work.

[P1689R5]: https://isocpp.org/files/papers/P1689R5.html
[cmake-experimental]: 
https://gitlab.kitware.com/cmake/cmake/-/blob/master/Help/dev/experimental.rst

TODO:

- header-unit information fields

Header units (including the standard library headers) are 100%
unsupported right now because the `-E` mechanism wants to import their
BMIs. A new mode (i.e., something more workable than existing `-E`
behavior) that mocks up header units as if they were imported purely
from their path and content would be required.


I notice that the cpp dependency generation tries (in open_file_failed) 
to continue after encountering a missing file, is that not sufficient 
for header units?  Or adjustable to be sufficient?



- non-utf8 paths

The current standard says that paths that are not unambiguously
represented using UTF-8 are not supported (because these cases are rare
and the extra complication is not worth it at this time). Future
versions of the format might have ways of encoding non-UTF-8 paths. For
now, this patch just doesn't support non-UTF-8 paths (ignoring the
"unambiguously represetable in UTF-8" case).


typo "representable"


- figure out why junk gets placed at the end of the file

Sometimes it seems like the file gets a lot of `NUL` bytes appended to
it. It happens rarely and seems to be the result of some
`ftruncate`-style call which results in extra padding in the contents.
Noting it here as an observation at least.

libcpp/

* include/cpplib.h: Add cpp_deps_format enum.
(cpp_options): Add format field
(cpp_finish): Add dependency stream parameter.
* include/mkdeps.h (deps_add_module_target): Add new preprocessor
parameter used for C++ module tracking.
* init.cc (cpp_finish): Add new preprocessor parameter used for C++
module tracking.
* mkdeps.cc (mkdeps): Implement P1689R5 output.

gcc/

* doc/invoke.texi: Document -fdeps-format=, -fdep-file=, and
-fdep-output= flags.

gcc/c-family/

* c-opts.cc (c_common_handle_option): Add fdeps_file variable and
-fdeps-format=, -fdep-file=, and -fdep-output= parsing.
* c.opt: Add -fdeps-format=, -fdep-file=, and -fdep-output= flags.

gcc/cp/

* module.cc (preprocessed_module): Pass whether the module is
exported to dependency tracking.

gcc/testsuite/

* g++.dg/modules/depflags-f-MD.C: New test.
* g++.dg/modules/depflags-f.C: New test.
* g++.dg/modules/depflags-fi.C: New test.
* g++.dg/modules/depflags-fj-MD.C: New test.
* g++.dg/modules/depflags-fj.C: New test.
* g++.dg/modules/depflags-fjo-MD.C: New test.
* g++.dg/modules/depflags-fjo.C: New test.
* g++.dg/modules/depflags-fo-MD.C: New test.
* g++.dg/modules/depflags-fo.C: New test.
* g++.dg/modules/depflags-j-MD.C: New test.
* g++.dg/modules/depflags-j.C: New test.
* g++.dg/modules/depflags-jo-MD.C: New test.
* g++.dg/modules/depflags-jo.C: New test.
* g++.dg/modules/depflags-o-MD.C: New test.
* g++.dg/modules/depflags-o.C: New test.
* g++.dg/modules/p1689-1.C: New test.
* g++.dg/modules/p1689-1.exp.json: New test expectation.
* g++.dg/modules/p1689-2.C: New test.
* g++.dg/modules/p1689-2.exp.json: New test expectation.
* g++.dg/modules/p1689-3.C: New test.
* g++.dg/modules/p1689-3.exp.json: New test expectation.

Re: [PATCH] c++: fix ICE in joust_maybe_elide_copy [PR106675]

2023-02-14 Thread Jason Merrill via Gcc-patches


On 2/13/23 09:06, Marek Polacek wrote:

joust_maybe_elide_copy checks that the last conversion in the ICS for
the first argument is ck_ref_bind, which is reasonable, because we've
checked that we're dealing with a copy/move constructor.  But it can
also happen that we couldn't figure out which conversion function is
better to convert the argument, as in this testcase: joust couldn't
decide if we should go with

   operator foo &()

or

   operator foo const &()

so we get a ck_ambig, which then upsets joust_maybe_elide_copy.  Since
a ck_ambig can validly occur, I think we should just return early, as
in the patch below.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/12?


OK.


PR c++/106675

gcc/cp/ChangeLog:

* call.cc (joust_maybe_elide_copy): Return false for ck_ambig.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/overload-conv-5.C: New test.
---
  gcc/cp/call.cc   |  2 ++
  gcc/testsuite/g++.dg/cpp0x/overload-conv-5.C | 21 
  2 files changed, 23 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/overload-conv-5.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index a349d8e79db..048b2b052f8 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -12542,6 +12542,8 @@ joust_maybe_elide_copy (z_candidate *&cand)
if (!DECL_COPY_CONSTRUCTOR_P (fn) && !DECL_MOVE_CONSTRUCTOR_P (fn))
  return false;
conversion *conv = cand->convs[0];
+  if (conv->kind == ck_ambig)
+return false;
gcc_checking_assert (conv->kind == ck_ref_bind);
conv = next_conversion (conv);
if (conv->kind == ck_user && !TYPE_REF_P (conv->type))
diff --git a/gcc/testsuite/g++.dg/cpp0x/overload-conv-5.C 
b/gcc/testsuite/g++.dg/cpp0x/overload-conv-5.C
new file mode 100644
index 000..b1e7766e42b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/overload-conv-5.C
@@ -0,0 +1,21 @@
+// PR c++/106675
+// { dg-do compile { target c++11 } }
+
+struct foo {
+int n_;
+foo(int n) : n_(n) {}
+};
+
+struct bar {
+int n_;
+
+operator foo() const {
+return foo(n_);
+}
+operator foo &() { return *reinterpret_cast(n_); }
+operator foo const &() = delete;
+
+void crashgcc() {
+foo tmp(*this); // { dg-error "ambiguous" }
+}
+};

base-commit: 72ae1e5635648bd3f6a5760ca46d531ad1f2c6b1

Re: [PATCH] c++: Add testcases from some Issaquah DRs

2023-02-14 Thread Jason Merrill via Gcc-patches


On 2/14/23 03:33, Jakub Jelinek wrote:

On Tue, Feb 14, 2023 at 12:22:33PM +0100, Jakub Jelinek via Gcc-patches wrote:

2023-02-14  Jakub Jelinek  

* g++.dg/DRs/dr2691.C: New test.


Actually, this one isn't a DR, so maybe it should go into:
* gcc/testsuite/c-c++-common/cpp/delimited-escape-seq-8.c: New test.
instead.


Sounds good.  Go ahead and commit with that change, the dg-boguses are 
fine.



--- gcc/testsuite/g++.dg/DRs/dr2691.C.jj2023-02-14 11:48:35.841335492 
+0100
+++ gcc/testsuite/g++.dg/DRs/dr2691.C   2023-02-14 11:57:21.538669133 +0100
@@ -0,0 +1,15 @@
+// DR 2691 - hexadecimal-escape-sequence is too greedy
+// { dg-do run { target c++11 } }
+// { dg-require-effective-target wchar }
+// { dg-options "-pedantic" }
+
+extern "C" void abort ();
+
+const char32_t *a = U"\x{20}ab";// { dg-warning "delimited escape sequences are only 
valid in" "" { target c++20_down } }
+
+int
+main ()
+{
+  if (a[0] != U'\x20' || a[1] != U'a' || a[2] != U'b' || a[3] != U'\0')
+abort ();
+}


Jakub

Re: [PATCH 1/2] c++: factor out TYPENAME_TYPE substitution

2023-02-14 Thread Jason Merrill via Gcc-patches


On 2/13/23 09:23, Patrick Palka wrote:

[N.B. this is a corrected version of
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607443.html ]

This patch factors out the TYPENAME_TYPE case of tsubst into a separate
function tsubst_typename_type.  It also factors out the two tsubst flags
controlling TYPENAME_TYPE substitution, tf_keep_type_decl and tf_tst_ok,
into distinct boolean parameters of this new function (and of
make_typename_type).  Consequently, callers which used to pass tf_tst_ok
to tsubst now instead must directly call tsubst_typename_type when
appropriate.


Hmm, I don't love how that turns 4 lines into 8 more complex lines in 
each caller.  And the previous approach of saying "a CTAD placeholder is 
OK" seem like better abstraction than repeating the specific 
TYPENAME_TYPE handling in each place.



In a subsequent patch we'll add another flag to
tsubst_typename_type controlling whether we want to ignore non-types
during the qualified lookup.

gcc/cp/ChangeLog:

* cp-tree.h (enum tsubst_flags): Remove tf_keep_type_decl
and tf_tst_ok.
(make_typename_type): Add two trailing boolean parameters
defaulted to false.
* decl.cc (make_typename_type): Replace uses of
tf_keep_type_decl and tf_tst_ok with the corresponding new
boolean parameters.
* pt.cc (tsubst_typename_type): New, factored out from tsubst
and adjusted after removing tf_keep_type_decl and tf_tst_ok.
(tsubst_decl) : Conditionally call
tsubst_typename_type directly instead of using tf_tst_ok.
(tsubst) : Call tsubst_typename_type.
(tsubst_copy) : Conditionally call
tsubst_typename_type directly instead of using tf_tst_ok.
(tsubst_copy_and_build) : Likewise.
: Likewise.
---
  gcc/cp/cp-tree.h |   9 +-
  gcc/cp/decl.cc   |  17 ++--
  gcc/cp/pt.cc | 223 +--
  3 files changed, 134 insertions(+), 115 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 06bc64a6b8d..a7c5765fc33 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5573,8 +5573,7 @@ enum tsubst_flags {
tf_error = 1 << 0, /* give error messages  */
tf_warning = 1 << 1,   /* give warnings too  */
tf_ignore_bad_quals = 1 << 2,  /* ignore bad cvr qualifiers */
-  tf_keep_type_decl = 1 << 3, /* retain typedef type decls
-   (make_typename_type use) */
+  /* 1 << 3 available */
tf_ptrmem_ok = 1 << 4, /* pointers to member ok (internal
instantiate_type use) */
tf_user = 1 << 5,  /* found template must be a user template
@@ -5594,8 +5593,7 @@ enum tsubst_flags {
(build_target_expr and friends) */
tf_norm = 1 << 11, /* Build diagnostic information during
constraint normalization.  */
-  tf_tst_ok = 1 << 12,/* Allow a typename-specifier to name
-   a template (C++17 or later).  */
+  /* 1 << 12 available */
tf_dguide = 1 << 13,  /* Building a deduction guide from a ctor.  */
/* Convenient substitution flags combinations.  */
tf_warning_or_error = tf_warning | tf_error
@@ -6846,7 +6844,8 @@ extern tree declare_local_label   (tree);
  extern tree define_label  (location_t, tree);
  extern void check_goto(tree);
  extern bool check_omp_return  (void);
-extern tree make_typename_type (tree, tree, enum tag_types, 
tsubst_flags_t);
+extern tree make_typename_type (tree, tree, enum tag_types, 
tsubst_flags_t,
+bool = false, bool = false);
  extern tree build_typename_type   (tree, tree, tree, 
tag_types);
  extern tree make_unbound_class_template   (tree, tree, tree, 
tsubst_flags_t);
  extern tree make_unbound_class_template_raw   (tree, tree, tree);
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index d606b31d7a7..430533606b0 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -4228,14 +4228,17 @@ build_typename_type (tree context, tree name, tree 
fullname,
  /* Resolve `typename CONTEXT::NAME'.  TAG_TYPE indicates the tag
 provided to name the type.  Returns an appropriate type, unless an
 error occurs, in which case error_mark_node is returned.  If we
-   locate a non-artificial TYPE_DECL and TF_KEEP_TYPE_DECL is set, we
+   locate a non-artificial TYPE_DECL and KEEP_TYPE_DECL is true, we
 return that, rather than the _TYPE it corresponds to, in other
-   cases we look through the type decl.  If TF_ERROR is set, complain
-   about errors, otherwise be quiet.  */
+   cases we look through the type decl.  If TEMPLATE_OK is true and
+   we found a TEMPLATE_DECL then we return a CTAD placeholder for the
+   TEMPLATE_DECL

Re: [PATCH 2/2] c++: TYPENAME_TYPE lookup ignoring non-types [PR107773]

2023-02-14 Thread Jason Merrill via Gcc-patches


On 2/13/23 09:23, Patrick Palka wrote:

[N.B. this is a corrected version of
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607443.html ]

Currently when resolving a TYPENAME_TYPE for 'typename T::m' via
make_typename_type, we consider only type bindings of 'm' and ignore
non-type ones.  But [temp.res.general]/3 says, in a note, "the usual
qualified name lookup ([basic.lookup.qual]) applies even in the presence
of 'typename'", and qualified name lookup doesn't discriminate between
type and non-type bindings.  So when resolving such a TYPENAME_TYPE
we want the lookup to consider all bindings.

An exception is when we have a TYPENAME_TYPE corresponding to the
qualifying scope appearing before the :: scope resolution operator, such
as 'T::type' in 'typename T::type::m'.  In that case, [basic.lookup.qual]/1
applies, and lookup for such a TYPENAME_TYPE must ignore non-type
bindings.  So in order to correctly handle all cases, make_typename_type
needs an additional flag controlling whether lookup should ignore
non-types or not.

To that end this patch adds a type_only flag to make_typename_type and
defaults it to false (do not ignore non-types).  In contexts where we do
want to ignore non-types (when substituting into the scope of a
TYPENAME_TYPE, SCOPE_REF, USING_DECL) we call tsubst_typename_type
directly with type_only=true.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/107773

gcc/cp/ChangeLog:

* cp-tree.h (make_typename_type): Add another boolean parameter
that defaults to false.
* decl.cc (make_typename_type): Use lookup_member instead of
lookup_field.  Pass want_type=type_only instead of =false to
lookup_member.  Generalize format specifier in diagnostic to
handle both type and non-type bindings.
* pt.cc (tsubst_typename_type): Add another boolean parameter
that defaults to false and pass it to make_typename_type.  If
TYPE_CONTEXT is a TYPENAME_TYPE recurse with type_only=true
instead of substituting it via tsubst.
(tsubst_decl) : If the scpoe is a TYPENAME_TYPE
call tsubst_typename_type directly with type_only=true instead
of substituting it via tsubst.
(tsubst_qualified_id): Likewise.
* search.cc (lookup_member): Document default argument.

gcc/testsuite/ChangeLog:

* g++.dg/template/typename24.C: New test.
* g++.dg/template/typename25.C: New test.
* g++.dg/template/typename26.C: New test.
---
  gcc/cp/cp-tree.h   |  2 +-
  gcc/cp/decl.cc | 14 -
  gcc/cp/pt.cc   | 24 +++
  gcc/cp/search.cc   |  2 +-
  gcc/testsuite/g++.dg/template/typename24.C | 18 
  gcc/testsuite/g++.dg/template/typename25.C | 34 ++
  gcc/testsuite/g++.dg/template/typename26.C | 20 +
  7 files changed, 100 insertions(+), 14 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/template/typename24.C
  create mode 100644 gcc/testsuite/g++.dg/template/typename25.C
  create mode 100644 gcc/testsuite/g++.dg/template/typename26.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index a7c5765fc33..1241dbf8037 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6845,7 +6845,7 @@ extern tree define_label  (location_t, 
tree);
  extern void check_goto(tree);
  extern bool check_omp_return  (void);
  extern tree make_typename_type(tree, tree, enum 
tag_types, tsubst_flags_t,
-bool = false, bool = false);
+bool = false, bool = false, 
bool = false);
  extern tree build_typename_type   (tree, tree, tree, 
tag_types);
  extern tree make_unbound_class_template   (tree, tree, tree, 
tsubst_flags_t);
  extern tree make_unbound_class_template_raw   (tree, tree, tree);
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 430533606b0..c741dc23d99 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -4232,13 +4232,14 @@ build_typename_type (tree context, tree name, tree 
fullname,
 return that, rather than the _TYPE it corresponds to, in other
 cases we look through the type decl.  If TEMPLATE_OK is true and
 we found a TEMPLATE_DECL then we return a CTAD placeholder for the
-   TEMPLATE_DECL.  If TF_ERROR is set, complain about errors, otherwise
-   be quiet.  */
+   TEMPLATE_DECL.  If TYPE_ONLY is true, lookup of NAME in CONTEXT
+   ignores non-type bindings.  If TF_ERROR is set, complain about errors,
+   otherwise be quiet.  */
  
  tree

  make_typename_type (tree context, tree name, enum tag_types tag_type,
tsubst_flags_t complain, bool keep_type_decl /* = false */,
-   bool template_ok /* = false */)
+

[PATCH] RISC-V: Rearrange the organization of declarations of RVV intrinsics [NFC]

2023-02-14 Thread juzhe . zhong

From: Ju-Zhe Zhong 

This patch doesn't change any functionality, only rearrange the oraganzation.
Make it to be consistent with RVV ISA. Add annotation for it to make codes
more readable.

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-functions.def (vsetvlmax): 
Rearrange.
(vsm): Ditto.
(vsse): Ditto.
(vsoxei64): Ditto.
(vsub): Ditto.
(vand): Ditto.
(vor): Ditto.
(vxor): Ditto.
(vsll): Ditto.
(vsra): Ditto.
(vsrl): Ditto.
(vmin): Ditto.
(vmax): Ditto.
(vminu): Ditto.
(vmaxu): Ditto.
(vmul): Ditto.
(vmulh): Ditto.
(vmulhu): Ditto.
(vmulhsu): Ditto.
(vdiv): Ditto.
(vrem): Ditto.
(vdivu): Ditto.
(vremu): Ditto.
(vnot): Ditto.
(vsext): Ditto.
(vzext): Ditto.
(vwadd): Ditto.
(vwsub): Ditto.
(vwmul): Ditto.
(vwmulu): Ditto.
(vwmulsu): Ditto.
(vwaddu): Ditto.
(vwsubu): Ditto.
(vsbc): Ditto.
(vmsbc): Ditto.
(vnsra): Ditto.
(vmerge): Ditto.
(vmv_v): Ditto.
(vmsne): Ditto.
(vmslt): Ditto.
(vmsgt): Ditto.
(vmsle): Ditto.
(vmsge): Ditto.
(vmsltu): Ditto.
(vmsgtu): Ditto.
(vmsleu): Ditto.
(vmsgeu): Ditto.
(vnmsac): Ditto.
(vmadd): Ditto.
(vnmsub): Ditto.
(vwmacc): Ditto.
(vsadd): Ditto.
(vssub): Ditto.
(vssubu): Ditto.
(vaadd): Ditto.
(vasub): Ditto.
(vasubu): Ditto.
(vsmul): Ditto.
(vssra): Ditto.
(vssrl): Ditto.
(vnclip): Ditto.

---
 .../riscv/riscv-vector-builtins-functions.def | 262 +++---
 1 file changed, 161 insertions(+), 101 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index 22271273655..9bad1373bfd 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -37,15 +37,23 @@ along with GCC; see the file COPYING3. If not see
 #endif
 
 /* 6. Configuration-Setting Instructions.  */
+
 DEF_RVV_FUNCTION (vsetvl, vsetvl, none_preds, i_none_size_size_ops)
 DEF_RVV_FUNCTION (vsetvlmax, vsetvlmax, none_preds, i_none_size_void_ops)
+
 /* 7. Vector Loads and Stores. */
+
+// 7.4. Vector Unit-Stride Instructions
 DEF_RVV_FUNCTION (vle, loadstore, full_preds, all_v_scalar_const_ptr_ops)
 DEF_RVV_FUNCTION (vse, loadstore, none_m_preds, all_v_scalar_ptr_ops)
 DEF_RVV_FUNCTION (vlm, loadstore, none_preds, b_v_scalar_const_ptr_ops)
 DEF_RVV_FUNCTION (vsm, loadstore, none_preds, b_v_scalar_ptr_ops)
+
+// 7.5. Vector Strided Instructions
 DEF_RVV_FUNCTION (vlse, loadstore, full_preds, 
all_v_scalar_const_ptr_ptrdiff_ops)
 DEF_RVV_FUNCTION (vsse, loadstore, none_m_preds, all_v_scalar_ptr_ptrdiff_ops)
+
+// 7.6. Vector Indexed Instructions
 DEF_RVV_FUNCTION (vluxei8, indexed_loadstore, full_preds, 
all_v_scalar_const_ptr_uint8_index_ops)
 DEF_RVV_FUNCTION (vluxei16, indexed_loadstore, full_preds, 
all_v_scalar_const_ptr_uint16_index_ops)
 DEF_RVV_FUNCTION (vluxei32, indexed_loadstore, full_preds, 
all_v_scalar_const_ptr_uint32_index_ops)
@@ -62,162 +70,214 @@ DEF_RVV_FUNCTION (vsoxei8, indexed_loadstore, 
none_m_preds, all_v_scalar_ptr_uin
 DEF_RVV_FUNCTION (vsoxei16, indexed_loadstore, none_m_preds, 
all_v_scalar_ptr_uint16_index_ops)
 DEF_RVV_FUNCTION (vsoxei32, indexed_loadstore, none_m_preds, 
all_v_scalar_ptr_uint32_index_ops)
 DEF_RVV_FUNCTION (vsoxei64, indexed_loadstore, none_m_preds, 
all_v_scalar_ptr_uint64_index_ops)
+
+// TODO: 7.7. Unit-stride Fault-Only-First Loads
+// TODO: 7.8. Vector Load/Store Segment Instructions
+
 /* 11. Vector Integer Arithmetic Instructions.  */
+
+// 11.1. Vector Single-Width Integer Add and Subtract
 DEF_RVV_FUNCTION (vadd, alu, full_preds, iu_vvv_ops)
-DEF_RVV_FUNCTION (vsub, alu, full_preds, iu_vvv_ops)
-DEF_RVV_FUNCTION (vand, alu, full_preds, iu_vvv_ops)
-DEF_RVV_FUNCTION (vor, alu, full_preds, iu_vvv_ops)
-DEF_RVV_FUNCTION (vxor, alu, full_preds, iu_vvv_ops)
-DEF_RVV_FUNCTION (vsll, alu, full_preds, iu_shift_vvv_ops)
-DEF_RVV_FUNCTION (vsra, alu, full_preds, i_shift_vvv_ops)
-DEF_RVV_FUNCTION (vsrl, alu, full_preds, u_shift_vvv_ops)
-DEF_RVV_FUNCTION (vmin, alu, full_preds, i_vvv_ops)
-DEF_RVV_FUNCTION (vmax, alu, full_preds, i_vvv_ops)
-DEF_RVV_FUNCTION (vminu, alu, full_preds, u_vvv_ops)
-DEF_RVV_FUNCTION (vmaxu, alu, full_preds, u_vvv_ops)
-DEF_RVV_FUNCTION (vmul, alu, full_preds, iu_vvv_ops)
-DEF_RVV_FUNCTION (vmulh, alu, full_preds, full_v_i_vvv_ops)
-DEF_RVV_FUNCTION (vmulhu, alu, full_preds, full_v_u_vvv_ops)
-DEF_RVV_FUNCTION (vmulhsu, alu, full_preds, full_v_i_su_vvv_ops)
-DEF_RVV_FUNCTION (vdiv, alu, full_preds, i_vvv_ops)
-DEF_RVV_FUNCTION (vrem, alu, full_preds, i_vvv_ops)
-DEF_RVV_FUNCTION (vdivu, alu, full_preds, u_

Re: [PATCH] bpf: fix memory constraint of ldx/stx instructions [PR108790]

2023-02-14 Thread Jose E. Marchesi via Gcc-patches



Hi David.

> In some cases where the target memory address for an ldx or stx
> instruction could be reduced to a constant, GCC could emit a malformed
> instruction like:
>
> ldxdw %r0,0
>
> Rather than the expected form:
>
> ldxdw %rX, [%rY + OFFSET]
>
> This is due to the constraint allowing a const_int operand, which the
> output templates do not handle.
>
> Fix it by introducing a new memory constraint for the appropriate
> operands of these instructions, which is identical to 'm' except that
> it does not accept const_int.
>
> Tested with bpf-unknown-none, no known regressions.
> OK?

OK.  Thanks for the patch.

> Thanks.
>
> gcc/
>
>   PR target/108790
>   * config/bpf/constraints.md (q): New memory constraint.
>   * config/bpf/bpf.md (zero_extendhidi2): Use it here.
>   (zero_extendqidi2): Likewise.
>   (zero_extendsidi2): Likewise.
>   (*mov): Likewise.
>
> gcc/testsuite/
>
>   PR target/108790
>   * gcc.target/bpf/ldxdw.c: New test.
> ---
>  gcc/config/bpf/bpf.md| 10 +-
>  gcc/config/bpf/constraints.md| 11 +++
>  gcc/testsuite/gcc.target/bpf/ldxdw.c | 12 
>  3 files changed, 28 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/bpf/ldxdw.c
>
> diff --git a/gcc/config/bpf/bpf.md b/gcc/config/bpf/bpf.md
> index d9af98384ef..f6be0a21234 100644
> --- a/gcc/config/bpf/bpf.md
> +++ b/gcc/config/bpf/bpf.md
> @@ -242,7 +242,7 @@ (define_insn "xor3"
>  
>  (define_insn "zero_extendhidi2"
>[(set (match_operand:DI 0 "register_operand" "=r,r,r")
> - (zero_extend:DI (match_operand:HI 1 "nonimmediate_operand" "0,r,m")))]
> + (zero_extend:DI (match_operand:HI 1 "nonimmediate_operand" "0,r,q")))]
>""
>"@
> and\t%0,0x
> @@ -252,7 +252,7 @@ (define_insn "zero_extendhidi2"
>  
>  (define_insn "zero_extendqidi2"
>[(set (match_operand:DI 0 "register_operand" "=r,r,r")
> - (zero_extend:DI (match_operand:QI 1 "nonimmediate_operand" "0,r,m")))]
> + (zero_extend:DI (match_operand:QI 1 "nonimmediate_operand" "0,r,q")))]
>""
>"@
> and\t%0,0xff
> @@ -263,7 +263,7 @@ (define_insn "zero_extendqidi2"
>  (define_insn "zero_extendsidi2"
>[(set (match_operand:DI 0 "register_operand" "=r,r")
>   (zero_extend:DI
> -   (match_operand:SI 1 "nonimmediate_operand" "r,m")))]
> +   (match_operand:SI 1 "nonimmediate_operand" "r,q")))]
>""
>"@
> * return bpf_has_alu32 ? \"mov32\t%0,%1\" : 
> \"mov\t%0,%1\;and\t%0,0x\";
> @@ -302,8 +302,8 @@ (define_expand "mov"
>  }")
>  
>  (define_insn "*mov"
> -  [(set (match_operand:MM 0 "nonimmediate_operand" "=r, r,r,m,m")
> -(match_operand:MM 1 "mov_src_operand"  " m,rI,B,r,I"))]
> +  [(set (match_operand:MM 0 "nonimmediate_operand" "=r, r,r,q,q")
> +(match_operand:MM 1 "mov_src_operand"  " q,rI,B,r,I"))]
>""
>"@
> ldx\t%0,%1
> diff --git a/gcc/config/bpf/constraints.md b/gcc/config/bpf/constraints.md
> index c8a65cfcddb..33f9177b8eb 100644
> --- a/gcc/config/bpf/constraints.md
> +++ b/gcc/config/bpf/constraints.md
> @@ -29,3 +29,14 @@ (define_constraint "B"
>  (define_constraint "S"
>"A constant call address."
>(match_code "const,symbol_ref,label_ref,const_int"))
> +
> +;;
> +;; Memory constraints.
> +;;
> +
> +; Just like 'm' but disallows const_int.
> +; Used for ldx[b,h,w,dw] and stx[b,h,w,dw] instructions.
> +(define_memory_constraint "q"
> +  "Memory reference which is not a constant integer."
> +  (and (match_code "mem")
> +   (match_test "GET_CODE(XEXP(op, 0)) != CONST_INT")))
> diff --git a/gcc/testsuite/gcc.target/bpf/ldxdw.c 
> b/gcc/testsuite/gcc.target/bpf/ldxdw.c
> new file mode 100644
> index 000..0985ea3e6ac
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/bpf/ldxdw.c
> @@ -0,0 +1,12 @@
> +/* Verify that we do not generate a malformed ldxdw instruction
> +   with a constant instead of register + offset.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +/* { dg-final { scan-assembler-times "ldxdw\t%r.,\\\[%r.+0\\\]" 1 } } */
> +/* { dg-final { scan-assembler-not "ldxdw\t%r.,\[0-9\]+" } } */
> +
> +unsigned long long test () {
> +  return *((unsigned long long *) 0x4000);
> +}

[og12] Address cast to pointer from integer of different size in 'libgomp/target.c:gomp_target_rev' (was: [OG12][committed] openmp: Add support for the 'present' modifier)

2023-02-14 Thread Thomas Schwinge

Hi!

On 2023-02-09T21:17:44+, Kwok Cheung Yeung  wrote:
> I've ported my patch for supporting the OpenMP 5.1 'present' modifier
> and committed it to the devel/omp/gcc-12 development branch:
>
> 229b705862c openmp: Add support for the 'present' modifier
>
> Tested with offloading on amdgcn and nvptx.

I've pushed to devel/omp/gcc-12 branch
commit cd377354c5faa326bdfa5f10e4193c1d1a686801
"Address cast to pointer from integer of different size in 
'libgomp/target.c:gomp_target_rev'",
see attached.


Note that this likewise applies to the current upstream submission:

"openmp: Add support for 'present' modifier".


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From cd377354c5faa326bdfa5f10e4193c1d1a686801 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 14 Feb 2023 23:34:45 +0100
Subject: [PATCH] Address cast to pointer from integer of different size in
 'libgomp/target.c:gomp_target_rev'
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

For example, for '-m32' multilib of x86_64-pc-linux-gnu:

[...]/libgomp/target.c: In function ‘gomp_target_rev’:
[...]/libgomp/target.c:3699:33: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
 3699 | (void *) devaddrs[i],
  | ^

Fix-up for recent og12 commit 229b705862c1d7f9634f72272b77c22970baf821
"openmp: Add support for the 'present' modifier".

	libgomp/
	* target.c (gomp_target_rev): Address cast to pointer from integer
	of different size.
---
 libgomp/ChangeLog.omp | 5 +
 libgomp/target.c  | 4 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp
index 484367d9975..67065f59922 100644
--- a/libgomp/ChangeLog.omp
+++ b/libgomp/ChangeLog.omp
@@ -1,3 +1,8 @@
+2023-02-14  Thomas Schwinge  
+
+	* target.c (gomp_target_rev): Address cast to pointer from integer
+	of different size.
+
 2023-02-09  Kwok Cheung Yeung  
 
 	* target.c (gomp_to_device_kind_p): Add map kinds with 'present'
diff --git a/libgomp/target.c b/libgomp/target.c
index 426383a451b..6edfc9214e4 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -3696,12 +3696,12 @@ gomp_target_rev (uint64_t fn_ptr, uint64_t mapnum, uint64_t devaddrs_ptr,
 #ifdef HAVE_INTTYPES_H
 		gomp_fatal ("present clause: no corresponding data on "
 "parent device at %p with size %"PRIu64,
-(void *) devaddrs[i],
+(void *) (uintptr_t) devaddrs[i],
 (uint64_t) sizes[i]);
 #else
 		gomp_fatal ("present clause: no corresponding data on "
 "parent device at %p with size %lu",
-(void *) devaddrs[i],
+(void *) (uintptr_t) devaddrs[i],
 (unsigned long) sizes[i]);
 #endif
 		break;
-- 
2.25.1

Re: [PATCH 1/2] c++: factor out TYPENAME_TYPE substitution

2023-02-14 Thread Jason Merrill via Gcc-patches


On 2/14/23 14:15, Jason Merrill wrote:

On 2/13/23 09:23, Patrick Palka wrote:

[N.B. this is a corrected version of
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607443.html ]

This patch factors out the TYPENAME_TYPE case of tsubst into a separate
function tsubst_typename_type.  It also factors out the two tsubst flags
controlling TYPENAME_TYPE substitution, tf_keep_type_decl and tf_tst_ok,
into distinct boolean parameters of this new function (and of
make_typename_type).  Consequently, callers which used to pass tf_tst_ok
to tsubst now instead must directly call tsubst_typename_type when
appropriate.


Hmm, I don't love how that turns 4 lines into 8 more complex lines in 
each caller.  And the previous approach of saying "a CTAD placeholder is 
OK" seem like better abstraction than repeating the specific 
TYPENAME_TYPE handling in each place.


tsubst_maybe_ctad_type?


In a subsequent patch we'll add another flag to
tsubst_typename_type controlling whether we want to ignore non-types
during the qualified lookup.

gcc/cp/ChangeLog:

* cp-tree.h (enum tsubst_flags): Remove tf_keep_type_decl
and tf_tst_ok.
(make_typename_type): Add two trailing boolean parameters
defaulted to false.
* decl.cc (make_typename_type): Replace uses of
tf_keep_type_decl and tf_tst_ok with the corresponding new
boolean parameters.
* pt.cc (tsubst_typename_type): New, factored out from tsubst
and adjusted after removing tf_keep_type_decl and tf_tst_ok.
(tsubst_decl) : Conditionally call
tsubst_typename_type directly instead of using tf_tst_ok.
(tsubst) : Call tsubst_typename_type.
(tsubst_copy) : Conditionally call
tsubst_typename_type directly instead of using tf_tst_ok.
(tsubst_copy_and_build) : Likewise.
: Likewise.
---
  gcc/cp/cp-tree.h |   9 +-
  gcc/cp/decl.cc   |  17 ++--
  gcc/cp/pt.cc | 223 +--
  3 files changed, 134 insertions(+), 115 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 06bc64a6b8d..a7c5765fc33 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5573,8 +5573,7 @@ enum tsubst_flags {
    tf_error = 1 << 0, /* give error messages  */
    tf_warning = 1 << 1,  /* give warnings too  */
    tf_ignore_bad_quals = 1 << 2, /* ignore bad cvr qualifiers */
-  tf_keep_type_decl = 1 << 3, /* retain typedef type decls
-    (make_typename_type use) */
+  /* 1 << 3 available */
    tf_ptrmem_ok = 1 << 4, /* pointers to member ok (internal
  instantiate_type use) */
    tf_user = 1 << 5, /* found template must be a user template
@@ -5594,8 +5593,7 @@ enum tsubst_flags {
  (build_target_expr and friends) */
    tf_norm = 1 << 11, /* Build diagnostic information during
  constraint normalization.  */
-  tf_tst_ok = 1 << 12, /* Allow a typename-specifier to name
-    a template (C++17 or later).  */
+  /* 1 << 12 available */
    tf_dguide = 1 << 13,    /* Building a deduction guide from a 
ctor.  */

    /* Convenient substitution flags combinations.  */
    tf_warning_or_error = tf_warning | tf_error
@@ -6846,7 +6844,8 @@ extern tree declare_local_label    (tree);
  extern tree define_label    (location_t, tree);
  extern void check_goto    (tree);
  extern bool check_omp_return    (void);
-extern tree make_typename_type    (tree, tree, enum 
tag_types, tsubst_flags_t);
+extern tree make_typename_type    (tree, tree, enum 
tag_types, tsubst_flags_t,

+ bool = false, bool = false);
  extern tree build_typename_type    (tree, tree, tree, 
tag_types);
  extern tree make_unbound_class_template    (tree, tree, tree, 
tsubst_flags_t);

  extern tree make_unbound_class_template_raw    (tree, tree, tree);
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index d606b31d7a7..430533606b0 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -4228,14 +4228,17 @@ build_typename_type (tree context, tree name, 
tree fullname,

  /* Resolve `typename CONTEXT::NAME'.  TAG_TYPE indicates the tag
 provided to name the type.  Returns an appropriate type, unless an
 error occurs, in which case error_mark_node is returned.  If we
-   locate a non-artificial TYPE_DECL and TF_KEEP_TYPE_DECL is set, we
+   locate a non-artificial TYPE_DECL and KEEP_TYPE_DECL is true, we
 return that, rather than the _TYPE it corresponds to, in other
-   cases we look through the type decl.  If TF_ERROR is set, complain
-   about errors, otherwise be quiet.  */
+   cases we look through the type decl.  If TEMPLATE_OK is true and
+   we found a TEMPLATE_DECL then we return a CTAD placeholder for the
+   TEMPLATE_DECL.  If TF_ERROR is set, complain about errors, otherwise
+   be quiet.  */
  tree
  make_typename_type (tree context, tree name, enum tag_types tag_type,
-    tsubst_fl

Re: [PATCH 2/2] c++: speculative constexpr and is_constant_evaluated [PR108243]

2023-02-14 Thread Jason Merrill via Gcc-patches


On 2/10/23 08:51, Patrick Palka wrote:

On Fri, 10 Feb 2023, Patrick Palka wrote:


On Thu, 9 Feb 2023, Patrick Palka wrote:


On Thu, 9 Feb 2023, Jason Merrill wrote:


On 2/9/23 09:36, Patrick Palka wrote:

On Sun, 5 Feb 2023, Jason Merrill wrote:


On 2/3/23 15:51, Patrick Palka wrote:

On Mon, 30 Jan 2023, Jason Merrill wrote:


On 1/27/23 17:02, Patrick Palka wrote:

This PR illustrates that __builtin_is_constant_evaluated currently
acts
as an optimization barrier for our speculative constexpr evaluation,
since we don't want to prematurely fold the builtin to false if the
expression in question would be later manifestly constant evaluated
(in
which case it must be folded to true).

This patch fixes this by permitting __builtin_is_constant_evaluated
to get folded as false during cp_fold_function, since at that point
we're sure we're doing manifestly constant evaluation.  To that end
we add a flags parameter to cp_fold that controls what mce_value the
CALL_EXPR case passes to maybe_constant_value.

bootstrapped and rgetsted no x86_64-pc-linux-gnu, does this look OK
for
trunk?

PR c++/108243

gcc/cp/ChangeLog:

* cp-gimplify.cc (enum fold_flags): Define.
(cp_fold_data::genericize): Replace this data member with ...
(cp_fold_data::fold_flags): ... this.
(cp_fold_r): Adjust cp_fold_data use and cp_fold_calls.
(cp_fold_function): Likewise.
(cp_fold_maybe_rvalue): Likewise.
(cp_fully_fold_init): Likewise.
(cp_fold): Add fold_flags parameter.  Don't cache if flags
isn't empty.
: Pass mce_false to maybe_constant_value
if if ff_genericize is set.

gcc/testsuite/ChangeLog:

* g++.dg/opt/pr108243.C: New test.
---
 gcc/cp/cp-gimplify.cc   | 76
++---
 gcc/testsuite/g++.dg/opt/pr108243.C | 29 +++
 2 files changed, 76 insertions(+), 29 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/opt/pr108243.C

diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
index a35cedd05cc..d023a63768f 100644
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -43,12 +43,20 @@ along with GCC; see the file COPYING3.  If not
see
 #include "omp-general.h"
 #include "opts.h"
 +/* Flags for cp_fold and cp_fold_r.  */
+
+enum fold_flags {
+  ff_none = 0,
+  /* Whether we're being called from cp_fold_function.  */
+  ff_genericize = 1 << 0,
+};
+
 /* Forward declarations.  */
   static tree cp_genericize_r (tree *, int *, void *);
 static tree cp_fold_r (tree *, int *, void *);
 static void cp_genericize_tree (tree*, bool);
-static tree cp_fold (tree);
+static tree cp_fold (tree, fold_flags);
   /* Genericize a TRY_BLOCK.  */
 @@ -996,9 +1004,8 @@ struct cp_genericize_data
 struct cp_fold_data
 {
   hash_set pset;
-  bool genericize; // called from cp_fold_function?
-
-  cp_fold_data (bool g): genericize (g) {}
+  fold_flags flags;
+  cp_fold_data (fold_flags flags): flags (flags) {}
 };
   static tree
@@ -1039,7 +1046,7 @@ cp_fold_r (tree *stmt_p, int *walk_subtrees,
void
*data_)
   break;
 }
 -  *stmt_p = stmt = cp_fold (*stmt_p);
+  *stmt_p = stmt = cp_fold (*stmt_p, data->flags);
 if (data->pset.add (stmt))
 {
@@ -1119,12 +1126,12 @@ cp_fold_r (tree *stmt_p, int *walk_subtrees,
void
*data_)
 here rather than in cp_genericize to avoid problems with the
invisible
 reference transition.  */
 case INIT_EXPR:
-  if (data->genericize)
+  if (data->flags & ff_genericize)
cp_genericize_init_expr (stmt_p);
   break;
   case TARGET_EXPR:
-  if (data->genericize)
+  if (data->flags & ff_genericize)
cp_genericize_target_expr (stmt_p);
 /* Folding might replace e.g. a COND_EXPR with a
TARGET_EXPR;
in
@@ -1157,7 +1164,7 @@ cp_fold_r (tree *stmt_p, int *walk_subtrees,
void
*data_)
 void
 cp_fold_function (tree fndecl)
 {
-  cp_fold_data data (/*genericize*/true);
+  cp_fold_data data (ff_genericize);
   cp_walk_tree (&DECL_SAVED_TREE (fndecl), cp_fold_r, &data,
NULL);
 }
 @@ -2375,7 +2382,7 @@ cp_fold_maybe_rvalue (tree x, bool rval)
 {
   while (true)
 {
-  x = cp_fold (x);
+  x = cp_fold (x, ff_none);
   if (rval)
x = mark_rvalue_use (x);
   if (rval && DECL_P (x)
@@ -2434,7 +2441,7 @@ cp_fully_fold_init (tree x)
   if (processing_template_decl)
 return x;
   x = cp_fully_fold (x);
-  cp_fold_data data (/*genericize*/false);
+  cp_fold_data data (ff_none);
   cp_walk_tree (&x, cp_fold_r, &data, NULL);
   return x;
 }
@@ -2469,7 +2476,7 @@ clear_fold_cache (void)
 Function returns X or its folded variant.  */
   static tree
-cp_fold (tree x)
+cp_fold (tree x, fold_flags flags)
 {
   tree op0, op1, op2, op3;
   tree org_x = x, r = NULL_TREE;
@@ -2490,8 +2497,11 @@ cp_fold (tree x)

Re: [PATCH] c++: sizeof(expr) in non-templated requires-expr [PR108563]

2023-02-14 Thread Jason Merrill via Gcc-patches


On 2/9/23 10:22, Patrick Palka wrote:

On Thu, 9 Feb 2023, Patrick Palka wrote:


When substituting into sizeof(expr), tsubst_copy_and_build elides
substitution into the operand if args is NULL_TREE, and instead
considers the TREE_TYPE of the operand.  But here the (templated)
operand is a TEMPLATE_ID_EXPR with empty TREE_TYPE, so we can't elide
substitution in this case.

Contrary to the associated comment (dating back to r69130) substituting
args=NULL_TREE should generally work since we do exactly that in e.g.
fold_non_dependent_expr, and I don't see why the operand of sizeof would
be an exception.  So this patch just removes this special case.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?  Diff generated with -w to ignore noisy whitespace changes.


This time with -w actually passed to format-patch:


OK.


-- >8 --

PR c++/108563

gcc/cp/ChangeLog:

* pt.cc (tsubst_copy_and_build) : Remove
special case for empty args.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-requires35.C: New test.
---
  gcc/cp/pt.cc | 11 ---
  gcc/testsuite/g++.dg/cpp2a/concepts-requires35.C | 14 ++
  2 files changed, 14 insertions(+), 11 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-requires35.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 9f3fc1fa089..f21d28263d1 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -20652,16 +20652,6 @@ tsubst_copy_and_build (tree t,
  op1 = TREE_TYPE (op1);
bool std_alignof = (TREE_CODE (t) == ALIGNOF_EXPR
&& ALIGNOF_EXPR_STD_P (t));
-if (!args)
- {
-   /* When there are no ARGS, we are trying to evaluate a
-  non-dependent expression from the parser.  Trying to do
-  the substitutions may not work.  */
-   if (!TYPE_P (op1))
- op1 = TREE_TYPE (op1);
- }
-   else
- {
++cp_unevaluated_operand;
++c_inhibit_evaluation_warnings;
if (TYPE_P (op1))
@@ -20670,7 +20660,6 @@ tsubst_copy_and_build (tree t,
  op1 = tsubst_copy_and_build (op1, args, complain, in_decl);
--cp_unevaluated_operand;
--c_inhibit_evaluation_warnings;
- }
  if (TYPE_P (op1))
  r = cxx_sizeof_or_alignof_type (input_location,
  op1, TREE_CODE (t), std_alignof,
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-requires35.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-requires35.C
new file mode 100644
index 000..2bb4b2b0b5d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-requires35.C
@@ -0,0 +1,14 @@
+// PR c++/108563
+// { dg-do compile { target c++20 } }
+
+template
+struct foo {
+  static constexpr T value = 0;
+};
+
+template
+inline constexpr T foo_v = foo::value;
+
+static_assert(requires { sizeof(foo_v); });
+static_assert(requires { requires sizeof(foo_v) == sizeof(int*); });
+static_assert(requires { requires sizeof(foo_v) == sizeof(char); });

[PATCH] RISC-V: Move saturating add/subtract md pattern location [NFC]

2023-02-14 Thread juzhe . zhong

From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/vector.md (@pred_): Rearrange.
(@pred__scalar): Ditto.
(*pred__scalar): Ditto.
(*pred__extended_scalar): Ditto.

---
 gcc/config/riscv/vector.md | 490 ++---
 1 file changed, 245 insertions(+), 245 deletions(-)

diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index b6e67e94f67..764d9316ad9 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -1336,7 +1336,6 @@
 ;; - 11.9 Vector Integer Min/Max Instructions
 ;; - 11.10 Vector Single-Width Integer Multiply Instructions
 ;; - 11.11 Vector Integer Divide Instructions
-;; - 12.1 Vector Single-Width Saturating Add and Subtract
 ;; 
---
 
 (define_insn "@pred_"
@@ -1728,248 +1727,6 @@
   [(set_attr "type" "vialu")
(set_attr "mode" "")])
 
-;; Saturating Add and Subtract
-(define_insn "@pred_"
-  [(set (match_operand:VI 0 "register_operand"   "=vd, vr, vd, 
vr")
-   (if_then_else:VI
- (unspec:
-   [(match_operand: 1 "vector_mask_operand" " vm,Wc1, 
vm,Wc1")
-(match_operand 5 "vector_length_operand"" rK, rK, rK, 
rK")
-(match_operand 6 "const_int_operand""  i,  i,  i,  
i")
-(match_operand 7 "const_int_operand""  i,  i,  i,  
i")
-(match_operand 8 "const_int_operand""  i,  i,  i,  
i")
-(reg:SI VL_REGNUM)
-(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
- (any_sat_int_binop:VI
-   (match_operand:VI 3 "" " vr, vr, vr, 
vr")
-   (match_operand:VI 4 "" 
""))
- (match_operand:VI 2 "vector_merge_operand" 
"0vu,0vu,0vu,0vu")))]
-  "TARGET_VECTOR"
-  "@
-   v.vv\t%0,%3,%4%p1
-   v.vv\t%0,%3,%4%p1
-   v\t%0,%p1
-   v\t%0,%p1"
-  [(set_attr "type" "")
-   (set_attr "mode" "")])
-
-;; Handle GET_MODE_INNER (mode) = QImode, HImode, SImode.
-(define_insn "@pred__scalar"
-  [(set (match_operand:VI_QHS 0 "register_operand"   "=vd, vr")
-   (if_then_else:VI_QHS
- (unspec:
-   [(match_operand: 1 "vector_mask_operand" " vm,Wc1")
-(match_operand 5 "vector_length_operand"" rK, rK")
-(match_operand 6 "const_int_operand""  i,  i")
-(match_operand 7 "const_int_operand""  i,  i")
-(match_operand 8 "const_int_operand""  i,  i")
-(reg:SI VL_REGNUM)
-(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
- (sat_int_plus_binop:VI_QHS
-   (vec_duplicate:VI_QHS
- (match_operand: 4 "register_operand"  "  r,  r"))
-   (match_operand:VI_QHS 3 "register_operand"   " vr, vr"))
- (match_operand:VI_QHS 2 "vector_merge_operand" "0vu,0vu")))]
-  "TARGET_VECTOR"
-  "v.vx\t%0,%3,%4%p1"
-  [(set_attr "type" "")
-   (set_attr "mode" "")])
-
-(define_insn "@pred__scalar"
-  [(set (match_operand:VI_QHS 0 "register_operand"   "=vd, vr")
-   (if_then_else:VI_QHS
- (unspec:
-   [(match_operand: 1 "vector_mask_operand" " vm,Wc1")
-(match_operand 5 "vector_length_operand"" rK, rK")
-(match_operand 6 "const_int_operand""  i,  i")
-(match_operand 7 "const_int_operand""  i,  i")
-(match_operand 8 "const_int_operand""  i,  i")
-(reg:SI VL_REGNUM)
-(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
- (sat_int_minus_binop:VI_QHS
-   (match_operand:VI_QHS 3 "register_operand"   " vr, vr")
-   (vec_duplicate:VI_QHS
- (match_operand: 4 "register_operand"  "  r,  r")))
- (match_operand:VI_QHS 2 "vector_merge_operand" "0vu,0vu")))]
-  "TARGET_VECTOR"
-  "v.vx\t%0,%3,%4%p1"
-  [(set_attr "type" "")
-   (set_attr "mode" "")])
-
-(define_expand "@pred__scalar"
-  [(set (match_operand:VI_D 0 "register_operand")
-   (if_then_else:VI_D
- (unspec:
-   [(match_operand: 1 "vector_mask_operand")
-(match_operand 5 "vector_length_operand")
-(match_operand 6 "const_int_operand")
-(match_operand 7 "const_int_operand")
-(match_operand 8 "const_int_operand")
-(reg:SI VL_REGNUM)
-(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
- (sat_int_plus_binop:VI_D
-   (vec_duplicate:VI_D
- (match_operand: 4 "reg_or_int_operand"))
-   (match_operand:VI_D 3 "register_operand"))
- (match_operand:VI_D 2 "vector_merge_operand")))]
-  "TARGET_VECTOR"
-  {
-if (riscv_vector::has_vi_variant_p (, operands[4]))
-  operands[4] = force_reg (mode, operands[4]);
-else if (!TARGET_64BIT)
-  {
-   rtx v = gen_reg_rtx (mode);
-
-   if (immediate_operand (operands[4], Pmode))
- operands[4] = gen_rtx_SIGN_EXTEND (mode,
-   force_reg (Pmode

Re: [OG12][committed] openmp: Add support for the 'present' modifier

2023-02-14 Thread Kwok Cheung Yeung


Hi

I have also committed the following patch to devel/omp/gcc-12 to show 
the 'present' modifier in the Fortran parse tree dump.


e7279cc2eda openmp: Add support for 'present' modifier in the Fortran 
parse tree dump


KwokFrom e7279cc2eda2a0c50cff19ee4e02eea3d7808f68 Mon Sep 17 00:00:00 2001
From: Kwok Cheung Yeung 
Date: Tue, 14 Feb 2023 21:24:19 +
Subject: [PATCH] openmp: Add support for 'present' modifier in the Fortran
 parse tree dump

2023-02-14  Kwok Cheung Yeung  

gcc/fortran/
* dump-parse-tree.cc (show_omp_namelist): Display 'present' map
modifier.
(show_omp_clauses): Display 'present' motion modifier for 'to'
and 'from' clauses.
---
 gcc/fortran/ChangeLog.omp  |  7 +++
 gcc/fortran/dump-parse-tree.cc | 15 +++
 2 files changed, 22 insertions(+)

diff --git a/gcc/fortran/ChangeLog.omp b/gcc/fortran/ChangeLog.omp
index 44bc0ea1e2a..579d8ee7c97 100644
--- a/gcc/fortran/ChangeLog.omp
+++ b/gcc/fortran/ChangeLog.omp
@@ -1,3 +1,10 @@
+2023-02-14  Kwok Cheung Yeung  
+
+   * dump-parse-tree.cc (show_omp_namelist): Display 'present' map
+   modifier.
+   (show_omp_clauses): Display 'present' motion modifier for 'to'
+   and 'from' clauses.
+
 2023-02-09  Kwok Cheung Yeung  
 
* gfortran.h (enum gfc_omp_map_op): Add entries with 'present'
diff --git a/gcc/fortran/dump-parse-tree.cc b/gcc/fortran/dump-parse-tree.cc
index 4da4d813d1d..7dad3ac0307 100644
--- a/gcc/fortran/dump-parse-tree.cc
+++ b/gcc/fortran/dump-parse-tree.cc
@@ -1453,9 +1453,20 @@ show_omp_namelist (int list_type, gfc_omp_namelist *n)
  case OMP_MAP_TO: fputs ("to:", dumpfile); break;
  case OMP_MAP_FROM: fputs ("from:", dumpfile); break;
  case OMP_MAP_TOFROM: fputs ("tofrom:", dumpfile); break;
+ case OMP_MAP_PRESENT_ALLOC: fputs ("present,alloc:", dumpfile); break;
+ case OMP_MAP_PRESENT_TO: fputs ("present,to:", dumpfile); break;
+ case OMP_MAP_PRESENT_FROM: fputs ("present,from:", dumpfile); break;
+ case OMP_MAP_PRESENT_TOFROM:
+   fputs ("present,tofrom:", dumpfile); break;
  case OMP_MAP_ALWAYS_TO: fputs ("always,to:", dumpfile); break;
  case OMP_MAP_ALWAYS_FROM: fputs ("always,from:", dumpfile); break;
  case OMP_MAP_ALWAYS_TOFROM: fputs ("always,tofrom:", dumpfile); break;
+ case OMP_MAP_ALWAYS_PRESENT_TO:
+   fputs ("always,present,to:", dumpfile); break;
+ case OMP_MAP_ALWAYS_PRESENT_FROM:
+   fputs ("always,present,from:", dumpfile); break;
+ case OMP_MAP_ALWAYS_PRESENT_TOFROM:
+   fputs ("always,present,tofrom:", dumpfile); break;
  case OMP_MAP_DELETE: fputs ("delete:", dumpfile); break;
  case OMP_MAP_RELEASE: fputs ("release:", dumpfile); break;
  default: break;
@@ -1793,6 +1804,10 @@ show_omp_clauses (gfc_omp_clauses *omp_clauses)
  fputs ("inscan, ", dumpfile);
if (list_type == OMP_LIST_REDUCTION_TASK)
  fputs ("task, ", dumpfile);
+   if ((list_type == OMP_LIST_TO || list_type == OMP_LIST_FROM)
+   && omp_clauses->lists[list_type]->u.motion_modifier
+  == OMP_MOTION_PRESENT)
+ fputs ("present:", dumpfile);
show_omp_namelist (list_type, omp_clauses->lists[list_type]);
fputc (')', dumpfile);
   }
-- 
2.34.1

[PATCH] gen_reload: Correct parameter for fatal_insn call

2023-02-14 Thread Hans-Peter Nilsson via Gcc-patches

Committed as obvious.  Also, I wrote the neighboring code
- apparently including that line...

-- >8 --
Observed when disabling LEGITIMIZE_RELOAD_ADDRESS for
cris-elf: the current code doesn't handle the post-cc0
parallel-with-clobber-of-cc0 sets, dropping down into the
fatal_insn call.  Following the code, it's obvious that the
variable "set" is always NULL at the call.  The intended
parameter is "in".

* reload1.cc (gen_reload): Correct rtx parameter for fatal_insn
"failure trying to reload" call.
---
 gcc/reload1.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/reload1.cc b/gcc/reload1.cc
index 6fe22d8b81f9..7dcef50437b8 100644
--- a/gcc/reload1.cc
+++ b/gcc/reload1.cc
@@ -8606,7 +8606,7 @@ gen_reload (rtx out, rtx in, int opnum, enum reload_type 
type)
  return insn;
}
 
-  fatal_insn ("failure trying to reload:", set);
+  fatal_insn ("failure trying to reload:", in);
 }
   /* If IN is a simple operand, use gen_move_insn.  */
   else if (OBJECT_P (in) || GET_CODE (in) == SUBREG)
-- 
2.30.2

[r13-5971 Regression] FAIL: gcc.target/i386/pr108774.c (test for excess errors) on Linux/x86_64

2023-02-14 Thread Jiang, Haochen via Gcc-patches

On Linux/x86_64,

a33e3dcbd15e73603796e30b5eeec11a0c8bacec is the first bad commit commit 
a33e3dcbd15e73603796e30b5eeec11a0c8bacec
Author: Vladimir N. Makarov 
Date:   Mon Feb 13 16:05:04 2023 -0500

RA: Clear reg equiv caller_save_p flag when clearing defined_p flag

caused

FAIL: gcc.target/i386/pr108774.c (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r13-5971/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr108774.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr108774.c --target_board='unix{-m32\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at haochen dot jiang at intel.com)

[pushed] wwwdocs: news/profiledriven: Update a link

2023-02-14 Thread Gerald Pfeifer

Pushed.

Gerald
---
 htdocs/news/profiledriven.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/news/profiledriven.html b/htdocs/news/profiledriven.html
index 82febc6d..3acacb78 100644
--- a/htdocs/news/profiledriven.html
+++ b/htdocs/news/profiledriven.html
@@ -276,7 +276,7 @@ Frequency and Program Profile Analysis; Wu and Larus; 
MICRO-27.
 
 [4] wwwdocs:
 
-http://www.lighterra.com/papers/valuerangeprop/Patterson1995-ValueRangeProp.pdf";>Accurate
+https://www.lighterra.com/papers/valuerangeprop/Patterson1995-ValueRangeProp.pdf";>Accurate
Static Branch Prediction by Value Range Propagation;
Jason R. C. Patterson (jas...@fit.qut.edu.au), 1995
 
-- 
2.39.1

[PATCH] warn-access: wrong -Wdangling-pointer with labels [PR106080]

2023-02-14 Thread Marek Polacek via Gcc-patches

-Wdangling-pointer warns when the address of a label escapes.  This
causes grief in OCaml () as
well as in the kernel:
 because it uses

  #define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })

to get the PC.  -Wdangling-pointer is documented to warn about pointers
to objects.  However, it uses is_auto_decl which checks DECL_P, but DECL_P
is also true for a label/enumerator/function declaration, none of which is
an object.  Rather, it should use auto_var_p which correctly checks VAR_P
and PARM_DECL.

Bootstrapped/regtested on ppc64le-pc-linux-gnu, ok for trunk and 12?

PR middle-end/106080

gcc/ChangeLog:

* gimple-ssa-warn-access.cc (is_auto_decl): Remove.  Use auto_var_p
instead.

gcc/testsuite/ChangeLog:

* c-c++-common/Wdangling-pointer-10.c: New test.
* c-c++-common/Wdangling-pointer-9.c: New test.
---
 gcc/gimple-ssa-warn-access.cc | 19 +--
 .../c-c++-common/Wdangling-pointer-10.c   | 12 
 .../c-c++-common/Wdangling-pointer-9.c|  9 +
 3 files changed, 26 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/Wdangling-pointer-10.c
 create mode 100644 gcc/testsuite/c-c++-common/Wdangling-pointer-9.c

diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc
index ad9dac54874..2eab1d59abd 100644
--- a/gcc/gimple-ssa-warn-access.cc
+++ b/gcc/gimple-ssa-warn-access.cc
@@ -4326,15 +4326,6 @@ pass_waccess::check_call (gcall *stmt)
   check_nonstring_args (stmt);
 }
 
-
-/* Return true of X is a DECL with automatic storage duration.  */
-
-static inline bool
-is_auto_decl (tree x)
-{
-  return DECL_P (x) && !DECL_EXTERNAL (x) && !TREE_STATIC (x);
-}
-
 /* Check non-call STMT for invalid accesses.  */
 
 void
@@ -4363,7 +4354,7 @@ pass_waccess::check_stmt (gimple *stmt)
   while (handled_component_p (lhs))
lhs = TREE_OPERAND (lhs, 0);
 
-  if (is_auto_decl (lhs))
+  if (auto_var_p (lhs))
m_clobbers.remove (lhs);
   return;
 }
@@ -4383,7 +4374,7 @@ pass_waccess::check_stmt (gimple *stmt)
   while (handled_component_p (arg))
arg = TREE_OPERAND (arg, 0);
 
-  if (!is_auto_decl (arg))
+  if (!auto_var_p (arg))
return;
 
   gimple **pclobber = m_clobbers.get (arg);
@@ -4467,7 +4458,7 @@ void
 pass_waccess::check_dangling_uses (tree var, tree decl, bool maybe /* = false 
*/,
   bool objref /* = false */)
 {
-  if (!decl || !is_auto_decl (decl))
+  if (!decl || !auto_var_p (decl))
 return;
 
   gimple **pclob = m_clobbers.get (decl);
@@ -4528,7 +4519,7 @@ pass_waccess::check_dangling_stores (basic_block bb,
   if (!m_ptr_qry.get_ref (lhs, stmt, &lhs_ref, 0))
continue;
 
-  if (is_auto_decl (lhs_ref.ref))
+  if (auto_var_p (lhs_ref.ref))
continue;
 
   if (DECL_P (lhs_ref.ref))
@@ -4573,7 +4564,7 @@ pass_waccess::check_dangling_stores (basic_block bb,
  || rhs_ref.deref != -1)
continue;
 
-  if (!is_auto_decl (rhs_ref.ref))
+  if (!auto_var_p (rhs_ref.ref))
continue;
 
   auto_diagnostic_group d;
diff --git a/gcc/testsuite/c-c++-common/Wdangling-pointer-10.c 
b/gcc/testsuite/c-c++-common/Wdangling-pointer-10.c
new file mode 100644
index 000..ef553bdf2ce
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wdangling-pointer-10.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wdangling-pointer" } */
+
+struct S {
+  int x;
+};
+
+void g (int **p)
+{
+  struct S s = {};
+  *p = &s.x; /* { dg-warning "address of local variable" } */
+}
diff --git a/gcc/testsuite/c-c++-common/Wdangling-pointer-9.c 
b/gcc/testsuite/c-c++-common/Wdangling-pointer-9.c
new file mode 100644
index 000..2147f733607
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wdangling-pointer-9.c
@@ -0,0 +1,9 @@
+/* PR middle-end/106080 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wdangling-pointer" } */
+
+void
+foo (void **failaddr)
+{
+  *failaddr = ({ __label__ __here; __here: &&__here; });
+}

base-commit: c348a717213b03c6661878934f197f4d261f0e56
-- 
2.39.1

Re: [patch, gfortran.dg] Allow test to pass on mingw

2023-02-14 Thread NightStrike via Gcc-patches

On Fri, Jan 20, 2023 at 10:21 PM Jerry DeLisle via Fortran
 wrote:
>
> Hi all,
>
> Similar to a patch I committed a while ago for Cygwin, the attached
> patch allows it to pass on the mingw version of gfortran.
>
> It is trivial.
>
> Ok for trunk?
>
> Regards,
>
> Jerry

ping

51 matches

Mail list logo