date:20230215

[PATCH (pushed)] docs: document new --param=asan-kernel-mem-intrinsic-prefix

2023-02-15 Thread Martin Liška

gcc/ChangeLog:

* doc/invoke.texi: Document --param=asan-kernel-mem-intrinsic-prefix.
---
 gcc/doc/invoke.texi | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 26de582e41e..0a43720f614 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15809,6 +15809,10 @@ is greater or equal to this number, use callbacks 
instead of inline checks.
 E.g. to disable inline code use
 @option{--param asan-instrumentation-with-call-threshold=0}.
 
+@item asan-kernel-mem-intrinsic-prefix
+Prefix calls to memcpy, memset and memmove with __asan_ or __hwasan_
+for -fsanitize=kernel-address or -fsanitize=kernel-hwaddress.
+
 @item hwasan-instrument-stack
 Enable hwasan instrumentation of statically sized stack-allocated variables.
 This kind of instrumentation is enabled by default when using
-- 
2.39.1

[PATCH] RISC-V: Normalize SEW = 64 handling into a simplified function

2023-02-15 Thread juzhe . zhong

From: Ju-Zhe Zhong 

Co-authored-by: kito-cheng 

gcc/ChangeLog:

* config/riscv/riscv-protos.h (sew64_scalar_helper): New function.
* config/riscv/riscv-v.cc (has_vi_variant_p): Adjust.
(sew64_scalar_helper): New function.
* config/riscv/vector.md: Normalization.

Co-authored-by: kito-cheng 

---
 gcc/config/riscv/riscv-protos.h |   2 +
 gcc/config/riscv/riscv-v.cc |  46 +-
 gcc/config/riscv/vector.md  | 771 +++-
 3 files changed, 316 insertions(+), 503 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 81ad2eabc00..37c634eca1d 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -181,6 +181,8 @@ bool neg_simm5_p (rtx);
 #ifdef RTX_CODE
 bool has_vi_variant_p (rtx_code, rtx);
 #endif
+bool sew64_scalar_helper (rtx *, rtx *, rtx, machine_mode, machine_mode,
+ bool, void (*)(rtx *, rtx));
 }
 
 /* We classify builtin types into two classes:
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index dd70bf9b541..59c25c65cd5 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -418,14 +418,11 @@ has_vi_variant_p (rtx_code code, rtx x)
   switch (code)
 {
 case PLUS:
-case MINUS:
 case AND:
 case IOR:
 case XOR:
 case SS_PLUS:
-case SS_MINUS:
 case US_PLUS:
-case US_MINUS:
 case EQ:
 case NE:
 case LE:
@@ -438,10 +435,53 @@ has_vi_variant_p (rtx_code code, rtx x)
 case LTU:
 case GE:
 case GEU:
+case MINUS:
+case SS_MINUS:
   return neg_simm5_p (x);
+
 default:
   return false;
 }
 }
 
+bool
+sew64_scalar_helper (rtx *operands, rtx *scalar_op, rtx vl,
+machine_mode vector_mode, machine_mode mask_mode,
+bool has_vi_variant_p,
+void (*emit_vector_func) (rtx *, rtx))
+{
+  machine_mode scalar_mode = GET_MODE_INNER (vector_mode);
+  if (has_vi_variant_p)
+{
+  *scalar_op = force_reg (scalar_mode, *scalar_op);
+  return false;
+}
+
+  if (TARGET_64BIT)
+{
+  if (!rtx_equal_p (*scalar_op, const0_rtx))
+   *scalar_op = force_reg (scalar_mode, *scalar_op);
+  return false;
+}
+
+  if (immediate_operand (*scalar_op, Pmode))
+{
+  if (!rtx_equal_p (*scalar_op, const0_rtx))
+   *scalar_op = force_reg (Pmode, *scalar_op);
+
+  *scalar_op = gen_rtx_SIGN_EXTEND (scalar_mode, *scalar_op);
+  return false;
+}
+
+  if (CONST_INT_P (*scalar_op))
+*scalar_op = force_reg (scalar_mode, *scalar_op);
+
+  rtx tmp = gen_reg_rtx (vector_mode);
+  riscv_vector::emit_nonvlmax_op (code_for_pred_broadcast (vector_mode), tmp,
+ *scalar_op, vl, mask_mode);
+  emit_vector_func (operands, tmp);
+
+  return true;
+}
+
 } // namespace riscv_vector
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 764d9316ad9..c897a365819 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -882,32 +882,21 @@
 (match_operand:VI_D 2 "register_operand"))
   (match_operand:VI_D 1 "vector_merge_operand")))]
   "TARGET_VECTOR"
-  {
-if (riscv_vector::simm5_p (operands[3]))
-  operands[3] = force_reg (mode, operands[3]);
-else if (!TARGET_64BIT)
-  {
-   rtx v = gen_reg_rtx (mode);
-
-   if (immediate_operand (operands[3], Pmode))
- operands[3] = gen_rtx_SIGN_EXTEND (mode,
-   force_reg (Pmode, operands[3]));
-   else
- {
-   if (CONST_INT_P (operands[3]))
- operands[3] = force_reg (mode, operands[3]);
-
-   riscv_vector::emit_nonvlmax_op (code_for_pred_broadcast 
(mode),
-   v, operands[3], operands[5], mode);
-   emit_insn (gen_pred_merge (operands[0], operands[1],
-   operands[2], v, operands[4],operands[5],
-   operands[6], operands[7]));
-   DONE;
- }
-  }
-else
-  operands[3] = force_reg (mode, operands[3]);
-  })
+{
+  if (riscv_vector::sew64_scalar_helper (
+   operands,
+   /* scalar op */&operands[3],
+   /* vl */operands[5],
+   mode,
+   mode,
+   riscv_vector::simm5_p (operands[3]),
+   [] (rtx *operands, rtx boardcast_scalar) {
+ emit_insn (gen_pred_merge (operands[0], operands[1],
+  operands[2], boardcast_scalar, operands[4], operands[5],
+  operands[6], operands[7]));
+}))
+DONE;
+})
 
 (define_insn "*pred_merge_scalar"
   [(set (match_operand:VI_D 0 "register_operand" "=vd")
@@ -1471,38 +1460,21 @@
(match_operand:VI_D 3 "register_operand"))
  (match_operand:VI_D 2 "vector_merge_operand")))]
   "TARGET_VECTOR"
-  {
-if (riscv_vector::has_vi_variant_p (, operands[4]))
-  operands[4] = force_reg (mode, operands[4]);
-else if (!TARGET_64BIT)
-  {
-   rtx

Re: [PATCH (pushed)] docs: document new --param=asan-kernel-mem-intrinsic-prefix

2023-02-15 Thread Jakub Jelinek via Gcc-patches

On Wed, Feb 15, 2023 at 09:39:11AM +0100, Martin Liška wrote:
> gcc/ChangeLog:
> 
>   * doc/invoke.texi: Document --param=asan-kernel-mem-intrinsic-prefix.

Ok, thanks.

Jakub

Re: [patch, gfortran.dg] Allow test to pass on mingw

2023-02-15 Thread Tobias Burnus


Hi Jerry,

On 21.01.23 04:21, Jerry DeLisle via Fortran wrote:

Similar to a patch I committed a while ago for Cygwin, the attached
patch allows it to pass on the mingw version of gfortran.

It is trivial.
Ok for trunk?


As you wrote, adding '*-ming*' alongside '*-cygwin*' as target selector
is trivial and it is also in a mere testcase and not touching code.

But as you asked for approval: OK. Thanks for the patch.

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

[committed] powerpc: Fix up expansion for WIDEN_MULT_PLUS_EXPR [PR108787]

2023-02-15 Thread Jakub Jelinek via Gcc-patches

Hi!

WIDEN_MULT_PLUS_EXPR as documented has the factor operands with
the same precision and the addend and result another one at least twice
as wide.
Similarly, {,u}maddMN4 is documented as
'maddMN4'
 Multiply operands 1 and 2, sign-extend them to mode N, add operand
 3, and store the result in operand 0.  Operands 1 and 2 have mode M
 and operands 0 and 3 have mode N.  Both modes must be integer or
 fixed-point modes and N must be twice the size of M.

 In other words, 'maddMN4' is like 'mulMN3' except that it also adds
 operand 3.

 These instructions are not allowed to 'FAIL'.

'umaddMN4'
 Like 'maddMN4', but zero-extend the multiplication operands instead
 of sign-extending them.
The PR103109 addition of these expanders to rs6000 didn't handle this
correctly though, it treated the last argument as also having mode M
sign or zero extended into N.  Unfortunately this means incorrect code
generation whenever the last operand isn't really sign or zero extended
from DImode to TImode.

The following patch removes maddditi4 expander altogether from rs6000.md,
because we'd need
maddhd 9,3,4,5
sradi 10,5,63
maddld 3,3,4,5
sub 9,9,10
add 4,9,6
which is longer than
mulld 9,3,4
mulhd 4,3,4
addc 3,9,5
adde 4,4,6
and nothing would be able to optimize the case of last operand already
sign-extended from DImode to TImode into just
mr 9,3
maddld 3,3,4,5
maddhd 4,9,4,5
or so.  And fixes umaddditi4, so that it emits an add at the end to add
the high half of the last operand, fortunately in this case if the high
half of the last operand is known to be zero (i.e. last operand is zero
extended from DImode to TImode) then combine will drop the useless add.

If we wanted to get back the signed op1 * op2 + op3 all in the DImode
into TImode op0, we'd need to introduce a new tree code next to
WIDEN_MULT_PLUS_EXPR and maddMN4 expander, because I'm afraid it can't
be done at expansion time in maddMN4 expander to detect whether the
operand is sign extended especially because of SUBREGs and the awkwardness
of looking at earlier emitted instructions, and combine would need 5
instruction combination.

Bootstrapped/regtested on powerpc64-linux (power7, tested -m32/-m64),
powerpc64le-linux (power8 and another on power9 with
--with-cpu-64=power9 --with-tune-64=power9), preapproved by Segher in the
PR, committed to trunk.

2023-02-15  Jakub Jelinek  

PR target/108787
PR target/103109
* config/rs6000/rs6000.md (maddditi4): Change into umaddditi4 only
expander, change operand 3 to be TImode, emit maddlddi4 and
umadddi4_highpart{,_le} with its low half and finally add the high
half to the result.

* gcc.dg/pr108787.c: New test.
* gcc.target/powerpc/pr108787.c: New test.
* gcc.target/powerpc/pr103109-1.c: Adjust expected instruction counts.

--- gcc/config/rs6000/rs6000.md.jj  2023-01-16 11:52:16.036734757 +0100
+++ gcc/config/rs6000/rs6000.md 2023-02-14 21:02:05.637399466 +0100
@@ -3226,25 +3226,40 @@ (define_insn "maddld4"
   "maddld %0,%1,%2,%3"
   [(set_attr "type" "mul")])
 
-(define_expand "maddditi4"
+;; umaddditi4 generally needs maddhdu + maddld + add instructions,
+;; unless last operand is zero extended from DImode, then needs
+;; maddhdu + maddld, which is both faster than mulld + mulhdu + addc + adde
+;; resp. mulld + mulhdu + addc + addze.
+;; We don't define maddditi4, as that one needs
+;; maddhd + sradi + maddld + add + sub and for last operand sign extended
+;; from DImode nothing is able to optimize it into maddhd + maddld, while
+;; without maddditi4 mulld + mulhd + addc + adde or
+;; mulld + mulhd + sradi + addc + adde is needed.  See PR108787.
+(define_expand "umaddditi4"
   [(set (match_operand:TI 0 "gpc_reg_operand")
(plus:TI
- (mult:TI (any_extend:TI (match_operand:DI 1 "gpc_reg_operand"))
-  (any_extend:TI (match_operand:DI 2 "gpc_reg_operand")))
- (any_extend:TI (match_operand:DI 3 "gpc_reg_operand"]
+ (mult:TI (zero_extend:TI (match_operand:DI 1 "gpc_reg_operand"))
+  (zero_extend:TI (match_operand:DI 2 "gpc_reg_operand")))
+ (match_operand:TI 3 "gpc_reg_operand")))]
   "TARGET_MADDLD && TARGET_POWERPC64"
 {
   rtx op0_lo = gen_rtx_SUBREG (DImode, operands[0], BYTES_BIG_ENDIAN ? 8 : 0);
   rtx op0_hi = gen_rtx_SUBREG (DImode, operands[0], BYTES_BIG_ENDIAN ? 0 : 8);
+  rtx op3_lo = gen_rtx_SUBREG (DImode, operands[3], BYTES_BIG_ENDIAN ? 8 : 0);
+  rtx op3_hi = gen_rtx_SUBREG (DImode, operands[3], BYTES_BIG_ENDIAN ? 0 : 8);
+  rtx hi_temp = gen_reg_rtx (DImode);
 
-  emit_insn (gen_maddlddi4 (op0_lo, operands[1], operands[2], operands[3]));
+  emit_insn (gen_maddlddi4 (op0_lo, operands[1], operands[2], op3_lo));
 
   if (BYTES_BIG_ENDIAN)
-emit_insn (gen_madddi4_highpart (op0_hi, operands[1], operands[2],
-

Re: [PATCH] LoongArch: Fix multiarch tuple canonization

2023-02-15 Thread Yujie Yang

On Tue, Feb 14, 2023 at 11:32:00AM +0800, Lulu Cheng wrote:
> add yangyujie.
 
Looks good to me. Thanks for the forward!

Re: [PATCH] warn-access: wrong -Wdangling-pointer with labels [PR106080]

2023-02-15 Thread Jakub Jelinek via Gcc-patches

On Tue, Feb 14, 2023 at 10:48:15PM -0500, Marek Polacek via Gcc-patches wrote:
> -Wdangling-pointer warns when the address of a label escapes.  This
> causes grief in OCaml () as
> well as in the kernel:
>  because it uses
> 
>   #define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })
> 
> to get the PC.  -Wdangling-pointer is documented to warn about pointers
> to objects.  However, it uses is_auto_decl which checks DECL_P, but DECL_P
> is also true for a label/enumerator/function declaration, none of which is
> an object.  Rather, it should use auto_var_p which correctly checks VAR_P
> and PARM_DECL.

and RESULT_DECL ;)

> Bootstrapped/regtested on ppc64le-pc-linux-gnu, ok for trunk and 12?
> 
>   PR middle-end/106080
> 
> gcc/ChangeLog:
> 
>   * gimple-ssa-warn-access.cc (is_auto_decl): Remove.  Use auto_var_p
>   instead.
> 
> gcc/testsuite/ChangeLog:
> 
>   * c-c++-common/Wdangling-pointer-10.c: New test.
>   * c-c++-common/Wdangling-pointer-9.c: New test.

> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/Wdangling-pointer-9.c
> @@ -0,0 +1,9 @@
> +/* PR middle-end/106080 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -Wdangling-pointer" } */
> +
> +void
> +foo (void **failaddr)
> +{
> +  *failaddr = ({ __label__ __here; __here: &&__here; });
> +}

Perhaps add dg-bogus above just to make it more clear what
we are testing in the test?

Otherwise LGTM.

Jakub

Re: [committed] powerpc: Fix up expansion for WIDEN_MULT_PLUS_EXPR [PR108787]

2023-02-15 Thread Segher Boessenkool

Hi!

On Wed, Feb 15, 2023 at 10:18:29AM +0100, Jakub Jelinek wrote:
> If we wanted to get back the signed op1 * op2 + op3 all in the DImode
> into TImode op0, we'd need to introduce a new tree code next to
> WIDEN_MULT_PLUS_EXPR and maddMN4 expander, because I'm afraid it can't
> be done at expansion time in maddMN4 expander to detect whether the
> operand is sign extended especially because of SUBREGs and the awkwardness
> of looking at earlier emitted instructions, and combine would need 5
> instruction combination.

The machine insns we have are like they are just for symmetry as far as
I can see, they aren't all so handy to use, don't worry about it :-)

Nicer for software insns have two registers in for the addend (either as
64-bit, or as two addends), but that is not so nice for hardware.

> Bootstrapped/regtested on powerpc64-linux (power7, tested -m32/-m64),
> powerpc64le-linux (power8 and another on power9 with
> --with-cpu-64=power9 --with-tune-64=power9), preapproved by Segher in the
> PR, committed to trunk.

Thanks again :-)


Segher

[patch] bpf: Fix double whitespace warning

2023-02-15 Thread Jan-Benedict Glaw

Hi!

Since a recent commit, the BPF target produces a new warning due to
two consecutive non-quoted spaces in a message. This'll fix it:

gcc/
* config/bpf/bpf.cc (bpf_option_override): Fix doubled space.


Ok?

MfG, JBG


diff --git a/gcc/config/bpf/bpf.cc b/gcc/config/bpf/bpf.cc
index b268801d00c..d8693f8cfbe 100644
--- a/gcc/config/bpf/bpf.cc
+++ b/gcc/config/bpf/bpf.cc
@@ -258,7 +258,7 @@ bpf_option_override (void)
 {
   inform (input_location,
   "%<-fstack-protector%> does not work "
-  " on this architecture");
+ "on this architecture");
   flag_stack_protect = 0;
 }
 }
-- 


signature.asc
Description: PGP signature

Re: [Patch][v2] OpenMP/Fortran: Fix loop-iter var privatization with !$OMP LOOP [PR108512]

2023-02-15 Thread Jakub Jelinek via Gcc-patches

On Fri, Feb 10, 2023 at 12:52:47PM +0100, Tobias Burnus wrote:
> > I'm afraid this is needed but insufficient.
> > I think
> >  case EXEC_OMP_MASKED_TASKLOOP:
> >  case EXEC_OMP_MASKED_TASKLOOP_SIMD:
> >  case EXEC_OMP_MASTER_TASKLOOP:
> >  case EXEC_OMP_MASTER_TASKLOOP_SIMD:
> >   case EXEC_OMP_PARALLEL_LOOP:
> >   case EXEC_OMP_TARGET_PARALLEL_LOOP:
> >   case EXEC_OMP_TARGET_TEAMS_LOOP:
> >   case EXEC_OMP_TARGET_SIMD:
> >   case EXEC_OMP_TEAMS_LOOP:
> > should be in the list above (of course alphabetically sorted in between the
> > others)
> > gfc_resolve_omp_parallel_blocks (code, ns);
> 
> I think 'TARGET_SIMD' shouldn't be resolved though parallel blocks but

You're right, we use gfc_resolve_omp_parallel_blocks for
parallel, teams, task but not for target alone.

> can call directly call gfc_resolve_omp_do_blocks (as
> currently/previously implemented). The masked version were already
> handled inside gfc_resolve_omp_parallel_blocks but missing in
> gfc_resolve_code, while the 'loop' ones had to be added to both.
> 
> (I did not extend the testcase, but I updated two to add additional
> dg-error to the same line.)

> gcc/fortran/ChangeLog:
> 
>   PR fortran/108512
>   * openmp.cc (gfc_resolve_omp_parallel_blocks): Handle combined 'loop'
>   directives.
>   (gfc_resolve_do_iterator): Set a source location for added
>   'private'-clause arguments.
>   * resolve.cc (gfc_resolve_code): Call gfc_resolve_omp_do_blocks
>   also for EXEC_OMP_LOOP and gfc_resolve_omp_parallel_blocks for
>   combined directives with loop + '{masked,master} taskloop (simd)'.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR fortran/108512
>   * gfortran.dg/gomp/loop-5.f90: New test.
>   * gfortran.dg/gomp/loop-2.f90: Update dg-error.
>   * gfortran.dg/gomp/taskloop-2.f90: Update dg-error.

LGTM, thanks.

Jakub

Re: [Patch] libgomp: Fix 'target enter data' with always pointer

2023-02-15 Thread Jakub Jelinek via Gcc-patches

On Mon, Feb 13, 2023 at 09:28:15PM +0100, Tobias Burnus wrote:
> libgomp: Fix 'target enter data' with always pointer
> 
> As GOMP_MAP_ALWAYS_POINTER operates on the previous map item, ensure that
> with 'target enter data' both are passed together to gomp_map_vars_internal.
> 
> libgomp/ChangeLog:
> 
>   * target.c (gomp_map_vars_internal): Add 'i > 0' before doing a
>   kind check.
>   (GOMP_target_enter_exit_data): If the next map item is
>   GOMP_MAP_ALWAYS_POINTER map it together with the current item.
> * testsuite/libgomp.fortran/target-enter-data-3.f90: New test.

8 spaces instead of tab, this won't get through the git pre-commit hook.

Otherwise LGTM.

Jakub

Re: [Patch] libgomp: Fix reverse-offload for GOMP_MAP_TO_PSET

2023-02-15 Thread Jakub Jelinek via Gcc-patches

On Thu, Feb 09, 2023 at 10:23:53AM +0100, Tobias Burnus wrote:
> libgomp: Fix reverse-offload for GOMP_MAP_TO_PSET
> 
> libgomp/
>   * target.c (gomp_target_rev): Dereference ptr
>   to get device address.
>   * libgomp.fortran/reverse-offload-5.f90: Add test
>   for unallocated allocatable.
> 
> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -3579,8 +3579,14 @@ gomp_target_rev (uint64_t fn_ptr, uint64_t mapnum, 
> uint64_t devaddrs_ptr,
> }
>   int k;
>   n2 = NULL;
> - cdata[i].present = true;
> + /* Dereference devaddrs[j] to get the device addr.  */
> + assert (devaddrs[j]-sizes[j] == cdata[i].devaddr);

Formatting, there should be spaces around - on both sides.

> + devaddrs[j] = *(uint64_t *) (uintptr_t) (devaddrs[i]
> +  + sizes[j]);
> + cdata[j].present = true;
>   cdata[j].devaddr = devaddrs[j];
> + if (devaddrs[j] == 0)
> +   continue;
>   k = gomp_map_cdata_lookup (cdata, devaddrs, kinds, sizes, j,
>  devaddrs[j],
>  devaddrs[j] + sizeof (void*),

Otherwise LGTM.

Jakub

[PATCH] Update baseline symbols for aarch64-linux

2023-02-15 Thread Andreas Schwab via Gcc-patches

libstdc++-v3/
* config/abi/post/aarch64-linux-gnu/baseline_symbols.txt: Update.
---
 .../aarch64-linux-gnu/baseline_symbols.txt| 90 +++
 1 file changed, 90 insertions(+)

diff --git 
a/libstdc++-v3/config/abi/post/aarch64-linux-gnu/baseline_symbols.txt 
b/libstdc++-v3/config/abi/post/aarch64-linux-gnu/baseline_symbols.txt
index c8ccecb120c..0a0e7b2859b 100644
--- a/libstdc++-v3/config/abi/post/aarch64-linux-gnu/baseline_symbols.txt
+++ b/libstdc++-v3/config/abi/post/aarch64-linux-gnu/baseline_symbols.txt
@@ -498,6 +498,10 @@ FUNC:_ZNKSt11__timepunctIwE8_M_am_pmEPPKw@@GLIBCXX_3.4
 FUNC:_ZNKSt11__timepunctIwE9_M_monthsEPPKw@@GLIBCXX_3.4
 FUNC:_ZNKSt11logic_error4whatEv@@GLIBCXX_3.4
 FUNC:_ZNKSt12__basic_fileIcE7is_openEv@@GLIBCXX_3.4
+FUNC:_ZNKSt12__shared_ptrINSt10filesystem28recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEcvbEv@@GLIBCXX_3.4.31
+FUNC:_ZNKSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE2EEcvbEv@@GLIBCXX_3.4.31
+FUNC:_ZNKSt12__shared_ptrINSt10filesystem7__cxx1128recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEcvbEv@@GLIBCXX_3.4.31
+FUNC:_ZNKSt12__shared_ptrINSt10filesystem7__cxx114_DirELN9__gnu_cxx12_Lock_policyE2EEcvbEv@@GLIBCXX_3.4.31
 FUNC:_ZNKSt12bad_weak_ptr4whatEv@@GLIBCXX_3.4.15
 FUNC:_ZNKSt12future_error4whatEv@@GLIBCXX_3.4.14
 FUNC:_ZNKSt12strstreambuf6pcountEv@@GLIBCXX_3.4
@@ -668,6 +672,13 @@ FUNC:_ZNKSt5ctypeIwE8do_widenEPKcS2_Pw@@GLIBCXX_3.4
 FUNC:_ZNKSt5ctypeIwE8do_widenEc@@GLIBCXX_3.4
 FUNC:_ZNKSt5ctypeIwE9do_narrowEPKwS2_cPc@@GLIBCXX_3.4
 FUNC:_ZNKSt5ctypeIwE9do_narrowEwc@@GLIBCXX_3.4
+FUNC:_ZNKSt6chrono4tzdb11locate_zoneESt17basic_string_viewIcSt11char_traitsIcEE@@GLIBCXX_3.4.31
+FUNC:_ZNKSt6chrono4tzdb12current_zoneEv@@GLIBCXX_3.4.31
+FUNC:_ZNKSt6chrono9time_zone15_M_get_sys_infoENS_10time_pointINS_3_V212system_clockENS_8durationIlSt5ratioILl1ELl1EE@@GLIBCXX_3.4.31
+FUNC:_ZNKSt6chrono9time_zone17_M_get_local_infoENS_10time_pointINS_7local_tENS_8durationIlSt5ratioILl1ELl1EE@@GLIBCXX_3.4.31
+FUNC:_ZNKSt6chrono9tzdb_list14const_iteratordeEv@@GLIBCXX_3.4.31
+FUNC:_ZNKSt6chrono9tzdb_list5beginEv@@GLIBCXX_3.4.31
+FUNC:_ZNKSt6chrono9tzdb_list5frontEv@@GLIBCXX_3.4.31
 FUNC:_ZNKSt6locale2id5_M_idEv@@GLIBCXX_3.4
 FUNC:_ZNKSt6locale4nameB5cxx11Ev@@GLIBCXX_3.4.21
 FUNC:_ZNKSt6locale4nameEv@@GLIBCXX_3.4
@@ -3095,9 +3106,18 @@ 
FUNC:_ZNSt6__norm15_List_node_base7_M_hookEPS0_@@GLIBCXX_3.4.14
 FUNC:_ZNSt6__norm15_List_node_base7reverseEv@@GLIBCXX_3.4.9
 FUNC:_ZNSt6__norm15_List_node_base8transferEPS0_S1_@@GLIBCXX_3.4.9
 FUNC:_ZNSt6__norm15_List_node_base9_M_unhookEv@@GLIBCXX_3.4.14
+FUNC:_ZNSt6chrono11locate_zoneESt17basic_string_viewIcSt11char_traitsIcEE@@GLIBCXX_3.4.31
+FUNC:_ZNSt6chrono11reload_tzdbEv@@GLIBCXX_3.4.31
+FUNC:_ZNSt6chrono12current_zoneEv@@GLIBCXX_3.4.31
 FUNC:_ZNSt6chrono12system_clock3nowEv@@GLIBCXX_3.4.11
+FUNC:_ZNSt6chrono13get_tzdb_listEv@@GLIBCXX_3.4.31
+FUNC:_ZNSt6chrono14remote_versionB5cxx11Ev@@GLIBCXX_3.4.31
 FUNC:_ZNSt6chrono3_V212steady_clock3nowEv@@GLIBCXX_3.4.19
 FUNC:_ZNSt6chrono3_V212system_clock3nowEv@@GLIBCXX_3.4.19
+FUNC:_ZNSt6chrono8get_tzdbEv@@GLIBCXX_3.4.31
+FUNC:_ZNSt6chrono9tzdb_list11erase_afterENS0_14const_iteratorE@@GLIBCXX_3.4.31
+FUNC:_ZNSt6chrono9tzdb_list14const_iteratorppEi@@GLIBCXX_3.4.31
+FUNC:_ZNSt6chrono9tzdb_list14const_iteratorppEv@@GLIBCXX_3.4.31
 FUNC:_ZNSt6gslice8_IndexerC1EmRKSt8valarrayImES4_@@GLIBCXX_3.4
 FUNC:_ZNSt6gslice8_IndexerC2EmRKSt8valarrayImES4_@@GLIBCXX_3.4
 FUNC:_ZNSt6locale11_M_coalesceERKS_S1_i@@GLIBCXX_3.4
@@ -3213,6 +3233,7 @@ 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE13_S_copy_charsEPcPKcS
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE13_S_copy_charsEPcS5_S5_@@GLIBCXX_3.4.21
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE13shrink_to_fitEv@@GLIBCXX_3.4.21
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE14_M_replace_auxEmmmc@@GLIBCXX_3.4.21
+FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE15_M_replace_coldEPcmPKcmm@@GLIBCXX_3.4.31
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE16_M_get_allocatorEv@@GLIBCXX_3.4.21
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE17_S_to_string_viewESt17basic_string_viewIcS2_E@@GLIBCXX_3.4.26
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE18_M_construct_aux_2Emc@@GLIBCXX_3.4.21
@@ -3364,6 +3385,7 @@ 
FUNC:_ZNSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEE13_S_copy_charsEPwPKwS
 
FUNC:_ZNSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEE13_S_copy_charsEPwS5_S5_@@GLIBCXX_3.4.21
 
FUNC:_ZNSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEE13shrink_to_fitEv@@GLIBCXX_3.4.21
 
FUNC:_ZNSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEE14_M_replace_auxEmmmw@@GLIBCXX_3.4.21
+FUNC:_ZNSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEE15_M_replace_coldEPwmPKwmm@@GLIBCXX_3.4.31
 
FUNC:_ZNSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEE16_M_get_allocatorEv@@GLIBCXX_3.4.21
 
FUNC:_ZNSt7__cxx1112basi

Re: [PATCH] Speedup DF dataflow solver

2023-02-15 Thread Jakub Jelinek via Gcc-patches

On Tue, Feb 14, 2023 at 03:21:53PM +0100, Richard Biener wrote:
> The following makes sure to process blocks that follow the current
> block in the iteration order in the same iteration and only postpone
> blocks that would be visited earlier to the next iteration.
> 
> For the all.i testcase in PR26854 at -O2 this shaves off 50% of
> the time to solve the DF RD problem, other problems also improve
> but not as drastically.
> 
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> 
> OK for trunk?
> 
> Thanks,
> Richard.
> 
>   PR middle-end/26854
>   * df-core.cc (df_worklist_propagate_forward): Put later
>   blocks on worklist and only earlier blocks on pending.
>   (df_worklist_propagate_backward): Likewise.
>   (df_worklist_dataflow_doublequeue): Change the iteration
>   to process new blocks in the same iteration if that
>   maintains the iteration order.

LGTM.

Jakub

Re: [PATCH] LoongArch: Fix multiarch tuple canonization

2023-02-15 Thread WANG Xuerui


Hi,

On 2023/2/13 18:38, Xi Ruoyao wrote:

Multiarch tuple will be coded in file or directory names in
multiarch-aware distros, so one ABI should have only one multiarch
tuple.  For example, "--target=loongarch64-linux-gnu --with-abi=lp64s"
and "--target=loongarch64-linux-gnusf" should both set multiarch tuple
to "loongarch64-linux-gnusf".  Before this commit,
"--target=loongarch64-linux-gnu --with-abi=lp64s --disable-multilib"
will produce wrong result (loongarch64-linux-gnu).

A recent LoongArch psABI revision mandates "loongarch64-linux-gnu" to be
used for -mabi=lp64d (instead of "loongarch64-linux-gnuf64") for some
non-technical reason [1].  Note that we cannot make
"loongarch64-linux-gnuf64" an alias for "loongarch64-linux-gnu" because
to implement such an alias, we must create thousands of symlinks in the
distro and doing so would be completely unpractical.  This commit also
aligns GCC with the revision.

Tested by building cross compilers with --enable-multiarch and multiple
combinations of --target=loongarch64-linux-gnu*, --with-abi=lp64{s,f,d},
and --{enable,disable}-multilib; and run "xgcc --print-multiarch" then
manually verify the result with eyesight.

Ok for trunk and backport to releases/gcc-12?

[1]: https://github.com/loongson/LoongArch-Documentation/pull/80

gcc/ChangeLog:

* config.gcc (triplet_abi): Set its value based on $with_abi,
instead of $target.
(la_canonical_triplet): Set it after $triplet_abi is set
correctly.
* config/loongarch/t-linux (MULTILIB_OSDIRNAMES): Make the
multiarch tuple for lp64d "loongarch64-linux-gnu" (without
"f64" suffix).
---
  gcc/config.gcc   | 14 +++---
  gcc/config/loongarch/t-linux |  2 +-
  2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 067720ac795..c070e6ecd2e 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4889,20 +4889,16 @@ case "${target}" in
case ${target} in
loongarch64-*-*-*f64)
abi_pattern="lp64d"
-   triplet_abi="f64"
;;
loongarch64-*-*-*f32)
abi_pattern="lp64f"
-   triplet_abi="f32"
;;
loongarch64-*-*-*sf)
abi_pattern="lp64s"
-   triplet_abi="sf"
;;
loongarch64-*-*-*)
abi_pattern="lp64[dfs]"
abi_default="lp64d"
-   triplet_abi=""
;;
*)
echo "Unsupported target ${target}." 1>&2
@@ -4923,9 +4919,6 @@ case "${target}" in
  ;;
esac
  
-		la_canonical_triplet="loongarch64-${triplet_os}${triplet_abi}"

-
-
# Perform initial sanity checks on --with-* options.
case ${with_arch} in
"" | loongarch64 | la464) ;; # OK, append here.
@@ -4996,6 +4989,13 @@ case "${target}" in
;;
esac
  
+		case ${with_abi} in

+ "lp64d") triplet_abi="";;
+ "lp64f") triplet_abi="f32";;
+ "lp64s") triplet_abi="sf";;
+   esac
+   la_canonical_triplet="loongarch64-${triplet_os}${triplet_abi}"
+
# Set default value for with_abiext (internal)
case ${with_abiext} in
"")
diff --git a/gcc/config/loongarch/t-linux b/gcc/config/loongarch/t-linux
index 131c45fdced..e40da179203 100644
--- a/gcc/config/loongarch/t-linux
+++ b/gcc/config/loongarch/t-linux
@@ -40,7 +40,7 @@ ifeq ($(filter LA_DISABLE_MULTILIB,$(tm_defines)),)
  
  MULTILIB_OSDIRNAMES = \

mabi.lp64d=../lib64$\
-  $(call if_multiarch,:loongarch64-linux-gnuf64)
+  $(call if_multiarch,:loongarch64-linux-gnu)
  
  MULTILIB_OSDIRNAMES += \

mabi.lp64f=../lib64/f32$\


Thanks for the quick patch; however Revy told me offline yesterday that 
this might conflict with things Debian side once this gets merged. He 
may have more details to share.


Adding him to CC -- you could keep him CC-ed on future changes that may 
impact distro packaging.

Re: [patch] bpf: Fix double whitespace warning

2023-02-15 Thread Jose E. Marchesi via Gcc-patches



> Hi!
>
> Since a recent commit, the BPF target produces a new warning due to
> two consecutive non-quoted spaces in a message. This'll fix it:
>
> gcc/
>   * config/bpf/bpf.cc (bpf_option_override): Fix doubled space.
>
>
> Ok?

OK.  Thanks for the patch.

(Sorry I didn't fix this when you first reported it.  My TODO list is
long atm :/)

> MfG, JBG
>
> diff --git a/gcc/config/bpf/bpf.cc b/gcc/config/bpf/bpf.cc
> index b268801d00c..d8693f8cfbe 100644
> --- a/gcc/config/bpf/bpf.cc
> +++ b/gcc/config/bpf/bpf.cc
> @@ -258,7 +258,7 @@ bpf_option_override (void)
>  {
>inform (input_location,
>"%<-fstack-protector%> does not work "
> -  " on this architecture");
> +   "on this architecture");
>flag_stack_protect = 0;
>  }
>  }

[pushed] testsuite, objective-c: Fix a testcase on Windows.

2023-02-15 Thread Iain Sandoe via Gcc-patches

tested by 'nightstrike' on Windows, and on x86_64-darwin21,
pushed to master, thanks,
Iain

--- 8< ---

Windows needs to use uintptr_t to represent an integral pointer type (long
is not the right type there).

Patch from 'nightstike'.

Signed-off-by: Iain Sandoe 

gcc/testsuite/ChangeLog:

* obj-c++.dg/proto-lossage-4.mm: Use uintptr_t for integral pointer
representations.
---
 gcc/testsuite/obj-c++.dg/proto-lossage-4.mm | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/obj-c++.dg/proto-lossage-4.mm 
b/gcc/testsuite/obj-c++.dg/proto-lossage-4.mm
index 2e753d1f8ba..ff053bec7d0 100644
--- a/gcc/testsuite/obj-c++.dg/proto-lossage-4.mm
+++ b/gcc/testsuite/obj-c++.dg/proto-lossage-4.mm
@@ -6,24 +6,26 @@
 /* One-line substitute for objc/objc.h */
 typedef struct objc_object { struct objc_class *class_pointer; } *id;
 
+typedef __UINTPTR_TYPE__ uintptr_t;
+
 @protocol Proto
-- (long)someValue;
+- (uintptr_t)someValue;
 @end
 
 @interface Obj
-- (long)anotherValue;
+- (uintptr_t)anotherValue;
 @end
 
-long foo(void) {
-  long receiver = 2;
+uintptr_t foo(void) {
+  uintptr_t receiver = 2;
   Obj *objrcvr;
   Obj  *objrcvr2;
 
   /* NB: Since 'receiver' is an invalid ObjC message receiver, the compiler
  should warn but then search for methods as if we were messaging 'id'.  */
 
-  receiver += [receiver someValue]; /* { dg-warning "invalid receiver type 
.long int." } */
-  receiver += [receiver anotherValue]; /* { dg-warning "invalid receiver type 
.long int." } */
+  receiver += [receiver someValue]; /* { dg-warning "invalid receiver type 
.uintptr_t." } */
+  receiver += [receiver anotherValue]; /* { dg-warning "invalid receiver type 
.uintptr_t." } */
 
   receiver += [(Obj *)receiver someValue]; /* { dg-warning ".Obj. may not 
respond to .\\-someValue." } */
 /* { dg-error "invalid conversion" "" { target *-*-* } .-1 } */
-- 
2.37.1 (Apple Git-137.1)

[PATCH] RISC-V: Rename tu_preds to none_tu_preds [NFC]

2023-02-15 Thread juzhe . zhong

From: Ju-Zhe Zhong 

To be consistent with other naming of preds array variable.
Change tu_preds into none_tu_preds which indicate such preds
include vop and vop_tu combinations.

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-functions.def (vadc): Rename.
(vsbc): Ditto.
(vmerge): Ditto.
(vmv_v): Ditto.
* config/riscv/riscv-vector-builtins.cc: Ditto.

---
 .../riscv/riscv-vector-builtins-functions.def| 16 
 gcc/config/riscv/riscv-vector-builtins.cc|  2 +-
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index 9bad1373bfd..e6c19691d17 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -113,14 +113,14 @@ DEF_RVV_FUNCTION (vsext, alu, full_preds, i_vf4_ops)
 DEF_RVV_FUNCTION (vsext, alu, full_preds, i_vf8_ops)
 
 // 11.4. Vector Integer Add-with-Carry/Subtract-with-Borrow Instructions
-DEF_RVV_FUNCTION (vadc, no_mask_policy, tu_preds, iu_vvvm_ops)
-DEF_RVV_FUNCTION (vadc, no_mask_policy, tu_preds, iu_vvxm_ops)
+DEF_RVV_FUNCTION (vadc, no_mask_policy, none_tu_preds, iu_vvvm_ops)
+DEF_RVV_FUNCTION (vadc, no_mask_policy, none_tu_preds, iu_vvxm_ops)
 DEF_RVV_FUNCTION (vmadc, return_mask, none_preds, iu_mvvm_ops)
 DEF_RVV_FUNCTION (vmadc, return_mask, none_preds, iu_mvxm_ops)
 DEF_RVV_FUNCTION (vmadc, return_mask, none_preds, iu_mvv_ops)
 DEF_RVV_FUNCTION (vmadc, return_mask, none_preds, iu_mvx_ops)
-DEF_RVV_FUNCTION (vsbc, no_mask_policy, tu_preds, iu_vvvm_ops)
-DEF_RVV_FUNCTION (vsbc, no_mask_policy, tu_preds, iu_vvxm_ops)
+DEF_RVV_FUNCTION (vsbc, no_mask_policy, none_tu_preds, iu_vvvm_ops)
+DEF_RVV_FUNCTION (vsbc, no_mask_policy, none_tu_preds, iu_vvxm_ops)
 DEF_RVV_FUNCTION (vmsbc, return_mask, none_preds, iu_mvvm_ops)
 DEF_RVV_FUNCTION (vmsbc, return_mask, none_preds, iu_mvxm_ops)
 DEF_RVV_FUNCTION (vmsbc, return_mask, none_preds, iu_mvv_ops)
@@ -230,12 +230,12 @@ DEF_RVV_FUNCTION (vwmaccsu, alu, full_preds, 
i_su_wwxv_ops)
 DEF_RVV_FUNCTION (vwmaccus, alu, full_preds, i_us_wwxv_ops)
 
 // 11.15. Vector Integer Merge Instructions
-DEF_RVV_FUNCTION (vmerge, no_mask_policy, tu_preds, all_vvvm_ops)
-DEF_RVV_FUNCTION (vmerge, no_mask_policy, tu_preds, iu_vvxm_ops)
+DEF_RVV_FUNCTION (vmerge, no_mask_policy, none_tu_preds, all_vvvm_ops)
+DEF_RVV_FUNCTION (vmerge, no_mask_policy, none_tu_preds, iu_vvxm_ops)
 
 // 11.16 Vector Integer Move Instructions
-DEF_RVV_FUNCTION (vmv_v, move, tu_preds, all_v_ops)
-DEF_RVV_FUNCTION (vmv_v, move, tu_preds, iu_x_ops)
+DEF_RVV_FUNCTION (vmv_v, move, none_tu_preds, all_v_ops)
+DEF_RVV_FUNCTION (vmv_v, move, none_tu_preds, iu_x_ops)
 
 /* 12. Vector Fixed-Point Arithmetic Instructions. */
 
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 54681bab3ea..97ca1f11541 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -481,7 +481,7 @@ static CONSTEXPR const predication_type_index full_preds[]
  PRED_TYPE_tumu, PRED_TYPE_mu, NUM_PRED_TYPES};
 
 /* vop/vop_tu will be registered.  */
-static CONSTEXPR const predication_type_index tu_preds[]
+static CONSTEXPR const predication_type_index none_tu_preds[]
   = {PRED_TYPE_none, PRED_TYPE_tu, NUM_PRED_TYPES};
 
 /* vop/vop_m will be registered.  */
-- 
2.36.3

RE: [PATCH 1/2]middle-end: Fix wrong overmatching of div-bitmask by using new optabs [PR108583]

2023-02-15 Thread Tamar Christina via Gcc-patches

> > >>> In any case, if you disagree I don’t' really see a way forward
> > >>> aside from making this its own pattern running it before the
> > >>> overwidening
> > pattern.
> > >> I think we should look to see if ranger can be persuaded to provide
> > >> the range of the 16-bit addition, even though the statement that
> > >> produces it isn't part of a BB.  It shouldn't matter that the
> > >> addition originally came from a 32-bit one: the range follows
> > >> directly from the ranges of the operands (i.e. the fact that the
> > >> operands are the results of widening conversions).
> > > I think you can ask ranger on operations on names defined in the IL,
> > > so you can work yourself through the sequence of operations in the
> > > pattern sequence to compute ranges on their defs (and possibly even
> > > store them in the SSA info).  You just need to pick the correct
> > > ranger API for this…. Andrew CCed
> > >
> > >
> > Its not clear to me whats being asked...
> >
> > Expressions don't need to be in the IL to do range calculations.. I
> > believe we support arbitrary tree expressions via range_of_expr.
> >
> > if you have 32 bit ranges that you want to do 16 bit addition on, you
> > can also cast those ranges to a 16bit type,
> >
> > my32bitrange.cast (my16bittype);
> >
> > then invoke range-ops directly via getting the handler:
> >
> > handler = range_op_handler (PLUS_EXPR, 16bittype_tree); if (handler)
> >     handler->fold (result, my16bittype, mycasted32bitrange,
> > myothercasted32bitrange)
> >
> > There are higher level APIs if what you have on hand is closer to IL
> > than random ranges
> >
> > Describe exactly what it is you want to do... and I'll try to direct
> > you to the best way to do it.
> 
> The vectorizer has  a pattern matcher that runs at startup on the scalar code.
> This pattern matcher can replace one or more statements with alternative
> ones, these can be either existing tree_code or new internal functions.
> 
> One of the patterns here is a overwidening detection pattern which reduces
> the precision that an operation is to be done in during vectorization.
> 
> Another one is widening multiplication, which replaced PLUS_EXPR with
> WIDEN_PLUS_EXPR.
> 
> These can be chained, so e.g. a widening addition done on ints can be
> reduced to a widen addition done on shorts.
> 
> The question is whether given the new expression that the vectorizer has
> created whether ranger can tell what the precision is.  get_range_query fails
> because presumably it has no idea about the new operations created  and
> also doesn't know about any new IFNs.

Hi,

I have been trying to use ranger as requested. I've tried:

  gimple_ranger ranger;
  int_range_max r;
  /* Check that no overflow will occur.  If we don't have range
 information we can't perform the optimization.  */
  if (ranger.range_of_expr (r, oprnd0, stmt))
{
  wide_int max = r.upper_bound ();


Which works for non-patterns, but still doesn't work for patterns.
On a stmt:
patt_27 = (_3) w+ (level_15(D));

it gives me a range:

$2 = {
   = {
val = {[0x0] = 0x, [0x1] = 0x7fff95bd8b00, [0x2] = 
0x7fff95bd78b0, [0x3] = 0x3fa1dd0, [0x4] = 0x3fa1dd0, [0x5] = 
0x344a706f832d4f00, [0x6] = 0x7fff95bd7950, [0x7] = 0x1ae7f11, [0x8] = 
0x7fff95bd79f8},
len = 0x1,
precision = 0x10
  },
  members of generic_wide_int:
  static is_sign_extended = 0x1
}

The precision is fine, but range seems to be -1?

Should I use range_op_handler (WIDEN_PLUS_EXPR, ...) in this case?

Thanks,
Tamar

> 
> Thanks,
> Tamar
> 
> >
> > Andrew
> >
> >

[PATCH] doc: Suggest fix for -Woverloaded-virtual warnings

2023-02-15 Thread Jonathan Wakely via Gcc-patches

OK for trunk?

-- >8 --

Users are confused about what this warning means, so add a suggested
solution to the documentation.

gcc/ChangeLog:

* doc/invoke.texi (C++ Dialect Options): Suggest adding a
using-declaration to unhide functions.
---
 gcc/doc/invoke.texi | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 26de582e41e..6404ed5c4ff 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -4282,6 +4282,10 @@ b->f();
 @noindent
 fails to compile.
 
+In cases where the different signatures are not an accident, the
+simplest solution is to add a using-declaration to the derived class
+to un-hide the base function, e.g. add @code{using A::f;} to @code{B}.
+
 The optional level suffix controls the behavior when all the
 declarations in the derived class override virtual functions in the
 base class, even if not all of the base functions are overridden:
-- 
2.39.1

[Patch] harden-sls-6.c: fix warning on LLP64

2023-02-15 Thread Jonathan Yong via Gcc-patches


gcc/testsuite/ChangeLog:

* gcc.target/i386/harden-sls-6.c: fix warning on LLP64
targets.

Attached patch OK?From c0572a1e95c6f569980d6b7454c8dc293f07389e Mon Sep 17 00:00:00 2001
From: Jonathan Yong <10wa...@gmail.com>
Date: Wed, 15 Feb 2023 13:42:12 +
Subject: [PATCH] harden-sls-6.c: fix warning on LLP64

gcc/testsuite/ChangeLog:

	* gcc.target/i386/harden-sls-6.c: fix warning on LLP64
	targets.

Signed-off-by: Jonathan Yong <10wa...@gmail.com>
---
 gcc/testsuite/gcc.target/i386/harden-sls-6.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/i386/harden-sls-6.c b/gcc/testsuite/gcc.target/i386/harden-sls-6.c
index 9068eb64008..3b270211927 100644
--- a/gcc/testsuite/gcc.target/i386/harden-sls-6.c
+++ b/gcc/testsuite/gcc.target/i386/harden-sls-6.c
@@ -11,7 +11,7 @@ struct _Unwind_Context {
   struct _Unwind_Context cur_contextcur_context =
   _Unwind_Resume_or_Rethrow_this_context;
   offset(0);
-  __builtin_eh_return ((long) offset, 0);
+  __builtin_eh_return ((__INTPTR_TYPE__) offset, 0);
 }
 
 /* { dg-final { scan-assembler "jmp\[ \t\]+\\*%rcx" } } */
-- 
2.39.2

[PATCH v2] warn-access: wrong -Wdangling-pointer with labels [PR106080]

2023-02-15 Thread Marek Polacek via Gcc-patches

On Wed, Feb 15, 2023 at 10:50:08AM +0100, Jakub Jelinek wrote:
> On Tue, Feb 14, 2023 at 10:48:15PM -0500, Marek Polacek via Gcc-patches wrote:
> > -Wdangling-pointer warns when the address of a label escapes.  This
> > causes grief in OCaml () as
> > well as in the kernel:
> >  because it uses
> > 
> >   #define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; 
> > })
> > 
> > to get the PC.  -Wdangling-pointer is documented to warn about pointers
> > to objects.  However, it uses is_auto_decl which checks DECL_P, but DECL_P
> > is also true for a label/enumerator/function declaration, none of which is
> > an object.  Rather, it should use auto_var_p which correctly checks VAR_P
> > and PARM_DECL.
> 
> and RESULT_DECL ;)
> 
> > Bootstrapped/regtested on ppc64le-pc-linux-gnu, ok for trunk and 12?
> > 
> > PR middle-end/106080
> > 
> > gcc/ChangeLog:
> > 
> > * gimple-ssa-warn-access.cc (is_auto_decl): Remove.  Use auto_var_p
> > instead.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * c-c++-common/Wdangling-pointer-10.c: New test.
> > * c-c++-common/Wdangling-pointer-9.c: New test.
> 
> > --- /dev/null
> > +++ b/gcc/testsuite/c-c++-common/Wdangling-pointer-9.c
> > @@ -0,0 +1,9 @@
> > +/* PR middle-end/106080 */
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -Wdangling-pointer" } */
> > +
> > +void
> > +foo (void **failaddr)
> > +{
> > +  *failaddr = ({ __label__ __here; __here: &&__here; });
> > +}
> 
> Perhaps add dg-bogus above just to make it more clear what
> we are testing in the test?

Ok, here it is.  Ok?

-- >8 --
-Wdangling-pointer warns when the address of a label escapes.  This
causes grief in OCaml () as
well as in the kernel:
 because it uses

  #define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })

to get the PC.  -Wdangling-pointer is documented to warn about pointers
to objects.  However, it uses is_auto_decl which checks DECL_P, but DECL_P
is also true for a label/enumerator/function declaration, none of which is
an object.  Rather, it should use auto_var_p which correctly checks VAR_P
and PARM_DECL.

PR middle-end/106080

gcc/ChangeLog:

* gimple-ssa-warn-access.cc (is_auto_decl): Remove.  Use auto_var_p
instead.

gcc/testsuite/ChangeLog:

* c-c++-common/Wdangling-pointer-10.c: New test.
* c-c++-common/Wdangling-pointer-9.c: New test.
---
 gcc/gimple-ssa-warn-access.cc | 19 +--
 .../c-c++-common/Wdangling-pointer-10.c   | 12 
 .../c-c++-common/Wdangling-pointer-9.c|  9 +
 3 files changed, 26 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/Wdangling-pointer-10.c
 create mode 100644 gcc/testsuite/c-c++-common/Wdangling-pointer-9.c

diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc
index ad9dac54874..2eab1d59abd 100644
--- a/gcc/gimple-ssa-warn-access.cc
+++ b/gcc/gimple-ssa-warn-access.cc
@@ -4326,15 +4326,6 @@ pass_waccess::check_call (gcall *stmt)
   check_nonstring_args (stmt);
 }
 
-
-/* Return true of X is a DECL with automatic storage duration.  */
-
-static inline bool
-is_auto_decl (tree x)
-{
-  return DECL_P (x) && !DECL_EXTERNAL (x) && !TREE_STATIC (x);
-}
-
 /* Check non-call STMT for invalid accesses.  */
 
 void
@@ -4363,7 +4354,7 @@ pass_waccess::check_stmt (gimple *stmt)
   while (handled_component_p (lhs))
lhs = TREE_OPERAND (lhs, 0);
 
-  if (is_auto_decl (lhs))
+  if (auto_var_p (lhs))
m_clobbers.remove (lhs);
   return;
 }
@@ -4383,7 +4374,7 @@ pass_waccess::check_stmt (gimple *stmt)
   while (handled_component_p (arg))
arg = TREE_OPERAND (arg, 0);
 
-  if (!is_auto_decl (arg))
+  if (!auto_var_p (arg))
return;
 
   gimple **pclobber = m_clobbers.get (arg);
@@ -4467,7 +4458,7 @@ void
 pass_waccess::check_dangling_uses (tree var, tree decl, bool maybe /* = false 
*/,
   bool objref /* = false */)
 {
-  if (!decl || !is_auto_decl (decl))
+  if (!decl || !auto_var_p (decl))
 return;
 
   gimple **pclob = m_clobbers.get (decl);
@@ -4528,7 +4519,7 @@ pass_waccess::check_dangling_stores (basic_block bb,
   if (!m_ptr_qry.get_ref (lhs, stmt, &lhs_ref, 0))
continue;
 
-  if (is_auto_decl (lhs_ref.ref))
+  if (auto_var_p (lhs_ref.ref))
continue;
 
   if (DECL_P (lhs_ref.ref))
@@ -4573,7 +4564,7 @@ pass_waccess::check_dangling_stores (basic_block bb,
  || rhs_ref.deref != -1)
continue;
 
-  if (!is_auto_decl (rhs_ref.ref))
+  if (!auto_var_p (rhs_ref.ref))
continue;
 
   auto_diagnostic_group d;
diff --git a/gcc/testsuite/c-c++-common/Wdangling-pointer-10.c 
b/gcc/testsuite/c-c++-common/Wdangling-pointer

Re: [PATCH v2] warn-access: wrong -Wdangling-pointer with labels [PR106080]

2023-02-15 Thread Jakub Jelinek via Gcc-patches

On Wed, Feb 15, 2023 at 08:46:07AM -0500, Marek Polacek wrote:
> > Perhaps add dg-bogus above just to make it more clear what
> > we are testing in the test?
> 
> Ok, here it is.  Ok?

Sure, thanks.
>   PR middle-end/106080
> 
> gcc/ChangeLog:
> 
>   * gimple-ssa-warn-access.cc (is_auto_decl): Remove.  Use auto_var_p
>   instead.
> 
> gcc/testsuite/ChangeLog:
> 
>   * c-c++-common/Wdangling-pointer-10.c: New test.
>   * c-c++-common/Wdangling-pointer-9.c: New test.

Jakub

[PATCH] Fix PR target/90458

2023-02-15 Thread Eric Botcazou via Gcc-patches

Hi,

this is the incompatibility of -fstack-clash-protection with Windows SEH.  Now 
the Windows ports always enable TARGET_STACK_PROBE, which means that the stack 
is always probed (out of line) so -fstack-clash-protection does nothing more.

Tested on x86-64/Windows and Linux, OK for all active branches?


2023-02-15  Eric Botcazou  

* config/i386/i386.cc (ix86_compute_frame_layout): Disable the
effects of -fstack-clash-protection for TARGET_STACK_PROBE.
(ix86_expand_prologue): Likewise.

-- 
Eric Botcazoudiff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 3cacf738c4a..22f444be23c 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -6876,7 +6876,9 @@ ix86_compute_frame_layout (void)
 	 stack clash protections are enabled and the allocated frame is
 	 larger than the probe interval, then use pushes to save
 	 callee saved registers.  */
-  || (flag_stack_clash_protection && to_allocate > get_probe_interval ()))
+  || (flag_stack_clash_protection
+	  && !ix86_target_stack_probe ()
+	  && to_allocate > get_probe_interval ()))
 frame->save_regs_using_mov = false;
 
   if (ix86_using_red_zone ()
@@ -8761,8 +8763,11 @@ ix86_expand_prologue (void)
   sse_registers_saved = true;
 }
 
-  /* If stack clash protection is requested, then probe the stack.  */
-  if (allocate >= 0 && flag_stack_clash_protection)
+  /* If stack clash protection is requested, then probe the stack, unless it
+ is already probed on the target.  */
+  if (allocate >= 0
+  && flag_stack_clash_protection
+  && !ix86_target_stack_probe ())
 {
   ix86_adjust_stack_and_probe (allocate, int_registers_saved, false);
   allocate = 0;

3D Printing Software - Gcc

2023-02-15 Thread Zoe Smith via Gcc-patches

Hello  Gcc ,



Hope You are doing well!



I Quickly wanted to understand if would you be interested in reaching out for 
the mailing list of " 3D Printing Software" Users, Clients & Customers across 
the world?



We also have the data list of below mentioned users:



ü  Fusion 360.

ü  SOLIDWORKS.

ü  Onshape.

ü  Tinkercad.

ü  Solid Edge.

ü  Siemens NX.

ü  Blender.

ü  Ultimaker Cura



Send me your target users and geographical location, so that I can give count 
and pricing details for your review.



Hence, I am sure this can increase your resources and also connect you to the 
right people in a quick and easier manner.



Appreciate your time and look forward to hear from you.



Thanks & Best regards,

Zoe Smith |Demand Generation Manager



   If you do not wish to hear from us again, please respond back with "Cancel" 
and we will honour your request.

[PATCH] i386: Rename extr_register_operand to int248_register_operand

2023-02-15 Thread Uros Bizjak via Gcc-patches

No functional changes.

gcc/ChangeLog:

2023-02-15  Uroš Bizjak  

* config/i386/predicates.md (int248_register_operand):
Rename from extr_register_operand.
* config/i386/i386.md (*extv): Update for renamed predicate.
(*extzx): Ditto.
(*ashl3_doubleword_mask): Use int248_register_operand predicate.
(*ashl3_mask): Ditto.
(*3_mask): Ditto.
(*3_doubleword_mask): Ditto.
(*3_mask): Ditto.
(*_mask): Ditto.
(*btr_mask): Ditto.
(*jcc_bt_mask_1): Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 5a946beb1c6..e37bc8dca53 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -3159,7 +3159,7 @@
 
 (define_insn "*extv"
   [(set (match_operand:SWI24 0 "register_operand" "=R")
-   (sign_extract:SWI24 (match_operand 1 "extr_register_operand" "Q")
+   (sign_extract:SWI24 (match_operand 1 "int248_register_operand" "Q")
(const_int 8)
(const_int 8)))]
   ""
@@ -3202,7 +3202,7 @@
 
 (define_insn "*extzv"
   [(set (match_operand:SWI248 0 "register_operand" "=R")
-   (zero_extract:SWI248 (match_operand 1 "extr_register_operand" "Q")
+   (zero_extract:SWI248 (match_operand 1 "int248_register_operand" "Q")
 (const_int 8)
 (const_int 8)))]
   ""
@@ -12449,15 +12449,12 @@
  (match_operand: 1 "register_operand")
  (subreg:QI
(and
- (match_operand 2 "register_operand" "c")
+ (match_operand 2 "int248_register_operand" "c")
  (match_operand 3 "const_int_operand")) 0)))
(clobber (reg:CC FLAGS_REG))]
   "((INTVAL (operands[3]) & ( * BITS_PER_UNIT)) == 0
 || ((INTVAL (operands[3]) & (2 *  * BITS_PER_UNIT - 1))
 == (2 *  * BITS_PER_UNIT - 1)))
-   && GET_MODE_CLASS (GET_MODE (operands[2])) == MODE_INT
-   && IN_RANGE (GET_MODE_SIZE (GET_MODE (operands[2])), 2,
-   4 << (TARGET_64BIT ? 1 : 0))
&& ix86_pre_reload_split ()"
   "#"
   "&& 1"
@@ -12844,15 +12841,12 @@
  (match_operand:SWI48 1 "nonimmediate_operand")
  (subreg:QI
(and
- (match_operand 2 "register_operand" "c,r")
+ (match_operand 2 "int248_register_operand" "c,r")
  (match_operand 3 "const_int_operand")) 0)))
(clobber (reg:CC FLAGS_REG))]
   "ix86_binary_operator_ok (ASHIFT, mode, operands)
&& (INTVAL (operands[3]) & (GET_MODE_BITSIZE (mode)-1))
   == GET_MODE_BITSIZE (mode)-1
-   && GET_MODE_CLASS (GET_MODE (operands[2])) == MODE_INT
-   && IN_RANGE (GET_MODE_SIZE (GET_MODE (operands[2])), 2,
-   4 << (TARGET_64BIT ? 1 : 0))
&& ix86_pre_reload_split ()"
   "#"
   "&& 1"
@@ -13438,15 +13432,12 @@
  (match_operand:SWI48 1 "nonimmediate_operand")
  (subreg:QI
(and
- (match_operand 2 "register_operand" "c,r")
+ (match_operand 2 "int248_register_operand" "c,r")
  (match_operand 3 "const_int_operand")) 0)))
(clobber (reg:CC FLAGS_REG))]
   "ix86_binary_operator_ok (, mode, operands)
&& (INTVAL (operands[3]) & (GET_MODE_BITSIZE (mode)-1))
   == GET_MODE_BITSIZE (mode)-1
-   && GET_MODE_CLASS (GET_MODE (operands[2])) == MODE_INT
-   && IN_RANGE (GET_MODE_SIZE (GET_MODE (operands[2])), 2,
-   4 << (TARGET_64BIT ? 1 : 0))
&& ix86_pre_reload_split ()"
   "#"
   "&& 1"
@@ -13489,15 +13480,12 @@
  (match_operand: 1 "register_operand")
  (subreg:QI
(and
- (match_operand 2 "register_operand" "c")
+ (match_operand 2 "int248_register_operand" "c")
  (match_operand 3 "const_int_operand")) 0)))
(clobber (reg:CC FLAGS_REG))]
   "((INTVAL (operands[3]) & ( * BITS_PER_UNIT)) == 0
 || ((INTVAL (operands[3]) & (2 *  * BITS_PER_UNIT - 1))
 == (2 *  * BITS_PER_UNIT - 1)))
-   && GET_MODE_CLASS (GET_MODE (operands[2])) == MODE_INT
-   && IN_RANGE (GET_MODE_SIZE (GET_MODE (operands[2])), 2,
-   4 << (TARGET_64BIT ? 1 : 0))
&& ix86_pre_reload_split ()"
   "#"
   "&& 1"
@@ -14384,15 +14372,12 @@
  (match_operand:SWI 1 "nonimmediate_operand")
  (subreg:QI
(and
- (match_operand 2 "register_operand" "c")
+ (match_operand 2 "int248_register_operand" "c")
  (match_operand 3 "const_int_operand")) 0)))
(clobber (reg:CC FLAGS_REG))]
   "ix86_binary_operator_ok (, mode, operands)
&& (INTVAL (operands[3]) & (GET_MODE_BITSIZE (mode)-1))
   == GET_MODE_BITSIZE (mode)-1
-   && GET_MODE_CLASS (GET_MODE (operands[2])) == MODE_INT
-   && IN_RANGE (GET_MODE_SIZE (GET_MODE (operands[2])), 2,
-   4 << (TARGET_64BIT ? 1 : 0))
&& ix86_pre_reload_split ()"
   "#"
   "&& 1"
@@ -14412,13 +14397,10 @@
  (match_operand:SWI 1 "const_int_operand")
  (subreg:QI

[PATCH] testsuite/i386: Cleanup target selectors in i386 target directory.

2023-02-15 Thread Uros Bizjak via Gcc-patches

gcc/testsuite/ChangeLog:

2023-02-15  Uroš Bizjak  

* g++.target/i386/empty-class2.C (dg-additional-options): Remove.
* gcc.target/i386/avx512fp16-reduce-op-2.c: Ditto.
* gcc.target/i386/pr99464.c: Ditto.
* gcc.target/i386/pr103541.c (dg-do): Compile for !ia32 target.
* gcc.target/i386/pr108774.c (dg-do): Compile for lp64 target.
* gcc.target/i386/pr85593.c (dg-do): Run for *-*-linux* target.
* gcc.target/i386/pr98063.c: Ditto.
* gcc.target/i386/pr90007.c (dg-do): Remove target selector.
* gcc.target/i386/pr92841-2.c (dg-do): Remove unneeded curly braces.
* gcc.target/i386/pr95464.c: Ditto.
* gcc.target/i386/pr99530-1.c (dg-do): Compile for *-*-linux* target.
* gcc.target/i386/pr99530-2.c: Ditto.
* gcc.target/i386/pr99530-3.c: Ditto.
* gcc.target/i386/pr99530-4.c: Ditto.
* gcc.target/i386/pr99530-5.c: Ditto.
* gcc.target/i386/pr99530-6.c: Ditto.
* gcc.target/i386/pr99531.c (dg-do): Compile for !ia32 target.

Tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/testsuite/g++.target/i386/empty-class2.C 
b/gcc/testsuite/g++.target/i386/empty-class2.C
index b9317c56706..3e4fc4e709c 100644
--- a/gcc/testsuite/g++.target/i386/empty-class2.C
+++ b/gcc/testsuite/g++.target/i386/empty-class2.C
@@ -2,7 +2,6 @@
 // Test passing aligned empty aggregate
 // { dg-do compile }
 // { dg-options "-O2" }
-// { dg-additional-options "-Wno-psabi" { target { { i?86-*-* x86_64-*-* } && 
ilp32 } } }
 
 struct S { union {} a; } __attribute__((aligned));
 
diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-reduce-op-2.c 
b/gcc/testsuite/gcc.target/i386/avx512fp16-reduce-op-2.c
index 72e4a814a76..924f1a94138 100644
--- a/gcc/testsuite/gcc.target/i386/avx512fp16-reduce-op-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx512fp16-reduce-op-2.c
@@ -1,6 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mprefer-vector-width=512 -fdump-tree-optimized" } */
-/* { dg-additional-options "-msse2" { target i?86-*-* } } */
+/* { dg-options "-O2 -msse2 -mprefer-vector-width=512 -fdump-tree-optimized" } 
*/
 
 /* { dg-final { scan-tree-dump-times "\.REDUC_PLUS" 3 "optimized" } } */
 /* { dg-final { scan-tree-dump-times "\.REDUC_MIN" 3 "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr103541.c 
b/gcc/testsuite/gcc.target/i386/pr103541.c
index 72b257d42ee..56ecb289fa1 100644
--- a/gcc/testsuite/gcc.target/i386/pr103541.c
+++ b/gcc/testsuite/gcc.target/i386/pr103541.c
@@ -1,5 +1,5 @@
 /* PR rtl-optimization/103541 */
-/* { dg-do compile  { target x86_64-*-* } } */
+/* { dg-do compile  { target { ! ia32 } } } */
 /* { dg-options "-O2" } */
 
 float a;
diff --git a/gcc/testsuite/gcc.target/i386/pr108774.c 
b/gcc/testsuite/gcc.target/i386/pr108774.c
index 482bc490cde..0bd2aed7327 100644
--- a/gcc/testsuite/gcc.target/i386/pr108774.c
+++ b/gcc/testsuite/gcc.target/i386/pr108774.c
@@ -1,5 +1,5 @@
 /* PR target/108774 */
-/* { dg-do compile  { target x86_64-*-* } } */
+/* { dg-do compile  { target lp64 } } */
 /* { dg-options "-Os -ftrapv -mcmodel=large" } */
 
 int i, j;
diff --git a/gcc/testsuite/gcc.target/i386/pr85593.c 
b/gcc/testsuite/gcc.target/i386/pr85593.c
index 092f9cbe680..6be6849bb33 100644
--- a/gcc/testsuite/gcc.target/i386/pr85593.c
+++ b/gcc/testsuite/gcc.target/i386/pr85593.c
@@ -1,5 +1,5 @@
 /* PR target/85593 */
-/* { dg-do run { target { { i?86-*-linux* x86_64-*-linux* } && lp64 } } } */
+/* { dg-do run { target { *-*-linux* && lp64 } } } */
 /* { dg-options "-O2" } */
 
 __attribute__((naked)) void
diff --git a/gcc/testsuite/gcc.target/i386/pr90007.c 
b/gcc/testsuite/gcc.target/i386/pr90007.c
index a16eec308fb..ad099ed0a78 100644
--- a/gcc/testsuite/gcc.target/i386/pr90007.c
+++ b/gcc/testsuite/gcc.target/i386/pr90007.c
@@ -1,5 +1,5 @@
 /* PR rtl-optimization/90007 */
-/* { dg-do compile { target x86_64-*-* } } */
+/* { dg-do compile } */
 /* { dg-options "-march=bdver1 -mfpmath=387 -O1 -fschedule-insns 
-fselective-scheduling" } */
 
 void
diff --git a/gcc/testsuite/gcc.target/i386/pr92841-2.c 
b/gcc/testsuite/gcc.target/i386/pr92841-2.c
index b2d5eb86389..7d30028db7a 100644
--- a/gcc/testsuite/gcc.target/i386/pr92841-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr92841-2.c
@@ -1,5 +1,5 @@
 /* PR target/92841 */
-/* { dg-do compile { target { { { *-*-linux* } && lp64 } && fstack_protector } 
} } */
+/* { dg-do compile { target { { *-*-linux* && lp64 } && fstack_protector } } } 
*/
 /* { dg-options "-O2 -fpic -fstack-protector-strong -masm=att" } */
 /* { dg-final { scan-assembler "leaq\tbuf2\\\(%rip\\\)," } } */
 
diff --git a/gcc/testsuite/gcc.target/i386/pr95464.c 
b/gcc/testsuite/gcc.target/i386/pr95464.c
index 33a8290e0cf..300a9085106 100644
--- a/gcc/testsuite/gcc.target/i386/pr95464.c
+++ b/gcc/testsuite/gcc.target/i386/pr95464.c
@@ -1,5 +1,5 @@
 /* { dg-options "-O2" } */
-/* { dg-do run { target { { *-*-linux* } && { ! ia32 } } } } */
+/* { dg-do run { target { *-*-linux* && { ! ia32 } } } } */
 
 struct S { unsigned a:

[PATCH] reload: Handle generating reloads that also clobbers flags

2023-02-15 Thread Hans-Peter Nilsson via Gcc-patches

Regtested cris-elf with its LEGITIMIZE_RELOAD_ADDRESS
disabled, where it regresses gcc.target/cris/rld-legit1.c;
as expected, because that test guards proper function of its
LEGITIMIZE_RELOAD_ADDRESS i.e., that there's no sign of
decomposed address elements.

LRA also causes a similar decomposition (and worse, in even
smaller bits), but it can create valid insns as-is.
Unfortunately, it doesn't have something equivalent to
LEGITIMIZE_RELOAD_ADDRESS so it generates worse code for
cases where that hook helped reload.

I fear reload-related patches these days are treated like a
redheaded stepchild and even worse as this one is intended
for stage 1.  Either way, I need to create a reference to
it, and it's properly tested and has been a help when
working towards LRA, thus might help other targets: ok to
install for the next stage 1?

-- >8 --
When LEGITIMIZE_RELOAD_ADDRESS for cris-elf is disabled,
this code is now required for reload to generate valid insns
from some reload-decomposed addresses, for example the
(plus:SI
 (sign_extend:SI (mem:HI (reg/v/f:SI 32 [ a ]) [1 *a_6(D)+0 S2 A8]))
 (reg/v/f:SI 33 [ y ]))
generated in gcc.target/cris/rld-legit1.c (a valid address
but with two registers needing reload).  Now after decc0:ing,
most SET insns for former cc0 targets need to be a parallel
with a clobber of the flags register.  Such targets
typically have TARGET_FLAGS_REGNUM set to a valid register.

* reload1.cc (emit_insn_if_valid_for_reload_1): Rename from
emit_insn_if_valid_for_reload.
(emit_insn_if_valid_for_reload): Call new helper, and if a SET fails
to be recognized, also try emitting a parallel that clobbers
TARGET_FLAGS_REGNUM, as applicable.
---
 gcc/reload1.cc | 29 ++---
 1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/gcc/reload1.cc b/gcc/reload1.cc
index 7dcef50437b8..9ec2cb9baf4b 100644
--- a/gcc/reload1.cc
+++ b/gcc/reload1.cc
@@ -8377,11 +8377,11 @@ emit_reload_insns (class insn_chain *chain)
   reg_reloaded_dead |= reg_reloaded_died;
 }
 
-/* Go through the motions to emit INSN and test if it is strictly valid.
-   Return the emitted insn if valid, else return NULL.  */
+
+/* Helper for emit_insn_if_valid_for_reload.  */
 
 static rtx_insn *
-emit_insn_if_valid_for_reload (rtx pat)
+emit_insn_if_valid_for_reload_1 (rtx pat)
 {
   rtx_insn *last = get_last_insn ();
   int code;
@@ -8403,6 +8403,29 @@ emit_insn_if_valid_for_reload (rtx pat)
   return NULL;
 }
 
+/* Go through the motions to emit INSN and test if it is strictly valid.
+   Return the emitted insn if valid, else return NULL.  */
+
+static rtx_insn *
+emit_insn_if_valid_for_reload (rtx pat)
+{
+  rtx_insn *insn = emit_insn_if_valid_for_reload_1 (pat);
+
+  if (insn)
+return insn;
+
+  /* If the pattern is a SET, and this target has a single
+ flags-register, try again with a PARALLEL that clobbers that
+ register.  */
+  if (targetm.flags_regnum == INVALID_REGNUM || GET_CODE (pat) != SET)
+return NULL;
+
+  rtx flags_clobber = gen_hard_reg_clobber (CCmode, targetm.flags_regnum);
+  rtx parpat = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, pat, flags_clobber));
+
+  return emit_insn_if_valid_for_reload (parpat);
+}
+
 /* Emit code to perform a reload from IN (which may be a reload register) to
OUT (which may also be a reload register).  IN or OUT is from operand
OPNUM with reload type TYPE.
-- 
2.30.2

Re: [PATCH] RISC-V: Bugfix for mode tieable of the rvv bool types

2023-02-15 Thread 盼李 via Gcc-patches

After some investigation, the mode precision adjusting can help to tell the 
difference from the VxN1BI to VxN64BI, besides the existing mode_size. Thus I 
would like to prepare the patch for the precision adjustment only first.

Unfortunately, there is one selftest failure right now when I try to adjust the 
precision of VxN*BI and I am still working on it. Of course, will keep you all 
posted.

VxN1BI  adjust precision => 1
VxN2BI  adjust precision => 2
VxN4BI  adjust precision => 4
VxN8BI  adjust precision => 8
VxN16BI  adjust precision => 16
VxN32BI  adjust precision => 32
VxN64BI  adjust precision => 64

Pan

From: Richard Biener 
Sent: Monday, February 13, 2023 23:47
To: 盼 李 
Cc: Andrew Stubbs ; juzhe.zh...@rivai.ai 
; gcc-patches ; kito.cheng 
; richard.sandif...@arm.com 
Subject: Re: [PATCH] RISC-V: Bugfix for mode tieable of the rvv bool types

On Mon, 13 Feb 2023, 盼 李 wrote:

> Thanks all for your help and comments.
>
> Let me share more information about this patch. Especially for the 
> tree-ssa-sccvn.cc part.
>
> Assume we have the blow test code for this issue.
>
> void
> test_1(int8_t * restrict in, int8_t * restrict out) {
> vbool8_t v2 = *(vbool8_t*)in;
> vbool16_t v5 = *(vbool16_t*)in;
>
> *(vbool8_t*)(out + 100) = v2;
> *(vbool16_t*)(out + 200) = v5;
> }
>
> Without the tree-ssa-sccvn.cc file code change.
> 
> void test_1 (int8_t * restrict in, int8_t * restrict out)
> {
>   vbool8_t v2;
>   __rvv_bool16_t _1;
>
>[local count: 1073741824]:
>   v2_4 = MEM[(vbool8_t *)in_3(D)];
>   _1 = VIEW_CONVERT_EXPR<__rvv_bool16_t>(v2_4);  // insert during 039.fre1
>   MEM[(vbool8_t *)out_5(D) + 100B] = v2_4;
>   MEM[(vbool16_t *)out_5(D) + 200B] = _1;
>   return;
> }
>
> WIthin the tree-ssa-sccvn.cc file code change.
> 
> void test_1 (int8_t * restrict in, int8_t * restrict out)
> {
>   vbool16_t v5;
>   vbool8_t v2;
>
>[local count: 1073741824]:
>   v2_3 = MEM[(vbool8_t *)in_2(D)];
>   v5_4 = MEM[(vbool16_t *)in_2(D)];
>   MEM[(vbool8_t *)out_5(D) + 100B] = v2_3;
>   MEM[(vbool16_t *)out_5(D) + 200B] = v5_4;
>   return;
> }
>
> Thus, I figured out the a-main.c.039t.fre1 pass results in this CONVERT being 
> inserted.
> With some debugging, I located the difference that comes from the
> expressions_equal_p. If GET_MODE_SIZE(mode) is the same between the VxN8Bimode
> and VxN4Bimode, the expressions_equal_p will compare the same address of a 
> tree, aka
> POLY_INT_CST [8, 8].
>
> visit_reference_op_load
> |- vn_reference_lookup
> |- vn_reference_lookup_2
>  |- find_slot_with_hash
>  |- vn_reference_hasher::equal
>  |- expressions_equal_p
>
> Meanwhile, we also double-checked that set the different MODE_SIZE of both the
> VxN8Bimode and VxN4Bimode (for example, [8, 1] and [4,1] for test only) are 
> able
> to resolve this issue. But they should be [1, 1] according to the ISA 
> semantics.
>
> Thus, we try to set other MODE_XXX but it seems not working at all. For 
> example:
>
> VNx4BIMode NUNITS [0x4, 0x4]
> VNx8BIMode NUNITS [0x8, 0x8]
>
> Finally, I found the TARGET_MODES_TIEABLE_P and inject it into the function
> visit_reference_op_load to resolve this issue.
>
> I will continue to try other ways besides the tree-ssa-sccvn.cc if this may 
> not be
> the right place for this issue.

There are other places like alias analysis which will be not happy
if the mode size/precision do not match reality.  So no, I don't think
modes_tieable is the correct thing to check here.  Instead the existing
check seems to be to the point but the modes are not set up correctly
to carry the info of one having padding at the end and the other not.

Richard.

> Thank again and will keep you posted.
>
> Pan
>
>
>
> 
> From: Andrew Stubbs 
> Sent: Monday, February 13, 2023 19:00
> To: Richard Biener ; juzhe.zh...@rivai.ai 
> 
> Cc: Pan Li ; gcc-patches 
> ; kito.cheng ; 
> richard.sandif...@arm.com 
> Subject: Re: [PATCH] RISC-V: Bugfix for mode tieable of the rvv bool types
>
> I presume I've been CC'd on this conversation because weird vector
> architecture problems have happened to me before. :)
>
> However, I'm not sure I can help much because AMD GCN does not use
> BImode vectors at all. This is partly because loading boolean values
> into a GCN vector would have 31 padding bits for each lane, but mostly
> because the result of comparison instructions is written to a DImode
> scalar register, not into a vector.
>
> I did experiment, long ago, with having a V64BImode that could be stored
> in scalar registers (tieable with DImode), but there wasn't any great
> advantage and it broke VECTOR_MODE_P in most other contexts.
>
> It's possible to store truth values in vectors as integers, and there
> are some cases

Re: [PATCH 1/2]middle-end: Fix wrong overmatching of div-bitmask by using new optabs [PR108583]

2023-02-15 Thread Andrew MacLeod via Gcc-patches




On 2/15/23 07:51, Tamar Christina wrote:

In any case, if you disagree I don’t' really see a way forward
aside from making this its own pattern running it before the
overwidening

pattern.

I think we should look to see if ranger can be persuaded to provide
the range of the 16-bit addition, even though the statement that
produces it isn't part of a BB.  It shouldn't matter that the
addition originally came from a 32-bit one: the range follows
directly from the ranges of the operands (i.e. the fact that the
operands are the results of widening conversions).

I think you can ask ranger on operations on names defined in the IL,
so you can work yourself through the sequence of operations in the
pattern sequence to compute ranges on their defs (and possibly even
store them in the SSA info).  You just need to pick the correct
ranger API for this…. Andrew CCed



Its not clear to me whats being asked...

Expressions don't need to be in the IL to do range calculations.. I
believe we support arbitrary tree expressions via range_of_expr.

if you have 32 bit ranges that you want to do 16 bit addition on, you
can also cast those ranges to a 16bit type,

my32bitrange.cast (my16bittype);

then invoke range-ops directly via getting the handler:

handler = range_op_handler (PLUS_EXPR, 16bittype_tree); if (handler)
     handler->fold (result, my16bittype, mycasted32bitrange,
myothercasted32bitrange)

There are higher level APIs if what you have on hand is closer to IL
than random ranges

Describe exactly what it is you want to do... and I'll try to direct
you to the best way to do it.

The vectorizer has  a pattern matcher that runs at startup on the scalar code.
This pattern matcher can replace one or more statements with alternative
ones, these can be either existing tree_code or new internal functions.

One of the patterns here is a overwidening detection pattern which reduces
the precision that an operation is to be done in during vectorization.

Another one is widening multiplication, which replaced PLUS_EXPR with
WIDEN_PLUS_EXPR.

These can be chained, so e.g. a widening addition done on ints can be
reduced to a widen addition done on shorts.

The question is whether given the new expression that the vectorizer has
created whether ranger can tell what the precision is.  get_range_query fails
because presumably it has no idea about the new operations created  and
also doesn't know about any new IFNs.

Hi,

I have been trying to use ranger as requested. I've tried:

  gimple_ranger ranger;
  int_range_max r;
  /* Check that no overflow will occur.  If we don't have range
 information we can't perform the optimization.  */
  if (ranger.range_of_expr (r, oprnd0, stmt))
{
  wide_int max = r.upper_bound ();
 

Which works for non-patterns, but still doesn't work for patterns.
On a stmt:
patt_27 = (_3) w+ (level_15(D));

it gives me a range:

$2 = {
= {
 val = {[0x0] = 0x, [0x1] = 0x7fff95bd8b00, [0x2] = 
0x7fff95bd78b0, [0x3] = 0x3fa1dd0, [0x4] = 0x3fa1dd0, [0x5] = 
0x344a706f832d4f00, [0x6] = 0x7fff95bd7950, [0x7] = 0x1ae7f11, [0x8] = 
0x7fff95bd79f8},
 len = 0x1,
 precision = 0x10
   },
   members of generic_wide_int:
   static is_sign_extended = 0x1
}

The precision is fine, but range seems to be -1?

Should I use range_op_handler (WIDEN_PLUS_EXPR, ...) in this case?


Its easier to see the range if you dump it.. ie:

p r.dump(stderr)

Im way behind the curve on exactly whats going on.  Im not sure how the 
above 2 things relate..  I presume $2 is is 'max'?  I have no context, 
what did you expect the range of _3 to be?


We have no entry in range-ops.cc for a WIDEN_PLUS_EXPR,  so ranger would 
only give back a VARYING for that no doubt.. however I doubt it would be 
too difficult to write the fold_range() method for it.


Its unclear to me what you mean by it doesnt work on patterns. so lets 
do some basics.


You have a stmt  "patt_27 = (_3) w+ (level_15(D));"

I gather thats a WIDEN_PLUS_EXPR, and if I read it right, patt_27 is a 
type that is twice as wide as _3, and will contain the value "_3 + 
level_15"?


You query above is asking for the range of _3 at this stmt in the IL.

And you are trying to determine whether the expression "_3 + level_15" 
would still fit in the type of _3, and thus you could avoid the WIDEN_* 
paradigm and revert to a simply plus?


And you also want to be able to do this for expressions which are not 
currently in the IL?


  IF that is all true, then I would suggest one of 2 possible routes.
1) we add WIDEN_PLUS_EXPR to range-ops.  THIs involves writing 
fold_range() for it whereby it would create a range of a type double the 
precision of _3, then take the 2 ranges for op1 and op2, cast them to 
this new type and add them.


2) manually doing the same thing.   BUt if you are goignto manually do 
it, we might as well put that same code into fold_range then t

Re: [PATCH] doc: Suggest fix for -Woverloaded-virtual warnings

2023-02-15 Thread Jason Merrill via Gcc-patches


On 2/15/23 05:37, Jonathan Wakely wrote:

OK for trunk?


OK.


-- >8 --

Users are confused about what this warning means, so add a suggested
solution to the documentation.

gcc/ChangeLog:

* doc/invoke.texi (C++ Dialect Options): Suggest adding a
using-declaration to unhide functions.
---
  gcc/doc/invoke.texi | 4 
  1 file changed, 4 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 26de582e41e..6404ed5c4ff 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -4282,6 +4282,10 @@ b->f();
  @noindent
  fails to compile.
  
+In cases where the different signatures are not an accident, the

+simplest solution is to add a using-declaration to the derived class
+to un-hide the base function, e.g. add @code{using A::f;} to @code{B}.
+
  The optional level suffix controls the behavior when all the
  declarations in the derived class override virtual functions in the
  base class, even if not all of the base functions are overridden:

[PATCH] PR tree-optimization/108697 - Create a lazy ssa_cache

2023-02-15 Thread Andrew MacLeod via Gcc-patches

This patch implements the suggestion that we have an alternative 
ssa-cache which does not zero memory, and instead uses a bitmap to track 
whether a value is currently set or not.  It roughly mimics what 
path_range_query was doing internally.


For sparsely used cases, expecially in large programs, this is more 
efficient.  I changed path_range_query to use this, and removed it old 
bitmap (and a hack or two around PHI calculations), and also utilized 
this is the assume_query class.


Performance wise, the patch doesn't affect VRP (since that still uses 
the original version).  Switching to the lazy version caused a slowdown 
of 2.5% across VRP.


There was a noticeable improvement elsewhere.,  across 230 GCC source 
files, threading ran over 12% faster!.  Overall compilation improved by 
0.3%  Not sure it makes much difference in compiler.i, but it shouldn't 
hurt.


bootstraps on x86_64-pc-linux-gnu with no regressions.   OK for trunk?  
or do you want to wait for the next release...


Andrew
From a4736b402d95b184659846ba308ce51f708472d1 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Wed, 8 Feb 2023 17:43:45 -0500
Subject: [PATCH 1/2] Create a lazy ssa_cache

Sparsely used ssa name caches can benefit from using a bitmap to
determine if a name already has an entry.  Utilize it in the path query
and remove its private bitmap for tracking the same info.
Also use it in the "assume" query class.

	* gimple-range-cache.cc (ssa_global_cache::clear_global_range): Do
	not clear the vector on an out of range query.
	(ssa_lazy_cache::set_global_range): New.
	* gimple-range-cache.h (class ssa_lazy_cache): New.
	(ssa_lazy_cache::ssa_lazy_cache): New.
	(ssa_lazy_cache::~ssa_lazy_cache): New.
	(ssa_lazy_cache::get_global_range): New.
	(ssa_lazy_cache::clear_global_range): New.
	(ssa_lazy_cache::clear): New.
	(ssa_lazy_cache::dump): New.
	* gimple-range-path.cc (path_range_query::path_range_query): Do
	not allocate a ssa_global_cache object not has_cache bitmap.
	(path_range_query::~path_range_query): Do not free objects.
	(path_range_query::clear_cache): Remove.
	(path_range_query::get_cache): Adjust.
	(path_range_query::set_cache): Remove.
	(path_range_query::dump): Don't call through a pointer.
	(path_range_query::internal_range_of_expr): Set cache directly.
	(path_range_query::reset_path): Clear cache directly.
	(path_range_query::ssa_range_in_phi): Fold with globals only.
	(path_range_query::compute_ranges_in_phis): Simply set range.
	(path_range_query::compute_ranges_in_block): Call cache directly.
	* gimple-range-path.h (class path_range_query): Replace bitmap
	and cache pointer with lazy cache object.
	* gimple-range.h (class assume_query): Use ssa_lazy_cache.
---
 gcc/gimple-range-cache.cc | 24 --
 gcc/gimple-range-cache.h  | 33 +++-
 gcc/gimple-range-path.cc  | 66 +--
 gcc/gimple-range-path.h   |  7 +
 gcc/gimple-range.h|  2 +-
 5 files changed, 70 insertions(+), 62 deletions(-)

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 546262c4794..9bfbdb2c9b3 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -525,14 +525,14 @@ ssa_global_cache::set_global_range (tree name, const vrange &r)
   return m != NULL;
 }
 
-// Set the range for NAME to R in the glonbal cache.
+// Set the range for NAME to R in the global cache.
 
 void
 ssa_global_cache::clear_global_range (tree name)
 {
   unsigned v = SSA_NAME_VERSION (name);
   if (v >= m_tab.length ())
-m_tab.safe_grow_cleared (num_ssa_names + 1);
+return;
   m_tab[v] = NULL;
 }
 
@@ -579,6 +579,26 @@ ssa_global_cache::dump (FILE *f)
 fputc ('\n', f);
 }
 
+
+// Set range of NAME to R in a lazy cache.  Return FALSE if it did not already
+// have a range.
+
+bool
+ssa_lazy_cache::set_global_range (tree name, const vrange &r)
+{
+  unsigned v = SSA_NAME_VERSION (name);
+  if (!bitmap_set_bit (active_p, v))
+{
+  // There is already an entry, simply set it.
+  gcc_checking_assert (v < m_tab.length ());
+  return ssa_global_cache::set_global_range (name, r);
+}
+  if (v >= m_tab.length ())
+m_tab.safe_grow (num_ssa_names + 1);
+  m_tab[v] = m_range_allocator->clone (r);
+  return false;
+}
+
 // --
 
 
diff --git a/gcc/gimple-range-cache.h b/gcc/gimple-range-cache.h
index 4ff435dc5c1..f1799b45738 100644
--- a/gcc/gimple-range-cache.h
+++ b/gcc/gimple-range-cache.h
@@ -62,11 +62,42 @@ public:
   void clear_global_range (tree name);
   void clear ();
   void dump (FILE *f = stderr);
-private:
+protected:
   vec m_tab;
   vrange_allocator *m_range_allocator;
 };
 
+// This is the same as global cache, except it maintains an active bitmap
+// rather than depending on a zero'd out vector of pointers.  This is better
+// for sparsely/lightly used caches.
+// It could be made a fully derived class, but at this point there doesnt seem
+// to be a

RE: [PATCH 1/2]middle-end: Fix wrong overmatching of div-bitmask by using new optabs [PR108583]

2023-02-15 Thread Tamar Christina via Gcc-patches

> On 2/15/23 07:51, Tamar Christina wrote:
> >> In any case, if you disagree I don’t' really see a way forward
> >> aside from making this its own pattern running it before the
> >> overwidening
> >>> pattern.
> > I think we should look to see if ranger can be persuaded to
> > provide the range of the 16-bit addition, even though the
> > statement that produces it isn't part of a BB.  It shouldn't
> > matter that the addition originally came from a 32-bit one: the
> > range follows directly from the ranges of the operands (i.e. the
> > fact that the operands are the results of widening conversions).
>  I think you can ask ranger on operations on names defined in the
>  IL, so you can work yourself through the sequence of operations in
>  the pattern sequence to compute ranges on their defs (and possibly
>  even store them in the SSA info).  You just need to pick the
>  correct ranger API for this…. Andrew CCed
> 
> 
> >>> Its not clear to me whats being asked...
> >>>
> >>> Expressions don't need to be in the IL to do range calculations.. I
> >>> believe we support arbitrary tree expressions via range_of_expr.
> >>>
> >>> if you have 32 bit ranges that you want to do 16 bit addition on,
> >>> you can also cast those ranges to a 16bit type,
> >>>
> >>> my32bitrange.cast (my16bittype);
> >>>
> >>> then invoke range-ops directly via getting the handler:
> >>>
> >>> handler = range_op_handler (PLUS_EXPR, 16bittype_tree); if (handler)
> >>>      handler->fold (result, my16bittype, mycasted32bitrange,
> >>> myothercasted32bitrange)
> >>>
> >>> There are higher level APIs if what you have on hand is closer to IL
> >>> than random ranges
> >>>
> >>> Describe exactly what it is you want to do... and I'll try to direct
> >>> you to the best way to do it.
> >> The vectorizer has  a pattern matcher that runs at startup on the scalar
> code.
> >> This pattern matcher can replace one or more statements with
> >> alternative ones, these can be either existing tree_code or new internal
> functions.
> >>
> >> One of the patterns here is a overwidening detection pattern which
> >> reduces the precision that an operation is to be done in during
> vectorization.
> >>
> >> Another one is widening multiplication, which replaced PLUS_EXPR with
> >> WIDEN_PLUS_EXPR.
> >>
> >> These can be chained, so e.g. a widening addition done on ints can be
> >> reduced to a widen addition done on shorts.
> >>
> >> The question is whether given the new expression that the vectorizer
> >> has created whether ranger can tell what the precision is.
> >> get_range_query fails because presumably it has no idea about the new
> >> operations created  and also doesn't know about any new IFNs.
> > Hi,
> >
> > I have been trying to use ranger as requested. I've tried:
> >
> >   gimple_ranger ranger;
> >   int_range_max r;
> >   /* Check that no overflow will occur.  If we don't have range
> >  information we can't perform the optimization.  */
> >   if (ranger.range_of_expr (r, oprnd0, stmt))
> > {
> >   wide_int max = r.upper_bound ();
> >  
> >
> > Which works for non-patterns, but still doesn't work for patterns.
> > On a stmt:
> > patt_27 = (_3) w+ (level_15(D));
> >
> > it gives me a range:
> >
> > $2 = {
> > = {
> >  val = {[0x0] = 0x, [0x1] = 0x7fff95bd8b00, [0x2] =
> 0x7fff95bd78b0, [0x3] = 0x3fa1dd0, [0x4] = 0x3fa1dd0, [0x5] =
> 0x344a706f832d4f00, [0x6] = 0x7fff95bd7950, [0x7] = 0x1ae7f11, [0x8] =
> 0x7fff95bd79f8},
> >  len = 0x1,
> >  precision = 0x10
> >},
> >members of generic_wide_int:
> >static is_sign_extended = 0x1
> > }
> >
> > The precision is fine, but range seems to be -1?
> >
> > Should I use range_op_handler (WIDEN_PLUS_EXPR, ...) in this case?
> 
> Its easier to see the range if you dump it.. ie:
> 
> p r.dump(stderr)
> 
> Im way behind the curve on exactly whats going on.  Im not sure how the
> above 2 things relate..  I presume $2 is is 'max'?  I have no context, what 
> did
> you expect the range of _3 to be?

Yes, $2 is max, and the expected range is 0x1fe as it's unsigned addition.
I'll expand below.

> 
> We have no entry in range-ops.cc for a WIDEN_PLUS_EXPR,  so ranger would
> only give back a VARYING for that no doubt.. however I doubt it would be
> too difficult to write the fold_range() method for it.
> 
> Its unclear to me what you mean by it doesnt work on patterns. so lets do
> some basics.
> 
> You have a stmt  "patt_27 = (_3) w+ (level_15(D));"
> 
> I gather thats a WIDEN_PLUS_EXPR, and if I read it right, patt_27 is a type
> that is twice as wide as _3, and will contain the value "_3 + level_15"?
> 
> You query above is asking for the range of _3 at this stmt in the IL.
> 
> And you are trying to determine whether the expression "_3 + level_15"
> would still fit in the type of _3, and thus you could avoid the WIDEN_*
> paradigm and revert to

Re: [PATCH 1/2] c++: factor out TYPENAME_TYPE substitution

2023-02-15 Thread Patrick Palka via Gcc-patches

On Tue, 14 Feb 2023, Jason Merrill wrote:

> On 2/13/23 09:23, Patrick Palka wrote:
> > [N.B. this is a corrected version of
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607443.html ]
> > 
> > This patch factors out the TYPENAME_TYPE case of tsubst into a separate
> > function tsubst_typename_type.  It also factors out the two tsubst flags
> > controlling TYPENAME_TYPE substitution, tf_keep_type_decl and tf_tst_ok,
> > into distinct boolean parameters of this new function (and of
> > make_typename_type).  Consequently, callers which used to pass tf_tst_ok
> > to tsubst now instead must directly call tsubst_typename_type when
> > appropriate.
> 
> Hmm, I don't love how that turns 4 lines into 8 more complex lines in each
> caller.  And the previous approach of saying "a CTAD placeholder is OK" seem
> like better abstraction than repeating the specific TYPENAME_TYPE handling in
> each place.

Ah yeah, I see what you mean.  I was thinking since tf_tst_ok is
specific to TYPENAME_TYPE handling and isn't propagated (i.e. it only
affects top-level TYPENAME_TYPEs), it seemed cleaner to encode the flag
as a bool parameter "template_ok" of tsubst_typename_type instead of as
a global tsubst_flag that gets propagated freely.

> 
> > In a subsequent patch we'll add another flag to
> > tsubst_typename_type controlling whether we want to ignore non-types
> > during the qualified lookup.

As mentioned above, the second patch in this series would just add
another flag "type_only" alongside "template_ok", since this flag will
also only affects top-level TYPENAME_TYPEs and doesn't need to propagate
like tsubst_flags.

Except, it turns it, this new flag _does_ need to propagate, namely when
expanding a variadic using:

  using typename Ts::type::m...; // from typename25a.C below

Here we have a USING_DECL whose USING_DECL_SCOPE is a
TYPE_PACK_EXPANSION over TYPENAME_TYPE.  In order to correctly
substitute this TYPENAME_TYPE, the USING_DECL case of tsubst_decl needs
to pass an appropriate tsubst_flag to tsubst_pack_expansion to be
propagated to tsubst (to be propagated to make_typename_type).

So in light of this case it seems adding a new tsubst_flag is the
way to go, which means we can avoid this refactoring patch entirely.

Like so?  Bootstrapped and regtested on x86_64-pc-linux-gnu.

-- >8 --

Subject: [PATCH] c++: TYPENAME_TYPE lookup ignoring non-types [PR107773]

Currently when resolving a TYPENAME_TYPE for 'typename T::m' via
make_typename_type, we consider only type bindings of 'm' and ignore
non-type ones.  But [temp.res.general]/3 says, in a note, "the usual
qualified name lookup ([basic.lookup.qual]) applies even in the presence
of 'typename'", and qualified name lookup doesn't discriminate between
type and non-type bindings.  So when resolving such a TYPENAME_TYPE
we want the lookup to consider all bindings.

An exception is when we have a TYPENAME_TYPE corresponding to the
qualifying scope of the :: scope resolution operator, such as
'T::type' in 'typename T::type::m'.  In that case, [basic.lookup.qual]/1
applies, and lookup for such a TYPENAME_TYPE must ignore non-type bindings.
So in order to correctly handle all cases, make_typename_type needs an
additional flag controlling whether lookup should ignore non-types or not.

To that end this patch adds a new tsubst flag tf_qualifying_scope to
communicate to make_typename_type whether we want to ignore non-type
bindings during the lookup (by default we don't want to ignore them).
In contexts where we do want to ignore non-types (when substituting
into the scope of TYPENAME_TYPE, SCOPE_REF or USING_DECL) we simply
pass tf_qualifying_scope to the relevant tsubst / tsubst_copy call.
This flag is intended to apply only to top-level TYPENAME_TYPEs so
we must be careful to clear the flag to avoid propagating it during
substitution of sub-trees.

PR c++/107773

gcc/cp/ChangeLog:

* cp-tree.h (enum tsubst_flags): New flag tf_qualifying_scope.
* decl.cc (make_typename_type): Use lookup_member instead of
lookup_field.  If tf_qualifying_scope is set, pass want_type=true
instead of =false to lookup_member.  Generalize format specifier
in diagnostic to handle both type and non-type bindings.
* pt.cc (tsubst_aggr_type_1): Clear tf_qualifying_scope.  Tidy
the function.
(tsubst_decl) : Set tf_qualifying_scope when
substituting USING_DECL_SCOPE.
(tsubst): Clear tf_qualifying_scope right away and remember if
it was set.  Do the same for tf_tst_ok sooner.
: Set tf_qualifying_scope when substituting
TYPE_CONTEXT.  Pass tf_qualifying_scope to make_typename_type
if it was set.
(tsubst_qualified_id): Set tf_qualifying_scope when substituting
the scope.
(tsubst_copy): Clear tf_qualifying_scope and remember if it was
set.
: Set tf_qualifying_scope when substituting the
scope.
: Pass tf_qualifying_scope

Re: [PATCH 1/2]middle-end: Fix wrong overmatching of div-bitmask by using new optabs [PR108583]

2023-02-15 Thread Andrew MacLeod via Gcc-patches



On 2/15/23 12:13, Tamar Christina wrote:

On 2/15/23 07:51, Tamar Christina wrote:


Thanks, lots of useful context there.



This second pattern replaces the above into:

   _6 = _3 +w level_14(D);
   _7 = _6 / 255;
   _8 = (unsigned char) _7;

Thus removing the need to promote before the addition.  What I'm working on
is an optimization for division.  So I am after what the range of _6 is. oprnd0 
in my
example is the 1rst operand of the division.

I need to know the range of_6 because based on the range we can optimize this
division into something much more efficient.


  IF that is all true, then I would suggest one of 2 possible routes.
1) we add WIDEN_PLUS_EXPR to range-ops.  THIs involves writing
fold_range() for it whereby it would create a range of a type double the
precision of _3, then take the 2 ranges for op1 and op2, cast them to this new
type and add them.


Right, so I guess none of the widening operations are currently there.  Can you
point me in the right direction of where I need to add them?


sure, details below



2) manually doing the same thing.   BUt if you are goignto manually do it, we
might as well put that same code into fold_range then the entire ecosystem
will benefit.

Once the operation can be performed in range ops, you can cast the new
range back to the type of _3 and see if its fully represented. ie

int_range_max r1, r2
if (ranger.range_of_stmt (r1, stmt))
    {
      r2 = r1;
      r2.cast (TREE_TYPE (_3));
      r2.cast (TREE_TYPE (patt_27));
      if (r1 == r2)
        // No info was lost casting back and forth, so r1 must fit into type of 
_3

That should work for within the IL.  And if you want to do the same thing
outside of the IL, you have to come up with the values you want to use for
op1 and op2, replace the ranger query with a direct range-opfold:

range_op_handler handler (WIDEN_PLUS_EXPR, TREE_TYPE (patt_27)); if
(handler && handler->fold_range (r1, range_of__3, range_of_level_15))
    {
      // same casting song and dance



Just for my own understanding, does the fold_range here update the information
in the IL? Or is it just for this computation? So when I hit this pattern again 
it
recomputes it?


fold_range does not update anything.  It just performs the calculation, 
and passes like VRP etc are responsible for if, and when, that is 
reflected in some way/transformation in the IL. The IL is primarily used 
for context to look back and try to determine the range of the inputs to 
the statement.   Thats why, if you arent using an expression in the IL, 
you need to provide the ranges yourself.   BY default, you end up with 
the full range for the type, ie VARYING.  but if ranger can detertmine 
through branches and such that its something different, it will. ie, so 
if you case is preceeded by


if (_3 < 20 && level_15< 20)
  //  the range of _3 will be [0, 19] and _15 will be [0, 19], and th 
addition will end up with a range of [0, 38]


In your case, I see the ranges are the range of the 8 bit type: irange] 
int [0, 255] NONZERO 0xff





If you dont want to go thru this process, in theory, you could try simply
adding _3 and level_15 in their own precision, and if max/min aren't +INF/-
INF then you can probably assume there is no overflow?
in which case, all you do is the path you are on above for within a stmt should
work:

  gimple_ranger ranger;
  int_range_max r0, r1, def;
  /* Check that no overflow will occur.  If we don't have range
 information we can't perform the optimization.  */
  if (ranger.range_of_expr (r0, oprnd0, stmt) &&
ranger.range_of_expr (r1,oprnd1, stmt)
{
  range_op_handler handler (PLUS_EXPR, TREE_TYPE (_3));
  if (handler && handler->fold_range (def, r0, r1))so I would 
expect a skeleton to be
// examine def.upper_bound() and def.lower_bound()

Am I grasping some of the issue here?

You are, and this was helpful.  I would imagine that Richard wouldn't accept me
to do it locally though.  So I guess if it's safe to do for this PR fix, I can 
add the basic
widening operations to ranger-ops if you can show me where.



all the range-op integer code is in gcc/range-op.cc.  As this is a basic 
binary operation, you should be able to get away with implementing a 
single routine,  wi_fold () which adds 2 wide int bounds  together and 
returns a result.  THis si the implelemntaion for operator_plus.


void
operator_plus::wi_fold (irange &r, tree type,
    const wide_int &lh_lb, const wide_int &lh_ub,
    const wide_int &rh_lb, const wide_int &rh_ub) const
{
  wi::overflow_type ov_lb, ov_ub;
  signop s = TYPE_SIGN (type);
  wide_int new_lb = wi::add (lh_lb, rh_lb, s, &ov_lb);
  wide_int new_ub = wi::add (lh_ub, rh_ub, s, &ov_ub);
  value_range_with_overflow (r, type, new_lb, new_ub, ov_lb, ov_ub);
}


you shouldn't have to do any of the overflow stuff at the end, just take 
the 2 sets of wide in

Re: Ping^3: [PATCH] libcpp: Handle extended characters in user-defined literal suffix [PR103902]

2023-02-15 Thread Jason Merrill via Gcc-patches


On 9/26/22 15:27, Lewis Hyatt wrote:

On Wed, Jun 15, 2022 at 03:06:16PM -0400, Lewis Hyatt wrote:

On Tue, Jun 14, 2022 at 05:26:49PM -0400, Lewis Hyatt wrote:

Hello-

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103902

The attached patch resolves PR preprocessor/103902 as described in the patch
message inline below. bootstrap + regtest all languages was successful on
x86-64 Linux, with no new failures:

FAIL 103 103
PASS 542338 542371
UNSUPPORTED 15247 15250
UNTESTED 136 136
XFAIL 4166 4166
XPASS 17 17

Please let me know if it looks OK?

A few questions I have:

- A difference introduced with this patch is that after lexing something
like `operator ""_abc', then `_abc' is added to the identifier hash map,
whereas previously it was not. I feel like this must be OK because with the
optional space as in `operator "" _abc', it would be added with or without the
patch.

- The behavior of `#pragma GCC poison' is not consistent (including prior to
   my patch). I tried to make it more so but there is still one thing I want to
   ask about. Leaving aside extended characters for now, the inconsistency is
   that currently the poison is only checked, when the suffix appears as a
   standalone token.

   #pragma GCC poison _X
   bool operator ""_X (unsigned long long);   //accepted before the patch,
  //rejected after it
   bool operator "" _X (unsigned long long);  //rejected either before or after
   const char * operator ""_X (const char *, unsigned long); //accepted before,
 //rejected after
   const char * operator "" _X (const char *, unsigned long); //rejected either

   const char * s = ""_X; //accepted before the patch, rejected after it
   const bool b = 1_X; //accepted before or after 

I feel like after the patch, the behavior is the expected behavior for all
cases but the last one. Here, we allow the poisoned identifier because it's
not lexed as an identifier, it's lexed as part of a pp-number. Does it seem OK
like this or does it need to be addressed?


Sorry, that version actually did not handle the case of -Wc++11-compat in
c++98 mode correctly. This updated version fixes that and adds the missing
test coverage for that, if you could please review this one instead?

By the way, the pipermail archive seems to permanently mangle UTF-8 in inline
attachments. I attached the patch also gzipped to address that for the
archive, since the new testcases do use non-ASCII characters.

Thanks for taking a look!


Hello-

May I please ping this patch again? Joseph suggested that it would be best if
a C++ maintainer has a look at it. This is one of just a few places left where
we don't handle UTF-8 properly in libcpp, it would be really nice to get them
fixed up if there is time to review this patch. Thanks!

https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596704.html

I re-attached it here as it required some trivial rebasing on top of recently
pushed changes. As before, I also attached the gzipped version so that the
UTF-8 testcases show up OK in the online archive, in case that's still an
issue. Thanks for taking a look!


Thank you for the patch, sorry it slipped off my radar.


This patch fixes it by adding a new function scan_cur_identifier() that can be
used to lex an identifier while in the middle of lexing another token. It is
somewhat duplicative of the code in lex_identifier(), which handles the normal
case, but I think there's no good way to avoid that without pessimizing the
usual case, since lex_identifier() takes advantage of the fact that the first
character of the identifier has already been analyzed.


So could you analyze the first character and then call lex_identifier?


With scan_cur_identifier(), we do also correctly warn about bidi and
normalization issues in the extended identifiers comprising the suffix, and we
check for poisoned identifiers there as well.


Hmm, I don't think we want the check for poisoned identifiers; a suffix 
is not a name.  That goes for the other diagnostics in 
identifier_diagnostics_on_lex, as well.  At the meeting last week the 
committee decided to deprecate the declaration with a space to clarify 
this distinction.



+ if (!accum.accum)
+   create_literal2 (pfile, token, base,
+suffix_begin - base,
+NODE_NAME (sr.node),
+NODE_LEN (sr.node),
+type);
+ else
+   {
+ accum.create_literal2 (pfile, token, base,
+suffix_begin - base,
+NODE_NAME (sr.node),
+NODE_LEN (sr.node),
+type);
+ _cpp_release_buff (pfile, accum.first);
+   }


How about always calling accum.create_literal2?

Re: [PATCH 1/2]middle-end: Fix wrong overmatching of div-bitmask by using new optabs [PR108583]

2023-02-15 Thread Andrew MacLeod via Gcc-patches




On 2/15/23 12:50, Andrew MacLeod wrote:


On 2/15/23 12:13, Tamar Christina wrote:

On 2/15/23 07:51, Tamar Christina wrote:

void
operator_plus::wi_fold (irange &r, tree type,
    const wide_int &lh_lb, const wide_int &lh_ub,
    const wide_int &rh_lb, const wide_int &rh_ub) 
const

{
  wi::overflow_type ov_lb, ov_ub;
  signop s = TYPE_SIGN (type);

  // Do whatever wideint magic is required to do this adds in higher 
precision

  wide_int new_lb = wi::add (lh_lb, rh_lb, s, &ov_lb);
  wide_int new_ub = wi::add (lh_ub, rh_ub, s, &ov_ub);

  r = int_range<2> (type, new_lb, new_ub);
}


The operator needs to be registered, I've attached the skeleton for 
it.  you should just have to finish implementing wi_fold().


in theory :-)

You also mentioned earlier that some were tree codes, some were internal 
function calls?  We have some initial support for built in functions, 
but I am not familiar with all the various forms they can take.  We 
currently support CFN_ functions in


  gimple-range-op.cc, gimple_range_op_handler::maybe_builtin_call ()

Basically this is part of a "gimple_range_op_handler"  wrapper for 
range-ops which can provide a range-ops class for stmts that don't map 
to a binary or unary form.. such as built in functions.


If you get to the point where you need this for a builtin function, I 
can help you through that too.  Although someone may have to also help 
me through what differentiates the different kinds of internal function 
:-)    I presume they are all similar in some way.


Andrew

[og12] Fix 'libgomp.{c-c++-common,fortran}/target-present-*' test cases (was: [OG12][committed] openmp: Add support for the 'present' modifier)

2023-02-15 Thread Thomas Schwinge

Hi!

On 2023-02-09T21:17:44+, Kwok Cheung Yeung  wrote:
> I've ported my patch for supporting the OpenMP 5.1 'present' modifier
> and committed it to the devel/omp/gcc-12 development branch:
>
> 229b705862c openmp: Add support for the 'present' modifier
>
> Tested with offloading on amdgcn and nvptx.

I've pushed to devel/omp/gcc-12 branch
commit bbda035ee62ba4db21356136c97e9d83a97ba7d1
"Fix 'libgomp.{c-c++-common,fortran}/target-present-*' test cases",
see attached.


Note that this likewise applies to the current upstream submission:

"openmp: Add support for 'present' modifier".


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From bbda035ee62ba4db21356136c97e9d83a97ba7d1 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Wed, 15 Feb 2023 12:39:19 +0100
Subject: [PATCH] Fix 'libgomp.{c-c++-common,fortran}/target-present-*' test
 cases

Their execution isn't expected to error out if we've been *compiling for any
offload target*, but rather if they're *executing on a non-shared memory
offload device*.  For example, if (any) offloading is configured but not
effective (no device available, for example), you'd get:

PASS: libgomp.c/../libgomp.c-c++-common/target-present-1.c (test for excess errors)
FAIL: libgomp.c/../libgomp.c-c++-common/target-present-1.c execution test
PASS: libgomp.c/../libgomp.c-c++-common/target-present-2.c (test for excess errors)
FAIL: libgomp.c/../libgomp.c-c++-common/target-present-2.c execution test
PASS: libgomp.c/../libgomp.c-c++-common/target-present-3.c (test for excess errors)
FAIL: libgomp.c/../libgomp.c-c++-common/target-present-3.c execution test

PASS: libgomp.c++/../libgomp.c-c++-common/target-present-1.c (test for excess errors)
FAIL: libgomp.c++/../libgomp.c-c++-common/target-present-1.c execution test
PASS: libgomp.c++/../libgomp.c-c++-common/target-present-2.c (test for excess errors)
FAIL: libgomp.c++/../libgomp.c-c++-common/target-present-2.c execution test
PASS: libgomp.c++/../libgomp.c-c++-common/target-present-3.c (test for excess errors)
FAIL: libgomp.c++/../libgomp.c-c++-common/target-present-3.c execution test

PASS: libgomp.fortran/target-present-1.f90   -O0  (test for excess errors)
FAIL: libgomp.fortran/target-present-1.f90   -O0  execution test
[...]
PASS: libgomp.fortran/target-present-2.f90   -O0  (test for excess errors)
FAIL: libgomp.fortran/target-present-2.f90   -O0  execution test
[...]
PASS: libgomp.fortran/target-present-3.f90   -O0  (test for excess errors)
FAIL: libgomp.fortran/target-present-3.f90   -O0  execution test
[...]

Also, verify reaching a checkpoint before the expected error condition -- and
fix up one case where that didn't happen; missing OpenMP 'map' clauses
('libgomp.fortran/target-present-2.f90').

Fix-up for recent og12 commit 229b705862c1d7f9634f72272b77c22970baf821
"openmp: Add support for the 'present' modifier"

	libgomp/
	* testsuite/libgomp.c-c++-common/target-present-1.c: Fix.
	* testsuite/libgomp.c-c++-common/target-present-2.c: Likewise.
	* testsuite/libgomp.c-c++-common/target-present-3.c: Likewise.
	* testsuite/libgomp.fortran/target-present-1.f90: Likewise.
	* testsuite/libgomp.fortran/target-present-2.f90: Likewise.
	* testsuite/libgomp.fortran/target-present-3.f90: Likewise.
---
 libgomp/ChangeLog.omp   |  9 +
 .../libgomp.c-c++-common/target-present-1.c |  9 ++---
 .../libgomp.c-c++-common/target-present-2.c | 11 +++
 .../libgomp.c-c++-common/target-present-3.c |  9 +
 .../testsuite/libgomp.fortran/target-present-1.f90  |  9 ++---
 .../testsuite/libgomp.fortran/target-present-2.f90  | 13 -
 .../testsuite/libgomp.fortran/target-present-3.f90  |  9 ++---
 7 files changed, 47 insertions(+), 22 deletions(-)

diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp
index b638cdbb41e..5257ee00e0c 100644
--- a/libgomp/ChangeLog.omp
+++ b/libgomp/ChangeLog.omp
@@ -1,3 +1,12 @@
+2023-02-15  Thomas Schwinge  
+
+	* testsuite/libgomp.c-c++-common/target-present-1.c: Fix.
+	* testsuite/libgomp.c-c++-common/target-present-2.c: Likewise.
+	* testsuite/libgomp.c-c++-common/target-present-3.c: Likewise.
+	* testsuite/libgomp.fortran/target-present-1.f90: Likewise.
+	* testsuite/libgomp.fortran/target-present-2.f90: Likewise.
+	* testsuite/libgomp.fortran/target-present-3.f90: Likewise.
+
 2023-02-15  Tobias Burnus  
 
 	Backported from master:
diff --git a/libgomp/testsuite/libgomp.c-c++-common/target-present-1.c b/libgomp/testsuite/libgomp.c-c++-common/target-present-1.c
index bbc4559b12e..55aecd1c8d1 100644
--- a/l

Re: [PATCH] Fix PR target/90458

2023-02-15 Thread Jeff Law via Gcc-patches





On 2/15/23 08:24, Eric Botcazou via Gcc-patches wrote:

Hi,

this is the incompatibility of -fstack-clash-protection with Windows SEH.  Now
the Windows ports always enable TARGET_STACK_PROBE, which means that the stack
is always probed (out of line) so -fstack-clash-protection does nothing more.

Tested on x86-64/Windows and Linux, OK for all active branches?


2023-02-15  Eric Botcazou  

* config/i386/i386.cc (ix86_compute_frame_layout): Disable the
effects of -fstack-clash-protection for TARGET_STACK_PROBE.
(ix86_expand_prologue): Likewise.


OK.  THanks for taking care of this.  I let it languish far too long.

jeff

[PATCH] testsuite: Handle "packed" targets in c-c++-common/auto-init-7.c and -8.c

2023-02-15 Thread Hans-Peter Nilsson via Gcc-patches

Tested for cris-elf.  Ok to commit?

-- >8 --
Looks like there's a failed assumption that
sizeof (union U { char u1[5]; int u2; float u3; }) == 8.
However, for "packed" targets like cris-elf, it's 5.

These two tests have always failed for cris-elf.  I see from
https://gcc.gnu.org/pipermail/gcc-testresults/2023-February/777912.html
that they fail on pru-elf too, but I don't know if the cause
(and/or remedy) is the same.

IMHO this is preferred over the alternative; splitting up
that last line into two lines, like:
/* { dg-final { scan-tree-dump "temp4 = \
 .DEFERRED_INIT \\(8, 2, \&\"temp4\"" "gimple" { target { ! default_packed } } 
} } */
/* { dg-final { scan-tree-dump "temp4 = \
 .DEFERRED_INIT \\(5, 2, \&\"temp4\"" "gimple" { target default_packed } } } */

gcc/testsuite:
* c-c++-common/auto-init-7.c, c-c++-common/auto-init-8.c: Also
match targets where sizeof (union U) == 5, like "packed" targets.
---
 gcc/testsuite/c-c++-common/auto-init-7.c | 2 +-
 gcc/testsuite/c-c++-common/auto-init-8.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/auto-init-7.c 
b/gcc/testsuite/c-c++-common/auto-init-7.c
index b44dd5e68ed1..dd48d691596f 100644
--- a/gcc/testsuite/c-c++-common/auto-init-7.c
+++ b/gcc/testsuite/c-c++-common/auto-init-7.c
@@ -32,4 +32,4 @@ double foo()
 /* { dg-final { scan-tree-dump "temp1 = .DEFERRED_INIT \\(12, 2, \&\"temp1\"" 
"gimple" } } */
 /* { dg-final { scan-tree-dump "temp2 = .DEFERRED_INIT \\(24, 2, \&\"temp2\"" 
"gimple" } } */
 /* { dg-final { scan-tree-dump "temp3 = .DEFERRED_INIT \\(28, 2, \&\"temp3\"" 
"gimple" } } */
-/* { dg-final { scan-tree-dump "temp4 = .DEFERRED_INIT \\(8, 2, \&\"temp4\"" 
"gimple" } } */
+/* { dg-final { scan-tree-dump "temp4 = .DEFERRED_INIT \\((8|5), 2, 
\&\"temp4\"" "gimple" } } */
diff --git a/gcc/testsuite/c-c++-common/auto-init-8.c 
b/gcc/testsuite/c-c++-common/auto-init-8.c
index 739ac0289315..863f2ba87d7d 100644
--- a/gcc/testsuite/c-c++-common/auto-init-8.c
+++ b/gcc/testsuite/c-c++-common/auto-init-8.c
@@ -32,4 +32,4 @@ double foo()
 /* { dg-final { scan-tree-dump "temp1 = .DEFERRED_INIT \\(12, 1, \&\"temp1\"" 
"gimple" } } */
 /* { dg-final { scan-tree-dump "temp2 = .DEFERRED_INIT \\(24, 1, \&\"temp2\"" 
"gimple" } } */
 /* { dg-final { scan-tree-dump "temp3 = .DEFERRED_INIT \\(28, 1, \&\"temp3\"" 
"gimple" } } */
-/* { dg-final { scan-tree-dump "temp4 = .DEFERRED_INIT \\(8, 1, \&\"temp4\"" 
"gimple" } } */
+/* { dg-final { scan-tree-dump "temp4 = .DEFERRED_INIT \\((8|5), 1, 
\&\"temp4\"" "gimple" } } */
-- 
2.30.2

Re: [PATCH 1/2] c++: factor out TYPENAME_TYPE substitution

2023-02-15 Thread Jason Merrill via Gcc-patches


On 2/15/23 09:21, Patrick Palka wrote:

On Tue, 14 Feb 2023, Jason Merrill wrote:


On 2/13/23 09:23, Patrick Palka wrote:

[N.B. this is a corrected version of
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607443.html ]

This patch factors out the TYPENAME_TYPE case of tsubst into a separate
function tsubst_typename_type.  It also factors out the two tsubst flags
controlling TYPENAME_TYPE substitution, tf_keep_type_decl and tf_tst_ok,
into distinct boolean parameters of this new function (and of
make_typename_type).  Consequently, callers which used to pass tf_tst_ok
to tsubst now instead must directly call tsubst_typename_type when
appropriate.


Hmm, I don't love how that turns 4 lines into 8 more complex lines in each
caller.  And the previous approach of saying "a CTAD placeholder is OK" seem
like better abstraction than repeating the specific TYPENAME_TYPE handling in
each place.


Ah yeah, I see what you mean.  I was thinking since tf_tst_ok is
specific to TYPENAME_TYPE handling and isn't propagated (i.e. it only
affects top-level TYPENAME_TYPEs), it seemed cleaner to encode the flag
as a bool parameter "template_ok" of tsubst_typename_type instead of as
a global tsubst_flag that gets propagated freely.




In a subsequent patch we'll add another flag to
tsubst_typename_type controlling whether we want to ignore non-types
during the qualified lookup.


As mentioned above, the second patch in this series would just add
another flag "type_only" alongside "template_ok", since this flag will
also only affects top-level TYPENAME_TYPEs and doesn't need to propagate
like tsubst_flags.

Except, it turns it, this new flag _does_ need to propagate, namely when
expanding a variadic using:

   using typename Ts::type::m...; // from typename25a.C below

Here we have a USING_DECL whose USING_DECL_SCOPE is a
TYPE_PACK_EXPANSION over TYPENAME_TYPE.  In order to correctly
substitute this TYPENAME_TYPE, the USING_DECL case of tsubst_decl needs
to pass an appropriate tsubst_flag to tsubst_pack_expansion to be
propagated to tsubst (to be propagated to make_typename_type).

So in light of this case it seems adding a new tsubst_flag is the
way to go, which means we can avoid this refactoring patch entirely.

Like so?  Bootstrapped and regtested on x86_64-pc-linux-gnu.


OK, though I still wonder about adding a tsubst_scope function that 
would add the tf_qualifying_scope.



-- >8 --

Subject: [PATCH] c++: TYPENAME_TYPE lookup ignoring non-types [PR107773]

Currently when resolving a TYPENAME_TYPE for 'typename T::m' via
make_typename_type, we consider only type bindings of 'm' and ignore
non-type ones.  But [temp.res.general]/3 says, in a note, "the usual
qualified name lookup ([basic.lookup.qual]) applies even in the presence
of 'typename'", and qualified name lookup doesn't discriminate between
type and non-type bindings.  So when resolving such a TYPENAME_TYPE
we want the lookup to consider all bindings.

An exception is when we have a TYPENAME_TYPE corresponding to the
qualifying scope of the :: scope resolution operator, such as
'T::type' in 'typename T::type::m'.  In that case, [basic.lookup.qual]/1
applies, and lookup for such a TYPENAME_TYPE must ignore non-type bindings.
So in order to correctly handle all cases, make_typename_type needs an
additional flag controlling whether lookup should ignore non-types or not.

To that end this patch adds a new tsubst flag tf_qualifying_scope to
communicate to make_typename_type whether we want to ignore non-type
bindings during the lookup (by default we don't want to ignore them).
In contexts where we do want to ignore non-types (when substituting
into the scope of TYPENAME_TYPE, SCOPE_REF or USING_DECL) we simply
pass tf_qualifying_scope to the relevant tsubst / tsubst_copy call.
This flag is intended to apply only to top-level TYPENAME_TYPEs so
we must be careful to clear the flag to avoid propagating it during
substitution of sub-trees.

PR c++/107773

gcc/cp/ChangeLog:

* cp-tree.h (enum tsubst_flags): New flag tf_qualifying_scope.
* decl.cc (make_typename_type): Use lookup_member instead of
lookup_field.  If tf_qualifying_scope is set, pass want_type=true
instead of =false to lookup_member.  Generalize format specifier
in diagnostic to handle both type and non-type bindings.
* pt.cc (tsubst_aggr_type_1): Clear tf_qualifying_scope.  Tidy
the function.
(tsubst_decl) : Set tf_qualifying_scope when
substituting USING_DECL_SCOPE.
(tsubst): Clear tf_qualifying_scope right away and remember if
it was set.  Do the same for tf_tst_ok sooner.
: Set tf_qualifying_scope when substituting
TYPE_CONTEXT.  Pass tf_qualifying_scope to make_typename_type
if it was set.
(tsubst_qualified_id): Set tf_qualifying_scope when substituting
the scope.
(tsubst_copy): Clear tf_qualifying_scope and remember if it was
set.
: Set

Re: [PATCH] testsuite: Handle "packed" targets in c-c++-common/auto-init-7.c and -8.c

2023-02-15 Thread Qing Zhao via Gcc-patches

Thank you for fixing this issue.

Qing

> On Feb 15, 2023, at 2:19 PM, Hans-Peter Nilsson  wrote:
> 
> Tested for cris-elf.  Ok to commit?
> 
> -- >8 --
> Looks like there's a failed assumption that
> sizeof (union U { char u1[5]; int u2; float u3; }) == 8.
> However, for "packed" targets like cris-elf, it's 5.
> 
> These two tests have always failed for cris-elf.  I see from
> https://gcc.gnu.org/pipermail/gcc-testresults/2023-February/777912.html
> that they fail on pru-elf too, but I don't know if the cause
> (and/or remedy) is the same.
> 
> IMHO this is preferred over the alternative; splitting up
> that last line into two lines, like:
> /* { dg-final { scan-tree-dump "temp4 = \
> .DEFERRED_INIT \\(8, 2, \&\"temp4\"" "gimple" { target { ! default_packed } } 
> } } */
> /* { dg-final { scan-tree-dump "temp4 = \
> .DEFERRED_INIT \\(5, 2, \&\"temp4\"" "gimple" { target default_packed } } } */
> 
> gcc/testsuite:
>   * c-c++-common/auto-init-7.c, c-c++-common/auto-init-8.c: Also
>   match targets where sizeof (union U) == 5, like "packed" targets.
> ---
> gcc/testsuite/c-c++-common/auto-init-7.c | 2 +-
> gcc/testsuite/c-c++-common/auto-init-8.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/c-c++-common/auto-init-7.c 
> b/gcc/testsuite/c-c++-common/auto-init-7.c
> index b44dd5e68ed1..dd48d691596f 100644
> --- a/gcc/testsuite/c-c++-common/auto-init-7.c
> +++ b/gcc/testsuite/c-c++-common/auto-init-7.c
> @@ -32,4 +32,4 @@ double foo()
> /* { dg-final { scan-tree-dump "temp1 = .DEFERRED_INIT \\(12, 2, \&\"temp1\"" 
> "gimple" } } */
> /* { dg-final { scan-tree-dump "temp2 = .DEFERRED_INIT \\(24, 2, \&\"temp2\"" 
> "gimple" } } */
> /* { dg-final { scan-tree-dump "temp3 = .DEFERRED_INIT \\(28, 2, \&\"temp3\"" 
> "gimple" } } */
> -/* { dg-final { scan-tree-dump "temp4 = .DEFERRED_INIT \\(8, 2, \&\"temp4\"" 
> "gimple" } } */
> +/* { dg-final { scan-tree-dump "temp4 = .DEFERRED_INIT \\((8|5), 2, 
> \&\"temp4\"" "gimple" } } */
> diff --git a/gcc/testsuite/c-c++-common/auto-init-8.c 
> b/gcc/testsuite/c-c++-common/auto-init-8.c
> index 739ac0289315..863f2ba87d7d 100644
> --- a/gcc/testsuite/c-c++-common/auto-init-8.c
> +++ b/gcc/testsuite/c-c++-common/auto-init-8.c
> @@ -32,4 +32,4 @@ double foo()
> /* { dg-final { scan-tree-dump "temp1 = .DEFERRED_INIT \\(12, 1, \&\"temp1\"" 
> "gimple" } } */
> /* { dg-final { scan-tree-dump "temp2 = .DEFERRED_INIT \\(24, 1, \&\"temp2\"" 
> "gimple" } } */
> /* { dg-final { scan-tree-dump "temp3 = .DEFERRED_INIT \\(28, 1, \&\"temp3\"" 
> "gimple" } } */
> -/* { dg-final { scan-tree-dump "temp4 = .DEFERRED_INIT \\(8, 1, \&\"temp4\"" 
> "gimple" } } */
> +/* { dg-final { scan-tree-dump "temp4 = .DEFERRED_INIT \\((8|5), 1, 
> \&\"temp4\"" "gimple" } } */
> -- 
> 2.30.2
>

Re: [PATCH] c++: ICE with -fno-elide-constructors and trivial fn [PR101073]

2023-02-15 Thread Jason Merrill via Gcc-patches


On 2/9/23 09:39, Marek Polacek wrote:

In constexpr-nsdmi3.C, with -fno-elide-constructors, we don't elide
the Y::Y(const Y&) call used to initialize o.c.  So store_init_value
-> cxx_constant_init must constexpr-evaluate the call to Y::Y(const Y&)
in cxx_eval_call_expression.  It's a trivial function, so we do the
"Shortcut trivial constructor/op=" code and rather than evaluating
the function, we just create an assignment

   o.c = *(const struct Y &) (const struct Y *) &(&)->b

which is a MODIFY_EXPR, so the preeval code in cxx_eval_store_expression
clears .ctor and .object, therefore we can't replace the PLACEHOLDER_EXPR
whereupon we crash at

   /* A placeholder without a referent.  We can get here when
  checking whether NSDMIs are noexcept, or in massage_init_elt;
  just say it's non-constant for now.  */
   gcc_assert (ctx->quiet);

The PLACEHOLDER_EXPR can also be on the LHS as in constexpr-nsdmi10.C.
I don't think we can do much here, but I noticed that the whole
trivial_fn_p (fun) block is only entered when -fno-elide-constructors.
This is true since GCC 9; it wasn't easy to bisect what changes made it
so, but r240845 is probably one of them.  -fno-elide-constructors is an
option for experiments only so it's not clear to me why we'd still want
to shortcut trivial constructor/op=.  I propose to remove the code and
add a checking assert to make sure we're not getting a trivial_fn_p
unless -fno-elide-constructors.


Hmm, trivial op= doesn't ever hit this code?


Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?  I don't
think I want to backport this.

PR c++/101073

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_call_expression): Replace shortcutting trivial
constructor/op= with a checking assert.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-nsdmi3.C: New test.
* g++.dg/cpp1y/constexpr-nsdmi10.C: New test.
---
  gcc/cp/constexpr.cc   | 25 +++
  gcc/testsuite/g++.dg/cpp0x/constexpr-nsdmi3.C | 17 +
  .../g++.dg/cpp1y/constexpr-nsdmi10.C  | 18 +
  3 files changed, 38 insertions(+), 22 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-nsdmi3.C
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-nsdmi10.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 564766c8a00..1d53dcf0f20 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -2865,28 +2865,9 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
ctx = &new_ctx;
  }
  
-  /* Shortcut trivial constructor/op=.  */

-  if (trivial_fn_p (fun))
-{
-  tree init = NULL_TREE;
-  if (call_expr_nargs (t) == 2)
-   init = convert_from_reference (get_nth_callarg (t, 1));
-  else if (TREE_CODE (t) == AGGR_INIT_EXPR
-  && AGGR_INIT_ZERO_FIRST (t))
-   init = build_zero_init (DECL_CONTEXT (fun), NULL_TREE, false);
-  if (init)
-   {
- tree op = get_nth_callarg (t, 0);
- if (is_dummy_object (op))
-   op = ctx->object;
- else
-   op = build1 (INDIRECT_REF, TREE_TYPE (TREE_TYPE (op)), op);
- tree set = build2 (MODIFY_EXPR, TREE_TYPE (op), op, init);


I think the problem is using MODIFY_EXPR instead of INIT_EXPR to 
represent a constructor; that's why cxx_eval_store_expression thinks 
it's OK to preevaluate.  This should properly use those two tree codes 
for op= and ctor, respectively.



- new_ctx.call = &new_call;
- return cxx_eval_constant_expression (&new_ctx, set, lval,
-  non_constant_p, overflow_p);
-   }
-}
+  /* We used to shortcut trivial constructor/op= here, but nowadays
+ we can only get a trivial function here with -fno-elide-constructors.  */
+  gcc_checking_assert (!trivial_fn_p (fun) || !flag_elide_constructors);


...but if this optimization is so rarely triggered, this simplification 
is OK too.



bool non_constant_args = false;
new_call.bindings
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-nsdmi3.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-nsdmi3.C
new file mode 100644
index 000..ec1c4e53387
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-nsdmi3.C
@@ -0,0 +1,17 @@
+// PR c++/101073
+// { dg-do compile { target c++11 } }
+// { dg-additional-options "-fno-elide-constructors" }
+
+struct Y
+{
+  int a;
+};
+
+struct X
+{
+  Y b = Y{1};
+  Y c = this->b;
+};
+
+constexpr X o = { };
+static_assert(o.b.a == 1 && o.c.a == 1, "");
diff --git a/gcc/testsuite/g++.dg/cpp1y/constexpr-nsdmi10.C 
b/gcc/testsuite/g++.dg/cpp1y/constexpr-nsdmi10.C
new file mode 100644
index 000..35cb8acc15b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/constexpr-nsdmi10.C
@@ -0,0 +1,18 @@
+// PR c++/101073
+// { dg-do compile { target c++14 } }
+// { dg-additional-options "-fno-elide-constructors" }
+// A copy of constexpr-nsdmi9.C.
+
+struct Y
+{
+  int a;
+};
+
+s

[pushed] analyzer: fix uninit false +ves [PR108664, PR108666, PR108725]

2023-02-15 Thread David Malcolm via Gcc-patches

This patch updates poisoned_value_diagnostic so that, where possible,
it checks to see if the value is still poisoned along the execution
path seen during feasibility analysis, rather than just that seen
in the exploded graph.

Integration testing shows this reduction in the number of
false positives:
  -Wanalyzer-use-of-uninitialized-value: 191 -> 153 (-38)
where the changes happen in:
  coreutils-9.1: 34 -> 20 (-14)
 qemu-7.2.0: 78 -> 54 (-24)

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-6064-gb03a10b0b25cef.

gcc/analyzer/ChangeLog:
PR analyzer/108664
PR analyzer/108666
PR analyzer/108725
* diagnostic-manager.cc (epath_finder::get_best_epath): Add
"target_stmt" param.
(epath_finder::explore_feasible_paths): Likewise.
(epath_finder::process_worklist_item): Likewise.
(saved_diagnostic::calc_best_epath): Pass m_stmt to
epath_finder::get_best_epath.
* engine.cc (feasibility_state::maybe_update_for_edge): Move
per-stmt logic to...
(feasibility_state::update_for_stmt): ...this new function.
* exploded-graph.h (feasibility_state::update_for_stmt): New decl.
* feasible-graph.cc (feasible_node::get_state_at_stmt): New.
* feasible-graph.h: Include "analyzer/exploded-graph.h".
 (feasible_node::get_state_at_stmt): New decl.
* infinite-recursion.cc
(infinite_recursion_diagnostic::check_valid_fpath_p): Update for
vfunc signature change.
* pending-diagnostic.h (pending_diagnostic::check_valid_fpath_p):
Convert first param to a reference.  Add stmt param.
* region-model.cc: Include "analyzer/feasible-graph.h".
(poisoned_value_diagnostic::poisoned_value_diagnostic): Add
"check_expr" param.
(poisoned_value_diagnostic::check_valid_fpath_p): New.
(poisoned_value_diagnostic::m_check_expr): New field.
(region_model::check_for_poison): Attempt to supply a check_expr
to the diagnostic
(region_model::deref_rvalue): Add NULL for new check_expr param
of poisoned_value_diagnostic.
(region_model::get_or_create_region_for_heap_alloc): Don't reuse
regions that are marked as TOUCHED.

gcc/testsuite/ChangeLog:
PR analyzer/108664
PR analyzer/108666
PR analyzer/108725
* gcc.dg/analyzer/coreutils-cksum-pr108664.c: New test.
* gcc.dg/analyzer/coreutils-sum-pr108666.c: New test.
* gcc.dg/analyzer/torture/uninit-pr108725.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/diagnostic-manager.cc| 23 -
 gcc/analyzer/engine.cc| 30 +++---
 gcc/analyzer/exploded-graph.h |  1 +
 gcc/analyzer/feasible-graph.cc| 30 ++
 gcc/analyzer/feasible-graph.h |  5 +
 gcc/analyzer/infinite-recursion.cc|  7 +-
 gcc/analyzer/pending-diagnostic.h |  3 +-
 gcc/analyzer/region-model.cc  | 71 +-
 .../analyzer/coreutils-cksum-pr108664.c   | 80 +++
 .../gcc.dg/analyzer/coreutils-sum-pr108666.c  | 98 +++
 .../gcc.dg/analyzer/torture/uninit-pr108725.c | 19 
 11 files changed, 343 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/coreutils-cksum-pr108664.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/coreutils-sum-pr108666.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/torture/uninit-pr108725.c

diff --git a/gcc/analyzer/diagnostic-manager.cc 
b/gcc/analyzer/diagnostic-manager.cc
index 4f036a6c28a..0a447f7ba26 100644
--- a/gcc/analyzer/diagnostic-manager.cc
+++ b/gcc/analyzer/diagnostic-manager.cc
@@ -88,6 +88,7 @@ public:
 
   std::unique_ptr
   get_best_epath (const exploded_node *target_enode,
+ const gimple *target_stmt,
  const pending_diagnostic &pd,
  const char *desc, unsigned diag_idx,
  std::unique_ptr *out_problem);
@@ -97,6 +98,7 @@ private:
 
   std::unique_ptr
   explore_feasible_paths (const exploded_node *target_enode,
+ const gimple *target_stmt,
  const pending_diagnostic &pd,
  const char *desc, unsigned diag_idx);
   bool
@@ -104,6 +106,7 @@ private:
 const trimmed_graph &tg,
 feasible_graph *fg,
 const exploded_node *target_enode,
+const gimple *target_stmt,
 const pending_diagnostic &pd,
 unsigned diag_idx,
 std::unique_ptr *out_best_path) const;
@@ -128,6 +131,9 @@ private:
 /* Get the "best" exploded_path for reaching ENODE from the origin,
returning ownership of it to the caller.
 
+   If TARGET_STMT is non-NULL, then check for reaching that stmt
+   within ENODE.
+

Re: [PATCH 1/2] c++: factor out TYPENAME_TYPE substitution

2023-02-15 Thread Patrick Palka via Gcc-patches

On Wed, 15 Feb 2023, Jason Merrill wrote:

> On 2/15/23 09:21, Patrick Palka wrote:
> > On Tue, 14 Feb 2023, Jason Merrill wrote:
> > 
> > > On 2/13/23 09:23, Patrick Palka wrote:
> > > > [N.B. this is a corrected version of
> > > > https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607443.html ]
> > > > 
> > > > This patch factors out the TYPENAME_TYPE case of tsubst into a separate
> > > > function tsubst_typename_type.  It also factors out the two tsubst flags
> > > > controlling TYPENAME_TYPE substitution, tf_keep_type_decl and tf_tst_ok,
> > > > into distinct boolean parameters of this new function (and of
> > > > make_typename_type).  Consequently, callers which used to pass tf_tst_ok
> > > > to tsubst now instead must directly call tsubst_typename_type when
> > > > appropriate.
> > > 
> > > Hmm, I don't love how that turns 4 lines into 8 more complex lines in each
> > > caller.  And the previous approach of saying "a CTAD placeholder is OK"
> > > seem
> > > like better abstraction than repeating the specific TYPENAME_TYPE handling
> > > in
> > > each place.
> > 
> > Ah yeah, I see what you mean.  I was thinking since tf_tst_ok is
> > specific to TYPENAME_TYPE handling and isn't propagated (i.e. it only
> > affects top-level TYPENAME_TYPEs), it seemed cleaner to encode the flag
> > as a bool parameter "template_ok" of tsubst_typename_type instead of as
> > a global tsubst_flag that gets propagated freely.
> > 
> > > 
> > > > In a subsequent patch we'll add another flag to
> > > > tsubst_typename_type controlling whether we want to ignore non-types
> > > > during the qualified lookup.
> > 
> > As mentioned above, the second patch in this series would just add
> > another flag "type_only" alongside "template_ok", since this flag will
> > also only affects top-level TYPENAME_TYPEs and doesn't need to propagate
> > like tsubst_flags.
> > 
> > Except, it turns it, this new flag _does_ need to propagate, namely when
> > expanding a variadic using:
> > 
> >using typename Ts::type::m...; // from typename25a.C below
> > 
> > Here we have a USING_DECL whose USING_DECL_SCOPE is a
> > TYPE_PACK_EXPANSION over TYPENAME_TYPE.  In order to correctly
> > substitute this TYPENAME_TYPE, the USING_DECL case of tsubst_decl needs
> > to pass an appropriate tsubst_flag to tsubst_pack_expansion to be
> > propagated to tsubst (to be propagated to make_typename_type).
> > 
> > So in light of this case it seems adding a new tsubst_flag is the
> > way to go, which means we can avoid this refactoring patch entirely.
> > 
> > Like so?  Bootstrapped and regtested on x86_64-pc-linux-gnu.
> 
> OK, though I still wonder about adding a tsubst_scope function that would add
> the tf_qualifying_scope.

Hmm, but we need to add tf_qualifying_scope to two tsubst_copy calls,
one tsubst call and one tsubst_aggr_type call (with entering_scope=true).
Would tsubst_scope call tsubst, tsubst_copy or tsubst_aggr_type?

> 
> > -- >8 --
> > 
> > Subject: [PATCH] c++: TYPENAME_TYPE lookup ignoring non-types [PR107773]
> > 
> > Currently when resolving a TYPENAME_TYPE for 'typename T::m' via
> > make_typename_type, we consider only type bindings of 'm' and ignore
> > non-type ones.  But [temp.res.general]/3 says, in a note, "the usual
> > qualified name lookup ([basic.lookup.qual]) applies even in the presence
> > of 'typename'", and qualified name lookup doesn't discriminate between
> > type and non-type bindings.  So when resolving such a TYPENAME_TYPE
> > we want the lookup to consider all bindings.
> > 
> > An exception is when we have a TYPENAME_TYPE corresponding to the
> > qualifying scope of the :: scope resolution operator, such as
> > 'T::type' in 'typename T::type::m'.  In that case, [basic.lookup.qual]/1
> > applies, and lookup for such a TYPENAME_TYPE must ignore non-type bindings.
> > So in order to correctly handle all cases, make_typename_type needs an
> > additional flag controlling whether lookup should ignore non-types or not.
> > 
> > To that end this patch adds a new tsubst flag tf_qualifying_scope to
> > communicate to make_typename_type whether we want to ignore non-type
> > bindings during the lookup (by default we don't want to ignore them).
> > In contexts where we do want to ignore non-types (when substituting
> > into the scope of TYPENAME_TYPE, SCOPE_REF or USING_DECL) we simply
> > pass tf_qualifying_scope to the relevant tsubst / tsubst_copy call.
> > This flag is intended to apply only to top-level TYPENAME_TYPEs so
> > we must be careful to clear the flag to avoid propagating it during
> > substitution of sub-trees.
> > 
> > PR c++/107773
> > 
> > gcc/cp/ChangeLog:
> > 
> > * cp-tree.h (enum tsubst_flags): New flag tf_qualifying_scope.
> > * decl.cc (make_typename_type): Use lookup_member instead of
> > lookup_field.  If tf_qualifying_scope is set, pass want_type=true
> > instead of =false to lookup_member.  Generalize format specifier
> > in diagnostic to handle both type and non-type

[PATCH] i386: Relax extract location operand mode requirements

2023-02-15 Thread Uros Bizjak via Gcc-patches

There is no requirement on the mode of the location operand, so any
supported integer mode is valid.  We can relax extract location
operand mode requirement of other patterns involving zero_extract RTX.

2023-02-15  Uroš Bizjak  

gcc/ChangeLog:

* config/i386/i386.md (*cmpqi_ext_1): Use
int248_register_operand predicate in zero_extract sub-RTX.
(*cmpqi_ext_2): Ditto.
(*cmpqi_ext_3): Ditto.
(*cmpqi_ext_4): Ditto.
(*extzvqi_mem_rex64): Ditto.
(*extzvqi): Ditto.
(*insvqi_1_mem_rex64): Ditto.
(@insv_1): Ditto.
(*insvqi_1): Ditto.
(*insvqi_2): Ditto.
(*insvqi_3): Ditto.
(*extendqi_ext_1): Ditto.
(*addqi_ext_1): Ditto.
(*addqi_ext_2): Ditto.
(*subqi_ext_2): Ditto.
(*testqi_ext_1): Ditto.
(*testqi_ext_2): Ditto.
(*andqi_ext_1): Ditto.
(*andqi_ext_1_cc): Ditto.
(*andqi_ext_2): Ditto.
(*qi_ext_1): Ditto.
(*qi_ext_2): Ditto.
(*xorqi_ext_1_cc): Ditto.
(*negqi_ext_2): Ditto.
(*ashlqi_ext_2): Ditto.
(*qi_ext_2): Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index e37bc8dca53..198f06e0769 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -1459,7 +1459,7 @@
  (match_operand:QI 0 "nonimmediate_operand" "QBc,m")
  (subreg:QI
(zero_extract:SWI248
- (match_operand:SWI248 1 "register_operand" "Q,Q")
+ (match_operand 1 "int248_register_operand" "Q,Q")
  (const_int 8)
  (const_int 8)) 0)))]
   "ix86_match_ccmode (insn, CCmode)"
@@ -1473,7 +1473,7 @@
(compare
  (subreg:QI
(zero_extract:SWI248
- (match_operand:SWI248 0 "register_operand" "Q")
+ (match_operand 0 "int248_register_operand" "Q")
  (const_int 8)
  (const_int 8)) 0)
  (match_operand:QI 1 "const0_operand")))]
@@ -1498,7 +1498,7 @@
(compare
  (subreg:QI
(zero_extract:SWI248
- (match_operand:SWI248 0 "register_operand" "Q,Q")
+ (match_operand 0 "int248_register_operand" "Q,Q")
  (const_int 8)
  (const_int 8)) 0)
  (match_operand:QI 1 "general_operand" "QnBc,m")))]
@@ -1513,12 +1513,12 @@
(compare
  (subreg:QI
(zero_extract:SWI248
- (match_operand:SWI248 0 "register_operand" "Q")
+ (match_operand 0 "int248_register_operand" "Q")
  (const_int 8)
  (const_int 8)) 0)
  (subreg:QI
(zero_extract:SWI248
- (match_operand:SWI248 1 "register_operand" "Q")
+ (match_operand 1 "int248_register_operand" "Q")
  (const_int 8)
  (const_int 8)) 0)))]
   "ix86_match_ccmode (insn, CCmode)"
@@ -3192,7 +3192,7 @@
   [(set (match_operand:QI 0 "norex_memory_operand" "=Bn")
(subreg:QI
  (zero_extract:SWI248
-   (match_operand:SWI248 1 "register_operand" "Q")
+   (match_operand 1 "int248_register_operand" "Q")
(const_int 8)
(const_int 8)) 0))]
   "TARGET_64BIT && reload_completed"
@@ -3214,7 +3214,7 @@
   [(set (match_operand:QI 0 "nonimmediate_operand" "=QBc,?R,m")
(subreg:QI
  (zero_extract:SWI248
-   (match_operand:SWI248 1 "register_operand" "Q,Q,Q")
+   (match_operand 1 "int248_register_operand" "Q,Q,Q")
(const_int 8)
(const_int 8)) 0))]
   ""
@@ -3242,7 +3242,7 @@
 (define_peephole2
   [(set (match_operand:QI 0 "register_operand")
(subreg:QI
- (zero_extract:SWI248 (match_operand:SWI248 1 "register_operand")
+ (zero_extract:SWI248 (match_operand 1 "int248_register_operand")
   (const_int 8)
   (const_int 8)) 0))
(set (match_operand:QI 2 "norex_memory_operand") (match_dup 0))]
@@ -3289,7 +3289,7 @@
 
 (define_insn "*insvqi_1_mem_rex64"
   [(set (zero_extract:SWI248
- (match_operand:SWI248 0 "register_operand" "+Q")
+ (match_operand 0 "int248_register_operand" "+Q")
  (const_int 8)
  (const_int 8))
(subreg:SWI248
@@ -3301,7 +3301,7 @@
 
 (define_insn "@insv_1"
   [(set (zero_extract:SWI248
- (match_operand:SWI248 0 "register_operand" "+Q,Q")
+ (match_operand 0 "int248_register_operand" "+Q,Q")
  (const_int 8)
  (const_int 8))
(match_operand:SWI248 1 "general_operand" "QnBc,m"))]
@@ -3317,7 +3317,7 @@
 
 (define_insn "*insvqi_1"
   [(set (zero_extract:SWI248
- (match_operand:SWI248 0 "register_operand" "+Q,Q")
+ (match_operand 0 "int248_register_operand" "+Q,Q")
  (const_int 8)
  (const_int 8))
(subreg:SWI248
@@ -3331,7 +3331,7 @@
 (define_peephole2
   [(set (match_operand:QI 0 "register_operand")
(match_operand:QI 1 "norex_memory_operand"))
-   (set (zero_e

[PATCH 0/7] Work on PR108030 and several simd bugfixes and testsuite improvements

2023-02-15 Thread Matthias Kretz via Gcc-patches

As suggested in PR108030, I used __attribute__ syntax to annotate lambdas 
as always_inline. In few cases the lambda was meant to be a function 
boundary and the attribute was omitted.

PR108030 mentions a few more functions as problematic. But ideally these 
should not be inline in some fixed_size_simd cases. This needs further 
verification.

This fix is not simply an optimization. If the user hits this bug then 
using simd makes the code significantly slower than without using simd. 
That defeats the whole purpose of the type.

While doing verification I found a few more issues and implemented the use 
of PCH to speed up the test suite.

Matthias Kretz (7):
  libstdc++: Ensure __builtin_constant_p isn't lost on the way
  libstdc++: Annotate most lambdas with always_inline
  libstdc++: Document timeout and timeout-factor of simd tests
  libstdc++: Use a PCH to speed up check-simd
  libstdc++: printf format string fix in testsuite
  libstdc++: Fix incorrect __builtin_is_constant_evaluated calls
  libstdc++: Fix incorrect function call in -ffast-math optimization

 libstdc++-v3/include/experimental/bits/simd.h | 245 ++--
 .../include/experimental/bits/simd_builtin.h  | 351 ++
 .../experimental/bits/simd_converter.h|  22 +-
 .../include/experimental/bits/simd_detail.h   |   3 +
 .../experimental/bits/simd_fixed_size.h   | 265 ++---
 .../include/experimental/bits/simd_math.h |  56 +--
 .../include/experimental/bits/simd_neon.h |  14 +-
 .../include/experimental/bits/simd_x86.h  | 143 +++
 .../testsuite/experimental/simd/README.md |  10 +-
 .../experimental/simd/generate_makefile.sh|  24 +-
 .../testsuite/experimental/simd/tests/abs.cc  |   4 +-
 .../experimental/simd/tests/algorithms.cc |   3 +-
 .../simd/tests/bits/conversions.h |  25 +-
 .../experimental/simd/tests/bits/main.h   |  87 +
 .../experimental/simd/tests/bits/make_vec.h   |  10 +
 .../simd/tests/bits/mathreference.h   |   3 +
 .../simd/tests/bits/test_values.h |   6 +
 .../experimental/simd/tests/bits/verify.h |  66 +---
 .../experimental/simd/tests/broadcast.cc  |   3 +-
 .../experimental/simd/tests/casts.cc  |   4 +-
 .../experimental/simd/tests/fpclassify.cc |   4 +-
 .../experimental/simd/tests/frexp.cc  |   4 +-
 .../experimental/simd/tests/generator.cc  |   3 +-
 .../experimental/simd/tests/hypot3_fma.cc |   4 +-
 .../simd/tests/integer_operators.cc   |   5 +-
 .../simd/tests/ldexp_scalbn_scalbln_modf.cc   |   4 +-
 .../experimental/simd/tests/loadstore.cc  |   4 +-
 .../experimental/simd/tests/logarithm.cc  |   5 +-
 .../experimental/simd/tests/mask_broadcast.cc |   3 +-
 .../simd/tests/mask_conversions.cc|   2 +-
 .../simd/tests/mask_implicit_cvt.cc   |   3 +-
 .../experimental/simd/tests/mask_loadstore.cc |  29 +-
 .../simd/tests/mask_operator_cvt.cc   |   3 +-
 .../experimental/simd/tests/mask_operators.cc |   3 +-
 .../simd/tests/mask_reductions.cc |  30 +-
 .../experimental/simd/tests/math_1arg.cc  |   3 +-
 .../experimental/simd/tests/math_2arg.cc  |   4 +-
 .../experimental/simd/tests/operator_cvt.cc   |   3 +-
 .../experimental/simd/tests/operators.cc  |  14 +-
 .../experimental/simd/tests/reductions.cc |   4 +-
 .../experimental/simd/tests/remqo.cc  |   4 +-
 .../testsuite/experimental/simd/tests/simd.cc |   2 +-
 .../experimental/simd/tests/sincos.cc |   6 +-
 .../experimental/simd/tests/split_concat.cc   |   4 +-
 .../experimental/simd/tests/splits.cc |   2 +-
 .../experimental/simd/tests/trigonometric.cc  |   4 +-
 .../simd/tests/trunc_ceil_floor.cc|   3 +-
 .../experimental/simd/tests/where.cc  |   4 +-
 48 files changed, 772 insertions(+), 735 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/experimental/simd/tests/bits/
main.h

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──

[PATCH 1/7] libstdc++: Ensure __builtin_constant_p isn't lost on the way

2023-02-15 Thread Matthias Kretz via Gcc-patches



The more expensive code path should only be taken if it can be optimized
away.

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* include/experimental/bits/simd.h
(_SimdWrapper::_M_is_constprop_none_of)
(_SimdWrapper::_M_is_constprop_all_of): Return false unless the
computed result still satisfies __builtin_constant_p.
---
 libstdc++-v3/include/experimental/bits/simd.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
index e76f4781fa6..3de966bbf22 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -2673,7 +2673,8 @@ template 
 	  else
 	__execute_n_times<_Width>(
 	  [&](auto __i) { __r &= _M_data[__i.value] == _Tp(); });
-	  return __r;
+	  if (__builtin_constant_p(__r))
+	return __r;
 	}
   return false;
 }
@@ -2693,7 +2694,8 @@ template 
 	  else
 	__execute_n_times<_Width>(
 	  [&](auto __i) { __r &= _M_data[__i.value] == ~_Tp(); });
-	  return __r;
+	  if (__builtin_constant_p(__r))
+	return __r;
 	}
   return false;
 }

[PATCH 3/7] libstdc++: Document timeout and timeout-factor of simd tests

2023-02-15 Thread Matthias Kretz via Gcc-patches



Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* testsuite/experimental/simd/README.md: Document the timeout
and timeout-factor directives. Minor typo fixed.
---
 libstdc++-v3/testsuite/experimental/simd/README.md | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──diff --git a/libstdc++-v3/testsuite/experimental/simd/README.md b/libstdc++-v3/testsuite/experimental/simd/README.md
index b82453df403..ef8b7c33de7 100644
--- a/libstdc++-v3/testsuite/experimental/simd/README.md
+++ b/libstdc++-v3/testsuite/experimental/simd/README.md
@@ -139,7 +139,13 @@ allowed_distance)` macros.
   test then shows as "XFAIL: ...". If the test passes, the test shows "XPASS: 
   ...".
 
-All patterns are matched via
+* `timeout: `
+  Set the timeout of this test to `` seconds.
+
+* `timeout-factor: `
+  Multiply the default timeout with ``.
+
+All patterns except `timeout` and `timeout-factor` are matched via
 ```sh
 case '' in
   )
@@ -147,7 +153,7 @@ case '' in
   ;;
 esac
 ```
-The `` is implicitly adds a `*` wildcard before and after the 
+The `` implicitly adds a `*` wildcard before and after the 
 pattern. Thus, the `CXXFLAGS` pattern matches a substring and all other 
 patterns require a full match.

[PATCH 7/7] libstdc++: Fix incorrect function call in -ffast-math optimization

2023-02-15 Thread Matthias Kretz via Gcc-patches



Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_math.h (__hypot): Bitcasting
between scalars requires the __bit_cast helper function instead
of simd_bit_cast.
---
 libstdc++-v3/include/experimental/bits/simd_math.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──diff --git a/libstdc++-v3/include/experimental/bits/simd_math.h b/libstdc++-v3/include/experimental/bits/simd_math.h
index c20315e4e30..c91f05fceb3 100644
--- a/libstdc++-v3/include/experimental/bits/simd_math.h
+++ b/libstdc++-v3/include/experimental/bits/simd_math.h
@@ -1010,7 +1010,7 @@ template 
 	using _IV = rebind_simd_t<_Ip, _V>;
 	const auto __as_int = simd_bit_cast<_IV>(__hi_exp);
 	const _V __scale
-	  = simd_bit_cast<_V>(2 * simd_bit_cast<_Ip>(_Tp(1)) - __as_int);
+	  = simd_bit_cast<_V>(2 * __bit_cast<_Ip>(_Tp(1)) - __as_int);
 #else
 	const _V __scale = (__hi_exp ^ __inf) * _Tp(.5);
 #endif
@@ -1181,7 +1181,7 @@ _GLIBCXX_SIMD_CVTING2(hypot)
 		using _IV = rebind_simd_t<_Ip, _V>;
 		const auto __as_int = simd_bit_cast<_IV>(__hi_exp);
 		const _V __scale
-		  = simd_bit_cast<_V>(2 * simd_bit_cast<_Ip>(_Tp(1)) - __as_int);
+		  = simd_bit_cast<_V>(2 * __bit_cast<_Ip>(_Tp(1)) - __as_int);
 #else
 		const _V __scale = (__hi_exp ^ __inf) * _Tp(.5);
 #endif

[PATCH 2/7] libstdc++: Annotate most lambdas with always_inline

2023-02-15 Thread Matthias Kretz via Gcc-patches



All of the annotated lambdas are simply a necessary means for
implementing these functions and should never result in an actual
function call. Many of these lambdas would go away if C++ had better
language support for packs.

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

PR libstdc++/108030
* include/experimental/bits/simd_detail.h: Define
_GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA.
* include/experimental/bits/simd.h: Annotate lambdas with
_GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA.
* include/experimental/bits/simd_builtin.h: Ditto.
* include/experimental/bits/simd_converter.h: Ditto.
* include/experimental/bits/simd_fixed_size.h: Ditto.
* include/experimental/bits/simd_math.h: Ditto.
* include/experimental/bits/simd_neon.h: Ditto.
* include/experimental/bits/simd_x86.h: Ditto.
---
 libstdc++-v3/include/experimental/bits/simd.h | 239 ++--
 .../include/experimental/bits/simd_builtin.h  | 351 ++
 .../experimental/bits/simd_converter.h|  22 +-
 .../include/experimental/bits/simd_detail.h   |   3 +
 .../experimental/bits/simd_fixed_size.h   | 265 ++---
 .../include/experimental/bits/simd_math.h |  52 +--
 .../include/experimental/bits/simd_neon.h |  14 +-
 .../include/experimental/bits/simd_x86.h  | 122 +++---
 8 files changed, 575 insertions(+), 493 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
index 3de966bbf22..ffe72fa6ccf 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -609,28 +609,34 @@ template 
 	  operator&(_Ip __rhs) const
 	  {
 	return __generate_from_n_evaluations<_Np, _Ip>(
-	  [&](auto __i) { return __rhs._M_data[__i] & _M_data[__i]; });
+	  [&](auto __i) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
+		return __rhs._M_data[__i] & _M_data[__i];
+	  });
 	  }
 
 	  _GLIBCXX_SIMD_INTRINSIC constexpr _Ip
 	  operator|(_Ip __rhs) const
 	  {
 	return __generate_from_n_evaluations<_Np, _Ip>(
-	  [&](auto __i) { return __rhs._M_data[__i] | _M_data[__i]; });
+	  [&](auto __i) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
+		return __rhs._M_data[__i] | _M_data[__i];
+	  });
 	  }
 
 	  _GLIBCXX_SIMD_INTRINSIC constexpr _Ip
 	  operator^(_Ip __rhs) const
 	  {
 	return __generate_from_n_evaluations<_Np, _Ip>(
-	  [&](auto __i) { return __rhs._M_data[__i] ^ _M_data[__i]; });
+	  [&](auto __i) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
+		return __rhs._M_data[__i] ^ _M_data[__i];
+	  });
 	  }
 
 	  _GLIBCXX_SIMD_INTRINSIC constexpr _Ip
 	  operator~() const
 	  {
 	return __generate_from_n_evaluations<_Np, _Ip>(
-	  [&](auto __i) { return ~_M_data[__i]; });
+	  [&](auto __i) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA { return ~_M_data[__i]; });
 	  }
 	};
 	return _Ip{};
@@ -1391,7 +1397,7 @@ template 
 operator^=(const _BitMask& __b) & noexcept
 {
   __execute_n_times<_S_array_size>(
-	[&](auto __i) { _M_bits[__i] ^= __b._M_bits[__i]; });
+	[&](auto __i) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA { _M_bits[__i] ^= __b._M_bits[__i]; });
   return *this;
 }
 
@@ -1399,7 +1405,7 @@ template 
 operator|=(const _BitMask& __b) & noexcept
 {
   __execute_n_times<_S_array_size>(
-	[&](auto __i) { _M_bits[__i] |= __b._M_bits[__i]; });
+	[&](auto __i) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA { _M_bits[__i] |= __b._M_bits[__i]; });
   return *this;
 }
 
@@ -1407,7 +1413,7 @@ template 
 operator&=(const _BitMask& __b) & noexcept
 {
   __execute_n_times<_S_array_size>(
-	[&](auto __i) { _M_bits[__i] &= __b._M_bits[__i]; });
+	[&](auto __i) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA { _M_bits[__i] &= __b._M_bits[__i]; });
   return *this;
 }
 
@@ -1797,8 +1803,9 @@ template 
   __vector_broadcast(_Tp __x)
   {
 return __call_with_n_evaluations<_Np>(
-  [](auto... __xx) { return __vector_type_t<_Tp, _Np>{__xx...}; },
-  [&__x](int) { return __x; });
+  [](auto... __xx) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
+	return __vector_type_t<_Tp, _Np>{__xx...};
+  }, [&__x](int) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA { return __x; });
   }
 
 // }}}
@@ -2205,7 +2212,7 @@ template (
-	  __x, [](auto... __entries) {
+	  __x, [](auto... __entries) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
 	return reinterpret_cast<_R>(_Up{__entries...});
 	  });
   }
@@ -2607,7 +2614,7 @@ template 
 
 _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper(initializer_list<_Tp> __init)
   : _Base(__generate_from_n_evaluations<_Width, _BuiltinType>(
-	[&](auto __i) { return __init.begin

[PATCH 5/7] libstdc++: printf format string fix in testsuite

2023-02-15 Thread Matthias Kretz via Gcc-patches



Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* testsuite/experimental/simd/tests/bits/verify.h
(verify::verify): Use %zx for size_t in format string.
---
 libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h
index 2ab3ad3fa8c..01ad50bd01a 100644
--- a/libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h
@@ -137,7 +137,7 @@ public:
 {
   if (m_failed)
 	[&] {
-	  __builtin_fprintf(stderr, "%s:%d: (%s):\nInstruction Pointer: %x\n"
+	  __builtin_fprintf(stderr, "%s:%d: (%s):\nInstruction Pointer: %zx\n"
 "Assertion '%s' failed.\n",
 			file, line, func, m_ip, cond);
 	  (print(extra_info, int()), ...);

[PATCH 4/7] libstdc++: Use a PCH to speed up check-simd

2023-02-15 Thread Matthias Kretz via Gcc-patches


Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* testsuite/experimental/simd/generate_makefile.sh: Generate and
pre-compile pch.h, which includes all headers that do not depend
on command-line macros.
* testsuite/experimental/simd/tests/bits/conversions.h: Add
include guard.
(genHalfBits): Simplify.
* testsuite/experimental/simd/tests/bits/make_vec.h: Add include
guard.
(make_alternating_mask): Moved from mask_loadstore.
* testsuite/experimental/simd/tests/bits/mathreference.h: Add
include guard.
* testsuite/experimental/simd/tests/bits/test_values.h: Ditto.
* testsuite/experimental/simd/tests/mask_loadstore.cc
(make_mask, make_alternating_mask): Removed.
* testsuite/experimental/simd/tests/mask_reductions.cc: Ditto.
* testsuite/experimental/simd/tests/operators.cc (genHalfBits):
Removed.
* testsuite/experimental/simd/tests/abs.cc: Only include
bits/main.h.
* testsuite/experimental/simd/tests/algorithms.cc: Ditto.
* testsuite/experimental/simd/tests/broadcast.cc: Ditto.
* testsuite/experimental/simd/tests/casts.cc: Ditto.
* testsuite/experimental/simd/tests/fpclassify.cc: Ditto.
* testsuite/experimental/simd/tests/frexp.cc: Ditto.
* testsuite/experimental/simd/tests/generator.cc: Ditto.
* testsuite/experimental/simd/tests/hypot3_fma.cc: Ditto.
* testsuite/experimental/simd/tests/integer_operators.cc: Ditto.
* testsuite/experimental/simd/tests/ldexp_scalbn_scalbln_modf.cc:
Ditto.
* testsuite/experimental/simd/tests/loadstore.cc: Ditto.
* testsuite/experimental/simd/tests/logarithm.cc: Ditto.
* testsuite/experimental/simd/tests/mask_broadcast.cc: Ditto.
* testsuite/experimental/simd/tests/mask_implicit_cvt.cc: Ditto.
* testsuite/experimental/simd/tests/mask_operator_cvt.cc: Ditto.
* testsuite/experimental/simd/tests/mask_operators.cc: Ditto.
* testsuite/experimental/simd/tests/math_1arg.cc: Ditto.
* testsuite/experimental/simd/tests/math_2arg.cc: Ditto.
* testsuite/experimental/simd/tests/operator_cvt.cc: Ditto.
* testsuite/experimental/simd/tests/reductions.cc: Ditto.
* testsuite/experimental/simd/tests/remqo.cc: Ditto.
* testsuite/experimental/simd/tests/sincos.cc: Ditto.
* testsuite/experimental/simd/tests/split_concat.cc: Ditto.
* testsuite/experimental/simd/tests/trigonometric.cc: Ditto.
* testsuite/experimental/simd/tests/trunc_ceil_floor.cc: Ditto.
* testsuite/experimental/simd/tests/where.cc: Ditto.
---
 .../experimental/simd/generate_makefile.sh| 24 -
 .../testsuite/experimental/simd/tests/abs.cc  |  4 +-
 .../experimental/simd/tests/algorithms.cc |  3 +-
 .../simd/tests/bits/conversions.h | 25 ++
 .../experimental/simd/tests/bits/main.h   | 87 +++
 .../experimental/simd/tests/bits/make_vec.h   | 10 +++
 .../simd/tests/bits/mathreference.h   |  3 +
 .../simd/tests/bits/test_values.h |  6 ++
 .../experimental/simd/tests/bits/verify.h | 64 --
 .../experimental/simd/tests/broadcast.cc  |  3 +-
 .../experimental/simd/tests/casts.cc  |  4 +-
 .../experimental/simd/tests/fpclassify.cc |  4 +-
 .../experimental/simd/tests/frexp.cc  |  4 +-
 .../experimental/simd/tests/generator.cc  |  3 +-
 .../experimental/simd/tests/hypot3_fma.cc |  4 +-
 .../simd/tests/integer_operators.cc   |  5 +-
 .../simd/tests/ldexp_scalbn_scalbln_modf.cc   |  4 +-
 .../experimental/simd/tests/loadstore.cc  |  4 +-
 .../experimental/simd/tests/logarithm.cc  |  5 +-
 .../experimental/simd/tests/mask_broadcast.cc |  3 +-
 .../simd/tests/mask_conversions.cc|  2 +-
 .../simd/tests/mask_implicit_cvt.cc   |  3 +-
 .../experimental/simd/tests/mask_loadstore.cc | 29 +--
 .../simd/tests/mask_operator_cvt.cc   |  3 +-
 .../experimental/simd/tests/mask_operators.cc |  3 +-
 .../simd/tests/mask_reductions.cc | 30 +--
 .../experimental/simd/tests/math_1arg.cc  |  3 +-
 .../experimental/simd/tests/math_2arg.cc  |  4 +-
 .../experimental/simd/tests/operator_cvt.cc   |  3 +-
 .../experimental/simd/tests/operators.cc  | 14 +--
 .../experimental/simd/tests/reductions.cc |  4 +-
 .../experimental/simd/tests/remqo.cc  |  4 +-
 .../testsuite/experimental/simd/tests/simd.cc |  2 +-
 .../experimental/simd/tests/sincos.cc |  6 +-
 .../experimental/simd/tests/split_concat.cc   |  4 +-
 .../experimental/simd/tests/splits.cc |  2 +-
 .../experimental/simd/tests/trigonometric.cc  |  4 +-
 .../simd/tests/trunc_ceil_floor.cc|  3 +-
 .../experimental/simd/tests/where.cc  |  4 +-
 39 files changed, 170 insertions(+), 226 deletions(-)
 create mode 100644 libstdc++-

[PATCH 6/7] libstdc++: Fix incorrect __builtin_is_constant_evaluated calls

2023-02-15 Thread Matthias Kretz via Gcc-patches



Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_x86.h
(_SimdImplX86::_S_not_equal_to, _SimdImplX86::_S_less)
(_SimdImplX86::_S_less_equal): Do not call
__builtin_is_constant_evaluated in constexpr-if.
---
 .../include/experimental/bits/simd_x86.h  | 21 +++
 1 file changed, 12 insertions(+), 9 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──diff --git a/libstdc++-v3/include/experimental/bits/simd_x86.h b/libstdc++-v3/include/experimental/bits/simd_x86.h
index 60e80d394ba..dcfdc2a9496 100644
--- a/libstdc++-v3/include/experimental/bits/simd_x86.h
+++ b/libstdc++-v3/include/experimental/bits/simd_x86.h
@@ -2344,15 +2344,16 @@ template 
 	else
 	  __assert_unreachable<_Tp>();
 	  }   // }}}
-	else if constexpr (!__builtin_is_constant_evaluated() // {{{
-			   && sizeof(__x) == 8)
+	else if (__builtin_is_constant_evaluated())
+	  return _Base::_S_not_equal_to(__x, __y);
+	else if constexpr (sizeof(__x) == 8)
 	  {
 	const auto __r128 = __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__x)
 != __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__y);
 	_MaskMember<_Tp> __r64;
 	__builtin_memcpy(&__r64._M_data, &__r128, sizeof(__r64));
 	return __r64;
-	  } // }}}
+	  }
 	else
 	  return _Base::_S_not_equal_to(__x, __y);
   }
@@ -2451,15 +2452,16 @@ template 
 	else
 	  __assert_unreachable<_Tp>();
 	  }   // }}}
-	else if constexpr (!__builtin_is_constant_evaluated() // {{{
-			   && sizeof(__x) == 8)
+	else if (__builtin_is_constant_evaluated())
+	  return _Base::_S_less(__x, __y);
+	else if constexpr (sizeof(__x) == 8)
 	  {
 	const auto __r128 = __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__x)
 < __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__y);
 	_MaskMember<_Tp> __r64;
 	__builtin_memcpy(&__r64._M_data, &__r128, sizeof(__r64));
 	return __r64;
-	  } // }}}
+	  }
 	else
 	  return _Base::_S_less(__x, __y);
   }
@@ -2558,15 +2560,16 @@ template 
 	else
 	  __assert_unreachable<_Tp>();
 	  }   // }}}
-	else if constexpr (!__builtin_is_constant_evaluated() // {{{
-			   && sizeof(__x) == 8)
+	else if (__builtin_is_constant_evaluated())
+	  return _Base::_S_less_equal(__x, __y);
+	else if constexpr (sizeof(__x) == 8)
 	  {
 	const auto __r128 = __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__x)
 <= __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__y);
 	_MaskMember<_Tp> __r64;
 	__builtin_memcpy(&__r64._M_data, &__r128, sizeof(__r64));
 	return __r64;
-	  } // }}}
+	  }
 	else
 	  return _Base::_S_less_equal(__x, __y);
   }

[PATCH, committed] Fortran: error recovery on invalid assumed size reference [PR104554]

2023-02-15 Thread Harald Anlauf via Gcc-patches

Dear all,

I've committed the attached obvious and trivial patch for a NULL
pointer dereference on behalf of Steve and after regtesting on
x86_64-pc-linux-gnu as r13-6066-ga418129273725fd02e881e6fb5e0877287a1356c

Thanks,
Harald

From a418129273725fd02e881e6fb5e0877287a1356c Mon Sep 17 00:00:00 2001
From: Steve Kargl 
Date: Wed, 15 Feb 2023 22:20:22 +0100
Subject: [PATCH] Fortran: error recovery on invalid assumed size reference
 [PR104554]

gcc/fortran/ChangeLog:

	PR fortran/104554
	* resolve.cc (check_assumed_size_reference): Avoid NULL pointer
	dereference.

gcc/testsuite/ChangeLog:

	PR fortran/104554
	* gfortran.dg/pr104554.f90: New test.
---
 gcc/fortran/resolve.cc |  8 +---
 gcc/testsuite/gfortran.dg/pr104554.f90 | 11 +++
 2 files changed, 16 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr104554.f90

diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 96c34065691..fb0745927ac 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -1670,9 +1670,11 @@ check_assumed_size_reference (gfc_symbol *sym, gfc_expr *e)

   /* FIXME: The comparison "e->ref->u.ar.type == AR_FULL" is wrong.
  What should it be?  */
-  if (e->ref && (e->ref->u.ar.end[e->ref->u.ar.as->rank - 1] == NULL)
-	  && (e->ref->u.ar.as->type == AS_ASSUMED_SIZE)
-	   && (e->ref->u.ar.type == AR_FULL))
+  if (e->ref
+  && e->ref->u.ar.as
+  && (e->ref->u.ar.end[e->ref->u.ar.as->rank - 1] == NULL)
+  && (e->ref->u.ar.as->type == AS_ASSUMED_SIZE)
+  && (e->ref->u.ar.type == AR_FULL))
 {
   gfc_error ("The upper bound in the last dimension must "
 		 "appear in the reference to the assumed size "
diff --git a/gcc/testsuite/gfortran.dg/pr104554.f90 b/gcc/testsuite/gfortran.dg/pr104554.f90
new file mode 100644
index 000..099f219c85d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr104554.f90
@@ -0,0 +1,11 @@
+! { dg-do compile }
+! PR fortran/104554 - ICE in check_assumed_size_reference
+! Contributed by G.Steinmetz
+
+program p
+  type t
+ integer :: a
+  end type
+  class(t) :: x(*) ! { dg-error "Assumed size array" }
+  x%a = 3
+end
--
2.35.3

Re: [PATCH] c++: ICE with -fno-elide-constructors and trivial fn [PR101073]

2023-02-15 Thread Marek Polacek via Gcc-patches

On Wed, Feb 15, 2023 at 02:39:16PM -0500, Jason Merrill wrote:
> On 2/9/23 09:39, Marek Polacek wrote:
> > In constexpr-nsdmi3.C, with -fno-elide-constructors, we don't elide
> > the Y::Y(const Y&) call used to initialize o.c.  So store_init_value
> > -> cxx_constant_init must constexpr-evaluate the call to Y::Y(const Y&)
> > in cxx_eval_call_expression.  It's a trivial function, so we do the
> > "Shortcut trivial constructor/op=" code and rather than evaluating
> > the function, we just create an assignment
> > 
> >o.c = *(const struct Y &) (const struct Y *) &(& > X>)->b
> > 
> > which is a MODIFY_EXPR, so the preeval code in cxx_eval_store_expression
> > clears .ctor and .object, therefore we can't replace the PLACEHOLDER_EXPR
> > whereupon we crash at
> > 
> >/* A placeholder without a referent.  We can get here when
> >   checking whether NSDMIs are noexcept, or in massage_init_elt;
> >   just say it's non-constant for now.  */
> >gcc_assert (ctx->quiet);
> > 
> > The PLACEHOLDER_EXPR can also be on the LHS as in constexpr-nsdmi10.C.
> > I don't think we can do much here, but I noticed that the whole
> > trivial_fn_p (fun) block is only entered when -fno-elide-constructors.
> > This is true since GCC 9; it wasn't easy to bisect what changes made it
> > so, but r240845 is probably one of them.  -fno-elide-constructors is an
> > option for experiments only so it's not clear to me why we'd still want
> > to shortcut trivial constructor/op=.  I propose to remove the code and
> > add a checking assert to make sure we're not getting a trivial_fn_p
> > unless -fno-elide-constructors.
> 
> Hmm, trivial op= doesn't ever hit this code?

With -fno-elide-constructors we hit the trivial_fn_p block twice in
constexpr-nsdmi9.C, once for "constexpr Y::Y(const Y&)" and then for
"constexpr Y& Y::operator=(Y&&)".  So it does hit the code, but only
with -fno-elide-constructors.
 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?  I don't
> > think I want to backport this.
> > 
> > PR c++/101073
> > 
> > gcc/cp/ChangeLog:
> > 
> > * constexpr.cc (cxx_eval_call_expression): Replace shortcutting trivial
> > constructor/op= with a checking assert.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp0x/constexpr-nsdmi3.C: New test.
> > * g++.dg/cpp1y/constexpr-nsdmi10.C: New test.
> > ---
> >   gcc/cp/constexpr.cc   | 25 +++
> >   gcc/testsuite/g++.dg/cpp0x/constexpr-nsdmi3.C | 17 +
> >   .../g++.dg/cpp1y/constexpr-nsdmi10.C  | 18 +
> >   3 files changed, 38 insertions(+), 22 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-nsdmi3.C
> >   create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-nsdmi10.C
> > 
> > diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
> > index 564766c8a00..1d53dcf0f20 100644
> > --- a/gcc/cp/constexpr.cc
> > +++ b/gcc/cp/constexpr.cc
> > @@ -2865,28 +2865,9 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, 
> > tree t,
> > ctx = &new_ctx;
> >   }
> > -  /* Shortcut trivial constructor/op=.  */
> > -  if (trivial_fn_p (fun))
> > -{
> > -  tree init = NULL_TREE;
> > -  if (call_expr_nargs (t) == 2)
> > -   init = convert_from_reference (get_nth_callarg (t, 1));
> > -  else if (TREE_CODE (t) == AGGR_INIT_EXPR
> > -  && AGGR_INIT_ZERO_FIRST (t))
> > -   init = build_zero_init (DECL_CONTEXT (fun), NULL_TREE, false);
> > -  if (init)
> > -   {
> > - tree op = get_nth_callarg (t, 0);
> > - if (is_dummy_object (op))
> > -   op = ctx->object;
> > - else
> > -   op = build1 (INDIRECT_REF, TREE_TYPE (TREE_TYPE (op)), op);
> > - tree set = build2 (MODIFY_EXPR, TREE_TYPE (op), op, init);
> 
> I think the problem is using MODIFY_EXPR instead of INIT_EXPR to represent a
> constructor; that's why cxx_eval_store_expression thinks it's OK to
> preevaluate.  This should properly use those two tree codes for op= and
> ctor, respectively.

Maybe it was so that the RHS in SET could refer to the op in the LHS?
 
> > - new_ctx.call = &new_call;
> > - return cxx_eval_constant_expression (&new_ctx, set, lval,
> > -  non_constant_p, overflow_p);
> > -   }
> > -}
> > +  /* We used to shortcut trivial constructor/op= here, but nowadays
> > + we can only get a trivial function here with -fno-elide-constructors. 
> >  */
> > +  gcc_checking_assert (!trivial_fn_p (fun) || !flag_elide_constructors);
> 
> ...but if this optimization is so rarely triggered, this simplification is
> OK too.

I'd say that's better so that we don't have to update the code (like
r234345 did).


m~

[PATCH, committed] Fortran: error recovery on checking procedure argument intent [PR103608]

2023-02-15 Thread Harald Anlauf via Gcc-patches

Dear all,

I've committed the attached obvious and trivial patch for another
NULL pointer dereference on behalf of Steve and after regtesting on
x86_64-pc-linux-gnu as r13-6067-gc75cbeba81e5b4737a9ab7dd28cce650965535a9

Thanks,
Harald

From c75cbeba81e5b4737a9ab7dd28cce650965535a9 Mon Sep 17 00:00:00 2001
From: Steve Kargl 
Date: Wed, 15 Feb 2023 22:40:37 +0100
Subject: [PATCH] Fortran: error recovery on checking procedure argument intent
 [PR103608]

gcc/fortran/ChangeLog:

	PR fortran/103608
	* frontend-passes.cc (do_intent): Catch NULL pointer dereference on
	reference to invalid formal argument.

gcc/testsuite/ChangeLog:

	PR fortran/103608
	* gfortran.dg/pr103608.f90: New test.
---
 gcc/fortran/frontend-passes.cc |  3 ++-
 gcc/testsuite/gfortran.dg/pr103608.f90 | 14 ++
 2 files changed, 16 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr103608.f90

diff --git a/gcc/fortran/frontend-passes.cc b/gcc/fortran/frontend-passes.cc
index 1cbc63016da..02fcb41dbc4 100644
--- a/gcc/fortran/frontend-passes.cc
+++ b/gcc/fortran/frontend-passes.cc
@@ -3049,7 +3049,8 @@ do_intent (gfc_expr **e)
 	  do_sym = dl->ext.iterator->var->symtree->n.sym;

 	  if (a->expr && a->expr->symtree
-	  && a->expr->symtree->n.sym == do_sym)
+	  && a->expr->symtree->n.sym == do_sym
+	  && f->sym)
 	{
 	  if (f->sym->attr.intent == INTENT_OUT)
 		gfc_error_now ("Variable %qs at %L set to undefined value "
diff --git a/gcc/testsuite/gfortran.dg/pr103608.f90 b/gcc/testsuite/gfortran.dg/pr103608.f90
new file mode 100644
index 000..5c37cb78dc6
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr103608.f90
@@ -0,0 +1,14 @@
+! { dg-do compile }
+! { dg-options "-w" }
+! PR fortran/103608 - ICE in do_intent
+! Contributed by G.Steinmetz
+
+program p
+  implicit none
+  integer :: i
+  integer :: x ! { dg-error "Alternate return specifier" }
+  x(*) = 0
+  do i = 1, 2
+ print *, x(i) ! { dg-error "Missing alternate return specifier" }
+  end do
+end
--
2.35.3

Re: [PATCH, committed] Fortran: error recovery on invalid assumed size reference [PR104554]

2023-02-15 Thread Steve Kargl via Gcc-patches

On Wed, Feb 15, 2023 at 10:28:00PM +0100, Harald Anlauf via Fortran wrote:
> Dear all,
> 
> I've committed the attached obvious and trivial patch for a NULL
> pointer dereference on behalf of Steve and after regtesting on
> x86_64-pc-linux-gnu as r13-6066-ga418129273725fd02e881e6fb5e0877287a1356c
> 

Thanks Harald!

-- 
Steve

Re: [PATCH v6] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-02-15 Thread Max Filippov via Gcc-patches

Hi Suwa-san,

On Thu, Jan 26, 2023 at 7:17 PM Takayuki 'January June' Suwa
 wrote:
>
> In the case of the CALL0 ABI, values that must be retained before and
> after function calls are placed in the callee-saved registers (A12
> through A15) and referenced later.  However, it is often the case that
> the save and the reference are each only once and a simple register-
> register move (with two exceptions; i. the register saved to/restored
> from is the stack pointer, ii. the function needs an additional stack
> pointer adjustment to grow the stack).
>
> e.g. in the following example, if there are no other occurrences of
> register A14:
>
> ;; before
> ; prologue {
>   ...
> s32i.n  a14, sp, 16
>   ...   ;; no frame pointer needed
> ;; no additional stack growth
> ; } prologue
>   ...
> mov.n   a14, a6 ;; A6 is not SP
>   ...
> call0   foo
>   ...
> mov.n   a8, a14 ;; A8 is not SP
>   ...
> ; epilogue {
>   ...
> l32i.n  a14, sp, 16
>   ...
> ; } epilogue
>
> It can be possible like this:
>
> ;; after
> ; prologue {
>   ...
> (no save needed)
>   ...
> ; } prologue
>   ...
> s32i.n  a6, sp, 16  ;; replaced with A14's slot
>   ...
> call0   foo
>   ...
> l32i.n  a8, sp, 16  ;; through SP
>   ...
> ; epilogue {
>   ...
> (no restoration needed)
>   ...
> ; } epilogue
>
> This patch adds the abovementioned logic to the function prologue/epilogue
> RTL expander code.
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.cc (machine_function): Add new member
> 'eliminated_callee_saved_bmp'.
> (xtensa_can_eliminate_callee_saved_reg_p): New function to
> determine whether the register can be eliminated or not.
> (xtensa_expand_prologue): Add invoking the above function and
> elimination the use of callee-saved register by using its stack
> slot through the stack pointer (or the frame pointer if needed)
> directly.
> (xtensa_expand_prologue): Modify to not emit register restoration
> insn from its stack slot if the register is already eliminated.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/xtensa/elim_callee_saved.c: New.
> ---
>  gcc/config/xtensa/xtensa.cc   | 132 ++
>  .../gcc.target/xtensa/elim_callee_saved.c |  38 +
>  2 files changed, 145 insertions(+), 25 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/xtensa/elim_callee_saved.c

This version passes regression tests, but I still have a couple questions.

> diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc
> index 3e2e22d4cbe..ff59c933d4d 100644
> --- a/gcc/config/xtensa/xtensa.cc
> +++ b/gcc/config/xtensa/xtensa.cc
> @@ -105,6 +105,7 @@ struct GTY(()) machine_function
>bool epilogue_done;
>bool inhibit_logues_a1_adjusts;
>rtx last_logues_a9_content;
> +  HOST_WIDE_INT eliminated_callee_saved_bmp;
>  };
>
>  static void xtensa_option_override (void);
> @@ -3343,6 +3344,66 @@ xtensa_emit_adjust_stack_ptr (HOST_WIDE_INT offset, 
> int flags)
>  cfun->machine->last_logues_a9_content = GEN_INT (offset);
>  }
>
> +static bool
> +xtensa_can_eliminate_callee_saved_reg_p (unsigned int regno,
> +rtx_insn **p_insnS,
> +rtx_insn **p_insnR)
> +{
> +  df_ref ref;
> +  rtx_insn *insn, *insnS = NULL, *insnR = NULL;
> +  rtx pattern;
> +
> +  if (!optimize || !df || call_used_or_fixed_reg_p (regno))
> +return false;
> +
> +  for (ref = DF_REG_DEF_CHAIN (regno);
> +   ref; ref = DF_REF_NEXT_REG (ref))
> +if (DF_REF_CLASS (ref) != DF_REF_REGULAR
> +   || DEBUG_INSN_P (insn = DF_REF_INSN (ref)))
> +  continue;
> +else if (GET_CODE (pattern = PATTERN (insn)) == SET
> +&& REG_P (SET_DEST (pattern))
> +&& REGNO (SET_DEST (pattern)) == regno
> +&& REG_NREGS (SET_DEST (pattern)) == 1
> +&& REG_P (SET_SRC (pattern))
> +&& REGNO (SET_SRC (pattern)) != A1_REG)

Do I understand correctly that the check for A1 here and below is
for the case when regno is a hard frame pointer and the function
needs the frame pointer? If so, wouldn't it be better to check
for it explicitly in the beginning?

> +  {
> +   if (insnS)
> + return false;
> +   insnS = insn;
> +   continue;
> +  }
> +else
> +  return false;
> +
> +  for (ref = DF_REG_USE_CHAIN (regno);
> +   ref; ref = DF_REF_NEXT_REG (ref))
> +if (DF_REF_CLASS (ref) != DF_REF_REGULAR
> +   || DEBUG_INSN_P (insn = DF_REF_INSN (ref)))
> +  continue;
> +else if (GET_CODE (pattern = PATTERN (insn)) == SET
> +&& REG_P (SET_SRC (pattern))
> +&& REGNO (SET_SRC (pattern)) == regno
> +&& REG_NREGS (SET_SRC (pattern)

Re: [PATCH 5/7] libstdc++: printf format string fix in testsuite

2023-02-15 Thread Jonathan Wakely via Gcc-patches

On Wed, 15 Feb 2023 at 20:52, Matthias Kretz via Libstdc++
 wrote:
>
>
>
> Signed-off-by: Matthias Kretz 
>
> libstdc++-v3/ChangeLog:
>
> * testsuite/experimental/simd/tests/bits/verify.h
> (verify::verify): Use %zx for size_t in format string.
> ---
>  libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

I think this one would be OK to commit without approval, as "obvious".

OK for all relevant branches anyway.

Re: [PATCH 3/7] libstdc++: Document timeout and timeout-factor of simd tests

2023-02-15 Thread Jonathan Wakely via Gcc-patches

On Wed, 15 Feb 2023 at 20:50, Matthias Kretz via Libstdc++
 wrote:
>
>
>
> Signed-off-by: Matthias Kretz 
>
> libstdc++-v3/ChangeLog:
>
> * testsuite/experimental/simd/README.md: Document the timeout
> and timeout-factor directives. Minor typo fixed.


OK for all relevant branches (trunk/12/11).

Re: Ping^3: [PATCH] libcpp: Handle extended characters in user-defined literal suffix [PR103902]

2023-02-15 Thread Lewis Hyatt via Gcc-patches

On Wed, Feb 15, 2023 at 1:39 PM Jason Merrill  wrote:
>
> On 9/26/22 15:27, Lewis Hyatt wrote:
> > On Wed, Jun 15, 2022 at 03:06:16PM -0400, Lewis Hyatt wrote:
> >> On Tue, Jun 14, 2022 at 05:26:49PM -0400, Lewis Hyatt wrote:
> >>> Hello-
> >>>
> >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103902
> >>>
> >>> The attached patch resolves PR preprocessor/103902 as described in the 
> >>> patch
> >>> message inline below. bootstrap + regtest all languages was successful on
> >>> x86-64 Linux, with no new failures:
> >>>
> >>> FAIL 103 103
> >>> PASS 542338 542371
> >>> UNSUPPORTED 15247 15250
> >>> UNTESTED 136 136
> >>> XFAIL 4166 4166
> >>> XPASS 17 17
> >>>
> >>> Please let me know if it looks OK?
> >>>
> >>> A few questions I have:
> >>>
> >>> - A difference introduced with this patch is that after lexing something
> >>> like `operator ""_abc', then `_abc' is added to the identifier hash map,
> >>> whereas previously it was not. I feel like this must be OK because with 
> >>> the
> >>> optional space as in `operator "" _abc', it would be added with or 
> >>> without the
> >>> patch.
> >>>
> >>> - The behavior of `#pragma GCC poison' is not consistent (including prior 
> >>> to
> >>>my patch). I tried to make it more so but there is still one thing I 
> >>> want to
> >>>ask about. Leaving aside extended characters for now, the 
> >>> inconsistency is
> >>>that currently the poison is only checked, when the suffix appears as a
> >>>standalone token.
> >>>
> >>>#pragma GCC poison _X
> >>>bool operator ""_X (unsigned long long);   //accepted before the patch,
> >>>   //rejected after it
> >>>bool operator "" _X (unsigned long long);  //rejected either before or 
> >>> after
> >>>const char * operator ""_X (const char *, unsigned long); //accepted 
> >>> before,
> >>>  //rejected 
> >>> after
> >>>const char * operator "" _X (const char *, unsigned long); //rejected 
> >>> either
> >>>
> >>>const char * s = ""_X; //accepted before the patch, rejected after it
> >>>const bool b = 1_X; //accepted before or after 
> >>>
> >>> I feel like after the patch, the behavior is the expected behavior for all
> >>> cases but the last one. Here, we allow the poisoned identifier because 
> >>> it's
> >>> not lexed as an identifier, it's lexed as part of a pp-number. Does it 
> >>> seem OK
> >>> like this or does it need to be addressed?
> >>
> >> Sorry, that version actually did not handle the case of -Wc++11-compat in
> >> c++98 mode correctly. This updated version fixes that and adds the missing
> >> test coverage for that, if you could please review this one instead?
> >>
> >> By the way, the pipermail archive seems to permanently mangle UTF-8 in 
> >> inline
> >> attachments. I attached the patch also gzipped to address that for the
> >> archive, since the new testcases do use non-ASCII characters.
> >>
> >> Thanks for taking a look!
> >
> > Hello-
> >
> > May I please ping this patch again? Joseph suggested that it would be best 
> > if
> > a C++ maintainer has a look at it. This is one of just a few places left 
> > where
> > we don't handle UTF-8 properly in libcpp, it would be really nice to get 
> > them
> > fixed up if there is time to review this patch. Thanks!
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596704.html
> >
> > I re-attached it here as it required some trivial rebasing on top of 
> > recently
> > pushed changes. As before, I also attached the gzipped version so that the
> > UTF-8 testcases show up OK in the online archive, in case that's still an
> > issue. Thanks for taking a look!
>
> Thank you for the patch, sorry it slipped off my radar.
>

Thanks for taking a look at it. It's certainly an edge case that is
not bothering anyone too much, so no rush with it.

> > This patch fixes it by adding a new function scan_cur_identifier() that can 
> > be
> > used to lex an identifier while in the middle of lexing another token. It is
> > somewhat duplicative of the code in lex_identifier(), which handles the 
> > normal
> > case, but I think there's no good way to avoid that without pessimizing the
> > usual case, since lex_identifier() takes advantage of the fact that the 
> > first
> > character of the identifier has already been analyzed.
>
> So could you analyze the first character and then call lex_identifier?
>

Yes, it can be done this way. lex_identifier may need some adaptations
though, since it does some other work like tracking the original
spelling of the identifier. Plus per your comments below, it would
need to avoid the poison and other checks too.
I think it's pretty straightforward to refactor a bit so that it works
out. I kinda thought it may not be desirable to touch lex_identifier,
which is called on everything, just to handle this rare case, however
I am happy to do it this way after confirming it won't hurt
performan

Re: [PATCH 7/7] libstdc++: Fix incorrect function call in -ffast-math optimization

2023-02-15 Thread Jonathan Wakely via Gcc-patches

On Wed, 15 Feb 2023 at 20:52, Matthias Kretz via Libstdc++
 wrote:
>
>
>
> Signed-off-by: Matthias Kretz 
>
> libstdc++-v3/ChangeLog:
>
> * include/experimental/bits/simd_math.h (__hypot): Bitcasting
> between scalars requires the __bit_cast helper function instead
> of simd_bit_cast.

OK for trunk/12/11.

Re: [PATCH 1/7] libstdc++: Ensure __builtin_constant_p isn't lost on the way

2023-02-15 Thread Jonathan Wakely via Gcc-patches

On Wed, 15 Feb 2023 at 20:49, Matthias Kretz via Libstdc++
 wrote:
>
>
>
> The more expensive code path should only be taken if it can be optimized
> away.
>
> Signed-off-by: Matthias Kretz 
>
> libstdc++-v3/ChangeLog:
>
> * include/experimental/bits/simd.h
> (_SimdWrapper::_M_is_constprop_none_of)
> (_SimdWrapper::_M_is_constprop_all_of): Return false unless the
> computed result still satisfies __builtin_constant_p.

OK for trunk/12/11.

Re: [PATCH 6/7] libstdc++: Fix incorrect __builtin_is_constant_evaluated calls

2023-02-15 Thread Jonathan Wakely via Gcc-patches

On Wed, 15 Feb 2023 at 20:51, Matthias Kretz via Libstdc++
 wrote:
>
>
>
> Signed-off-by: Matthias Kretz 
>
> libstdc++-v3/ChangeLog:
>
> * include/experimental/bits/simd_x86.h
> (_SimdImplX86::_S_not_equal_to, _SimdImplX86::_S_less)
> (_SimdImplX86::_S_less_equal): Do not call
> __builtin_is_constant_evaluated in constexpr-if.

OK for trunk/12/11.

[PATCH] testsuite: Add CRIS to check_effective_target_lra non-LRA list

2023-02-15 Thread Hans-Peter Nilsson via Gcc-patches

I'd much rather install
https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611531.html
than this one, because obviously a general solution is
better than a target list.  But, that would require
approval, and I got NAK.  This change however, piling on to
the target list, is within target maintainer rights.
Committed.

Also, asking for maintainer reconsideration of the more
general solution in the above patch that gets rid of the
target list, as reload will stay in gcc-13 (IIUC).

-- >8 --
gcc/testsuite:

* lib/target-supports.exp (check_effective_target_lra): Add CRIS
as a non-LRA target.
---
 gcc/testsuite/lib/target-supports.exp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 227e3004077a..f808b4f63714 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -12192,7 +12192,7 @@ proc check_effective_target_o_flag_in_section { } {
 # return 1 if LRA is supported.
 
 proc check_effective_target_lra { } {
-if { [istarget hppa*-*-*] } {
+if { [istarget hppa*-*-*] || [istarget cris-*-*] } {
return 0
 }
 return 1
-- 
2.30.2

[PATCH] objs-gcc.sh: Only bootstrap if source-directory contains gcc

2023-02-15 Thread Hans-Peter Nilsson via Gcc-patches

TL;DR: committed as obvious.
-- >8 --
I use objs-gcc.sh as a preparatory step before calling
btest-gcc.sh in my scripts, for example my cris-elf
autotester.  I thought, why not use it for native builds
too.  Except that use, with binutils release-style tarballs
and a x86_64-pc-linux-gnu host, was broken.  Now that I look
at it, the script seems to have aged poorly...  Still,
there's a need for such a script to install stuff needed for
btest-gcc.sh (and to fix up stuff if needed), and this can
still be that script.  So, I prefer to fix show-stoppers for
common uses, while taking care to retain compatibility for
use that could possibly still work, with current sources.

A long time ago (before 2011, but after this script was
created in 2002, and used for a few years), the binutils
(and gdb and gcc) toplevel Makefile may have had a bootstrap
target that worked with binutils but didn't require gcc
sources to be present.  Now, you'll get an error (see
configure.ac line 1366 and on).  Let's just build the
default make-target when "bootstrap" is known to fail.
An alternative would be to fold this native
non-i686-pc-linux-gnu clause into the native
i686-pc-linux-gnu clause, as that seems to have been
originally intended as *the* single native clause, but
that'd require further edits (e.g. to remove install-dejagnu
and make gdb build conditional on gdb sources presence, to
work with binutils tarballs, and I'd also then prefer to
build not just ld, but also gas and binutils).

As it's a minimal obvious change required for current native
use with release-tarballs and git-checkout use(*), I'm
installing this as obvious.

*) Native i686-pc-linux-gnu remains broken for other use
than specially constructed combined trees where dejagnu is
included at the toplevel (i.e. historic Cygnus devo-type).

contrib/regression:
* objs-gcc.sh: Only bootstrap if source-directory contains gcc.
---
 contrib/regression/objs-gcc.sh | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/contrib/regression/objs-gcc.sh b/contrib/regression/objs-gcc.sh
index ea7820f33fac..d205bab17368 100755
--- a/contrib/regression/objs-gcc.sh
+++ b/contrib/regression/objs-gcc.sh
@@ -106,7 +106,9 @@ if [ $H_REAL_TARGET = $H_REAL_HOST -a $H_REAL_TARGET = 
i686-pc-linux-gnu ]
   make all-gdb all-dejagnu all-ld || exit 1
   make install-gdb install-dejagnu install-ld || exit 1
 elif [ $H_REAL_TARGET = $H_REAL_HOST ] ; then
-  make bootstrap || exit 1
+  H_MAKE_TARGET=
+  test -f $SOURCE/gcc/configure && H_MAKE_TARGET=bootstrap
+  make $H_MAKE_TARGET || exit 1
   make install || exit 1
 else
   make || exit 1
-- 
2.30.2

Re: [PATCH] Fix PR target/90458

2023-02-15 Thread NightStrike via Gcc-patches

On Wed, Feb 15, 2023 at 10:24 AM Eric Botcazou via Gcc-patches
 wrote:
>
> Hi,
>
> this is the incompatibility of -fstack-clash-protection with Windows SEH.  Now
> the Windows ports always enable TARGET_STACK_PROBE, which means that the stack
> is always probed (out of line) so -fstack-clash-protection does nothing more.
>
> Tested on x86-64/Windows and Linux, OK for all active branches?
>
>
> 2023-02-15  Eric Botcazou  
>
> * config/i386/i386.cc (ix86_compute_frame_layout): Disable the
> effects of -fstack-clash-protection for TARGET_STACK_PROBE.
> (ix86_expand_prologue): Likewise.

This fixes dg.exp/stack-check-2.c, -7, 8, and -16.c, which is great!

-6 no longer ICEs, but still fails.

-3, -4, -5, and -9 didn't ICE, and still fail:

gcc.dg/stack-check-10.c: pattern found 0 times
FAIL: gcc.dg/stack-check-10.c scan-rtl-dump-times pro_and_epilogue
"Stack clash no probe" 2
gcc.dg/stack-check-10.c: pattern found 0 times
FAIL: gcc.dg/stack-check-10.c scan-rtl-dump-times pro_and_epilogue
"Stack clash not noreturn" 2
gcc.dg/stack-check-10.c: pattern found 0 times
FAIL: gcc.dg/stack-check-10.c scan-rtl-dump-times pro_and_epilogue
"Stack clash residual allocation in prologue" 2
gcc.dg/stack-check-10.c: pattern found 0 times
FAIL: gcc.dg/stack-check-10.c scan-rtl-dump-times pro_and_epilogue
"Stack clash no frame pointer needed" 2
gcc.dg/stack-check-3.c: pattern found 0 times
FAIL: gcc.dg/stack-check-3.c scan-rtl-dump-times expand "allocation
and probing residuals" 7
gcc.dg/stack-check-3.c: pattern found 0 times
FAIL: gcc.dg/stack-check-3.c scan-rtl-dump-times expand "allocation
and probing in loop" 7
gcc.dg/stack-check-4.c: pattern found 0 times
FAIL: gcc.dg/stack-check-4.c scan-rtl-dump-times pro_and_epilogue
"Stack clash noreturn" 1
gcc.dg/stack-check-4.c: pattern found 0 times
FAIL: gcc.dg/stack-check-4.c scan-rtl-dump-times pro_and_epilogue
"Stack clash not noreturn" 1
gcc.dg/stack-check-5.c: pattern found 0 times
FAIL: gcc.dg/stack-check-5.c scan-rtl-dump-times pro_and_epilogue
"Stack clash no probe" 4
gcc.dg/stack-check-5.c: pattern found 0 times
FAIL: gcc.dg/stack-check-5.c scan-rtl-dump-times pro_and_epilogue
"Stack clash not noreturn" 4
gcc.dg/stack-check-5.c: pattern found 0 times
FAIL: gcc.dg/stack-check-5.c scan-rtl-dump-times pro_and_epilogue
"Stack clash no frame pointer needed" 4
gcc.dg/stack-check-5.c: pattern found 0 times
FAIL: gcc.dg/stack-check-5.c scan-rtl-dump-times pro_and_epilogue
"Stack clash no residual allocation in prologue" 1
gcc.dg/stack-check-5.c: pattern found 0 times
FAIL: gcc.dg/stack-check-5.c scan-rtl-dump-times pro_and_epilogue
"Stack clash residual allocation in prologue" 3
gcc.dg/stack-check-6.c: pattern found 0 times
FAIL: gcc.dg/stack-check-6.c scan-rtl-dump-times pro_and_epilogue
"Stack clash inline probes" 2
gcc.dg/stack-check-6.c: pattern found 0 times
FAIL: gcc.dg/stack-check-6.c scan-rtl-dump-times pro_and_epilogue
"Stack clash probe loop" 2
gcc.dg/stack-check-6.c: pattern found 0 times
FAIL: gcc.dg/stack-check-6.c scan-rtl-dump-times pro_and_epilogue
"Stack clash residual allocation in prologue" 4
gcc.dg/stack-check-6.c: pattern found 0 times
FAIL: gcc.dg/stack-check-6.c scan-rtl-dump-times pro_and_epilogue
"Stack clash not noreturn" 4
gcc.dg/stack-check-6.c: pattern found 0 times
FAIL: gcc.dg/stack-check-6.c scan-rtl-dump-times pro_and_epilogue
"Stack clash no frame pointer needed" 4
gcc.dg/stack-check-6a.c: pattern found 0 times
FAIL: gcc.dg/stack-check-6a.c scan-rtl-dump-times pro_and_epilogue
"Stack clash residual allocation in prologue" 4
gcc.dg/stack-check-6a.c: pattern found 0 times
FAIL: gcc.dg/stack-check-6a.c scan-rtl-dump-times pro_and_epilogue
"Stack clash not noreturn" 4
gcc.dg/stack-check-6a.c: pattern found 0 times
FAIL: gcc.dg/stack-check-6a.c scan-rtl-dump-times pro_and_epilogue
"Stack clash no frame pointer needed" 4
gcc.dg/stack-check-9.c: pattern found 0 times
FAIL: gcc.dg/stack-check-9.c scan-rtl-dump-times pro_and_epilogue
"Stack clash inline probes" 1
gcc.dg/stack-check-9.c: pattern found 0 times
FAIL: gcc.dg/stack-check-9.c scan-rtl-dump-times pro_and_epilogue
"Stack clash residual allocation in prologue" 1
gcc.dg/stack-check-9.c: pattern found 0 times
FAIL: gcc.dg/stack-check-9.c scan-rtl-dump-times pro_and_epilogue
"Stack clash not noreturn" 1
gcc.dg/stack-check-9.c: pattern found 0 times
FAIL: gcc.dg/stack-check-9.c scan-rtl-dump-times pro_and_epilogue
"Stack clash no frame pointer needed" 1


target/i386.exp/stack-check-12.c no longer ICEs but still fails.

-11, -18 and -19 still fail:

gcc.target/i386/stack-check-11.c: sub[ql] found 1 times
FAIL: gcc.target/i386/stack-check-11.c scan-assembler-times sub[ql] 4
gcc.target/i386/stack-check-11.c: or[ql] found 0 times
FAIL: gcc.target/i386/stack-check-11.c scan-assembler-times or[ql] 3

ia32882847.c:2:13: error: size of array 'dummy' is negative^M
ia32882847.c:4:55: error: '__i386__' undeclared here (not in a function)^M
compiler exited with status 1
FAIL: gcc.target/i38

[PATCH] RISC-V: Add RVV all mask C/C++ intrinsics support

2023-02-15 Thread juzhe . zhong

From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc (class mask_logic): New 
class.
(class mask_nlogic): Ditto.
(class mask_notlogic): Ditto.
(class vmmv): Ditto.
(class vmclr): Ditto.
(class vmset): Ditto.
(class vmnot): Ditto.
(class vcpop): Ditto.
(class vfirst): Ditto.
(class mask_misc): Ditto.
(class viota): Ditto.
(class vid): Ditto.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vmand): Ditto.
(vmnand): Ditto.
(vmandn): Ditto.
(vmxor): Ditto.
(vmor): Ditto.
(vmnor): Ditto.
(vmorn): Ditto.
(vmxnor): Ditto.
(vmmv): Ditto.
(vmclr): Ditto.
(vmset): Ditto.
(vmnot): Ditto.
(vcpop): Ditto.
(vfirst): Ditto.
(vmsbf): Ditto.
(vmsif): Ditto.
(vmsof): Ditto.
(viota): Ditto.
(vid): Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (struct alu_def): Ditto.
(struct mask_alu_def): Ditto.
(SHAPE): Ditto.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins.cc: Ditto.
* config/riscv/riscv-vsetvl.cc (pass_vsetvl::cleanup_insns): Fix bug 
for dest it scalar RVV intrinsics.
* config/riscv/vector-iterators.md (sof): New iterator.
* config/riscv/vector.md (@pred_n): New pattern.
(@pred_not): New pattern.
(@pred_popcount): New pattern.
(@pred_ffs): New pattern.
(@pred_): New pattern.
(@pred_iota): New pattern.
(@pred_series): New pattern.

---
 .../riscv/riscv-vector-builtins-bases.cc  | 217 ++
 .../riscv/riscv-vector-builtins-bases.h   |  19 ++
 .../riscv/riscv-vector-builtins-functions.def |  32 ++-
 .../riscv/riscv-vector-builtins-shapes.cc |  38 +++
 .../riscv/riscv-vector-builtins-shapes.h  |   1 +
 gcc/config/riscv/riscv-vector-builtins.cc |  64 ++
 gcc/config/riscv/riscv-vsetvl.cc  |   6 +-
 gcc/config/riscv/vector-iterators.md  |   9 +
 gcc/config/riscv/vector.md| 145 ++--
 9 files changed, 511 insertions(+), 20 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index ba701482728..88142217e45 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -665,6 +665,185 @@ public:
   }
 };
 
+/* Implements vmand/vmnand/vmandn/vmxor/vmor/vmnor/vmorn/vmxnor  */
+template
+class mask_logic : public function_base
+{
+public:
+  bool apply_tail_policy_p () const override { return false; }
+  bool apply_mask_policy_p () const override { return false; }
+
+  rtx expand (function_expander &e) const override
+  {
+return e.use_exact_insn (code_for_pred (CODE, e.vector_mode ()));
+  }
+};
+template
+class mask_nlogic : public function_base
+{
+public:
+  bool apply_tail_policy_p () const override { return false; }
+  bool apply_mask_policy_p () const override { return false; }
+
+  rtx expand (function_expander &e) const override
+  {
+return e.use_exact_insn (code_for_pred_n (CODE, e.vector_mode ()));
+  }
+};
+template
+class mask_notlogic : public function_base
+{
+public:
+  bool apply_tail_policy_p () const override { return false; }
+  bool apply_mask_policy_p () const override { return false; }
+
+  rtx expand (function_expander &e) const override
+  {
+return e.use_exact_insn (code_for_pred_not (CODE, e.vector_mode ()));
+  }
+};
+
+/* Implements vmmv.  */
+class vmmv : public function_base
+{
+public:
+  bool apply_tail_policy_p () const override { return false; }
+  bool apply_mask_policy_p () const override { return false; }
+
+  rtx expand (function_expander &e) const override
+  {
+return e.use_exact_insn (code_for_pred_mov (e.vector_mode ()));
+  }
+};
+
+/* Implements vmclr.  */
+class vmclr : public function_base
+{
+public:
+  bool can_be_overloaded_p (enum predication_type_index) const override
+  {
+return false;
+  }
+
+  rtx expand (function_expander &e) const override
+  {
+machine_mode mode = TYPE_MODE (TREE_TYPE (e.exp));
+e.add_all_one_mask_operand (mode);
+e.add_vundef_operand (mode);
+e.add_input_operand (mode, CONST0_RTX (mode));
+e.add_input_operand (call_expr_nargs (e.exp) - 1);
+e.add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX));
+return e.generate_insn (code_for_pred_mov (e.vector_mode ()));
+  }
+};
+
+/* Implements vmset.  */
+class vmset : public function_base
+{
+public:
+  bool can_be_overloaded_p (enum predication_type_index) const override
+  {
+return false;
+  }
+
+  rtx expand (function_expander &e) const override
+  {
+machine_mode mode = TYPE_MODE (TREE_TYPE (e.exp));
+e.add_all_on

[PATCH] RISC-V: Fix vmnot asm check (Should check vmnot.m instead of vmnot.mm)

2023-02-15 Thread juzhe . zhong

From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/binop_vx_constraint-148.c: Change vmnot.mm 
to vmnot.m.
* gcc.target/riscv/rvv/base/binop_vx_constraint-149.c: Change vmnot.mm 
to vmnot.m.
* gcc.target/riscv/rvv/base/binop_vx_constraint-150.c: Change vmnot.mm 
to vmnot.m.
* gcc.target/riscv/rvv/base/binop_vx_constraint-151.c: Change vmnot.mm 
to vmnot.m.
* gcc.target/riscv/rvv/base/binop_vx_constraint-152.c: Change vmnot.mm 
to vmnot.m.
* gcc.target/riscv/rvv/base/binop_vx_constraint-153.c: Change vmnot.mm 
to vmnot.m.
* gcc.target/riscv/rvv/base/binop_vx_constraint-156.c: Change vmnot.mm 
to vmnot.m.
* gcc.target/riscv/rvv/base/binop_vx_constraint-157.c: Change vmnot.mm 
to vmnot.m.
* gcc.target/riscv/rvv/base/binop_vx_constraint-159.c: Change vmnot.mm 
to vmnot.m.
* gcc.target/riscv/rvv/base/binop_vx_constraint-160.c: Change vmnot.mm 
to vmnot.m.
* gcc.target/riscv/rvv/base/binop_vx_constraint-161.c: Change vmnot.mm 
to vmnot.m.

---
 .../riscv/rvv/base/binop_vx_constraint-148.c   |  2 +-
 .../riscv/rvv/base/binop_vx_constraint-149.c   |  2 +-
 .../riscv/rvv/base/binop_vx_constraint-150.c   |  2 +-
 .../riscv/rvv/base/binop_vx_constraint-151.c   |  2 +-
 .../riscv/rvv/base/binop_vx_constraint-152.c   |  2 +-
 .../riscv/rvv/base/binop_vx_constraint-153.c   |  6 +++---
 .../riscv/rvv/base/binop_vx_constraint-156.c   |  6 +++---
 .../riscv/rvv/base/binop_vx_constraint-157.c   | 10 +-
 .../riscv/rvv/base/binop_vx_constraint-159.c   |  6 +++---
 .../riscv/rvv/base/binop_vx_constraint-160.c   | 10 +-
 .../riscv/rvv/base/binop_vx_constraint-161.c   |  4 ++--
 11 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-148.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-148.c
index c48134bc553..0c66a60ce74 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-148.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-148.c
@@ -16,5 +16,5 @@ void f1 (void * in, void *out, int32_t x)
 /* { dg-final { scan-assembler-times 
{vmslt\.vx\s+v[0-9]+,\s*v[0-9]+,\s*[a-x0-9]+\s+} 1 } } */
 /* { dg-final { scan-assembler-times 
{vmslt\.vx\s+v[0-9]+,\s*v[0-9]+,\s*[a-x0-9]+,\s*v0.t} 1 } } */
 /* { dg-final { scan-assembler-times 
{vmxor\.mm\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 1 } } */
-/* { dg-final { scan-assembler-times {vmnot\.mm\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vmnot\.m\s+v[0-9]+,\s*v[0-9]+} 1 } } */
 /* { dg-final { scan-assembler-not {vmv} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-149.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-149.c
index 7ba1a14aab6..f745b967c11 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-149.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-149.c
@@ -15,5 +15,5 @@ void f1 (void * in, void *out, int32_t x)
 
 /* { dg-final { scan-assembler-times 
{vmslt\.vx\s+v[0-9]+,\s*v[0-9]+,\s*[a-x0-9]+\s+} 2 } } */
 /* { dg-final { scan-assembler-times 
{vmandn\.mm\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 1 } } */
-/* { dg-final { scan-assembler-times {vmnot\.mm\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vmnot\.m\s+v[0-9]+,\s*v[0-9]+} 1 } } */
 /* { dg-final { scan-assembler-not {vmv} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-150.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-150.c
index 6282fb48105..55a222f47ea 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-150.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-150.c
@@ -17,5 +17,5 @@ void f1 (void * in, void *out, int32_t x)
 /* { dg-final { scan-assembler-times 
{vmslt\.vx\s+v[0-9]+,\s*v[0-9]+,\s*[a-x0-9]+\s+} 1 } } */
 /* { dg-final { scan-assembler-times 
{vmslt\.vx\s+v[0-9]+,\s*v[0-9]+,\s*[a-x0-9]+,\s*v0.t} 1 } } */
 /* { dg-final { scan-assembler-times 
{vmxor\.mm\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 1 } } */
-/* { dg-final { scan-assembler-times {vmnot\.mm\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vmnot\.m\s+v[0-9]+,\s*v[0-9]+} 1 } } */
 /* { dg-final { scan-assembler-times {vmv} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-151.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-151.c
index a2aa633aef7..49f697d8c35 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-151.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-151.c
@@ -16,5 +16,5 @@ void f1 (void * in, void *out, int32_t x)
 /* { dg-final { scan-assembler-times 
{vmslt\.vx\s+v[0-9]+,\s*v[0-9]+,\s*[a-x0-9]+\s+} 1 } } */
 /* { dg-final { scan-assembler-times 
{vmslt\.vx\s+v[0-9]+,\s*v[0-9]+,\s*[a-x0-9]+,\s*v

[PATCH] RISC-V: Add the res of all mask C api tests

2023-02-15 Thread juzhe . zhong

From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/vcpop_m_m-1.c: New test.
* gcc.target/riscv/rvv/base/vcpop_m_m-2.c: New test.
* gcc.target/riscv/rvv/base/vcpop_m_m-3.c: New test.
* gcc.target/riscv/rvv/base/vfirst_m_m-1.c: New test.
* gcc.target/riscv/rvv/base/vfirst_m_m-2.c: New test.
* gcc.target/riscv/rvv/base/vfirst_m_m-3.c: New test.
* gcc.target/riscv/rvv/base/vlm_v-1.c: New test.
* gcc.target/riscv/rvv/base/vlm_v-2.c: New test.
* gcc.target/riscv/rvv/base/vlm_v-3.c: New test.
* gcc.target/riscv/rvv/base/vsm_v-1.c: New test.
* gcc.target/riscv/rvv/base/vsm_v-2.c: New test.
* gcc.target/riscv/rvv/base/vsm_v-3.c: New test.

---
 .../gcc.target/riscv/rvv/base/vcpop_m_m-1.c   | 104 ++
 .../gcc.target/riscv/rvv/base/vcpop_m_m-2.c   | 104 ++
 .../gcc.target/riscv/rvv/base/vcpop_m_m-3.c   | 104 ++
 .../gcc.target/riscv/rvv/base/vfirst_m_m-1.c  | 104 ++
 .../gcc.target/riscv/rvv/base/vfirst_m_m-2.c  | 104 ++
 .../gcc.target/riscv/rvv/base/vfirst_m_m-3.c  | 104 ++
 .../gcc.target/riscv/rvv/base/vlm_v-1.c   |  55 +
 .../gcc.target/riscv/rvv/base/vlm_v-2.c   |  55 +
 .../gcc.target/riscv/rvv/base/vlm_v-3.c   |  55 +
 .../gcc.target/riscv/rvv/base/vsm_v-1.c   |  55 +
 .../gcc.target/riscv/rvv/base/vsm_v-2.c   |  55 +
 .../gcc.target/riscv/rvv/base/vsm_v-3.c   |  55 +
 12 files changed, 954 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vcpop_m_m-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vcpop_m_m-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vcpop_m_m-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vfirst_m_m-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vfirst_m_m-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vfirst_m_m-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vlm_v-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vlm_v-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vlm_v-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vsm_v-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vsm_v-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vsm_v-3.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/vcpop_m_m-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/vcpop_m_m-1.c
new file mode 100644
index 000..5ac335ac1ab
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/vcpop_m_m-1.c
@@ -0,0 +1,104 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-schedule-insns 
-fno-schedule-insns2" } */
+
+#include "riscv_vector.h"
+
+uint64_t test___riscv_vcpop_m_b1(vbool1_t op1,size_t vl)
+{
+return __riscv_vcpop_m_b1(op1,vl);
+}
+
+
+uint64_t test___riscv_vcpop_m_b2(vbool2_t op1,size_t vl)
+{
+return __riscv_vcpop_m_b2(op1,vl);
+}
+
+
+uint64_t test___riscv_vcpop_m_b4(vbool4_t op1,size_t vl)
+{
+return __riscv_vcpop_m_b4(op1,vl);
+}
+
+
+uint64_t test___riscv_vcpop_m_b8(vbool8_t op1,size_t vl)
+{
+return __riscv_vcpop_m_b8(op1,vl);
+}
+
+
+uint64_t test___riscv_vcpop_m_b16(vbool16_t op1,size_t vl)
+{
+return __riscv_vcpop_m_b16(op1,vl);
+}
+
+
+uint64_t test___riscv_vcpop_m_b32(vbool32_t op1,size_t vl)
+{
+return __riscv_vcpop_m_b32(op1,vl);
+}
+
+
+uint64_t test___riscv_vcpop_m_b64(vbool64_t op1,size_t vl)
+{
+return __riscv_vcpop_m_b64(op1,vl);
+}
+
+
+uint64_t test___riscv_vcpop_m_b1_m(vbool1_t mask,vbool1_t op1,size_t vl)
+{
+return __riscv_vcpop_m_b1_m(mask,op1,vl);
+}
+
+
+uint64_t test___riscv_vcpop_m_b2_m(vbool2_t mask,vbool2_t op1,size_t vl)
+{
+return __riscv_vcpop_m_b2_m(mask,op1,vl);
+}
+
+
+uint64_t test___riscv_vcpop_m_b4_m(vbool4_t mask,vbool4_t op1,size_t vl)
+{
+return __riscv_vcpop_m_b4_m(mask,op1,vl);
+}
+
+
+uint64_t test___riscv_vcpop_m_b8_m(vbool8_t mask,vbool8_t op1,size_t vl)
+{
+return __riscv_vcpop_m_b8_m(mask,op1,vl);
+}
+
+
+uint64_t test___riscv_vcpop_m_b16_m(vbool16_t mask,vbool16_t op1,size_t vl)
+{
+return __riscv_vcpop_m_b16_m(mask,op1,vl);
+}
+
+
+uint64_t test___riscv_vcpop_m_b32_m(vbool32_t mask,vbool32_t op1,size_t vl)
+{
+return __riscv_vcpop_m_b32_m(mask,op1,vl);
+}
+
+
+uint64_t test___riscv_vcpop_m_b64_m(vbool64_t mask,vbool64_t op1,size_t vl)
+{
+return __riscv_vcpop_m_b64_m(mask,op1,vl);
+}
+
+
+
+/* { dg-final { scan-assembler-times 
{vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*m8,\s*t[au],\s*m[au]\s+vcpop\.m\s+[a-x0-9]+,\s*v[0-9]+\s+}
 1 } } */
+/* { dg-final { scan-assembler-times 
{vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*m4,\s*t[au],\s*m[au]\s+vcpop\.m\s+[a-x0-9]+,\s*v[0-9]+\s+}
 1 } } */
+/* { dg-final { scan-assembler-times 
{vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*m2,\s*t[au],\s*m[au]\s+vcpop\.m\s+[

[PATCH] RISC-V: Add vm* C++ api tests

2023-02-15 Thread juzhe . zhong

From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* g++.target/riscv/rvv/base/vmand_mm-1.C: New test.
* g++.target/riscv/rvv/base/vmand_mm-2.C: New test.
* g++.target/riscv/rvv/base/vmand_mm-3.C: New test.
* g++.target/riscv/rvv/base/vmandn_mm-1.C: New test.
* g++.target/riscv/rvv/base/vmandn_mm-2.C: New test.
* g++.target/riscv/rvv/base/vmandn_mm-3.C: New test.
* g++.target/riscv/rvv/base/vmclr_m_m-1.C: New test.
* g++.target/riscv/rvv/base/vmclr_m_m-2.C: New test.
* g++.target/riscv/rvv/base/vmclr_m_m-3.C: New test.
* g++.target/riscv/rvv/base/vmmv_mm-1.C: New test.
* g++.target/riscv/rvv/base/vmmv_mm-2.C: New test.
* g++.target/riscv/rvv/base/vmmv_mm-3.C: New test.
* g++.target/riscv/rvv/base/vmnand_mm-1.C: New test.
* g++.target/riscv/rvv/base/vmnand_mm-2.C: New test.
* g++.target/riscv/rvv/base/vmnand_mm-3.C: New test.
* g++.target/riscv/rvv/base/vmnor_mm-1.C: New test.
* g++.target/riscv/rvv/base/vmnor_mm-2.C: New test.
* g++.target/riscv/rvv/base/vmnor_mm-3.C: New test.
* g++.target/riscv/rvv/base/vmnot_mm-1.C: New test.
* g++.target/riscv/rvv/base/vmnot_mm-2.C: New test.
* g++.target/riscv/rvv/base/vmnot_mm-3.C: New test.
* g++.target/riscv/rvv/base/vmor_mm-1.C: New test.
* g++.target/riscv/rvv/base/vmor_mm-2.C: New test.
* g++.target/riscv/rvv/base/vmor_mm-3.C: New test.
* g++.target/riscv/rvv/base/vmorn_mm-1.C: New test.
* g++.target/riscv/rvv/base/vmorn_mm-2.C: New test.
* g++.target/riscv/rvv/base/vmorn_mm-3.C: New test.
* g++.target/riscv/rvv/base/vmsbf_m-1.C: New test.
* g++.target/riscv/rvv/base/vmsbf_m-2.C: New test.
* g++.target/riscv/rvv/base/vmsbf_m-3.C: New test.
* g++.target/riscv/rvv/base/vmsbf_m_mu-1.C: New test.
* g++.target/riscv/rvv/base/vmsbf_m_mu-2.C: New test.
* g++.target/riscv/rvv/base/vmsbf_m_mu-3.C: New test.
* g++.target/riscv/rvv/base/vmset_m_m-1.C: New test.
* g++.target/riscv/rvv/base/vmset_m_m-2.C: New test.
* g++.target/riscv/rvv/base/vmset_m_m-3.C: New test.
* g++.target/riscv/rvv/base/vmsif_m-1.C: New test.
* g++.target/riscv/rvv/base/vmsif_m-2.C: New test.
* g++.target/riscv/rvv/base/vmsif_m-3.C: New test.
* g++.target/riscv/rvv/base/vmsif_m_mu-1.C: New test.
* g++.target/riscv/rvv/base/vmsif_m_mu-2.C: New test.
* g++.target/riscv/rvv/base/vmsif_m_mu-3.C: New test.
* g++.target/riscv/rvv/base/vmsof_m-1.C: New test.
* g++.target/riscv/rvv/base/vmsof_m-2.C: New test.
* g++.target/riscv/rvv/base/vmsof_m-3.C: New test.
* g++.target/riscv/rvv/base/vmsof_m_mu-1.C: New test.
* g++.target/riscv/rvv/base/vmsof_m_mu-2.C: New test.
* g++.target/riscv/rvv/base/vmsof_m_mu-3.C: New test.
* g++.target/riscv/rvv/base/vmxnor_mm-1.C: New test.
* g++.target/riscv/rvv/base/vmxnor_mm-2.C: New test.
* g++.target/riscv/rvv/base/vmxnor_mm-3.C: New test.
* g++.target/riscv/rvv/base/vmxor_mm-1.C: New test.
* g++.target/riscv/rvv/base/vmxor_mm-2.C: New test.
* g++.target/riscv/rvv/base/vmxor_mm-3.C: New test.

---
 .../g++.target/riscv/rvv/base/vmand_mm-1.C|  55 +
 .../g++.target/riscv/rvv/base/vmand_mm-2.C|  55 +
 .../g++.target/riscv/rvv/base/vmand_mm-3.C|  55 +
 .../g++.target/riscv/rvv/base/vmandn_mm-1.C   |  55 +
 .../g++.target/riscv/rvv/base/vmandn_mm-2.C   |  55 +
 .../g++.target/riscv/rvv/base/vmandn_mm-3.C   |  55 +
 .../g++.target/riscv/rvv/base/vmclr_m_m-1.C   |  55 +
 .../g++.target/riscv/rvv/base/vmclr_m_m-2.C   |  55 +
 .../g++.target/riscv/rvv/base/vmclr_m_m-3.C   |  55 +
 .../g++.target/riscv/rvv/base/vmmv_mm-1.C |  55 +
 .../g++.target/riscv/rvv/base/vmmv_mm-2.C |  55 +
 .../g++.target/riscv/rvv/base/vmmv_mm-3.C |  55 +
 .../g++.target/riscv/rvv/base/vmnand_mm-1.C   |  55 +
 .../g++.target/riscv/rvv/base/vmnand_mm-2.C   |  55 +
 .../g++.target/riscv/rvv/base/vmnand_mm-3.C   |  55 +
 .../g++.target/riscv/rvv/base/vmnor_mm-1.C|  55 +
 .../g++.target/riscv/rvv/base/vmnor_mm-2.C|  55 +
 .../g++.target/riscv/rvv/base/vmnor_mm-3.C|  55 +
 .../g++.target/riscv/rvv/base/vmnot_mm-1.C|  55 +
 .../g++.target/riscv/rvv/base/vmnot_mm-2.C|  55 +
 .../g++.target/riscv/rvv/base/vmnot_mm-3.C|  55 +
 .../g++.target/riscv/rvv/base/vmor_mm-1.C |  55 +
 .../g++.target/riscv/rvv/base/vmor_mm-2.C |  55 +
 .../g++.target/riscv/rvv/base/vmor_mm-3.C |  55 +
 .../g++.target/riscv/rvv/base/vmorn_mm-1.C|  55 +
 .../g++.target/riscv/rvv/base/vmorn_mm-2.C|  55 +
 .../g++.target/riscv/rvv/base/vmorn_mm-3.C|  5

[PATCH 1/2, GCC12] AArch64: Update transitive closures of aes, sha2 and sha3 extensions

2023-02-15 Thread Tejas Belagod via Gcc-patches

Transitive closures of architectural extensions have to be manually maintained
for AARCH64_OPT_EXTENSION list.  Currently aes, sha2 and sha3 extensions add
AARCH64_FL_SIMD has their dependency - this does not automatically pull in the
transitive dependence of AARCH64_FL_FP from AARCH64_FL_SIMD's definition.  As
described the transitive closure/dependence has to be maintained manually.
This patch adds AARCH64_FL_FP to each of these crypto extensions' dependence
set.  Automatic transitive closure maintenance is fixed on trunk in commit
11a113d501ff64fa4843e28d0a21b3f4e9d0d3de.

gcc/ChangeLog:

* config/aarch64/aarch64-option-extensions.def (aes, sha2, sha3):
Update AARCH64_OPT_EXTENSION definition of architectural dependence for
defintion of aes, sha2 and sha3 with AARCH64_FL_FP.
---
 gcc/config/aarch64/aarch64-option-extensions.def | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-option-extensions.def 
b/gcc/config/aarch64/aarch64-option-extensions.def
index b4d0ac8b600..88cefc20022 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -118,19 +118,19 @@ AARCH64_OPT_EXTENSION("dotprod", AARCH64_FL_DOTPROD, 
AARCH64_FL_SIMD, 0, \
 
 /* Enabling "aes" also enables "simd".
Disabling "aes" disables "aes" and "sve2-aes'.  */
-AARCH64_OPT_EXTENSION("aes", AARCH64_FL_AES, AARCH64_FL_SIMD, \
- AARCH64_FL_SVE2_AES, false, "aes")
+AARCH64_OPT_EXTENSION("aes", AARCH64_FL_AES, AARCH64_FL_SIMD | \
+ AARCH64_FL_FP, AARCH64_FL_SVE2_AES, false, "aes")
 
 /* Enabling "sha2" also enables "simd".
Disabling "sha2" just disables "sha2".  */
-AARCH64_OPT_EXTENSION("sha2", AARCH64_FL_SHA2, AARCH64_FL_SIMD, 0, false, \
- "sha1 sha2")
+AARCH64_OPT_EXTENSION("sha2", AARCH64_FL_SHA2, AARCH64_FL_SIMD | \
+ AARCH64_FL_FP, 0, false, "sha1 sha2")
 
 /* Enabling "sha3" enables "simd" and "sha2".
Disabling "sha3" disables "sha3" and "sve2-sha3".  */
 AARCH64_OPT_EXTENSION("sha3", AARCH64_FL_SHA3, AARCH64_FL_SIMD | \
- AARCH64_FL_SHA2, AARCH64_FL_SVE2_SHA3, false, \
- "sha3 sha512")
+ AARCH64_FL_SHA2 | AARCH64_FL_FP, AARCH64_FL_SVE2_SHA3, \
+ false, "sha3 sha512")
 
 /* Enabling "sm4" also enables "simd".
Disabling "sm4" disables "sm4" and "sve2-sm4".  */
-- 
2.17.1

[PATCH 2/2, GCC12] AArch64: Gate various crypto intrinsics availability based on features

2023-02-15 Thread Tejas Belagod via Gcc-patches

The 64-bit variant of PMULL{2} and AES instructions are available if FEAT_AES
is implemented according to the Arm ARM [1].  Similarly FEAT_SHA1 and
FEAT_SHA256 enable the use of SHA1 and SHA256 instruction variants.
This patch fixes arm_neon.h to correctly reflect the feature availability based
on '+aes' and '+sha2' as opposed to the ambiguous catch-all '+crypto'.

[1] Section D17.2.61, C7.2.215

2022-01-11  Tejas Belagod  

gcc/ChangeLog:

* config/aarch64/arm_neon.h (vmull_p64, vmull_high_p64, vaeseq_u8,
vaesdq_u8, vaesmcq_u8, vaesimcq_u8): Gate under "nothing+aes".
(vsha1*_u32, vsha256*_u32): Gate under "nothing+sha2".

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/acle/pmull64.c: New.
* gcc.target/aarch64/aes-fuse-1.c: Replace '+crypto' with corresponding
feature flag based on the intrinsic.
* gcc.target/aarch64/aes-fuse-2.c: Likewise.
* gcc.target/aarch64/aes_1.c: Likewise.
* gcc.target/aarch64/aes_2.c: Likewise.
* gcc.target/aarch64/aes_xor_combine.c: Likewise.
* gcc.target/aarch64/sha1_1.c: Likewise.
* gcc.target/aarch64/sha256_1.c: Likewise.
* gcc.target/aarch64/target_attr_crypto_ice_1.c: Likewise.
---
 gcc/config/aarch64/arm_neon.h | 35 ++-
 .../gcc.target/aarch64/acle/pmull64.c | 14 
 gcc/testsuite/gcc.target/aarch64/aes-fuse-1.c |  4 +--
 gcc/testsuite/gcc.target/aarch64/aes-fuse-2.c |  4 +--
 gcc/testsuite/gcc.target/aarch64/aes_1.c  |  2 +-
 gcc/testsuite/gcc.target/aarch64/aes_2.c  |  4 ++-
 .../gcc.target/aarch64/aes_xor_combine.c  |  2 +-
 gcc/testsuite/gcc.target/aarch64/sha1_1.c |  2 +-
 gcc/testsuite/gcc.target/aarch64/sha256_1.c   |  2 +-
 .../aarch64/target_attr_crypto_ice_1.c|  2 +-
 10 files changed, 44 insertions(+), 27 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/pmull64.c

diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 85d03c58d2a..695aafd9a5e 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -10243,7 +10243,7 @@ vqrdmlshs_laneq_s32 (int32_t __a, int32_t __b, 
int32x4_t __c, const int __d)
 #pragma GCC pop_options
 
 #pragma GCC push_options
-#pragma GCC target ("+nothing+crypto")
+#pragma GCC target ("+nothing+aes")
 /* vaes  */
 
 __extension__ extern __inline uint8x16_t
@@ -10273,6 +10273,22 @@ vaesimcq_u8 (uint8x16_t data)
 {
   return __builtin_aarch64_crypto_aesimcv16qi_uu (data);
 }
+
+__extension__ extern __inline poly128_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vmull_p64 (poly64_t __a, poly64_t __b)
+{
+  return
+__builtin_aarch64_crypto_pmulldi_ppp (__a, __b);
+}
+
+__extension__ extern __inline poly128_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vmull_high_p64 (poly64x2_t __a, poly64x2_t __b)
+{
+  return __builtin_aarch64_crypto_pmullv2di_ppp (__a, __b);
+}
+
 #pragma GCC pop_options
 
 /* vcage  */
@@ -23519,7 +23535,7 @@ vrsrad_n_u64 (uint64_t __a, uint64_t __b, const int __c)
 }
 
 #pragma GCC push_options
-#pragma GCC target ("+nothing+crypto")
+#pragma GCC target ("+nothing+sha2")
 
 /* vsha1  */
 
@@ -23596,21 +23612,6 @@ vsha256su1q_u32 (uint32x4_t __tw0_3, uint32x4_t 
__w8_11, uint32x4_t __w12_15)
   __w12_15);
 }
 
-__extension__ extern __inline poly128_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vmull_p64 (poly64_t __a, poly64_t __b)
-{
-  return
-__builtin_aarch64_crypto_pmulldi_ppp (__a, __b);
-}
-
-__extension__ extern __inline poly128_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vmull_high_p64 (poly64x2_t __a, poly64x2_t __b)
-{
-  return __builtin_aarch64_crypto_pmullv2di_ppp (__a, __b);
-}
-
 #pragma GCC pop_options
 
 /* vshl */
diff --git a/gcc/testsuite/gcc.target/aarch64/acle/pmull64.c 
b/gcc/testsuite/gcc.target/aarch64/acle/pmull64.c
new file mode 100644
index 000..6a1e99e2d0d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/acle/pmull64.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=armv8.2-a" } */
+
+#pragma push_options
+#pragma GCC target ("+aes")
+
+#include "arm_neon.h"
+
+int foo (poly64_t a, poly64_t b)
+{
+  return vgetq_lane_s32 (vreinterpretq_s32_p128 (vmull_p64 (a, b)), 0);
+}
+
+/* { dg-final { scan-assembler "\tpmull\tv" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/aes-fuse-1.c 
b/gcc/testsuite/gcc.target/aarch64/aes-fuse-1.c
index d7b4f89919d..1b4e10f78db 100644
--- a/gcc/testsuite/gcc.target/aarch64/aes-fuse-1.c
+++ b/gcc/testsuite/gcc.target/aarch64/aes-fuse-1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -mcpu=cortex-a72+crypto -dp" } */
-/* { dg-additional-options "-march=armv8-a+crypto" { target { aarch64*-*-* } } 
}*/
+/* { dg-options "-O3 -mcpu=cortex-a72+aes -dp" } */
+/* { dg-additional-options "-march=armv8-a+aes" {

Re: [PING 2] [PATCH 0/3] RISC-V: optimize stack manipulation in save-restore

2023-02-15 Thread Fei Gao

ping.

BR, 
Fei

On 2023-02-03 16:52  Fei Gao  wrote:
>
>
>Gentle ping.
>
>The patch I previously submitted:
>| Date: Wed, 30 Nov 2022 00:38:08 -0800
>| Subject: [PATCH] RISC-V: optimize stack manipulation in save-restore
>| Message-ID: 
>
>I split the patches as per Palmer's review comment.
>
>BR
>Fei
>
>On 2022-12-01 18:03  Fei Gao  wrote:
>>
>>The patches allow less instructions to be used in stack allocation
>>and deallocation if save-restore enabled, and also make the stack
>>manipulation codes more readable.
>>
>>Fei Gao (3):
>>  RISC-V: add a new parameter in riscv_first_stack_step.
>>  RISC-V: optimize stack manipulation in save-restore
>>  RISC-V: make the stack manipulation codes more readable.
>>
>> gcc/config/riscv/riscv.cc | 105 +-
>> .../gcc.target/riscv/stack_save_restore.c |  40 +++
>> 2 files changed, 95 insertions(+), 50 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.target/riscv/stack_save_restore.c
>>
>>--
>>2.17.1

[PATCH] tree-optimization/108791 - checking ICE with sloppy ADDR_EXPR

2023-02-15 Thread Richard Biener via Gcc-patches

The following fixes a checking ICE by choosing a more appropriate
type for an ADDR_EXPR built by forwprop.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/108791
* tree-ssa-forwprop.cc (optimize_vector_load): Build
the ADDR_EXPR of a TARGET_MEM_REF using a more meaningful
type.

* gcc.dg/torture/pr108791.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr108791.c | 9 +
 gcc/tree-ssa-forwprop.cc| 3 ++-
 2 files changed, 11 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr108791.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr108791.c 
b/gcc/testsuite/gcc.dg/torture/pr108791.c
new file mode 100644
index 000..c1c1cb22aad
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr108791.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+
+int f (int *a(), int *b, int *c, int *d)
+{
+  int s = 0;
+  for (int *i = (int *)a; i < b; ++i, ++c)
+s += *c * d[*i];
+  return s;
+}
diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
index 0841a739fe1..03fe0a3f6df 100644
--- a/gcc/tree-ssa-forwprop.cc
+++ b/gcc/tree-ssa-forwprop.cc
@@ -3299,7 +3299,8 @@ optimize_vector_load (gimple_stmt_iterator *gsi)
 {
   if (TREE_CODE (TREE_OPERAND (load_rhs, 0)) == ADDR_EXPR)
mark_addressable (TREE_OPERAND (TREE_OPERAND (load_rhs, 0), 0));
-  tree tem = make_ssa_name (TREE_TYPE (TREE_OPERAND (load_rhs, 0)));
+  tree ptrtype = build_pointer_type (TREE_TYPE (load_rhs));
+  tree tem = make_ssa_name (ptrtype);
   gimple *new_stmt
= gimple_build_assign (tem, build1 (ADDR_EXPR, TREE_TYPE (tem),
unshare_expr (load_rhs)));
-- 
2.35.3

[PATCH 1/5] Add prototypes for RISC-V Crypto built-in functions

2023-02-15 Thread Liao Shihua

Co-Authored-By: SiYu Wu
---
 gcc/config/riscv/riscv-builtins.cc |  8 
 gcc/config/riscv/riscv-ftypes.def  | 10 ++
 2 files changed, 18 insertions(+)

diff --git a/gcc/config/riscv/riscv-builtins.cc 
b/gcc/config/riscv/riscv-builtins.cc
index 25ca407f9a9..ded91e17554 100644
--- a/gcc/config/riscv/riscv-builtins.cc
+++ b/gcc/config/riscv/riscv-builtins.cc
@@ -42,6 +42,8 @@ along with GCC; see the file COPYING3.  If not see
 /* Macros to create an enumeration identifier for a function prototype.  */
 #define RISCV_FTYPE_NAME0(A) RISCV_##A##_FTYPE
 #define RISCV_FTYPE_NAME1(A, B) RISCV_##A##_FTYPE_##B
+#define RISCV_FTYPE_NAME2(A, B, C) RISCV_##A##_FTYPE_##B##_##C
+#define RISCV_FTYPE_NAME3(A, B, C, D) RISCV_##A##_FTYPE_##B##_##C##_##D
 
 /* Classifies the prototype of a built-in function.  */
 enum riscv_function_type {
@@ -132,6 +134,8 @@ AVAIL (always, (!0))
 /* Argument types.  */
 #define RISCV_ATYPE_VOID void_type_node
 #define RISCV_ATYPE_USI unsigned_intSI_type_node
+#define RISCV_ATYPE_QI intQI_type_node
+#define RISCV_ATYPE_HI intHI_type_node
 #define RISCV_ATYPE_SI intSI_type_node
 #define RISCV_ATYPE_DI intDI_type_node
 #define RISCV_ATYPE_VOID_PTR ptr_type_node
@@ -142,6 +146,10 @@ AVAIL (always, (!0))
   RISCV_ATYPE_##A
 #define RISCV_FTYPE_ATYPES1(A, B) \
   RISCV_ATYPE_##A, RISCV_ATYPE_##B
+#define RISCV_FTYPE_ATYPES2(A, B, C) \
+  RISCV_ATYPE_##A, RISCV_ATYPE_##B, RISCV_ATYPE_##C
+#define RISCV_FTYPE_ATYPES3(A, B, C, D) \
+  RISCV_ATYPE_##A, RISCV_ATYPE_##B, RISCV_ATYPE_##C, RISCV_ATYPE_##D
 
 static const struct riscv_builtin_description riscv_builtins[] = {
   #include "riscv-cmo.def"
diff --git a/gcc/config/riscv/riscv-ftypes.def 
b/gcc/config/riscv/riscv-ftypes.def
index 3a40c33e7c2..3b518195a29 100644
--- a/gcc/config/riscv/riscv-ftypes.def
+++ b/gcc/config/riscv/riscv-ftypes.def
@@ -32,3 +32,13 @@ DEF_RISCV_FTYPE (1, (VOID, USI))
 DEF_RISCV_FTYPE (1, (VOID, VOID_PTR))
 DEF_RISCV_FTYPE (1, (SI, SI))
 DEF_RISCV_FTYPE (1, (DI, DI))
+DEF_RISCV_FTYPE (2, (SI, QI, QI))
+DEF_RISCV_FTYPE (2, (SI, HI, HI))
+DEF_RISCV_FTYPE (2, (SI, SI, SI))
+DEF_RISCV_FTYPE (2, (DI, QI, QI))
+DEF_RISCV_FTYPE (2, (DI, HI, HI))
+DEF_RISCV_FTYPE (2, (DI, SI, SI))
+DEF_RISCV_FTYPE (2, (DI, DI, SI))
+DEF_RISCV_FTYPE (2, (DI, DI, DI))
+DEF_RISCV_FTYPE (3, (SI, SI, SI, SI))
+DEF_RISCV_FTYPE (3, (DI, DI, DI, SI))
-- 
2.38.1.windows.1

[PATCH V2 3/5] Implement ZKND and ZKNE extensions

2023-02-15 Thread Liao Shihua

This patch support Zkne and Zknd extension. 
It includes instruction's machine description, built-in funtion, and 
intrinsics. 

gcc/ChangeLog:

* config/riscv/constraints.md (D03): New constraints of bs.
(DsA):New constraints of rnum.
* config/riscv/crypto.md (riscv_aes32dsi):Add ZKND,ZKNE instructions.
(riscv_aes32dsmi): Likewise.
(riscv_aes64ds): Likewise.
(riscv_aes64dsm): Likewise.
(riscv_aes64im): Likewise.
(riscv_aes64ks1i): Likewise.
(riscv_aes64ks2): Likewise.
(riscv_aes32esi): Likewise.
(riscv_aes32esmi): Likewise.
(riscv_aes64es): Likewise.
(riscv_aes64esm): Likewise.
* config/riscv/riscv-builtins.cc (AVAIL): Add ZKND's and ZKNE's AVAIL. 
* config/riscv/riscv-crypto.def (DIRECT_BUILTIN):Add ZKND's and ZKNE's 
built-in functions. 
* config/riscv/riscv_scalar_crypto.h (__riscv_aes32dsi):Add ZKND's and 
ZKNE's intrinsics. 
(__riscv_aes32dsmi): Likewise.
(__riscv_aes64ds): Likewise.
(__riscv_aes64dsm): Likewise.
(__riscv_aes64im): Likewise.
(__riscv_aes64ks1i): Likewise.
(__riscv_aes64ks2): Likewise.
(__riscv_aes32esi): Likewise.
(__riscv_aes32esmi): Likewise.
(__riscv_aes64es): Likewise.
(__riscv_aes64esm): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zknd32.c: New test.
* gcc.target/riscv/zknd64.c: New test.
* gcc.target/riscv/zkne32.c: New test.
* gcc.target/riscv/zkne64.c: New test.

Co-Authored-By: SiYu Wu
---
 gcc/config/riscv/constraints.md |   8 ++
 gcc/config/riscv/crypto.md  | 121 +++-
 gcc/config/riscv/riscv-builtins.cc  |   5 +
 gcc/config/riscv/riscv-crypto.def   |  15 +++
 gcc/config/riscv/riscv_scalar_crypto.h  |  46 +
 gcc/testsuite/gcc.target/riscv/zknd32.c |  18 
 gcc/testsuite/gcc.target/riscv/zknd64.c |  36 +++
 gcc/testsuite/gcc.target/riscv/zkne32.c |  18 
 gcc/testsuite/gcc.target/riscv/zkne64.c |  30 ++
 9 files changed, 296 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknd32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknd64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zkne32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zkne64.c

diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index 3637380ee47..3f46f14b10f 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -83,6 +83,14 @@
   (and (match_code "const_int")
(match_test "SINGLE_BIT_MASK_OPERAND (~ival)")))
 
+(define_constraint "D03"
+  "0, 1, 2 or 3 immediate"
+  (match_test "IN_RANGE (ival, 0, 3)"))
+
+(define_constraint "DsA"
+  "0 - 10 immediate"
+  (match_test "IN_RANGE (ival, 0, 10)"))
+
 ;; Floating-point constant +0.0, used for FCVT-based moves when FMV is
 ;; not available in RV32.
 (define_constraint "G"
diff --git a/gcc/config/riscv/crypto.md b/gcc/config/riscv/crypto.md
index 6792f19ed68..d76a872775f 100644
--- a/gcc/config/riscv/crypto.md
+++ b/gcc/config/riscv/crypto.md
@@ -34,7 +34,20 @@
 UNSPEC_XPERM8
 UNSPEC_XPERM4
 
-
+;; ZKND unspecs
+UNSPEC_AES_DSI
+UNSPEC_AES_DSMI
+UNSPEC_AES_DS
+UNSPEC_AES_DSM
+UNSPEC_AES_IM
+UNSPEC_AES_KS1I
+UNSPEC_AES_KS2
+
+;; ZKNE unspecs
+UNSPEC_AES_ES
+UNSPEC_AES_ESM
+UNSPEC_AES_ESI
+UNSPEC_AES_ESMI
 ])
 
 ;; ZBKB extension
@@ -128,3 +141,109 @@
   "TARGET_ZBKX"
   "xperm8\t%0,%1,%2"
   [(set_attr "type" "crypto")])
+
+;; ZKND extension
+
+(define_insn "riscv_aes32dsi"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+(unspec:SI [(match_operand:SI 1 "register_operand" "r")
+   (match_operand:SI 2 "register_operand" "r")
+   (match_operand:SI 3 "register_operand" "D03")]
+   UNSPEC_AES_DSI))]
+  "TARGET_ZKND && !TARGET_64BIT"
+  "aes32dsi\t%0,%1,%2,%3"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_aes32dsmi"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+(unspec:SI [(match_operand:SI 1 "register_operand" "r")
+   (match_operand:SI 2 "register_operand" "r")
+   (match_operand:SI 3 "register_operand" "D03")]
+   UNSPEC_AES_DSMI))]
+  "TARGET_ZKND && !TARGET_64BIT"
+  "aes32dsmi\t%0,%1,%2,%3"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_aes64ds"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+(unspec:DI [(match_operand:DI 1 "register_operand" "r")
+   (match_operand:DI 2 "register_operand" "r")]
+   UNSPEC_AES_DS))]
+  "TARGET_ZKND && TARGET_64BIT"
+  "aes64ds\t%0,%1,%2"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_aes64dsm"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+(unspec:DI [(match_operand:DI 1 "register_operand" "r")
+   (match_operand:DI 2

[PATCH V2 5/5] Implement ZKSH and ZKSED extensions

2023-02-15 Thread Liao Shihua

This patch support Zksh and Zksed extension. 
It includes instruction's machine description, built-in funtion, and 
intrinsics. 

gcc/ChangeLog:

* config/riscv/crypto.md (riscv_sm3p0_): Add ZKSH's and ZKSED's 
instructions.
(riscv_sm3p1_): Likewise.
(riscv_sm4ed_): Likewise.
(riscv_sm4ks_): Likewise.
* config/riscv/riscv-builtins.cc (AVAIL): Add ZKSH's and ZKSED's AVAIL.
* config/riscv/riscv-crypto.def (RISCV_BUILTIN): Add ZKSH's and ZKSED's 
built-in functions.
* config/riscv/riscv_scalar_crypto.h (__riscv_sm4ks): Add ZKSH's and 
ZKSED's intrinsics.
(__riscv_sm4ed): Likewise.
(__riscv_sm3p0): Likewise.
(__riscv_sm3p1): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zksed.c: New test.
* gcc.target/riscv/zksh.c: New test.


Co-Authored-By: SiYu Wu
---
 gcc/config/riscv/crypto.md | 50 +-
 gcc/config/riscv/riscv-builtins.cc |  4 +++
 gcc/config/riscv/riscv-crypto.def  | 12 +++
 gcc/config/riscv/riscv_scalar_crypto.h | 20 +++
 gcc/testsuite/gcc.target/riscv/zksed.c | 20 +++
 gcc/testsuite/gcc.target/riscv/zksh.c  | 19 ++
 6 files changed, 124 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zksed.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zksh.c

diff --git a/gcc/config/riscv/crypto.md b/gcc/config/riscv/crypto.md
index 063a8025f20..e28bdd91078 100644
--- a/gcc/config/riscv/crypto.md
+++ b/gcc/config/riscv/crypto.md
@@ -64,6 +64,14 @@
 UNSPEC_SHA_512_SUM0R
 UNSPEC_SHA_512_SUM1
 UNSPEC_SHA_512_SUM1R
+
+;; ZKSH unspecs
+UNSPEC_SM3_P0
+UNSPEC_SM3_P1
+
+;; ZKSED unspecs
+UNSPEC_SM4_ED
+UNSPEC_SM4_KS
 ])
 
 ;; ZBKB extension
@@ -384,4 +392,44 @@
UNSPEC_SHA_512_SUM1))]
   "TARGET_ZKNH && TARGET_64BIT"
   "sha512sum1\t%0,%1"
-  [(set_attr "type" "crypto")])
\ No newline at end of file
+  [(set_attr "type" "crypto")])
+
+ ;; ZKSH
+
+(define_insn "riscv_sm3p0_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")]
+  UNSPEC_SM3_P0))]
+  "TARGET_ZKSH"
+  "sm3p0\t%0,%1"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sm3p1_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")]
+  UNSPEC_SM3_P1))]
+  "TARGET_ZKSH"
+  "sm3p1\t%0,%1"
+  [(set_attr "type" "crypto")])
+
+;; ZKSED
+
+(define_insn "riscv_sm4ed_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")
+  (match_operand:X 2 "register_operand" "r")
+  (match_operand:SI 3 "register_operand" "D03")]
+  UNSPEC_SM4_ED))]
+  "TARGET_ZKSED"
+  "sm4ed\t%0,%1,%2,%3"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sm4ks_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")
+  (match_operand:X 2 "register_operand" "r")
+  (match_operand:SI 3 "register_operand" "D03")]
+  UNSPEC_SM4_KS))]
+  "TARGET_ZKSED"
+  "sm4ks\t%0,%1,%2,%3"
+  [(set_attr "type" "crypto")])
diff --git a/gcc/config/riscv/riscv-builtins.cc 
b/gcc/config/riscv/riscv-builtins.cc
index 2a35167e6fb..18c0cce6b8b 100644
--- a/gcc/config/riscv/riscv-builtins.cc
+++ b/gcc/config/riscv/riscv-builtins.cc
@@ -113,6 +113,10 @@ AVAIL (crypto_zkne64, TARGET_ZKNE && TARGET_64BIT)
 AVAIL (crypto_zkne_or_zknd, (TARGET_ZKNE || TARGET_ZKND) && TARGET_64BIT)
 AVAIL (crypto_zknh32, TARGET_ZKNH && !TARGET_64BIT)
 AVAIL (crypto_zknh64, TARGET_ZKNH && TARGET_64BIT)
+AVAIL (crypto_zksh32, TARGET_ZKSH && !TARGET_64BIT)
+AVAIL (crypto_zksh64, TARGET_ZKSH && TARGET_64BIT)
+AVAIL (crypto_zksed32, TARGET_ZKSED && !TARGET_64BIT)
+AVAIL (crypto_zksed64, TARGET_ZKSED && TARGET_64BIT)
 AVAIL (always, (!0))
 
 /* Construct a riscv_builtin_description from the given arguments.
diff --git a/gcc/config/riscv/riscv-crypto.def 
b/gcc/config/riscv/riscv-crypto.def
index 831ab8c0d01..7774b801aec 100644
--- a/gcc/config/riscv/riscv-crypto.def
+++ b/gcc/config/riscv/riscv-crypto.def
@@ -80,3 +80,15 @@ DIRECT_BUILTIN (sha512sig0, RISCV_DI_FTYPE_DI, 
crypto_zknh64),
 DIRECT_BUILTIN (sha512sig1, RISCV_DI_FTYPE_DI, crypto_zknh64),
 DIRECT_BUILTIN (sha512sum0, RISCV_DI_FTYPE_DI, crypto_zknh64),
 DIRECT_BUILTIN (sha512sum1, RISCV_DI_FTYPE_DI, crypto_zknh64),
+
+// ZKSH
+RISCV_BUILTIN (sm3p0_si, "sm3p0", RISCV_BUILTIN_DIRECT, RISCV_SI_FTYPE_SI, 
crypto_zksh32),
+RISCV_BUILTIN (sm3p0_di, "sm3p0", RISCV_BUILTIN_DIRECT, RISCV_DI_FTYPE_DI, 
crypto_zksh64),
+RISCV_BUILTIN (sm3p1_si, "sm3p1", RISCV_BUILTIN_DIRECT, RISCV_SI_FTYPE_SI, 
crypto_zksh32),
+RISCV_BUILTIN (sm3p1_di, "sm3p1", RISCV_BUILTIN_DIRECT, RISCV_DI_FTYPE_DI, 
crypto_zksh64),
+
+// ZKSED
+RISCV_BUILTIN (sm4ed_si, "sm4ed", RISCV_BUILTIN

[PATCH V2 0/5] RISC-V: Implement Scalar Cryptography Extension

2023-02-15 Thread Liao Shihua

This series adds basic support for the Scalar Cryptography extensions:
* Zbkb
* Zbkc
* Zbkx
* Zknd
* Zkne
* Zknh
* Zksed
* Zksh

The implementation follows the version Scalar Cryptography v1.0.0 of the 
specification,
and the intrinsic of Scalar Cryptography extensions follows riscv-c-api
which can be found here:
https://github.com/riscv/riscv-crypto/releases/tag/v1.0.0-scalar
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/31

It works by Wu Siyu and Liao Shihua .

Liao Shihua (5):
  Add prototypes for RISC-V Crypto built-in functions
  Implement ZBKB, ZBKC and ZBKX extensions
  Implement ZKND and ZKNE extensions
  Implement ZKNH extensions
  Implement ZKSH and ZKSED extensions

 gcc/config.gcc|   2 +-
 gcc/config/riscv/bitmanip.md  |  20 +-
 gcc/config/riscv/constraints.md   |   8 +
 gcc/config/riscv/crypto.md| 435 ++
 gcc/config/riscv/riscv-builtins.cc|  26 ++
 gcc/config/riscv/riscv-crypto.def |  94 
 gcc/config/riscv/riscv-ftypes.def |  10 +
 gcc/config/riscv/riscv.md |   4 +-
 gcc/config/riscv/riscv_scalar_crypto.h| 218 +
 gcc/testsuite/gcc.target/riscv/zbkb32.c   |  36 ++
 gcc/testsuite/gcc.target/riscv/zbkb64.c   |  28 ++
 gcc/testsuite/gcc.target/riscv/zbkc32.c   |  17 +
 gcc/testsuite/gcc.target/riscv/zbkc64.c   |  17 +
 gcc/testsuite/gcc.target/riscv/zbkx32.c   |  18 +
 gcc/testsuite/gcc.target/riscv/zbkx64.c   |  18 +
 gcc/testsuite/gcc.target/riscv/zknd32.c   |  18 +
 gcc/testsuite/gcc.target/riscv/zknd64.c   |  36 ++
 gcc/testsuite/gcc.target/riscv/zkne32.c   |  18 +
 gcc/testsuite/gcc.target/riscv/zkne64.c   |  30 ++
 gcc/testsuite/gcc.target/riscv/zknh-sha256.c  |  29 ++
 .../gcc.target/riscv/zknh-sha512-32.c |  43 ++
 .../gcc.target/riscv/zknh-sha512-64.c |  31 ++
 gcc/testsuite/gcc.target/riscv/zksed.c|  20 +
 gcc/testsuite/gcc.target/riscv/zksh.c |  19 +
 24 files changed, 1183 insertions(+), 12 deletions(-)
 create mode 100644 gcc/config/riscv/crypto.md
 create mode 100644 gcc/config/riscv/riscv-crypto.def
 create mode 100644 gcc/config/riscv/riscv_scalar_crypto.h
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkb32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkb64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkc32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkc64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkx32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkx64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknd32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknd64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zkne32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zkne64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha256.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha512-32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha512-64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zksed.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zksh.c

-- 
2.38.1.windows.1

[PATCH V2 4/5] Implement ZKNH extensions

2023-02-15 Thread Liao Shihua

This patch support Zknh extension. 
It includes instruction's machine description, built-in funtion, and 
intrinsics. 

gcc/ChangeLog:

* config/riscv/crypto.md (riscv_sha256sig0_):Add ZKNH's 
instructions.
(riscv_sha256sig1_): Likewise.
(riscv_sha256sum0_): Likewise.
(riscv_sha256sum1_): Likewise.
(riscv_sha512sig0h): Likewise.
(riscv_sha512sig0l): Likewise.
(riscv_sha512sig1h): Likewise.
(riscv_sha512sig1l): Likewise.
(riscv_sha512sum0r): Likewise.
(riscv_sha512sum1r): Likewise.
(riscv_sha512sig0): Likewise.
(riscv_sha512sig1): Likewise.
(riscv_sha512sum0): Likewise.
(riscv_sha512sum1): Likewise.
* config/riscv/riscv-builtins.cc (AVAIL): Add ZKNH's AVAIL.
* config/riscv/riscv-crypto.def (RISCV_BUILTIN): Add ZKNH's built-in 
functions.
(DIRECT_BUILTIN): Likewise.
* config/riscv/riscv_scalar_crypto.h (__riscv_sha256sig0): Add ZKNH's 
intrinsics.
(__riscv_sha256sig1): Likewise.
(__riscv_sha256sum0): Likewise.
(__riscv_sha256sum1): Likewise.
(__riscv_sha512sig0h): Likewise.
(__riscv_sha512sig0l): Likewise.
(__riscv_sha512sig1h): Likewise.
(__riscv_sha512sig1l): Likewise.
(__riscv_sha512sum0r): Likewise.
(__riscv_sha512sum1r): Likewise.
(__riscv_sha512sig0): Likewise.
(__riscv_sha512sig1): Likewise.
(__riscv_sha512sum0): Likewise.
(__riscv_sha512sum1): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zknh-sha256.c: New test.
* gcc.target/riscv/zknh-sha512-32.c: New test.
* gcc.target/riscv/zknh-sha512-64.c: New test.

Co-Authored-By: SiYu Wu
---
 gcc/config/riscv/crypto.md| 138 ++
 gcc/config/riscv/riscv-builtins.cc|   2 +
 gcc/config/riscv/riscv-crypto.def |  22 +++
 gcc/config/riscv/riscv_scalar_crypto.h|  48 ++
 gcc/testsuite/gcc.target/riscv/zknh-sha256.c  |  29 
 .../gcc.target/riscv/zknh-sha512-32.c |  43 ++
 .../gcc.target/riscv/zknh-sha512-64.c |  31 
 7 files changed, 313 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha256.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha512-32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha512-64.c

diff --git a/gcc/config/riscv/crypto.md b/gcc/config/riscv/crypto.md
index d76a872775f..063a8025f20 100644
--- a/gcc/config/riscv/crypto.md
+++ b/gcc/config/riscv/crypto.md
@@ -48,6 +48,22 @@
 UNSPEC_AES_ESM
 UNSPEC_AES_ESI
 UNSPEC_AES_ESMI
+
+;; ZKNH unspecs
+UNSPEC_SHA_256_SIG0
+UNSPEC_SHA_256_SIG1
+UNSPEC_SHA_256_SUM0
+UNSPEC_SHA_256_SUM1
+UNSPEC_SHA_512_SIG0
+UNSPEC_SHA_512_SIG0H
+UNSPEC_SHA_512_SIG0L
+UNSPEC_SHA_512_SIG1
+UNSPEC_SHA_512_SIG1H
+UNSPEC_SHA_512_SIG1L
+UNSPEC_SHA_512_SUM0
+UNSPEC_SHA_512_SUM0R
+UNSPEC_SHA_512_SUM1
+UNSPEC_SHA_512_SUM1R
 ])
 
 ;; ZBKB extension
@@ -247,3 +263,125 @@
   "TARGET_ZKNE && TARGET_64BIT"
   "aes64esm\t%0,%1,%2"
   [(set_attr "type" "crypto")])
+
+;; ZKNH - SHA256
+
+(define_insn "riscv_sha256sig0_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")]
+  UNSPEC_SHA_256_SIG0))]
+  "TARGET_ZKNH"
+  "sha256sig0\t%0,%1"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sha256sig1_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")]
+  UNSPEC_SHA_256_SIG1))]
+  "TARGET_ZKNH"
+  "sha256sig1\t%0,%1"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sha256sum0_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")]
+  UNSPEC_SHA_256_SUM0))]
+  "TARGET_ZKNH"
+  "sha256sum0\t%0,%1"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sha256sum1_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")]
+  UNSPEC_SHA_256_SUM1))]
+  "TARGET_ZKNH"
+  "sha256sum1\t%0,%1"
+  [(set_attr "type" "crypto")])
+
+;; ZKNH - SHA512
+
+(define_insn "riscv_sha512sig0h"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+(unspec:SI [(match_operand:SI 1 "register_operand" "r")
+   (match_operand:SI 2 "register_operand" "r")]
+   UNSPEC_SHA_512_SIG0H))]
+  "TARGET_ZKNH && !TARGET_64BIT"
+  "sha512sig0h\t%0,%1,%2"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sha512sig0l"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+(unspec:SI [(match_operand:SI 1 "register_operand" "r")
+   (match_operand:SI 2 "register_operand" "r")]
+   UNSPEC_SHA_512_SIG0L))]
+  "TARGET_ZKNH && !TARGET_64BIT"
+  "sha512sig0l\t%0,%1,%2"
+  [(set_attr "type" "crypto")]

[PATCH V2 2/5] Implement ZBKB, ZBKC and ZBKX extensions

2023-02-15 Thread Liao Shihua

This patch support Zkbk, Zbkc and Zkbx extension. 
It includes instruction's machine description, built-in funtion, and 
intrinsics. 
It is worth mentioning that this patch only adds instructions in Zbkb but no 
longer in Zbb.
If any instructions both in Zbb and Zbkb, they will be generated by code 
generator instead of built-in functions and intrinsics.

gcc/ChangeLog:

* config.gcc: Add intrinsics header in extra_headers.
* config/riscv/bitmanip.md: Add TARGET_ZBKB if these instructions are 
included in ZBKB extension.
* config/riscv/riscv-builtins.cc (AVAIL): Add ZBKB's,ZBKC's,ZBKX's 
AVAIL. 
* config/riscv/riscv.md: include crypto.md.
* config/riscv/crypto.md: Scalar Cryptography Machine description file.
* config/riscv/riscv-crypto.def: Scalar Cryptography built-in function 
file.
* config/riscv/riscv_scalar_crypto.h: Scalar Cryptography intrinsics 
header.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zbkb32.c: New test.
* gcc.target/riscv/zbkb64.c: New test.
* gcc.target/riscv/zbkc32.c: New test.
* gcc.target/riscv/zbkc64.c: New test.
* gcc.target/riscv/zbkx32.c: New test.
* gcc.target/riscv/zbkx64.c: New test.

Co-Authored-By: SiYu Wu
---
 gcc/config.gcc  |   2 +-
 gcc/config/riscv/bitmanip.md|  20 ++--
 gcc/config/riscv/crypto.md  | 130 
 gcc/config/riscv/riscv-builtins.cc  |   7 ++
 gcc/config/riscv/riscv-crypto.def   |  45 
 gcc/config/riscv/riscv.md   |   4 +-
 gcc/config/riscv/riscv_scalar_crypto.h  | 104 +++
 gcc/testsuite/gcc.target/riscv/zbkb32.c |  36 +++
 gcc/testsuite/gcc.target/riscv/zbkb64.c |  28 +
 gcc/testsuite/gcc.target/riscv/zbkc32.c |  17 
 gcc/testsuite/gcc.target/riscv/zbkc64.c |  17 
 gcc/testsuite/gcc.target/riscv/zbkx32.c |  18 
 gcc/testsuite/gcc.target/riscv/zbkx64.c |  18 
 13 files changed, 434 insertions(+), 12 deletions(-)
 create mode 100644 gcc/config/riscv/crypto.md
 create mode 100644 gcc/config/riscv/riscv-crypto.def
 create mode 100644 gcc/config/riscv/riscv_scalar_crypto.h
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkb32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkb64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkc32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkc64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkx32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkx64.c

diff --git a/gcc/config.gcc b/gcc/config.gcc
index f0958e1c959..951b92b2028 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -532,7 +532,7 @@ riscv*)
extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o 
riscv-shorten-memrefs.o riscv-selftests.o riscv-v.o riscv-vsetvl.o"
extra_objs="${extra_objs} riscv-vector-builtins.o 
riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o"
d_target_objs="riscv-d.o"
-   extra_headers="riscv_vector.h"
+   extra_headers="riscv_vector.h riscv_scalar_crypto.h"
target_gtfiles="$target_gtfiles 
\$(srcdir)/config/riscv/riscv-vector-builtins.cc"
target_gtfiles="$target_gtfiles 
\$(srcdir)/config/riscv/riscv-vector-builtins.h"
;;
diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 14d18edbe62..f076ba35832 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -189,7 +189,7 @@
   [(set (match_operand:X 0 "register_operand" "=r")
 (bitmanip_bitwise:X (not:X (match_operand:X 1 "register_operand" "r"))
 (match_operand:X 2 "register_operand" "r")))]
-  "TARGET_ZBB"
+  "TARGET_ZBB || TARGET_ZBKB"
   "n\t%0,%2,%1"
   [(set_attr "type" "bitmanip")
(set_attr "mode" "")])
@@ -203,7 +203,7 @@
   (const_int 0)))
(match_operand:DI 2 "register_operand")))
(clobber (match_operand:DI 3 "register_operand"))]
-  "TARGET_ZBB"
+  "TARGET_ZBB || TARGET_ZBKB"
   [(set (match_dup 3) (ashiftrt:DI (match_dup 1) (const_int 63)))
(set (match_dup 0) (and:DI (not:DI (match_dup 3)) (match_dup 2)))])
 
@@ -211,7 +211,7 @@
   [(set (match_operand:X 0 "register_operand" "=r")
 (not:X (xor:X (match_operand:X 1 "register_operand" "r")
   (match_operand:X 2 "register_operand" "r"]
-  "TARGET_ZBB"
+  "TARGET_ZBB || TARGET_ZBKB"
   "xnor\t%0,%1,%2"
   [(set_attr "type" "bitmanip")
(set_attr "mode" "")])
@@ -277,7 +277,7 @@
   [(set (match_operand:SI 0 "register_operand" "=r")
(rotatert:SI (match_operand:SI 1 "register_operand" "r")
 (match_operand:QI 2 "arith_operand" "rI")))]
-  "TARGET_ZBB"
+  "TARGET_ZBB || TARGET_ZBKB"
   "ror%i2%~\t%0,%1,%2"
   [(set_attr "type" "bitmanip")])
 
@@ -285,7 +285,7 @@
   [(set (match_operand:DI 0 "register_operand" "=r")
(rotatert:DI (match_operand:DI 1 "register_operand" "r")
 (matc

[PATCH V2 1/5] Add prototypes for RISC-V Crypto built-in functions

2023-02-15 Thread Liao Shihua

gcc/ChangeLog:

* config/riscv/riscv-builtins.cc (RISCV_FTYPE_NAME2): New enumeration 
identifier.
(RISCV_FTYPE_NAME3): Likewise.
(RISCV_ATYPE_QI): New Argument types.
(RISCV_ATYPE_HI): Likewise.
(RISCV_FTYPE_ATYPES2): New RISCV_ATYPE.
(RISCV_FTYPE_ATYPES3): Likewise.
* config/riscv/riscv-ftypes.def (2): New Definitions of prototypes.
(3):Likewise

Co-Authored-By: SiYu Wu
---
 gcc/config/riscv/riscv-builtins.cc |  8 
 gcc/config/riscv/riscv-ftypes.def  | 10 ++
 2 files changed, 18 insertions(+)

diff --git a/gcc/config/riscv/riscv-builtins.cc 
b/gcc/config/riscv/riscv-builtins.cc
index 25ca407f9a9..ded91e17554 100644
--- a/gcc/config/riscv/riscv-builtins.cc
+++ b/gcc/config/riscv/riscv-builtins.cc
@@ -42,6 +42,8 @@ along with GCC; see the file COPYING3.  If not see
 /* Macros to create an enumeration identifier for a function prototype.  */
 #define RISCV_FTYPE_NAME0(A) RISCV_##A##_FTYPE
 #define RISCV_FTYPE_NAME1(A, B) RISCV_##A##_FTYPE_##B
+#define RISCV_FTYPE_NAME2(A, B, C) RISCV_##A##_FTYPE_##B##_##C
+#define RISCV_FTYPE_NAME3(A, B, C, D) RISCV_##A##_FTYPE_##B##_##C##_##D
 
 /* Classifies the prototype of a built-in function.  */
 enum riscv_function_type {
@@ -132,6 +134,8 @@ AVAIL (always, (!0))
 /* Argument types.  */
 #define RISCV_ATYPE_VOID void_type_node
 #define RISCV_ATYPE_USI unsigned_intSI_type_node
+#define RISCV_ATYPE_QI intQI_type_node
+#define RISCV_ATYPE_HI intHI_type_node
 #define RISCV_ATYPE_SI intSI_type_node
 #define RISCV_ATYPE_DI intDI_type_node
 #define RISCV_ATYPE_VOID_PTR ptr_type_node
@@ -142,6 +146,10 @@ AVAIL (always, (!0))
   RISCV_ATYPE_##A
 #define RISCV_FTYPE_ATYPES1(A, B) \
   RISCV_ATYPE_##A, RISCV_ATYPE_##B
+#define RISCV_FTYPE_ATYPES2(A, B, C) \
+  RISCV_ATYPE_##A, RISCV_ATYPE_##B, RISCV_ATYPE_##C
+#define RISCV_FTYPE_ATYPES3(A, B, C, D) \
+  RISCV_ATYPE_##A, RISCV_ATYPE_##B, RISCV_ATYPE_##C, RISCV_ATYPE_##D
 
 static const struct riscv_builtin_description riscv_builtins[] = {
   #include "riscv-cmo.def"
diff --git a/gcc/config/riscv/riscv-ftypes.def 
b/gcc/config/riscv/riscv-ftypes.def
index 3a40c33e7c2..3b518195a29 100644
--- a/gcc/config/riscv/riscv-ftypes.def
+++ b/gcc/config/riscv/riscv-ftypes.def
@@ -32,3 +32,13 @@ DEF_RISCV_FTYPE (1, (VOID, USI))
 DEF_RISCV_FTYPE (1, (VOID, VOID_PTR))
 DEF_RISCV_FTYPE (1, (SI, SI))
 DEF_RISCV_FTYPE (1, (DI, DI))
+DEF_RISCV_FTYPE (2, (SI, QI, QI))
+DEF_RISCV_FTYPE (2, (SI, HI, HI))
+DEF_RISCV_FTYPE (2, (SI, SI, SI))
+DEF_RISCV_FTYPE (2, (DI, QI, QI))
+DEF_RISCV_FTYPE (2, (DI, HI, HI))
+DEF_RISCV_FTYPE (2, (DI, SI, SI))
+DEF_RISCV_FTYPE (2, (DI, DI, SI))
+DEF_RISCV_FTYPE (2, (DI, DI, DI))
+DEF_RISCV_FTYPE (3, (SI, SI, SI, SI))
+DEF_RISCV_FTYPE (3, (DI, DI, DI, SI))
-- 
2.38.1.windows.1

[PATCH V2 5/5] Implement ZKSH and ZKSED extensions

2023-02-15 Thread Liao Shihua

This patch support Zksh and Zksed extension. 
It includes instruction's machine description, built-in funtion, and 
intrinsics. 

gcc/ChangeLog:

* config/riscv/crypto.md (riscv_sm3p0_): Add ZKSH's and ZKSED's 
instructions.
(riscv_sm3p1_): Likewise.
(riscv_sm4ed_): Likewise.
(riscv_sm4ks_): Likewise.
* config/riscv/riscv-builtins.cc (AVAIL): Add ZKSH's and ZKSED's AVAIL.
* config/riscv/riscv-crypto.def (RISCV_BUILTIN): Add ZKSH's and ZKSED's 
built-in functions.
* config/riscv/riscv_scalar_crypto.h (__riscv_sm4ks): Add ZKSH's and 
ZKSED's intrinsics.
(__riscv_sm4ed): Likewise.
(__riscv_sm3p0): Likewise.
(__riscv_sm3p1): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zksed.c: New test.
* gcc.target/riscv/zksh.c: New test.


Co-Authored-By: SiYu Wu
---
 gcc/config/riscv/crypto.md | 50 +-
 gcc/config/riscv/riscv-builtins.cc |  4 +++
 gcc/config/riscv/riscv-crypto.def  | 12 +++
 gcc/config/riscv/riscv_scalar_crypto.h | 20 +++
 gcc/testsuite/gcc.target/riscv/zksed.c | 20 +++
 gcc/testsuite/gcc.target/riscv/zksh.c  | 19 ++
 6 files changed, 124 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zksed.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zksh.c

diff --git a/gcc/config/riscv/crypto.md b/gcc/config/riscv/crypto.md
index 063a8025f20..e28bdd91078 100644
--- a/gcc/config/riscv/crypto.md
+++ b/gcc/config/riscv/crypto.md
@@ -64,6 +64,14 @@
 UNSPEC_SHA_512_SUM0R
 UNSPEC_SHA_512_SUM1
 UNSPEC_SHA_512_SUM1R
+
+;; ZKSH unspecs
+UNSPEC_SM3_P0
+UNSPEC_SM3_P1
+
+;; ZKSED unspecs
+UNSPEC_SM4_ED
+UNSPEC_SM4_KS
 ])
 
 ;; ZBKB extension
@@ -384,4 +392,44 @@
UNSPEC_SHA_512_SUM1))]
   "TARGET_ZKNH && TARGET_64BIT"
   "sha512sum1\t%0,%1"
-  [(set_attr "type" "crypto")])
\ No newline at end of file
+  [(set_attr "type" "crypto")])
+
+ ;; ZKSH
+
+(define_insn "riscv_sm3p0_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")]
+  UNSPEC_SM3_P0))]
+  "TARGET_ZKSH"
+  "sm3p0\t%0,%1"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sm3p1_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")]
+  UNSPEC_SM3_P1))]
+  "TARGET_ZKSH"
+  "sm3p1\t%0,%1"
+  [(set_attr "type" "crypto")])
+
+;; ZKSED
+
+(define_insn "riscv_sm4ed_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")
+  (match_operand:X 2 "register_operand" "r")
+  (match_operand:SI 3 "register_operand" "D03")]
+  UNSPEC_SM4_ED))]
+  "TARGET_ZKSED"
+  "sm4ed\t%0,%1,%2,%3"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sm4ks_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")
+  (match_operand:X 2 "register_operand" "r")
+  (match_operand:SI 3 "register_operand" "D03")]
+  UNSPEC_SM4_KS))]
+  "TARGET_ZKSED"
+  "sm4ks\t%0,%1,%2,%3"
+  [(set_attr "type" "crypto")])
diff --git a/gcc/config/riscv/riscv-builtins.cc 
b/gcc/config/riscv/riscv-builtins.cc
index 2a35167e6fb..18c0cce6b8b 100644
--- a/gcc/config/riscv/riscv-builtins.cc
+++ b/gcc/config/riscv/riscv-builtins.cc
@@ -113,6 +113,10 @@ AVAIL (crypto_zkne64, TARGET_ZKNE && TARGET_64BIT)
 AVAIL (crypto_zkne_or_zknd, (TARGET_ZKNE || TARGET_ZKND) && TARGET_64BIT)
 AVAIL (crypto_zknh32, TARGET_ZKNH && !TARGET_64BIT)
 AVAIL (crypto_zknh64, TARGET_ZKNH && TARGET_64BIT)
+AVAIL (crypto_zksh32, TARGET_ZKSH && !TARGET_64BIT)
+AVAIL (crypto_zksh64, TARGET_ZKSH && TARGET_64BIT)
+AVAIL (crypto_zksed32, TARGET_ZKSED && !TARGET_64BIT)
+AVAIL (crypto_zksed64, TARGET_ZKSED && TARGET_64BIT)
 AVAIL (always, (!0))
 
 /* Construct a riscv_builtin_description from the given arguments.
diff --git a/gcc/config/riscv/riscv-crypto.def 
b/gcc/config/riscv/riscv-crypto.def
index 831ab8c0d01..7774b801aec 100644
--- a/gcc/config/riscv/riscv-crypto.def
+++ b/gcc/config/riscv/riscv-crypto.def
@@ -80,3 +80,15 @@ DIRECT_BUILTIN (sha512sig0, RISCV_DI_FTYPE_DI, 
crypto_zknh64),
 DIRECT_BUILTIN (sha512sig1, RISCV_DI_FTYPE_DI, crypto_zknh64),
 DIRECT_BUILTIN (sha512sum0, RISCV_DI_FTYPE_DI, crypto_zknh64),
 DIRECT_BUILTIN (sha512sum1, RISCV_DI_FTYPE_DI, crypto_zknh64),
+
+// ZKSH
+RISCV_BUILTIN (sm3p0_si, "sm3p0", RISCV_BUILTIN_DIRECT, RISCV_SI_FTYPE_SI, 
crypto_zksh32),
+RISCV_BUILTIN (sm3p0_di, "sm3p0", RISCV_BUILTIN_DIRECT, RISCV_DI_FTYPE_DI, 
crypto_zksh64),
+RISCV_BUILTIN (sm3p1_si, "sm3p1", RISCV_BUILTIN_DIRECT, RISCV_SI_FTYPE_SI, 
crypto_zksh32),
+RISCV_BUILTIN (sm3p1_di, "sm3p1", RISCV_BUILTIN_DIRECT, RISCV_DI_FTYPE_DI, 
crypto_zksh64),
+
+// ZKSED
+RISCV_BUILTIN (sm4ed_si, "sm4ed", RISCV_BUILTIN

[PATCH V2 0/5] RISC-V: Implement Scalar Cryptography Extension

2023-02-15 Thread Liao Shihua

This series adds basic support for the Scalar Cryptography extensions:
* Zbkb
* Zbkc
* Zbkx
* Zknd
* Zkne
* Zknh
* Zksed
* Zksh

The implementation follows the version Scalar Cryptography v1.0.0 of the 
specification,
and the intrinsic of Scalar Cryptography extensions follows riscv-c-api
which can be found here:
https://github.com/riscv/riscv-crypto/releases/tag/v1.0.0-scalar
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/31

It works by Wu Siyu and Liao Shihua .

Liao Shihua (5):
  Add prototypes for RISC-V Crypto built-in functions
  Implement ZBKB, ZBKC and ZBKX extensions
  Implement ZKND and ZKNE extensions
  Implement ZKNH extensions
  Implement ZKSH and ZKSED extensions

 gcc/config.gcc|   2 +-
 gcc/config/riscv/bitmanip.md  |  20 +-
 gcc/config/riscv/constraints.md   |   8 +
 gcc/config/riscv/crypto.md| 435 ++
 gcc/config/riscv/riscv-builtins.cc|  26 ++
 gcc/config/riscv/riscv-crypto.def |  94 
 gcc/config/riscv/riscv-ftypes.def |  10 +
 gcc/config/riscv/riscv.md |   4 +-
 gcc/config/riscv/riscv_scalar_crypto.h| 218 +
 gcc/testsuite/gcc.target/riscv/zbkb32.c   |  36 ++
 gcc/testsuite/gcc.target/riscv/zbkb64.c   |  28 ++
 gcc/testsuite/gcc.target/riscv/zbkc32.c   |  17 +
 gcc/testsuite/gcc.target/riscv/zbkc64.c   |  17 +
 gcc/testsuite/gcc.target/riscv/zbkx32.c   |  18 +
 gcc/testsuite/gcc.target/riscv/zbkx64.c   |  18 +
 gcc/testsuite/gcc.target/riscv/zknd32.c   |  18 +
 gcc/testsuite/gcc.target/riscv/zknd64.c   |  36 ++
 gcc/testsuite/gcc.target/riscv/zkne32.c   |  18 +
 gcc/testsuite/gcc.target/riscv/zkne64.c   |  30 ++
 gcc/testsuite/gcc.target/riscv/zknh-sha256.c  |  29 ++
 .../gcc.target/riscv/zknh-sha512-32.c |  43 ++
 .../gcc.target/riscv/zknh-sha512-64.c |  31 ++
 gcc/testsuite/gcc.target/riscv/zksed.c|  20 +
 gcc/testsuite/gcc.target/riscv/zksh.c |  19 +
 24 files changed, 1183 insertions(+), 12 deletions(-)
 create mode 100644 gcc/config/riscv/crypto.md
 create mode 100644 gcc/config/riscv/riscv-crypto.def
 create mode 100644 gcc/config/riscv/riscv_scalar_crypto.h
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkb32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkb64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkc32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkc64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkx32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkx64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknd32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknd64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zkne32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zkne64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha256.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha512-32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha512-64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zksed.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zksh.c

-- 
2.38.1.windows.1

[PATCH V2 3/5] Implement ZKND and ZKNE extensions

2023-02-15 Thread Liao Shihua

This patch support Zkne and Zknd extension. 
It includes instruction's machine description, built-in funtion, and 
intrinsics. 

gcc/ChangeLog:

* config/riscv/constraints.md (D03): New constraints of bs.
(DsA):New constraints of rnum.
* config/riscv/crypto.md (riscv_aes32dsi):Add ZKND,ZKNE instructions.
(riscv_aes32dsmi): Likewise.
(riscv_aes64ds): Likewise.
(riscv_aes64dsm): Likewise.
(riscv_aes64im): Likewise.
(riscv_aes64ks1i): Likewise.
(riscv_aes64ks2): Likewise.
(riscv_aes32esi): Likewise.
(riscv_aes32esmi): Likewise.
(riscv_aes64es): Likewise.
(riscv_aes64esm): Likewise.
* config/riscv/riscv-builtins.cc (AVAIL): Add ZKND's and ZKNE's AVAIL. 
* config/riscv/riscv-crypto.def (DIRECT_BUILTIN):Add ZKND's and ZKNE's 
built-in functions. 
* config/riscv/riscv_scalar_crypto.h (__riscv_aes32dsi):Add ZKND's and 
ZKNE's intrinsics. 
(__riscv_aes32dsmi): Likewise.
(__riscv_aes64ds): Likewise.
(__riscv_aes64dsm): Likewise.
(__riscv_aes64im): Likewise.
(__riscv_aes64ks1i): Likewise.
(__riscv_aes64ks2): Likewise.
(__riscv_aes32esi): Likewise.
(__riscv_aes32esmi): Likewise.
(__riscv_aes64es): Likewise.
(__riscv_aes64esm): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zknd32.c: New test.
* gcc.target/riscv/zknd64.c: New test.
* gcc.target/riscv/zkne32.c: New test.
* gcc.target/riscv/zkne64.c: New test.

Co-Authored-By: SiYu Wu
---
 gcc/config/riscv/constraints.md |   8 ++
 gcc/config/riscv/crypto.md  | 121 +++-
 gcc/config/riscv/riscv-builtins.cc  |   5 +
 gcc/config/riscv/riscv-crypto.def   |  15 +++
 gcc/config/riscv/riscv_scalar_crypto.h  |  46 +
 gcc/testsuite/gcc.target/riscv/zknd32.c |  18 
 gcc/testsuite/gcc.target/riscv/zknd64.c |  36 +++
 gcc/testsuite/gcc.target/riscv/zkne32.c |  18 
 gcc/testsuite/gcc.target/riscv/zkne64.c |  30 ++
 9 files changed, 296 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknd32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknd64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zkne32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zkne64.c

diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index 3637380ee47..3f46f14b10f 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -83,6 +83,14 @@
   (and (match_code "const_int")
(match_test "SINGLE_BIT_MASK_OPERAND (~ival)")))
 
+(define_constraint "D03"
+  "0, 1, 2 or 3 immediate"
+  (match_test "IN_RANGE (ival, 0, 3)"))
+
+(define_constraint "DsA"
+  "0 - 10 immediate"
+  (match_test "IN_RANGE (ival, 0, 10)"))
+
 ;; Floating-point constant +0.0, used for FCVT-based moves when FMV is
 ;; not available in RV32.
 (define_constraint "G"
diff --git a/gcc/config/riscv/crypto.md b/gcc/config/riscv/crypto.md
index 6792f19ed68..d76a872775f 100644
--- a/gcc/config/riscv/crypto.md
+++ b/gcc/config/riscv/crypto.md
@@ -34,7 +34,20 @@
 UNSPEC_XPERM8
 UNSPEC_XPERM4
 
-
+;; ZKND unspecs
+UNSPEC_AES_DSI
+UNSPEC_AES_DSMI
+UNSPEC_AES_DS
+UNSPEC_AES_DSM
+UNSPEC_AES_IM
+UNSPEC_AES_KS1I
+UNSPEC_AES_KS2
+
+;; ZKNE unspecs
+UNSPEC_AES_ES
+UNSPEC_AES_ESM
+UNSPEC_AES_ESI
+UNSPEC_AES_ESMI
 ])
 
 ;; ZBKB extension
@@ -128,3 +141,109 @@
   "TARGET_ZBKX"
   "xperm8\t%0,%1,%2"
   [(set_attr "type" "crypto")])
+
+;; ZKND extension
+
+(define_insn "riscv_aes32dsi"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+(unspec:SI [(match_operand:SI 1 "register_operand" "r")
+   (match_operand:SI 2 "register_operand" "r")
+   (match_operand:SI 3 "register_operand" "D03")]
+   UNSPEC_AES_DSI))]
+  "TARGET_ZKND && !TARGET_64BIT"
+  "aes32dsi\t%0,%1,%2,%3"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_aes32dsmi"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+(unspec:SI [(match_operand:SI 1 "register_operand" "r")
+   (match_operand:SI 2 "register_operand" "r")
+   (match_operand:SI 3 "register_operand" "D03")]
+   UNSPEC_AES_DSMI))]
+  "TARGET_ZKND && !TARGET_64BIT"
+  "aes32dsmi\t%0,%1,%2,%3"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_aes64ds"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+(unspec:DI [(match_operand:DI 1 "register_operand" "r")
+   (match_operand:DI 2 "register_operand" "r")]
+   UNSPEC_AES_DS))]
+  "TARGET_ZKND && TARGET_64BIT"
+  "aes64ds\t%0,%1,%2"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_aes64dsm"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+(unspec:DI [(match_operand:DI 1 "register_operand" "r")
+   (match_operand:DI 2

[PATCH V2 4/5] Implement ZKNH extensions

2023-02-15 Thread Liao Shihua

This patch support Zknh extension. 
It includes instruction's machine description, built-in funtion, and 
intrinsics. 

gcc/ChangeLog:

* config/riscv/crypto.md (riscv_sha256sig0_):Add ZKNH's 
instructions.
(riscv_sha256sig1_): Likewise.
(riscv_sha256sum0_): Likewise.
(riscv_sha256sum1_): Likewise.
(riscv_sha512sig0h): Likewise.
(riscv_sha512sig0l): Likewise.
(riscv_sha512sig1h): Likewise.
(riscv_sha512sig1l): Likewise.
(riscv_sha512sum0r): Likewise.
(riscv_sha512sum1r): Likewise.
(riscv_sha512sig0): Likewise.
(riscv_sha512sig1): Likewise.
(riscv_sha512sum0): Likewise.
(riscv_sha512sum1): Likewise.
* config/riscv/riscv-builtins.cc (AVAIL): Add ZKNH's AVAIL.
* config/riscv/riscv-crypto.def (RISCV_BUILTIN): Add ZKNH's built-in 
functions.
(DIRECT_BUILTIN): Likewise.
* config/riscv/riscv_scalar_crypto.h (__riscv_sha256sig0): Add ZKNH's 
intrinsics.
(__riscv_sha256sig1): Likewise.
(__riscv_sha256sum0): Likewise.
(__riscv_sha256sum1): Likewise.
(__riscv_sha512sig0h): Likewise.
(__riscv_sha512sig0l): Likewise.
(__riscv_sha512sig1h): Likewise.
(__riscv_sha512sig1l): Likewise.
(__riscv_sha512sum0r): Likewise.
(__riscv_sha512sum1r): Likewise.
(__riscv_sha512sig0): Likewise.
(__riscv_sha512sig1): Likewise.
(__riscv_sha512sum0): Likewise.
(__riscv_sha512sum1): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zknh-sha256.c: New test.
* gcc.target/riscv/zknh-sha512-32.c: New test.
* gcc.target/riscv/zknh-sha512-64.c: New test.

Co-Authored-By: SiYu Wu
---
 gcc/config/riscv/crypto.md| 138 ++
 gcc/config/riscv/riscv-builtins.cc|   2 +
 gcc/config/riscv/riscv-crypto.def |  22 +++
 gcc/config/riscv/riscv_scalar_crypto.h|  48 ++
 gcc/testsuite/gcc.target/riscv/zknh-sha256.c  |  29 
 .../gcc.target/riscv/zknh-sha512-32.c |  43 ++
 .../gcc.target/riscv/zknh-sha512-64.c |  31 
 7 files changed, 313 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha256.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha512-32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha512-64.c

diff --git a/gcc/config/riscv/crypto.md b/gcc/config/riscv/crypto.md
index d76a872775f..063a8025f20 100644
--- a/gcc/config/riscv/crypto.md
+++ b/gcc/config/riscv/crypto.md
@@ -48,6 +48,22 @@
 UNSPEC_AES_ESM
 UNSPEC_AES_ESI
 UNSPEC_AES_ESMI
+
+;; ZKNH unspecs
+UNSPEC_SHA_256_SIG0
+UNSPEC_SHA_256_SIG1
+UNSPEC_SHA_256_SUM0
+UNSPEC_SHA_256_SUM1
+UNSPEC_SHA_512_SIG0
+UNSPEC_SHA_512_SIG0H
+UNSPEC_SHA_512_SIG0L
+UNSPEC_SHA_512_SIG1
+UNSPEC_SHA_512_SIG1H
+UNSPEC_SHA_512_SIG1L
+UNSPEC_SHA_512_SUM0
+UNSPEC_SHA_512_SUM0R
+UNSPEC_SHA_512_SUM1
+UNSPEC_SHA_512_SUM1R
 ])
 
 ;; ZBKB extension
@@ -247,3 +263,125 @@
   "TARGET_ZKNE && TARGET_64BIT"
   "aes64esm\t%0,%1,%2"
   [(set_attr "type" "crypto")])
+
+;; ZKNH - SHA256
+
+(define_insn "riscv_sha256sig0_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")]
+  UNSPEC_SHA_256_SIG0))]
+  "TARGET_ZKNH"
+  "sha256sig0\t%0,%1"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sha256sig1_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")]
+  UNSPEC_SHA_256_SIG1))]
+  "TARGET_ZKNH"
+  "sha256sig1\t%0,%1"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sha256sum0_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")]
+  UNSPEC_SHA_256_SUM0))]
+  "TARGET_ZKNH"
+  "sha256sum0\t%0,%1"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sha256sum1_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")]
+  UNSPEC_SHA_256_SUM1))]
+  "TARGET_ZKNH"
+  "sha256sum1\t%0,%1"
+  [(set_attr "type" "crypto")])
+
+;; ZKNH - SHA512
+
+(define_insn "riscv_sha512sig0h"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+(unspec:SI [(match_operand:SI 1 "register_operand" "r")
+   (match_operand:SI 2 "register_operand" "r")]
+   UNSPEC_SHA_512_SIG0H))]
+  "TARGET_ZKNH && !TARGET_64BIT"
+  "sha512sig0h\t%0,%1,%2"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sha512sig0l"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+(unspec:SI [(match_operand:SI 1 "register_operand" "r")
+   (match_operand:SI 2 "register_operand" "r")]
+   UNSPEC_SHA_512_SIG0L))]
+  "TARGET_ZKNH && !TARGET_64BIT"
+  "sha512sig0l\t%0,%1,%2"
+  [(set_attr "type" "crypto")]

[PATCH V2 2/5] Implement ZBKB, ZBKC and ZBKX extensions

2023-02-15 Thread Liao Shihua

This patch support Zkbk, Zbkc and Zkbx extension. 
It includes instruction's machine description, built-in funtion, and 
intrinsics. 
It is worth mentioning that this patch only adds instructions in Zbkb but no 
longer in Zbb.
If any instructions both in Zbb and Zbkb, they will be generated by code 
generator instead of built-in functions and intrinsics.

gcc/ChangeLog:

* config.gcc: Add intrinsics header in extra_headers.
* config/riscv/bitmanip.md: Add TARGET_ZBKB if these instructions are 
included in ZBKB extension.
* config/riscv/riscv-builtins.cc (AVAIL): Add ZBKB's,ZBKC's,ZBKX's 
AVAIL. 
* config/riscv/riscv.md: include crypto.md.
* config/riscv/crypto.md: Scalar Cryptography Machine description file.
* config/riscv/riscv-crypto.def: Scalar Cryptography built-in function 
file.
* config/riscv/riscv_scalar_crypto.h: Scalar Cryptography intrinsics 
header.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zbkb32.c: New test.
* gcc.target/riscv/zbkb64.c: New test.
* gcc.target/riscv/zbkc32.c: New test.
* gcc.target/riscv/zbkc64.c: New test.
* gcc.target/riscv/zbkx32.c: New test.
* gcc.target/riscv/zbkx64.c: New test.

Co-Authored-By: SiYu Wu
---
 gcc/config.gcc  |   2 +-
 gcc/config/riscv/bitmanip.md|  20 ++--
 gcc/config/riscv/crypto.md  | 130 
 gcc/config/riscv/riscv-builtins.cc  |   7 ++
 gcc/config/riscv/riscv-crypto.def   |  45 
 gcc/config/riscv/riscv.md   |   4 +-
 gcc/config/riscv/riscv_scalar_crypto.h  | 104 +++
 gcc/testsuite/gcc.target/riscv/zbkb32.c |  36 +++
 gcc/testsuite/gcc.target/riscv/zbkb64.c |  28 +
 gcc/testsuite/gcc.target/riscv/zbkc32.c |  17 
 gcc/testsuite/gcc.target/riscv/zbkc64.c |  17 
 gcc/testsuite/gcc.target/riscv/zbkx32.c |  18 
 gcc/testsuite/gcc.target/riscv/zbkx64.c |  18 
 13 files changed, 434 insertions(+), 12 deletions(-)
 create mode 100644 gcc/config/riscv/crypto.md
 create mode 100644 gcc/config/riscv/riscv-crypto.def
 create mode 100644 gcc/config/riscv/riscv_scalar_crypto.h
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkb32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkb64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkc32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkc64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkx32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkx64.c

diff --git a/gcc/config.gcc b/gcc/config.gcc
index f0958e1c959..951b92b2028 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -532,7 +532,7 @@ riscv*)
extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o 
riscv-shorten-memrefs.o riscv-selftests.o riscv-v.o riscv-vsetvl.o"
extra_objs="${extra_objs} riscv-vector-builtins.o 
riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o"
d_target_objs="riscv-d.o"
-   extra_headers="riscv_vector.h"
+   extra_headers="riscv_vector.h riscv_scalar_crypto.h"
target_gtfiles="$target_gtfiles 
\$(srcdir)/config/riscv/riscv-vector-builtins.cc"
target_gtfiles="$target_gtfiles 
\$(srcdir)/config/riscv/riscv-vector-builtins.h"
;;
diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 14d18edbe62..f076ba35832 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -189,7 +189,7 @@
   [(set (match_operand:X 0 "register_operand" "=r")
 (bitmanip_bitwise:X (not:X (match_operand:X 1 "register_operand" "r"))
 (match_operand:X 2 "register_operand" "r")))]
-  "TARGET_ZBB"
+  "TARGET_ZBB || TARGET_ZBKB"
   "n\t%0,%2,%1"
   [(set_attr "type" "bitmanip")
(set_attr "mode" "")])
@@ -203,7 +203,7 @@
   (const_int 0)))
(match_operand:DI 2 "register_operand")))
(clobber (match_operand:DI 3 "register_operand"))]
-  "TARGET_ZBB"
+  "TARGET_ZBB || TARGET_ZBKB"
   [(set (match_dup 3) (ashiftrt:DI (match_dup 1) (const_int 63)))
(set (match_dup 0) (and:DI (not:DI (match_dup 3)) (match_dup 2)))])
 
@@ -211,7 +211,7 @@
   [(set (match_operand:X 0 "register_operand" "=r")
 (not:X (xor:X (match_operand:X 1 "register_operand" "r")
   (match_operand:X 2 "register_operand" "r"]
-  "TARGET_ZBB"
+  "TARGET_ZBB || TARGET_ZBKB"
   "xnor\t%0,%1,%2"
   [(set_attr "type" "bitmanip")
(set_attr "mode" "")])
@@ -277,7 +277,7 @@
   [(set (match_operand:SI 0 "register_operand" "=r")
(rotatert:SI (match_operand:SI 1 "register_operand" "r")
 (match_operand:QI 2 "arith_operand" "rI")))]
-  "TARGET_ZBB"
+  "TARGET_ZBB || TARGET_ZBKB"
   "ror%i2%~\t%0,%1,%2"
   [(set_attr "type" "bitmanip")])
 
@@ -285,7 +285,7 @@
   [(set (match_operand:DI 0 "register_operand" "=r")
(rotatert:DI (match_operand:DI 1 "register_operand" "r")
 (matc

Re: [PATCH] PR tree-optimization/108697 - Create a lazy ssa_cache

2023-02-15 Thread Richard Biener via Gcc-patches

On Wed, Feb 15, 2023 at 6:07 PM Andrew MacLeod via Gcc-patches
 wrote:
>
> This patch implements the suggestion that we have an alternative
> ssa-cache which does not zero memory, and instead uses a bitmap to track
> whether a value is currently set or not.  It roughly mimics what
> path_range_query was doing internally.
>
> For sparsely used cases, expecially in large programs, this is more
> efficient.  I changed path_range_query to use this, and removed it old
> bitmap (and a hack or two around PHI calculations), and also utilized
> this is the assume_query class.
>
> Performance wise, the patch doesn't affect VRP (since that still uses
> the original version).  Switching to the lazy version caused a slowdown
> of 2.5% across VRP.
>
> There was a noticeable improvement elsewhere.,  across 230 GCC source
> files, threading ran over 12% faster!.  Overall compilation improved by
> 0.3%  Not sure it makes much difference in compiler.i, but it shouldn't
> hurt.
>
> bootstraps on x86_64-pc-linux-gnu with no regressions.   OK for trunk?
> or do you want to wait for the next release...

I see

@@ -365,16 +335,8 @@ path_range_query::compute_ranges_in_phis (basic_block bb)

   Value_Range r (TREE_TYPE (name));
   if (range_defined_in_block (r, name, bb))
-   {
- unsigned v = SSA_NAME_VERSION (name);
- set_cache (r, name);
- bitmap_set_bit (phi_set, v);
- // Pretend we don't have a cache entry for this name until
- // we're done with all PHIs.
- bitmap_clear_bit (m_has_cache_entry, v);
-   }
+   m_cache.set_global_range (name, r);
 }
-  bitmap_ior_into (m_has_cache_entry, phi_set);
 }

 // Return TRUE if relations may be invalidated after crossing edge E.

which I think is not correct - if we have

 # _1 = PHI <..., _2>
 # _2 = PHI <..., _1>

then their effects are supposed to be executed in parallel, that is,
both PHI argument _2 and _1 are supposed to see the "old" version.
The previous code tried to make sure the range of the new _1 doesn't
get seen when processing the argument _1 in the definition of _2.

The new version drops this, possibly resulting in wrong-code.

While I think it's appropriate to sort out compile-time issues like this
during stage4 at least the above makes me think it should be defered
to next stage1.

Richard.

>
> Andrew

89 matches

Mail list logo