Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements

2023-05-22 Thread juzhe.zh...@rivai.ai
Hi, Richard and Richi.
This patch bootstrap PASS on X86 and regression no surprise change.
Ok for trunk ?

Thanks.


juzhe.zh...@rivai.ai
 
From: juzhe.zhong
Date: 2023-05-22 10:08
To: gcc-patches
CC: richard.sandiford; rguenther; pan2.li; Ju-Zhe Zhong
Subject: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting 
elements
From: Ju-Zhe Zhong 
 
Address comments from Richard that splits the patch of fixing multiple-rgroup
handling of length counting elements.
 
This patch is fixing issue of handling multiple-rgroup of length is counting 
elements
 
Before this patch, multiple rgroup run fail:
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
 
After this patch, These tests are all passed.
 
gcc/ChangeLog:
 
* tree-vect-loop.cc (vect_get_loop_len): Fix issue for multiple-rgroup 
of length.
* tree-vect-stmts.cc (vectorizable_store): Ditto.
(vectorizable_load): Ditto.
* tree-vectorizer.h (vect_get_loop_len): Ditto.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c: New 
test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c: New 
test.
 
---
.../rvv/autovec/partial/multiple_rgroup-1.c   |   6 +
.../rvv/autovec/partial/multiple_rgroup-1.h   | 304 ++
.../rvv/autovec/partial/multiple_rgroup-2.c   |   6 +
.../rvv/autovec/partial/multiple_rgroup-2.h   | 546 ++
.../autovec/partial/multiple_rgroup_run-1.c   |  19 +
.../autovec/partial/multiple_rgroup_run-2.c   |  19 +
gcc/tree-vect-loop.cc |  26 +-
gcc/tree-vect-stmts.cc|  28 +-
gcc/tree-vectorizer.h |   5 +-
9 files changed, 944 insertions(+), 15 deletions(-)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c
 
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
new file mode 100644
index 000..69cc3be78f7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param 
riscv-autovec-preference=fixed-vlmax" } */
+
+#include "multiple_rgroup-1.h"
+
+TEST_ALL (test_1)
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
new file mode 100644
index 000..fbc49f4855d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
@@ -0,0 +1,304 @@
+#include 
+#include 
+
+#define test_1(TYPE1, TYPE2)   
\
+  void __attribute__ ((noinline, noclone)) 
\
+  test_1_##TYPE1_##TYPE2 (TYPE1 *__restrict f, TYPE2 *__restrict d, TYPE1 x,   
\
+   TYPE1 x2, TYPE2 y, int n)\
+  {
\
+for (int i = 0; i < n; ++i)
\
+  {   

[PATCH 2/2] xtensa: Merge '*addx' and '*subx' insn patterns into one

2023-05-22 Thread Takayuki 'January June' Suwa via Gcc-patches
By making use of the 'addsub_operator' added in the last patch.

gcc/ChangeLog:

* config/xtensa/xtensa.md (*addsubx): Rename from '*addx',
and change to also accept '*subx' pattern.
(*subx): Remove.
---
 gcc/config/xtensa/xtensa.md | 31 +--
 1 file changed, 13 insertions(+), 18 deletions(-)

diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
index bd4614e4be0..f3313266645 100644
--- a/gcc/config/xtensa/xtensa.md
+++ b/gcc/config/xtensa/xtensa.md
@@ -170,15 +170,24 @@
(set_attr "mode""SI")
(set_attr "length"  "2,2,3,3,3")])
 
-(define_insn "*addx"
+(define_insn "*addsubx"
   [(set (match_operand:SI 0 "register_operand" "=a")
-   (plus:SI (ashift:SI (match_operand:SI 1 "register_operand" "r")
+   (match_operator:SI 4 "addsub_operator"
+   [(ashift:SI (match_operand:SI 1 "register_operand" "r")
(match_operand:SI 3 "addsubx_operand" "i"))
-(match_operand:SI 2 "register_operand" "r")))]
+(match_operand:SI 2 "register_operand" "r")]))]
   "TARGET_ADDX"
 {
   operands[3] = GEN_INT (1 << INTVAL (operands[3]));
-  return "addx%3\t%0, %1, %2";
+  switch (GET_CODE (operands[4]))
+{
+case PLUS:
+  return "addx%3\t%0, %1, %2";
+case MINUS:
+  return "subx%3\t%0, %1, %2";
+default:
+  gcc_unreachable ();
+}
 }
   [(set_attr "type""arith")
(set_attr "mode""SI")
@@ -207,20 +216,6 @@
(set_attr "mode""SI")
(set_attr "length"  "3")])
 
-(define_insn "*subx"
-  [(set (match_operand:SI 0 "register_operand" "=a")
-   (minus:SI (ashift:SI (match_operand:SI 1 "register_operand" "r")
-(match_operand:SI 3 "addsubx_operand" "i"))
- (match_operand:SI 2 "register_operand" "r")))]
-  "TARGET_ADDX"
-{
-  operands[3] = GEN_INT (1 << INTVAL (operands[3]));
-  return "subx%3\t%0, %1, %2";
-}
-  [(set_attr "type""arith")
-   (set_attr "mode""SI")
-   (set_attr "length"  "3")])
-
 (define_insn "subsf3"
   [(set (match_operand:SF 0 "register_operand" "=f")
(minus:SF (match_operand:SF 1 "register_operand" "f")
-- 
2.30.2


[PATCH 1/2] xtensa: Optimize '(x & CST1_POW2) != 0 ? CST2_POW2 : 0'

2023-05-22 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch decreses one machine instruction from "single bit extraction
with shifting" operation, and tries to eliminate the conditional
branch if CST2_POW2 doesn't fit into signed 12 bits with the help
of ifcvt optimization.

/* example #1 */
int test0(int x) {
  return (x & 1048576) != 0 ? 1024 : 0;
}
extern int foo(void);
int test1(void) {
  return (foo() & 1048576) != 0 ? 16777216 : 0;
}

;; before
test0:
movia9, 0x400
sraia2, a2, 10
and a2, a2, a9
ret.n
test1:
addisp, sp, -16
s32i.n  a0, sp, 12
call0   foo
extui   a2, a2, 20, 1
sllia2, a2, 20
beqz.n  a2, .L2
movi.n  a2, 1
sllia2, a2, 24
.L2:
l32i.n  a0, sp, 12
addisp, sp, 16
ret.n

;; after
test0:
extui   a2, a2, 20, 1
sllia2, a2, 10
ret.n
test1:
addisp, sp, -16
s32i.n  a0, sp, 12
call0   foo
l32i.n  a0, sp, 12
extui   a2, a2, 20, 1
sllia2, a2, 24
addisp, sp, 16
ret.n

In addition, if the left shift amount ('exact_log2(CST2_POW2)') is
between 1 through 3 and a either addition or subtraction with another
register follows, emit a ADDX[248] or SUBX[248] machine instruction
instead of separate left shift and add/subtract ones.

/* example #2 */
int test2(int x, int y) {
  return ((x & 1048576) != 0 ? 4 : 0) + y;
}
int test3(int x, int y) {
  return ((x & 2) != 0 ? 8 : 0) - y;
}

;; before
test2:
movi.n  a9, 4
sraia2, a2, 18
and a2, a2, a9
add.n   a2, a2, a3
ret.n
test3:
movi.n  a9, 8
sllia2, a2, 2
and a2, a2, a9
sub a2, a2, a3
ret.n

;; after
test2:
extui   a2, a2, 20, 1
addx4   a2, a2, a3
ret.n
test3:
extui   a2, a2, 1, 1
subx8   a2, a2, a3
ret.n

gcc/ChangeLog:

* config/xtensa/predicates.md (addsub_operator): New.
* config/xtensa/xtensa.md (*extzvsi-1bit_ashlsi3,
*extzvsi-1bit_addsubx): New insn_and_split patterns.
* config/xtensa/xtensa.cc (xtensa_rtx_costs):
Add a special case about ifcvt 'noce_try_cmove()' to handle
constant loads that do not fit into signed 12 bits in the
patterns added above.
---
 gcc/config/xtensa/predicates.md |  3 ++
 gcc/config/xtensa/xtensa.cc |  3 +-
 gcc/config/xtensa/xtensa.md | 75 +
 3 files changed, 80 insertions(+), 1 deletion(-)

diff --git a/gcc/config/xtensa/predicates.md b/gcc/config/xtensa/predicates.md
index 2dac193373a..5faf1be8c15 100644
--- a/gcc/config/xtensa/predicates.md
+++ b/gcc/config/xtensa/predicates.md
@@ -191,6 +191,9 @@
 (define_predicate "logical_shift_operator"
   (match_code "ashift,lshiftrt"))
 
+(define_predicate "addsub_operator"
+  (match_code "plus,minus"))
+
 (define_predicate "xtensa_cstoresi_operator"
   (match_code "eq,ne,gt,ge,lt,le"))
 
diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc
index bb1444c44b6..e3af78cd228 100644
--- a/gcc/config/xtensa/xtensa.cc
+++ b/gcc/config/xtensa/xtensa.cc
@@ -4355,7 +4355,8 @@ xtensa_rtx_costs (rtx x, machine_mode mode, int 
outer_code,
   switch (outer_code)
{
case SET:
- if (xtensa_simm12b (INTVAL (x)))
+ if (xtensa_simm12b (INTVAL (x))
+ || (current_pass && current_pass->tv_id == TV_IFCVT))
{
  *total = speed ? COSTS_N_INSNS (1) : 0;
  return true;
diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
index 3521fa33b47..bd4614e4be0 100644
--- a/gcc/config/xtensa/xtensa.md
+++ b/gcc/config/xtensa/xtensa.md
@@ -997,6 +997,81 @@
(set_attr "mode""SI")
(set_attr "length"  "3")])
 
+(define_insn_and_split "*extzvsi-1bit_ashlsi3"
+  [(set (match_operand:SI 0 "register_operand" "=a")
+   (and:SI (match_operator:SI 4 "logical_shift_operator"
+   [(match_operand:SI 1 "register_operand" "r")
+(match_operand:SI 2 "const_int_operand" "i")])
+   (match_operand:SI 3 "const_int_operand" "i")))]
+  "exact_log2 (INTVAL (operands[3])) > 0"
+  "#"
+  "&& 1"
+  [(set (match_dup 0)
+   (zero_extract:SI (match_dup 1)
+(const_int 1)
+(match_dup 2)))
+   (set (match_dup 0)
+   (ashift:SI (match_dup 0)
+  (match_dup 3)))]
+{
+  int shift = floor_log2 (INTVAL (operands[3]));
+  switch (GET_CODE (operands[4]))
+{
+case ASHIFT:
+  operands[2] = GEN_INT (shift - INTVAL (operands[2]));
+  break;
+case LSHIFTRT:
+  operands[2] = GEN_INT (shift + INTVAL (operands[2]));
+  break;
+default:
+  gcc_unreachable ();
+}
+  operands[3] = GEN_INT (shift);
+}
+  [(set_attr "type""arith")
+ 

[PATCH] RISC-V: Reorganize the code of CONST_VECTOR handling in riscv.cc

2023-05-22 Thread juzhe . zhong
From: Juzhe-Zhong 

Since satisfies_constraint_vi (x) belongs to RVV region.
We make this condition inside riscv_v_ext_vector_mode_p to make codes
more reasonable.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_const_insns): Reorganize the codes.

---
 gcc/config/riscv/riscv.cc | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 7bb38978261..5ac187c1b1b 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -1295,13 +1295,13 @@ riscv_const_insns (rtx x)
 * accurately according to BASE && STEP.  */
return 1;
  }
+   /* Constants from -16 to 15 can be loaded with vmv.v.i.
+  The Wc0, Wc1 constraints are already covered by the
+  vi constraint so we do not need to check them here
+  separately.  */
+   if (satisfies_constraint_vi (x))
+ return 1;
  }
-   /* Constants from -16 to 15 can be loaded with vmv.v.i.
-  The Wc0, Wc1 constraints are already covered by the
-  vi constraint so we do not need to check them here
-  separately.  */
-   if (TARGET_VECTOR && satisfies_constraint_vi (x))
- return 1;
 
/* TODO: We may support more const vector in the future.  */
return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0;
-- 
2.36.3



[PATCH] Fold _mm{, 256, 512}_abs_{epi8, epi16, epi32, epi64} into gimple ABS_EXPR.

2023-05-22 Thread liuhongt via Gcc-patches
Also for 64-bit vector abs intrinsics _mm_abs_{pi8,pi16,pi32}.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

gcc/ChangeLog:

PR target/109900
* config/i386/i386.cc (ix86_gimple_fold_builtin): Fold
_mm{,256,512}_abs_{epi8,epi16,epi32,epi64} and
_mm_abs_{pi8,pi16,pi32} into gimple ABS_EXPR.
(ix86_masked_all_ones): Handle 64-bit mask.
* config/i386/i386-builtin.def: Replace icode of related
non-mask simd abs builtins with CODE_FOR_nothing.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr109900.c: New test.
---
 gcc/config/i386/i386-builtin.def | 18 ++---
 gcc/config/i386/i386.cc  | 86 +++--
 gcc/testsuite/gcc.target/i386/pr109900.c | 95 
 3 files changed, 166 insertions(+), 33 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr109900.c

diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index f7b10a6ab1e..c91e3809c75 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -899,12 +899,12 @@ BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_sse3_hsubv4sf3, 
"__builtin_ia32_hsubps"
 BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_sse3_hsubv2df3, 
"__builtin_ia32_hsubpd", IX86_BUILTIN_HSUBPD, UNKNOWN, (int) 
V2DF_FTYPE_V2DF_V2DF)
 
 /* SSSE3 */
-BDESC (OPTION_MASK_ISA_SSSE3, 0, CODE_FOR_absv16qi2, 
"__builtin_ia32_pabsb128", IX86_BUILTIN_PABSB128, UNKNOWN, (int) 
V16QI_FTYPE_V16QI)
-BDESC (OPTION_MASK_ISA_SSSE3 | OPTION_MASK_ISA_MMX, 0, 
CODE_FOR_ssse3_absv8qi2, "__builtin_ia32_pabsb", IX86_BUILTIN_PABSB, UNKNOWN, 
(int) V8QI_FTYPE_V8QI)
-BDESC (OPTION_MASK_ISA_SSSE3, 0, CODE_FOR_absv8hi2, "__builtin_ia32_pabsw128", 
IX86_BUILTIN_PABSW128, UNKNOWN, (int) V8HI_FTYPE_V8HI)
-BDESC (OPTION_MASK_ISA_SSSE3 | OPTION_MASK_ISA_MMX, 0, 
CODE_FOR_ssse3_absv4hi2, "__builtin_ia32_pabsw", IX86_BUILTIN_PABSW, UNKNOWN, 
(int) V4HI_FTYPE_V4HI)
-BDESC (OPTION_MASK_ISA_SSSE3, 0, CODE_FOR_absv4si2, "__builtin_ia32_pabsd128", 
IX86_BUILTIN_PABSD128, UNKNOWN, (int) V4SI_FTYPE_V4SI)
-BDESC (OPTION_MASK_ISA_SSSE3 | OPTION_MASK_ISA_MMX, 0, 
CODE_FOR_ssse3_absv2si2, "__builtin_ia32_pabsd", IX86_BUILTIN_PABSD, UNKNOWN, 
(int) V2SI_FTYPE_V2SI)
+BDESC (OPTION_MASK_ISA_SSSE3, 0, CODE_FOR_nothing, "__builtin_ia32_pabsb128", 
IX86_BUILTIN_PABSB128, UNKNOWN, (int) V16QI_FTYPE_V16QI)
+BDESC (OPTION_MASK_ISA_SSSE3 | OPTION_MASK_ISA_MMX, 0, CODE_FOR_nothing, 
"__builtin_ia32_pabsb", IX86_BUILTIN_PABSB, UNKNOWN, (int) V8QI_FTYPE_V8QI)
+BDESC (OPTION_MASK_ISA_SSSE3, 0, CODE_FOR_nothing, "__builtin_ia32_pabsw128", 
IX86_BUILTIN_PABSW128, UNKNOWN, (int) V8HI_FTYPE_V8HI)
+BDESC (OPTION_MASK_ISA_SSSE3 | OPTION_MASK_ISA_MMX, 0, CODE_FOR_nothing, 
"__builtin_ia32_pabsw", IX86_BUILTIN_PABSW, UNKNOWN, (int) V4HI_FTYPE_V4HI)
+BDESC (OPTION_MASK_ISA_SSSE3, 0, CODE_FOR_nothing, "__builtin_ia32_pabsd128", 
IX86_BUILTIN_PABSD128, UNKNOWN, (int) V4SI_FTYPE_V4SI)
+BDESC (OPTION_MASK_ISA_SSSE3 | OPTION_MASK_ISA_MMX, 0, CODE_FOR_nothing, 
"__builtin_ia32_pabsd", IX86_BUILTIN_PABSD, UNKNOWN, (int) V2SI_FTYPE_V2SI)
 
 BDESC (OPTION_MASK_ISA_SSSE3, 0, CODE_FOR_ssse3_phaddwv8hi3, 
"__builtin_ia32_phaddw128", IX86_BUILTIN_PHADDW128, UNKNOWN, (int) 
V8HI_FTYPE_V8HI_V8HI)
 BDESC (OPTION_MASK_ISA_SSSE3 | OPTION_MASK_ISA_MMX, 0, 
CODE_FOR_ssse3_phaddwv4hi3, "__builtin_ia32_phaddw", IX86_BUILTIN_PHADDW, 
UNKNOWN, (int) V4HI_FTYPE_V4HI_V4HI)
@@ -1178,9 +1178,9 @@ BDESC (OPTION_MASK_ISA_AVX, 0, 
CODE_FOR_vec_pack_sfix_v4df, "__builtin_ia32_vec_
 
 /* AVX2 */
 BDESC (OPTION_MASK_ISA_AVX2, 0, CODE_FOR_avx2_mpsadbw, 
"__builtin_ia32_mpsadbw256", IX86_BUILTIN_MPSADBW256, UNKNOWN, (int) 
V32QI_FTYPE_V32QI_V32QI_INT)
-BDESC (OPTION_MASK_ISA_AVX2, 0, CODE_FOR_absv32qi2, "__builtin_ia32_pabsb256", 
IX86_BUILTIN_PABSB256, UNKNOWN, (int) V32QI_FTYPE_V32QI)
-BDESC (OPTION_MASK_ISA_AVX2, 0, CODE_FOR_absv16hi2, "__builtin_ia32_pabsw256", 
IX86_BUILTIN_PABSW256, UNKNOWN, (int) V16HI_FTYPE_V16HI)
-BDESC (OPTION_MASK_ISA_AVX2, 0, CODE_FOR_absv8si2, "__builtin_ia32_pabsd256", 
IX86_BUILTIN_PABSD256, UNKNOWN, (int) V8SI_FTYPE_V8SI)
+BDESC (OPTION_MASK_ISA_AVX2, 0, CODE_FOR_nothing, "__builtin_ia32_pabsb256", 
IX86_BUILTIN_PABSB256, UNKNOWN, (int) V32QI_FTYPE_V32QI)
+BDESC (OPTION_MASK_ISA_AVX2, 0, CODE_FOR_nothing, "__builtin_ia32_pabsw256", 
IX86_BUILTIN_PABSW256, UNKNOWN, (int) V16HI_FTYPE_V16HI)
+BDESC (OPTION_MASK_ISA_AVX2, 0, CODE_FOR_nothing, "__builtin_ia32_pabsd256", 
IX86_BUILTIN_PABSD256, UNKNOWN, (int) V8SI_FTYPE_V8SI)
 BDESC (OPTION_MASK_ISA_AVX2, 0, CODE_FOR_avx2_packssdw, 
"__builtin_ia32_packssdw256",  IX86_BUILTIN_PACKSSDW256, UNKNOWN, (int) 
V16HI_FTYPE_V8SI_V8SI)
 BDESC (OPTION_MASK_ISA_AVX2, 0, CODE_FOR_avx2_packsswb, 
"__builtin_ia32_packsswb256",  IX86_BUILTIN_PACKSSWB256, UNKNOWN, (int) 
V32QI_FTYPE_V16HI_V16HI)
 BDESC (OPTION_MASK_ISA_AVX2, 0, CODE_FOR_avx2_packusdw, 
"__builtin_ia32_packusdw256",  IX86_BUILTIN_PACKUSDW256, UNKNOWN, (int) 
V16HI_FTYPE_V8SI_V8SI)

Re: [PATCH] RISC-V: Reorganize the code of CONST_VECTOR handling in riscv.cc

2023-05-22 Thread Kito Cheng via Gcc-patches
LGTM

On Mon, May 22, 2023 at 3:20 PM  wrote:
>
> From: Juzhe-Zhong 
>
> Since satisfies_constraint_vi (x) belongs to RVV region.
> We make this condition inside riscv_v_ext_vector_mode_p to make codes
> more reasonable.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_const_insns): Reorganize the codes.
>
> ---
>  gcc/config/riscv/riscv.cc | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 7bb38978261..5ac187c1b1b 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -1295,13 +1295,13 @@ riscv_const_insns (rtx x)
>  * accurately according to BASE && STEP.  */
> return 1;
>   }
> +   /* Constants from -16 to 15 can be loaded with vmv.v.i.
> +  The Wc0, Wc1 constraints are already covered by the
> +  vi constraint so we do not need to check them here
> +  separately.  */
> +   if (satisfies_constraint_vi (x))
> + return 1;
>   }
> -   /* Constants from -16 to 15 can be loaded with vmv.v.i.
> -  The Wc0, Wc1 constraints are already covered by the
> -  vi constraint so we do not need to check them here
> -  separately.  */
> -   if (TARGET_VECTOR && satisfies_constraint_vi (x))
> - return 1;
>
> /* TODO: We may support more const vector in the future.  */
> return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0;
> --
> 2.36.3
>


Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements

2023-05-22 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes:
> From: Ju-Zhe Zhong 
>
> Address comments from Richard that splits the patch of fixing multiple-rgroup
> handling of length counting elements.
>
> This patch is fixing issue of handling multiple-rgroup of length is counting 
> elements
>
> Before this patch, multiple rgroup run fail:
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
>
> After this patch, These tests are all passed.

Thanks, looks great.  A couple of minor comments below:

> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 905145ae97b..a13d6f5e898 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -10364,8 +10364,9 @@ vect_record_loop_len (loop_vec_info loop_vinfo, 
> vec_loop_lens *lens,
> rgroup that operates on NVECTORS vectors, where 0 <= INDEX < NVECTORS.  */
>  

The new parameters need to be documented.  How about:

/* Given a complete set of lengths LENS, extract length number INDEX
   for an rgroup that operates on NVECTORS vectors of type VECTYPE,
   where 0 <= INDEX < NVECTORS.  Return a value that contains FACTOR
   multipled by the number of elements that should be processed.
   Insert any set-up statements before GSI.  */

>  tree
> -vect_get_loop_len (loop_vec_info loop_vinfo, vec_loop_lens *lens,
> -unsigned int nvectors, unsigned int index)
> +vect_get_loop_len (loop_vec_info loop_vinfo, gimple_stmt_iterator *gsi,
> +vec_loop_lens *lens, unsigned int nvectors, tree vectype,
> +unsigned int index, unsigned int factor)
>  {
>rgroup_controls *rgl = &(*lens)[nvectors - 1];
>bool use_bias_adjusted_len =
> @@ -10400,6 +10401,27 @@ vect_get_loop_len (loop_vec_info loop_vinfo, 
> vec_loop_lens *lens,
>  
>if (use_bias_adjusted_len)
>  return rgl->bias_adjusted_ctrl;
> +  else if (rgl->factor == 1 && factor == 1)
> +{
> +  tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo);
> +  tree loop_len = rgl->controls[index];
> +  poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type);
> +  poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype);
> +  if (maybe_ne (nunits1, nunits2))
> + {
> +   /* A loop len for data type X can be reused for data type Y
> +  if X has N times more elements than Y and if Y's elements
> +  are N times bigger than X's.  */
> +   gcc_assert (multiple_p (nunits1, nunits2));
> +   factor = exact_div (nunits1, nunits2).to_constant ();
> +   gimple_seq seq = NULL;
> +   loop_len = gimple_build (&seq, RDIV_EXPR, iv_type, loop_len,
> +build_int_cst (iv_type, factor));
> +   if (seq)
> + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
> + }
> +  return loop_len;
> +}
>else
>  return rgl->controls[index];

This looks right, but I think it'd be clearer to rearrange things slightly:

  if (use_bias_adjusted_len)
return rgl->bias_adjusted_ctrl;

  tree loop_len = rgl->controls[index];
  if (rgl->factor == 1 && factor == 1)
{
  poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type);
  poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype);
  if (maybe_ne (nunits1, nunits2))
{
  /* A loop len for data type X can be reused for data type Y
 if X has N times more elements than Y and if Y's elements
 are N times bigger than X's.  */
  gcc_assert (multiple_p (nunits1, nunits2));
  factor = exact_div (nunits1, nunits2).to_constant ();
  tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo);
  gimple_seq seq = NULL;
  loop_len = gimple_build (&seq, RDIV_EXPR, iv_type, loop_len,
   build_int_cst (iv_type, factor));
  if (seq)
gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
}
}
  return loop_len;

There's no change to the individual statements here, just the

Re: [PATCH] RISC-V: Add RVV comparison autovectorization

2023-05-22 Thread Robin Dapp via Gcc-patches
Hi Juzhe,

thanks.  Some remarks inline.

> +;; Integer (signed) vcond.  Don't enforce an immediate range here, since it
> +;; depends on the comparison; leave it to riscv_vector::expand_vcond instead.
> +(define_expand "vcond"
> +  [(set (match_operand:V 0 "register_operand")
> + (if_then_else:V
> +   (match_operator 3 "comparison_operator"
> + [(match_operand:VI 4 "register_operand")
> +  (match_operand:VI 5 "nonmemory_operand")])
> +   (match_operand:V 1 "nonmemory_operand")
> +   (match_operand:V 2 "nonmemory_operand")))]
> +  "TARGET_VECTOR && known_eq (GET_MODE_NUNITS (mode),
> + GET_MODE_NUNITS (mode))"
> +  {
> +riscv_vector::expand_vcond (mode, operands);
> +DONE;
> +  }
> +)
> +
> +;; Integer vcondu.  Don't enforce an immediate range here, since it
> +;; depends on the comparison; leave it to riscv_vector::expand_vcond instead.
> +(define_expand "vcondu"
> +  [(set (match_operand:V 0 "register_operand")
> + (if_then_else:V
> +   (match_operator 3 "comparison_operator"
> + [(match_operand:VI 4 "register_operand")
> +  (match_operand:VI 5 "nonmemory_operand")])
> +   (match_operand:V 1 "nonmemory_operand")
> +   (match_operand:V 2 "nonmemory_operand")))]
> +  "TARGET_VECTOR && known_eq (GET_MODE_NUNITS (mode),
> + GET_MODE_NUNITS (mode))"
> +  {
> +riscv_vector::expand_vcond (mode, operands);
> +DONE;
> +  }
> +)

These do exactly the same (as do their aarch64 heirs).  As you are a friend
of iterators usually I guess you didn't use one for clarity here?  Also, I
didn't see that we do much of immediate-range enforcement in expand_vcond.

> +
> +;; Floating-point vcond.  Don't enforce an immediate range here, since it
> +;; depends on the comparison; leave it to riscv_vector::expand_vcond instead.
> +(define_expand "vcond"
> +  [(set (match_operand:V 0 "register_operand")
> + (if_then_else:V
> +   (match_operator 3 "comparison_operator"
> + [(match_operand:VF 4 "register_operand")
> +  (match_operand:VF 5 "nonmemory_operand")])
> +   (match_operand:V 1 "nonmemory_operand")
> +   (match_operand:V 2 "nonmemory_operand")))]
> +  "TARGET_VECTOR && known_eq (GET_MODE_NUNITS (mode),
> + GET_MODE_NUNITS (mode))"
> +  {
> +riscv_vector::expand_vcond (mode, operands);
> +DONE;
> +  }
> +)

It comes a bit as a surprise to add float comparisons before any other
float autovec patterns are in.  I'm not against it but would wait for
other comments here.  If the tests are source from aarch64 they have
been reviewed often enough that we can be fairly sure to do the right
thing though.  I haven't checked the expander and inversion things
closely now though.

> +
> +;; -
> +;;  [INT,FP] Select based on masks
> +;; -
> +;; Includes merging patterns for:
> +;; - vmerge.vv
> +;; - vmerge.vx
> +;; - vfmerge.vf
> +;; -
> +
> +(define_expand "vcond_mask_"
> +  [(match_operand:V 0 "register_operand")
> +   (match_operand: 3 "register_operand")
> +   (match_operand:V 1 "nonmemory_operand")
> +   (match_operand:V 2 "register_operand")]
> +  "TARGET_VECTOR"
> +  {
> +riscv_vector::emit_merge_op (operands[0], operands[2],
> +  operands[1], operands[3]);
> +DONE;
> +  }
> +)

Order of operands is a bit surprising, see below.

> +  void add_fixed_operand (rtx x)
> +  {
> +create_fixed_operand (&m_ops[m_opno++], x);
> +gcc_assert (m_opno <= MAX_OPERANDS);
> +  }
> +  void add_integer_operand (rtx x)
> +  {
> +create_integer_operand (&m_ops[m_opno++], INTVAL (x));
> +gcc_assert (m_opno <= MAX_OPERANDS);
> +  }
>void add_all_one_mask_operand (machine_mode mode)
>{
>  add_input_operand (CONSTM1_RTX (mode), mode);
> @@ -85,11 +95,14 @@ public:
>{
>  add_input_operand (RVV_VUNDEF (mode), mode);
>}
> -  void add_policy_operand (enum tail_policy vta, enum mask_policy vma)
> +  void add_policy_operand (enum tail_policy vta)
>{
>  rtx tail_policy_rtx = gen_int_mode (vta, Pmode);
> -rtx mask_policy_rtx = gen_int_mode (vma, Pmode);
>  add_input_operand (tail_policy_rtx, Pmode);
> +  }
> +  void add_policy_operand (enum mask_policy vma)
> +  {
> +rtx mask_policy_rtx = gen_int_mode (vma, Pmode);
>  add_input_operand (mask_policy_rtx, Pmode);
>}
>void add_avl_type_operand (avl_type type)
> @@ -97,7 +110,8 @@ public:
>  add_input_operand (gen_int_mode (type, Pmode), Pmode);
>}

My idea would be to have the policy operands hidden a bit more as
in my last patch.  It comes down to a matter of taste.  We can discuss
once this is in and I rebased my suggestion.  

> -  void set_dest_and_mask (rtx mask, rtx dest, machine_mode mask_mode)
> +  void set_dest_and_mask (rtx 

Re: [PATCH V2] Provide -fcf-protection=branch,return.

2023-05-22 Thread Hongtao Liu via Gcc-patches
ping.

On Sat, May 13, 2023 at 5:20 PM liuhongt  wrote:
>
> > I think this could be simplified if you use either EnumSet or
> > EnumBitSet instead in common.opt for `-fcf-protection=`.
>
> Use EnumSet instead of EnumBitSet since CF_FULL is not power of 2.
> It is a bit tricky for sets classification, cf_branch and cf_return
> should be in different sets, but they both "conflicts" cf_full,
> cf_none. And current EnumSet don't handle this well.
>
> So in the current implementation, only cf_full,cf_none are exclusive
> to each other, but they can be combined with any cf_branch, cf_return,
> cf_check. It's not perfect, but still an improvement than original
> one.
>
> gcc/ChangeLog:
>
> * common.opt: (fcf-protection=): Add EnumSet attribute to
> support combination of params.
>
> gcc/testsuite/ChangeLog:
>
> * c-c++-common/fcf-protection-10.c: New test.
> * c-c++-common/fcf-protection-11.c: New test.
> * c-c++-common/fcf-protection-12.c: New test.
> * c-c++-common/fcf-protection-8.c: New test.
> * c-c++-common/fcf-protection-9.c: New test.
> * gcc.target/i386/pr89701-1.c: New test.
> * gcc.target/i386/pr89701-2.c: New test.
> * gcc.target/i386/pr89701-3.c: New test.
> ---
>  gcc/common.opt | 12 ++--
>  gcc/testsuite/c-c++-common/fcf-protection-10.c |  2 ++
>  gcc/testsuite/c-c++-common/fcf-protection-11.c |  2 ++
>  gcc/testsuite/c-c++-common/fcf-protection-12.c |  2 ++
>  gcc/testsuite/c-c++-common/fcf-protection-8.c  |  2 ++
>  gcc/testsuite/c-c++-common/fcf-protection-9.c  |  2 ++
>  gcc/testsuite/gcc.target/i386/pr89701-1.c  |  4 
>  gcc/testsuite/gcc.target/i386/pr89701-2.c  |  4 
>  gcc/testsuite/gcc.target/i386/pr89701-3.c  |  4 
>  9 files changed, 28 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/c-c++-common/fcf-protection-10.c
>  create mode 100644 gcc/testsuite/c-c++-common/fcf-protection-11.c
>  create mode 100644 gcc/testsuite/c-c++-common/fcf-protection-12.c
>  create mode 100644 gcc/testsuite/c-c++-common/fcf-protection-8.c
>  create mode 100644 gcc/testsuite/c-c++-common/fcf-protection-9.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr89701-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr89701-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr89701-3.c
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index a28ca13385a..02f2472959a 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1886,7 +1886,7 @@ fcf-protection
>  Common RejectNegative Alias(fcf-protection=,full)
>
>  fcf-protection=
> -Common Joined RejectNegative Enum(cf_protection_level) 
> Var(flag_cf_protection) Init(CF_NONE)
> +Common Joined RejectNegative Enum(cf_protection_level) EnumSet 
> Var(flag_cf_protection) Init(CF_NONE)
>  -fcf-protection=[full|branch|return|none|check]Instrument functions 
> with checks to verify jump/call/return control-flow transfer
>  instructions have valid targets.
>
> @@ -1894,19 +1894,19 @@ Enum
>  Name(cf_protection_level) Type(enum cf_protection_level) 
> UnknownError(unknown Control-Flow Protection Level %qs)
>
>  EnumValue
> -Enum(cf_protection_level) String(full) Value(CF_FULL)
> +Enum(cf_protection_level) String(full) Value(CF_FULL) Set(1)
>
>  EnumValue
> -Enum(cf_protection_level) String(branch) Value(CF_BRANCH)
> +Enum(cf_protection_level) String(branch) Value(CF_BRANCH) Set(2)
>
>  EnumValue
> -Enum(cf_protection_level) String(return) Value(CF_RETURN)
> +Enum(cf_protection_level) String(return) Value(CF_RETURN) Set(3)
>
>  EnumValue
> -Enum(cf_protection_level) String(check) Value(CF_CHECK)
> +Enum(cf_protection_level) String(check) Value(CF_CHECK) Set(4)
>
>  EnumValue
> -Enum(cf_protection_level) String(none) Value(CF_NONE)
> +Enum(cf_protection_level) String(none) Value(CF_NONE) Set(1)
>
>  finstrument-functions
>  Common Var(flag_instrument_function_entry_exit,1)
> diff --git a/gcc/testsuite/c-c++-common/fcf-protection-10.c 
> b/gcc/testsuite/c-c++-common/fcf-protection-10.c
> new file mode 100644
> index 000..b271d134e52
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/fcf-protection-10.c
> @@ -0,0 +1,2 @@
> +/* { dg-do compile { target { "i?86-*-* x86_64-*-*" } } } */
> +/* { dg-options "-fcf-protection=branch,check" } */
> diff --git a/gcc/testsuite/c-c++-common/fcf-protection-11.c 
> b/gcc/testsuite/c-c++-common/fcf-protection-11.c
> new file mode 100644
> index 000..2e566350ccd
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/fcf-protection-11.c
> @@ -0,0 +1,2 @@
> +/* { dg-do compile { target { "i?86-*-* x86_64-*-*" } } } */
> +/* { dg-options "-fcf-protection=branch,return" } */
> diff --git a/gcc/testsuite/c-c++-common/fcf-protection-12.c 
> b/gcc/testsuite/c-c++-common/fcf-protection-12.c
> new file mode 100644
> index 000..b39c2f8e25d
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/fcf-protection-12.c
> @@ -0,0 +1,2 @@
> +/* { dg-do compile { t

[PATCH] Fix handling of non-integral bit-fields in native_encode_initializer

2023-05-22 Thread Eric Botcazou via Gcc-patches
Hi,

the encoder for CONSTRUCTORs assumes that all bit-fields (DECL_BIT_FIELD) have
integral types, but that's not the case in Ada where they may have pretty much
any type, resulting in a wrong encoding for them.

The attached fix filters out non-integral bit-fields, except if they start and
end on a byte boundary because they are correctly handled in this case.

Bootstrapped/regtested on x86-64/Linux, OK for mainline and 13 branch?


2023-05-22  Eric Botcazou  

* fold-const.cc (native_encode_initializer) : Apply the
specific treatment for bit-fields only if they have an integral type
and filter out non-integral bit-fields that do not start and end on
a byte boundary.


2023-05-22  Eric Botcazou  

* gnat.dg/opt101.adb: New test.
* gnat.dg/opt101_pkg.ads: New helper.

-- 
Eric Botcazoudiff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 25466e97220..57521501fff 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -8360,20 +8360,26 @@ native_encode_initializer (tree init, unsigned char *ptr, int len,
 	  if (fieldsize == 0)
 		continue;
 
+	  /* Prepare to deal with integral bit-fields and filter out other
+		 bit-fields that do not start and end on a byte boundary.  */
 	  if (DECL_BIT_FIELD (field))
 		{
 		  if (!tree_fits_uhwi_p (DECL_FIELD_BIT_OFFSET (field)))
 		return 0;
-		  fieldsize = TYPE_PRECISION (TREE_TYPE (field));
 		  bpos = tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field));
-		  if (bpos % BITS_PER_UNIT)
-		bpos %= BITS_PER_UNIT;
-		  else
-		bpos = 0;
-		  fieldsize += bpos;
-		  epos = fieldsize % BITS_PER_UNIT;
-		  fieldsize += BITS_PER_UNIT - 1;
-		  fieldsize /= BITS_PER_UNIT;
+		  if (INTEGRAL_TYPE_P (TREE_TYPE (field)))
+		{
+		  bpos %= BITS_PER_UNIT;
+		  fieldsize = TYPE_PRECISION (TREE_TYPE (field)) + bpos;
+		  epos = fieldsize % BITS_PER_UNIT;
+		  fieldsize += BITS_PER_UNIT - 1;
+		  fieldsize /= BITS_PER_UNIT;
+		}
+		  else if (bpos % BITS_PER_UNIT
+			   || DECL_SIZE (field) == NULL_TREE
+			   || !tree_fits_shwi_p (DECL_SIZE (field))
+			   || tree_to_shwi (DECL_SIZE (field)) % BITS_PER_UNIT)
+		return 0;
 		}
 
 	  if (off != -1 && pos + fieldsize <= off)
@@ -8382,7 +8388,8 @@ native_encode_initializer (tree init, unsigned char *ptr, int len,
 	  if (val == NULL_TREE)
 		continue;
 
-	  if (DECL_BIT_FIELD (field))
+	  if (DECL_BIT_FIELD (field)
+		  && INTEGRAL_TYPE_P (TREE_TYPE (field)))
 		{
 		  /* FIXME: Handle PDP endian.  */
 		  if (BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN)
-- { dg-do run }
-- { dg-options "-O" }

pragma Optimize_Alignment (Space);

with Opt101_Pkg; use Opt101_Pkg;

procedure Opt101 is

  C1 : Cont1;
  C2 : Cont2;

begin
  C1 := ((1234, 1, 2), 1, 2);
  if C1.R.I1 /= 1 or C1.I2 /= 2 then
raise Program_Error;
  end if;

  C2 := (1, (1234, 1, 2), 2);
  if C2.R.I1 /= 1 or C2.I2 /= 2 then
raise Program_Error;
  end if;
end;
package Opt101_Pkg is

  type Int is mod 16;

  type Rec is record
S : Short_Integer;
I1, I2 : Int;
  end record;
  pragma Pack (Rec);
  for Rec'Alignment use 4;

  type Cont1 is record
R : Rec;
I1, I2 : Int;
  end record;
  pragma Pack (Cont1);

  type Cont2 is record
I1 : Int;
R  : Rec;
I2 : Int;
  end record;
  pragma Pack (Cont2);
  pragma No_Component_Reordering (Cont2);

end Opt101_Pkg;


[wwwdocs, committed] git.html: Move OG12 to OG13, briefly mention old branches

2023-05-22 Thread Tobias Burnus

Committed as  6196747803d192744097590d6703a94def0030f4 →
https://gcc.gnu.org/git.html

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
commit 6196747803d192744097590d6703a94def0030f4
Author: Tobias Burnus 
Date:   Mon May 22 10:08:14 2023 +0200

git.html: Move OG12 to OG13, briefly mention old branches

diff --git a/htdocs/git.html b/htdocs/git.html
index 701773ef..22c0eec1 100644
--- a/htdocs/git.html
+++ b/htdocs/git.html
@@ -280,16 +280,17 @@ in Git.
   Makarov vmaka...@redhat.com.
   
 
-  https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=shortlog;h=refs/heads/devel/omp/gcc-12";>devel/omp/gcc-12
+  https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=shortlog;h=refs/heads/devel/omp/gcc-13";>devel/omp/gcc-13
   This branch is for collaborative development of
   https://gcc.gnu.org/wiki/OpenACC";>OpenACC and
   https://gcc.gnu.org/wiki/openmp";>OpenMP support and related
   functionality, such
   as https://gcc.gnu.org/wiki/Offloading";>offloading support (OMP:
   offloading and multi processing).
-  The branch is based on releases/gcc-12.
-  Please send patch emails with a short-hand [og12] tag in the
-  subject line, and use ChangeLog.omp files.
+  The branch is based on releases/gcc-13.
+  Please send patch emails with a short-hand [og13] tag in the
+  subject line, and use ChangeLog.omp files. (Likewise but now
+  stale branches exists for the prior GCC releases 9 to 12.)
 
   unified-autovect
   This branch is for work on improving effectiveness and generality of GCC's


[PATCH V13] VECT: Fix bug of multiple-rgroup for length is counting elements

2023-05-22 Thread juzhe . zhong
From: Ju-Zhe Zhong 

Address comments from Richard that splits the patch of fixing multiple-rgroup
handling of length counting elements.

This patch is fixing issue of handling multiple-rgroup of length is counting 
elements

Before this patch, multiple rgroup run fail:
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test

After this patch, These tests are all passed.

gcc/ChangeLog:

* tree-vect-loop.cc (vect_get_loop_len): Fix issue for multiple-rgroup 
of length.
* tree-vect-stmts.cc (vectorizable_store): Ditto.
(vectorizable_load): Ditto.
* tree-vectorizer.h (vect_get_loop_len): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c: New 
test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c: New 
test.

---
 .../rvv/autovec/partial/multiple_rgroup-1.c   |   6 +
 .../rvv/autovec/partial/multiple_rgroup-1.h   | 304 ++
 .../rvv/autovec/partial/multiple_rgroup-2.c   |   6 +
 .../rvv/autovec/partial/multiple_rgroup-2.h   | 546 ++
 .../autovec/partial/multiple_rgroup_run-1.c   |  19 +
 .../autovec/partial/multiple_rgroup_run-2.c   |  19 +
 gcc/tree-vect-loop.cc |  36 +-
 gcc/tree-vect-stmts.cc|  28 +-
 gcc/tree-vectorizer.h |   5 +-
 9 files changed, 950 insertions(+), 19 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c

diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
new file mode 100644
index 000..69cc3be78f7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param 
riscv-autovec-preference=fixed-vlmax" } */
+
+#include "multiple_rgroup-1.h"
+
+TEST_ALL (test_1)
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
new file mode 100644
index 000..fbc49f4855d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
@@ -0,0 +1,304 @@
+#include 
+#include 
+
+#define test_1(TYPE1, TYPE2)   
\
+  void __attribute__ ((noinline, noclone)) 
\
+  test_1_##TYPE1_##TYPE2 (TYPE1 *__restrict f, TYPE2 *__restrict d, TYPE1 x,   
\
+ TYPE1 x2, TYPE2 y, int n)\
+  {
\
+for (int i = 0; i < n; ++i)
\
+  {
\
+   f[i * 2 + 0] = x;  \
+   f[i * 2 + 1] = x2; \
+   d[i] = y;  \
+  }   

Re: Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements

2023-05-22 Thread juzhe.zh...@rivai.ai
Thanks. Richard.
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619111.html 
Would you mind take a look again this patch?
I just copy your codes from your comments and test them.
They all passed.
Ok for trunk.

>> The patch is OK for trunk with those changes, thanks.  Once it's pushed,
>> could you post the updated decrementing IV patch?
Sure, I am working on it.


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-22 16:00
To: juzhe.zhong
CC: gcc-patches; rguenther; pan2.li
Subject: Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is 
counting elements
juzhe.zh...@rivai.ai writes:
> From: Ju-Zhe Zhong 
>
> Address comments from Richard that splits the patch of fixing multiple-rgroup
> handling of length counting elements.
>
> This patch is fixing issue of handling multiple-rgroup of length is counting 
> elements
>
> Before this patch, multiple rgroup run fail:
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
>
> After this patch, These tests are all passed.
 
Thanks, looks great.  A couple of minor comments below:
 
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 905145ae97b..a13d6f5e898 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -10364,8 +10364,9 @@ vect_record_loop_len (loop_vec_info loop_vinfo, 
> vec_loop_lens *lens,
> rgroup that operates on NVECTORS vectors, where 0 <= INDEX < NVECTORS.  */
>  
 
The new parameters need to be documented.  How about:
 
/* Given a complete set of lengths LENS, extract length number INDEX
   for an rgroup that operates on NVECTORS vectors of type VECTYPE,
   where 0 <= INDEX < NVECTORS.  Return a value that contains FACTOR
   multipled by the number of elements that should be processed.
   Insert any set-up statements before GSI.  */
 
>  tree
> -vect_get_loop_len (loop_vec_info loop_vinfo, vec_loop_lens *lens,
> -unsigned int nvectors, unsigned int index)
> +vect_get_loop_len (loop_vec_info loop_vinfo, gimple_stmt_iterator *gsi,
> +vec_loop_lens *lens, unsigned int nvectors, tree vectype,
> +unsigned int index, unsigned int factor)
>  {
>rgroup_controls *rgl = &(*lens)[nvectors - 1];
>bool use_bias_adjusted_len =
> @@ -10400,6 +10401,27 @@ vect_get_loop_len (loop_vec_info loop_vinfo, 
> vec_loop_lens *lens,
>  
>if (use_bias_adjusted_len)
>  return rgl->bias_adjusted_ctrl;
> +  else if (rgl->factor == 1 && factor == 1)
> +{
> +  tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo);
> +  tree loop_len = rgl->controls[index];
> +  poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type);
> +  poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype);
> +  if (maybe_ne (nunits1, nunits2))
> + {
> +   /* A loop len for data type X can be reused for data type Y
> +  if X has N times more elements than Y and if Y's elements
> +  are N times bigger than X's.  */
> +   gcc_assert (multiple_p (nunits1, nunits2));
> +   factor = exact_div (nunits1, nunits2).to_constant ();
> +   gimple_seq seq = NULL;
> +   loop_len = gimple_build (&seq, RDIV_EXPR, iv_type, loop_len,
> +build_int_cst (iv_type, factor));
> +   if (seq)
> + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
> + }
> +  return loop_len;
> +}
>else
>  return rgl->controls[index];
 
This looks right, but I think it'd be clearer to rearrange things slightly:
 
  if (use_bias_adjusted_len)
return rgl->bias_adjusted_ctrl;
 
  tree loop_len = rgl->controls[index];
  if (rgl->factor == 1 && factor == 1)
{
  poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type);
  poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype);
  if (maybe_ne (nunits1, nunits2))
{
  /* A loop len for data type X can be reused for data type Y
 if X has N times more elements than Y and if Y's elements
 are N times bigger than X's.  */
  gcc_assert (multiple_p (nunits1, nunits2));
  factor = exact_div (nunits1, n

Re: [PATCH V13] VECT: Fix bug of multiple-rgroup for length is counting elements

2023-05-22 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes:
> From: Ju-Zhe Zhong 
>
> Address comments from Richard that splits the patch of fixing multiple-rgroup
> handling of length counting elements.
>
> This patch is fixing issue of handling multiple-rgroup of length is counting 
> elements
>
> Before this patch, multiple rgroup run fail:
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
>
> After this patch, These tests are all passed.

OK, thanks.

Richard


RE: [PATCH V13] VECT: Fix bug of multiple-rgroup for length is counting elements

2023-05-22 Thread Li, Pan2 via Gcc-patches
Committed, thanks Richard.

Pan

-Original Message-
From: Richard Sandiford  
Sent: Monday, May 22, 2023 4:21 PM
To: juzhe.zh...@rivai.ai
Cc: gcc-patches@gcc.gnu.org; rguent...@suse.de; Li, Pan2 
Subject: Re: [PATCH V13] VECT: Fix bug of multiple-rgroup for length is 
counting elements

juzhe.zh...@rivai.ai writes:
> From: Ju-Zhe Zhong 
>
> Address comments from Richard that splits the patch of fixing 
> multiple-rgroup handling of length counting elements.
>
> This patch is fixing issue of handling multiple-rgroup of length is 
> counting elements
>
> Before this patch, multiple rgroup run fail:
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c 
> execution test
>
> After this patch, These tests are all passed.

OK, thanks.

Richard


[PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-22 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* tree-vect-loop-manip.cc (vect_adjust_loop_lens_control): New function.
(vect_set_loop_controls_directly): Add decrement IV support.
(vect_set_loop_condition_partial_vectors): Ditto.
* tree-vect-loop.cc: Ditto.
* tree-vectorizer.h (LOOP_VINFO_USING_DECREMENTING_IV_P): New macro.

---
 gcc/tree-vect-loop-manip.cc | 184 +++-
 gcc/tree-vect-loop.cc   |  10 ++
 gcc/tree-vectorizer.h   |   8 ++
 3 files changed, 199 insertions(+), 3 deletions(-)

diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index ff6159e08d5..94b38d1e0fb 100644
--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -385,6 +385,66 @@ vect_maybe_permute_loop_masks (gimple_seq *seq, 
rgroup_controls *dest_rgm,
   return false;
 }
 
+/* Try to use adjust loop lens for non-SLP multiple-rgroups.
+
+ _36 = MIN_EXPR ;
+
+ First length (MIN (X, VF/N)):
+   loop_len_15 = MIN_EXPR <_36, VF/N>;
+
+ Second length:
+   tmp = _36 - loop_len_15;
+   loop_len_16 = MIN (tmp, VF/N);
+
+ Third length:
+   tmp2 = tmp - loop_len_16;
+   loop_len_17 = MIN (tmp2, VF/N);
+
+ Last length:
+   loop_len_18 = tmp2 - loop_len_17;
+*/
+
+static void
+vect_adjust_loop_lens_control (tree iv_type, gimple_seq *seq,
+  rgroup_controls *dest_rgm,
+  rgroup_controls *src_rgm, tree step)
+{
+  tree ctrl_type = dest_rgm->type;
+  poly_uint64 nitems_per_ctrl
+= TYPE_VECTOR_SUBPARTS (ctrl_type) * dest_rgm->factor;
+  tree length_limit = build_int_cst (iv_type, nitems_per_ctrl);
+
+  for (unsigned int i = 0; i < dest_rgm->controls.length (); ++i)
+{
+  if (!step)
+   step = src_rgm->controls[i / dest_rgm->controls.length ()];
+  tree ctrl = dest_rgm->controls[i];
+  if (i == 0)
+   {
+ /* First iteration: MIN (X, VF/N) capped to the range [0, VF/N].  */
+ gassign *assign
+   = gimple_build_assign (ctrl, MIN_EXPR, step, length_limit);
+ gimple_seq_add_stmt (seq, assign);
+   }
+  else if (i == dest_rgm->controls.length () - 1)
+   {
+ /* Last iteration: Remain capped to the range [0, VF/N].  */
+ gassign *assign = gimple_build_assign (ctrl, MINUS_EXPR, step,
+dest_rgm->controls[i - 1]);
+ gimple_seq_add_stmt (seq, assign);
+   }
+  else
+   {
+ /* (MIN (remain, VF*I/N)) capped to the range [0, VF/N].  */
+ step = gimple_build (seq, MINUS_EXPR, iv_type, step,
+  dest_rgm->controls[i - 1]);
+ gassign *assign
+   = gimple_build_assign (ctrl, MIN_EXPR, step, length_limit);
+ gimple_seq_add_stmt (seq, assign);
+   }
+}
+}
+
 /* Helper for vect_set_loop_condition_partial_vectors.  Generate definitions
for all the rgroup controls in RGC and return a control that is nonzero
when the loop needs to iterate.  Add any new preheader statements to
@@ -468,9 +528,78 @@ vect_set_loop_controls_directly (class loop *loop, 
loop_vec_info loop_vinfo,
   gimple_stmt_iterator incr_gsi;
   bool insert_after;
   standard_iv_increment_position (loop, &incr_gsi, &insert_after);
-  create_iv (build_int_cst (iv_type, 0), PLUS_EXPR, nitems_step, NULL_TREE,
-loop, &incr_gsi, insert_after, &index_before_incr,
-&index_after_incr);
+  if (LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo))
+{
+  nitems_total = gimple_convert (preheader_seq, iv_type, nitems_total);
+  tree step = make_ssa_name (iv_type);
+  /* Create decrement IV.  */
+  create_iv (nitems_total, MINUS_EXPR, step, NULL_TREE, loop, &incr_gsi,
+insert_after, &index_before_incr, &index_after_incr);
+  tree temp = gimple_build (header_seq, MIN_EXPR, iv_type,
+   index_before_incr, nitems_step);
+  gimple_seq_add_stmt (header_seq, gimple_build_assign (step, temp));
+
+  if (rgc->max_nscalars_per_iter == 1)
+   {
+ /* single rgroup:
+...
+_10 = (unsigned long) count_12(D);
+...
+# ivtmp_9 = PHI 
+_36 = MIN_EXPR ;
+...
+vect__4.8_28 = .LEN_LOAD (_17, 32B, _36, 0);
+...
+ivtmp_35 = ivtmp_9 - _36;
+...
+if (ivtmp_35 != 0)
+  goto ; [83.33%]
+else
+  goto ; [16.67%]
+ */
+ gassign *assign = gimple_build_assign (rgc->controls[0], step);
+ gimple_seq_add_stmt (header_seq, assign);
+   }
+  else
+   {
+ /* Multiple rgroup (SLP):
+...
+_38 = (unsigned long) bnd.7_29;
+_39 = _38 * 2;
+...
+# ivtmp_41 = PHI 
+...
+_43 = MIN_EXPR ;
+loop_len_26 = MIN_EXPR <_43, 16>;
+ 

Re: Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements

2023-05-22 Thread juzhe.zh...@rivai.ai
Hi, Richard.
I have rebase to trunk and send the updated patch for "decrement IV support":
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619115.html 

Thanks.


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-22 16:00
To: juzhe.zhong
CC: gcc-patches; rguenther; pan2.li
Subject: Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is 
counting elements
juzhe.zh...@rivai.ai writes:
> From: Ju-Zhe Zhong 
>
> Address comments from Richard that splits the patch of fixing multiple-rgroup
> handling of length counting elements.
>
> This patch is fixing issue of handling multiple-rgroup of length is counting 
> elements
>
> Before this patch, multiple rgroup run fail:
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
>
> After this patch, These tests are all passed.
 
Thanks, looks great.  A couple of minor comments below:
 
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 905145ae97b..a13d6f5e898 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -10364,8 +10364,9 @@ vect_record_loop_len (loop_vec_info loop_vinfo, 
> vec_loop_lens *lens,
> rgroup that operates on NVECTORS vectors, where 0 <= INDEX < NVECTORS.  */
>  
 
The new parameters need to be documented.  How about:
 
/* Given a complete set of lengths LENS, extract length number INDEX
   for an rgroup that operates on NVECTORS vectors of type VECTYPE,
   where 0 <= INDEX < NVECTORS.  Return a value that contains FACTOR
   multipled by the number of elements that should be processed.
   Insert any set-up statements before GSI.  */
 
>  tree
> -vect_get_loop_len (loop_vec_info loop_vinfo, vec_loop_lens *lens,
> -unsigned int nvectors, unsigned int index)
> +vect_get_loop_len (loop_vec_info loop_vinfo, gimple_stmt_iterator *gsi,
> +vec_loop_lens *lens, unsigned int nvectors, tree vectype,
> +unsigned int index, unsigned int factor)
>  {
>rgroup_controls *rgl = &(*lens)[nvectors - 1];
>bool use_bias_adjusted_len =
> @@ -10400,6 +10401,27 @@ vect_get_loop_len (loop_vec_info loop_vinfo, 
> vec_loop_lens *lens,
>  
>if (use_bias_adjusted_len)
>  return rgl->bias_adjusted_ctrl;
> +  else if (rgl->factor == 1 && factor == 1)
> +{
> +  tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo);
> +  tree loop_len = rgl->controls[index];
> +  poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type);
> +  poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype);
> +  if (maybe_ne (nunits1, nunits2))
> + {
> +   /* A loop len for data type X can be reused for data type Y
> +  if X has N times more elements than Y and if Y's elements
> +  are N times bigger than X's.  */
> +   gcc_assert (multiple_p (nunits1, nunits2));
> +   factor = exact_div (nunits1, nunits2).to_constant ();
> +   gimple_seq seq = NULL;
> +   loop_len = gimple_build (&seq, RDIV_EXPR, iv_type, loop_len,
> +build_int_cst (iv_type, factor));
> +   if (seq)
> + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
> + }
> +  return loop_len;
> +}
>else
>  return rgl->controls[index];
 
This looks right, but I think it'd be clearer to rearrange things slightly:
 
  if (use_bias_adjusted_len)
return rgl->bias_adjusted_ctrl;
 
  tree loop_len = rgl->controls[index];
  if (rgl->factor == 1 && factor == 1)
{
  poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type);
  poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype);
  if (maybe_ne (nunits1, nunits2))
{
  /* A loop len for data type X can be reused for data type Y
 if X has N times more elements than Y and if Y's elements
 are N times bigger than X's.  */
  gcc_assert (multiple_p (nunits1, nunits2));
  factor = exact_div (nunits1, nunits2).to_constant ();
  tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo);
  gimple_seq seq = NULL;
  loop_len = gimple_build (&seq, RDIV_EXPR, iv_type, loop_len,
   build_int_cst (iv_type, factor));

[COMMITTED] ada: prevent infinite recursion in Collect_Types_In_Hierarchy

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Bob Duff 

In (illegal) mutually-dependent type declarations, it is possible for
Etype (Etype (Typ)) to point back to Typ. This patch stops the recursion
in such cases.

gcc/ada/

* sem_util.adb (Process_Type): Stop the recursion.
* exp_aggr.adb (Build_Record_Aggr_Code): Add assertion.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_aggr.adb |  1 +
 gcc/ada/sem_util.adb | 13 +
 2 files changed, 14 insertions(+)

diff --git a/gcc/ada/exp_aggr.adb b/gcc/ada/exp_aggr.adb
index fe61e0ec90b..58831bd51ca 100644
--- a/gcc/ada/exp_aggr.adb
+++ b/gcc/ada/exp_aggr.adb
@@ -3837,6 +3837,7 @@ package body Exp_Aggr is
   Comp := First (Component_Associations (N));
   while Present (Comp) loop
  Selector := Entity (First (Choices (Comp)));
+ pragma Assert (Present (Selector));
 
  --  C++ constructors
 
diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index 1d8d4fc30f8..9cf21953fea 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -6235,6 +6235,19 @@ package body Sem_Util is
  --  Examine parent type
 
  if Etype (Typ) /= Typ then
+--  Prevent infinite recursion, which can happen in illegal
+--  programs. Silently return if illegal. For now, just deal
+--  with the 2-type cycle case. Larger cycles will get
+--  SIGSEGV at compile time from running out of stack.
+
+if Etype (Etype (Typ)) = Typ then
+   if Total_Errors_Detected = 0 then
+  raise Program_Error;
+   else
+  return;
+   end if;
+end if;
+
 Process_Type (Etype (Typ));
  end if;
 
-- 
2.40.0



[COMMITTED] ada: update Ada_Version_Type in fe.h to match opt.ads

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Bob Duff 

Remove Ada_With_Extensions, which is not used on the C side.
Do not add Ada_With_Core_Extensions and Ada_With_All_Extensions,
which are also not used on the C side, and on the Ada side
are always used via functions All_Extensions_Allowed and
Core_Extensions_Allowed. Explain this in comments.

Move the functions closer to the type declaration,
so the usage style is clearer.

Cleanup only -- no change in compiler behavior.

gcc/ada/

* fe.h: Remove Ada_With_Extensions and add commentary.
* opt.ads: Rearrange code and add commentary.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/fe.h|  5 -
 gcc/ada/opt.ads | 27 ---
 2 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/gcc/ada/fe.h b/gcc/ada/fe.h
index dd1ee51aadc..2d8f299903d 100644
--- a/gcc/ada/fe.h
+++ b/gcc/ada/fe.h
@@ -220,8 +220,11 @@ extern Boolean In_Extended_Main_Code_Unit  (Entity_Id);
 #define Unnest_Subprogram_Mode opt__unnest_subprogram_mode
 
 typedef enum {
-  Ada_83, Ada_95, Ada_2005, Ada_2012, Ada_2022, Ada_With_Extensions
+  Ada_83, Ada_95, Ada_2005, Ada_2012, Ada_2022
 } Ada_Version_Type;
+// Ada_With_Core_Extensions and Ada_With_All_Extensions (see opt.ads) are not
+// used on the C side for now. If we decide to use them, we should import
+// All_Extensions_Allowed and Core_Extensions_Allowed functions.
 
 extern Ada_Version_Type Ada_Version;
 extern Boolean Back_End_Inlining;
diff --git a/gcc/ada/opt.ads b/gcc/ada/opt.ads
index 7e5919d4635..bcafba9e57d 100644
--- a/gcc/ada/opt.ads
+++ b/gcc/ada/opt.ads
@@ -81,8 +81,13 @@ package Opt is
--  so that tests like Ada_Version >= Ada_95 are legitimate and useful.
--  Think twice before using "="; Ada_Version >= Ada_2012 is more likely
--  what you want, because it will apply to future versions of the language.
+   --
--  Note that Ada_With_All_Extensions should always be last since it should
-   --  always be a superset of the other Ada versions.
+   --  always be a superset of the other Ada versions. Likewise, the
+   --  penultimate one should be Ada_With_Core_Extensions.
+   --
+   --  Use the ..._Extensions_Allowed functions below instead of referring
+   --  directly to Ada_With_..._Extensions.
 
--  WARNING: There is a matching C declaration of this type in fe.h
 
@@ -100,6 +105,16 @@ package Opt is
 
--  WARNING: There is a matching C declaration of this variable in fe.h
 
+   function All_Extensions_Allowed return Boolean is
+ (Ada_Version = Ada_With_All_Extensions);
+   --  True if GNAT specific language extensions are allowed. See GNAT RM for
+   --  details.
+
+   function Core_Extensions_Allowed return Boolean is
+ (Ada_Version >= Ada_With_Core_Extensions);
+   --  True if some but not all GNAT specific language extensions are allowed.
+   --  See GNAT RM for details.
+
Ada_Version_Pragma : Node_Id := Empty;
--  Reflects the Ada_xxx pragma that resulted in setting Ada_Version. Used
--  to specialize error messages complaining about the Ada version in use.
@@ -594,16 +609,6 @@ package Opt is
--  Set to True to convert nonbinary modular additions into code
--  that relies on the front-end expansion of operator Mod.
 
-   function All_Extensions_Allowed return Boolean is
- (Ada_Version = Ada_With_All_Extensions);
-   --  True if GNAT specific language extensions are allowed. See GNAT RM for
-   --  details.
-
-   function Core_Extensions_Allowed return Boolean is
- (Ada_Version >= Ada_With_Core_Extensions);
-   --  True if some but not all GNAT specific language extensions are allowed.
-   --  See GNAT RM for details.
-
type External_Casing_Type is (
  As_Is,   -- External names cased as they appear in the Ada source
  Uppercase,   -- External names forced to all uppercase letters
-- 
2.40.0



[COMMITTED] ada: Update Controlling_Argument when copying trees

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

When copying the AST we need to update fields that carry semantic
meaning and not just copy them. We already updated some of them,
e.g. the First/Next_Named_Association chain, but failed to update
the Controlling_Argument.

This fix doesn't appear to change anything for the compiler, but it is
needed for GNATprove, where we no longer want to expand expression
functions and instead we want to copy their preanalyzed expressions.

gcc/ada/

* sem_util.ads (New_Copy_Tree): Update comment.
* sem_util.adb (New_Copy_Tree): Update Controlling_Argument, very
much like we update the First/Next_Named_Association.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_util.adb | 73 +---
 gcc/ada/sem_util.ads |  1 +
 2 files changed, 70 insertions(+), 4 deletions(-)

diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index 9cf21953fea..cb0cbf2cf3a 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -23136,6 +23136,13 @@ package body Sem_Util is
   pragma Inline (Update_CFS_Sloc);
   --  Update the Comes_From_Source and Sloc attributes of node or entity N
 
+  procedure Update_Controlling_Argument
+(Old_Call : Node_Id;
+ New_Call : Node_Id);
+  pragma Inline (Update_Controlling_Argument);
+  --  Update Controlling_Argument of New_Call base on Old_Call to make it
+  --  points to the corresponding newly copied actual parameter.
+
   procedure Update_Named_Associations
 (Old_Call : Node_Id;
  New_Call : Node_Id);
@@ -23574,17 +23581,22 @@ package body Sem_Util is
   (Old_Assoc => N,
New_Assoc => Result);
 
---  Update the First/Next_Named_Association chain for a replicated
---  call.
+--  Update the First/Next_Named_Association chain and the
+--  Controlling_Argument for a replicated call.
 
 if Nkind (N) in N_Entry_Call_Statement
-  | N_Function_Call
-  | N_Procedure_Call_Statement
+  | N_Subprogram_Call
 then
Update_Named_Associations
  (Old_Call => N,
   New_Call => Result);
 
+   if Nkind (N) in N_Subprogram_Call then
+  Update_Controlling_Argument
+(Old_Call => N,
+ New_Call => Result);
+   end if;
+
 --  Update the Renamed_Object attribute of a replicated object
 --  declaration.
 
@@ -23694,6 +23706,59 @@ package body Sem_Util is
  end if;
   end Update_CFS_Sloc;
 
+  -
+  -- Update_Controlling_Argument --
+  -
+
+  procedure Update_Controlling_Argument
+(Old_Call : Node_Id;
+ New_Call : Node_Id)
+  is
+ New_Act : Node_Id;
+ Old_Act : Node_Id;
+
+ Old_Ctrl_Arg : constant Node_Id := Controlling_Argument (Old_Call);
+ --  Controlling argument of the old call node
+
+ Replaced : Boolean := False;
+ --  Flag to make sure that replacement works as expected
+
+  begin
+ if No (Old_Ctrl_Arg) then
+return;
+ end if;
+
+ --  Recreate the Controlling_Argument of a call by traversing both the
+ --  old and new actual parameters in parallel.
+
+ New_Act := First (Parameter_Associations (New_Call));
+ Old_Act := First (Parameter_Associations (Old_Call));
+ while Present (Old_Act) loop
+
+--  Actual parameter appears either in a named parameter
+--  association or directly.
+
+if Nkind (Old_Act) = N_Parameter_Association then
+   if Explicit_Actual_Parameter (Old_Act) = Old_Ctrl_Arg then
+  Set_Controlling_Argument
+(New_Call, Explicit_Actual_Parameter (New_Act));
+  Replaced := True;
+  exit;
+   end if;
+
+elsif Old_Act = Old_Ctrl_Arg then
+   Set_Controlling_Argument (New_Call, New_Act);
+   Replaced := True;
+   exit;
+end if;
+
+Next (New_Act);
+Next (Old_Act);
+ end loop;
+
+ pragma Assert (Replaced);
+  end Update_Controlling_Argument;
+
   ---
   -- Update_Named_Associations --
   ---
diff --git a/gcc/ada/sem_util.ads b/gcc/ada/sem_util.ads
index 42c6d249e2f..060d04241d3 100644
--- a/gcc/ada/sem_util.ads
+++ b/gcc/ada/sem_util.ads
@@ -2646,6 +2646,7 @@ package Sem_Util is
--
--First_Named_Actual
--Next_Named_Actual
+   --Controlling_Argument
--
--  If applicable, the Etype field (if any) is updated to refer to a
--  local itype or type (see below).

Re: [aarch64] Code-gen for vector initialization involving constants

2023-05-22 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni  writes:
> Hi Richard,
> Thanks for the suggestions. Does the attached patch look OK ?
> Boostrap+test in progress on aarch64-linux-gnu.

Like I say, please wait for the tests to complete before sending an RFA.
It saves a review cycle if the tests don't in fact pass.

> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 29dbacfa917..e611a7cca25 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -22332,6 +22332,43 @@ aarch64_unzip_vector_init (machine_mode mode, rtx 
> vals, bool even_p)
>return gen_rtx_PARALLEL (new_mode, vec);
>  }
>  
> +/* Return true if INSN is a scalar move.  */
> +
> +static bool
> +scalar_move_insn_p (const rtx_insn *insn)
> +{
> +  rtx set = single_set (insn);
> +  if (!set)
> +return false;
> +  rtx src = SET_SRC (set);
> +  rtx dest = SET_DEST (set);
> +  return is_a(GET_MODE (dest))
> +  && aarch64_mov_operand_p (src, GET_MODE (src));

Formatting:

  return (is_a(GET_MODE (dest))
  && aarch64_mov_operand_p (src, GET_MODE (src)));

OK with that change if the tests pass, thanks.

Richard


[COMMITTED] ada: Restrict expression pretty-printer to subexpressions

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

When pretty-printing expressions with a CASE alternatives we can qualify
the call to Nkind using N_Subexpr, so that we will get compile-time
errors when new node kinds are added (e.g. Ada 2022 case expressions).

gcc/ada/

* pprint.adb (Expr_Name): Qualify CASE expression with N_Subexpr; add
missing alternative for N_Raise_Storage_Error; remove dead alternatives;
explicitly list unsupported alternatives.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/pprint.adb | 42 +++---
 1 file changed, 23 insertions(+), 19 deletions(-)

diff --git a/gcc/ada/pprint.adb b/gcc/ada/pprint.adb
index 526b70f7996..2a86bd58cd8 100644
--- a/gcc/ada/pprint.adb
+++ b/gcc/ada/pprint.adb
@@ -185,10 +185,8 @@ package body Pprint is
 return "...";
  end if;
 
- case Nkind (Expr) is
-when N_Defining_Identifier
-   | N_Identifier
-=>
+ case N_Subexpr'(Nkind (Expr)) is
+when N_Identifier =>
return Ident_Image (Expr, Expression_Image.Expr, Expand_Type);
 
 when N_Character_Literal =>
@@ -379,14 +377,6 @@ package body Pprint is
   return "." & Expr_Name (Selector_Name (Expr));
end if;
 
-when N_Component_Association =>
-   return "("
- & List_Name
- (List  => First (Choices (Expr)),
-  Add_Space => False,
-  Add_Paren => False)
- & " => " & Expr_Name (Expression (Expr)) & ")";
-
 when N_If_Expression =>
declare
   Cond_Expr : constant Node_Id := First (Expressions (Expr));
@@ -436,6 +426,15 @@ package body Pprint is
   return "[program_error]";
end if;
 
+when N_Raise_Storage_Error =>
+   if Present (Condition (Expr)) then
+  return
+"[storage_error when "
+  & Expr_Name (Condition (Expr)) & "]";
+   else
+  return "[storage_error]";
+   end if;
+
 when N_Range =>
return
  Expr_Name (Low_Bound (Expr)) & ".." &
@@ -573,9 +572,6 @@ package body Pprint is
 when N_Op_Not =>
return "not (" & Expr_Name (Right_Opnd (Expr)) & ")";
 
-when N_Parameter_Association =>
-   return Expr_Name (Explicit_Actual_Parameter (Expr));
-
 when N_Type_Conversion =>
 
--  Most conversions are not very interesting (used inside
@@ -627,10 +623,18 @@ package body Pprint is
 when N_Null =>
return "null";
 
-when N_Others_Choice =>
-   return "others";
-
-when others =>
+when N_Case_Expression
+   | N_Delta_Aggregate
+   | N_Interpolated_String_Literal
+   | N_Op_Rotate_Left
+   | N_Op_Rotate_Right
+   | N_Operator_Symbol
+   | N_Procedure_Call_Statement
+   | N_Quantified_Expression
+   | N_Raise_Expression
+   | N_Reference
+   | N_Target_Name
+=>
return "...";
  end case;
   end Expr_Name;
-- 
2.40.0



[COMMITTED] ada: Don't pretty-print DEL within expression images

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

When printing expression images, e.g. for GNATprove counterexamples,
it seems better to print DEL not directly but with its numeric code.

gcc/ada/

* pprint.adb (Expr_Name): Exclude DEL from printable range.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/pprint.adb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ada/pprint.adb b/gcc/ada/pprint.adb
index 8cc92445080..526b70f7996 100644
--- a/gcc/ada/pprint.adb
+++ b/gcc/ada/pprint.adb
@@ -195,7 +195,7 @@ package body Pprint is
declare
   Char : constant Int := UI_To_Int (Char_Literal_Value (Expr));
begin
-  if Char in 32 .. 127 then
+  if Char in 32 .. 126 then
  return "'" & Character'Val (Char) & "'";
   else
  UI_Image (Char_Literal_Value (Expr));
-- 
2.40.0



[COMMITTED] ada: Fix handling of constrained array declarations in declare-expression

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

They need to go through Constrain_Array or else they do not really work.

gcc/ada/

* sem_ch3.adb (Find_Type_Of_Object): In a spec expression, also set
the Scope of the type, and call Constrain_Array for array subtypes.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_ch3.adb | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/ada/sem_ch3.adb b/gcc/ada/sem_ch3.adb
index 66013ca0134..7596a59edb9 100644
--- a/gcc/ada/sem_ch3.adb
+++ b/gcc/ada/sem_ch3.adb
@@ -18417,10 +18417,10 @@ package body Sem_Ch3 is
Set_Etype  (T, Base_T);
Mutate_Ekind  (T, Subtype_Kind (Ekind (Base_T)));
Set_Parent (T, Obj_Def);
+   Set_Scope (T, Current_Scope);
 
if Ekind (T) = E_Array_Subtype then
-  Set_First_Index (T, First_Index (Base_T));
-  Set_Is_Constrained (T);
+  Constrain_Array (T, Obj_Def, Related_Nod, T, 'P');
 
elsif Ekind (T) = E_Record_Subtype then
   Set_First_Entity (T, First_Entity (Base_T));
-- 
2.40.0



[COMMITTED] ada: Better error message if non-Ada2022 code declares No_Return function

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Steve Baird 

When a feature that is legal in Ada2022 but not in earlier Ada versions
is used, we typically want to call Error_Msg_Ada_2022_Feature in order to
generate an informative message in the error case. Specifying No_Return
for a function (as opposed to a procedure) is no exception to this rule.

gcc/ada/

* sem_prag.adb (Analyze_Pragma): In Check_No_Return, call
Error_Msg_Ada_2022_Feature in the case of a function. Remove code
outside of Check_No_Return that was querying Ada_Version.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_prag.adb | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/ada/sem_prag.adb b/gcc/ada/sem_prag.adb
index 6b1f9263f9d..36c1add5ea4 100644
--- a/gcc/ada/sem_prag.adb
+++ b/gcc/ada/sem_prag.adb
@@ -20035,7 +20035,11 @@ package body Sem_Prag is
 N : Node_Id) return Boolean
 is
 begin
-   if Ekind (E) = E_Procedure then
+   if Ekind (E) in E_Function | E_Generic_Function then
+  Error_Msg_Ada_2022_Feature ("No_Return function", Sloc (N));
+  return Ada_Version >= Ada_2022;
+
+   elsif Ekind (E) = E_Procedure then
 
   --  If E is a generic instance, marking it with No_Return
   --  is forbidden, but having it inherit the No_Return of
@@ -20106,9 +20110,7 @@ package body Sem_Prag is
   --  Ada 2022 (AI12-0269): A function can be No_Return
 
   if Ekind (E) in E_Generic_Procedure | E_Procedure
-or else (Ada_Version >= Ada_2022
-  and then
- Ekind (E) in E_Generic_Function | E_Function)
+   | E_Generic_Function | E_Function
   then
  --  Check that the pragma is not applied to a body.
  --  First check the specless body case, to give a
-- 
2.40.0



[COMMITTED] ada: Add contracts to Ada.Strings.Unbounded library

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Joffrey Huguet 

This patch adds contracts to the conversions between
Unbounded_String and String, the Element function and the
equality between two Unbounded_String, or between
Unbounded_String and String.
This patch also disallows the use of a function in SPARK, because
it returns an uninitialized Unbounded_String.

gcc/ada/

* libgnat/a-strunb.ads, libgnat/a-strunb__shared.ads
(To_Unbounded_String): Add postcondition. Add aspect SPARK_Mode
Off on the version that takes a Natural as parameter.
(To_String): Complete postcondition.
(Set_Unbounded_String): Add postcondition.
(Element): Likewise.
("="): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnat/a-strunb.ads | 16 +++-
 gcc/ada/libgnat/a-strunb__shared.ads | 16 +++-
 2 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/gcc/ada/libgnat/a-strunb.ads b/gcc/ada/libgnat/a-strunb.ads
index 0b0085a41b1..d3e88d0b4b6 100644
--- a/gcc/ada/libgnat/a-strunb.ads
+++ b/gcc/ada/libgnat/a-strunb.ads
@@ -86,21 +86,22 @@ is
function To_Unbounded_String
  (Source : String)  return Unbounded_String
with
- Post   => Length (To_Unbounded_String'Result) = Source'Length,
+ Post   => To_String (To_Unbounded_String'Result) = Source,
  Global => null;
--  Returns an Unbounded_String that represents Source
 
function To_Unbounded_String
  (Length : Natural) return Unbounded_String
with
- Post   =>
-   Ada.Strings.Unbounded.Length (To_Unbounded_String'Result) = Length,
- Global => null;
+ SPARK_Mode => Off,
+ Global => null;
--  Returns an Unbounded_String that represents an uninitialized String
--  whose length is Length.
 
function To_String (Source : Unbounded_String) return String with
- Post   => To_String'Result'Length = Length (Source),
+ Post   =>
+   To_String'Result'First = 1
+ and then To_String'Result'Length = Length (Source),
  Global => null;
--  Returns the String with lower bound 1 represented by Source
 
@@ -115,6 +116,7 @@ is
  (Target : out Unbounded_String;
   Source : String)
with
+ Post   => To_String (Target) = Source,
  Global => null;
pragma Ada_05 (Set_Unbounded_String);
--  Sets Target to an Unbounded_String that represents Source
@@ -198,6 +200,7 @@ is
   Index  : Positive) return Character
with
  Pre=> Index <= Length (Source),
+ Post   => Element'Result = To_String (Source) (Index),
  Global => null;
--  Returns the character at position Index in the string represented by
--  Source; propagates Index_Error if Index > Length (Source).
@@ -259,18 +262,21 @@ is
  (Left  : Unbounded_String;
   Right : Unbounded_String) return Boolean
with
+ Post   => "="'Result = (To_String (Left) = To_String (Right)),
  Global => null;
 
function "="
  (Left  : Unbounded_String;
   Right : String) return Boolean
with
+ Post   => "="'Result = (To_String (Left) = Right),
  Global => null;
 
function "="
  (Left  : String;
   Right : Unbounded_String) return Boolean
with
+ Post   => "="'Result = (Left = To_String (Right)),
  Global => null;
 
function "<"
diff --git a/gcc/ada/libgnat/a-strunb__shared.ads 
b/gcc/ada/libgnat/a-strunb__shared.ads
index bb69056299f..3f5d56e0a8c 100644
--- a/gcc/ada/libgnat/a-strunb__shared.ads
+++ b/gcc/ada/libgnat/a-strunb__shared.ads
@@ -108,24 +108,26 @@ is
function To_Unbounded_String
  (Source : String)  return Unbounded_String
with
- Post   => Length (To_Unbounded_String'Result) = Source'Length,
+ Post   => To_String (To_Unbounded_String'Result) = Source,
  Global => null;
 
function To_Unbounded_String
  (Length : Natural) return Unbounded_String
with
- Post   =>
-   Ada.Strings.Unbounded.Length (To_Unbounded_String'Result) = Length,
- Global => null;
+ SPARK_Mode => Off,
+ Global => null;
 
function To_String (Source : Unbounded_String) return String with
- Post   => To_String'Result'Length = Length (Source),
+ Post   =>
+   To_String'Result'First = 1
+ and then To_String'Result'Length = Length (Source),
  Global => null;
 
procedure Set_Unbounded_String
  (Target : out Unbounded_String;
   Source : String)
with
+ Post   => To_String (Target) = Source,
  Global => null;
pragma Ada_05 (Set_Unbounded_String);
 
@@ -198,6 +200,7 @@ is
   Index  : Positive) return Character
with
  Pre=> Index <= Length (Source),
+ Post   => Element'Result = To_String (Source) (Index),
  Global => null;
 
procedure Replace_Element
@@ -244,18 +247,21 @@ is
  (Left  : Unbounded_String;
   Right : Unbounded_String) return Boolean
with
+ Post   => "="'Result = (To_String (Left) = To_String (Right)),
  Global => null;
 
function

[COMMITTED] ada: Fix traversal for the rightmost node of a pretty-printed expression

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

When getting the rightmost node of a pretty-printed expression we
incorrectly traversed some composite nodes, which caused the expression
image to be chopped.

gcc/ada/

* pprint.adb (Expression_Image): Reduce scope of local variables; inline
local uncommented constant From_Source; concatenate string with a single
character, as it is likely to execute faster; add missing cases to
traversal for the rightmost node and assertion to demonstrate that the
??? comment is no longer relevant.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/pprint.adb | 147 +++--
 1 file changed, 101 insertions(+), 46 deletions(-)

diff --git a/gcc/ada/pprint.adb b/gcc/ada/pprint.adb
index 2a86bd58cd8..8fdb5d6916e 100644
--- a/gcc/ada/pprint.adb
+++ b/gcc/ada/pprint.adb
@@ -53,13 +53,6 @@ package body Pprint is
  (Expr: Node_Id;
   Default : String) return String
is
-  From_Source  : constant Boolean :=
-   Comes_From_Source (Expr)
- and then not Opt.Debug_Generated_Code;
-  Append_Paren : Natural := 0;
-  Left : Node_Id := Original_Node (Expr);
-  Right: Node_Id := Original_Node (Expr);
-
   function Expr_Name
 (Expr: Node_Id;
  Take_Prefix : Boolean := True;
@@ -302,7 +295,7 @@ package body Pprint is
  return Str;
   end;
else
-  return "'" & Get_Name_String (Attribute_Name (Expr));
+  return ''' & Get_Name_String (Attribute_Name (Expr));
end if;
 
 when N_Explicit_Dereference =>
@@ -639,10 +632,20 @@ package body Pprint is
  end case;
   end Expr_Name;
 
+  --  Local variables
+
+  Append_Paren : Natural := 0;
+  Left : Node_Id := Original_Node (Expr);
+  Right: Node_Id := Original_Node (Expr);
+
+  Left_Sloc, Right_Sloc : Source_Ptr;
+
--  Start of processing for Expression_Image
 
begin
-  if not From_Source then
+  if not Comes_From_Source (Expr)
+or else Opt.Debug_Generated_Code
+  then
  declare
 S : constant String := Expr_Name (Expr);
  begin
@@ -661,8 +664,6 @@ package body Pprint is
   end if;
 
   --  Compute left (start) and right (end) slocs for the expression
-  --  Consider using Sinput.Sloc_Range instead, except that it does not
-  --  work properly currently???
 
   loop
  case Nkind (Left) is
@@ -706,13 +707,24 @@ package body Pprint is
 
   loop
  case Nkind (Right) is
-when N_And_Then
-   | N_Membership_Test
+when N_Membership_Test
| N_Op
-   | N_Or_Else
+   | N_Short_Circuit
 =>
Right := Original_Node (Right_Opnd (Right));
 
+when N_Attribute_Reference =>
+   declare
+  Exprs : constant List_Id := Expressions (Right);
+   begin
+  if Present (Exprs) then
+ Right := Original_Node (Last (Expressions (Right)));
+ Append_Paren := Append_Paren + 1;
+  else
+ exit;
+  end if;
+   end;
+
 when N_Expanded_Name
| N_Selected_Component
 =>
@@ -755,40 +767,37 @@ package body Pprint is
Append_Paren := Append_Paren + 1;
 
 when N_Function_Call =>
-   if Present (Parameter_Associations (Right)) then
-  declare
- Rover : Node_Id;
- Found : Boolean;
-
-  begin
- --  Avoid source position confusion associated with
- --  parameters for which Comes_From_Source is False.
-
- Rover := First (Parameter_Associations (Right));
- Found := False;
- while Present (Rover) loop
-if Comes_From_Source (Original_Node (Rover)) then
-   Right := Original_Node (Rover);
-   Found := True;
-end if;
+   declare
+  Has_Source_Param : Boolean := False;
+  --  True iff function call has a parameter coming from source
 
-Next (Rover);
- end loop;
+  Param : Node_Id;
 
- if Found then
-Append_Paren := Append_Paren + 1;
+   begin
+  --  Avoid source position confusion associated with
+  --  parameters for which Comes_From_Source is False.
+
+  Param := First (Parameter_Associations (Right));
+  while Present (Param) loop
+

[COMMITTED] ada: Reject illegal declarations in expression functions

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Steve Baird 

gcc/ada/

* sem_ch4.adb (Analyze_Expression_With_Actions.Check_Action_Ok):
If Comes_From_Source (A) is False, then look at Original_Node (A)
instead of A. In particular, if an (illegal) expression function
is transformed into a "vanilla" function, we don't want to allow
it just because Comes_From_Source is now False.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_ch4.adb | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gcc/ada/sem_ch4.adb b/gcc/ada/sem_ch4.adb
index 153a63586ca..7e8da9f2d5a 100644
--- a/gcc/ada/sem_ch4.adb
+++ b/gcc/ada/sem_ch4.adb
@@ -2368,6 +2368,16 @@ package body Sem_Ch4 is
   procedure Check_Action_OK (A : Node_Id) is
   begin
  if not Comes_From_Source (N) or else not Comes_From_Source (A) then
+
+--  If, for example, an (illegal) expression function is
+--  transformed into a"vanilla" function then we don't want to
+--  allow it just because Comes_From_Source is now False. So look
+--  at the Original_Node.
+
+if A /= Original_Node (A) then
+   Check_Action_OK (Original_Node (A));
+end if;
+
 return; -- Allow anything in generated code
  end if;
 
-- 
2.40.0



[COMMITTED] ada: Implement conversions from Big_Integer to large types

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

This implements the conversion from Big_Integer to Long_Long_Unsigned on
32-bit platforms and to Long_Long_Long_{Integer,Unsigned} on 64-bit ones.

gcc/ada/

* libgnat/s-genbig.ads (From_Bignum): New overloaded declarations.
* libgnat/s-genbig.adb (LLLI): New subtype.
(LLLI_Is_128): New boolean constant.
(From_Bignum): Change the return type of the signed implementation
to Long_Long_Long_Integer and add support for the case where its
size is 128 bits.  Add a wrapper around it for Long_Long_Integer.
Add an unsigned implementation returning Unsigned_128 and a wrapper
around it for Unsigned_64.
(To_Bignum): Test LLLI_Is_128 instead of its size.
(To_String.Image): Add qualification to calls to From_Bignum.
* libgnat/a-nbnbin.adb (To_Big_Integer): Likewise.
(Signed_Conversions.From_Big_Integer): Likewise.
(Unsigned_Conversions): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnat/a-nbnbin.adb |   6 +--
 gcc/ada/libgnat/s-genbig.adb | 100 +--
 gcc/ada/libgnat/s-genbig.ads |  12 +
 3 files changed, 98 insertions(+), 20 deletions(-)

diff --git a/gcc/ada/libgnat/a-nbnbin.adb b/gcc/ada/libgnat/a-nbnbin.adb
index edfd04e1ca3..090f408f2d7 100644
--- a/gcc/ada/libgnat/a-nbnbin.adb
+++ b/gcc/ada/libgnat/a-nbnbin.adb
@@ -160,7 +160,7 @@ package body Ada.Numerics.Big_Numbers.Big_Integers is
 
function To_Integer (Arg : Valid_Big_Integer) return Integer is
begin
-  return Integer (From_Bignum (Get_Bignum (Arg)));
+  return Integer (Long_Long_Integer'(From_Bignum (Get_Bignum (Arg;
end To_Integer;
 

@@ -186,7 +186,7 @@ package body Ada.Numerics.Big_Numbers.Big_Integers is
 
   function From_Big_Integer (Arg : Valid_Big_Integer) return Int is
   begin
- return Int (From_Bignum (Get_Bignum (Arg)));
+ return Int (Long_Long_Long_Integer'(From_Bignum (Get_Bignum (Arg;
   end From_Big_Integer;
 
end Signed_Conversions;
@@ -214,7 +214,7 @@ package body Ada.Numerics.Big_Numbers.Big_Integers is
 
   function From_Big_Integer (Arg : Valid_Big_Integer) return Int is
   begin
- return Int (From_Bignum (Get_Bignum (Arg)));
+ return Int (Unsigned_128'(From_Bignum (Get_Bignum (Arg;
   end From_Big_Integer;
 
end Unsigned_Conversions;
diff --git a/gcc/ada/libgnat/s-genbig.adb b/gcc/ada/libgnat/s-genbig.adb
index 85dc40b87d3..183ce3262f0 100644
--- a/gcc/ada/libgnat/s-genbig.adb
+++ b/gcc/ada/libgnat/s-genbig.adb
@@ -49,6 +49,10 @@ package body System.Generic_Bignums is
--  Compose double digit value from two single digit values
 
subtype LLI is Long_Long_Integer;
+   subtype LLLI is Long_Long_Long_Integer;
+
+   LLLI_Is_128 : constant Boolean := Long_Long_Long_Integer'Size = 128;
+   --  True if Long_Long_Long_Integer is 128-bit large
 
One_Data : constant Digit_Vector (1 .. 1) := [1];
--  Constant one
@@ -1041,22 +1045,48 @@ package body System.Generic_Bignums is
-- From_Bignum --
-
 
-   function From_Bignum (X : Bignum) return Long_Long_Integer is
+   function From_Bignum (X : Bignum) return Long_Long_Long_Integer is
begin
   if X.Len = 0 then
  return 0;
 
   elsif X.Len = 1 then
- return (if X.Neg then -LLI (X.D (1)) else LLI (X.D (1)));
+ return (if X.Neg then -LLLI (X.D (1)) else LLLI (X.D (1)));
 
   elsif X.Len = 2 then
  declare
 Mag : constant DD := X.D (1) & X.D (2);
  begin
-if X.Neg and then Mag <= 2 ** 63 then
-   return -LLI (Mag);
-elsif Mag < 2 ** 63 then
-   return LLI (Mag);
+if X.Neg and then (Mag <= 2 ** 63 or else LLLI_Is_128) then
+   return -LLLI (Mag);
+elsif Mag < 2 ** 63 or else LLLI_Is_128 then
+   return LLLI (Mag);
+end if;
+ end;
+
+  elsif X.Len = 3 and then LLLI_Is_128 then
+ declare
+Hi  : constant SD := X.D (1);
+Lo  : constant DD := X.D (2) & X.D (3);
+Mag : constant Unsigned_128 :=
+Shift_Left (Unsigned_128 (Hi), 64) + Unsigned_128 (Lo);
+ begin
+return (if X.Neg then -LLLI (Mag) else LLLI (Mag));
+ end;
+
+  elsif X.Len = 4 and then LLLI_Is_128 then
+ declare
+Hi  : constant DD := X.D (1) & X.D (2);
+Lo  : constant DD := X.D (3) & X.D (4);
+Mag : constant Unsigned_128 :=
+Shift_Left (Unsigned_128 (Hi), 64) + Unsigned_128 (Lo);
+ begin
+if X.Neg
+  and then (Hi < 2 ** 63 or else (Hi = 2 ** 63 and then Lo = 0))
+then
+   return -LLLI (Mag);
+elsif Hi < 2 ** 63 then
+   return LLLI (Mag);
 end if;
  end;
   

[COMMITTED] ada: Fix double finalization in conditional exit statement

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

The temporary is first finalized through its enclosing block.

gcc/ada/

* exp_ch4.adb (Expand_N_Expression_With_Actions.Process_Action): Do
not look into nested blocks.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch4.adb | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
index 95b81fb8e53..b63e47335be 100644
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -5653,14 +5653,17 @@ package body Exp_Ch4 is
 return Skip;
 
  --  Avoid processing temporary function results multiple times when
- --  dealing with nested expression_with_actions.
+ --  dealing with nested expression_with_actions or nested blocks.
  --  Similarly, do not process temporary function results in loops.
  --  This is done by Expand_N_Loop_Statement and Build_Finalizer.
  --  Note that we used to wrongly return Abandon instead of Skip here:
  --  this is wrong since it means that we were ignoring lots of
  --  relevant subsequent statements.
 
- elsif Nkind (Act) in N_Expression_With_Actions | N_Loop_Statement then
+ elsif Nkind (Act) in N_Expression_With_Actions
+| N_Block_Statement
+| N_Loop_Statement
+ then
 return Skip;
  end if;
 
-- 
2.40.0



[COMMITTED] ada: Fix error and crash on imported function with precondition and 'Base

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

This fixes a spurious error on an imported function with a precondition
and a parameter declared with a 'Base formal type, and even a crash in
the case where this function is declared in a generic package.

gcc/ada/

* freeze.adb (Wrap_Imported_Subprogram): Use Copy_Subprogram_Spec
to copy the spec from the subprogram to the generated subprogram
body.
(Freeze_Entity): Do not wrap imported subprograms inside generics.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/freeze.adb | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/freeze.adb b/gcc/ada/freeze.adb
index f54ae0503a1..df3b5ec944e 100644
--- a/gcc/ada/freeze.adb
+++ b/gcc/ada/freeze.adb
@@ -6127,8 +6127,7 @@ package body Freeze is
 
 Bod :=
   Make_Subprogram_Body (Loc,
-Specification  =>
-  Copy_Separate_Tree (Spec),
+Specification  => Copy_Subprogram_Spec (Spec),
 Declarations   => New_List (
   Make_Subprogram_Declaration (Loc,
 Specification => Copy_Separate_Tree (Spec)),
@@ -6438,7 +6437,9 @@ package body Freeze is
 
 --  Check for needing to wrap imported subprogram
 
-Wrap_Imported_Subprogram (E);
+if not Inside_A_Generic then
+   Wrap_Imported_Subprogram (E);
+end if;
 
 --  Freeze all parameter types and the return type (RM 13.14(14)).
 --  However skip this for internal subprograms. This is also where
-- 
2.40.0



[COMMITTED] ada: Remove outdated part of comment

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Ronan Desplanques 

The concept of extended nodes was retired with the introduction of
variable-sized node types, but a reference to that concept was left
over in a comment. This change removes that reference.

gcc/ada/

* atree.ads: Remove outdated part of comment.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/atree.ads | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gcc/ada/atree.ads b/gcc/ada/atree.ads
index 50f75cf4d59..329e41954dd 100644
--- a/gcc/ada/atree.ads
+++ b/gcc/ada/atree.ads
@@ -261,8 +261,7 @@ package Atree is
function New_Entity
  (New_Node_Kind : Node_Kind;
   New_Sloc  : Source_Ptr) return Entity_Id;
-   --  Similar to New_Node, except that it is used only for entity nodes
-   --  and returns an extended node.
+   --  Similar to New_Node, except that it is used only for entity nodes.
 
procedure Set_Comes_From_Source_Default (Default : Boolean);
--  Sets value of Comes_From_Source flag to be used in all subsequent
-- 
2.40.0



[COMMITTED] ada: Remove a remaining reference to ?

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Arnaud Charlet 

We should no longer use ? anywhere when emitting warnings.

gcc/ada/

* sem_aggr.adb (Get_Value): Use ?? instead of ?.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_aggr.adb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ada/sem_aggr.adb b/gcc/ada/sem_aggr.adb
index 2ccfe6dcaef..6405fe4c2d4 100644
--- a/gcc/ada/sem_aggr.adb
+++ b/gcc/ada/sem_aggr.adb
@@ -4676,7 +4676,7 @@ package body Sem_Aggr is
 then
Error_Msg_Node_2 := Typ;
Error_Msg_NE
- ("component&? of type& is uninitialized",
+ ("??component& of type& is uninitialized",
   Assoc, Selector_Name);
 
--  An additional reminder if the component type
-- 
2.40.0



[COMMITTED] ada: Fix crash on Ada.Containers with No_Dispatching_Calls restriction

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

This makes it so that the compiler does not crash and flags the underlying
violation of the restriction instead.

gcc/ada/

* exp_ch3.adb (Freeze_Type): Do not associate the Finalize_Address
routine for a class-wide type if restriction No_Dispatching_Calls
is in effect.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch3.adb | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb
index 363186565f6..33e96a0ff90 100644
--- a/gcc/ada/exp_ch3.adb
+++ b/gcc/ada/exp_ch3.adb
@@ -9251,9 +9251,13 @@ package body Exp_Ch3 is
  --  this is indeed the case, associate the Finalize_Address routine
  --  of the full view with the finalization masters of all pending
  --  access types. This scenario applies to anonymous access types as
- --  well.
+ --  well. But the Finalize_Address routine is missing if the type is
+ --  class-wide and we are under restriction No_Dispatching_Calls, see
+ --  Expand_Freeze_Class_Wide_Type above for the rationale.
 
  elsif Needs_Finalization (Typ)
+   and then (not Is_Class_Wide_Type (Typ)
+  or else not Restriction_Active (No_Dispatching_Calls))
and then Present (Pending_Access_Types (Typ))
  then
 E := First_Elmt (Pending_Access_Types (Typ));
-- 
2.40.0



[COMMITTED] ada: Remove unreferenced utility routine Is_Actual_Tagged_Parameter

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

Routine Is_Actual_Tagged_Parameter was added to detect unsupported SPARK
2005 constructs, but this feature was deconstructed in favor of SPARK
2014 and its SPARK_Mode aspects.

gcc/ada/

* sem_util.ads (Is_Actual_Tagged_Parameter): Remove spec.
* sem_util.adb (Is_Actual_Tagged_Parameter): Remove body.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_util.adb | 12 
 gcc/ada/sem_util.ads |  4 
 2 files changed, 16 deletions(-)

diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index cb0cbf2cf3a..ef591c935eb 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -15235,18 +15235,6 @@ package body Sem_Util is
   end case;
end Is_Actual_Parameter;
 
-   
-   -- Is_Actual_Tagged_Parameter --
-   
-
-   function Is_Actual_Tagged_Parameter (N : Node_Id) return Boolean is
-  Formal : Entity_Id;
-  Call   : Node_Id;
-   begin
-  Find_Actual (N, Formal, Call);
-  return Present (Formal) and then Is_Tagged_Type (Etype (Formal));
-   end Is_Actual_Tagged_Parameter;
-
-
-- Is_Aliased_View --
-
diff --git a/gcc/ada/sem_util.ads b/gcc/ada/sem_util.ads
index 060d04241d3..7bb8cdbe3f3 100644
--- a/gcc/ada/sem_util.ads
+++ b/gcc/ada/sem_util.ads
@@ -1759,10 +1759,6 @@ package Sem_Util is
function Is_Actual_Parameter (N : Node_Id) return Boolean;
--  Determines if N is an actual parameter in a subprogram or entry call
 
-   function Is_Actual_Tagged_Parameter (N : Node_Id) return Boolean;
-   --  Determines if N is an actual parameter of a formal of tagged type in a
-   --  subprogram call.
-
function Is_Aliased_View (Obj : Node_Id) return Boolean;
--  Determine if Obj is an aliased view, i.e. the name of an object to which
--  'Access or 'Unchecked_Access can apply. Note that this routine uses the
-- 
2.40.0



[COMMITTED] ada: Fix source location for crashes in expanded Loop_Entry attributes

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

Historically, Loop_Entry attributes were expanded while expanding their
corresponding loops, so it was easier to use location of these loops for
expanded code. Now, these attributes are expanded where they appear, so
we can easily use the location of the attribute reference for expanded
code.

This matters when there is a crash in the expanded code, e.g. because of
a stack overflow in the declaration of an constant object that captures
the Loop_Entry prefix. Now backtrace will point to the source location
of the attribute, which is more helpful than the location of the loop.

gcc/ada/

* exp_attr.adb (Expand_Loop_Entry_Attribute): Use location of the
attribute reference, not of the loop statement.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_attr.adb | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/gcc/ada/exp_attr.adb b/gcc/ada/exp_attr.adb
index 7e71422eba3..a5791adf7dd 100644
--- a/gcc/ada/exp_attr.adb
+++ b/gcc/ada/exp_attr.adb
@@ -1354,14 +1354,14 @@ package body Exp_Attr is
 
   --  Local variables
 
-  Pref  : constant Node_Id   := Prefix (N);
-  Base_Typ  : constant Entity_Id := Base_Type (Etype (Pref));
-  Exprs : constant List_Id   := Expressions (N);
+  Pref  : constant Node_Id:= Prefix (N);
+  Base_Typ  : constant Entity_Id  := Base_Type (Etype (Pref));
+  Exprs : constant List_Id:= Expressions (N);
+  Loc   : constant Source_Ptr := Sloc (N);
   Aux_Decl  : Node_Id;
   Blk   : Node_Id := Empty;
   Decls : List_Id;
   Installed : Boolean;
-  Loc   : Source_Ptr;
   Loop_Id   : Entity_Id;
   Loop_Stmt : Node_Id;
   Result: Node_Id := Empty;
@@ -1402,8 +1402,6 @@ package body Exp_Attr is
  Loop_Id := Entity (Identifier (Loop_Stmt));
   end if;
 
-  Loc := Sloc (Loop_Stmt);
-
   --  Step 2: Transform the loop
 
   --  The loop has already been transformed during the expansion of a prior
-- 
2.40.0



[COMMITTED] ada: Remove extra parentheses

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Arnaud Charlet 

In preparation of enhancing -gnatyx to check for these automatically.

gcc/ada/

* ali-util.adb, par-endh.adb, par-prag.adb, par-ch2.adb,
checks.adb, fmap.adb, libgnat/a-nbnbig.ads, libgnat/g-dynhta.adb,
libgnat/s-carun8.adb, libgnat/s-strcom.adb, libgnat/a-dhfina.adb,
libgnat/a-direct.adb, libgnat/a-rbtgbo.adb, libgnat/a-strsea.adb,
libgnat/a-ststio.adb, libgnat/a-suenco.adb, libgnat/a-costso.adb,
libgnat/a-strmap.adb, libgnat/g-alleve.adb,
libgnat/g-debpoo.adb, libgnat/g-sercom__linux.adb,
libgnat/s-genbig.adb, libgnat/s-mmap.adb, libgnat/s-regpat.adb,
par-ch5.adb, sem_case.adb, sem_ch12.adb, sem_ch13.adb,
sem_ch8.adb, sem_eval.adb, sem_prag.adb, sem_type.adb,
exp_ch11.adb, exp_ch2.adb, exp_ch3.adb, exp_ch4.adb, exp_ch5.adb,
exp_ch6.adb, exp_ch9.adb, exp_put_image.adb, freeze.adb, live.adb,
sem_aggr.adb, sem_cat.adb, sem_ch10.adb, sem_ch3.adb, sem_ch6.adb,
sem_ch9.adb, sem_disp.adb, sem_elab.adb, sem_res.adb,
sem_util.adb, sinput.adb, uintp.adb, bcheck.adb, binde.adb,
binderr.adb, einfo-utils.adb, clean.adb, sem_ch4.adb, gnatls.adb,
gprep.adb, sem_ch11.adb: Remove extra parentheses.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/ali-util.adb|  2 +-
 gcc/ada/bcheck.adb  |  6 ++---
 gcc/ada/binde.adb   |  4 +--
 gcc/ada/binderr.adb |  4 +--
 gcc/ada/checks.adb  | 10 
 gcc/ada/clean.adb   |  2 +-
 gcc/ada/einfo-utils.adb |  2 +-
 gcc/ada/exp_ch11.adb|  2 +-
 gcc/ada/exp_ch2.adb |  4 +--
 gcc/ada/exp_ch3.adb | 12 -
 gcc/ada/exp_ch4.adb | 26 ++--
 gcc/ada/exp_ch5.adb |  2 +-
 gcc/ada/exp_ch6.adb |  4 +--
 gcc/ada/exp_ch9.adb |  4 +--
 gcc/ada/exp_put_image.adb   |  2 +-
 gcc/ada/fmap.adb|  2 +-
 gcc/ada/freeze.adb  |  8 +++---
 gcc/ada/gnatls.adb  |  4 +--
 gcc/ada/gprep.adb   |  2 +-
 gcc/ada/libgnat/a-costso.adb|  2 +-
 gcc/ada/libgnat/a-dhfina.adb|  2 +-
 gcc/ada/libgnat/a-direct.adb|  4 +--
 gcc/ada/libgnat/a-nbnbig.ads| 12 -
 gcc/ada/libgnat/a-rbtgbo.adb| 18 +++---
 gcc/ada/libgnat/a-strmap.adb|  2 +-
 gcc/ada/libgnat/a-strsea.adb|  2 +-
 gcc/ada/libgnat/a-ststio.adb|  2 +-
 gcc/ada/libgnat/a-suenco.adb|  2 +-
 gcc/ada/libgnat/g-alleve.adb| 10 
 gcc/ada/libgnat/g-debpoo.adb|  2 +-
 gcc/ada/libgnat/g-dynhta.adb|  4 +--
 gcc/ada/libgnat/g-sercom__linux.adb |  2 +-
 gcc/ada/libgnat/s-carun8.adb|  2 +-
 gcc/ada/libgnat/s-genbig.adb|  6 ++---
 gcc/ada/libgnat/s-mmap.adb  |  5 ++--
 gcc/ada/libgnat/s-regpat.adb|  2 +-
 gcc/ada/libgnat/s-strcom.adb|  2 +-
 gcc/ada/live.adb|  2 +-
 gcc/ada/par-ch2.adb |  2 +-
 gcc/ada/par-ch5.adb |  2 +-
 gcc/ada/par-endh.adb| 12 -
 gcc/ada/par-prag.adb|  4 +--
 gcc/ada/sem_aggr.adb|  4 +--
 gcc/ada/sem_case.adb| 10 
 gcc/ada/sem_cat.adb |  2 +-
 gcc/ada/sem_ch10.adb| 10 
 gcc/ada/sem_ch11.adb|  2 +-
 gcc/ada/sem_ch12.adb|  2 +-
 gcc/ada/sem_ch13.adb|  8 +++---
 gcc/ada/sem_ch3.adb |  2 +-
 gcc/ada/sem_ch4.adb |  5 ++--
 gcc/ada/sem_ch6.adb | 10 
 gcc/ada/sem_ch8.adb | 12 -
 gcc/ada/sem_ch9.adb |  4 +--
 gcc/ada/sem_disp.adb|  6 ++---
 gcc/ada/sem_elab.adb|  2 +-
 gcc/ada/sem_eval.adb|  8 +++---
 gcc/ada/sem_prag.adb|  6 ++---
 gcc/ada/sem_res.adb | 38 ++---
 gcc/ada/sem_type.adb|  6 ++---
 gcc/ada/sem_util.adb| 24 +-
 gcc/ada/sinput.adb  |  2 +-
 gcc/ada/uintp.adb   |  2 +-
 63 files changed, 183 insertions(+), 187 deletions(-)

diff --git a/gcc/ada/ali-util.adb b/gcc/ada/ali-util.adb
index c0b8ad60623..2bd5bcac184 100644
--- a/gcc/ada/ali-util.adb
+++ b/gcc/ada/ali-util.adb
@@ -447,7 +447,7 @@ package body ALI.Util is
 Stringt.Release;
  end if;
 
- if (not Read_Only) or else Source.Table (Src).Source_Found then
+ if not Read_Only or else Source.Table (Src).Source_Found then
 if not Source.Table (Src).Source_Found
   or else Sdep.Table (D).Stamp /= Source.Table (Src).Stamp
 then
diff --git a/gcc/ada/bcheck.adb b/gcc/ada/bcheck.adb
index f09de1bef3b..86ed92080bb 

[COMMITTED] ada: Support calls through dereferences in Find_Actual

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Claire Dross 

Return the corresponding formal in the designated subprogram profile in
that case.

gcc/ada/

* sem_util.adb (Find_Actual): On calls through dereferences,
return the corresponding formal in the designated subprogram
profile.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_util.adb | 46 
 1 file changed, 38 insertions(+), 8 deletions(-)

diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index ef591c935eb..3ea7ef506df 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -8604,6 +8604,7 @@ package body Sem_Util is
   Context  : constant Node_Id := Parent (N);
   Actual   : Node_Id;
   Call_Nam : Node_Id;
+  Call_Ent : Node_Id := Empty;
 
begin
   if Nkind (Context) in N_Indexed_Component | N_Selected_Component
@@ -8652,13 +8653,42 @@ package body Sem_Util is
 Call_Nam := Selector_Name (Call_Nam);
  end if;
 
- if Is_Entity_Name (Call_Nam)
-   and then Present (Entity (Call_Nam))
-   and then (Is_Generic_Subprogram (Entity (Call_Nam))
-  or else Is_Overloadable (Entity (Call_Nam))
-  or else Ekind (Entity (Call_Nam)) in E_Entry_Family
- | E_Subprogram_Body
- | E_Subprogram_Type)
+ --  If Call_Nam is an entity name, get its entity
+
+ if Is_Entity_Name (Call_Nam) then
+Call_Ent := Entity (Call_Nam);
+
+ --  If it is a dereference, get the designated subprogram type
+
+ elsif Nkind (Call_Nam) = N_Explicit_Dereference then
+declare
+   Typ : Entity_Id := Etype (Prefix (Call_Nam));
+begin
+   if Present (Full_View (Typ)) then
+  Typ := Full_View (Typ);
+   elsif Is_Private_Type (Typ)
+ and then Present (Underlying_Full_View (Typ))
+   then
+  Typ := Underlying_Full_View (Typ);
+   end if;
+
+   if Is_Access_Type (Typ) then
+  Call_Ent := Directly_Designated_Type (Typ);
+   else
+  pragma Assert (Has_Implicit_Dereference (Typ));
+  Formal := Empty;
+  Call   := Empty;
+  return;
+   end if;
+end;
+ end if;
+
+ if Present (Call_Ent)
+   and then (Is_Generic_Subprogram (Call_Ent)
+  or else Is_Overloadable (Call_Ent)
+  or else Ekind (Call_Ent) in E_Entry_Family
+| E_Subprogram_Body
+| E_Subprogram_Type)
and then not Is_Overloaded (Call_Nam)
  then
 --  If node is name in call it is not an actual
@@ -8672,7 +8702,7 @@ package body Sem_Util is
 --  Fall here if we are definitely a parameter
 
 Actual := First_Actual (Call);
-Formal := First_Formal (Entity (Call_Nam));
+Formal := First_Formal (Call_Ent);
 while Present (Formal) and then Present (Actual) loop
if Actual = N then
   return;
-- 
2.40.0



[COMMITTED] ada: Accept Assert pragmas in expression functions

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Steve Baird 

gcc/ada/

* sem_ch4.adb (Analyze_Expression_With_Actions.Check_Action_Ok):
Accept an executable pragma occuring in a declare expression as
per AI22-0045. This means Assert and Inspection_Point pragmas as
well as any implementation-defined pragmas that the implementation
chooses to categorize as executable. Currently Assume and Debug
are the only such pragmas.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_ch4.adb | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/sem_ch4.adb b/gcc/ada/sem_ch4.adb
index e9c5b9f8a33..5b013dfb63d 100644
--- a/gcc/ada/sem_ch4.adb
+++ b/gcc/ada/sem_ch4.adb
@@ -2411,10 +2411,27 @@ package body Sem_Ch4 is
   return; -- ???For now; the RM rule is a bit more complicated
end if;
 
+when N_Pragma =>
+   declare
+  --  See AI22-0045 pragma categorization.
+  subtype Executable_Pragma_Id is Pragma_Id
+with Predicate => Executable_Pragma_Id in
+--  language-defined executable pragmas
+  Pragma_Assert | Pragma_Inspection_Point
+
+--  GNAT-defined executable pragmas
+| Pragma_Assume | Pragma_Debug;
+   begin
+  if Get_Pragma_Id (A) in Executable_Pragma_Id then
+ return;
+  end if;
+   end;
+
 when others =>
-   null; -- Nothing else allowed, not even pragmas
+   null; -- Nothing else allowed
  end case;
 
+ --  We could mention pragmas in the message text; let's not.
  Error_Msg_N ("object renaming or constant declaration expected", A);
   end Check_Action_OK;
 
-- 
2.40.0



[COMMITTED] ada: Improve -gnatyx style check

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Arnaud Charlet 

Check redundant parentheses in many more places, for now only under
-gnatdQ, while pending violations are fixed.

gcc/ada/

* par-ch3.adb, sem_ch4.adb (P_Discrete_Range, Analyze_Logical_Op,
Analyze_Short_Circuit): Add calls to Check_Xtra_Parentheses.
* par-ch5.adb (P_Condition): Move logic to Check_Xtra_Parentheses.
* style.ads, styleg.adb, styleg.ads (Check_Xtra_Parens): Move logic
related to expressions requiring parentheses here.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/par-ch3.adb | 10 ++
 gcc/ada/par-ch5.adb | 17 +++--
 gcc/ada/sem_ch4.adb | 27 +++
 gcc/ada/style.ads   |  7 ---
 gcc/ada/styleg.adb  | 20 +---
 gcc/ada/styleg.ads  |  7 ---
 6 files changed, 65 insertions(+), 23 deletions(-)

diff --git a/gcc/ada/par-ch3.adb b/gcc/ada/par-ch3.adb
index b763d414763..7126afbfbeb 100644
--- a/gcc/ada/par-ch3.adb
+++ b/gcc/ada/par-ch3.adb
@@ -3064,10 +3064,20 @@ package body Ch3 is
   elsif Token = Tok_Dot_Dot then
  Range_Node := New_Node (N_Range, Token_Ptr);
  Set_Low_Bound (Range_Node, Expr_Node);
+
+ if Style_Check then
+Style.Check_Xtra_Parens (Expr_Node);
+ end if;
+
  Scan; -- past ..
  Expr_Node := P_Expression;
  Check_Simple_Expression (Expr_Node);
  Set_High_Bound (Range_Node, Expr_Node);
+
+ if Style_Check then
+Style.Check_Xtra_Parens (Expr_Node);
+ end if;
+
  return Range_Node;
 
   --  Otherwise we must have a subtype mark, or an Ada 2012 iterator
diff --git a/gcc/ada/par-ch5.adb b/gcc/ada/par-ch5.adb
index 8f7224517bc..6099a78effb 100644
--- a/gcc/ada/par-ch5.adb
+++ b/gcc/ada/par-ch5.adb
@@ -1355,22 +1355,11 @@ package body Ch5 is
 
  return Cond;
 
-  --  Otherwise check for redundant parentheses but do not emit messages
-  --  about expressions that require parentheses (e.g. conditional,
-  --  quantified or declaration expressions).
+  --  Otherwise check for redundant parentheses
 
   else
- if Style_Check
-   and then
- Paren_Count (Cond) >
-   (if Nkind (Cond) in N_Case_Expression
- | N_Expression_With_Actions
- | N_If_Expression
- | N_Quantified_Expression
-then 1
-else 0)
- then
-Style.Check_Xtra_Parens (First_Sloc (Cond));
+ if Style_Check then
+Style.Check_Xtra_Parens (Cond, Enable => True);
  end if;
 
  --  And return the result
diff --git a/gcc/ada/sem_ch4.adb b/gcc/ada/sem_ch4.adb
index 03737db90d4..e9c5b9f8a33 100644
--- a/gcc/ada/sem_ch4.adb
+++ b/gcc/ada/sem_ch4.adb
@@ -65,6 +65,7 @@ with Sinfo;  use Sinfo;
 with Sinfo.Nodes;use Sinfo.Nodes;
 with Sinfo.Utils;use Sinfo.Utils;
 with Snames; use Snames;
+with Style;  use Style;
 with Tbuild; use Tbuild;
 with Uintp;  use Uintp;
 with Warnsw; use Warnsw;
@@ -3134,6 +3135,20 @@ package body Sem_Ch4 is
 
   Operator_Check (N);
   Check_Function_Writable_Actuals (N);
+
+  if Style_Check then
+ if Nkind (L) not in N_Short_Circuit | N_Op_And | N_Op_Or | N_Op_Xor
+   and then Is_Boolean_Type (Etype (L))
+ then
+Check_Xtra_Parens (L);
+ end if;
+
+ if Nkind (R) not in N_Short_Circuit | N_Op_And | N_Op_Or | N_Op_Xor
+   and then Is_Boolean_Type (Etype (R))
+ then
+Check_Xtra_Parens (R);
+ end if;
+  end if;
end Analyze_Logical_Op;
 
---
@@ -6006,6 +6021,18 @@ package body Sem_Ch4 is
  Resolve (R, Standard_Boolean);
  Set_Etype (N, Standard_Boolean);
   end if;
+
+  if Style_Check then
+ if Nkind (L) not in N_Short_Circuit | N_Op_And | N_Op_Or | N_Op_Xor
+ then
+Check_Xtra_Parens (L);
+ end if;
+
+ if Nkind (R) not in N_Short_Circuit | N_Op_And | N_Op_Or | N_Op_Xor
+ then
+Check_Xtra_Parens (R);
+ end if;
+  end if;
end Analyze_Short_Circuit;
 
---
diff --git a/gcc/ada/style.ads b/gcc/ada/style.ads
index 35118f4d094..4a7faff31e3 100644
--- a/gcc/ada/style.ads
+++ b/gcc/ada/style.ads
@@ -28,6 +28,7 @@
 --  gathered in a separate package so that they can more easily be customized.
 --  Calls to these subprograms are only made if Opt.Style_Check is set True.
 
+with Debug; use Debug;
 with Errout;
 with Styleg;
 with Types;use Types;
@@ -192,10 +193,10 @@ package Style is
  renames Style_Inst.Check_Vertical_Bar;
--  Called after scanning a vertical bar to check spacing
 
-   procedure Check_Xtra_Parens (Loc : Source_Ptr)
+   procedure Check_Xtra_Parens (N : Node_Id; Enable : Boolean := Debug_Flag_

[COMMITTED] ada: Remove redundant protection against empty lists

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

Calls to List_Length on No_List intentionally return 0 (and likewise
call to First on No_List intentionally return Empty), so explicit guards
against No_List are unnecessary. Code cleanup; semantics is unaffected.

gcc/ada/

* exp_aggr.adb (Aggregate_Size): Remove redundant calls to
Present.
* exp_ch5.adb (Expand_N_If_Statement): Likewise.
* sem_prag.adb (Analyze_Pragma): Likewise.
* sem_warn.adb (Find_Var): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_aggr.adb |  8 +++-
 gcc/ada/exp_ch5.adb  |  1 -
 gcc/ada/sem_prag.adb | 25 ++---
 gcc/ada/sem_warn.adb |  2 +-
 4 files changed, 14 insertions(+), 22 deletions(-)

diff --git a/gcc/ada/exp_aggr.adb b/gcc/ada/exp_aggr.adb
index 58831bd51ca..e4b1991f410 100644
--- a/gcc/ada/exp_aggr.adb
+++ b/gcc/ada/exp_aggr.adb
@@ -7397,7 +7397,7 @@ package body Exp_Aggr is
  Comp   : Node_Id;
  Choice : Node_Id;
  Lo, Hi : Node_Id;
- Siz : Int := 0;
+ Siz: Int;
 
  procedure Add_Range_Size;
  --  Compute number of components specified by a component association
@@ -7422,11 +7422,9 @@ package body Exp_Aggr is
  end Add_Range_Size;
 
   begin
- --  Aggregate is either all positional or all named.
+ --  Aggregate is either all positional or all named
 
- if Present (Expressions (N)) then
-Siz := List_Length (Expressions (N));
- end if;
+ Siz := List_Length (Expressions (N));
 
  if Present (Component_Associations (N)) then
 Comp := First (Component_Associations (N));
diff --git a/gcc/ada/exp_ch5.adb b/gcc/ada/exp_ch5.adb
index 0c89856b58b..dfe1112f341 100644
--- a/gcc/ada/exp_ch5.adb
+++ b/gcc/ada/exp_ch5.adb
@@ -4743,7 +4743,6 @@ package body Exp_Ch5 is
 and then not Opt.Suppress_Control_Flow_Optimizations
 and then Nkind (N) = N_If_Statement
 and then No (Elsif_Parts (N))
-and then Present (Else_Statements (N))
 and then List_Length (Then_Statements (N)) = 1
 and then List_Length (Else_Statements (N)) = 1
   then
diff --git a/gcc/ada/sem_prag.adb b/gcc/ada/sem_prag.adb
index 36c1add5ea4..5fe5d6a2d0f 100644
--- a/gcc/ada/sem_prag.adb
+++ b/gcc/ada/sem_prag.adb
@@ -11699,29 +11699,24 @@ package body Sem_Prag is
 
   --  Preset arguments
 
-  Arg_Count := 0;
-  Arg1  := Empty;
+  Arg_Count := List_Length (Pragma_Argument_Associations (N));
+  Arg1  := First (Pragma_Argument_Associations (N));
   Arg2  := Empty;
   Arg3  := Empty;
   Arg4  := Empty;
   Arg5  := Empty;
 
-  if Present (Pragma_Argument_Associations (N)) then
- Arg_Count := List_Length (Pragma_Argument_Associations (N));
- Arg1 := First (Pragma_Argument_Associations (N));
-
- if Present (Arg1) then
-Arg2 := Next (Arg1);
+  if Present (Arg1) then
+ Arg2 := Next (Arg1);
 
-if Present (Arg2) then
-   Arg3 := Next (Arg2);
+ if Present (Arg2) then
+Arg3 := Next (Arg2);
 
-   if Present (Arg3) then
-  Arg4 := Next (Arg3);
+if Present (Arg3) then
+   Arg4 := Next (Arg3);
 
-  if Present (Arg4) then
- Arg5 := Next (Arg4);
-  end if;
+   if Present (Arg4) then
+  Arg5 := Next (Arg4);
end if;
 end if;
  end if;
diff --git a/gcc/ada/sem_warn.adb b/gcc/ada/sem_warn.adb
index 834d48d311c..5dd7c17d4e2 100644
--- a/gcc/ada/sem_warn.adb
+++ b/gcc/ada/sem_warn.adb
@@ -353,7 +353,7 @@ package body Sem_Warn is
 begin
--  One argument, so check the argument
 
-   if Present (PA) and then List_Length (PA) = 1 then
+   if List_Length (PA) = 1 then
   if Nkind (First (PA)) = N_Parameter_Association then
  Find_Var (Explicit_Actual_Parameter (First (PA)));
   else
-- 
2.40.0



[COMMITTED] ada: Fix spurious freezing error on nonabstract null extension

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

This prevents the wrapper function created for each nonoverridden inherited
function with a controlling result of nonabstract null extensions of tagged
types from causing premature freezing of types referenced in its profile.

gcc/ada/

* exp_ch3.adb (Make_Controlling_Function_Wrappers): Create the body
as the expanded body of an expression function.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch3.adb | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb
index b8ab549c0fc..3a023092532 100644
--- a/gcc/ada/exp_ch3.adb
+++ b/gcc/ada/exp_ch3.adb
@@ -11109,9 +11109,10 @@ package body Exp_Ch3 is
 Null_Record_Present => True);
 
 --  GNATprove will use expression of an expression function as an
---  implicit postcondition. GNAT will not benefit from expression
---  function (and would struggle if we add an expression function
---  to freezing actions).
+--  implicit postcondition. GNAT will also benefit from expression
+--  function to avoid premature freezing, but would struggle if we
+--  added an expression function to freezing actions, so we create
+--  the expanded form directly.
 
 if GNATprove_Mode then
Func_Body :=
@@ -11130,6 +11131,7 @@ package body Exp_Ch3 is
Statements => New_List (
  Make_Simple_Return_Statement (Loc,
Expression => Ext_Aggr;
+   Set_Was_Expression_Function (Func_Body);
 end if;
 
 Append_To (Body_List, Func_Body);
-- 
2.40.0



[COMMITTED] ada: Fix spurious warning on Inline_Always and contracts

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

Warnings about pre/postconditions being ignored with Inline_Always were
only true for the obsolete frontend inlining. With the current backend
pre/postconditions work fine with Inline_Always.

gcc/ada/

* sem_prag.adb (Check_Postcondition_Use_In_Inlined_Subprogram): Only
emit warning when frontend inlining is enabled.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_prag.adb | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/sem_prag.adb b/gcc/ada/sem_prag.adb
index b6c78dbd559..dbc8584e211 100644
--- a/gcc/ada/sem_prag.adb
+++ b/gcc/ada/sem_prag.adb
@@ -210,7 +210,7 @@ package body Sem_Prag is
--  Subsidiary to the analysis of pragmas Contract_Cases, Postcondition,
--  Precondition, Refined_Post, and Test_Case. Emit a warning when pragma
--  Prag is associated with subprogram Spec_Id subject to Inline_Always,
-   --  and assertions are enabled.
+   --  assertions are enabled and inling is done in the frontend.
 
procedure Check_State_And_Constituent_Use
  (States   : Elist_Id;
@@ -30304,6 +30304,7 @@ package body Sem_Prag is
   if Warn_On_Redundant_Constructs
 and then Has_Pragma_Inline_Always (Spec_Id)
 and then Assertions_Enabled
+and then not Back_End_Inlining
   then
  Error_Msg_Name_1 := Original_Aspect_Pragma_Name (Prag);
 
-- 
2.40.0



[COMMITTED] ada: Fix missing finalization in library-unit instance spec

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

This fixes the missing finalization of objects declared in the spec of
package instances that are library units (and only them, i.e. not all
library-level package instances) when the instances have a package body.

The finalization is done when there is no package body, and supporting
this case precisely broke the other case because of a thinko or a typo.

This also requires a small adjustment to the routine writing ALI files.

gcc/ada/

* exp_ch7.adb (Build_Finalizer): Reverse the test comparing the
instantiation and declaration nodes of a package instance, and
therefore bail out only when they are equal.  Adjust comments.
(Expand_N_Package_Declaration): Do not clear the Finalizer field.
* lib-writ.adb: Add with and use clauses for Sem_Util.
(Write_Unit_Information): Look at unit nodes to find finalizers.
* sem_ch12.adb (Analyze_Package_Instantiation): Beef up the comment
about the rewriting of the instantiation node into a declaration.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch7.adb  | 18 +-
 gcc/ada/lib-writ.adb | 19 +++
 gcc/ada/sem_ch12.adb | 10 ++
 3 files changed, 34 insertions(+), 13 deletions(-)

diff --git a/gcc/ada/exp_ch7.adb b/gcc/ada/exp_ch7.adb
index 7ea39f7ba16..a02e28e4b34 100644
--- a/gcc/ada/exp_ch7.adb
+++ b/gcc/ada/exp_ch7.adb
@@ -3534,15 +3534,21 @@ package body Exp_Ch7 is
 and then
   (not Is_Library_Level_Entity (Spec_Id)
 
---  Nested packages are library level entities, but do not need to
+--  Nested packages are library-level entities, but do not need to
 --  be processed separately.
 
 or else Scope_Depth (Spec_Id) /= Uint_1
+
+--  Do not build two finalizers for an instance without body that
+--  is a library unit (see Analyze_Package_Instantiation).
+
 or else (Is_Generic_Instance (Spec_Id)
-  and then Package_Instantiation (Spec_Id) /= N))
+  and then Package_Instantiation (Spec_Id) = N))
 
- --  Still need to process package body instantiations which may
- --  contain objects requiring finalization.
+ --  Still need to process library-level package body instances, whose
+ --  instantiation was deferred and thus could not be seen during the
+ --  processing of the enclosing scope, and which may contain objects
+ --  requiring finalization.
 
 and then not
   (For_Package_Body
@@ -5376,7 +5382,9 @@ package body Exp_Ch7 is
 Defer_Abort => False,
 Fin_Id  => Fin_Id);
 
- Set_Finalizer (Id, Fin_Id);
+ if Present (Fin_Id) then
+Set_Finalizer (Id, Fin_Id);
+ end if;
   end if;
 
   --  If this is a library-level package and unnesting is enabled,
diff --git a/gcc/ada/lib-writ.adb b/gcc/ada/lib-writ.adb
index deecfc067c5..23b6266bb41 100644
--- a/gcc/ada/lib-writ.adb
+++ b/gcc/ada/lib-writ.adb
@@ -50,6 +50,7 @@ with Rident; use Rident;
 with Stand;  use Stand;
 with Scn;use Scn;
 with Sem_Eval;   use Sem_Eval;
+with Sem_Util;   use Sem_Util;
 with Sinfo;  use Sinfo;
 with Sinfo.Nodes;use Sinfo.Nodes;
 with Sinfo.Utils;use Sinfo.Utils;
@@ -524,10 +525,20 @@ package body Lib.Writ is
  Write_Info_Str (" O");
  Write_Info_Char (OA_Setting (Unit_Num));
 
- if Ekind (Uent) in E_Package | E_Package_Body
-   and then Present (Finalizer (Uent))
- then
-Write_Info_Str (" PF");
+ --  For a package instance with a body that is a library unit, the two
+ --  compilation units share Cunit_Entity so we cannot rely on Uent.
+
+ if Ukind in N_Package_Declaration | N_Package_Body then
+declare
+   E : constant Entity_Id := Defining_Entity (Unit (Unode));
+
+begin
+   if Ekind (E) in E_Package | E_Package_Body
+ and then Present (Finalizer (E))
+   then
+  Write_Info_Str (" PF");
+   end if;
+end;
  end if;
 
  if Is_Preelaborated (Uent) then
diff --git a/gcc/ada/sem_ch12.adb b/gcc/ada/sem_ch12.adb
index 181392c2132..c31d0c62faa 100644
--- a/gcc/ada/sem_ch12.adb
+++ b/gcc/ada/sem_ch12.adb
@@ -5007,10 +5007,12 @@ package body Sem_Ch12 is
  Set_First_Private_Entity (Defining_Unit_Name (Unit_Renaming),
First_Private_Entity (Act_Decl_Id));
 
- --  If the instantiation will receive a body, the unit will be
- --  transformed into a package body, and receive its own elaboration
- --  entity. Otherwise, the nature of the unit is now a package
- --  declaration.
+ --  If the instantiation needs a body, the unit will be turned into
+ --  a package body and receive it

[COMMITTED] ada: Add warning on frontend inlining of Subprogram_Variant

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

We already warned when contracts like pre/postcondition appear together
with pragma Inline_Always and they are ignored by the frontend inlining.

For consistency we now also warn for Subprogram_Variant, which is
similarly ignored even though this contract is only meaningful for
recursive subprograms and those can't be inlined anyway (but error about
this might only be emitted when full compilation is done).

gcc/ada/

* sem_prag.adb
(Check_Postcondition_Use_In_Inlined_Subprogram): Mention
Subprogram_Variant in the comment.
(Analyze_Subprogram_Variant_In_Decl_Part): Warn when contract is
ignored because of pragma Inline_Always and frontend inlining.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_prag.adb | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/sem_prag.adb b/gcc/ada/sem_prag.adb
index dbc8584e211..feaf486c348 100644
--- a/gcc/ada/sem_prag.adb
+++ b/gcc/ada/sem_prag.adb
@@ -208,9 +208,10 @@ package body Sem_Prag is
  (Prag: Node_Id;
   Spec_Id : Entity_Id);
--  Subsidiary to the analysis of pragmas Contract_Cases, Postcondition,
-   --  Precondition, Refined_Post, and Test_Case. Emit a warning when pragma
-   --  Prag is associated with subprogram Spec_Id subject to Inline_Always,
-   --  assertions are enabled and inling is done in the frontend.
+   --  Precondition, Refined_Post, Subprogram_Variant, and Test_Case. Emit a
+   --  warning when pragma Prag is associated with subprogram Spec_Id subject
+   --  to Inline_Always, assertions are enabled and inling is done in the
+   --  frontend.
 
procedure Check_State_And_Constituent_Use
  (States   : Elist_Id;
@@ -29627,6 +29628,11 @@ package body Sem_Prag is
 End_Scope;
  end if;
 
+ --  Currently it is not possible to inline Subprogram_Variant on a
+ --  subprogram subject to pragma Inline_Always.
+
+ Check_Postcondition_Use_In_Inlined_Subprogram (N, Spec_Id);
+
   --  Otherwise the pragma is illegal
 
   else
-- 
2.40.0



[COMMITTED] ada: Add Is_Past_Self_Hiding_Point flag

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Bob Duff 

This patch adds a flag Is_Past_Self_Hiding_Point. When False,
this will replace E_Void as the indicator for a premature use of
a declaration within itself -- for example, "X : T := X;".

One might think this flag should be called something like
Is_Hidden_From_All_Visibility, reversing the sense of
Is_Past_Self_Hiding_Point. We don't do that because we want
Is_Past_Self_Hiding_Point to be initially False by default (and we have
no mechanism for defaulting to True), and because it doesn't exactly
match the RM definition of "hidden from all visibility" (for
example, for record components).

This is work in progress; more changes are needed before we
can remove all Mutate_Ekind(..., E_Void).

gcc/ada/

* einfo.ads (Is_Past_Self_Hiding_Point): Document.
* gen_il-fields.ads (Is_Past_Self_Hiding_Point): Add to list of
fields.
* gen_il-gen-gen_entities.adb (Is_Past_Self_Hiding_Point): Declare
in all entities.
* exp_aggr.adb: Set Is_Past_Self_Hiding_Point as appropriate.
* sem.adb: Likewise.
* sem_aggr.adb: Likewise.
* sem_ch11.adb: Likewise.
* sem_ch12.adb: Likewise.
* sem_ch5.adb: Likewise.
* sem_ch7.adb: Likewise.
* sem_prag.adb: Likewise.
* sem_ch6.adb: Likewise.
(Set_Formal_Mode): Minor cleanup: Move from spec.
* sem_ch6.ads:
(Set_Formal_Mode): Minor cleanup: Move to body.
* cstand.adb: Call Set_Is_Past_Self_Hiding_Point on all entities
as soon as they are created.
* comperr.adb (Compiler_Abort): Minor cleanup -- use 'in' instead
of 'or else'.
* debug.adb: Minor comment cleanups.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/comperr.adb |  6 ++
 gcc/ada/cstand.adb  |  4 +++-
 gcc/ada/debug.adb   | 23 +--
 gcc/ada/einfo.ads   | 13 +
 gcc/ada/exp_aggr.adb|  1 +
 gcc/ada/gen_il-fields.ads   |  1 +
 gcc/ada/gen_il-gen-gen_entities.adb |  1 +
 gcc/ada/sem.adb | 23 +++
 gcc/ada/sem_aggr.adb|  3 +++
 gcc/ada/sem_ch11.adb|  1 +
 gcc/ada/sem_ch12.adb|  5 +
 gcc/ada/sem_ch5.adb |  4 
 gcc/ada/sem_ch6.adb |  8 
 gcc/ada/sem_ch6.ads |  3 ---
 gcc/ada/sem_ch7.adb |  9 ++---
 gcc/ada/sem_prag.adb|  9 +
 16 files changed, 89 insertions(+), 25 deletions(-)

diff --git a/gcc/ada/comperr.adb b/gcc/ada/comperr.adb
index 4fc0e5d3baa..c52db7b0c23 100644
--- a/gcc/ada/comperr.adb
+++ b/gcc/ada/comperr.adb
@@ -177,10 +177,8 @@ package body Comperr is
 
  --  Output target name, deleting junk final reverse slash
 
- if Target_Name.all (Target_Name.all'Last) = '\'
-   or else Target_Name.all (Target_Name.all'Last) = '/'
- then
-Write_Str (Target_Name.all (1 .. Target_Name.all'Last - 1));
+ if Target_Name (Target_Name'Last) in '/' | '\' then
+Write_Str (Target_Name (1 .. Target_Name'Last - 1));
  else
 Write_Str (Target_Name.all);
  end if;
diff --git a/gcc/ada/cstand.adb b/gcc/ada/cstand.adb
index 72c287a8739..f53015d1e0c 100644
--- a/gcc/ada/cstand.adb
+++ b/gcc/ada/cstand.adb
@@ -1784,6 +1784,7 @@ package body CStand is
 
   Set_Is_Immediately_Visible  (Ident_Node, True);
   Set_Is_Intrinsic_Subprogram (Ident_Node, True);
+  Set_Is_Past_Self_Hiding_Point (Ident_Node);
 
   Set_Name_Entity_Id (Op, Ident_Node);
   Append_Entity (Ident_Node, Standard_Standard);
@@ -1806,9 +1807,10 @@ package body CStand is
   Set_Is_Public (E);
 
   --  All standard entity names are analyzed manually, and are thus
-  --  frozen as soon as they are created.
+  --  frozen and not self-hidden as soon as they are created.
 
   Set_Is_Frozen (E);
+  Set_Is_Past_Self_Hiding_Point (E);
 
   --  Set debug information required for all standard types
 
diff --git a/gcc/ada/debug.adb b/gcc/ada/debug.adb
index 7497fa04076..9566e095d1a 100644
--- a/gcc/ada/debug.adb
+++ b/gcc/ada/debug.adb
@@ -41,7 +41,7 @@ package body Debug is
--  dh   Generate listing showing loading of name table hash chains
--  di   Generate messages for visibility linking/delinking
--  dj   Suppress "junk null check" for access parameter values
-   --  dk   Generate GNATBUG message on abort, even if previous errors
+   --  dk   Generate "GNAT BUG" message on abort, even if previous errors
--  dl   Generate unit load trace messages
--  dm   Prevent special frontend inlining in GNATprove mode
--  dn   Generate messages for node/list allocation
@@ -113,7 +113,7 @@ package body Debug is
--  d.z  Restore previous support for frontend handling of Inline_Always
 
--  d.A  Enable statistics printing in Atree

Re: [PATCH] RISC-V: Implement autovec abs, vneg, vnot.

2023-05-22 Thread Robin Dapp via Gcc-patches
As discussed with Juzhe off-list, I will rebase this patch against
Juzhe's vec_cmp/vcond patch once that hits the trunk.

Regards
 Robin


[COMMITTED] ada: Add missing word in comment

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Ronan Desplanques 

gcc/ada/

* par-ch3.adb: Add missing word in comment.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/par-ch3.adb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ada/par-ch3.adb b/gcc/ada/par-ch3.adb
index 7126afbfbeb..a71056b20a0 100644
--- a/gcc/ada/par-ch3.adb
+++ b/gcc/ada/par-ch3.adb
@@ -1466,7 +1466,7 @@ package body Ch3 is
  Save_Scan_State (Scan_State); -- at colon
  T_Colon;
 
-  --  If we have identifier followed by := then we assume that what is
+  --  If we have an identifier followed by := then we assume that what is
   --  really meant is an assignment statement. The assignment statement
   --  is scanned out and added to the list of declarations. An exception
   --  occurs if the := is followed by the keyword constant, in which case
-- 
2.40.0



[COMMITTED] ada: Cleanup redundant condition in resolution of entity names

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

Code cleanup related to new contract for SPARK; semantics is unaffected.

gcc/ada/

* sem_res.adb (Resolve_Entity_Name): Combine two IF statements that
execute code only for references that come from source.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_res.adb | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/gcc/ada/sem_res.adb b/gcc/ada/sem_res.adb
index 3eb13de38df..365c75041a9 100644
--- a/gcc/ada/sem_res.adb
+++ b/gcc/ada/sem_res.adb
@@ -8022,7 +8022,7 @@ package body Sem_Res is
 
   if Comes_From_Source (N) then
 
- --  The following checks are only relevant when SPARK_Mode is on as
+ --  The following checks are only relevant when SPARK_Mode is On as
  --  they are not standard Ada legality rules.
 
  if SPARK_Mode = On then
@@ -8067,13 +8067,11 @@ package body Sem_Res is
  if Is_Ghost_Entity (E) then
 Check_Ghost_Context (E, N);
  end if;
-  end if;
 
-  --  We may be resolving an entity within expanded code, so a reference to
-  --  an entity should be ignored when calculating effective use clauses to
-  --  avoid inappropriate marking.
+ --  We may be resolving an entity within expanded code, so a reference
+ --  to an entity should be ignored when calculating effective use
+ --  clauses to avoid inappropriate marking.
 
-  if Comes_From_Source (N) then
  Mark_Use_Clauses (E);
   end if;
end Resolve_Entity_Name;
-- 
2.40.0



[COMMITTED] ada: Fix missing finalization in separate package body

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

This directly comes from a loophole in the implementation.

gcc/ada/

* exp_ch7.adb (Process_Package_Body): New procedure taken from...
(Build_Finalizer.Process_Declarations): ...here.  Call the above
procedure to deal with both package bodies and package body stubs.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch7.adb | 59 -
 1 file changed, 37 insertions(+), 22 deletions(-)

diff --git a/gcc/ada/exp_ch7.adb b/gcc/ada/exp_ch7.adb
index a02e28e4b34..9ec03b7e4cd 100644
--- a/gcc/ada/exp_ch7.adb
+++ b/gcc/ada/exp_ch7.adb
@@ -2138,6 +2138,9 @@ package body Exp_Ch7 is
  --  This variable is used to determine whether a nested package or
  --  instance contains at least one controlled object.
 
+ procedure Process_Package_Body (Decl : Node_Id);
+ --  Process an N_Package_Body node
+
  procedure Processing_Actions
(Has_No_Init  : Boolean := False;
 Is_Protected : Boolean := False);
@@ -2149,6 +2152,35 @@ package body Exp_Ch7 is
  --  Is_Protected should be set when the current declaration denotes a
  --  simple protected object.
 
+ --
+ -- Process_Package_Body --
+ --
+
+ procedure Process_Package_Body (Decl : Node_Id) is
+ begin
+--  Do not inspect an ignored Ghost package body because all
+--  code found within will not appear in the final tree.
+
+if Is_Ignored_Ghost_Entity (Defining_Entity (Decl)) then
+   null;
+
+elsif Ekind (Corresponding_Spec (Decl)) /= E_Generic_Package then
+   Old_Counter_Val := Counter_Val;
+   Process_Declarations (Declarations (Decl), Preprocess);
+
+   --  The nested package body is the last construct to contain
+   --  a controlled object.
+
+   if Preprocess
+ and then Top_Level
+ and then No (Last_Top_Level_Ctrl_Construct)
+ and then Counter_Val > Old_Counter_Val
+   then
+  Last_Top_Level_Ctrl_Construct := Decl;
+   end if;
+end if;
+ end Process_Package_Body;
+
  
  -- Processing_Actions --
  
@@ -2536,29 +2568,12 @@ package body Exp_Ch7 is
 --  Nested package bodies, avoid generics
 
 elsif Nkind (Decl) = N_Package_Body then
+   Process_Package_Body (Decl);
 
-   --  Do not inspect an ignored Ghost package body because all
-   --  code found within will not appear in the final tree.
-
-   if Is_Ignored_Ghost_Entity (Defining_Entity (Decl)) then
-  null;
-
-   elsif Ekind (Corresponding_Spec (Decl)) /= E_Generic_Package
-   then
-  Old_Counter_Val := Counter_Val;
-  Process_Declarations (Declarations (Decl), Preprocess);
-
-  --  The nested package body is the last construct to contain
-  --  a controlled object.
-
-  if Preprocess
-and then Top_Level
-and then No (Last_Top_Level_Ctrl_Construct)
-and then Counter_Val > Old_Counter_Val
-  then
- Last_Top_Level_Ctrl_Construct := Decl;
-  end if;
-   end if;
+elsif Nkind (Decl) = N_Package_Body_Stub
+  and then Present (Library_Unit (Decl))
+then
+   Process_Package_Body (Proper_Body (Unit (Library_Unit (Decl;
 
 --  Handle a rare case caused by a controlled transient object
 --  created as part of a record init proc. The variable is wrapped
-- 
2.40.0



[COMMITTED] ada: Avoid repeated calls when looking for first/last slocs of a node

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

gcc/ada/

* errout.adb (First_Loc): Avoid repeated calls.
(Last_Loc): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/errout.adb | 34 ++
 1 file changed, 18 insertions(+), 16 deletions(-)

diff --git a/gcc/ada/errout.adb b/gcc/ada/errout.adb
index 49281fdb05f..a82aff5266b 100644
--- a/gcc/ada/errout.adb
+++ b/gcc/ada/errout.adb
@@ -1845,11 +1845,12 @@ package body Errout is

 
function First_Sloc (N : Node_Id) return Source_Ptr is
-  SI : constant Source_File_Index := Source_Index (Get_Source_Unit (N));
-  SF : constant Source_Ptr:= Source_First (SI);
-  SL : constant Source_Ptr:= Source_Last (SI);
-  F  : Node_Id;
-  S  : Source_Ptr;
+  SI  : constant Source_File_Index := Source_Index (Get_Source_Unit (N));
+  SF  : constant Source_Ptr:= Source_First (SI);
+  SL  : constant Source_Ptr:= Source_Last (SI);
+  Src : constant Source_Buffer_Ptr := Source_Text (SI);
+  F   : Node_Id;
+  S   : Source_Ptr;
 
begin
   F := First_Node (N);
@@ -1876,11 +1877,11 @@ package body Errout is
 Search_Loop : for K in 1 .. 12 loop
exit Search_Loop when S = SF;
 
-   if Source_Text (SI) (S - 1) = '(' then
+   if Src (S - 1) = '(' then
   S := S - 1;
   exit Search_Loop;
 
-   elsif Source_Text (SI) (S - 1) <= ' ' then
+   elsif Src (S - 1) <= ' ' then
   S := S - 1;
 
else
@@ -1963,11 +1964,12 @@ package body Errout is
---
 
function Last_Sloc (N : Node_Id) return Source_Ptr is
-  SI : constant Source_File_Index := Source_Index (Get_Source_Unit (N));
-  SF : constant Source_Ptr:= Source_First (SI);
-  SL : constant Source_Ptr:= Source_Last (SI);
-  F  : Node_Id;
-  S  : Source_Ptr;
+  SI  : constant Source_File_Index := Source_Index (Get_Source_Unit (N));
+  SF  : constant Source_Ptr:= Source_First (SI);
+  SL  : constant Source_Ptr:= Source_Last (SI);
+  Src : constant Source_Buffer_Ptr := Source_Text (SI);
+  F   : Node_Id;
+  S   : Source_Ptr;
 
begin
   F := Last_Node (N);
@@ -1980,7 +1982,7 @@ package body Errout is
   --  Skip past an identifier
 
   while S in SF .. SL - 1
-and then Source_Text (SI) (S + 1)
+and then Src (S + 1)
   in
 '0' .. '9' | 'a' .. 'z' | 'A' .. 'Z' | '.' | '_'
   loop
@@ -2000,11 +2002,11 @@ package body Errout is
 Search_Loop : for K in 1 .. 12 loop
exit Node_Loop when S = SL;
 
-   if Source_Text (SI) (S + 1) = ')' then
+   if Src (S + 1) = ')' then
   S := S + 1;
   exit Search_Loop;
 
-   elsif Source_Text (SI) (S + 1) <= ' ' then
+   elsif Src (S + 1) <= ' ' then
   S := S + 1;
 
else
@@ -2021,7 +2023,7 @@ package body Errout is
   --  Remove any trailing space
 
   while S in SF + 1 .. SL
-and then Source_Text (SI) (S) = ' '
+and then Src (S) = ' '
   loop
  S := S - 1;
   end loop;
-- 
2.40.0



[COMMITTED] ada: Further fixes to GNATprove and CodePeer expression pretty-printer

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

The expression pretty-printer still crashes on several tests, but
already gives much better outputs for many previously unsupported
constructs.

gcc/ada/

* pprint.adb (Expression_Image): Handle several previously unsupported
constructs.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/pprint.adb | 326 +++--
 1 file changed, 198 insertions(+), 128 deletions(-)

diff --git a/gcc/ada/pprint.adb b/gcc/ada/pprint.adb
index 8fdb5d6916e..1b97630179b 100644
--- a/gcc/ada/pprint.adb
+++ b/gcc/ada/pprint.adb
@@ -27,6 +27,7 @@ with Atree;  use Atree;
 with Einfo;  use Einfo;
 with Einfo.Entities; use Einfo.Entities;
 with Einfo.Utils;use Einfo.Utils;
+with Errout; use Errout;
 with Namet;  use Namet;
 with Nlists; use Nlists;
 with Opt;use Opt;
@@ -63,8 +64,11 @@ package body Pprint is
   --  Expand_Type is True and Expr is a type, try to expand Expr (an
   --  internally generated type) into a user understandable name.
 
-  Max_List : constant := 3;
-  --  Limit number of list elements to dump
+  Max_List_Depth : constant := 3;
+  --  Limit number of nested lists to print
+
+  Max_List_Length : constant := 3;
+  --  Limit number of list elements to print
 
   Max_Expr_Elements : constant := 24;
   --  Limit number of elements in an expression for use by Expr_Name
@@ -72,94 +76,82 @@ package body Pprint is
   Num_Elements : Natural := 0;
   --  Current number of elements processed by Expr_Name
 
-  function List_Name
-(List  : Node_Id;
- Add_Space : Boolean := True;
- Add_Paren : Boolean := True) return String;
+  function List_Name (List : List_Id) return String;
   --  Return a string corresponding to List
 
   ---
   -- List_Name --
   ---
 
-  function List_Name
-(List  : Node_Id;
- Add_Space : Boolean := True;
- Add_Paren : Boolean := True) return String
-  is
- function Internal_List_Name
-   (List  : Node_Id;
-First : Boolean := True;
-Add_Space : Boolean := True;
-Add_Paren : Boolean := True;
-Num   : Natural := 1) return String;
- --  Created for purposes of recursing on embedded lists
-
- 
- -- Internal_List_Name --
- 
-
- function Internal_List_Name
-   (List  : Node_Id;
-First : Boolean := True;
-Add_Space : Boolean := True;
-Add_Paren : Boolean := True;
-Num   : Natural := 1) return String
- is
- begin
-if No (List) then
-   if First or else not Add_Paren then
-  return "";
-   else
-  return ")";
-   end if;
-elsif Num > Max_List then
-   if Add_Paren then
-  return ", ...)";
-   else
-  return ", ...";
-   end if;
-end if;
+  function List_Name (List : List_Id) return String is
+ Buf  : Bounded_String;
+ Elmt : Node_Id;
 
---  Continue recursing on the list - handling the first element
---  in a special way.
-
-return
-  (if First then
-  (if Add_Space and Add_Paren then " ("
-   elsif Add_Paren then "("
-   elsif Add_Space then " "
-   else "")
-   else ", ")
-   & Expr_Name (List)
-   & Internal_List_Name
-   (List  => Next (List),
-First => False,
-Add_Paren => Add_Paren,
-Num   => Num + 1);
- end Internal_List_Name;
-
-  --  Start of processing for List_Name
+ Printed_Elmts : Natural := 0;
 
   begin
- --  Prevent infinite recursion by limiting depth to 3
+ --  Give up if the printed list is too deep
 
- if List_Name_Count > 3 then
+ if List_Name_Count > Max_List_Depth then
 return "...";
  end if;
 
  List_Name_Count := List_Name_Count + 1;
 
- declare
-Result : constant String :=
-   Internal_List_Name
- (List  => List,
-  Add_Space => Add_Space,
-  Add_Paren => Add_Paren);
- begin
-List_Name_Count := List_Name_Count - 1;
-return Result;
- end;
+ Elmt := First (List);
+ while Present (Elmt) loop
+
+--  Print component_association as "x | y | z => 12345"
+
+if Nkind (Elmt) = N_Component_Association then
+   declare
+  Choice

[COMMITTED] ada: Use idiomatic construct in Expand_N_Package_Body

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

gcc/ada/

* exp_ch7.adb (Expand_N_Package_Body): Call Defining_Entity to get
the entity of the body.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch7.adb | 11 +--
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/gcc/ada/exp_ch7.adb b/gcc/ada/exp_ch7.adb
index 9ec03b7e4cd..db2644fb287 100644
--- a/gcc/ada/exp_ch7.adb
+++ b/gcc/ada/exp_ch7.adb
@@ -5262,16 +5262,7 @@ package body Exp_Ch7 is
 Fin_Id  => Fin_Id);
 
  if Present (Fin_Id) then
-declare
-   Body_Ent : Node_Id := Defining_Unit_Name (N);
-
-begin
-   if Nkind (Body_Ent) = N_Defining_Program_Unit_Name then
-  Body_Ent := Defining_Identifier (Body_Ent);
-   end if;
-
-   Set_Finalizer (Body_Ent, Fin_Id);
-end;
+Set_Finalizer (Defining_Entity (N), Fin_Id);
  end if;
   end if;
end Expand_N_Package_Body;
-- 
2.40.0



[COMMITTED] ada: Small cleanup in support for protected subprograms

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

This moves the propagation of the Uses_Sec_Stack flag, from the original to
the rewritten subprogram, to the point where the latter is expanded, along
with the propagation of the Has_Nested_Subprogram flag, as well as addresses
a ??? comment in the same block of code.  No functional changes.

gcc/ada/

* inline.adb (Cleanup_Scopes): Do not propagate the Uses_Sec_Stack
flag from original to rewritten protected subprograms here...
* exp_ch9.adb (Expand_N_Protected_Body) :
...but here instead. Add local variables and remove a useless
test.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch9.adb | 97 +++--
 gcc/ada/inline.adb  | 11 -
 2 files changed, 49 insertions(+), 59 deletions(-)

diff --git a/gcc/ada/exp_ch9.adb b/gcc/ada/exp_ch9.adb
index 50b9d072d84..b51c60ea506 100644
--- a/gcc/ada/exp_ch9.adb
+++ b/gcc/ada/exp_ch9.adb
@@ -8393,9 +8393,11 @@ package body Exp_Ch9 is
   Current_Node : Node_Id;
   Disp_Op_Body : Node_Id;
   New_Op_Body  : Node_Id;
+  New_Op_Spec  : Node_Id;
   Op_Body  : Node_Id;
   Op_Decl  : Node_Id;
   Op_Id: Entity_Id;
+  Op_Spec  : Entity_Id;
 
   function Build_Dispatching_Subprogram_Body
 (N: Node_Id;
@@ -8512,11 +8514,12 @@ package body Exp_Ch9 is
null;
 
 when N_Subprogram_Body =>
+   Op_Spec := Corresponding_Spec (Op_Body);
 
--  Do not create bodies for eliminated operations
 
if not Is_Eliminated (Defining_Entity (Op_Body))
- and then not Is_Eliminated (Corresponding_Spec (Op_Body))
+ and then not Is_Eliminated (Op_Spec)
then
   if Lock_Free_Active then
  New_Op_Body :=
@@ -8531,7 +8534,9 @@ package body Exp_Ch9 is
   Current_Node := New_Op_Body;
   Analyze (New_Op_Body);
 
-  --  When the original protected body has nested subprograms,
+  New_Op_Spec := Corresponding_Spec (New_Op_Body);
+
+  --  When the original subprogram body has nested subprograms,
   --  the new body also has them, so set the flag accordingly
   --  and reset the scopes of the top-level nested subprograms
   --  and other declaration entities so that they now refer to
@@ -8541,58 +8546,54 @@ package body Exp_Ch9 is
   --  subprogram entity isn't available via Corresponding_Spec
   --  until after the above Analyze call.)
 
-  if Has_Nested_Subprogram (Corresponding_Spec (Op_Body)) then
- Set_Has_Nested_Subprogram
-   (Corresponding_Spec (New_Op_Body));
-
- Reset_Scopes_To
-   (New_Op_Body, Corresponding_Spec (New_Op_Body));
+  if Has_Nested_Subprogram (Op_Spec) then
+ Set_Has_Nested_Subprogram (New_Op_Spec);
+ Reset_Scopes_To (New_Op_Body, New_Op_Spec);
   end if;
 
+  --  Similarly, when the original subprogram body uses the
+  --  secondary stack, the new body also does. This is needed
+  --  when the cleanup actions of the subprogram are delayed
+  --  because it contains a package instance with a body.
+
+  Set_Uses_Sec_Stack (New_Op_Spec, Uses_Sec_Stack (Op_Spec));
+
   --  Build the corresponding protected operation. This is
   --  needed only if this is a public or private operation of
   --  the type.
 
-  --  Why do we need to test for Corresponding_Spec being
-  --  present here when it's assumed to be set further above
-  --  in the Is_Eliminated test???
-
-  if Present (Corresponding_Spec (Op_Body)) then
- Op_Decl :=
-   Unit_Declaration_Node (Corresponding_Spec (Op_Body));
-
- if Nkind (Parent (Op_Decl)) = N_Protected_Definition then
-if Lock_Free_Active then
-   New_Op_Body :=
- Build_Lock_Free_Protected_Subprogram_Body
-   (Op_Body, Pid, Specification (New_Op_Body));
-else
-   New_Op_Body :=
- Build_Protected_Subprogram_Body (
-   Op_Body, Pid, Specification (New_Op_Body));
-end if;
-
-Insert_After (Current_Node, New_Op_Body);
-Analyze (New_Op_Body);
-Current_Node := New_Op_Body;
-
---  Generate an overriding primitive operation body for
- 

[COMMITTED] ada: Rename Is_Past_Self_Hiding_Point flag to be Is_Not_Self_Hidden

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Bob Duff 

...which seems clearer.

Still work in progress.

gcc/ada/

* cstand.adb (Is_Past_Self_Hiding_Point): Rename to be
Is_Not_Self_Hidden.
* einfo.ads: Likewise.
* exp_aggr.adb: Likewise.
* gen_il-fields.ads: Likewise.
* gen_il-gen-gen_entities.adb: Likewise.
* sem.adb: Likewise.
* sem_aggr.adb: Likewise.
* sem_ch11.adb: Likewise.
* sem_ch12.adb: Likewise.
* sem_ch5.adb: Likewise.
* sem_ch6.adb: Likewise.
* sem_ch7.adb: Likewise.
* sem_prag.adb: Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/cstand.adb  | 4 ++--
 gcc/ada/einfo.ads   | 4 ++--
 gcc/ada/exp_aggr.adb| 2 +-
 gcc/ada/gen_il-fields.ads   | 2 +-
 gcc/ada/gen_il-gen-gen_entities.adb | 2 +-
 gcc/ada/sem.adb | 6 +++---
 gcc/ada/sem_aggr.adb| 6 +++---
 gcc/ada/sem_ch11.adb| 2 +-
 gcc/ada/sem_ch12.adb| 8 
 gcc/ada/sem_ch5.adb | 8 
 gcc/ada/sem_ch6.adb | 4 ++--
 gcc/ada/sem_ch7.adb | 4 ++--
 gcc/ada/sem_prag.adb| 2 +-
 13 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/gcc/ada/cstand.adb b/gcc/ada/cstand.adb
index f53015d1e0c..3646003b330 100644
--- a/gcc/ada/cstand.adb
+++ b/gcc/ada/cstand.adb
@@ -1784,7 +1784,7 @@ package body CStand is
 
   Set_Is_Immediately_Visible  (Ident_Node, True);
   Set_Is_Intrinsic_Subprogram (Ident_Node, True);
-  Set_Is_Past_Self_Hiding_Point (Ident_Node);
+  Set_Is_Not_Self_Hidden (Ident_Node);
 
   Set_Name_Entity_Id (Op, Ident_Node);
   Append_Entity (Ident_Node, Standard_Standard);
@@ -1810,7 +1810,7 @@ package body CStand is
   --  frozen and not self-hidden as soon as they are created.
 
   Set_Is_Frozen (E);
-  Set_Is_Past_Self_Hiding_Point (E);
+  Set_Is_Not_Self_Hidden (E);
 
   --  Set debug information required for all standard types
 
diff --git a/gcc/ada/einfo.ads b/gcc/ada/einfo.ads
index c67731c1298..0cc4b495bd9 100644
--- a/gcc/ada/einfo.ads
+++ b/gcc/ada/einfo.ads
@@ -3104,7 +3104,7 @@ package Einfo is
 --   procedure which verifies the invariants of the partial view of a
 --   private type or private extension.
 
---Is_Past_Self_Hiding_Point
+--Is_Not_Self_Hidden
 --   Defined in all entities. Roughly speaking, this is False if the
 --   declaration of the entity is hidden from all visibility because
 --   we are within its declaration, as defined by 8.3(16-18). When
@@ -4957,7 +4957,7 @@ package Einfo is
--Is_Obsolescent
--Is_Package_Body_Entity
--Is_Packed_Array_Impl_Type
-   --Is_Past_Self_Hiding_Point
+   --Is_Not_Self_Hidden
--Is_Potentially_Use_Visible
--Is_Preelaborated
--Is_Primitive_Wrapper
diff --git a/gcc/ada/exp_aggr.adb b/gcc/ada/exp_aggr.adb
index e2f0ccdb34a..40dd1c4d41b 100644
--- a/gcc/ada/exp_aggr.adb
+++ b/gcc/ada/exp_aggr.adb
@@ -2057,7 +2057,7 @@ package body Exp_Aggr is
 Set_Etype (L_J, Any_Type);
 
 Mutate_Ekind (L_J, E_Variable);
-Set_Is_Past_Self_Hiding_Point (L_J);
+Set_Is_Not_Self_Hidden (L_J);
 Set_Scope (L_J, Ent);
  else
 L_J := Make_Temporary (Loc, 'J', L);
diff --git a/gcc/ada/gen_il-fields.ads b/gcc/ada/gen_il-fields.ads
index 19ebf6744d0..fd89fac869d 100644
--- a/gcc/ada/gen_il-fields.ads
+++ b/gcc/ada/gen_il-fields.ads
@@ -752,7 +752,7 @@ package Gen_IL.Fields is
   Is_Package_Body_Entity,
   Is_Packed,
   Is_Packed_Array_Impl_Type,
-  Is_Past_Self_Hiding_Point,
+  Is_Not_Self_Hidden,
   Is_Param_Block_Component_Type,
   Is_Partial_Invariant_Procedure,
   Is_Potentially_Use_Visible,
diff --git a/gcc/ada/gen_il-gen-gen_entities.adb 
b/gcc/ada/gen_il-gen-gen_entities.adb
index 6356de0ee2e..d531e4a8efa 100644
--- a/gcc/ada/gen_il-gen-gen_entities.adb
+++ b/gcc/ada/gen_il-gen-gen_entities.adb
@@ -177,7 +177,7 @@ begin -- Gen_IL.Gen.Gen_Entities
 Sm (Is_Package_Body_Entity, Flag),
 Sm (Is_Packed, Flag, Impl_Base_Type_Only),
 Sm (Is_Packed_Array_Impl_Type, Flag),
-Sm (Is_Past_Self_Hiding_Point, Flag),
+Sm (Is_Not_Self_Hidden, Flag),
 Sm (Is_Potentially_Use_Visible, Flag),
 Sm (Is_Preelaborated, Flag),
 Sm (Is_Private_Descendant, Flag),
diff --git a/gcc/ada/sem.adb b/gcc/ada/sem.adb
index b0b492b0099..3bff8d26a0d 100644
--- a/gcc/ada/sem.adb
+++ b/gcc/ada/sem.adb
@@ -760,7 +760,7 @@ package body Sem is
 
   Debug_A_Exit ("analyzing  ", N, "  (done)");
 
-  --  Set Is_Past_Self_Hiding_Point flag. RM-8.3(16) says a declaration
+  --  Set Is_Not_Self_Hidden flag. RM-8.3(16) says a declaration
   --  is no longer hidden from all visibility after "the end of the
   --  declaration", so we set the f

Re: Re: [PATCH] RISC-V: Add RVV comparison autovectorization

2023-05-22 Thread juzhe.zh...@rivai.ai
Thanks Robin. Address comment.



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-05-22 16:07
To: juzhe.zhong; gcc-patches
CC: rdapp.gcc; kito.cheng; palmer; jeffreyalaw; Richard Sandiford
Subject: Re: [PATCH] RISC-V: Add RVV comparison autovectorization
Hi Juzhe,
 
thanks.  Some remarks inline.
 
> +;; Integer (signed) vcond.  Don't enforce an immediate range here, since it
> +;; depends on the comparison; leave it to riscv_vector::expand_vcond instead.
> +(define_expand "vcond"
> +  [(set (match_operand:V 0 "register_operand")
> + (if_then_else:V
> +   (match_operator 3 "comparison_operator"
> + [(match_operand:VI 4 "register_operand")
> +  (match_operand:VI 5 "nonmemory_operand")])
> +   (match_operand:V 1 "nonmemory_operand")
> +   (match_operand:V 2 "nonmemory_operand")))]
> +  "TARGET_VECTOR && known_eq (GET_MODE_NUNITS (mode),
> +  GET_MODE_NUNITS (mode))"
> +  {
> +riscv_vector::expand_vcond (mode, operands);
> +DONE;
> +  }
> +)
> +
> +;; Integer vcondu.  Don't enforce an immediate range here, since it
> +;; depends on the comparison; leave it to riscv_vector::expand_vcond instead.
> +(define_expand "vcondu"
> +  [(set (match_operand:V 0 "register_operand")
> + (if_then_else:V
> +   (match_operator 3 "comparison_operator"
> + [(match_operand:VI 4 "register_operand")
> +  (match_operand:VI 5 "nonmemory_operand")])
> +   (match_operand:V 1 "nonmemory_operand")
> +   (match_operand:V 2 "nonmemory_operand")))]
> +  "TARGET_VECTOR && known_eq (GET_MODE_NUNITS (mode),
> +  GET_MODE_NUNITS (mode))"
> +  {
> +riscv_vector::expand_vcond (mode, operands);
> +DONE;
> +  }
> +)
 
These do exactly the same (as do their aarch64 heirs).  As you are a friend
of iterators usually I guess you didn't use one for clarity here?  Also, I
didn't see that we do much of immediate-range enforcement in expand_vcond.
 
> +
> +;; Floating-point vcond.  Don't enforce an immediate range here, since it
> +;; depends on the comparison; leave it to riscv_vector::expand_vcond instead.
> +(define_expand "vcond"
> +  [(set (match_operand:V 0 "register_operand")
> + (if_then_else:V
> +   (match_operator 3 "comparison_operator"
> + [(match_operand:VF 4 "register_operand")
> +  (match_operand:VF 5 "nonmemory_operand")])
> +   (match_operand:V 1 "nonmemory_operand")
> +   (match_operand:V 2 "nonmemory_operand")))]
> +  "TARGET_VECTOR && known_eq (GET_MODE_NUNITS (mode),
> +  GET_MODE_NUNITS (mode))"
> +  {
> +riscv_vector::expand_vcond (mode, operands);
> +DONE;
> +  }
> +)
 
It comes a bit as a surprise to add float comparisons before any other
float autovec patterns are in.  I'm not against it but would wait for
other comments here.  If the tests are source from aarch64 they have
been reviewed often enough that we can be fairly sure to do the right
thing though.  I haven't checked the expander and inversion things
closely now though.
 
> +
> +;; -
> +;;  [INT,FP] Select based on masks
> +;; -
> +;; Includes merging patterns for:
> +;; - vmerge.vv
> +;; - vmerge.vx
> +;; - vfmerge.vf
> +;; -
> +
> +(define_expand "vcond_mask_"
> +  [(match_operand:V 0 "register_operand")
> +   (match_operand: 3 "register_operand")
> +   (match_operand:V 1 "nonmemory_operand")
> +   (match_operand:V 2 "register_operand")]
> +  "TARGET_VECTOR"
> +  {
> +riscv_vector::emit_merge_op (operands[0], operands[2],
> +operands[1], operands[3]);
> +DONE;
> +  }
> +)
 
Order of operands is a bit surprising, see below.
 
> +  void add_fixed_operand (rtx x)
> +  {
> +create_fixed_operand (&m_ops[m_opno++], x);
> +gcc_assert (m_opno <= MAX_OPERANDS);
> +  }
> +  void add_integer_operand (rtx x)
> +  {
> +create_integer_operand (&m_ops[m_opno++], INTVAL (x));
> +gcc_assert (m_opno <= MAX_OPERANDS);
> +  }
>void add_all_one_mask_operand (machine_mode mode)
>{
>  add_input_operand (CONSTM1_RTX (mode), mode);
> @@ -85,11 +95,14 @@ public:
>{
>  add_input_operand (RVV_VUNDEF (mode), mode);
>}
> -  void add_policy_operand (enum tail_policy vta, enum mask_policy vma)
> +  void add_policy_operand (enum tail_policy vta)
>{
>  rtx tail_policy_rtx = gen_int_mode (vta, Pmode);
> -rtx mask_policy_rtx = gen_int_mode (vma, Pmode);
>  add_input_operand (tail_policy_rtx, Pmode);
> +  }
> +  void add_policy_operand (enum mask_policy vma)
> +  {
> +rtx mask_policy_rtx = gen_int_mode (vma, Pmode);
>  add_input_operand (mask_policy_rtx, Pmode);
>}
>void add_avl_type_operand (avl_type type)
> @@ -97,7 +110,8 @@ public:
>  add_input_operand (gen_int_mode (type, Pmode), Pmode);
>}
 
My idea would be to have the policy operands hidden a bit more as
in my last patch.  It comes down to a matter of taste.  We can discuss
once this is in an

[COMMITTED] ada: Fix crash caused by incorrect expansion of iterated component

2023-05-22 Thread Marc Poulhiès via Gcc-patches
The way iterated component are expanded could lead to inconsistent tree.

This change fixes 2 issues:

- in an early step during Pre_Analyze, the loop variable still has
Any_Type and the compiler must not emit an error. A later full Analyze
is supposed to correctly set the Etype, and only then should the
compiler emit an error if Any_Type is still used.

- when expanding into a loop with assignments statement, the expression
is analyzed in an early context (where the loop variable still has
Any_Type Etype) and then copied. The compiler would crash because this
Any_Type is never changed because the expression node has its Analyzed
flag set. Resetting the flag ensures the later Analyze call also
analyzes these nodes and set Etype correctly.

gcc/ada/

* exp_aggr.adb (Process_Transient_Component): Reset Analyzed flag
for the copy of the initialization expression.
* sem_attr.adb (Validate_Non_Static_Attribute_Function_Call): Skip
error emission during Pre_Analyze.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_aggr.adb | 11 ++-
 gcc/ada/sem_attr.adb |  4 +++-
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/gcc/ada/exp_aggr.adb b/gcc/ada/exp_aggr.adb
index 40dd1c4d41b..f3ad8a9e1ae 100644
--- a/gcc/ada/exp_aggr.adb
+++ b/gcc/ada/exp_aggr.adb
@@ -9840,6 +9840,7 @@ package body Exp_Aggr is
   Res_Decl: Node_Id;
   Res_Id  : Entity_Id;
   Res_Typ : Entity_Id;
+  Copy_Init_Expr : constant Node_Id := New_Copy_Tree (Init_Expr);
 
--  Start of processing for Process_Transient_Component
 
@@ -9890,7 +9891,15 @@ package body Exp_Aggr is
   Constant_Present=> True,
   Object_Definition   => New_Occurrence_Of (Res_Typ, Loc),
   Expression  =>
-Make_Reference (Loc, New_Copy_Tree (Init_Expr)));
+Make_Reference (Loc, Copy_Init_Expr));
+
+  --  In some cases, like iterated component, the Init_Expr may have been
+  --  analyzed in a context where all the Etype fields are not correct yet
+  --  and a later call to Analyze is expected to set them.
+  --  Resetting the Analyzed flag ensures this later call doesn't skip this
+  --  node.
+
+  Reset_Analyzed_Flags (Copy_Init_Expr);
 
   Add_Item (Res_Decl);
 
diff --git a/gcc/ada/sem_attr.adb b/gcc/ada/sem_attr.adb
index a07e91b839d..bc4e3cf019e 100644
--- a/gcc/ada/sem_attr.adb
+++ b/gcc/ada/sem_attr.adb
@@ -3319,7 +3319,9 @@ package body Sem_Attr is
 
 --  Check for missing/bad expression (result of previous error)
 
-if No (E1) or else Etype (E1) = Any_Type then
+if No (E1)
+  or else (Etype (E1) = Any_Type and then Full_Analysis)
+then
Check_Error_Detected;
raise Bad_Attribute;
 end if;
-- 
2.40.0



Re: [PATCH] rs6000: Fix __builtin_vec_xst_trunc definition

2023-05-22 Thread Kewen.Lin via Gcc-patches
Hi Carl,

on 2023/5/11 02:06, Carl Love via Gcc-patches wrote:
> GCC maintainers:
> 
> The following patch fixes errors in the arguments in the
> __builtin_altivec_tr_stxvrhx, __builtin_altivec_tr_stxvrwx builtin
> definitions.  Note, these builtins are used by the overloaded
> __builtin_vec_xst_trunc builtin.
> 
> The patch adds a new overloaded builtin definition for
> __builtin_vec_xst_trunc for the third argument to be unsigned and
> signed long int.
> 
> A new testcase is added for the various overloaded versions of
> __builtin_vec_xst_trunc.
> 
> The patch has been tested on Power 10 with no new regressions.
> 
> Please let me know if the patch is acceptable for mainline.  Thanks.
> 
> Carl
> 
> ---
> rs6000: Fix __builtin_vec_xst_trunc definition
> 
> Built-in __builtin_vec_xst_trunc calls __builtin_altivec_tr_stxvrhx
> and __builtin_altivec_tr_stxvrwx to handle the short and word cases.  The
> arguments for these two builtins are wrong.  This patch fixes the wrong
> arguments for the builtins.
> 
> Additionally, the patch adds a new __builtin_vec_xst_trunc overloaded
> version for the destination being signed or unsigned long int.
> 
> A runnable test case is added to test each of the overloaded definitions
> of __builtin_vec_xst_tru
> 
> gcc/
>   * config/rs6000/builtins.def (__builtin_altivec_tr_stxvrhx,
>   __builtin_altivec_tr_stxvrwx): Fix type of second argument.
>   Add, definition for send argument to be signed long.
>   * config/rs6000/rs6000-overload.def (__builtin_vec_xst_trunc):
>   add definition with thrird arument signed and unsigned long.
>   * doc/extend.texi (__builtin_vec_xst_trunc): Add documentation for
>   new unsinged long and signed long versions.
> 
> gcc/testsuite/
>   * gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c: New test case
>   for __builtin_vec_xst_trunc builtin.
> ---
>  gcc/config/rs6000/rs6000-builtins.def |   7 +-
>  gcc/config/rs6000/rs6000-overload.def |   4 +
>  gcc/doc/extend.texi   |   2 +
>  .../powerpc/vsx-builtin-vec_xst_trunc.c   | 217 ++
>  4 files changed, 228 insertions(+), 2 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
> b/gcc/config/rs6000/rs6000-builtins.def
> index 638d0bc72ca..a378491b358 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -3161,12 +3161,15 @@
>void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char *);
>  TR_STXVRBX vsx_stxvrbx {stvec}
>  
> -  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed int *);
> +  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed short *);
>  TR_STXVRHX vsx_stxvrhx {stvec}
>  
> -  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed short *);
> +  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed int *);
>  TR_STXVRWX vsx_stxvrwx {stvec}

Good catching!

>  
> +  void __builtin_altivec_tr_stxvrlx (vsq, signed long, signed long *);
> +TR_STXVRLX vsx_stxvrdx {stvec}
> +

This is mapped to the one used for type long long, it's a hard mapping,
IMHO it's wrong and not consistent with what the users expect, since on Power
the size of type long int is 4 bytes at -m32 while 8 bytes at -m64, this
implementation binding to 8 bytes can cause trouble in 32-bit.  I wonder if
it's a good idea to add one overloaded version for type long int, for now
openxl also emits error message for long int type pointer (see its doc [1]),
users can use casting to make it to the acceptable pointer types (long long
or int as its size).

[1] 
https://www.ibm.com/docs/en/openxl-c-and-cpp-lop/17.1.1?topic=functions-vec-xst-trunc


>void __builtin_altivec_tr_stxvrdx (vsq, signed long, signed long long *);
>  TR_STXVRDX vsx_stxvrdx {stvec}
>  
> diff --git a/gcc/config/rs6000/rs6000-overload.def 
> b/gcc/config/rs6000/rs6000-overload.def
> index c582490c084..54b7ae5e51b 100644
> --- a/gcc/config/rs6000/rs6000-overload.def
> +++ b/gcc/config/rs6000/rs6000-overload.def
> @@ -4872,6 +4872,10 @@
>  TR_STXVRWX  TR_STXVRWX_S
>void __builtin_vec_xst_trunc (vuq, signed long long, unsigned int *);
>  TR_STXVRWX  TR_STXVRWX_U
> +  void __builtin_vec_xst_trunc (vsq, signed long long, signed long *);
> +TR_STXVRLX  TR_STXVRLX_S
> +  void __builtin_vec_xst_trunc (vuq, signed long long, unsigned long *);
> +TR_STXVRLX  TR_STXVRLX_U
>void __builtin_vec_xst_trunc (vsq, signed long long, signed long long *);
>  TR_STXVRDX  TR_STXVRDX_S
>void __builtin_vec_xst_trunc (vuq, signed long long, unsigned long long *);
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index e426a2eb7d8..7e2ae790ab3 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -18570,10 +18570,12 @@ instructions.
>  @defbuiltin{{void} vec_xst_tr

[COMMITTED] ada: Incorrect constant folding in postcondition involving 'Old

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Justin Squirek 

The following patch fixes an issue in the compiler whereby certain flavors of
access comparisons may be incorrectly constant-folded out of contract
expressions - notably in postcondition expressions featuring a reference to
'Old.

gcc/ada/

* checks.adb (Install_Null_Excluding_Check): Avoid non-null
optimizations when assertions are enabled.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/checks.adb | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/checks.adb b/gcc/ada/checks.adb
index 9f3c679ed7e..0d472964ff5 100644
--- a/gcc/ada/checks.adb
+++ b/gcc/ada/checks.adb
@@ -8437,7 +8437,18 @@ package body Checks is
   Right_Opnd => Make_Null (Loc)),
   Reason => CE_Access_Check_Failed));
 
-  Mark_Non_Null;
+  --  Mark the entity of N "non-null" except when assertions are enabled -
+  --  since expansion becomes much more complicated (especially when it
+  --  comes to contracts) due to the generation of wrappers and wholesale
+  --  moving of declarations and statements which may happen.
+
+  --  Additionally, it is assumed that extra checks will exist with
+  --  assertions enabled so some potentially redundant checks are
+  --  acceptable.
+
+  if not Assertions_Enabled then
+ Mark_Non_Null;
+  end if;
end Install_Null_Excluding_Check;
 
-
-- 
2.40.0



Re: [PATCH] RISC-V: Implement autovec abs, vneg, vnot.

2023-05-22 Thread Kito Cheng via Gcc-patches
So I expect you will also apply those refactor on Juzhe's new changes?
If so I would like to have a separated NFC refactor patch if possible.

e.g.
Juzhe's vec_cmp/vcond -> NFC refactor patch -> abs, vneg, vnot

On Mon, May 22, 2023 at 4:59 PM Robin Dapp  wrote:
>
> As discussed with Juzhe off-list, I will rebase this patch against
> Juzhe's vec_cmp/vcond patch once that hits the trunk.
>
> Regards
>  Robin


[COMMITTED] ada: Reuse idiomatic procedure in CStand

2023-05-22 Thread Marc Poulhiès via Gcc-patches
From: Ronan Desplanques 

This change replaces a call to Set_Name_Entity_Id with a call to
the higher-level Set_Current_Entity.

gcc/ada/

* cstand.adb: Use more idiomatic procedure.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/cstand.adb | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gcc/ada/cstand.adb b/gcc/ada/cstand.adb
index 3646003b330..fbd5888b198 100644
--- a/gcc/ada/cstand.adb
+++ b/gcc/ada/cstand.adb
@@ -1642,8 +1642,7 @@ package body CStand is
 
   for E in Standard_Entity_Type loop
  if Ekind (Standard_Entity (E)) /= E_Operator then
-Set_Name_Entity_Id
-  (Chars (Standard_Entity (E)), Standard_Entity (E));
+Set_Current_Entity (Standard_Entity (E));
 Set_Homonym (Standard_Entity (E), Empty);
  end if;
 
-- 
2.40.0



Re: [PATCH] RISC-V: Implement autovec abs, vneg, vnot.

2023-05-22 Thread Robin Dapp via Gcc-patches
> So I expect you will also apply those refactor on Juzhe's new changes?
> If so I would like to have a separated NFC refactor patch if possible.

What's NFC? :)  Do you mean to just have the refactor part as a separate
patch?  If yes, I agree.

> e.g.
> Juzhe's vec_cmp/vcond -> NFC refactor patch -> abs, vneg, vnot



Re: Re: [PATCH] RISC-V: Implement autovec abs, vneg, vnot.

2023-05-22 Thread juzhe.zh...@rivai.ai
Yeah, I agree wit kito.
For example, I see you have rename "get_prefer_***" into "get_preferred_**"
I think this NFC patch should be  separated patch.

Thanks.


juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2023-05-22 17:05
To: Robin Dapp
CC: 钟居哲; gcc-patches; palmer; Michael Collison; Jeff Law
Subject: Re: [PATCH] RISC-V: Implement autovec abs, vneg, vnot.
So I expect you will also apply those refactor on Juzhe's new changes?
If so I would like to have a separated NFC refactor patch if possible.
 
e.g.
Juzhe's vec_cmp/vcond -> NFC refactor patch -> abs, vneg, vnot
 
On Mon, May 22, 2023 at 4:59 PM Robin Dapp  wrote:
>
> As discussed with Juzhe off-list, I will rebase this patch against
> Juzhe's vec_cmp/vcond patch once that hits the trunk.
>
> Regards
>  Robin
 


Re: [PATCH] RISC-V: Add missing torture-init and torture-finish for rvv.exp

2023-05-22 Thread Kito Cheng via Gcc-patches
Ooops, seems still some issue around here, but I found something might
related this issue:

https://github.com/gcc-mirror/gcc/commit/d6654a4be3ba44c0d57be7c8a51d76d9721345e1
https://github.com/gcc-mirror/gcc/commit/23c49bb8d09bc3bfce9a08be637cf32ac014de56

On Mon, May 22, 2023 at 2:42 PM Kito Cheng  wrote:
>
> Hi Vineet:
>
> Could you help to test this patch, this could resolve that issue on our
> machine, but I would like to also work for other env.
>
> Thanks :)
>
> ---
>
> We got bunch of following error message for multi-lib run:
>
> ERROR: torture-init: torture_without_loops is not empty as expected
> ERROR: tcl error code NONE
>
> And seems we need torture-init and torture-finish around the test
> loop.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/rvv.exp: Add torture-init and
> torture-finish.
> ---
>  gcc/testsuite/gcc.target/riscv/rvv/rvv.exp | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp 
> b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
> index bc99cc0c3cf4..19179564361a 100644
> --- a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
> @@ -39,6 +39,7 @@ if [istarget riscv32-*-*] then {
>
>  # Initialize `dg'.
>  dg-init
> +torture-init
>
>  # Main loop.
>  set CFLAGS "$DEFAULT_CFLAGS -march=$gcc_march -mabi=$gcc_mabi -O3"
> @@ -69,5 +70,7 @@ foreach op $AUTOVEC_TEST_OPTS {
>  dg-runtest [lsort [glob -nocomplain 
> $srcdir/$subdir/autovec/vls-vlmax/*.\[cS\]]] \
> "-std=c99 -O3 -ftree-vectorize --param 
> riscv-autovec-preference=fixed-vlmax" $CFLAGS
>
> +torture-finish
> +
>  # All done.
>  dg-finish
> --
> 2.40.1
>


Re: [PATCH] RISC-V: Implement autovec abs, vneg, vnot.

2023-05-22 Thread Kito Cheng via Gcc-patches
> > So I expect you will also apply those refactor on Juzhe's new changes?
> > If so I would like to have a separated NFC refactor patch if possible.
>
> What's NFC? :)  Do you mean to just have the refactor part as a separate
> patch?  If yes, I agree.

NFC: non-functional-change, that's a term used in LLVM, I just forgot
that's kind of rare term used here,


Re: [PATCH, rs6000] Split TImode for logical operations in expand pass [PR100694]

2023-05-22 Thread Kewen.Lin via Gcc-patches
Hi Haochen,

on 2023/2/8 13:08, HAO CHEN GUI wrote:
> Hi,
>   The logical operations for TImode is split after reload pass right now. Some
> potential optimizations miss as the split is too late. This patch removes
> TImode from "AND", "IOR", "XOR" and "NOT" expander so that these logical
> operations can be split at expand pass. The new test case illustrates the
> optimization.
> 
>   Two test cases of pr92398 are merged into one as all sub-targets generates
> the same sequence of instructions with the patch.

IIUC, this can also help PR target/93123.  Add it to the PR marker too if so.

This patch aligns with what the other ports do, I think it's good, but note that
it can regress some case like:

```
vector unsigned __int128 test(unsigned __int128 *a, unsigned __int128 *b,
  unsigned __int128 *c, unsigned __int128 *d) {

  unsigned __int128 t1 = *a | *b;
  unsigned __int128 t2 = *c & *d;
  unsigned __int128 t3 = t1 ^ t2;

  return (vector unsigned __int128)t3;
}
```

w/o the proposed patch:

lxv 32,0(5)
lxv 0,0(6)
lxv 45,0(3)
lxv 33,0(4)
xxland 32,32,0
vor 2,1,13
vxor 2,2,0

vs.

w/ this patch:

ld 9,8(6)
ld 8,0(5)
ld 10,8(5)
ld 0,0(6)
ld 11,0(3)
ld 6,8(3)
ld 5,0(4)
ld 7,8(4)
and 8,8,0
and 10,10,9
or 9,5,11
xor 9,9,8
or 8,7,6
xor 8,8,10
mtvsrdd 34,8,9

It can get the optimal insn seq before, but fails to with the proposed patch.
Apparently we don't have some support to get back the operation in vector
when it's beneficial for now.

I guess the cases in PR100694 and PR93123 are dominated and the regressed
case is corner.  So we can probably install this patch first and open a bug
for further enhancement.

Segher, what do you think of this?

BR,
Kewen

> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> 
> Thanks
> Gui Haochen
> 
> 
> ChangeLog
> 2023-02-08  Haochen Gui 
> 
> gcc/
>   PR target/100694>   * config/rs6000/rs6000.md (BOOL_128_V): New 
> mode iterator for 128-bit
>   vector types.
>   (and3): Replace BOOL_128 with BOOL_128_V.
>   (ior3): Likewise.
>   (xor3): Likewise.
>   (one_cmpl2 expander): New expander with BOOL_128_V.
>   (one_cmpl2 insn_and_split): Rename to ...
>   (*one_cmpl2): ... this.
> 
> gcc/testsuite/
>   PR target/100694
>   * gcc.target/powerpc/pr100694.c: New.
>   * gcc.target/powerpc/pr92398.c: New.
>   * gcc.target/powerpc/pr92398.h: Remove.
>   * gcc.target/powerpc/pr92398.p9-.c: Remove.
>   * gcc.target/powerpc/pr92398.p9+.c: Remove.
> 
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 4bd1dfd3da9..455b7329643 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -743,6 +743,15 @@ (define_mode_iterator BOOL_128   [TI
>(V2DF  "TARGET_ALTIVEC")
>(V1TI  "TARGET_ALTIVEC")])
> 
> +;; Mode iterator for logical operations on 128-bit vector types
> +(define_mode_iterator BOOL_128_V [(V16QI "TARGET_ALTIVEC")
> +  (V8HI  "TARGET_ALTIVEC")
> +  (V4SI  "TARGET_ALTIVEC")
> +  (V4SF  "TARGET_ALTIVEC")
> +  (V2DI  "TARGET_ALTIVEC")
> +  (V2DF  "TARGET_ALTIVEC")
> +  (V1TI  "TARGET_ALTIVEC")])
> +
>  ;; For the GPRs we use 3 constraints for register outputs, two that are the
>  ;; same as the output register, and a third where the output register is an
>  ;; early clobber, so we don't have to deal with register overlaps.  For the
> @@ -7135,23 +7144,23 @@ (define_expand "subti3"
>  ;; 128-bit logical operations expanders
> 
>  (define_expand "and3"
> -  [(set (match_operand:BOOL_128 0 "vlogical_operand")
> - (and:BOOL_128 (match_operand:BOOL_128 1 "vlogical_operand")
> -   (match_operand:BOOL_128 2 "vlogical_operand")))]
> +  [(set (match_operand:BOOL_128_V 0 "vlogical_operand")
> + (and:BOOL_128_V (match_operand:BOOL_128_V 1 "vlogical_operand")
> + (match_operand:BOOL_128_V 2 "vlogical_operand")))]
>""
>"")
> 
>  (define_expand "ior3"
> -  [(set (match_operand:BOOL_128 0 "vlogical_operand")
> -(ior:BOOL_128 (match_operand:BOOL_128 1 "vlogical_operand")
> -   (match_operand:BOOL_128 2 "vlogical_operand")))]
> +  [(set (match_operand:BOOL_128_V 0 "vlogical_operand")
> + (ior:BOOL_128_V (match_operand:BOOL_128_V 1 "vlogical_operand")
> + (match_operand:BOOL_128_V 2 "vlogical_operand")))]
>""
>"")
> 
>  (define_expand "xor3"
> -  [(set (match_operand:BOOL_128 0 "vlogical_operand")
> -(xor:BOOL_128 (match_oper

[PATCH] RISC-V: Fix typo of multiple_rgroup-2.h

2023-05-22 Thread juzhe . zhong
From: Juzhe-Zhong 

Just notice this following fail in the regression:
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c (test for excess 
errors)
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c (test for 
excess errors)

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h: Fix typo

---
 .../gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h
index 7b12c656779..045a76de45f 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h
@@ -487,7 +487,7 @@
__builtin_abort ();\
 }
 
-#defitree-vect-loop.ccne run_10(TYPE1, TYPE2, TYPE3)   
  \
+#define run_10(TYPE1, TYPE2, TYPE3)
 \
   int n_10_##TYPE1_##TYPE2_##TYPE3 = 777;  
 \
   TYPE1 x_10_##TYPE1 = 222;
 \
   TYPE1 x2_10_##TYPE1 = 111;   
 \
-- 
2.36.3



Re: [PATCH] c-family: implement -ffp-contract=on

2023-05-22 Thread Richard Biener via Gcc-patches
On Thu, May 18, 2023 at 11:04 PM Alexander Monakov via Gcc-patches
 wrote:
>
> Implement -ffp-contract=on for C and C++ without changing default
> behavior (=off for -std=cNN, =fast for C++ and -std=gnuNN).

The documentation changes mention the defaults are changed for
standard modes, I suppose you want to remove that hunk.

> gcc/c-family/ChangeLog:
>
> * c-gimplify.cc (fma_supported_p): New helper.
> (c_gimplify_expr) [PLUS_EXPR, MINUS_EXPR]: Implement FMA
> contraction.
>
> gcc/ChangeLog:
>
> * common.opt (fp_contract_mode) [on]: Remove fallback.
> * config/sh/sh.md (*fmasf4): Correct flag_fp_contract_mode test.
> * doc/invoke.texi (-ffp-contract): Update.
> * trans-mem.cc (diagnose_tm_1): Skip internal function calls.
> ---
>  gcc/c-family/c-gimplify.cc | 78 ++
>  gcc/common.opt |  3 +-
>  gcc/config/sh/sh.md|  2 +-
>  gcc/doc/invoke.texi|  8 ++--
>  gcc/trans-mem.cc   |  3 ++
>  5 files changed, 88 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/c-family/c-gimplify.cc b/gcc/c-family/c-gimplify.cc
> index ef5c7d919f..f7635d3b0c 100644
> --- a/gcc/c-family/c-gimplify.cc
> +++ b/gcc/c-family/c-gimplify.cc
> @@ -41,6 +41,8 @@ along with GCC; see the file COPYING3.  If not see
>  #include "c-ubsan.h"
>  #include "tree-nested.h"
>  #include "context.h"
> +#include "tree-pass.h"
> +#include "internal-fn.h"
>
>  /*  The gimplification pass converts the language-dependent trees
>  (ld-trees) emitted by the parser into language-independent trees
> @@ -686,6 +688,14 @@ c_build_bind_expr (location_t loc, tree block, tree body)
>return bind;
>  }
>
> +/* Helper for c_gimplify_expr: test if target supports fma-like FN.  */
> +
> +static bool
> +fma_supported_p (enum internal_fn fn, tree type)
> +{
> +  return direct_internal_fn_supported_p (fn, type, OPTIMIZE_FOR_BOTH);
> +}
> +
>  /* Gimplification of expression trees.  */
>
>  /* Do C-specific gimplification on *EXPR_P.  PRE_P and POST_P are as in
> @@ -739,6 +749,74 @@ c_gimplify_expr (tree *expr_p, gimple_seq *pre_p 
> ATTRIBUTE_UNUSED,
> break;
>}
>
> +case PLUS_EXPR:
> +case MINUS_EXPR:
> +  {
> +   tree type = TREE_TYPE (*expr_p);
> +   /* For -ffp-contract=on we need to attempt FMA contraction only
> +  during initial gimplification.  Late contraction across statement
> +  boundaries would violate language semantics.  */
> +   if (SCALAR_FLOAT_TYPE_P (type)
> +   && flag_fp_contract_mode == FP_CONTRACT_ON
> +   && cfun && !(cfun->curr_properties & PROP_gimple_any)
> +   && fma_supported_p (IFN_FMA, type))
> + {
> +   bool neg_mul = false, neg_add = code == MINUS_EXPR;
> +
> +   tree *op0_p = &TREE_OPERAND (*expr_p, 0);
> +   tree *op1_p = &TREE_OPERAND (*expr_p, 1);
> +
> +   /* Look for ±(x * y) ± z, swapping operands if necessary.  */
> +   if (TREE_CODE (*op0_p) == NEGATE_EXPR
> +   && TREE_CODE (TREE_OPERAND (*op0_p, 0)) == MULT_EXPR)
> + /* '*EXPR_P' is '-(x * y) ± z'.  This is fine.  */;
> +   else if (TREE_CODE (*op0_p) != MULT_EXPR)
> + {
> +   std::swap (op0_p, op1_p);
> +   std::swap (neg_mul, neg_add);
> + }
> +   if (TREE_CODE (*op0_p) == NEGATE_EXPR)
> + {
> +   op0_p = &TREE_OPERAND (*op0_p, 0);
> +   neg_mul = !neg_mul;
> + }
> +   if (TREE_CODE (*op0_p) != MULT_EXPR)
> + break;
> +   auto_vec ops (3);
> +   ops.quick_push (TREE_OPERAND (*op0_p, 0));
> +   ops.quick_push (TREE_OPERAND (*op0_p, 1));
> +   ops.quick_push (*op1_p);
> +
> +   enum internal_fn ifn = IFN_FMA;
> +   if (neg_mul)
> + {
> +   if (fma_supported_p (IFN_FNMA, type))
> + ifn = IFN_FNMA;
> +   else
> + ops[0] = build1 (NEGATE_EXPR, type, ops[0]);
> + }
> +   if (neg_add)
> + {
> +   enum internal_fn ifn2 = ifn == IFN_FMA ? IFN_FMS : IFN_FNMS;
> +   if (fma_supported_p (ifn2, type))
> + ifn = ifn2;
> +   else
> + ops[2] = build1 (NEGATE_EXPR, type, ops[2]);
> + }
> +   for (auto &&op : ops)
> + if (gimplify_expr (&op, pre_p, post_p, is_gimple_val, fb_rvalue)
> + == GS_ERROR)
> +   return GS_ERROR;
> +
> +   gcall *call = gimple_build_call_internal_vec (ifn, ops);
> +   gimple_seq_add_stmt_without_update (pre_p, call);
> +   *expr_p = create_tmp_var (type);
> +   gimple_call_set_lhs (call, *expr_p);

it would be possible to do

  *expr_p = build_call_expr_internal (ifn, type, ops[0], ops[1]. ops[2]);
  return GS_OK;

and not worry about temporary creation a

Re: [PATCH] RISC-V: Fix typo of multiple_rgroup-2.h

2023-05-22 Thread Kito Cheng via Gcc-patches
ok

On Mon, May 22, 2023 at 6:02 PM  wrote:
>
> From: Juzhe-Zhong 
>
> Just notice this following fail in the regression:
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c (test for 
> excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c (test for 
> excess errors)
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h: Fix typo
>
> ---
>  .../gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h| 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git 
> a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h
> index 7b12c656779..045a76de45f 100644
> --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h
> @@ -487,7 +487,7 @@
> __builtin_abort ();   
>  \
>  }
>
> -#defitree-vect-loop.ccne run_10(TYPE1, TYPE2, TYPE3) 
> \
> +#define run_10(TYPE1, TYPE2, TYPE3)  
>\
>int n_10_##TYPE1_##TYPE2_##TYPE3 = 777;
>\
>TYPE1 x_10_##TYPE1 = 222;  
>\
>TYPE1 x2_10_##TYPE1 = 111; 
>\
> --
> 2.36.3
>


Re: [PATCH V11] VECT: Add decrement IV support in Loop Vectorizer

2023-05-22 Thread Richard Biener via Gcc-patches
On Fri, May 19, 2023 at 12:59 PM Richard Sandiford via Gcc-patches
 wrote:
>
> "juzhe.zh...@rivai.ai"  writes:
> >>> I don't think this is a property of decrementing IVs.  IIUC it's really
> >>> a property of rgl->factor == 1 && factor == 1, where factor would need
> >>> to be passed in by the caller.  Because of that, it should probably be
> >>> a separate patch.
> > Is it right that I just post this part code as a seperate patch then merge 
> > it?
>
> No, not in its current form.  Like I say, the test should be based on
> factors rather than TYPE_VECTOR_SUBPARTS.  But a fix for this problem
> should come before the changes to IVs.
>
> >>> That is, current LOAD_LEN targets have two properties (IIRC):
> >>> (1) all vectors used in a given piece of vector code have the same byte 
> >>> size
> >>> (2) lengths are measured in bytes rather than elements
> >>> For all cases, including SVE, the number of controls needed for a scalar
> >>> statement is equal to the number of vectors needed for that scalar
> >>> statement.
> >>> Because of (1), on current LOADL_LEN targets, the number of controls
> >>> needed for a scalar statement is also proportional to the total number
> >>> of bytes occupied by the vectors generated for that scalar statement.
> >>> And because of (2), the total number of bytes is the only thing that
> >>> matters, so all users of a particular control can use the same control
> >>> value.
> >>> E.g. on current LOAD_LEN targets, 2xV16QI and 2xV8HI would use the same
> >>> control (with no adjustment).  2xV16QI means 32 elements, while 2xV8HI
> >>> means 16 elements.  V16QI's nscalars_per_iter would therefore be double
> >>> V8HI's, but V8HI's factor would be double V16QI's (2 vs 1), so things
> >>> even out.
> >>> The code structurally supports targets that count in elements rather
> >>> than bytes, so that factor==1 for all element types.  See the
> >>> "rgl->factor == 1 && factor == 1" case in:
> >  >>  if (rgl->max_nscalars_per_iter < nscalars_per_iter)  >>   {  >> /* 
> > For now, we only support cases in which all loads and stores fall back 
> > to VnQI or none do.  */
> >>>gcc_assert (!rgl->max_nscalars_per_iter>>  || 
> > (rgl->factor == 1 && factor == 1)
> > || (rgl->max_nscalars_per_iter * rgl->factor
> >>>   == nscalars_per_iter * factor));
> >  >>  rgl->max_nscalars_per_iter = nscalars_per_iter; >>  rgl->type 
> > = vectype; >>  rgl->factor = factor;  >>   }>> But it hasn't been 
> > tested, since no current target uses it.
> >>> I think the above part of the patch shows that the current "factor is
> >>> always 1" path is in fact broken, and the patch is a correctness fix on
> >>> targets that measure in elements rather than bytes.
> >>> So I think the above part of the patch should go in ahead of the IV 
> >>> changes.
> >>> But the test should be based on factor rather than TYPE_VECTOR_SUBPARTS.
> > Since the length control measured by bytes instead of bytes is not
> > appropriate for RVV.You mean I can't support RVV auto-vectorization in
> > upstream GCC middle-end and I can only support it in my downstream, is
> > that right?
>
> No.  I haven't said in this or previous reviews that something cannot be
> supported in upstream GCC.
>
> I'm saying that the code in theory supports counting in bytes *or*
> counting in elements.  But only the first one has actually been tested.
> And so, perhaps not surprisingly, the support for counting elements
> needs a fix.
>
> The fix in your patch looks like it's on the right lines, but it should be
> based on factor rather than TYPE_VECTOR_SUBPARTS.
>
> See get_len_load_store_mode for how this selection happens:
>
> (1) IFN_LOAD_LEN itself always counts in elements rather than bytes.
>
> (2) If a target has instructions that count in elements, it should
> define load_len patterns for all vector modes that it supports.
>
> (3) If a target has instructions that count in bytes, it should define
> load_len patterns only for byte modes.  The vectoriser will then
> use byte loads for all vector types (even things like V8HI).

Not sure if you've covered this already in another thread but IIRC
RVV uses "with-len" not only for loads and stores but for arithmetic
instructions as well which is where (3) fails.  Fortunately RVV uses
element counts(?)

> For (2), the loop controls will always have a factor of 1.
> For (3), the loop controls will have a factor equal to the element
> size in bytes.  See:
>
>   machine_mode vmode;
>   if (get_len_load_store_mode (vecmode, is_load).exists (&vmode))
> {
>   nvectors = group_memory_nvectors (group_size * vf, nunits);
>   vec_loop_lens *lens = &LOOP_VINFO_LENS (loop_vinfo);
>   unsigned factor = (vecmode == vmode) ? 1 : GET_MODE_UNIT_SIZE (vecmode);
>   vect_record_loop_len (loop_vinfo, lens, nvectors, vectype, factor);
>   using_partial_vectors_p = true;
> }
>
> This part should work correctly for RVV an

Re: [committed] Enable LRA on several ports

2023-05-22 Thread Richard Biener via Gcc-patches
On Fri, May 19, 2023 at 1:45 PM Maciej W. Rozycki  wrote:
>
> On Tue, 2 May 2023, Jeff Law via Gcc-patches wrote:
>
> > Well, I'd say that my plan would be to deprecate any target that is not
> > converted by the end of this development cycle.  So the change keeps cris 
> > from
> > falling into that bucket.
>
>  As I noted in the other thread it is highly unlikely I will make it with
> the VAX target in this release cycle, owing to the catastrophic breakage
> of the exception unwinder, recently discovered, which I consider higher
> priority as a show-stopper for important software such as current GDB.  I
> will appreciate your taking this into consideration.

You might end up with VAX working fine with reload for GCC 14 but
marked as deprecated.  You then have the full next cycle to GCC 15
to improve the code quality with LRA - note that reload is likely removed
early in the development cycle.

>  That written the VAX target does build its target libraries with `-mlra',
> but there are ICE regressions in the test suite and overall code produced
> is brown paperbag quality.  And removing `-mno-lra' before that has been
> sorted will make making LRA match old reload quality much tougher.

You can always compare to GCC 14 then or even work based off the
release branch.

Richard.

>   Maciej


Re: Re: [PATCH V11] VECT: Add decrement IV support in Loop Vectorizer

2023-05-22 Thread juzhe.zh...@rivai.ai

>> Not sure if you've covered this already in another thread but IIRC
>> RVV uses "with-len" not only for loads and stores but for arithmetic
>> instructions as well which is where (3) fails.  Fortunately RVV uses
>> element counts(?)

Yes, RVV uses element count. But I did discover we have bugs for some 
arithmetic operations.
For example, Division, we definitely need len_div (...) like cond_div in ARM 
SVE.
But this is another story. I have support full features of RVV in my downstream 
GCC and works
well for a year (I think fix all potential issue for RVV). 
So you could image I will post more middle-end patches for RVV 
auto-vectorization in the future.

Thanks. 


juzhe.zh...@rivai.ai
 
From: Richard Biener
Date: 2023-05-22 18:12
To: Richard Sandiford; juzhe.zh...@rivai.ai; gcc-patches; rguenther
Subject: Re: [PATCH V11] VECT: Add decrement IV support in Loop Vectorizer
On Fri, May 19, 2023 at 12:59 PM Richard Sandiford via Gcc-patches
 wrote:
>
> "juzhe.zh...@rivai.ai"  writes:
> >>> I don't think this is a property of decrementing IVs.  IIUC it's really
> >>> a property of rgl->factor == 1 && factor == 1, where factor would need
> >>> to be passed in by the caller.  Because of that, it should probably be
> >>> a separate patch.
> > Is it right that I just post this part code as a seperate patch then merge 
> > it?
>
> No, not in its current form.  Like I say, the test should be based on
> factors rather than TYPE_VECTOR_SUBPARTS.  But a fix for this problem
> should come before the changes to IVs.
>
> >>> That is, current LOAD_LEN targets have two properties (IIRC):
> >>> (1) all vectors used in a given piece of vector code have the same byte 
> >>> size
> >>> (2) lengths are measured in bytes rather than elements
> >>> For all cases, including SVE, the number of controls needed for a scalar
> >>> statement is equal to the number of vectors needed for that scalar
> >>> statement.
> >>> Because of (1), on current LOADL_LEN targets, the number of controls
> >>> needed for a scalar statement is also proportional to the total number
> >>> of bytes occupied by the vectors generated for that scalar statement.
> >>> And because of (2), the total number of bytes is the only thing that
> >>> matters, so all users of a particular control can use the same control
> >>> value.
> >>> E.g. on current LOAD_LEN targets, 2xV16QI and 2xV8HI would use the same
> >>> control (with no adjustment).  2xV16QI means 32 elements, while 2xV8HI
> >>> means 16 elements.  V16QI's nscalars_per_iter would therefore be double
> >>> V8HI's, but V8HI's factor would be double V16QI's (2 vs 1), so things
> >>> even out.
> >>> The code structurally supports targets that count in elements rather
> >>> than bytes, so that factor==1 for all element types.  See the
> >>> "rgl->factor == 1 && factor == 1" case in:
> >  >>  if (rgl->max_nscalars_per_iter < nscalars_per_iter)  >>   {  >> /* 
> > For now, we only support cases in which all loads and stores fall back 
> > to VnQI or none do.  */
> >>>gcc_assert (!rgl->max_nscalars_per_iter>>  || 
> > (rgl->factor == 1 && factor == 1)
> > || (rgl->max_nscalars_per_iter * rgl->factor
> >>>   == nscalars_per_iter * factor));
> >  >>  rgl->max_nscalars_per_iter = nscalars_per_iter; >>  rgl->type 
> > = vectype; >>  rgl->factor = factor;  >>   }>> But it hasn't been 
> > tested, since no current target uses it.
> >>> I think the above part of the patch shows that the current "factor is
> >>> always 1" path is in fact broken, and the patch is a correctness fix on
> >>> targets that measure in elements rather than bytes.
> >>> So I think the above part of the patch should go in ahead of the IV 
> >>> changes.
> >>> But the test should be based on factor rather than TYPE_VECTOR_SUBPARTS.
> > Since the length control measured by bytes instead of bytes is not
> > appropriate for RVV.You mean I can't support RVV auto-vectorization in
> > upstream GCC middle-end and I can only support it in my downstream, is
> > that right?
>
> No.  I haven't said in this or previous reviews that something cannot be
> supported in upstream GCC.
>
> I'm saying that the code in theory supports counting in bytes *or*
> counting in elements.  But only the first one has actually been tested.
> And so, perhaps not surprisingly, the support for counting elements
> needs a fix.
>
> The fix in your patch looks like it's on the right lines, but it should be
> based on factor rather than TYPE_VECTOR_SUBPARTS.
>
> See get_len_load_store_mode for how this selection happens:
>
> (1) IFN_LOAD_LEN itself always counts in elements rather than bytes.
>
> (2) If a target has instructions that count in elements, it should
> define load_len patterns for all vector modes that it supports.
>
> (3) If a target has instructions that count in bytes, it should define
> load_len patterns only for byte modes.  The vectoriser will then
> use byte loads for all vector ty

RE: [PATCH] RISC-V: Fix typo of multiple_rgroup-2.h

2023-05-22 Thread Li, Pan2 via Gcc-patches
Committed, thanks Kito and Juzhe and sorry for inconvenient.

Pan

-Original Message-
From: Kito Cheng  
Sent: Monday, May 22, 2023 6:05 PM
To: juzhe.zh...@rivai.ai
Cc: gcc-patches@gcc.gnu.org; kito.ch...@sifive.com; pal...@dabbelt.com; 
pal...@rivosinc.com; jeffreya...@gmail.com; rdapp@gmail.com; Li, Pan2 

Subject: Re: [PATCH] RISC-V: Fix typo of multiple_rgroup-2.h

ok

On Mon, May 22, 2023 at 6:02 PM  wrote:
>
> From: Juzhe-Zhong 
>
> Just notice this following fail in the regression:
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c (test for 
> excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c (test for 
> excess errors)
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h: Fix typo
>
> ---
>  .../gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h| 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git 
> a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h
> index 7b12c656779..045a76de45f 100644
> --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h
> @@ -487,7 +487,7 @@
> __builtin_abort ();   
>  \
>  }
>
> -#defitree-vect-loop.ccne run_10(TYPE1, TYPE2, TYPE3) 
> \
> +#define run_10(TYPE1, TYPE2, TYPE3)  
>\
>int n_10_##TYPE1_##TYPE2_##TYPE3 = 777;
>\
>TYPE1 x_10_##TYPE1 = 222;  
>\
>TYPE1 x2_10_##TYPE1 = 111; 
>\
> --
> 2.36.3
>


Re: [PATCH 1/2] Improve do_store_flag for single bit comparison against 0

2023-05-22 Thread Richard Biener via Gcc-patches
On Fri, May 19, 2023 at 4:15 AM Andrew Pinski via Gcc-patches
 wrote:
>
> While working something else, I noticed we could improve
> the following function code generation:
> ```
> unsigned f(unsigned t)
> {
>   if (t & ~(1<<30)) __builtin_unreachable();
>   return t != 0;
> }
> ```
> Right know we just emit a comparison against 0 instead
> of just a shift right by 30.
> There is code in do_store_flag which already optimizes
> `(t & 1<<30) != 0` to `(t >> 30) & 1`. This patch
> extends it to handle the case where we know t has a
> nonzero of just one bit set.
>
> OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
>
> gcc/ChangeLog:
>
> * expr.cc (do_store_flag): Extend the one bit checking case
> to handle the case where we don't have an and but rather still
> one bit is known to be non-zero.
> ---
>  gcc/expr.cc | 27 +--
>  1 file changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/expr.cc b/gcc/expr.cc
> index 5ede094e705..91528e734e7 100644
> --- a/gcc/expr.cc
> +++ b/gcc/expr.cc
> @@ -13083,15 +13083,30 @@ do_store_flag (sepops ops, rtx target, machine_mode 
> mode)
>&& integer_zerop (arg1)
>&& (TYPE_PRECISION (ops->type) != 1 || TYPE_UNSIGNED (ops->type)))
>  {
> -  gimple *srcstmt = get_def_for_expr (arg0, BIT_AND_EXPR);
> -  if (srcstmt
> - && integer_pow2p (gimple_assign_rhs2 (srcstmt)))
> +  wide_int nz = tree_nonzero_bits (arg0);
> +
> +  if (wi::popcount (nz) == 1)
> {
> + tree op0;
> + tree op1;
> + gimple *srcstmt = get_def_for_expr (arg0, BIT_AND_EXPR);
> + /* If the defining statement was (x & POW2), then remove the and
> +as we are going to add it back. */
> + if (srcstmt
> + && integer_pow2p (gimple_assign_rhs2 (srcstmt)))
> +   {
> + op0 = gimple_assign_rhs1 (srcstmt);
> + op1 = gimple_assign_rhs2 (srcstmt);
> +   }
> + else
> +   {
> + op0 = arg0;
> + op1 = wide_int_to_tree (TREE_TYPE (op0), nz);
> +   }
>   enum tree_code tcode = code == NE ? NE_EXPR : EQ_EXPR;
>   type = lang_hooks.types.type_for_mode (mode, unsignedp);
> - tree temp = fold_build2_loc (loc, BIT_AND_EXPR, TREE_TYPE (arg1),
> -  gimple_assign_rhs1 (srcstmt),
> -  gimple_assign_rhs2 (srcstmt));
> + tree temp = fold_build2_loc (loc, BIT_AND_EXPR, TREE_TYPE (op0),
> +  op0, op1);
>   temp = fold_single_bit_test (loc, tcode, temp, arg1, type);
>   if (temp)
> return expand_expr (temp, target, VOIDmode, EXPAND_NORMAL);

I wonder if, instead of expanding expand with these kind of tricks we
want to instead
add to ISEL and use direct optab IFNs for things we matched?  In
particular I think
we do want to get rid of TER but the above adds another use of get_def_for_expr.

As Jeff says the above doesn't look like it includes costing so that would be an
argument to make it a generic match.pd transform (it appears to be "simpler")?

Richard.

> --
> 2.31.1
>


Re: [PATCH] avr: Set param_min_pagesize to 0 [PR105523]

2023-05-22 Thread Richard Biener via Gcc-patches
On Fri, May 19, 2023 at 7:58 AM  wrote:
>
> On 26/04/23, 5:51 PM, "Richard Biener"  > wrote:
> > On Wed, Apr 26, 2023 at 12:56 PM  > > wrote:
> > >
> > > On Wed, Apr 26, 2023 at 3:15 PM Richard Biener via Gcc-patches 
> > > mailto:gcc-patches@gcc.gnu.org>> wrote:
> > > >
> > > > On Wed, Apr 26, 2023 at 11:42 AM Richard Biener
> > > > mailto:richard.guent...@gmail.com>> wrote:
> > > > >
> > > > > On Wed, Apr 26, 2023 at 11:01 AM SenthilKumar.Selvaraj--- via
> > > > > Gcc-patches  > > > > > wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > This patch fixes PR 105523 by setting param_min_pagesize to 0 for 
> > > > > > the
> > > > > > avr target. For this target, zero and offsets from zero are 
> > > > > > perfectly
> > > > > > valid addresses, and the default value of param_min_pagesize ends up
> > > > > > triggering warnings on valid memory accesses.
> > > > >
> > > > > I think the proper configuration is to have
> > > > > DEFAULT_ADDR_SPACE_ZERO_ADDRESS_VALID
> > > >
> > > > Err, TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID
> > >
> > > That worked. Ok for trunk and backporting to 13 and 12 branches
> > > (pending regression testing)?
> >
> >
> > OK, but please let Denis time to comment.
>
> Didn't hear from Denis. When running regression tests with this patch,
> I found that some tests with -fdelete-null-pointer-checks were
> failing. Commit 19416210b37db0584cd0b3f3b3961324b8973d25 made
> -fdelete-null-pointer-checks false by default, while still allowing it
> to be overridden from the command line (it was previously
> unconditionally false).
>
> To keep the same behavior, I modified the hook to report zero
> addresses as valid only if -fdelete-null-pointer-checks is not set.
> With this change, all regression tests pass.
>
> Ok for trunk and backporting to 13 and 12 branches?

I think that's bit backwards - this hook conveys more precise information
(it's address-space specific) and it is also more specific.  Instead I'd
suggest to set the flag to zero in the target like nios2 or msp430 do.
In fact we should probably initialize it using this hook (and using the
default address space).

Richard.

> Regards
> Senthil
>
> PR 105523
>
> gcc/ChangeLog:
>
> * config/avr/avr.cc (avr_addr_space_zero_address_valid):
> (TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID): Return true if
> flag_delete_null_pointer_checks is not set.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/avr/pr105523.c: New test.
>
>
> diff --git gcc/config/avr/avr.cc gcc/config/avr/avr.cc
> index d5af40f..4c9eb84 100644
> --- gcc/config/avr/avr.cc
> +++ gcc/config/avr/avr.cc
> @@ -9787,6 +9787,18 @@ avr_addr_space_diagnose_usage (addr_space_t as, 
> location_t loc)
>(void) avr_addr_space_supported_p (as, loc);
>  }
>
> +/* Implement `TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID. Zero is a valid
> +   address in all address spaces. Even in ADDR_SPACE_FLASH1 etc..,
> +   a zero address is valid and means 0x, where RAMPZ is
> +   set to the appropriate segment value.
> +   If the user explicitly passes in -fdelete-null-pointer-checks though,
> +   assume zero addresses are invalid.*/
> +
> +static bool
> +avr_addr_space_zero_address_valid (addr_space_t as ATTRIBUTE_UNUSED)
> +{
> +  return flag_delete_null_pointer_checks == 0;
> +}
>
>  /* Look if DECL shall be placed in program memory space by
> means of attribute `progmem' or some address-space qualifier.
> @@ -14687,6 +14699,9 @@ avr_float_lib_compare_returns_bool (machine_mode 
> mode, enum rtx_code)
>  #undef  TARGET_ADDR_SPACE_DIAGNOSE_USAGE
>  #define TARGET_ADDR_SPACE_DIAGNOSE_USAGE avr_addr_space_diagnose_usage
>
> +#undef  TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID
> +#define TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID 
> avr_addr_space_zero_address_valid
> +
>  #undef  TARGET_MODE_DEPENDENT_ADDRESS_P
>  #define TARGET_MODE_DEPENDENT_ADDRESS_P avr_mode_dependent_address_p
>
> diff --git gcc/testsuite/gcc.target/avr/pr105523.c 
> gcc/testsuite/gcc.target/avr/pr105523.c
> new file mode 100644
> index 000..fbbf7bf
> --- /dev/null
> +++ gcc/testsuite/gcc.target/avr/pr105523.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Os -Wall" } */
> +
> +/* Verify no "array subscript 0 is outside array bounds of" is generated
> +   for accessing memory addresses in the 0-4096 range. */
> +
> +typedef __UINT8_TYPE__ uint8_t;
> +
> +#define SREG (*(volatile uint8_t*) (0x3F + __AVR_SFR_OFFSET__ ))
> +
> +void bar (void)
> +{
> +SREG = 0;
> +}
>


[PATCH] RISC-V: Add "m_" prefix for private member

2023-05-22 Thread juzhe . zhong
From: Juzhe-Zhong 

Since the current framework is hard to maintain and
hard to be used in the future possible auto-vectorization patterns.

We will need to keep adding more helpers and arguments during the 
auto-vectorization supporting. We should refactor the framework
now for the future use since the we don't support too much auto-vectorization
patterns for now.

Start with this simple patch, this patch is adding "m_" prefix for private the 
members.

gcc/ChangeLog:

* config/riscv/riscv-v.cc: Add "m_" prefix.

---
 gcc/config/riscv/riscv-v.cc | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index d65e7300303..e0b19bc1754 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -66,7 +66,7 @@ const_vlmax_p (machine_mode mode)
 template  class insn_expander
 {
 public:
-  insn_expander () : m_opno (0), has_dest(false) {}
+  insn_expander () : m_opno (0), m_has_dest_p(false) {}
   void add_output_operand (rtx x, machine_mode mode)
   {
 create_output_operand (&m_ops[m_opno++], x, mode);
@@ -99,41 +99,41 @@ public:
 
   void set_dest_and_mask (rtx mask, rtx dest, machine_mode mask_mode)
   {
-dest_mode = GET_MODE (dest);
-has_dest = true;
+m_dest_mode = GET_MODE (dest);
+m_has_dest_p = true;
 
-add_output_operand (dest, dest_mode);
+add_output_operand (dest, m_dest_mode);
 
 if (mask)
   add_input_operand (mask, GET_MODE (mask));
 else
   add_all_one_mask_operand (mask_mode);
 
-add_vundef_operand (dest_mode);
+add_vundef_operand (m_dest_mode);
   }
 
   void set_len_and_policy (rtx len, bool force_vlmax = false)
 {
   bool vlmax_p = force_vlmax || !len;
-  gcc_assert (has_dest);
+  gcc_assert (m_has_dest_p);
 
-  if (vlmax_p && const_vlmax_p (dest_mode))
+  if (vlmax_p && const_vlmax_p (m_dest_mode))
{
  /* Optimize VLS-VLMAX code gen, we can use vsetivli instead of the
 vsetvli to obtain the value of vlmax.  */
- poly_uint64 nunits = GET_MODE_NUNITS (dest_mode);
+ poly_uint64 nunits = GET_MODE_NUNITS (m_dest_mode);
  len = gen_int_mode (nunits, Pmode);
  vlmax_p = false; /* It has became NONVLMAX now.  */
}
   else if (!len)
{
  len = gen_reg_rtx (Pmode);
- emit_vlmax_vsetvl (dest_mode, len);
+ emit_vlmax_vsetvl (m_dest_mode, len);
}
 
   add_input_operand (len, Pmode);
 
-  if (GET_MODE_CLASS (dest_mode) != MODE_VECTOR_BOOL)
+  if (GET_MODE_CLASS (m_dest_mode) != MODE_VECTOR_BOOL)
add_policy_operand (get_prefer_tail_policy (), get_prefer_mask_policy 
());
 
   add_avl_type_operand (vlmax_p ? avl_type::VLMAX : avl_type::NONVLMAX);
@@ -152,8 +152,8 @@ public:
 
 private:
   int m_opno;
-  bool has_dest;
-  machine_mode dest_mode;
+  bool m_has_dest_p;
+  machine_mode m_dest_mode;
   expand_operand m_ops[MAX_OPERANDS];
 };
 
-- 
2.36.3



Re: [PATCH] RISC-V: Add RVV comparison autovectorization

2023-05-22 Thread Robin Dapp via Gcc-patches
> Thanks Robin. Address comment.

Did you intend to send an update here already or are you working
on it?  Just wondering because you just sent another refactoring
patch.

Regards
 Robin


Re: [PATCH] Fix handling of non-integral bit-fields in native_encode_initializer

2023-05-22 Thread Richard Biener via Gcc-patches
On Mon, May 22, 2023 at 10:10 AM Eric Botcazou via Gcc-patches
 wrote:
>
> Hi,
>
> the encoder for CONSTRUCTORs assumes that all bit-fields (DECL_BIT_FIELD) have
> integral types, but that's not the case in Ada where they may have pretty much
> any type, resulting in a wrong encoding for them.
>
> The attached fix filters out non-integral bit-fields, except if they start and
> end on a byte boundary because they are correctly handled in this case.
>
> Bootstrapped/regtested on x86-64/Linux, OK for mainline and 13 branch?

OK.

Can we handle non-integer bitfields by recursing with a temporary buffer to
encode it byte-aligned and then apply shifting and masking to get it in place?
Or is that not worth it?

Thanks,
Richard.

>
>
> 2023-05-22  Eric Botcazou  
>
> * fold-const.cc (native_encode_initializer) : Apply the
> specific treatment for bit-fields only if they have an integral type
> and filter out non-integral bit-fields that do not start and end on
> a byte boundary.
>
>
> 2023-05-22  Eric Botcazou  
>
> * gnat.dg/opt101.adb: New test.
> * gnat.dg/opt101_pkg.ads: New helper.
>
> --
> Eric Botcazou


Re: Re: [PATCH] RISC-V: Add RVV comparison autovectorization

2023-05-22 Thread juzhe.zh...@rivai.ai
Yes, I am working on it, but I noticed that the current framework is really 
ugly and bad.
I am gonna refactor it before I send comparison support.

I do refactoring since we are going to have many different auto-vectorization 
patterns,
for example: cond_addetc.

I should make the current framework suitable for all of them to simplify the 
future work.

Thanks. 


juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-05-22 20:14
To: juzhe.zh...@rivai.ai; gcc-patches
CC: rdapp.gcc; Kito.cheng; palmer; jeffreyalaw; richard.sandiford
Subject: Re: [PATCH] RISC-V: Add RVV comparison autovectorization
> Thanks Robin. Address comment.
 
Did you intend to send an update here already or are you working
on it?  Just wondering because you just sent another refactoring
patch.
 
Regards
Robin
 


Re: [PATCH] RISC-V: Add RVV comparison autovectorization

2023-05-22 Thread Robin Dapp via Gcc-patches
> I do refactoring since we are going to have many different
> auto-vectorization patterns, for example: cond_addetc.
> 
> I should make the current framework suitable for all of them to
> simplify the future work.

That's good in general but can't it wait until the respective
changes go in?  I don't know how much you intend to change but
it will be easier to review as well if we don't change parts now
that might be used differently in the future. On top, we won't
get everything right with the first shot anyway.

Regards
 Robin


Re: Re: [PATCH] RISC-V: Add RVV comparison autovectorization

2023-05-22 Thread juzhe.zh...@rivai.ai
I will first send refactor patch soon. Then second send comparison patch.
The refactor patch will be applicable for all future use, and they should come
first since I have implemented the all RVV auto-vectorization patterns and I 
know
what we will need in the future use.

Thanks.


juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-05-22 20:26
To: juzhe.zh...@rivai.ai; gcc-patches
CC: rdapp.gcc; Kito.cheng; palmer; jeffreyalaw; richard.sandiford
Subject: Re: [PATCH] RISC-V: Add RVV comparison autovectorization
> I do refactoring since we are going to have many different
> auto-vectorization patterns, for example: cond_addetc.
> 
> I should make the current framework suitable for all of them to
> simplify the future work.
 
That's good in general but can't it wait until the respective
changes go in?  I don't know how much you intend to change but
it will be easier to review as well if we don't change parts now
that might be used differently in the future. On top, we won't
get everything right with the first shot anyway.
 
Regards
Robin
 


[PATCH] libgomp: Fix build for -fshort-enums

2023-05-22 Thread Sebastian Huber
Make sure that the API enums have at least the size of int.  Otherwise the
following build error may occur:

In file included from gcc/libgomp/env.c:34:
./libgomp_f.h: In function 'omp_check_defines':
./libgomp_f.h:77:8: error: size of array 'test' is negative
   77 |   char test[(28 != sizeof (omp_lock_t)
  |^~~~

libgomp/ChangeLog:

* omp.h.in (omp_alloctrait_key_t):  Add __omp_alloctrait_key_t_max__
with a value of the int type maximum.
---
 libgomp/omp.h.in | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libgomp/omp.h.in b/libgomp/omp.h.in
index bd1286c2a3f..3b1612fcb15 100644
--- a/libgomp/omp.h.in
+++ b/libgomp/omp.h.in
@@ -146,7 +146,8 @@ typedef enum omp_alloctrait_key_t
   omp_atk_fallback = 5,
   omp_atk_fb_data = 6,
   omp_atk_pinned = 7,
-  omp_atk_partition = 8
+  omp_atk_partition = 8,
+  __omp_alloctrait_key_t_max__ = __INT_MAX__
 } omp_alloctrait_key_t;
 
 typedef enum omp_alloctrait_value_t
-- 
2.35.3



Re: [PATCH v1] tree-ssa-sink: Improve code sinking pass.

2023-05-22 Thread Richard Biener via Gcc-patches
On Thu, May 18, 2023 at 9:14 AM Ajit Agarwal  wrote:
>
> Hello All:
>
> This patch improves code sinking pass to sink statements before call to reduce
> register pressure.
> Review comments are incorporated.
>
> Bootstrapped and regtested on powerpc64-linux-gnu.
>
> Thanks & Regards
> Ajit
>
>
> tree-ssa-sink: Improve code sinking pass.
>
> Code Sinking sinks the blocks after call. This increases
> register pressure for callee-saved registers. Improves
> code sinking before call in the use blocks or immediate
> dominator of use blocks.
>
> 2023-05-18  Ajit Kumar Agarwal  
>
> gcc/ChangeLog:
>
> * tree-ssa-sink.cc (statement_sink_location): Modifed to
> move statements before calls.
> (block_call_p): New function.
> (def_use_same_block): New function.
> (select_best_block): Add heuristics to select the best
> blocks in the immediate post dominator.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/ssa-sink-20.c: New testcase.
> * gcc.dg/tree-ssa/ssa-sink-21.c: New testcase.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c |  16 ++
>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c |  20 +++
>  gcc/tree-ssa-sink.cc| 159 ++--
>  3 files changed, 185 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
> new file mode 100644
> index 000..716bc1f9257
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-sink -fdump-tree-optimized 
> -fdump-tree-sink-stats" } */
> +
> +void bar();
> +int j;
> +void foo(int a, int b, int c, int d, int e, int f)
> +{
> +  int l;
> +  l = a + b + c + d +e + f;
> +  if (a != 5)
> +{
> +  bar();
> +  j = l;
> +}
> +}
> +/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } */

this doesn't verify the place we sink to?

> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
> new file mode 100644
> index 000..ff41e2ea8ae
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-sink-stats -fdump-tree-sink-stats" } */
> +
> +void bar();
> +int j, x;
> +void foo(int a, int b, int c, int d, int e, int f)
> +{
> +  int l;
> +  l = a + b + c + d +e + f;
> +  if (a != 5)
> +{
> +  bar();
> +  if (b != 3)
> +x = 3;
> +  else
> +x = 5;
> +  j = l;
> +}
> +}
> +/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } */

likewise.  So both tests already pass before the patch?

> diff --git a/gcc/tree-ssa-sink.cc b/gcc/tree-ssa-sink.cc
> index 87b1d40c174..76556e7795b 100644
> --- a/gcc/tree-ssa-sink.cc
> +++ b/gcc/tree-ssa-sink.cc
> @@ -171,6 +171,72 @@ nearest_common_dominator_of_uses (def_operand_p def_p, 
> bool *debug_stmts)
>return commondom;
>  }
>
> +/* Return TRUE if immediate uses of the defs in
> +   USE occur in the same block as USE, FALSE otherwise.  */
> +
> +bool
> +def_use_same_block (gimple *stmt)
> +{
> +  use_operand_p use_p;
> +  def_operand_p def_p;
> +  imm_use_iterator imm_iter;
> +  ssa_op_iter iter;
> +
> +  FOR_EACH_SSA_DEF_OPERAND (def_p, stmt, iter, SSA_OP_DEF)
> +{
> +  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, DEF_FROM_PTR (def_p))
> +   {
> + if (is_gimple_debug (USE_STMT (use_p)))
> +   continue;
> +
> + if (use_p

use_p is never null

> + && (gimple_bb (USE_STMT (use_p)) == gimple_bb (stmt)))
> +   return true;

the function behavior is obviously odd ...

> +   }
> + }
> +  return false;
> +}
> +
> +/* Return TRUE if the block has only calls, FALSE otherwise. */
> +
> +bool
> +block_call_p (basic_block bb)
> +{
> +  int i = 0;
> +  bool is_call = false;
> +  gimple_stmt_iterator gsi = gsi_last_bb (bb);
> +  gimple *last_stmt = gsi_stmt (gsi);
> +
> +  if (last_stmt && gimple_code (last_stmt) == GIMPLE_COND)
> +{
> +  if (!gsi_end_p (gsi))
> +   gsi_prev (&gsi);
> +
> +   for (; !gsi_end_p (gsi);)
> +{
> +  gimple *stmt = gsi_stmt (gsi);
> +
> +  /* We have already seen a call.  */
> +  if (is_call)
> +return false;

Likewise.  Do you want to check whether a block has
a single stmt and that is a call and that is followed by
a condition?  It looks like a very convoluted way to write this.

> +
> +  if (is_gimple_call (stmt))
> +is_call = true;
> +  else
> +return false;
> +
> +  if (!gsi_end_p (gsi))
> +gsi_prev (&gsi);
> +
> +   ++i;
> +   }
> + }
> +  if (is_call && i == 1)
> +return true;
> +
> +  retu

Re: [PATCH 2/3] Refactor widen_plus as internal_fn

2023-05-22 Thread Richard Biener via Gcc-patches
On Thu, 18 May 2023, Andre Vieira (lists) wrote:

> How about this?
> 
> Not sure about the DEF_INTERNAL documentation I rewrote in internal-fn.def,
> was struggling to word these, so improvements welcome!

The even/odd variant optabs are also commutative_optab_p, so is
the vec_widen_sadd without hi/lo or even/odd.

+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */

do you really want -all?  I think you want -details

+  else if (widening_fn_p (ifn)
+  || narrowing_fn_p (ifn))
+   {
+ tree lhs = gimple_get_lhs (stmt);
+ if (!lhs)
+   {
+ error ("vector IFN call with no lhs");
+ debug_generic_stmt (fn);

that's an error because ...?  Maybe we want to verify this
for all ECF_CONST|ECF_NOTHROW (or pure instead of const) internal
function calls, but I wouldn't add any verification as part
of this patch (not special to widening/narrowing fns either).

if (gimple_call_internal_p (stmt))
- return 0;
+ {
+   internal_fn fn = gimple_call_internal_fn (stmt);
+   switch (fn)
+ {
+ case IFN_VEC_WIDEN_PLUS_HI:
+ case IFN_VEC_WIDEN_PLUS_LO:
+ case IFN_VEC_WIDEN_MINUS_HI:
+ case IFN_VEC_WIDEN_MINUS_LO:
+   return 1;

this now looks incomplete.  I think that we want instead to
have a default: returning 1 and then special-cases we want
to cost as zero.  Not sure which - maybe blame tells why
this was added?  I think we can deal with this as followup
(likewise the ranger additions).

Otherwise looks good to me.

Thanks,
Richard.

> gcc/ChangeLog:
> 
> 2023-04-25  Andre Vieira  
> Joel Hutton  
> Tamar Christina  
> 
> * config/aarch64/aarch64-simd.md (vec_widen_addl_lo_):
> Rename
> this ...
> (vec_widen_add_lo_): ... to this.
> (vec_widen_addl_hi_): Rename this ...
> (vec_widen_add_hi_): ... to this.
> (vec_widen_subl_lo_): Rename this ...
> (vec_widen_sub_lo_): ... to this.
> (vec_widen_subl_hi_): Rename this ...
> (vec_widen_sub_hi_): ...to this.
> * doc/generic.texi: Document new IFN codes.
>   * internal-fn.cc (ifn_cmp): Function to compare ifn's for
> sorting/searching.
>   (lookup_hilo_internal_fn): Add lookup function.
>   (commutative_binary_fn_p): Add widen_plus fn's.
>   (widening_fn_p): New function.
>   (narrowing_fn_p): New function.
>(direct_internal_fn_optab): Change visibility.
>   * internal-fn.def (DEF_INTERNAL_WIDENING_OPTAB_FN): Macro to define an
> internal_fn that expands into multiple internal_fns for widening.
> (DEF_INTERNAL_NARROWING_OPTAB_FN): Likewise but for narrowing.
> (IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO,
>  IFN_VEC_WIDEN_PLUS_EVEN, IFN_VEC_WIDEN_PLUS_ODD,
>  IFN_VEC_WIDEN_MINUS, IFN_VEC_WIDEN_MINUS_HI, 
> IFN_VEC_WIDEN_MINUS_LO,
>  IFN_VEC_WIDEN_MINUS_ODD, IFN_VEC_WIDEN_MINUS_EVEN): Define widening
>plus,minus functions.
>   * internal-fn.h (direct_internal_fn_optab): Declare new prototype.
>   (lookup_hilo_internal_fn): Likewise.
>   (widening_fn_p): Likewise.
>   (Narrowing_fn_p): Likewise.
>   * optabs.cc (commutative_optab_p): Add widening plus optabs.
>   * optabs.def (OPTAB_D): Define widen add, sub optabs.
> * tree-cfg.cc (verify_gimple_call): Add checks for widening ifns.
> * tree-inline.cc (estimate_num_insns): Return same
> cost for widen add and sub IFNs as previous tree_codes.
>   * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
> patterns with a hi/lo or even/odd split.
> (vect_recog_sad_pattern): Refactor to use new IFN codes.
> (vect_recog_widen_plus_pattern): Likewise.
> (vect_recog_widen_minus_pattern): Likewise.
> (vect_recog_average_pattern): Likewise.
>   * tree-vect-stmts.cc (vectorizable_conversion): Add support for
>_HILO IFNs.
>   (supportable_widening_operation): Likewise.
> * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/aarch64/vect-widen-add.c: Test that new
> IFN_VEC_WIDEN_PLUS is being used.
>   * gcc.target/aarch64/vect-widen-sub.c: Test that new
> IFN_VEC_WIDEN_MINUS is being used.
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


Re: [PATCH] add glibc-stdint.h to vax and lm32 linux target (PR target/105525)

2023-05-22 Thread Maciej W. Rozycki
On Fri, 19 May 2023, Mikael Pettersson wrote:

> >  Hmm, I find it quite insteresting and indeed encouraging that someone
> > actually verifies our VAX/Linux target.
> >
> >  Mikael, how do you actually verify it however?
> 
> My vax builds are only cross-compilers without kernel headers or libc.

 Hmm, interesting, I wasn't aware you could actually build stage 1 GCC 
without target headers nowadays.

 When I tried it previously, it failed, and I had to come up with a hack 
to make glibc's `make install-headers' work, as ordinarily it requires a 
target compiler, making it a chicken-and-egg problem.

> The background is that I maintain a script to build GCC-based crosses to
> as many targets as I can, currently it supports 78 distinct processors and
> 82 triplets (four processors have multiple triplets). I only check that I can
> build the toolchains (full linux-gnu ones where possible).

 Great work, thanks!

  Maciej


Re: [PATCH 1/2] PR gcc/98350:Add a param to control the length of the chain with FMA in reassoc pass

2023-05-22 Thread Richard Biener via Gcc-patches
On Wed, May 17, 2023 at 3:05 PM Cui, Lili  wrote:
>
> > I think to make a difference you need to hit the number of parallel 
> > fadd/fmul
> > the pipeline can perform.  I don't think issue width is ever a problem for
> > chains w/o fma and throughput of fma vs fadd + fmul should be similar.
> >
>
> Yes, for x86 backend, fadd , fmul and fma have the same TP meaning they 
> should have the same width.
> The current implementation is reasonable  /* reassoc int, fp, vec_int, 
> vec_fp.  */.
>
> > That said, I think iff then we should try to improve
> > rewrite_expr_tree_parallel rather than adding a new function.  For example
> > for the case with equal rank operands we can try to sort adds first.  I 
> > can't
> > convince myself that rewrite_expr_tree_parallel honors ranks properly
> > quickly.
> >
>
> I rewrite this patch, there are mainly two changes:
> 1. I made some changes to rewrite_expr_tree_parallel_for_fma and used it 
> instead of rewrite_expr_tree_parallel. The following example shows that the 
> sequence generated by the this patch is better.
> 2. Put no-mult ops and mult ops alternately at the end of the queue, which is 
> conducive to generating more fma and reducing the loss of FMA when breaking 
> the chain.
>
> With these two changes, GCC can break the chain with width = 2 and generates 
> 6 FMAs for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98350  without any 
> params.
>
> --
> Source code: g + h + j + s + m + n+a+b +e  (https://godbolt.org/z/G8sb86n84)
> Compile options: -Ofast -mfpmath=sse -mfma
> Width = 3 was chosen for reassociation
> -
> Old rewrite_expr_tree_parallel generates:
>   _6 = g_8(D) + h_9(D);   --> parallel 0
>   _3 = s_11(D) + m_12(D);  --> parallel 1
>   _5 = _3 + j_10(D);
>   _2 = n_13(D) + a_14(D);   --> parallel 2
>   _1 = b_15(D) + e_16(D);  -> Parallel 3, This is not necessary, and it 
> is not friendly to FMA.
>   _4 = _1 + _2;
>   _7 = _4 + _5;
>   _17 = _6 + _7;
>   return _17;
>
> When the width = 3,  we need 5 cycles here.
> -first 
> end-
> Rewrite the old rewrite_expr_tree_parallel (3 sets in parallel) generates:
>
>   _3 = s_11(D) + m_12(D);  --> parallel 0
>   _5 = _3 + j_10(D);
>   _2 = n_13(D) + a_14(D);   --> parallel 1
>   _1 = b_15(D) + e_16(D);   --> parallel 2
>   _4 = _1 + _2;
>   _6 = _4 + _5;
>   _7 = _6 + h_9(D);
>   _17 = _7 + g_8(D);
>   return _17;
>
> When the width = 3, we need 5 cycles here.
> -second 
> end---
> Use rewrite_expr_tree_parallel_for_fma instead of rewrite_expr_tree_parallel 
> generates:
>
>   _3 = s_11(D) + m_12(D);
>   _6 = _3 + g_8(D);
>   _2 = n_13(D) + a_14(D);
>   _5 = _2 + h_9(D);
>   _1 = b_15(D) + e_16(D);
>   _4 = _1 + j_10(D);
>   _7 = _4 + _5;
>   _17 = _7 + _6;
>   return _17;
>
> When the width = 3, we need 4 cycles here.
> third 
> end---

Yes, so what I was saying is that I doubt rewrite_expr_tree_parallel
is optimal - you show
that for the specific example rewrite_expr_tree_parallel_for_fma is
better.  I was arguing
we want a single function, whether we single out leaves with
multiplications or not.

And we want documentation that shows the strategy will result in optimal latency
(I think we should not sacrifice latency just for the sake of forming
more FMAs).

Richard.

>
> Thanks,
> Lili.
>


[PATCH] libiberty: On Windows pass a >32k cmdline through a response file.

2023-05-22 Thread Costas Argyris via Gcc-patches
Currently on Windows, when CreateProcess is called with a command-line
that exceeds the 32k Windows limit, we get a very bad error:

"CreateProcess: No such file or directory"

This patch detects the case where this would happen and writes the
long command-line to a temporary response file and calls CreateProcess
with @file instead.
From 5c7237c102cdaca34e5907cd25c31610bda51919 Mon Sep 17 00:00:00 2001
From: Costas Argyris 
Date: Mon, 22 May 2023 13:55:56 +0100
Subject: [PATCH] libiberty: On Windows, pass a >32k cmdline through a response
 file.

pex-win32.c (win32_spawn): If the command line for CreateProcess
exceeds the 32k Windows limit, try to store it in a temporary
response file and call CreateProcess with @file instead (PR71850).

Signed-off-by: Costas Argyris 
---
 libiberty/pex-win32.c | 57 +--
 1 file changed, 44 insertions(+), 13 deletions(-)

diff --git a/libiberty/pex-win32.c b/libiberty/pex-win32.c
index 23c6c190a2c..0fd8b38734c 100644
--- a/libiberty/pex-win32.c
+++ b/libiberty/pex-win32.c
@@ -569,7 +569,8 @@ env_compare (const void *a_ptr, const void *b_ptr)
  * target is not actually an executable, such as if it is a shell script. */
 
 static pid_t
-win32_spawn (const char *executable,
+win32_spawn (struct pex_obj *obj,
+ const char *executable,
 	 BOOL search,
 	 char *const *argv,
  char *const *env, /* array of strings of the form: VAR=VALUE */
@@ -624,8 +625,37 @@ win32_spawn (const char *executable,
   cmdline = argv_to_cmdline (argv);
   if (!cmdline)
 goto exit;
-
-  /* Create the child process.  */  
+  /* If cmdline is too large, CreateProcess will fail with a bad
+ 'No such file or directory' error. Try passing it through a
+ temporary response file instead.  */
+  if (strlen (cmdline) > 32767)
+{
+  char *response_file = make_temp_file ("");
+  /* Register the file for deletion by pex_free.  */
+  ++obj->remove_count;
+  obj->remove = XRESIZEVEC (char *, obj->remove, obj->remove_count);
+  obj->remove[obj->remove_count - 1] = response_file;
+  int fd = pex_win32_open_write (obj, response_file, 0, 0);
+  if (fd == -1)
+goto exit;
+  FILE *f = pex_win32_fdopenw (obj, fd, 0);
+  /* Don't write argv[0] (program name) to the response file.  */
+  if (writeargv (&argv[1], f))
+{
+  fclose (f);
+  goto exit;
+}
+  fclose (f); /* Also closes fd and the underlying OS handle.  */
+  char *response_arg = concat ("@", response_file, NULL);
+  char *response_argv[3] = {argv[0], response_arg, NULL};
+  free (cmdline);
+  cmdline = argv_to_cmdline (response_argv);
+  free (response_arg);
+  if (!cmdline)
+goto exit;
+}
+  
+  /* Create the child process.  */
   if (CreateProcess (full_executable, cmdline,
 		  /*lpProcessAttributes=*/NULL,
 		  /*lpThreadAttributes=*/NULL,
@@ -645,7 +675,7 @@ win32_spawn (const char *executable,
   free (env_block);
   free (cmdline);
   free (full_executable);
-
+  
   return pid;
 }
 
@@ -653,7 +683,8 @@ win32_spawn (const char *executable,
This function is called as a fallback if win32_spawn fails. */
 
 static pid_t
-spawn_script (const char *executable, char *const *argv,
+spawn_script (struct pex_obj *obj,
+  const char *executable, char *const *argv,
   char* const *env,
 	  DWORD dwCreationFlags,
 	  LPSTARTUPINFO si,
@@ -703,20 +734,20 @@ spawn_script (const char *executable, char *const *argv,
 	  executable = strrchr (executable1, '\\') + 1;
 	  if (!executable)
 		executable = executable1;
-	  pid = win32_spawn (executable, TRUE, argv, env,
+	  pid = win32_spawn (obj, executable, TRUE, argv, env,
  dwCreationFlags, si, pi);
 #else
 	  if (strchr (executable1, '\\') == NULL)
-		pid = win32_spawn (executable1, TRUE, argv, env,
+		pid = win32_spawn (obj, executable1, TRUE, argv, env,
    dwCreationFlags, si, pi);
 	  else if (executable1[0] != '\\')
-		pid = win32_spawn (executable1, FALSE, argv, env,
+		pid = win32_spawn (obj, executable1, FALSE, argv, env,
    dwCreationFlags, si, pi);
 	  else
 		{
 		  const char *newex = mingw_rootify (executable1);
 		  *avhere = newex;
-		  pid = win32_spawn (newex, FALSE, argv, env,
+		  pid = win32_spawn (obj, newex, FALSE, argv, env,
  dwCreationFlags, si, pi);
 		  if (executable1 != newex)
 		free ((char *) newex);
@@ -726,7 +757,7 @@ spawn_script (const char *executable, char *const *argv,
 		  if (newex != executable1)
 			{
 			  *avhere = newex;
-			  pid = win32_spawn (newex, FALSE, argv, env,
+			  pid = win32_spawn (obj, newex, FALSE, argv, env,
 	 dwCreationFlags, si, pi);
 			  free ((char *) newex);
 			}
@@ -745,7 +776,7 @@ spawn_script (const char *executable, char *const *argv,
 /* Execute a child.  */
 
 static pid_t
-pex_win32_exec_child (struct pex_obj *obj ATTRIBUTE_UNUSED, int 

Re: [PATCH] PR gcc/98350:Handle FMA friendly in reassoc pass

2023-05-22 Thread Richard Biener via Gcc-patches
On Wed, May 17, 2023 at 3:02 PM Cui, Lili  wrote:
>
> From: Lili Cui 
>
> Make some changes in reassoc pass to make it more friendly to fma pass later.
> Using FMA instead of mult + add reduces register pressure and insruction
> retired.
>
> There are mainly two changes
> 1. Put no-mult ops and mult ops alternately at the end of the queue, which is
> conducive to generating more fma and reducing the loss of FMA when breaking
> the chain.
> 2. Rewrite the rewrite_expr_tree_parallel function to try to build parallel
> chains according to the given correlation width, keeping the FMA chance as
> much as possible.
>
> TEST1:
>
> float
> foo (float a, float b, float c, float d, float *e)
> {
>return  *e  + a * b + c * d ;
> }
>
> For "-Ofast -mfpmath=sse -mfma" GCC generates:
> vmulss  %xmm3, %xmm2, %xmm2
> vfmadd132ss %xmm1, %xmm2, %xmm0
> vaddss  (%rdi), %xmm0, %xmm0
> ret
>
> With this patch GCC generates:
> vfmadd213ss   (%rdi), %xmm1, %xmm0
> vfmadd231ss   %xmm2, %xmm3, %xmm0
> ret
>
> TEST2:
>
> for (int i = 0; i < N; i++)
> {
>   a[i] += b[i]* c[i] + d[i] * e[i] + f[i] * g[i] + h[i] * j[i] + k[i] * l[i] 
> + m[i]* o[i] + p[i];
> }
>
> For "-Ofast -mfpmath=sse -mfma"  GCC generates:
> vmovapd e(%rax), %ymm4
> vmulpd  d(%rax), %ymm4, %ymm3
> addq$32, %rax
> vmovapd c-32(%rax), %ymm5
> vmovapd j-32(%rax), %ymm6
> vmulpd  h-32(%rax), %ymm6, %ymm2
> vmovapd a-32(%rax), %ymm6
> vaddpd  p-32(%rax), %ymm6, %ymm0
> vmovapd g-32(%rax), %ymm7
> vfmadd231pd b-32(%rax), %ymm5, %ymm3
> vmovapd o-32(%rax), %ymm4
> vmulpd  m-32(%rax), %ymm4, %ymm1
> vmovapd l-32(%rax), %ymm5
> vfmadd231pd f-32(%rax), %ymm7, %ymm2
> vfmadd231pd k-32(%rax), %ymm5, %ymm1
> vaddpd  %ymm3, %ymm0, %ymm0
> vaddpd  %ymm2, %ymm0, %ymm0
> vaddpd  %ymm1, %ymm0, %ymm0
> vmovapd %ymm0, a-32(%rax)
> cmpq$8192, %rax
> jne .L4
> vzeroupper
> ret
>
> with this patch applied GCC breaks the chain with width = 2 and generates 6 
> fma:
>
> vmovapd a(%rax), %ymm2
> vmovapd c(%rax), %ymm0
> addq$32, %rax
> vmovapd e-32(%rax), %ymm1
> vmovapd p-32(%rax), %ymm5
> vmovapd g-32(%rax), %ymm3
> vmovapd j-32(%rax), %ymm6
> vmovapd l-32(%rax), %ymm4
> vmovapd o-32(%rax), %ymm7
> vfmadd132pd b-32(%rax), %ymm2, %ymm0
> vfmadd132pd d-32(%rax), %ymm5, %ymm1
> vfmadd231pd f-32(%rax), %ymm3, %ymm0
> vfmadd231pd h-32(%rax), %ymm6, %ymm1
> vfmadd231pd k-32(%rax), %ymm4, %ymm0
> vfmadd231pd m-32(%rax), %ymm7, %ymm1
> vaddpd  %ymm1, %ymm0, %ymm0
> vmovapd %ymm0, a-32(%rax)
> cmpq$8192, %rax
> jne .L2
> vzeroupper
> ret
>
> gcc/ChangeLog:
>
> PR gcc/98350
> * tree-ssa-reassoc.cc
> (rewrite_expr_tree_parallel): Rewrite this function.
> (rank_ops_for_fma): New.
> (reassociate_bb): Handle new function.
>
> gcc/testsuite/ChangeLog:
>
> PR gcc/98350
> * gcc.dg/pr98350-1.c: New test.
> * gcc.dg/pr98350-2.c: Ditto.
> ---
>  gcc/testsuite/gcc.dg/pr98350-1.c |  31 
>  gcc/testsuite/gcc.dg/pr98350-2.c |  11 ++
>  gcc/tree-ssa-reassoc.cc  | 256 +--
>  3 files changed, 215 insertions(+), 83 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/pr98350-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/pr98350-2.c
>
> diff --git a/gcc/testsuite/gcc.dg/pr98350-1.c 
> b/gcc/testsuite/gcc.dg/pr98350-1.c
> new file mode 100644
> index 000..185511c5e0a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr98350-1.c
> @@ -0,0 +1,31 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Ofast -mfpmath=sse -mfma -Wno-attributes " } */
> +
> +/* Test that the compiler properly optimizes multiply and add
> +   to generate more FMA instructions.  */
> +#define N 1024
> +double a[N];
> +double b[N];
> +double c[N];
> +double d[N];
> +double e[N];
> +double f[N];
> +double g[N];
> +double h[N];
> +double j[N];
> +double k[N];
> +double l[N];
> +double m[N];
> +double o[N];
> +double p[N];
> +
> +
> +void
> +foo (void)
> +{
> +  for (int i = 0; i < N; i++)
> +  {
> +a[i] += b[i] * c[i] + d[i] * e[i] + f[i] * g[i] + h[i] * j[i] + k[i] * 
> l[i] + m[i]* o[i] + p[i];
> +  }
> +}
> +/* { dg-final { scan-assembler-times "vfm" 6  } } */
> diff --git a/gcc/testsuite/gcc.dg/pr98350-2.c 
> b/gcc/testsuite/gcc.dg/pr98350-2.c
> new file mode 100644
> index 000..b35d88aead9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr98350-2.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Ofast -mfpmath=sse -mfma -Wno-attributes " } */
> +
> +/* Test that the compiler rearrange the ops to generate more FMA.  */
> +
> +float
> +foo1 (

Re: [PATCH 1/4] Missed opportunity to use [SU]ABD

2023-05-22 Thread Richard Biener via Gcc-patches
On Thu, May 18, 2023 at 7:59 PM Richard Sandiford
 wrote:
>
> Thanks for the update.  Some of these comments would have applied
> to the first version, so sorry for not catching them first time.
>
>  writes:
> > From: oluade01 
> >
> > This adds a recognition pattern for the non-widening
> > absolute difference (ABD).
> >
> > gcc/ChangeLog:
> >
> >   * doc/md.texi (sabd, uabd): Document them.
> >   * internal-fn.def (ABD): Use new optab.
> >   * optabs.def (sabd_optab, uabd_optab): New optabs,
> >   * tree-vect-patterns.cc (vect_recog_absolute_difference):
> >   Recognize the following idiom abs (a - b).
> >   (vect_recog_sad_pattern): Refactor to use
> >   vect_recog_absolute_difference.
> >   (vect_recog_abd_pattern): Use patterns found by
> >   vect_recog_absolute_difference to build a new ABD
> >   internal call.
> > ---
> >  gcc/doc/md.texi   |  10 ++
> >  gcc/internal-fn.def   |   3 +
> >  gcc/optabs.def|   2 +
> >  gcc/tree-vect-patterns.cc | 255 +-
> >  4 files changed, 239 insertions(+), 31 deletions(-)
> >
> > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> > index 
> > 07bf8bdebffb2e523f25a41f2b57e43c0276b745..3e65584d7efcd301f2c96a40edd82d30b84462b8
> >  100644
> > --- a/gcc/doc/md.texi
> > +++ b/gcc/doc/md.texi
> > @@ -5778,6 +5778,16 @@ Other shift and rotate instructions, analogous to the
> >  Vector shift and rotate instructions that take vectors as operand 2
> >  instead of a scalar type.
> >
> > +@cindex @code{uabd@var{m}} instruction pattern
> > +@cindex @code{sabd@var{m}} instruction pattern
> > +@item @samp{uabd@var{m}}, @samp{sabd@var{m}}
> > +Signed and unsigned absolute difference instructions.  These
> > +instructions find the difference between operands 1 and 2
> > +then return the absolute value.  A C code equivalent would be:
> > +@smallexample
> > +op0 = op0 > op1 ? op0 - op1 : op1 - op0;
>
> Should be:
>
>   op0 = op1 > op2 ? op1 - op2 : op2 - op1;
>
> since op0 is the output.
>
> > +@end smallexample
> > +
> >  @cindex @code{avg@var{m}3_floor} instruction pattern
> >  @cindex @code{uavg@var{m}3_floor} instruction pattern
> >  @item @samp{avg@var{m}3_floor}
> > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> > index 
> > 7fe742c2ae713e7152ab05cfdfba86e4e0aa3456..0f1724ecf37a31c231572edf90b5577e2d82f468
> >  100644
> > --- a/gcc/internal-fn.def
> > +++ b/gcc/internal-fn.def
> > @@ -167,6 +167,9 @@ DEF_INTERNAL_OPTAB_FN (FMS, ECF_CONST, fms, ternary)
> >  DEF_INTERNAL_OPTAB_FN (FNMA, ECF_CONST, fnma, ternary)
> >  DEF_INTERNAL_OPTAB_FN (FNMS, ECF_CONST, fnms, ternary)
> >
> > +DEF_INTERNAL_SIGNED_OPTAB_FN (ABD, ECF_CONST | ECF_NOTHROW, first,
> > +   sabd, uabd, binary)
> > +
> >  DEF_INTERNAL_SIGNED_OPTAB_FN (AVG_FLOOR, ECF_CONST | ECF_NOTHROW, first,
> > savg_floor, uavg_floor, binary)
> >  DEF_INTERNAL_SIGNED_OPTAB_FN (AVG_CEIL, ECF_CONST | ECF_NOTHROW, first,
> > diff --git a/gcc/optabs.def b/gcc/optabs.def
> > index 
> > 695f5911b300c9ca5737de9be809fa01aabe5e01..29bc92281a2175f898634cbe6af63c18021e5268
> >  100644
> > --- a/gcc/optabs.def
> > +++ b/gcc/optabs.def
> > @@ -359,6 +359,8 @@ OPTAB_D (mask_fold_left_plus_optab, 
> > "mask_fold_left_plus_$a")
> >  OPTAB_D (extract_last_optab, "extract_last_$a")
> >  OPTAB_D (fold_extract_last_optab, "fold_extract_last_$a")
> >
> > +OPTAB_D (uabd_optab, "uabd$a3")
> > +OPTAB_D (sabd_optab, "sabd$a3")
> >  OPTAB_D (savg_floor_optab, "avg$a3_floor")
> >  OPTAB_D (uavg_floor_optab, "uavg$a3_floor")
> >  OPTAB_D (savg_ceil_optab, "avg$a3_ceil")
> > diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> > index 
> > a49b09539776c0056e77f99b10365d0a8747fbc5..50f1822f220c023027f4b0f777965f3757842fa2
> >  100644
> > --- a/gcc/tree-vect-patterns.cc
> > +++ b/gcc/tree-vect-patterns.cc
> > @@ -770,6 +770,93 @@ vect_split_statement (vec_info *vinfo, stmt_vec_info 
> > stmt2_info, tree new_rhs,
> >  }
> >  }
> >
> > +/* Look for the following pattern
> > + X = x[i]
> > + Y = y[i]
> > + DIFF = X - Y
> > + DAD = ABS_EXPR
> > +
> > +   ABS_STMT should point to a statement of code ABS_EXPR or ABSU_EXPR.
> > +   If REJECT_UNSIGNED is true it aborts if the type of ABS_STMT is 
> > unsigned.
> > +   HALF_TYPE and UNPROM will be set should the statement be found to
> > +   be a widened operation.
> > +   DIFF_OPRNDS will be set to the two inputs of the MINUS_EXPR preceding
> > +   ABS_STMT, otherwise it will be set the operations found by
> > +   vect_widened_op_tree.
> > + */
> > +static bool
> > +vect_recog_absolute_difference (vec_info *vinfo, gassign *abs_stmt,
> > + tree *half_type, bool reject_unsigned,
> > + vect_unpromoted_value unprom[2],
> > + tree diff_oprnds[2])
> > +{
> > +  if (!abs_stmt)
> > +return false;
> > +
> > +  /* FORNOW.  Can continue analyzing the d

Re: [PATCH v2] tree-ssa-sink: Improve code sinking pass

2023-05-22 Thread Richard Biener via Gcc-patches
On Fri, May 19, 2023 at 11:43 AM Ajit Agarwal  wrote:
>
> Hello All:
>
> This patch improves code sinking pass to sink statements before call to reduce
> register pressure.
> Review comments are incorporated.
>
> For example :
>
> void bar();
> int j;
> void foo(int a, int b, int c, int d, int e, int f)
> {
>   int l;
>   l = a + b + c + d +e + f;
>   if (a != 5)
> {
>   bar();
>   j = l;
> }
> }
>
> Code Sinking does the following:
>
> void bar();
> int j;
> void foo(int a, int b, int c, int d, int e, int f)
> {
>   int l;
>
>   if (a != 5)
> {
>   l = a + b + c + d +e + f;
>   bar();
>   j = l;
> }
> }
>
> Bootstrapped regtested on powerpc64-linux-gnu.
>
> Thanks & Regards
> Ajit
>
>
> tree-ssa-sink: Improve code sinking pass
>
> Code Sinking sinks the blocks after call.This increases register pressure
> for callee-saved registers. Improves code sinking before call in the use
> blocks or immediate dominator of use blocks.

Saw this update too late but I think all comments still apply.

> 2023-05-18  Ajit Kumar Agarwal  
>
> gcc/ChangeLog:
>
> * tree-ssa-sink.cc (statement_sink_location): Move statements before
> calls.
> (block_call_p): New function.
> (def_use_same_block): New function.
> (select_best_block): Add heuristics to select the best blocks in the
> immediate post dominator.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/ssa-sink-20.c: New testcase.
> * gcc.dg/tree-ssa/ssa-sink-21.c: New testcase.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c |  15 ++
>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c |  19 +++
>  gcc/tree-ssa-sink.cc| 160 ++--
>  3 files changed, 183 insertions(+), 11 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
> new file mode 100644
> index 000..69fa6d32e7c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
> @@ -0,0 +1,15 @@
> +/* { dg-options "-O2 -fdump-tree-optimized -fdump-tree-sink-stats" } */
> +
> +void bar();
> +int j;
> +void foo(int a, int b, int c, int d, int e, int f)
> +{
> +  int l;
> +  l = a + b + c + d +e + f;
> +  if (a != 5)
> +{
> +  bar();
> +  j = l;
> +}
> +}
> +/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
> new file mode 100644
> index 000..b34959c8a4d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
> @@ -0,0 +1,19 @@
> +/* { dg-options "-O2 -fdump-tree-sink-stats" } */
> +
> +void bar();
> +int j, x;
> +void foo(int a, int b, int c, int d, int e, int f)
> +{
> +  int l;
> +  l = a + b + c + d +e + f;
> +  if (a != 5)
> +{
> +  bar();
> +  if (b != 3)
> +x = 3;
> +  else
> +x = 5;
> +  j = l;
> +}
> +}
> +/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } */
> diff --git a/gcc/tree-ssa-sink.cc b/gcc/tree-ssa-sink.cc
> index b1ba7a2ad6c..091aa90d289 100644
> --- a/gcc/tree-ssa-sink.cc
> +++ b/gcc/tree-ssa-sink.cc
> @@ -171,6 +171,71 @@ nearest_common_dominator_of_uses (def_operand_p def_p, 
> bool *debug_stmts)
>return commondom;
>  }
>
> +/* Return TRUE if immediate uses of the defs in
> +   STMT occur in the same block as STMT, FALSE otherwise.  */
> +
> +bool
> +def_use_same_block (gimple *stmt)
> +{
> +  use_operand_p use;
> +  def_operand_p def;
> +  imm_use_iterator imm_iter;
> +  ssa_op_iter iter;
> +
> +  FOR_EACH_SSA_DEF_OPERAND (def, stmt, iter, SSA_OP_DEF)
> +{
> +  FOR_EACH_IMM_USE_FAST (use, imm_iter, DEF_FROM_PTR (def))
> +   {
> + if (is_gimple_debug (USE_STMT (use)))
> +   continue;
> +
> + if (use && (gimple_bb (USE_STMT (use)) == gimple_bb (stmt)))
> +   return true;
> +   }
> + }
> +  return false;
> +}
> +
> +/* Return TRUE if the block has only one call statement, FALSE otherwise. */
> +
> +bool
> +block_call_p (basic_block bb)
> +{
> +  int i = 0;
> +  bool is_call = false;
> +  gimple_stmt_iterator gsi = gsi_last_bb (bb);
> +  gimple *last_stmt = gsi_stmt (gsi);
> +
> +  if (last_stmt && gimple_code (last_stmt) == GIMPLE_COND)
> +{
> +  if (!gsi_end_p (gsi))
> +   gsi_prev (&gsi);
> +
> +   for (; !gsi_end_p (gsi);)
> +{
> +  gimple *stmt = gsi_stmt (gsi);
> +
> +  /* We have already seen a call.  */
> +  if (is_call)
> +return false;
> +
> +  if (is_gimple_call (stmt))
> +is_call = true;
> +  else
> +return false;
> +
> +  if (!gsi_end_p (gsi))
> +gsi_prev (&gsi);
> +
> +   ++i;
> +   }
> + }
> +  if (is_call && i == 1)
> +r

Re: [PATCH 2/2] vect: Enhance cost evaluation in vect_transform_slp_perm_load_1

2023-05-22 Thread Richard Biener via Gcc-patches
On Wed, May 17, 2023 at 8:15 AM Kewen.Lin  wrote:
>
> Hi,
>
> Following Richi's suggestion in [1], I'm working on deferring
> cost evaluation next to the transformation, this patch is
> to enhance function vect_transform_slp_perm_load_1 which
> could under-cost for vector permutation, since the costing
> doesn't try to consider nvectors_per_build, it's inconsistent
> with the transformation part.
>
> Bootstrapped and regtested on x86_64-redhat-linux,
> aarch64-linux-gnu and powerpc64{,le}-linux-gnu.
>
> Is it ok for trunk?
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563624.html
>
> BR,
> Kewen
> -
> gcc/ChangeLog:
>
> * tree-vect-slp.cc (vect_transform_slp_perm_load_1): Adjust the
> calculation on n_perms by considering nvectors_per_build.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/vect/costmodel/ppc/costmodel-slp-perm.c: New test.
> ---
>  .../vect/costmodel/ppc/costmodel-slp-perm.c   | 23 +++
>  gcc/tree-vect-slp.cc  | 66 ++-
>  2 files changed, 57 insertions(+), 32 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-perm.c
>
> diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-perm.c 
> b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-perm.c
> new file mode 100644
> index 000..e5c4dceddfb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-perm.c
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target vect_int } */
> +/* { dg-require-effective-target powerpc_p9vector_ok } */
> +/* Specify power9 to ensure the vectorization is profitable
> +   and test point stands, otherwise it could be not profitable
> +   to vectorize.  */
> +/* { dg-additional-options "-mdejagnu-cpu=power9 -mpower9-vector" } */
> +
> +/* Verify we cost the exact count for required vec_perm.  */
> +
> +int x[1024], y[1024];
> +
> +void
> +foo ()
> +{
> +  for (int i = 0; i < 512; ++i)
> +{
> +  x[2 * i] = y[1023 - (2 * i)];
> +  x[2 * i + 1] = y[1023 - (2 * i + 1)];
> +}
> +}
> +
> +/* { dg-final { scan-tree-dump-times "2 times vec_perm" 1 "vect" } } */
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index e5c9d7e766e..af9a6dd4fa9 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -8115,12 +8115,12 @@ vect_transform_slp_perm_load_1 (vec_info *vinfo, 
> slp_tree node,
>
>mode = TYPE_MODE (vectype);
>poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (vectype);
> +  unsigned int nstmts = SLP_TREE_NUMBER_OF_VEC_STMTS (node);
>
>/* Initialize the vect stmts of NODE to properly insert the generated
>   stmts later.  */
>if (! analyze_only)
> -for (unsigned i = SLP_TREE_VEC_STMTS (node).length ();
> -i < SLP_TREE_NUMBER_OF_VEC_STMTS (node); i++)
> +for (unsigned i = SLP_TREE_VEC_STMTS (node).length (); i < nstmts; i++)
>SLP_TREE_VEC_STMTS (node).quick_push (NULL);
>
>/* Generate permutation masks for every NODE. Number of masks for each NODE
> @@ -8161,7 +8161,10 @@ vect_transform_slp_perm_load_1 (vec_info *vinfo, 
> slp_tree node,
>  (b) the permutes only need a single vector input.  */
>mask.new_vector (nunits, group_size, 3);
>nelts_to_build = mask.encoded_nelts ();
> -  nvectors_per_build = SLP_TREE_VEC_STMTS (node).length ();
> +  /* It's possible to obtain zero nstmts during analyze_only, so make
> +it at least one to ensure the later computation for n_perms
> +proceed.  */
> +  nvectors_per_build = nstmts > 0 ? nstmts : 1;
>in_nlanes = DR_GROUP_SIZE (stmt_info) * 3;
>  }
>else
> @@ -8252,40 +8255,39 @@ vect_transform_slp_perm_load_1 (vec_info *vinfo, 
> slp_tree node,
>   return false;
> }
>
> - ++*n_perms;
> -
> + tree mask_vec = NULL_TREE;
>   if (!analyze_only)
> -   {
> - tree mask_vec = vect_gen_perm_mask_checked (vectype, 
> indices);
> +   mask_vec = vect_gen_perm_mask_checked (vectype, indices);
>
> - if (second_vec_index == -1)
> -   second_vec_index = first_vec_index;
> + if (second_vec_index == -1)
> +   second_vec_index = first_vec_index;
>
> - for (unsigned int ri = 0; ri < nvectors_per_build; ++ri)
> + for (unsigned int ri = 0; ri < nvectors_per_build; ++ri)
> +   {
> + ++*n_perms;

So the "real" change is doing

  *n_perms += nvectors_per_build;

and *n_perms was unused when !analyze_only?  And since at
analysis time we (sometimes?) have zero nvectors you have to
fixup above?  Which cases are that?

In principle the patch looks good to me.

Richard.

> + if (analyze_only)
> +   continue;
> + /* Generate the permute statement if necessary.  */
> + tree first_vec = dr_chain[fir

Re: [PATCH] RISC-V: Add "m_" prefix for private member

2023-05-22 Thread Kito Cheng via Gcc-patches
LGTM

On Mon, May 22, 2023 at 8:10 PM  wrote:
>
> From: Juzhe-Zhong 
>
> Since the current framework is hard to maintain and
> hard to be used in the future possible auto-vectorization patterns.
>
> We will need to keep adding more helpers and arguments during the
> auto-vectorization supporting. We should refactor the framework
> now for the future use since the we don't support too much auto-vectorization
> patterns for now.
>
> Start with this simple patch, this patch is adding "m_" prefix for private 
> the members.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-v.cc: Add "m_" prefix.
>
> ---
>  gcc/config/riscv/riscv-v.cc | 24 
>  1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
> index d65e7300303..e0b19bc1754 100644
> --- a/gcc/config/riscv/riscv-v.cc
> +++ b/gcc/config/riscv/riscv-v.cc
> @@ -66,7 +66,7 @@ const_vlmax_p (machine_mode mode)
>  template  class insn_expander
>  {
>  public:
> -  insn_expander () : m_opno (0), has_dest(false) {}
> +  insn_expander () : m_opno (0), m_has_dest_p(false) {}
>void add_output_operand (rtx x, machine_mode mode)
>{
>  create_output_operand (&m_ops[m_opno++], x, mode);
> @@ -99,41 +99,41 @@ public:
>
>void set_dest_and_mask (rtx mask, rtx dest, machine_mode mask_mode)
>{
> -dest_mode = GET_MODE (dest);
> -has_dest = true;
> +m_dest_mode = GET_MODE (dest);
> +m_has_dest_p = true;
>
> -add_output_operand (dest, dest_mode);
> +add_output_operand (dest, m_dest_mode);
>
>  if (mask)
>add_input_operand (mask, GET_MODE (mask));
>  else
>add_all_one_mask_operand (mask_mode);
>
> -add_vundef_operand (dest_mode);
> +add_vundef_operand (m_dest_mode);
>}
>
>void set_len_and_policy (rtx len, bool force_vlmax = false)
>  {
>bool vlmax_p = force_vlmax || !len;
> -  gcc_assert (has_dest);
> +  gcc_assert (m_has_dest_p);
>
> -  if (vlmax_p && const_vlmax_p (dest_mode))
> +  if (vlmax_p && const_vlmax_p (m_dest_mode))
> {
>   /* Optimize VLS-VLMAX code gen, we can use vsetivli instead of the
>  vsetvli to obtain the value of vlmax.  */
> - poly_uint64 nunits = GET_MODE_NUNITS (dest_mode);
> + poly_uint64 nunits = GET_MODE_NUNITS (m_dest_mode);
>   len = gen_int_mode (nunits, Pmode);
>   vlmax_p = false; /* It has became NONVLMAX now.  */
> }
>else if (!len)
> {
>   len = gen_reg_rtx (Pmode);
> - emit_vlmax_vsetvl (dest_mode, len);
> + emit_vlmax_vsetvl (m_dest_mode, len);
> }
>
>add_input_operand (len, Pmode);
>
> -  if (GET_MODE_CLASS (dest_mode) != MODE_VECTOR_BOOL)
> +  if (GET_MODE_CLASS (m_dest_mode) != MODE_VECTOR_BOOL)
> add_policy_operand (get_prefer_tail_policy (), get_prefer_mask_policy 
> ());
>
>add_avl_type_operand (vlmax_p ? avl_type::VLMAX : avl_type::NONVLMAX);
> @@ -152,8 +152,8 @@ public:
>
>  private:
>int m_opno;
> -  bool has_dest;
> -  machine_mode dest_mode;
> +  bool m_has_dest_p;
> +  machine_mode m_dest_mode;
>expand_operand m_ops[MAX_OPERANDS];
>  };
>
> --
> 2.36.3
>


Re: [PATCH] add glibc-stdint.h to vax and lm32 linux target (PR target/105525)

2023-05-22 Thread Jan-Benedict Glaw
Hi!

On Mon, 2023-05-22 14:10:48 +0100, Maciej W. Rozycki  wrote:
> On Fri, 19 May 2023, Mikael Pettersson wrote:
> > The background is that I maintain a script to build GCC-based crosses to
> > as many targets as I can, currently it supports 78 distinct processors and
> > 82 triplets (four processors have multiple triplets). I only check that I 
> > can
> > build the toolchains (full linux-gnu ones where possible).
> 
>  Great work, thanks!

I'd be very much interested in running your script as one build
variant for my http://toolchain.lug-owl.de/ efforts. Is it available
somewhere? That would be nice!

MfG, JBG

-- 


signature.asc
Description: PGP signature


RE: [PATCH] RISC-V: Add "m_" prefix for private member

2023-05-22 Thread Li, Pan2 via Gcc-patches
Committed, thanks Kito.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Kito Cheng via Gcc-patches
Sent: Monday, May 22, 2023 9:49 PM
To: juzhe.zh...@rivai.ai
Cc: gcc-patches@gcc.gnu.org; kito.ch...@gmail.com; pal...@dabbelt.com; 
pal...@rivosinc.com; jeffreya...@gmail.com; rdapp@gmail.com
Subject: Re: [PATCH] RISC-V: Add "m_" prefix for private member

LGTM

On Mon, May 22, 2023 at 8:10 PM  wrote:
>
> From: Juzhe-Zhong 
>
> Since the current framework is hard to maintain and hard to be used in 
> the future possible auto-vectorization patterns.
>
> We will need to keep adding more helpers and arguments during the 
> auto-vectorization supporting. We should refactor the framework now 
> for the future use since the we don't support too much 
> auto-vectorization patterns for now.
>
> Start with this simple patch, this patch is adding "m_" prefix for private 
> the members.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-v.cc: Add "m_" prefix.
>
> ---
>  gcc/config/riscv/riscv-v.cc | 24 
>  1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc 
> index d65e7300303..e0b19bc1754 100644
> --- a/gcc/config/riscv/riscv-v.cc
> +++ b/gcc/config/riscv/riscv-v.cc
> @@ -66,7 +66,7 @@ const_vlmax_p (machine_mode mode)  template  MAX_OPERANDS> class insn_expander  {
>  public:
> -  insn_expander () : m_opno (0), has_dest(false) {}
> +  insn_expander () : m_opno (0), m_has_dest_p(false) {}
>void add_output_operand (rtx x, machine_mode mode)
>{
>  create_output_operand (&m_ops[m_opno++], x, mode); @@ -99,41 
> +99,41 @@ public:
>
>void set_dest_and_mask (rtx mask, rtx dest, machine_mode mask_mode)
>{
> -dest_mode = GET_MODE (dest);
> -has_dest = true;
> +m_dest_mode = GET_MODE (dest);
> +m_has_dest_p = true;
>
> -add_output_operand (dest, dest_mode);
> +add_output_operand (dest, m_dest_mode);
>
>  if (mask)
>add_input_operand (mask, GET_MODE (mask));
>  else
>add_all_one_mask_operand (mask_mode);
>
> -add_vundef_operand (dest_mode);
> +add_vundef_operand (m_dest_mode);
>}
>
>void set_len_and_policy (rtx len, bool force_vlmax = false)
>  {
>bool vlmax_p = force_vlmax || !len;
> -  gcc_assert (has_dest);
> +  gcc_assert (m_has_dest_p);
>
> -  if (vlmax_p && const_vlmax_p (dest_mode))
> +  if (vlmax_p && const_vlmax_p (m_dest_mode))
> {
>   /* Optimize VLS-VLMAX code gen, we can use vsetivli instead of the
>  vsetvli to obtain the value of vlmax.  */
> - poly_uint64 nunits = GET_MODE_NUNITS (dest_mode);
> + poly_uint64 nunits = GET_MODE_NUNITS (m_dest_mode);
>   len = gen_int_mode (nunits, Pmode);
>   vlmax_p = false; /* It has became NONVLMAX now.  */
> }
>else if (!len)
> {
>   len = gen_reg_rtx (Pmode);
> - emit_vlmax_vsetvl (dest_mode, len);
> + emit_vlmax_vsetvl (m_dest_mode, len);
> }
>
>add_input_operand (len, Pmode);
>
> -  if (GET_MODE_CLASS (dest_mode) != MODE_VECTOR_BOOL)
> +  if (GET_MODE_CLASS (m_dest_mode) != MODE_VECTOR_BOOL)
> add_policy_operand (get_prefer_tail_policy (), 
> get_prefer_mask_policy ());
>
>add_avl_type_operand (vlmax_p ? avl_type::VLMAX : 
> avl_type::NONVLMAX); @@ -152,8 +152,8 @@ public:
>
>  private:
>int m_opno;
> -  bool has_dest;
> -  machine_mode dest_mode;
> +  bool m_has_dest_p;
> +  machine_mode m_dest_mode;
>expand_operand m_ops[MAX_OPERANDS];  };
>
> --
> 2.36.3
>


Re: [PATCH 0/7] openmp: OpenMP 5.1 loop transformation directives

2023-05-22 Thread Jakub Jelinek via Gcc-patches
On Wed, May 17, 2023 at 01:55:00PM +0200, Frederik Harwath wrote:
> Thanks for the explanation. But actually doing this would require a
> complete rewrite which would almost certainly imply that mainline GCC
> would not support the loop transformations for a long time.

I don't think it needs complete rewrite, the change to use
OMP_UNROLL/OMP_TILE should actually simplify stuff when you already have
some other extra construct to handle the clauses if it isn't nested into
something else, so I wouldn't expect it needs more than 2-3 hours of work.
It is true that doing the transformation on trees rather than high gimple
is something different, but again it doesn't require everything to be
rewritten and we have code to do code copying both on trees and high and low
gimple in tree-inline.cc, so the unrolling can just use different APIs
to perform it.

I'd still prefer to do it like that, I think it will pay back in
maintainance costs.

If you don't get to this within say 2 weeks, I'll try to do the conversion
myself.

Jakub



[COMMITTED] i386: Account for the memory read in V*QImode multiplication sequences

2023-05-22 Thread Uros Bizjak via Gcc-patches
Add the cost of a memory read to the cost of V*QImode vector mult sequences.

gcc/ChangeLog:

* config/i386/i386.cc (ix86_multiplication_cost): Add
the cost of a memory read to the cost of V?QImode sequences.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 6a4b3326219..a36e625342d 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -20463,27 +20463,42 @@ ix86_multiplication_cost (const struct 
processor_costs *cost,
   {
   case V4QImode:
   case V8QImode:
-   /* Partial V*QImode is emulated with 4-5 insns.  */
-   if ((TARGET_AVX512BW && TARGET_AVX512VL) || TARGET_XOP)
+   /* Partial V*QImode is emulated with 4-6 insns.  */
+   if (TARGET_AVX512BW && TARGET_AVX512VL)
  return ix86_vec_cost (mode, cost->mulss + cost->sse_op * 3);
+   else if (TARGET_AVX2)
+ return ix86_vec_cost (mode, cost->mulss + cost->sse_op * 5);
+   else if (TARGET_XOP)
+ return (ix86_vec_cost (mode, cost->mulss + cost->sse_op * 3)
+ + cost->sse_load[2]);
else
- return ix86_vec_cost (mode, cost->mulss + cost->sse_op * 4);
+ return (ix86_vec_cost (mode, cost->mulss + cost->sse_op * 4)
+ + cost->sse_load[2]);
 
   case V16QImode:
/* V*QImode is emulated with 4-11 insns.  */
if (TARGET_AVX512BW && TARGET_AVX512VL)
  return ix86_vec_cost (mode, cost->mulss + cost->sse_op * 3);
+   else if (TARGET_AVX2)
+ return ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 8);
else if (TARGET_XOP)
- return ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 5);
-   /* FALLTHRU */
+ return (ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 5)
+ + cost->sse_load[2]);
+   else
+ return (ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 7)
+ + cost->sse_load[2]);
+
   case V32QImode:
-   if (TARGET_AVX512BW && mode == V32QImode)
+   if (TARGET_AVX512BW)
  return ix86_vec_cost (mode, cost->mulss + cost->sse_op * 3);
else
- return ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 7);
+ return (ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 7)
+ + cost->sse_load[3] * 2);
 
   case V64QImode:
-   return ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 9);
+   return (ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 9)
+   + cost->sse_load[3] * 2
+   + cost->sse_load[4] * 2);
 
   case V4SImode:
/* pmulld is used in this case. No emulation is needed.  */


[avr,testsuite,committed] Skip test that fail for avr for this or that reason.

2023-05-22 Thread Georg-Johann Lay

This annotates some tests that won't work for AVR like:

* asm goto with output reload (AVR is not lra).

* Using a program address as a ram address.

* Float related stuff: AVR double is 32-bit, and long double
  is incomplete (some functions missing, no signed zeros, etc.)

Applied as obvious.

Johann

--

Skip some tests that won't work for target AVR.

gcc/testsuite/
* lib/target-supports.exp (check_effective_target_lra) 
[avr]: Return 0.

* gcc.dg/pr19402-2.c: Skip for avr.
* gcc.dg/pr86124.c: Same.
* gcc.dg/pr94291.c: Same.
* gcc.dg/torture/builtin-complex-1.c: Same.
* gcc.dg/torture/fp-int-convert-float32x-timode.c: Same.
* gcc.dg/torture/fp-int-convert-float32x.c: Same.
* gcc.dg/torture/fp-int-convert-float64-timode.c: Same.
* gcc.dg/torture/fp-int-convert-float64.c: Same.
* gcc.dg/torture/fp-int-convert-long-double.c: Same.
* gcc.dg/torture/fp-int-convert-timode.c: Same.
* c-c++-common/torture/builtin-convertvector-1.c: Same.
* c-c++-common/torture/complex-sign-add.c: Same.
* c-c++-common/torture/complex-sign-mixed-add.c: Same.
* c-c++-common/torture/complex-sign-mixed-div.c: Same.
* c-c++-common/torture/complex-sign-mixed-mul.c: Same.
* c-c++-common/torture/complex-sign-mixed-sub.c: Same.
* c-c++-common/torture/complex-sign-mul-minus-one.c: Same.
* c-c++-common/torture/complex-sign-mul-one.c: Same.
* c-c++-common/torture/complex-sign-mul.c: Same.
* c-c++-common/torture/complex-sign-sub.c: Same.

diff --git 
a/gcc/testsuite/c-c++-common/torture/builtin-convertvector-1.c 
b/gcc/testsuite/c-c++-common/torture/builtin-convertvector-1.c

index 347dda7692d..fababf1a9eb 100644
--- a/gcc/testsuite/c-c++-common/torture/builtin-convertvector-1.c
+++ b/gcc/testsuite/c-c++-common/torture/builtin-convertvector-1.c
@@ -1,3 +1,5 @@
+/* { dg-skip-if "double support is incomplete" { "avr-*-*" } } */
+
 extern
 #ifdef __cplusplus
 "C"
diff --git a/gcc/testsuite/c-c++-common/torture/complex-sign-add.c 
b/gcc/testsuite/c-c++-common/torture/complex-sign-add.c

index e81223224dc..c1e7886a0df 100644
--- a/gcc/testsuite/c-c++-common/torture/complex-sign-add.c
+++ b/gcc/testsuite/c-c++-common/torture/complex-sign-add.c
@@ -2,6 +2,7 @@
addition.  */
 /* { dg-do run } */
 /* { dg-options "-std=gnu99" { target c } } */
+/* { dg-skip-if "double support is incomplete" { "avr-*-*" } } */

 #include "complex-sign.h"

diff --git a/gcc/testsuite/c-c++-common/torture/complex-sign-mixed-add.c 
b/gcc/testsuite/c-c++-common/torture/complex-sign-mixed-add.c

index a209161e157..36d305baf53 100644
--- a/gcc/testsuite/c-c++-common/torture/complex-sign-mixed-add.c
+++ b/gcc/testsuite/c-c++-common/torture/complex-sign-mixed-add.c
@@ -3,6 +3,7 @@
 /* { dg-do run } */
 /* { dg-options "-std=gnu99" { target c } } */
 /* { dg-skip-if "ptx can elide zero additions" { "nvptx-*-*" } { "-O0" 
} { "" } } */

+/* { dg-skip-if "double support is incomplete" { "avr-*-*" } } */

 #include "complex-sign.h"

diff --git a/gcc/testsuite/c-c++-common/torture/complex-sign-mixed-div.c 
b/gcc/testsuite/c-c++-common/torture/complex-sign-mixed-div.c

index f7ee48341c0..a37074bb3b9 100644
--- a/gcc/testsuite/c-c++-common/torture/complex-sign-mixed-div.c
+++ b/gcc/testsuite/c-c++-common/torture/complex-sign-mixed-div.c
@@ -2,6 +2,7 @@
division.  */
 /* { dg-do run } */
 /* { dg-options "-std=gnu99" { target c } } */
+/* { dg-skip-if "double support is incomplete" { "avr-*-*" } } */

 #include "complex-sign.h"

diff --git a/gcc/testsuite/c-c++-common/torture/complex-sign-mixed-mul.c 
b/gcc/testsuite/c-c++-common/torture/complex-sign-mixed-mul.c

index 02f936b75bd..1e528b986c5 100644
--- a/gcc/testsuite/c-c++-common/torture/complex-sign-mixed-mul.c
+++ b/gcc/testsuite/c-c++-common/torture/complex-sign-mixed-mul.c
@@ -2,6 +2,7 @@
multiplication.  */
 /* { dg-do run } */
 /* { dg-options "-std=gnu99" { target c } } */
+/* { dg-skip-if "double support is incomplete" { "avr-*-*" } } */

 #include "complex-sign.h"

diff --git a/gcc/testsuite/c-c++-common/torture/complex-sign-mixed-sub.c 
b/gcc/testsuite/c-c++-common/torture/complex-sign-mixed-sub.c

index 02ab4db247c..63c75dfdff2 100644
--- a/gcc/testsuite/c-c++-common/torture/complex-sign-mixed-sub.c
+++ b/gcc/testsuite/c-c++-common/torture/complex-sign-mixed-sub.c
@@ -3,6 +3,7 @@
 /* { dg-do run } */
 /* { dg-options "-std=gnu99" { target c } } */
 /* { dg-skip-if "ptx can elide zero additions" { "nvptx-*-*" } { "-O0" 
} { "" } } */

+/* { dg-skip-if "double support is incomplete" { "avr-*-*" } } */

 #include "complex-sign.h"

diff --git 
a/gcc/testsuite/c-c++-common/torture/complex-sign-mul-minus-one.c 
b/gcc/testsuite/c-c++-common/torture/complex-sign-mul-minus-one.c

index 05cc4fabea4..f8abdd00e2e 100644
--- a/gcc/testsuite/c-c++-common/torture/co

[testsuite,committed] PR testsuite/52641

2023-05-22 Thread Georg-Johann Lay
Applied more annotations to reduce testsuite fallout for 16-bit int / 
pointer targets.


https://gcc.gnu.org/r14-1074

Most of the affected tests use constants not suitable for 16-bit int, 
bit-fields wider than 16 bits, etc.


Johann

--

commit 9f5065094c9632a50bea604d5896a139609e50cf
Author: Georg-Johann Lay 
Date:   Mon May 22 16:47:56 2023 +0200

testsuite/52641: Fix tests that fail for 16-bit int / pointer targets.

gcc/testsuite/
PR testsuite/52641
* c-c++-common/pr19807-2.c: Use __SIZEOF_INT__ instead of 4.
* gcc.c-torture/compile/pr103813.c: Require size32plus.
* gcc.c-torture/execute/pr108498-2.c: Same.
* gcc.c-torture/compile/pr96426.c: Condition on
__SIZEOF_LONG_LONG__ == __SIZEOF_DOUBLE__.
* gcc.c-torture/execute/pr103417.c: Require int32plus.
* gcc.dg/pr104198.c: Same.
* gcc.dg/pr21137.c: Same.
* gcc.dg/pr88905.c: Same.
* gcc.dg/pr90838.c: Same.
* gcc.dg/pr97317.c: Same.
* gcc.dg/pr100292.c: Require int32.
* gcc.dg/pr101008.c: Same.
* gcc.dg/pr96542.c: Same.
* gcc.dg/pr96674.c: Same.
* gcc.dg/pr97750.c: Require ptr_eq_long.

diff --git a/gcc/testsuite/c-c++-common/pr19807-2.c 
b/gcc/testsuite/c-c++-common/pr19807-2.c

index 529b9c97322..29a370304d3 100644
--- a/gcc/testsuite/c-c++-common/pr19807-2.c
+++ b/gcc/testsuite/c-c++-common/pr19807-2.c
@@ -6,7 +6,7 @@ int i;
 int main()
 {
   int a[4];
-  if ((char*)&a[1] + 4*i + 4 != (char*)&a[i+2])
+  if ((char*)&a[1] + __SIZEOF_INT__*i + __SIZEOF_INT__ != (char*)&a[i+2])
 link_error();
   return 0;
 }
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr103813.c 
b/gcc/testsuite/gcc.c-torture/compile/pr103813.c

index b3fc066beed..0aa64fb3152 100644
--- a/gcc/testsuite/gcc.c-torture/compile/pr103813.c
+++ b/gcc/testsuite/gcc.c-torture/compile/pr103813.c
@@ -1,4 +1,5 @@
 /* PR middle-end/103813 */
+/* { dg-require-effective-target size32plus } */

 struct A { char b; char c[0x2100]; };
 struct A d;
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr96426.c 
b/gcc/testsuite/gcc.c-torture/compile/pr96426.c

index bd573fe5366..fdb441efc10 100644
--- a/gcc/testsuite/gcc.c-torture/compile/pr96426.c
+++ b/gcc/testsuite/gcc.c-torture/compile/pr96426.c
@@ -1,5 +1,7 @@
 /* PR middle-end/96426 */

+#if __SIZEOF_LONG_LONG__ == __SIZEOF_DOUBLE__
+
 typedef long long V __attribute__((vector_size(16)));
 typedef double W __attribute__((vector_size(16)));

@@ -8,3 +10,5 @@ foo (V *v)
 {
   __builtin_convertvector (*v, W);
 }
+
+#endif
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr103417.c 
b/gcc/testsuite/gcc.c-torture/execute/pr103417.c

index 0fef8908036..ea4b99030a5 100644
--- a/gcc/testsuite/gcc.c-torture/execute/pr103417.c
+++ b/gcc/testsuite/gcc.c-torture/execute/pr103417.c
@@ -1,4 +1,5 @@
 /* PR tree-optimization/103417 */
+/* { dg-require-effective-target int32plus } */

 struct { int a : 8; int b : 24; } c = { 0, 1 };

diff --git a/gcc/testsuite/gcc.c-torture/execute/pr108498-2.c 
b/gcc/testsuite/gcc.c-torture/execute/pr108498-2.c

index ad930488c33..fdd628cbc86 100644
--- a/gcc/testsuite/gcc.c-torture/execute/pr108498-2.c
+++ b/gcc/testsuite/gcc.c-torture/execute/pr108498-2.c
@@ -1,4 +1,5 @@
 /* PR tree-optimization/108498 */
+/* { dg-require-effective-target int32plus } */

 struct U { char c[16]; };
 struct V { char c[16]; };
diff --git a/gcc/testsuite/gcc.dg/pr100292.c 
b/gcc/testsuite/gcc.dg/pr100292.c

index 675a60c3412..147c9324d81 100644
--- a/gcc/testsuite/gcc.dg/pr100292.c
+++ b/gcc/testsuite/gcc.dg/pr100292.c
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target int32 } */

 typedef unsigned char __attribute__((__vector_size__ (4))) V;

diff --git a/gcc/testsuite/gcc.dg/pr101008.c 
b/gcc/testsuite/gcc.dg/pr101008.c

index c06208d3425..8229769c6ac 100644
--- a/gcc/testsuite/gcc.dg/pr101008.c
+++ b/gcc/testsuite/gcc.dg/pr101008.c
@@ -1,6 +1,7 @@
 /* PR rtl-optimization/101008 */
 /* { dg-do compile } */
 /* { dg-options "-O2 -g" } */
+/* { dg-require-effective-target int32 } */

 typedef unsigned __attribute__((__vector_size__(32))) U;
 typedef unsigned __attribute__((__vector_size__(16))) V;
diff --git a/gcc/testsuite/gcc.dg/pr104198.c 
b/gcc/testsuite/gcc.dg/pr104198.c

index bfc7a777184..de86f49c9dc 100644
--- a/gcc/testsuite/gcc.dg/pr104198.c
+++ b/gcc/testsuite/gcc.dg/pr104198.c
@@ -3,6 +3,7 @@

 /* { dg-do run } */
 /* { dg-options "-O2 -std=c99" } */
+/* { dg-require-effective-target int32plus } */

 #include 
 #include 
diff --git a/gcc/testsuite/gcc.dg/pr21137.c b/gcc/testsuite/gcc.dg/pr21137.c
index 6d73deaee6c..199555a5017 100644
--- a/gcc/testsuite/gcc.dg/pr21137.c
+++ b/gcc/testsuite/gcc.dg/pr21137.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-require-effective-target int32plus } */

 void foo();

diff --git a/gcc/testsuite/gcc.dg/pr88905

Re: [PATCH] c-family: implement -ffp-contract=on

2023-05-22 Thread Alexander Monakov via Gcc-patches


On Mon, 22 May 2023, Richard Biener wrote:

> On Thu, May 18, 2023 at 11:04 PM Alexander Monakov via Gcc-patches
>  wrote:
> >
> > Implement -ffp-contract=on for C and C++ without changing default
> > behavior (=off for -std=cNN, =fast for C++ and -std=gnuNN).
> 
> The documentation changes mention the defaults are changed for
> standard modes, I suppose you want to remove that hunk.

No, the current documentation is incomplete, and that hunk extends it
to match the current GCC behavior. Should I break it out to a separate
patch? I see this drive-by fix could look confusing — sorry about that.

> it would be possible to do
> 
>   *expr_p = build_call_expr_internal (ifn, type, ops[0], ops[1]. ops[2]);
>   return GS_OK;
> 
> and not worry about temporary creation and gimplifying of the operands.
> That would in theory also leave the possibility to do this during
> genericization instead (and avoid the guard against late invocation of
> the hook).

Ah, no, I deliberately decided against that, because that way we would go
via gimplify_arg, which would emit all side effects in *pre_p. That seems
wrong if arguments had side-effects that should go in *post_p.

Thanks.
Alexander

> Otherwise it looks OK, but I'll let frontend maintainers have a chance to look
> as well.
> 
> Thanks for tackling this long-standing issue.
> Richard.


Re: [PATCH 1/2] Improve do_store_flag for single bit comparison against 0

2023-05-22 Thread Andrew Pinski via Gcc-patches
On Mon, May 22, 2023 at 4:56 AM Richard Biener via Gcc-patches
 wrote:
>
> On Fri, May 19, 2023 at 4:15 AM Andrew Pinski via Gcc-patches
>  wrote:
> >
> > While working something else, I noticed we could improve
> > the following function code generation:
> > ```
> > unsigned f(unsigned t)
> > {
> >   if (t & ~(1<<30)) __builtin_unreachable();
> >   return t != 0;
> > }
> > ```
> > Right know we just emit a comparison against 0 instead
> > of just a shift right by 30.
> > There is code in do_store_flag which already optimizes
> > `(t & 1<<30) != 0` to `(t >> 30) & 1`. This patch
> > extends it to handle the case where we know t has a
> > nonzero of just one bit set.
> >
> > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> >
> > gcc/ChangeLog:
> >
> > * expr.cc (do_store_flag): Extend the one bit checking case
> > to handle the case where we don't have an and but rather still
> > one bit is known to be non-zero.
> > ---
> >  gcc/expr.cc | 27 +--
> >  1 file changed, 21 insertions(+), 6 deletions(-)
> >
> > diff --git a/gcc/expr.cc b/gcc/expr.cc
> > index 5ede094e705..91528e734e7 100644
> > --- a/gcc/expr.cc
> > +++ b/gcc/expr.cc
> > @@ -13083,15 +13083,30 @@ do_store_flag (sepops ops, rtx target, 
> > machine_mode mode)
> >&& integer_zerop (arg1)
> >&& (TYPE_PRECISION (ops->type) != 1 || TYPE_UNSIGNED (ops->type)))
> >  {
> > -  gimple *srcstmt = get_def_for_expr (arg0, BIT_AND_EXPR);
> > -  if (srcstmt
> > - && integer_pow2p (gimple_assign_rhs2 (srcstmt)))
> > +  wide_int nz = tree_nonzero_bits (arg0);
> > +
> > +  if (wi::popcount (nz) == 1)
> > {
> > + tree op0;
> > + tree op1;
> > + gimple *srcstmt = get_def_for_expr (arg0, BIT_AND_EXPR);
> > + /* If the defining statement was (x & POW2), then remove the and
> > +as we are going to add it back. */
> > + if (srcstmt
> > + && integer_pow2p (gimple_assign_rhs2 (srcstmt)))
> > +   {
> > + op0 = gimple_assign_rhs1 (srcstmt);
> > + op1 = gimple_assign_rhs2 (srcstmt);
> > +   }
> > + else
> > +   {
> > + op0 = arg0;
> > + op1 = wide_int_to_tree (TREE_TYPE (op0), nz);
> > +   }
> >   enum tree_code tcode = code == NE ? NE_EXPR : EQ_EXPR;
> >   type = lang_hooks.types.type_for_mode (mode, unsignedp);
> > - tree temp = fold_build2_loc (loc, BIT_AND_EXPR, TREE_TYPE (arg1),
> > -  gimple_assign_rhs1 (srcstmt),
> > -  gimple_assign_rhs2 (srcstmt));
> > + tree temp = fold_build2_loc (loc, BIT_AND_EXPR, TREE_TYPE (op0),
> > +  op0, op1);
> >   temp = fold_single_bit_test (loc, tcode, temp, arg1, type);
> >   if (temp)
> > return expand_expr (temp, target, VOIDmode, EXPAND_NORMAL);
>
> I wonder if, instead of expanding expand with these kind of tricks we
> want to instead
> add to ISEL and use direct optab IFNs for things we matched?  In
> particular I think
> we do want to get rid of TER but the above adds another use of 
> get_def_for_expr.

The above does not add another at all. It was there before, it just
moves it around slightly. Instead we depend on the non-zero bits to be
correct before even trying get_def_for_expr .
The get_def_for_expr is there to remove the & if it can be ter'ed.

>
> As Jeff says the above doesn't look like it includes costing so that would be 
> an
> argument to make it a generic match.pd transform (it appears to be "simpler")?

For the TER case, it would be same number of gimple instructions so
that can happen if we want
t = a & CST
result = t != 0
vs:
t1 = BIT_FIELD_REF 
result = (bool)t1

For the non-TER case (which is what this patch is trying to solve).
we just have `t != 0` (where t has a non-zero value of CST) so it might increase
the number of gimple instructions by 1.

Is that ok? Or should that still happen in expand only.

The cost issue between a != 0 vs bit_extraction (for the non-ter case)
is something which I will be solving next weekend.

>
> Richard.
>
> > --
> > 2.31.1
> >


  1   2   >