Fix update_bb_profile_for_threading

2023-07-01 Thread Jan Hubicka via Gcc-patches
Hi,
this patch fixes some of profile mismatches caused by profile updating.
It seems that I misupdated update_bb_profile_for_threading in 2017 which
results in invalid updates from rtl threading and threadbackwards.
update_bb_profile_for_threading knows that some paths to BB are being
redirected elsehwere and those paths will exit from BB with E.  So it needs to
determine probability of the duplicated path and redistribute probablities.
For some reaosn however the conditonal probability of redirected path is
computed after its counts is subtracted which is wrong and often results in
probability greater than 100%.

I also fixed error mesage.  Compilling tramp3d I now get following passes
producing mismpatches:
Pass dump id and name|static mismatcdynamic mismatch  
 |in count |in count  
113t fre |  2+2|0 
114t mergephi|  2  |0 
115t threadfull  |  2  |0 
116t vrp |  2  |0 
127t ch  |307  +305|347194302   +347194302
130t thread  |313+6|347221478   +27176
131t dom |321+8|346841121  -380357
134t reassoc |323+2|346841121 
136t forwprop|327+4|347026371  +185250
144t pre |326-1|347040926   +14555
172t ifcvt   |338+2|347218249  +156280
173t vect|409   +71|356357418 +9139169
176t cunroll |377   -32|126071925   -230285493
183t loopdone|376-1|126015489   -56436
194t tracer  |379+3|127258199 +1242710
197t dom |375-4|128352165 +1093966
199t threadfull  |379+4|128526112  +173947
200t vrp |381+2|128724673  +198561
204t dce |374-7|128632495   -92178
206t sink|370-4|128618043   -14452
211t cddce   |372+2|128632495   +14452
248t ehcleanup   |370-2|128618755   -13740
255t optimized   |362-8|128576810   -41945
256r expand  |356-6|128899768  +322958
258r into_cfglayout  |353-3|129051765  +151997
259r jump|354+1|129051765 
262r cse1|353-1|129051765 
275r loop2_unroll|355+2|132182110 +3130345
277r loop2_done  |354-1|132182109   -1
312r pro_and_epilogue|371   +17|13324   +40215
323r bbro|375+4|132095926  -126398

Without the patch at jump2 time we get over 432 mismatches, so 15%
improvement. Some of the mismathces are unavoidable.  

I think ch mismatches are mostly due to loop header copying where the header
condition constant propagates.  Most common case should be threadable in early
optimizations and we also could do better on profile updating here.

Bootstrapped/regtested x6_64-linux, comitted.

gcc/ChangeLog:

PR tree-optimization/103680
* cfg.cc (update_bb_profile_for_threading): Fix profile update;
make message clearer.

gcc/testsuite/ChangeLog:

PR tree-optimization/103680
* gcc.dg/tree-ssa/pr103680.c: New test.
* gcc.dg/tree-prof/cmpsf-1.c: Un-xfail.
# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
#
# On branch master
# Your branch is up to date with 'origin/master'.
#
# Changes to be committed:
#   modified:   cfg.cc
#   modified:   testsuite/gcc.dg/tree-prof/cmpsf-1.c
#   new file:   testsuite/gcc.dg/tree-ssa/pr103680.c
#
# Changes not staged for commit:
#   modified:   internal-fn.def
#   modified:   ../libstdc++-v3/include/bits/c++config
#   modified:   ../libstdc++-v3/include/bits/new_allocator.h
#   modified:   ../libstdc++-v3/include/ext/malloc_allocator.h
#   modified:   ../libstdc++-v3/include/ext/random.tcc
#
# Untracked files:
#   ../1
#   ../alwaysexec
#   ../b/
#   ../buil3/
#   ../build-in/
#   ../build-inst/
#   ../build-inst2/
#   ../build-kub/
#   ../build-lto/
#   ../build-lto2/
#   ../build-lto3/
#   ../build-ppc/
#   ../build-profiled/
#   ../build/
#   ../build2/
#   ../build3/
#   ../changes
#   .cfgloopmanip.cc.swo

Re: [PATCH] RISC-V: improve codegen for repeating large constants [3]

2023-07-01 Thread Andrew Waterman via Gcc-patches
On Fri, Jun 30, 2023 at 5:36 PM Palmer Dabbelt  wrote:
>
> On Fri, 30 Jun 2023 17:25:54 PDT (-0700), Andrew Waterman wrote:
> > On Fri, Jun 30, 2023 at 5:13 PM Vineet Gupta  wrote:
> >>
> >>
> >>
> >> On 6/30/23 16:50, Andrew Waterman wrote:
> >> > I don't believe this is correct; the subtraction is needed to account
> >> > for the fact that the low part might be negative, resulting in a
> >> > borrow from the high part.  See the output for your test case below:
> >> >
> >> > $ cat test.c
> >> > #include 
> >> >
> >> > int main()
> >> > {
> >> >unsigned long result, tmp;
> >> >
> >> > asm (
> >> >"li  %1,-252645376\n"
> >> >"addi%1,%1,240\n"
> >> >"slli%0,%1,32\n"
> >> >"add %0,%0,%1"
> >> >  : "=r" (result), "=r" (tmp));
> >> >
> >> >printf("%lx\n", result);
> >> >
> >> >return 0;
> >> > }
> >> > $ riscv64-unknown-elf-gcc -O2 test.c
> >> > $ spike pk a.out
> >> > bbl loader
> >> > f0f0f0eff0f0f0f0
> >> > $
> >>
> >> Thx for the quick feedback Andew. I'm clearly lacking in signed math :-(
> >> So is it possible to have a better code seq for the testcase at all ?
> >
> > You're welcome!
> >
> > When Zba is implemented, then inserting a zext.w would do the trick;
> > see below.  (The generalization is that the zext.w is needed if the
> > 32-bit constant is negative.)  When Zba is not implemented, I think
> > the original sequence is optimal.
> >
> > li  a5, -252645376
> > addia5, a5, 240
> > sllia0, a5, 32
> > zext.w  a5, a5
> > add a0, a0, a5
>
> For the non-Zba case, I think we can leverage the two high parts
> starting out the same to save an instruction generating the constant.
> So for the original code sequence of
>
> li  a5,-252645376
> addia5,a5,241
> li  a0,-252645376
> sllia5,a5,32
> addia0,a0,240
> add a0,a5,a0
> ret
>
> we could instead generate
>
> li  a5,-252645376
> addia0,a5,240
> addia5,a5,241
> sllia5,a5,32
> add a0,a5,a0
> ret
>
> which IIUC produces the same result.  I think something along the lines
> of this (with the corresponding cost function updates) would do it
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index de578b5b899..32b6033a966 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -704,7 +704,13 @@ riscv_split_integer (HOST_WIDE_INT val, machine_mode 
> mode)
>rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode);
>
>riscv_move_integer (hi, hi, hival, mode);
> -  riscv_move_integer (lo, lo, loval, mode);
> +  if (riscv_integer_cost (loval - hival) + 1 < riscv_integer_cost 
> (loval)) {
> +rtx delta = gen_reg_rrtx (mode);
> +riscv_move_integer (delta, delta, loval - hival, mode);
> +lo = gen_rtx_fmt_ee (PLUS, mode, hi, delta);
> +  } else {
> +riscv_move_integer (lo, lo, loval, mode);
> +  }
>
>hi = gen_rtx_fmt_ee (ASHIFT, mode, hi, GEN_INT (32));
>hi = force_reg (mode, hi);
>
> though I suppose that would produce a slightly different sequence that has the
> same number of instructions but a slightly longer dependency chain, something
> more like
>
> li  a5,-252645376
> addia5,a5,241
> addia0,a5,-1
> sllia5,a5,32
> add a0,a5,a0
> ret
>
> Take that all with a grain of salt, though, as I just ate some very spicy
> chicken and can barely see straight :)

Yeah, that might end up being a false economy for superscalars.

In general, I wouldn't recommend spending too many cleverness beans on
non-Zba+Zbb implementations.  Going forward, we should expect that
even very simple cores provide those extensions.

>
>
> >
> >
> >>
> >> -Vineet
> >>
> >> >
> >> >
> >> > On Fri, Jun 30, 2023 at 4:42 PM Vineet Gupta  
> >> > wrote:
> >> >>
> >> >>
> >> >> On 6/30/23 16:33, Vineet Gupta wrote:
> >> >>> Ran into a minor snafu in const splitting code when playing with test
> >> >>> case from an old PR/23813.
> >> >>>
> >> >>>long long f(void) { return 0xF0F0F0F0F0F0F0F0ull; }
> >> >>>
> >> >>> This currently generates
> >> >>>
> >> >>>li  a5,-252645376
> >> >>>addia5,a5,241
> >> >>>li  a0,-252645376
> >> >>>sllia5,a5,32
> >> >>>addia0,a0,240
> >> >>>add a0,a5,a0
> >> >>>ret
> >> >>>
> >> >>> The signed math in hival extraction introduces an additional bit,
> >> >>> causing loval == hival check to fail.
> >> >>>
> >> >>> | riscv_split_integer (val=-1085102592571150096, mode=E_DImode) at 
> >> >>> ../gcc/config/riscv/riscv.cc:702
> >> >>> | 702   unsigned HOST_WIDE_INT loval = sext_hwi (val, 32);
> >> >>> | (gdb)n
> >> >>> | 703   unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 
> >> >>> 32);
> >> >>> | (gdb)
> >> >> FWIW (and I missed adding this observation to the change

[PATCH 1/2] Fix PR 110487: invalid signed boolean value

2023-07-01 Thread Andrew Pinski via Gcc-patches
This fixes the first part of this bug where `a ? -1 : 0`
would cause a value of 1 into the signed boolean value.
It fixes the problem by casting to an integer type of
the same size/signedness before doing the negative and
then casting to the type of expression.

OK? Bootstrapped and tested on x86_64.

gcc/ChangeLog:

* match.pd (a?-1:0): Cast type an integer type
rather the type before the negative.
(a?0:-1): Likewise.
---
 gcc/match.pd | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/gcc/match.pd b/gcc/match.pd
index 45c72e733a5..a0d114f6a16 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4703,7 +4703,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 /* a ? -1 : 0 -> -a.  No need to check the TYPE_PRECISION not being 1
here as the powerof2cst case above will handle that case correctly.  */
 (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@1))
- (negate (convert (convert:boolean_type_node @0))
+ (with {
+   auto prec = TYPE_PRECISION (type);
+   auto unsign = TYPE_UNSIGNED (type);
+   tree inttype = build_nonstandard_integer_type (prec, unsign);
+  }
+  (convert (negate (convert:inttype (convert:boolean_type_node @0
   (if (integer_zerop (@1))
(with {
   tree booltrue = constant_boolean_node (true, boolean_type_node);
@@ -4722,7 +4727,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  /* a ? -1 : 0 -> -(!a).  No need to check the TYPE_PRECISION not being 1
here as the powerof2cst case above will handle that case correctly.  */
  (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@2))
-  (negate (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } 

+  (with {
+   auto prec = TYPE_PRECISION (type);
+   auto unsign = TYPE_UNSIGNED (type);
+   tree inttype = build_nonstandard_integer_type (prec, unsign);
+   }
+   (convert
+   (negate
+ (convert:inttype
+ (bit_xor (convert:boolean_type_node @0) { booltrue; } )
+)
+   )
+   )
+  )
+ )
 )
)
   )
-- 
2.31.1



[PATCH 2/2] PR 110487: `(a !=/== CST1 ? CST2 : CST3)` pattern for type safety

2023-07-01 Thread Andrew Pinski via Gcc-patches
The problem here is we might produce some values out of the type's
min/max (and/or valid values, e.g. signed booleans). The fix is to
use an integer type which has the same precision and signedness
as the original type.

Note two_value_replacement in phiopt had the same issue in previous
versions; though I don't know if a problem will show up there.

OK? Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

PR tree-optimization/110487
* match.pd (a !=/== CST1 ? CST2 : CST3): Always
build a nonstandard integer and use that.
---
 gcc/match.pd | 24 
 1 file changed, 8 insertions(+), 16 deletions(-)

diff --git a/gcc/match.pd b/gcc/match.pd
index a0d114f6a16..9748ad8466e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4797,24 +4797,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
tree type1;
if ((eqne == EQ_EXPR) ^ (wi::to_wide (@1) == min))
  std::swap (arg0, arg1);
-   if (TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (type))
-{
-  /* Avoid performing the arithmetics in bool type which has different
- semantics, otherwise prefer unsigned types from the two with
-the same precision.  */
-  if (TREE_CODE (TREE_TYPE (arg0)) == BOOLEAN_TYPE
-  || !TYPE_UNSIGNED (type))
-type1 = TREE_TYPE (@0);
-  else
-type1 = TREE_TYPE (arg0);
-}
-   else if (TYPE_PRECISION (TREE_TYPE (@0)) > TYPE_PRECISION (type))
+   if (TYPE_PRECISION (TREE_TYPE (@0)) > TYPE_PRECISION (type))
 type1 = TREE_TYPE (@0);
else
 type1 = type;
-   min = wide_int::from (min, TYPE_PRECISION (type1),
+   auto prec = TYPE_PRECISION (type1);
+   auto unsign = TYPE_UNSIGNED (type1);
+   type1 = build_nonstandard_integer_type (prec, unsign);
+   min = wide_int::from (min, prec,
 TYPE_SIGN (TREE_TYPE (@0)));
-   wide_int a = wide_int::from (wi::to_wide (arg0), TYPE_PRECISION (type1),
+   wide_int a = wide_int::from (wi::to_wide (arg0), prec,
TYPE_SIGN (type));
enum tree_code code;
wi::overflow_type ovf;
@@ -4822,7 +4814,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 {
   code = PLUS_EXPR;
   a -= min;
-  if (!TYPE_UNSIGNED (type1))
+  if (!unsign)
 {
   /* lhs is known to be in range [min, min+1] and we want to add a
  to it.  Check if that operation can overflow for those 2 
values
@@ -4836,7 +4828,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 {
   code = MINUS_EXPR;
   a += min;
-  if (!TYPE_UNSIGNED (type1))
+  if (!unsign)
 {
   /* lhs is known to be in range [min, min+1] and we want to 
subtract
  it from a.  Check if that operation can overflow for those 2
-- 
2.31.1



[PATCH 0/2] ifcvt: Allow if conversion of arithmetic in basic blocks with multiple sets

2023-07-01 Thread Manolis Tsamis


noce_convert_multiple_sets has been introduced and extended over time to handle
if conversion for blocks with multiple sets. Currently this is focused on
register moves and rejects any sort of arithmetic operations.

This series is an extension to allow more sequences to take part in if
conversion. The first patch is a required change to emit correct code and the
second patch whitelists a larger number of operations through
bb_ok_for_noce_convert_multiple_sets.

For targets that have a rich selection of conditional instructions,
like aarch64, I have seen an ~5x increase of profitable if conversions for
multiple set blocks in SPEC benchmarks. Also tested with a wide variety of
benchmarks and I have not seen performance regressions on either x64 / aarch64.

Some samples that previously resulted in a branch but now better use these
instructions can be seen in the provided test case.

Tested on aarch64 and x64; On x64 some tests that use __builtin_rint are
failing with an ICE but I believe that it's not an issue of this change.
force_operand crashes when (and:DF (not:DF (reg:DF 88)) (reg/v:DF 83 [ x ]))
is provided through emit_conditional_move.



Manolis Tsamis (2):
  ifcvt: handle sequences that clobber flags in
noce_convert_multiple_sets
  ifcvt: Allow more operations in multiple set if conversion

 gcc/ifcvt.cc  | 109 ++
 .../aarch64/ifcvt_multiple_sets_arithm.c  |  67 +++
 2 files changed, 127 insertions(+), 49 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_arithm.c

-- 
2.34.1



[PATCH 1/2] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets

2023-07-01 Thread Manolis Tsamis
This is an extension of what was done in PR106590.

Currently if a sequence generated in noce_convert_multiple_sets clobbers the
condition rtx (cc_cmp or rev_cc_cmp) then only seq1 is used afterwards
(sequences that emit the comparison itself). Since this applies only from the
next iteration it assumes that the sequences generated (in particular seq2)
doesn't clobber the condition rtx itself before using it in the if_then_else,
which is only true in specific cases (currently only register/subregister moves
are allowed).

This patch changes this so it also tests if seq2 clobbers cc_cmp/rev_cc_cmp in
the current iteration. This makes it possible to include arithmetic operations
in noce_convert_multiple_sets.

gcc/ChangeLog:

* ifcvt.cc (check_for_cc_cmp_clobbers): Use modified_in_p instead.
(noce_convert_multiple_sets_1): Don't use seq2 if it clobbers cc_cmp.

Signed-off-by: Manolis Tsamis 
---

 gcc/ifcvt.cc | 49 +++--
 1 file changed, 19 insertions(+), 30 deletions(-)

diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
index 0b180b4568f..fd1ce8a1049 100644
--- a/gcc/ifcvt.cc
+++ b/gcc/ifcvt.cc
@@ -3373,20 +3373,6 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
   return TRUE;
 }
 
-/* Helper function for noce_convert_multiple_sets_1.  If store to
-   DEST can affect P[0] or P[1], clear P[0].  Called via note_stores.  */
-
-static void
-check_for_cc_cmp_clobbers (rtx dest, const_rtx, void *p0)
-{
-  rtx *p = (rtx *) p0;
-  if (p[0] == NULL_RTX)
-return;
-  if (reg_overlap_mentioned_p (dest, p[0])
-  || (p[1] && reg_overlap_mentioned_p (dest, p[1])))
-p[0] = NULL_RTX;
-}
-
 /* This goes through all relevant insns of IF_INFO->then_bb and tries to
create conditional moves.  In case a simple move sufficis the insn
should be listed in NEED_NO_CMOV.  The rewired-src cases should be
@@ -3550,9 +3536,17 @@ noce_convert_multiple_sets_1 (struct noce_if_info 
*if_info,
 creating an additional compare for each.  If successful, costing
 is easier and this sequence is usually preferred.  */
   if (cc_cmp)
-   seq2 = try_emit_cmove_seq (if_info, temp, cond,
-  new_val, old_val, need_cmov,
-  &cost2, &temp_dest2, cc_cmp, rev_cc_cmp);
+   {
+ seq2 = try_emit_cmove_seq (if_info, temp, cond,
+new_val, old_val, need_cmov,
+&cost2, &temp_dest2, cc_cmp, rev_cc_cmp);
+
+ /* The if_then_else in SEQ2 may be affected when cc_cmp/rev_cc_cmp is
+clobbered.  We can't safely use the sequence in this case.  */
+ if (seq2 && (modified_in_p (cc_cmp, seq2)
+ || (rev_cc_cmp && modified_in_p (rev_cc_cmp, seq2
+   seq2 = NULL;
+   }
 
   /* The backend might have created a sequence that uses the
 condition.  Check this.  */
@@ -3607,21 +3601,16 @@ noce_convert_multiple_sets_1 (struct noce_if_info 
*if_info,
  return FALSE;
}
 
-  if (cc_cmp)
+  if (cc_cmp && seq == seq1)
{
- /* Check if SEQ can clobber registers mentioned in
-cc_cmp and/or rev_cc_cmp.  If yes, we need to use
-only seq1 from that point on.  */
- rtx cc_cmp_pair[2] = { cc_cmp, rev_cc_cmp };
- for (walk = seq; walk; walk = NEXT_INSN (walk))
+ /* Check if SEQ can clobber registers mentioned in cc_cmp/rev_cc_cmp.
+If yes, we need to use only seq1 from that point on.
+Only check when we use seq1 since we have already tested seq2.  */
+ if (modified_in_p (cc_cmp, seq)
+ || (rev_cc_cmp && modified_in_p (rev_cc_cmp, seq)))
{
- note_stores (walk, check_for_cc_cmp_clobbers, cc_cmp_pair);
- if (cc_cmp_pair[0] == NULL_RTX)
-   {
- cc_cmp = NULL_RTX;
- rev_cc_cmp = NULL_RTX;
- break;
-   }
+ cc_cmp = NULL_RTX;
+ rev_cc_cmp = NULL_RTX;
}
}
 
-- 
2.34.1



[PATCH 2/2] ifcvt: Allow more operations in multiple set if conversion

2023-07-01 Thread Manolis Tsamis
Currently the operations allowed for if conversion of a basic block with
multiple sets are few, namely REG, SUBREG and CONST_INT (as controlled by
bb_ok_for_noce_convert_multiple_sets).

This commit allows more operations (arithmetic, compare, etc) to participate
in if conversion. The target's profitability hook and ifcvt's costing is
expected to reject sequences that are unprofitable.

This is especially useful for targets which provide a rich selection of
conditional instructions (like aarch64 which has cinc, csneg, csinv, ccmp, ...)
which are currently not used in basic blocks with more than a single set.

gcc/ChangeLog:

* ifcvt.cc (try_emit_cmove_seq): Modify comments.
(noce_convert_multiple_sets_1): Modify comments.
(bb_ok_for_noce_convert_multiple_sets): Allow more operations.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/ifcvt_multiple_sets_arithm.c: New test.

Signed-off-by: Manolis Tsamis 
---

 gcc/ifcvt.cc  | 60 +++--
 .../aarch64/ifcvt_multiple_sets_arithm.c  | 67 +++
 2 files changed, 108 insertions(+), 19 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_arithm.c

diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
index fd1ce8a1049..a9e5352a0a0 100644
--- a/gcc/ifcvt.cc
+++ b/gcc/ifcvt.cc
@@ -3213,13 +3213,13 @@ try_emit_cmove_seq (struct noce_if_info *if_info, rtx 
temp,
 /* We have something like:
 
  if (x > y)
-   { i = a; j = b; k = c; }
+   { i = EXPR_A; j = EXPR_B; k = EXPR_C; }
 
Make it:
 
- tmp_i = (x > y) ? a : i;
- tmp_j = (x > y) ? b : j;
- tmp_k = (x > y) ? c : k;
+ tmp_i = (x > y) ? EXPR_A : i;
+ tmp_j = (x > y) ? EXPR_B : j;
+ tmp_k = (x > y) ? EXPR_C : k;
  i = tmp_i;
  j = tmp_j;
  k = tmp_k;
@@ -3635,11 +3635,10 @@ noce_convert_multiple_sets_1 (struct noce_if_info 
*if_info,
 
 
 
-/* Return true iff basic block TEST_BB is comprised of only
-   (SET (REG) (REG)) insns suitable for conversion to a series
-   of conditional moves.  Also check that we have more than one set
-   (other routines can handle a single set better than we would), and
-   fewer than PARAM_MAX_RTL_IF_CONVERSION_INSNS sets.  While going
+/* Return true iff basic block TEST_BB is suitable for conversion to a
+   series of conditional moves.  Also check that we have more than one
+   set (other routines can handle a single set better than we would),
+   and fewer than PARAM_MAX_RTL_IF_CONVERSION_INSNS sets.  While going
through the insns store the sum of their potential costs in COST.  */
 
 static bool
@@ -3665,20 +3664,43 @@ bb_ok_for_noce_convert_multiple_sets (basic_block 
test_bb, unsigned *cost)
   rtx dest = SET_DEST (set);
   rtx src = SET_SRC (set);
 
-  /* We can possibly relax this, but for now only handle REG to REG
-(including subreg) moves.  This avoids any issues that might come
-from introducing loads/stores that might violate data-race-freedom
-guarantees.  */
-  if (!REG_P (dest))
+  /* Do not handle anything involving memory loads/stores since it might
+violate data-race-freedom guarantees.  */
+  if (!REG_P (dest) || contains_mem_rtx_p (src))
return false;
 
-  if (!((REG_P (src) || CONSTANT_P (src))
-   || (GET_CODE (src) == SUBREG && REG_P (SUBREG_REG (src))
- && subreg_lowpart_p (src
+  /* Allow a wide range of operations and let the costing function decide
+if the conversion is worth it later.  */
+  enum rtx_code code = GET_CODE (src);
+  if (!(CONSTANT_P (src)
+   || code == REG
+   || code == SUBREG
+   || code == ZERO_EXTEND
+   || code == SIGN_EXTEND
+   || code == NOT
+   || code == NEG
+   || code == PLUS
+   || code == MINUS
+   || code == AND
+   || code == IOR
+   || code == MULT
+   || code == ASHIFT
+   || code == ASHIFTRT
+   || code == NE
+   || code == EQ
+   || code == GE
+   || code == GT
+   || code == LE
+   || code == LT
+   || code == GEU
+   || code == GTU
+   || code == LEU
+   || code == LTU
+   || code == COMPARE))
return false;
 
-  /* Destination must be appropriate for a conditional write.  */
-  if (!noce_operand_ok (dest))
+  /* Destination and source must be appropriate.  */
+  if (!noce_operand_ok (dest) || !noce_operand_ok (src))
return false;
 
   /* We must be able to conditionally move in this mode.  */
diff --git a/gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_arithm.c 
b/gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_arithm.c
new file mode 100644
index 000..f29cc72263a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_arithm.c
@@ -0,0 +1,67 @@
+/* { dg-do compile } */
+/* { dg-options "-O2

Fix profile updates in copy-header

2023-07-01 Thread Jan Hubicka via Gcc-patches
Hi,
most common source of profile mismatches is now copyheader pass.  The reason is 
that
in comon case the duplicated header condition will become constant true and 
that needs
changes in the loop exit condition probability.

While this can be done by jump threading it is not, since it gives up on loops.
Copy header pass now has logic to prove that first exit will become true, so 
this
patch adds necessary pumbing to the profile updating.
This is done in gimple_duplicate_sese_region in a way that is specific for this
particular case.  I think general case is kind-of unsolvable and loop-ch is the
only user of the infrastructure.  If we later invent some new users, maybe we
can export the region and region_copy arrays and let user to do the update.

With the patch we now get:

Pass dump id and name|static mismat|dynamic mismatch  
 |in count |in count  
107t cunrolli|  3+3|19237   +19237
127t ch  | 13   +10|19237 
131t dom | 39   +26|19237 
133t isolate-paths   | 47+8|19237 
134t reassoc | 49+2|19237 
136t forwprop| 53+4|   226943  +207706
159t cddce   | 61+8|   24   +15279
161t ldist   | 62+1|   24 
172t ifcvt   | 66+4|   415472  +173250
173t vect|143   +77| 10859784+10444312
176t cunroll |294  +151|150357763   +139497979
183t loopdone|291-3|150289533   -68230
194t tracer  |322   +31|153230990 +2941457
195t fre |317-5|153230990 
197t dom |286   -31|154448079 +1217089
199t threadfull  |293+7|154724763  +276684
200t vrp |297+4|155042448  +317685
204t dce |294-3|155017073   -25375
206t sink|292-2|155017073 
211t cddce   |298+6|155018657+1584
255t optimized   |296-2|155018657 
256r expand  |273   -23|154592622  -426035
258r into_cfglayout  |268-5|154592661  +39
275r loop2_unroll|272+4|159701866 +5109205
291r ce2 |270-2|159723509 
312r pro_and_epilogue|290   +20|159792505   +68996
315r jump2   |296+6|164234016 +4441511
323r bbro|294-2|159385430 -4848586

So ch introduces 10 new mismatches while originally it did 308.  At bbro the
number of mismatches dropped from 432 to 294.
Most offender is now cunroll pass. I think it is the case where loop has 
multiple
exits and one of exits becomes to be false in all but last peeled iteration.

This is another case where non-trivial loop update is needed.

Honza

gcc/ChangeLog:

* tree-cfg.cc (gimple_duplicate_sese_region): Add elliminated_edge
parmaeter; update profile.
* tree-cfg.h (gimple_duplicate_sese_region): Update prototype.
* tree-ssa-loop-ch.cc (entry_loop_condition_is_static): Rename to ...
(static_loop_exit): ... this; return the edge to be elliminated.
(ch_base::copy_headers): Handle profile updating for eliminated exits.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/ifc-20040816-1.c: Reduce number of mismatches
from 2 to 1.
* gcc.dg/tree-ssa/loop-ch-profile-1.c: New test.
* gcc.dg/tree-ssa/loop-ch-profile-2.c: New test.

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-20040816-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ifc-20040816-1.c
index f8a6495cbaa..b55a533e374 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-20040816-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-20040816-1.c
@@ -39,4 +39,4 @@ int main1 ()
which is folded by vectorizer.  Both outgoing edges must have probability
100% so the resulting profile match after folding.  */
 /* { dg-final { scan-tree-dump-times "Invalid sum of outgoing probabilities 
200.0" 1 "ifcvt" } } */
-/* { dg-final { scan-tree-dump-times "Invalid sum of incoming counts" 2 
"ifcvt" } } */
+/* { dg-final { scan-tree-dump-times "Invalid sum of incoming counts" 1 
"ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/loop-ch-profile-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/loop-ch-profile-1.c
new file mode 100644
index 000..e8bab62b0d9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/loop-ch-profile-1.c
@@ -0

Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering

2023-07-01 Thread Robin Dapp via Gcc-patches
> There has to be some kind of mismatch between the patch or testcase
> or what we're looking at to judge success.

Yeah I think the initially posted example was misleading because it
contained an already working example.

> While I really don't see the need to have the bridge pattern, I'm
> still willing to believe that I've missed something, which is why I
> wanted to dive into it myself.  For example, we have heuristics to
> avoid trying too many 4->n combine patterns and we might be tripping
> over that or who knows what.
> 
> So my suggestion is that if both of you are getting the desired code,
> then Robin handle the review side of the two patches that introduce
> the helper patterns.

I went over both patches again and given the context they seem
reasonable to me.  I'd propose go with both of them for now and - in
the meanwhile - I'm going to  brush up on my combine knowledge some
time in the next weeks and get back to this then, hopefully with a
better explanation than my last one.

Regards
 Robin


Re: [PATCH] RISC-V: improve codegen for repeating large constants [3]

2023-07-01 Thread Jeff Law via Gcc-patches




On 7/1/23 02:00, Andrew Waterman wrote:



Yeah, that might end up being a false economy for superscalars.

In general, I wouldn't recommend spending too many cleverness beans on
non-Zba+Zbb implementations.  Going forward, we should expect that
even very simple cores provide those extensions.
I suspect you under-estimate how difficult it is to get the distros to 
move forward on baseline ISAs.


jeff


[GCC 11][committed] d: Fix ICE in setValue, at d/dmd/dinterpret.c:7013

2023-07-01 Thread Iain Buclaw via Gcc-patches
Hi,

This patch backports ICE fix from upstream which is already part of
GCC-12 and later.  When casting null to integer or real, instead of
painting the type on the NullExp, we emplace an IntegerExp/RealExp with
the value zero.  Same as when casting from NullExp to bool.

Bootstrapped and regression tested on x86_64-linux-gnu, committed to
releases/gcc-11, and backported to releases/gcc-10.

Regards,
Iain.

---
Reviewed-on: https://github.com/dlang/dmd/pull/13172

PR d/110511

gcc/d/ChangeLog:

* dmd/dinterpret.c (Interpreter::visit (CastExp *)): Handle casting
null to int or float.

gcc/testsuite/ChangeLog:

* gdc.test/compilable/test21794.d: New test.
---
 gcc/d/dmd/dinterpret.c| 12 -
 gcc/testsuite/gdc.test/compilable/test21794.d | 52 +++
 2 files changed, 63 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gdc.test/compilable/test21794.d

diff --git a/gcc/d/dmd/dinterpret.c b/gcc/d/dmd/dinterpret.c
index ab9d88c660c..d4cfb0caacb 100644
--- a/gcc/d/dmd/dinterpret.c
+++ b/gcc/d/dmd/dinterpret.c
@@ -5792,12 +5792,22 @@ public:
 }
 if (e->to->ty == Tsarray)
 e1 = resolveSlice(e1);
-if (e->to->toBasetype()->ty == Tbool && e1->type->ty == Tpointer)
+Type *tobt = e->to->toBasetype();
+if (tobt->ty == Tbool && e1->type->ty == Tpointer)
 {
 new(pue) IntegerExp(e->loc, e1->op != TOKnull, e->to);
 result = pue->exp();
 return;
 }
+else if (tobt->isTypeBasic() && e1->op == TOKnull)
+{
+if (tobt->isintegral())
+new(pue) IntegerExp(e->loc, 0, e->to);
+else if (tobt->isreal())
+new(pue) RealExp(e->loc, CTFloat::zero, e->to);
+result = pue->exp();
+return;
+}
 result = ctfeCast(pue, e->loc, e->type, e->to, e1);
 }
 
diff --git a/gcc/testsuite/gdc.test/compilable/test21794.d 
b/gcc/testsuite/gdc.test/compilable/test21794.d
new file mode 100644
index 000..68e504bce56
--- /dev/null
+++ b/gcc/testsuite/gdc.test/compilable/test21794.d
@@ -0,0 +1,52 @@
+// https://issues.dlang.org/show_bug.cgi?id=21794
+/*
+TEST_OUTPUT:
+---
+0
+0u
+0L
+0LU
+0.0F
+0.0
+0.0L
+---
+*/
+
+bool fun(void* p) {
+const x = cast(ulong)p;
+return 1;
+}
+
+static assert(fun(null));
+
+T fun2(T)(void* p) {
+const x = cast(T)p;
+return x;
+}
+
+// These were an error before, they were returning a NullExp instead of 
IntegerExp/RealExp
+
+static assert(fun2!int(null)== 0);
+static assert(fun2!uint(null)   == 0);
+static assert(fun2!long(null)   == 0);
+static assert(fun2!ulong(null)  == 0);
+static assert(fun2!float(null)  == 0);
+static assert(fun2!double(null) == 0);
+static assert(fun2!real(null)   == 0);
+
+// These were printing 'null' instead of the corresponding number
+
+const i = cast(int)null;
+const ui = cast(uint)null;
+const l = cast(long)null;
+const ul = cast(ulong)null;
+const f = cast(float)null;
+const d = cast(double)null;
+const r = cast(real)null;
+pragma(msg, i);
+pragma(msg, ui);
+pragma(msg, l);
+pragma(msg, ul);
+pragma(msg, f);
+pragma(msg, d);
+pragma(msg, r);
-- 
2.39.2



Re: [pushed] wwwdocs: Add GCC Code of Conduct

2023-07-01 Thread Gerald Pfeifer
On Tue, 20 Jun 2023, Jason Merrill via Gcc-patches wrote:
> As announced on gcc@.

Here is a minor follow-up that I just pushed.

Gerald


>From f87deaa12cccb4b7398a8ec3b306cb4185aae012 Mon Sep 17 00:00:00 2001
From: Gerald Pfeifer 
Date: Fri, 30 Jun 2023 14:59:27 +0200
Subject: [PATCH] conduct: Fix nested lists

---
 htdocs/conduct.html | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/htdocs/conduct.html b/htdocs/conduct.html
index 8fb62e86..da940a47 100644
--- a/htdocs/conduct.html
+++ b/htdocs/conduct.html
@@ -61,7 +61,7 @@ affect a person's ability to participate within them.
   Be careful in the words that you choose. Be kind to
   others. Do not insult or put down other participants. Harassment and other
   exclusionary behavior aren't acceptable. This includes, but is not limited
-  to:
+  to:
 
   
 Violent threats or language directed against another person.
@@ -73,6 +73,7 @@ affect a person's ability to participate within them.
 Advocating for, or encouraging, any of the above behavior.
 Repeated harassment of others. In general, if someone asks you to 
stop, then stop.
   
+  
 
   When we disagree, try to understand why. Disagreements,
   both social and technical, happen all the time and the GCC community is no
-- 
2.41.0



Re: [PATCH] RISC-V: improve codegen for repeating large constants [3]

2023-07-01 Thread Palmer Dabbelt

On Sat, 01 Jul 2023 07:04:16 PDT (-0700), jeffreya...@gmail.com wrote:



On 7/1/23 02:00, Andrew Waterman wrote:



Yeah, that might end up being a false economy for superscalars.

In general, I wouldn't recommend spending too many cleverness beans on
non-Zba+Zbb implementations.  Going forward, we should expect that
even very simple cores provide those extensions.

I suspect you under-estimate how difficult it is to get the distros to
move forward on baseline ISAs.


Ya, we haven't even gotten to the point where most implementations are 
shipping with the B extensions, much less to the point where we can 
start ignoring all the pre-B hardware.


Re: [PATCH] RISC-V: improve codegen for repeating large constants [3]

2023-07-01 Thread Andrew Waterman via Gcc-patches
On Sat, Jul 1, 2023 at 7:04 AM Jeff Law  wrote:
>
>
>
> On 7/1/23 02:00, Andrew Waterman wrote:
>
> >
> > Yeah, that might end up being a false economy for superscalars.
> >
> > In general, I wouldn't recommend spending too many cleverness beans on
> > non-Zba+Zbb implementations.  Going forward, we should expect that
> > even very simple cores provide those extensions.
> I suspect you under-estimate how difficult it is to get the distros to
> move forward on baseline ISAs.

Yeah, true.

>
> jeff


[committed] d: Don't generate code that throws exceptions when compiling with `-fno-exceptions'

2023-07-01 Thread Iain Buclaw via Gcc-patches
Hi,

The version flags for RTMI, RTTI, and exceptions was unconditionally
predefined.  These are now only predefined if the feature flag is
enabled.  It was noticed that there was no `-fexceptions' definition
inside d/lang.opt, so the detection of the exceptions option flag was
only partially working.  Once that was fixed, a few places in the
front-end implementation were found to fall fowl of `nothrow' rules,
these have been fixed upstream and backported here as well.

Bootstrapped and regression tested on x86_64-linux-gnu{-m64,-m32},
committed to mainline, and backported to releases/gcc-13.

Regards,
Iain.

---
Reviewed-on: https://github.com/dlang/dmd/pull/15357
 https://github.com/dlang/dmd/pull/15360

PR d/110471

gcc/d/ChangeLog:

* d-builtins.cc (d_init_versions): Predefine D_ModuleInfo,
D_Exceptions, and D_TypeInfo only if feature is enabled.
* lang.opt: Add -fexceptions.

gcc/testsuite/ChangeLog:

* gdc.dg/pr110471a.d: New test.
* gdc.dg/pr110471b.d: New test.
* gdc.dg/pr110471c.d: New test.

(cherry picked from commit da108c75ad386b3f1f47abb2265296e4b61d578a)
---
 gcc/d/d-builtins.cc  | 9 ++---
 gcc/d/dmd/root/array.d   | 2 +-
 gcc/d/dmd/semantic2.d| 3 +--
 gcc/d/dmd/semantic3.d| 2 +-
 gcc/d/lang.opt   | 4 
 gcc/testsuite/gdc.dg/pr110471a.d | 5 +
 gcc/testsuite/gdc.dg/pr110471b.d | 5 +
 gcc/testsuite/gdc.dg/pr110471c.d | 5 +
 8 files changed, 28 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gdc.dg/pr110471a.d
 create mode 100644 gcc/testsuite/gdc.dg/pr110471b.d
 create mode 100644 gcc/testsuite/gdc.dg/pr110471c.d

diff --git a/gcc/d/d-builtins.cc b/gcc/d/d-builtins.cc
index f40888019ce..60f76fc694c 100644
--- a/gcc/d/d-builtins.cc
+++ b/gcc/d/d-builtins.cc
@@ -500,9 +500,12 @@ d_init_versions (void)
 VersionCondition::addPredefinedGlobalIdent ("D_BetterC");
   else
 {
-  VersionCondition::addPredefinedGlobalIdent ("D_ModuleInfo");
-  VersionCondition::addPredefinedGlobalIdent ("D_Exceptions");
-  VersionCondition::addPredefinedGlobalIdent ("D_TypeInfo");
+  if (global.params.useModuleInfo)
+   VersionCondition::addPredefinedGlobalIdent ("D_ModuleInfo");
+  if (global.params.useExceptions)
+   VersionCondition::addPredefinedGlobalIdent ("D_Exceptions");
+  if (global.params.useTypeInfo)
+   VersionCondition::addPredefinedGlobalIdent ("D_TypeInfo");
 }
 
   if (optimize)
diff --git a/gcc/d/dmd/root/array.d b/gcc/d/dmd/root/array.d
index 541a12d9e1d..d1c61be7344 100644
--- a/gcc/d/dmd/root/array.d
+++ b/gcc/d/dmd/root/array.d
@@ -574,7 +574,7 @@ unittest
 private template arraySortWrapper(T, alias fn)
 {
 pragma(mangle, "arraySortWrapper_" ~ T.mangleof ~ "_" ~ fn.mangleof)
-extern(C) int arraySortWrapper(scope const void* e1, scope const void* e2) 
nothrow
+extern(C) int arraySortWrapper(scope const void* e1, scope const void* e2)
 {
 return fn(cast(const(T*))e1, cast(const(T*))e2);
 }
diff --git a/gcc/d/dmd/semantic2.d b/gcc/d/dmd/semantic2.d
index 440e4cbc8e7..ee268d95251 100644
--- a/gcc/d/dmd/semantic2.d
+++ b/gcc/d/dmd/semantic2.d
@@ -807,9 +807,8 @@ private void doGNUABITagSemantic(ref Expression e, ref 
Expression* lastTag)
 // but it's a concession to practicality.
 // Casts are unfortunately necessary as `implicitConvTo` is not
 // `const` (and nor is `StringExp`, by extension).
-static int predicate(const scope Expression* e1, const scope Expression* 
e2) nothrow
+static int predicate(const scope Expression* e1, const scope Expression* 
e2)
 {
-scope(failure) assert(0, "An exception was thrown");
 return 
(cast(Expression*)e1).toStringExp().compare((cast(Expression*)e2).toStringExp());
 }
 ale.elements.sort!predicate;
diff --git a/gcc/d/dmd/semantic3.d b/gcc/d/dmd/semantic3.d
index 33a43187fa8..a912e768f0c 100644
--- a/gcc/d/dmd/semantic3.d
+++ b/gcc/d/dmd/semantic3.d
@@ -1420,7 +1420,7 @@ private extern(C++) final class Semantic3Visitor : Visitor
  * https://issues.dlang.org/show_bug.cgi?id=14246
  */
 AggregateDeclaration ad = ctor.isMemberDecl();
-if (!ctor.fbody || !ad || !ad.fieldDtor || !global.params.dtorFields 
|| global.params.betterC || ctor.type.toTypeFunction.isnothrow)
+if (!ctor.fbody || !ad || !ad.fieldDtor || !global.params.dtorFields 
|| !global.params.useExceptions || ctor.type.toTypeFunction.isnothrow)
 return visit(cast(FuncDeclaration)ctor);
 
 /* Generate:
diff --git a/gcc/d/lang.opt b/gcc/d/lang.opt
index 26ca92c4c17..98a95c1dc38 100644
--- a/gcc/d/lang.opt
+++ b/gcc/d/lang.opt
@@ -291,6 +291,10 @@ fdump-d-original
 D
 Display the frontend AST after parsing and semantic passes.
 
+fexceptions
+D
+; Documented in common.opt
+
 fextern-std=
 D Joined RejectNegative Enum(extern_stdcpp) Var(flag_extern_stdcpp)
 -fextern-std=   

[pushed] libphobos, testsuite: Disable forkgc2 on Darwin [PR103944]

2023-07-01 Thread Iain Sandoe via Gcc-patches
From: Iain Sandoe 

This has been in use for some time across all the Darwin version supported
by D.  It has also been tested on x86_64-linux-gnu.

Approved on irc by Iain Buclaw, pushed to main (and will be backported).
thanks
Iain

--- 8< ---

It hangs the testsuite (requiring manual intervention to kill the
spawned processes) which breaks CI.  The reason for the hang id not
clear.  This skips the test for now (xfail does not work).

Signed-off-by: Iain Sandoe 

PR d/103944

libphobos/ChangeLog:

* testsuite/libphobos.gc/forkgc2.d: Skip for Darwin.
---
 libphobos/testsuite/libphobos.gc/forkgc2.d | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libphobos/testsuite/libphobos.gc/forkgc2.d 
b/libphobos/testsuite/libphobos.gc/forkgc2.d
index de7796ced72..38d0d0c2f93 100644
--- a/libphobos/testsuite/libphobos.gc/forkgc2.d
+++ b/libphobos/testsuite/libphobos.gc/forkgc2.d
@@ -1,3 +1,4 @@
+// { dg-skip-if "test hangs the testsuite PR103944" { *-*-darwin* } }
 import core.stdc.stdlib : exit;
 import core.sys.posix.sys.wait : waitpid;
 import core.sys.posix.unistd : fork;
-- 
2.39.2 (Apple Git-143)



[PATCH 2/2] xtensa: The use of CLAMPS instruction also requires TARGET_MINMAX, as well as TARGET_CLAMPS

2023-07-01 Thread Takayuki 'January June' Suwa via Gcc-patches
Because both smin and smax requiring TARGET_MINMAX are essential to the
RTL representation.

gcc/ChangeLog:

* config/xtensa/xtensa.cc (xtensa_match_CLAMPS_imms_p):
Simplify.
* config/xtensa/xtensa.md (*xtensa_clamps):
Add TARGET_MINMAX to the condition.
---
 gcc/config/xtensa/xtensa.cc | 7 ++-
 gcc/config/xtensa/xtensa.md | 4 ++--
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc
index dd35e63c094..3298d53493c 100644
--- a/gcc/config/xtensa/xtensa.cc
+++ b/gcc/config/xtensa/xtensa.cc
@@ -2649,11 +2649,8 @@ xtensa_emit_add_imm (rtx dst, rtx src, HOST_WIDE_INT 
imm, rtx scratch,
 bool
 xtensa_match_CLAMPS_imms_p (rtx cst_max, rtx cst_min)
 {
-  int max, min;
-
-  return IN_RANGE (max = exact_log2 (-INTVAL (cst_max)), 7, 22)
-&& IN_RANGE (min = exact_log2 (INTVAL (cst_min) + 1), 7, 22)
-&& max == min;
+  return IN_RANGE (exact_log2 (-INTVAL (cst_max)), 7, 22)
+&& (INTVAL (cst_max) + INTVAL (cst_min)) == -1;
 }
 
 
diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
index b1af08eba8a..664424f1239 100644
--- a/gcc/config/xtensa/xtensa.md
+++ b/gcc/config/xtensa/xtensa.md
@@ -522,7 +522,7 @@
(smax:SI (smin:SI (match_operand:SI 1 "register_operand" "r")
  (match_operand:SI 2 "const_int_operand" "i"))
 (match_operand:SI 3 "const_int_operand" "i")))]
-  "TARGET_CLAMPS
+  "TARGET_MINMAX && TARGET_CLAMPS
&& xtensa_match_CLAMPS_imms_p (operands[3], operands[2])"
   "#"
   "&& 1"
@@ -540,7 +540,7 @@
(smin:SI (smax:SI (match_operand:SI 1 "register_operand" "r")
  (match_operand:SI 2 "const_int_operand" "i"))
 (match_operand:SI 3 "const_int_operand" "i")))]
-  "TARGET_CLAMPS
+  "TARGET_MINMAX && TARGET_CLAMPS
&& xtensa_match_CLAMPS_imms_p (operands[2], operands[3])"
 {
   static char result[64];
-- 
2.30.2


[PATCH 1/2] xtensa: Fix missing mode warning in "*eqne_INT_MIN"

2023-07-01 Thread Takayuki 'January June' Suwa via Gcc-patches
gcc/ChangeLog:

* config/xtensa/xtensa.md (*eqne_INT_MIN):
Add missing ":SI" to the match_operator.
---
 gcc/config/xtensa/xtensa.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
index 4b4ab3f5f37..b1af08eba8a 100644
--- a/gcc/config/xtensa/xtensa.md
+++ b/gcc/config/xtensa/xtensa.md
@@ -3191,7 +3191,7 @@
 
 (define_insn_and_split "*eqne_INT_MIN"
   [(set (match_operand:SI 0 "register_operand" "=a")
-   (match_operator 2 "boolean_operator"
+   (match_operator:SI 2 "boolean_operator"
[(match_operand:SI 1 "register_operand" "r")
 (const_int -2147483648)]))]
   "TARGET_ABS"
-- 
2.30.2


New Croatian PO file for 'gcc' (version 13.1.0)

2023-07-01 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Croatian team of translators.  The file is available at:

https://translationproject.org/latest/gcc/hr.po

(This file, 'gcc-13.1.0.hr.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




[PATCH] Use chain_next on eh_landing_pad_d for GTY (PR middle-end/110510)

2023-07-01 Thread Andrew Pinski via Gcc-patches
The backtrace in the bug report suggest there is a running out of
stack during GC collection, because of a long chain of eh_landing_pad_d.
This might fix that by adding chain_next onto eh_landing_pad_d's GTY marker.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR middle-end/110510
* except.h (struct eh_landing_pad_d): Add chain_next GTY.
---
 gcc/except.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/except.h b/gcc/except.h
index 378a9e4cb77..173b0f026db 100644
--- a/gcc/except.h
+++ b/gcc/except.h
@@ -66,7 +66,7 @@ enum eh_region_type
 /* A landing pad for a given exception region.  Any transfer of control
from the EH runtime to the function happens at a landing pad.  */
 
-struct GTY(()) eh_landing_pad_d
+struct GTY((chain_next("%h.next_lp"))) eh_landing_pad_d
 {
   /* The linked list of all landing pads associated with the region.  */
   struct eh_landing_pad_d *next_lp;
-- 
2.31.1



[PATCH] gcc-ar: Handle response files properly [PR77576]

2023-07-01 Thread Costas Argyris via Gcc-patches
Basically implementing what Andrew said in the PR:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77576

If @file has been passed to gcc-ar, do the following:

1) Expand it to get an argv without any @files.
2) Then apply the plugin modifications to argv.
3) Create temporary response file.
4) Put the modified argv in the temporary file.
5) Call ar with @tmp.
6) Delete the temporary response file.


0001-gcc-ar-Handle-response-files-properly-PR77576.patch
Description: Binary data


[committed] d: Fix accesses of immutable arrays using constant index still bounds checked

2023-07-01 Thread Iain Buclaw via Gcc-patches
Hi,

This patch sets TREE_READONLY on all non-static const and immutable
variables in D, as well as all static immutable variables that aren't
initialized by a module constructor.  This allows more aggressive
constant folding of D code which makes use of `immutable' or `const'.

Bootstrapped and regression tested on x86_64-linux-gnu, committed to
mainline, and backported to releases/gcc-13 and releases/gcc-12.

Regards,
Iain.

---
PR d/110514

gcc/d/ChangeLog:

* decl.cc (get_symbol_decl): Set TREE_READONLY on certain kinds of
const and immutable variables.
* expr.cc (ExprVisitor::visit (ArrayLiteralExp *)): Set TREE_READONLY
on immutable dynamic array literals.

gcc/testsuite/ChangeLog:

* gdc.dg/pr110514a.d: New test.
* gdc.dg/pr110514b.d: New test.
* gdc.dg/pr110514c.d: New test.
* gdc.dg/pr110514d.d: New test.
---
 gcc/d/decl.cc| 14 ++
 gcc/d/expr.cc|  4 
 gcc/testsuite/gdc.dg/pr110514a.d |  9 +
 gcc/testsuite/gdc.dg/pr110514b.d |  8 
 gcc/testsuite/gdc.dg/pr110514c.d |  8 
 gcc/testsuite/gdc.dg/pr110514d.d |  8 
 6 files changed, 51 insertions(+)
 create mode 100644 gcc/testsuite/gdc.dg/pr110514a.d
 create mode 100644 gcc/testsuite/gdc.dg/pr110514b.d
 create mode 100644 gcc/testsuite/gdc.dg/pr110514c.d
 create mode 100644 gcc/testsuite/gdc.dg/pr110514d.d

diff --git a/gcc/d/decl.cc b/gcc/d/decl.cc
index 78c4ab554dc..3f980851259 100644
--- a/gcc/d/decl.cc
+++ b/gcc/d/decl.cc
@@ -1277,6 +1277,20 @@ get_symbol_decl (Declaration *decl)
DECL_INITIAL (decl->csym) = build_expr (ie, true);
}
}
+
+  /* [type-qualifiers/const-and-immutable]
+
+`immutable` applies to data that cannot change. Immutable data values,
+once constructed, remain the same for the duration of the program's
+execution.  */
+  if (vd->isImmutable () && !vd->setInCtorOnly ())
+   TREE_READONLY (decl->csym) = 1;
+
+  /* `const` applies to data that cannot be changed by the const reference
+to that data. It may, however, be changed by another reference to that
+same data.  */
+  if (vd->isConst () && !vd->isDataseg ())
+   TREE_READONLY (decl->csym) = 1;
 }
 
   /* Set the declaration mangled identifier if static.  */
diff --git a/gcc/d/expr.cc b/gcc/d/expr.cc
index c6245ff5fc1..b7cec1327fd 100644
--- a/gcc/d/expr.cc
+++ b/gcc/d/expr.cc
@@ -2701,6 +2701,10 @@ public:
if (tb->ty == TY::Tarray)
  ctor = d_array_value (type, size_int (e->elements->length), ctor);
 
+   /* Immutable data can be placed in rodata.  */
+   if (tb->isImmutable ())
+ TREE_READONLY (decl) = 1;
+
d_pushdecl (decl);
rest_of_decl_compilation (decl, 1, 0);
  }
diff --git a/gcc/testsuite/gdc.dg/pr110514a.d b/gcc/testsuite/gdc.dg/pr110514a.d
new file mode 100644
index 000..46e370527d3
--- /dev/null
+++ b/gcc/testsuite/gdc.dg/pr110514a.d
@@ -0,0 +1,9 @@
+// { dg-do "compile" }
+// { dg-options "-O -fdump-tree-optimized" }
+immutable uint[] imm_arr = [1,2,3];
+int test_imm(immutable uint[] ptr)
+{
+return imm_arr[2] == 3 ? 123 : 456;
+}
+// { dg-final { scan-assembler-not "_d_arraybounds_indexp" } }
+// { dg-final { scan-tree-dump "return 123;" optimized } }
diff --git a/gcc/testsuite/gdc.dg/pr110514b.d b/gcc/testsuite/gdc.dg/pr110514b.d
new file mode 100644
index 000..86aeb485c34
--- /dev/null
+++ b/gcc/testsuite/gdc.dg/pr110514b.d
@@ -0,0 +1,8 @@
+// { dg-do "compile" }
+// { dg-options "-O" }
+immutable uint[] imm_ctor_arr;
+int test_imm_ctor(immutable uint[] ptr)
+{
+return imm_ctor_arr[2] == 3;
+}
+// { dg-final { scan-assembler "_d_arraybounds_indexp" } }
diff --git a/gcc/testsuite/gdc.dg/pr110514c.d b/gcc/testsuite/gdc.dg/pr110514c.d
new file mode 100644
index 000..94779e123a4
--- /dev/null
+++ b/gcc/testsuite/gdc.dg/pr110514c.d
@@ -0,0 +1,8 @@
+// { dg-do "compile" }
+// { dg-options "-O" }
+const uint[] cst_arr = [1,2,3];
+int test_cst(const uint[] ptr)
+{
+return cst_arr[2] == 3;
+}
+// { dg-final { scan-assembler "_d_arraybounds_indexp" } }
diff --git a/gcc/testsuite/gdc.dg/pr110514d.d b/gcc/testsuite/gdc.dg/pr110514d.d
new file mode 100644
index 000..56e9a3139ea
--- /dev/null
+++ b/gcc/testsuite/gdc.dg/pr110514d.d
@@ -0,0 +1,8 @@
+// { dg-do "compile" }
+// { dg-options "-O" }
+const uint[] cst_ctor_arr;
+int test_cst_ctor(const uint[] ptr)
+{
+return cst_ctor_arr[2] == 3;
+}
+// { dg-final { scan-assembler "_d_arraybounds_indexp" } }
-- 
2.39.2



[committed] d: Fix core.volatile.volatileLoad discarded if result is unused

2023-07-01 Thread Iain Buclaw via Gcc-patches
Hi,

The first pass of code generation in the D front-end splits up all
compound expressions and discards expressions that have no side effects.
This included calls to the `volatileLoad' intrinsic if its result was
not used, causing such calls to be eliminated from the program.

We already set TREE_THIS_VOLATILE on the expression, however the
tree documentation says if this bit is set in an expression, so is
TREE_SIDE_EFFECTS.  So set TREE_SIDE_EFFECTS on the expression too.
This prevents any early discarding from occuring.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32, committed
to mainline, and backported to releases/gcc-13, gcc-12, and gcc-11.

Regards,
Iain.

---
PR d/110516

gcc/d/ChangeLog:

* intrinsics.cc (expand_volatile_load): Set TREE_SIDE_EFFECTS on the
expanded expression.
(expand_volatile_store): Likewise.

gcc/testsuite/ChangeLog:

* gdc.dg/torture/pr110516a.d: New test.
* gdc.dg/torture/pr110516b.d: New test.
---
 gcc/d/intrinsics.cc  |  2 ++
 gcc/testsuite/gdc.dg/torture/pr110516a.d | 12 
 gcc/testsuite/gdc.dg/torture/pr110516b.d | 12 
 3 files changed, 26 insertions(+)
 create mode 100644 gcc/testsuite/gdc.dg/torture/pr110516a.d
 create mode 100644 gcc/testsuite/gdc.dg/torture/pr110516b.d

diff --git a/gcc/d/intrinsics.cc b/gcc/d/intrinsics.cc
index 0121d81eb14..aaf04e50baa 100644
--- a/gcc/d/intrinsics.cc
+++ b/gcc/d/intrinsics.cc
@@ -1007,6 +1007,7 @@ expand_volatile_load (tree callexp)
   tree type = build_qualified_type (TREE_TYPE (ptrtype), TYPE_QUAL_VOLATILE);
   tree result = indirect_ref (type, ptr);
   TREE_THIS_VOLATILE (result) = 1;
+  TREE_SIDE_EFFECTS (result) = 1;
 
   return result;
 }
@@ -1034,6 +1035,7 @@ expand_volatile_store (tree callexp)
   tree type = build_qualified_type (TREE_TYPE (ptrtype), TYPE_QUAL_VOLATILE);
   tree result = indirect_ref (type, ptr);
   TREE_THIS_VOLATILE (result) = 1;
+  TREE_SIDE_EFFECTS (result) = 1;
 
   /* (*(volatile T *) ptr) = value;  */
   tree value = CALL_EXPR_ARG (callexp, 1);
diff --git a/gcc/testsuite/gdc.dg/torture/pr110516a.d 
b/gcc/testsuite/gdc.dg/torture/pr110516a.d
new file mode 100644
index 000..276455ae408
--- /dev/null
+++ b/gcc/testsuite/gdc.dg/torture/pr110516a.d
@@ -0,0 +1,12 @@
+// https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110516
+// { dg-do compile }
+// { dg-options "-fno-moduleinfo -fdump-tree-optimized" }
+void fn110516(ubyte* ptr)
+{
+import core.volatile : volatileLoad;
+volatileLoad(ptr);
+volatileLoad(ptr);
+volatileLoad(ptr);
+volatileLoad(ptr);
+}
+// { dg-final { scan-tree-dump-times " ={v} " 4 "optimized" } }
diff --git a/gcc/testsuite/gdc.dg/torture/pr110516b.d 
b/gcc/testsuite/gdc.dg/torture/pr110516b.d
new file mode 100644
index 000..b7a67e716a5
--- /dev/null
+++ b/gcc/testsuite/gdc.dg/torture/pr110516b.d
@@ -0,0 +1,12 @@
+// https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110516
+// { dg-do compile }
+// { dg-options "-fno-moduleinfo -fdump-tree-optimized" }
+void fn110516(ubyte* ptr)
+{
+import core.volatile : volatileStore;
+volatileStore(ptr, 0);
+volatileStore(ptr, 0);
+volatileStore(ptr, 0);
+volatileStore(ptr, 0);
+}
+// { dg-final { scan-tree-dump-times " ={v} " 4 "optimized" } }
-- 
2.39.2