Re: [PATCH v3] Consider frequency in cost estimation when converting scalar to vector.

2025-05-11 Thread Hongtao Liu
On Thu, May 8, 2025 at 2:40 PM liuhongt  wrote:
>
> The only part I changed is related to size_cost of sse_to_ineteger, as below
>
> 114+  /* Under TARGET_SSE4_1, it's vmovd + vpextrd/vpinsrd.
> 115+ W/o it, it's movd + psrlq/unpckldq + movd.  */
> 116+  else if (!TARGET_64BIT && smode != SImode)
> 117+cost *= TARGET_SSE4_1 ? 2 : 3;
> 118+
Sorry, I missed posting a paragraph.
> There is rtl_ssa::changes_are_worthwhile which solves similar problem.
> Logic there seems OK to me.  It computes unscaled cost in integers and
> weighted cost that is updated only for hot BBs
>
> After both set of costs are computed, if the weighted cost indicates
> that replacement is worthwhile, it will be done.
>
> This works well for chains that contains some portions in cold BBs with
> frequency of 0 as well as for those that have everything in cold BBs.
I haven't changed STV to rtl-ssa, but introduce estimated_cost_{gain,
weighted_cost_sse_integer) which make the same logic as
rtl_ssa::changes_are_worthwhile, meanwhile optimized_for_size is
replaced with !speed_p.
There are some costs using magic numbers(mainly the for_size part)
which I didn't change, except for the part mentioned above.
For the cost using ix86_cost->XXX, I replace them with
ix86_size_cost->XXX when !speed_p.

Looks like you've updated the patch in
https://gcc.gnu.org/pipermail/gcc-patches/2025-May/683218.html, I'll
rebase my patch against that and post the patch again.

>I also realize it would need more work, but what the pass does is quite
>close to the job of late combine pass.  Since late combine has all the
>substitution logic nicely separated into rtl-ssa framework, converting
>STv to rtl-ssa as well will likely make it easier and faster...
I haven't change STV to rtl-ssa, but introduce
>
> Ok for trunk?
>
>
> n some benchmark, I notice stv failed due to cost unprofitable, but the igain
> is inside the loop, but sse<->integer conversion is outside the loop, current 
> cost
> model doesn't consider the frequency of those gain/cost.
> The patch weights those cost with frequency.
>
> The patch regressed gcc.target/i386/minmax-6.c under -m32 Since the
> place of integer<->sse is before the branch, and the conversion to
> min/max is in the branch, with static profile, the cost model is not
> profitable anymore which is exactly the patch try to do.
> Considering the original testcase is to guard RA issue, so restrict
> the testcase under ! ia32 should still be ok.
>
> gcc/ChangeLog:
>
> * config/i386/i386-features.cc
> (scalar_chain::mark_dual_mode_def): Weight
> n_integer_to_sse/n_sse_to_integer with bb frequency.
> (general_scalar_chain::compute_convert_gain): Ditto, and
> adjust function prototype to return true/false when cost model
> is profitable or not.
> (timode_scalar_chain::compute_convert_gain): Ditto.
> (convert_scalars_to_vector): Adjust after the upper two
> function prototype are changed.
> * config/i386/i386-features.h (class scalar_chain): Change
> n_integer_to_sse/n_sse_to_integer to cost_sse_integer, and add
> weighted_cost_sse_integer.
> (class general_scalar_chain): Adjust prototype to return bool
> intead of int.
> (class timode_scalar_chain): Ditto.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/i386/minmax-6.c: Adjust testcase.
> ---
>  gcc/config/i386/i386-features.cc | 216 ---
>  gcc/config/i386/i386-features.h  |  11 +-
>  gcc/testsuite/gcc.target/i386/minmax-6.c |   2 +-
>  3 files changed, 124 insertions(+), 105 deletions(-)
>
> diff --git a/gcc/config/i386/i386-features.cc 
> b/gcc/config/i386/i386-features.cc
> index c35ac24fd8a..5f21130db58 100644
> --- a/gcc/config/i386/i386-features.cc
> +++ b/gcc/config/i386/i386-features.cc
> @@ -296,8 +296,8 @@ scalar_chain::scalar_chain (enum machine_mode smode_, 
> enum machine_mode vmode_)
>insns_conv = BITMAP_ALLOC (NULL);
>queue = NULL;
>
> -  n_sse_to_integer = 0;
> -  n_integer_to_sse = 0;
> +  cost_sse_integer = 0;
> +  weighted_cost_sse_integer = 0 ;
>
>max_visits = x86_stv_max_visits;
>  }
> @@ -337,20 +337,40 @@ scalar_chain::mark_dual_mode_def (df_ref def)
>/* Record the def/insn pair so we can later efficiently iterate over
>   the defs to convert on insns not in the chain.  */
>bool reg_new = bitmap_set_bit (defs_conv, DF_REF_REGNO (def));
> +  basic_block bb = BLOCK_FOR_INSN (DF_REF_INSN (def));
> +  profile_count entry_count = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
> +  bool speed_p = optimize_bb_for_speed_p (bb);
> +  sreal bb_freq = bb->count.to_sreal_scale (entry_count);
> +  int cost = 0;
> +
>if (!bitmap_bit_p (insns, DF_REF_INSN_UID (def)))
>  {
>if (!bitmap_set_bit (insns_conv, DF_REF_INSN_UID (def))
>   && !reg_new)
> return;
> -  n_integer_to_sse++;
> +
> +  /* ???  integer_to_sse but we only have that in the RA cost table.
> +

Re: [PATCH 0/6] RISC-V: frm state-machine improvements

2025-05-11 Thread 钟居哲
Hi, vineet.

>> I have a feeling this has to do with following:
>> https://godbolt.org/z/Px9es7j1r

I saw in there are 2 fsrm instruction inside the main loop in Clang generated 
ASM which I think GCC is better.

Correct me if I am wrong. Thanks.



juzhe.zh...@rivai.ai
 
From: Vineet Gupta
Date: 2025-05-10 04:27
To: gcc-patches
CC: gnu-toolchain; Jeff Law; Robin Dapp; Juzhe Zhong; Pan Li; Kito Cheng; 
Vineet Gupta
Subject: [PATCH 0/6] RISC-V: frm state-machine improvements
Hi,
 
This came out of Rivos perf team reporting (shoutout to Siavash) that
some of the SPEC2017 workloads had unnecessary FRM wiggles, when
none were needed. The writes in particular could be expensive.
 
I started with reduced test for PR/119164 from blender:node_testure_util.c.
 
However in trying to understand (and a botched rewrite of whole thing)
it turned out that lot of code was just unnecessary leading to more
complexity than warranted. As a result there are more deletions here and
the actual improvements come from just a few lines of actual changes.
 
I've verified each patch incrementally with
- Testsuite run (unchanged, 1 unexpected pass 
gcc.target/riscv/rvv/autovec/pr119114.c)
- SPEC build
- Static analysis of FRM read/write insns emitted in all of SPEC binaries.
- There's BPI date for some of this too, but the delta there is not
   significant as this could really be uarch specific.
 
Here's the result for static analysis.
 
 
1. revert-confluence  2. remove-edge-insert  4-fewer-frm-restore  
5-call-backtrack
  3. remove-mode-after
  ---    ---  
---
frrm fsrmi fsrm   frrm fsrmi fsrm   frrm fsrmi fsrm 
frrm fsrmi fsrm
perlbench_r   4204  4204  1701  
  1701
   cpugcc_r  1670   17 1670   17  1100  
  1100
   bwaves_r   1601  1601  1601  
  1601
  mcf_r   1100  1100  1100  
  1100
   cactusBSSN_r   790   27  760   27  1901  
  1901
 namd_r  1190   63 1190   63  1401  
  1401
   parest_r  2180  114 1680  114  2401  
  2401
   povray_r  1231   17 1231   17  2616  
  2616
  lbm_r600   600   600  
   600
  omnetpp_r   1701  1701  1701  
  1701
  wrf_r 2287   13 19562287   13 19561268   13 1603  
 613   13   82
 cpuxalan_r   1701  1701  1701  
  1701
   ldecod_r   1100  1100  1100  
  1100
 x264_r   1401  1401  1100  
  1100
  blender_r  724   12  182 724   12  182  61   12   42  
  39   12   16
 cam4_r  324   13  169 324   13  169  45   13   20  
  40   13   17
deepsjeng_r   1100  1100  1100  
  1100
  imagick_r  265   16   34 265   16   34 132   16   25  
  33   16   18
leela_r   1200  1200  1200  
  1200
  nab_r   1301  1301  1301  
  1301
exchange2_r   1601  1601  1601  
  1601
fotonik3d_r   200   11  200   11  1901  
  1901
 roms_r   330   23  330   23  2101  
  2101
   xz_r600   600   600  
   600
    ---  ---  

4551   55 26234498   55 26231804   55 1707  
1023   55  150
    ---  ---  

  7729  7176  3566  
  1228
    ---  ---  

 
It seems wrf still has half of all read/writes
 613   13   82
 
with one function having the bulk of them
  solve_em_  5551   50
 
This is 1 static RM so ideally needs 1 save and 1 restore.
 
I have a feeling this has to do with following:
https://godbolt.org/z/Px9es7j1r
 
The function call code path need not bother with frm save/restore at
all. This is currently being investigated but could take more ti

Re: [PATCH v1 1/3] Match: Support form 7 for unsigned integer SAT_ADD

2025-05-11 Thread Richard Biener
On Mon, Apr 28, 2025 at 3:35 PM  wrote:
>
> From: Pan Li 
>
> This patch would like to support the form 7 of the unsigned
> integer SAT_ADD, aka below example.
>
>   #define DEF_SAT_U_ADD_FMT_7(WT, T) \
>   T __attribute__((noinline))\
>   sat_u_add_##WT##_##T##_fmt_7(T x, T y) \
>   {  \
> T max = -1;  \
> WT val = (WT)x + (WT)y;  \
> return val > max ? max : (T)val; \
>   }
>
>   DEF_SAT_U_ADD_FMT_7(uint64_t, uint32_t)
>
> If we take -O3 build with -fdump-tree-optimized, we will have
>
> Before this patch:
>5   │ __attribute__((noinline))
>6   │ uint32_t sat_u_add_uint64_t_uint32_t_fmt_7 (uint32_t x, uint32_t y)
>7   │ {
>8   │   uint64_t val;
>9   │   long unsigned int _1;
>   10   │   long unsigned int _2;
>   11   │   uint32_t _3;
>   12   │   uint32_t _7;
>   13   │
>   14   │[local count: 1073741824]:
>   15   │   _1 = (long unsigned int) x_4(D);
>   16   │   _2 = (long unsigned int) y_5(D);
>   17   │   val_6 = _1 + _2;
>   18   │   if (val_6 <= 4294967295)
>   19   │ goto ; [65.00%]
>   20   │   else
>   21   │ goto ; [35.00%]
>   22   │
>   23   │[local count: 697932184]:
>   24   │   _7 = x_4(D) + y_5(D);
>   25   │
>   26   │[local count: 1073741824]:
>   27   │   # _3 = PHI <4294967295(2), _7(3)>
>   28   │   return _3;
>   29   │
>   30   │ }
>
> After this patch:
>4   │ __attribute__((noinline))
>5   │ uint32_t sat_u_add_uint64_t_uint32_t_fmt_7 (uint32_t x, uint32_t y)
>6   │ {
>7   │   uint32_t _3;
>8   │
>9   │[local count: 1073741824]:
>   10   │   _3 = .SAT_ADD (x_4(D), y_5(D)); [tail call]
>   11   │   return _3;
>   12   │
>   13   │ }
>
> This change also effects on vector mode too.
>
> The below test suites are passed for this patch.
> * The rv64gcv fully regression test.
> * The x86 bootstrap test.
> * The x86 fully regression test.
>
> gcc/ChangeLog:
>
> * match.pd: Add form 7 matching pattern for unsigned integer
> SAT_ADD.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/match.pd | 16 +++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index ba036e52837..e63e2783d79 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3241,7 +3241,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   SAT_U_ADD = IMAGPART (SUM) != 0 ? -1 : REALPART (SUM)  */
>(cond^ (ne (imagpart (IFN_ADD_OVERFLOW@2 @0 INTEGER_CST@1)) integer_zerop)
>  integer_minus_onep (realpart @2))
> -  (if (types_match (type, @0) && int_fits_type_p (@1, type)
> +  (if (types_match (type, @0) && int_fits_type_p (@1, type
> + (match (unsigned_integer_sat_add @0 @1)
> +  /* WIDEN_SUM = (WT)X + (WT)Y
> + SAT_U_ADD = WIDEN_SUM > MAX ? MAX : (NT)WIDEN_SUM  */
> +  (cond^ (le (plus:c (convert@2 @0) (convert@3 @1)) INTEGER_CST@4)
> +(plus:c @0 @1) integer_minus_onep)

I think it's enough to put :c on one of the (plus

OK with that change.

> +  (if (types_match (type, @0, @1) && types_match (@2, @3))
> +   (with
> +{
> + unsigned precision = TYPE_PRECISION (type);
> + unsigned widen_precision = TYPE_PRECISION (TREE_TYPE (@2));
> + wide_int max = wi::mask (precision, false, widen_precision);
> + wide_int c4 = wi::to_wide (@4);
> +}
> +(if (wi::eq_p (c4, max) && widen_precision > precision))
>
>  /* Saturation sub for unsigned integer.  */
>  (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type))
> --
> 2.43.0
>


Re: [PATCH] libstdc++: Make dg-require-namedlocale work for more targets [PR65909]

2025-05-11 Thread Tomasz Kaminski
On Thu, May 8, 2025 at 4:22 PM Jonathan Wakely  wrote:

> As noted in the PR, some embedded targets do not support command-line
> arguments, which means that the dg-require-namedlocale check always
> fails. Use Sandra's suggestion of hardcoding the argument into the
> executable instead of passing it as a command-line argument.
>
> Realistically, those embedded targets probably don't support the named
> locales anyway, but at least now the tests will be UNSUPPORTED for the
> right reason.
>
> libstdc++-v3/ChangeLog:
>
> PR libstdc++/65909
> * testsuite/lib/libstdc++.exp (check_v3_target_namedlocale):
> Hardcode the locale name instead of passing it to the
> executable. Do not hardcode buffer size for string.
> ---
>
> Tested x86_64-linux.
>
LGTM. Learned some TCL today.

>
>  libstdc++-v3/testsuite/lib/libstdc++.exp | 13 -
>  1 file changed, 4 insertions(+), 9 deletions(-)
>
> diff --git a/libstdc++-v3/testsuite/lib/libstdc++.exp
> b/libstdc++-v3/testsuite/lib/libstdc++.exp
> index 5e958d159de2..fbc9f7f13e64 100644
> --- a/libstdc++-v3/testsuite/lib/libstdc++.exp
> +++ b/libstdc++-v3/testsuite/lib/libstdc++.exp
> @@ -1034,7 +1034,7 @@ proc check_v3_target_namedlocale { args } {
> puts $f "using namespace std;"
> puts $f "char *transform_locale(const char *name)"
> puts $f "{"
> -   puts $f "char *result = new char\[50\];"
> +   puts $f "char *result = new char\[strlen(name)+6\];"
> puts $f "strcpy(result, name);"
> puts $f "#if defined __FreeBSD__ || defined __DragonFly__ ||
> defined __NetBSD__"
> puts $f "/* fall-through */"
> @@ -1045,14 +1045,9 @@ proc check_v3_target_namedlocale { args } {
> puts $f "#endif"
> puts $f "return result;"
> puts $f "}"
> -   puts $f "int main (int argc, char** argv)"
> +   puts $f "int main ()"
> puts $f "{"
> -   puts $f "  if (argc < 2)"
> -   puts $f "  {"
> -   puts $f "printf(\"locale support test not supported\\n\");"
> -   puts $f "return 1;"
> -   puts $f "  }"
> -   puts $f "  const char *namedloc = transform_locale(*(argv + 1));"
> +   puts $f "  const char *namedloc = transform_locale(\"$args\");"
> puts $f "  try"
> puts $f "  {"
> puts $f "locale((const char*)namedloc);"
> @@ -1076,7 +1071,7 @@ proc check_v3_target_namedlocale { args } {
>   return 0
> }
>
> -   set result [${tool}_load "./$exe" "$args" ""]
> +   set result [${tool}_load "./$exe" "" ""]
> set status [lindex $result 0]
>
> verbose "check_v3_target_namedlocale <$args>: status is <$status>"
> 2
> --
> 2.49.0
>
>


[PATCH] x86: Remove df_insn_rescan after emit_insn_*

2025-05-11 Thread H.J. Lu
Since df_insn_rescan has been called by emit_insn_*, there is no need
to call it after calling emit_insn_*.  Remove its unnecessary usages.

PR target/120228
* config/i386/i386-features.cc (ix86_place_single_vector_set):
Remove df_insn_rescan after emit_insn_*.
(remove_partial_avx_dependency): Likewise.
(replace_vector_const): Likewise.

OK for master?

-- 
H.J.
From 6fbdc43bfc32ed6c88891f84bd367696cca1e247 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 12 May 2025 10:02:24 +0800
Subject: [PATCH] x86: Remove df_insn_rescan after emit_insn_*

Since df_insn_rescan has been called by emit_insn_*, there is no need
to call it after calling emit_insn_*.  Remove its unnecessary usages.

	PR target/120228
	* config/i386/i386-features.cc (ix86_place_single_vector_set):
	Remove df_insn_rescan after emit_insn_*.
	(remove_partial_avx_dependency): Likewise.
	(replace_vector_const): Likewise.

Signed-off-by: H.J. Lu 
---
 gcc/config/i386/i386-features.cc | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc
index 13e6c2a8abd..cc8313bd292 100644
--- a/gcc/config/i386/i386-features.cc
+++ b/gcc/config/i386/i386-features.cc
@@ -3095,13 +3095,10 @@ ix86_place_single_vector_set (rtx dest, rtx src, bitmap bbs)
   insn = NEXT_INSN (insn);
 }
 
-  rtx_insn *set_insn;
   if (insn == BB_HEAD (bb))
-set_insn = emit_insn_before (set, insn);
+emit_insn_before (set, insn);
   else
-set_insn = emit_insn_after (set,
-insn ? PREV_INSN (insn) : BB_END (bb));
-  df_insn_rescan (set_insn);
+emit_insn_after (set, insn ? PREV_INSN (insn) : BB_END (bb));
 }
 
 /* At entry of the nearest common dominator for basic blocks with
@@ -3225,7 +3222,6 @@ remove_partial_avx_dependency (void)
 	  /* Generate an XMM vector SET.  */
 	  set = gen_rtx_SET (vec, src);
 	  set_insn = emit_insn_before (set, insn);
-	  df_insn_rescan (set_insn);
 
 	  if (cfun->can_throw_non_call_exceptions)
 	{
@@ -3396,8 +3392,7 @@ replace_vector_const (machine_mode vector_mode, rtx vector_const,
 		  vreg = gen_reg_rtx (vmode);
 		  rtx vsubreg = gen_rtx_SUBREG (vmode, vector_const, 0);
 		  rtx pat = gen_rtx_SET (vreg, vsubreg);
-		  rtx_insn *vinsn = emit_insn_before (pat, insn);
-		  df_insn_rescan (vinsn);
+		  emit_insn_before (pat, insn);
 		}
 	  replace = gen_rtx_SUBREG (mode, vreg, 0);
 	}
-- 
2.49.0



[PATCH] fortran, v2: Fix up minloc/maxloc lowering [PR120191]

2025-05-11 Thread Jakub Jelinek
On Sat, May 10, 2025 at 11:21:19AM +0200, Tobias Burnus wrote:
> Namely: Similar to above, we should be able to just do:
> 
>    if (dim_arg->expr)
> 
> I think the comment should be also updated and we
> can also get rid of the 'actual' variable for cleanup.
> 
> Namely, something like the following on top of your patch (untested):

Here is an updated patch including your incremental changes.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Trying to write a testcase I've run into further issues but seems they
are on the library side, so I'll post it incrementally.

2025-05-12  Jakub Jelinek  
Daniil Kochergin  
Tobias Burnus  

PR fortran/120191
* trans-intrinsic.cc (strip_kind_from_actual): Remove.
(gfc_conv_intrinsic_minmaxloc): Don't call strip_kind_from_actual.
Free and clear kind_arg->expr if non-NULL.  Set back_arg->name to
"%VAL" instead of a loop looking for last argument.  Remove actual
variable, use array_arg instead.  Free and clear dim_arg->expr if
non-NULL for BT_CHARACTER cases instead of using a loop.

* gfortran.dg/pr120191_1.f90: New test.

--- gcc/fortran/trans-intrinsic.cc.jj   2025-04-22 21:26:15.772920190 +0200
+++ gcc/fortran/trans-intrinsic.cc  2025-05-10 21:25:24.541308686 +0200
@@ -4715,22 +4715,6 @@ maybe_absent_optional_variable (gfc_expr
 }
 
 
-/* Remove unneeded kind= argument from actual argument list when the
-   result conversion is dealt with in a different place.  */
-
-static void
-strip_kind_from_actual (gfc_actual_arglist * actual)
-{
-  for (gfc_actual_arglist *a = actual; a; a = a->next)
-{
-  if (a && a->name && strcmp (a->name, "kind") == 0)
-   {
- gfc_free_expr (a->expr);
- a->expr = NULL;
-   }
-}
-}
-
 /* Emit code for minloc or maxloc intrinsic.  There are many different cases
we need to handle.  For performance reasons we sometimes create two
loops instead of one, where the second one is much simpler.
@@ -4925,7 +4909,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * s
   tree b_if, b_else;
   tree back;
   gfc_loopinfo loop, *ploop;
-  gfc_actual_arglist *actual, *array_arg, *dim_arg, *mask_arg, *kind_arg;
+  gfc_actual_arglist *array_arg, *dim_arg, *mask_arg, *kind_arg;
   gfc_actual_arglist *back_arg;
   gfc_ss *arrayss = nullptr;
   gfc_ss *maskss = nullptr;
@@ -4944,8 +4928,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * s
   int n;
   bool optional_mask;
 
-  actual = expr->value.function.actual;
-  array_arg = actual;
+  array_arg = expr->value.function.actual;
   dim_arg = array_arg->next;
   mask_arg = dim_arg->next;
   kind_arg = mask_arg->next;
@@ -4954,14 +4937,16 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * s
   bool dim_present = dim_arg->expr != nullptr;
   bool nested_loop = dim_present && expr->rank > 0;
 
-  /* The last argument, BACK, is passed by value. Ensure that
- by setting its name to %VAL. */
-  for (gfc_actual_arglist *a = actual; a; a = a->next)
+  /* Remove kind.  */
+  if (kind_arg->expr)
 {
-  if (a->next == NULL)
-   a->name = "%VAL";
+  gfc_free_expr (kind_arg->expr);
+  kind_arg->expr = NULL;
 }
 
+  /* Pass BACK argument by value.  */
+  back_arg->name = "%VAL";
+
   if (se->ss)
 {
   if (se->ss->info->useflags)
@@ -4983,25 +4968,19 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * s
}
 }
 
-  arrayexpr = actual->expr;
+  arrayexpr = array_arg->expr;
 
-  /* Special case for character maxloc.  Remove unneeded actual
- arguments, then call a library function.  */
+  /* Special case for character maxloc.  Remove unneeded "dim" actual
+ argument, then call a library function.  */
 
   if (arrayexpr->ts.type == BT_CHARACTER)
 {
   gcc_assert (expr->rank == 0);
 
-  gfc_actual_arglist *a = actual;
-  strip_kind_from_actual (a);
-  while (a)
+  if (dim_arg->expr)
{
- if (a->name && strcmp (a->name, "dim") == 0)
-   {
- gfc_free_expr (a->expr);
- a->expr = NULL;
-   }
- a = a->next;
+ gfc_free_expr (dim_arg->expr);
+ dim_arg->expr = NULL;
}
   gfc_conv_intrinsic_funcall (se, expr);
   return;
--- gcc/testsuite/gfortran.dg/pr120191_1.f90.jj 2025-05-09 17:19:07.905018604 
+0200
+++ gcc/testsuite/gfortran.dg/pr120191_1.f902025-05-09 17:19:07.905018604 
+0200
@@ -0,0 +1,614 @@
+! PR fortran/120191
+! { dg-do run }
+
+  integer(kind=1) :: a1(10, 10, 10), b1(10)
+  integer(kind=2) :: a2(10, 10, 10), b2(10)
+  integer(kind=4) :: a4(10, 10, 10), b4(10)
+  integer(kind=8) :: a8(10, 10, 10), b8(10)
+  real(kind=4) :: r4(10, 10, 10), s4(10)
+  real(kind=8) :: r8(10, 10, 10), s8(10)
+  logical :: l1(10, 10, 10), l2(10), l3
+  l1 = .true.
+  l2 = .true.
+  l3 = .true.
+  a1 = 0
+  if (any (maxloc (a1) .ne. 1)) stop 1
+  if (any (maxloc (a1, back=.false.) .ne. 1)) stop 2
+  if (any (maxloc (a1, back=.true.) .ne. 10)) stop 3
+  if (a

Re: [PATCH v1 2/4] Match: Refactor the signed SAT_TRUNC match patterns [NFC]

2025-05-11 Thread Richard Biener
On Thu, Dec 12, 2024 at 9:45 AM  wrote:
>
> From: Pan Li 
>
> This patch would like to refactor the all signed SAT_TRUNC patterns,
> aka:
> * Extract type check outside.
> * Re-arrange the related match pattern forms together.
>
> The below test suites are passed for this patch.
> * The rv64gcv fully regression test.
> * The x86 bootstrap test.
> * The x86 fully regression test.

OK.

> gcc/ChangeLog:
>
> * match.pd: Refactor sorts of signed SAT_TRUNC match patterns
>
> Signed-off-by: Pan Li 
> ---
>  gcc/match.pd | 65 ++--
>  1 file changed, 32 insertions(+), 33 deletions(-)
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 1ef504f141f..5b30a1e9990 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3415,6 +3415,38 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (realpart @2))
>(if (types_match (type, @0, @1)
>
> +(if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type))
> + (match (signed_integer_sat_trunc @0)
> +  /* SAT_S_TRUNC(X) = (unsigned)X + NT_MAX + 1  > Unsigned_MAX ? (NT)X  */
> +  (cond^ (gt (plus:c (convert@4 @0) INTEGER_CST@1) INTEGER_CST@2)
> +(bit_xor:c (nop_convert?
> +(negate (nop_convert? (convert (lt @0 integer_zerop)
> +   INTEGER_CST@3)
> +(convert @0))
> +  (if (!TYPE_UNSIGNED (TREE_TYPE (@0)) && TYPE_UNSIGNED (TREE_TYPE (@4)))
> +   (with
> +{
> + unsigned itype_prec = TYPE_PRECISION (TREE_TYPE (@0));
> + unsigned otype_prec = TYPE_PRECISION (type);
> + wide_int offset = wi::uhwi (HOST_WIDE_INT_1U << (otype_prec - 1),
> +itype_prec); // Aka 128 for int8_t
> + wide_int limit_0 = wi::mask (otype_prec, false, itype_prec); // Aka 255
> + wide_int limit_1 = wi::uhwi ((HOST_WIDE_INT_1U << otype_prec) - 3,
> + itype_prec); // Aka 253
> + wide_int limit_2 = wi::uhwi ((HOST_WIDE_INT_1U << otype_prec) - 2,
> + itype_prec); // Aka 254
> + wide_int otype_max = wi::mask (otype_prec - 1, false, otype_prec);
> + wide_int itype_max = wi::mask (otype_prec - 1, false, itype_prec);
> + wide_int int_cst_1 = wi::to_wide (@1);
> + wide_int int_cst_2 = wi::to_wide (@2);
> + wide_int int_cst_3 = wi::to_wide (@3);
> +}
> +(if (((wi::eq_p (int_cst_1, offset) && wi::eq_p (int_cst_2, limit_0))
> +|| (wi::eq_p (int_cst_1, itype_max) && wi::eq_p (int_cst_2, limit_2))
> +|| (wi::eq_p (int_cst_1, offset) && wi::eq_p (int_cst_2, limit_2))
> +|| (wi::eq_p (int_cst_1, itype_max) && wi::eq_p (int_cst_2, 
> limit_1)))
> +&& wi::eq_p (int_cst_3, otype_max)))
> +
>  /* The boundary condition for case 10: IMM = 1:
> SAT_U_SUB = X >= IMM ? (X - IMM) : 0.
> simplify (X != 0 ? X + ~0 : 0) to X - (X != 0).  */
> @@ -3426,39 +3458,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (with { tree itype = TREE_TYPE (@2); }
>  (convert (minus @2 (convert:itype @1))
>
> -/* Signed saturation truncate, case 1 and case 2, sizeof (WT) > sizeof (NT).
> -   SAT_S_TRUNC(X) = (unsigned)X + NT_MAX + 1  > Unsigned_MAX ? (NT)X.  */
> -(match (signed_integer_sat_trunc @0)
> - (cond^ (gt (plus:c (convert@4 @0) INTEGER_CST@1) INTEGER_CST@2)
> -   (bit_xor:c (nop_convert?
> -   (negate (nop_convert? (convert (lt @0 integer_zerop)
> -  INTEGER_CST@3)
> -   (convert @0))
> - (if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type)
> -  && !TYPE_UNSIGNED (TREE_TYPE (@0)) && TYPE_UNSIGNED (TREE_TYPE (@4)))
> - (with
> -  {
> -   unsigned itype_prec = TYPE_PRECISION (TREE_TYPE (@0));
> -   unsigned otype_prec = TYPE_PRECISION (type);
> -   wide_int offset = wi::uhwi (HOST_WIDE_INT_1U << (otype_prec - 1),
> -  itype_prec); // Aka 128 for int8_t
> -   wide_int limit_0 = wi::mask (otype_prec, false, itype_prec); // Aka 255
> -   wide_int limit_1 = wi::uhwi ((HOST_WIDE_INT_1U << otype_prec) - 3,
> -   itype_prec); // Aka 253
> -   wide_int limit_2 = wi::uhwi ((HOST_WIDE_INT_1U << otype_prec) - 2,
> -   itype_prec); // Aka 254
> -   wide_int otype_max = wi::mask (otype_prec - 1, false, otype_prec);
> -   wide_int itype_max = wi::mask (otype_prec - 1, false, itype_prec);
> -   wide_int int_cst_1 = wi::to_wide (@1);
> -   wide_int int_cst_2 = wi::to_wide (@2);
> -   wide_int int_cst_3 = wi::to_wide (@3);
> -  }
> -  (if (((wi::eq_p (int_cst_1, offset) && wi::eq_p (int_cst_2, limit_0))
> -|| (wi::eq_p (int_cst_1, itype_max) && wi::eq_p (int_cst_2, limit_2))
> -|| (wi::eq_p (int_cst_1, offset) && wi::eq_p (int_cst_2, limit_2))
> -|| (wi::eq_p (int_cst_1, itype_max) && wi::eq_p (int_cst_2, 
> limit_1)))
> -   && wi::eq_p (int_cst_3, otype_max))
> -
>  /* x >  y  &&  x != XXX_MIN  -->  x > y
> x >  y  &&  x == XXX_MIN  -->  false . */
>  (for eqne (eq ne)
> --
> 2.43.0
>


Re: [PATCH] x86: Remove df_insn_rescan after emit_insn_*

2025-05-11 Thread Uros Bizjak
On Mon, May 12, 2025 at 8:19 AM H.J. Lu  wrote:
>
> Since df_insn_rescan has been called by emit_insn_*, there is no need
> to call it after calling emit_insn_*.  Remove its unnecessary usages.
>
> PR target/120228
> * config/i386/i386-features.cc (ix86_place_single_vector_set):
> Remove df_insn_rescan after emit_insn_*.
> (remove_partial_avx_dependency): Likewise.
> (replace_vector_const): Likewise.

LGTM.

Thanks,
Uros.

>
> OK for master?
>
> --
> H.J.


Re: [PATCH v1 4/4] Match: Update the comments for indicating SAT_* pattern

2025-05-11 Thread Richard Biener
On Thu, Dec 12, 2024 at 9:45 AM  wrote:
>
> From: Pan Li 
>
> Given the SAT_* patterns are grouped for each alu and signed or not,
> add leading comments to indicate the beginning of the pattern.

OK.

> gcc/ChangeLog:
>
> * match.pd: Update comments for sat_* pattern.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/match.pd | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 18098920007..aa006b9e282 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3099,6 +3099,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> || POINTER_TYPE_P (itype))
>&& wi::eq_p (wi::to_wide (int_cst), wi::max_value (itype))
>
> +/* Saturation add for unsigned integer.  */
>  (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type))
>   (match (usadd_overflow_mask @0 @1)
>/* SAT_U_ADD = (X + Y) | -(X > (X + Y)).
> @@ -3173,6 +3174,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  integer_minus_onep (realpart @2))
>(if (types_match (type, @0) && int_fits_type_p (@1, type)
>
> +/* Saturation sub for unsigned integer.  */
>  (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type))
>   (match (unsigned_integer_sat_sub @0 @1)
>/* SAT_U_SUB = X > Y ? X - Y : 0  */
> @@ -3262,6 +3264,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  }
>  (if (wi::eq_p (sum, wi::uhwi (0, precision
>
> +/* Saturation truncate for unsigned integer.  */
>  (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type))
>   (match (unsigned_integer_sat_trunc @0)
>/* SAT_U_TRUNC = (NT)x | (NT)(-(X > (WT)(NT)(-1)))  */
> @@ -3321,6 +3324,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (nop_convert? (convert (lt @0 integer_zerop)
>  max_value)))
>
> +/* Saturation add for signed integer.  */
>  (if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type))
>   (match (signed_integer_sat_add @0 @1)
>/* T SUM = (T)((UT)X + (UT)Y)
> @@ -3375,6 +3379,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  @2)
>(if (wi::bit_and (wi::to_wide (@1), wi::to_wide (@3)) == 0
>
> +/* Saturation sub for signed integer.  */
>  (if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type))
>   (match (signed_integer_sat_sub @0 @1)
>/* T Z = (T)((UT)X - (UT)Y);
> @@ -3411,6 +3416,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (realpart @2))
>(if (types_match (type, @0, @1)
>
> +/* Saturation truncate for signed integer.  */
>  (if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type))
>   (match (signed_integer_sat_trunc @0)
>/* SAT_S_TRUNC(X) = (unsigned)X + NT_MAX + 1  > Unsigned_MAX ? (NT)X  */
> --
> 2.43.0
>


Re: [PATCH v1 3/4] Match: Refactor the signed SAT_* match for saturated value [NFC]

2025-05-11 Thread Richard Biener
On Thu, Dec 12, 2024 at 9:45 AM  wrote:
>
> From: Pan Li 
>
> This patch would like to refactor the all signed SAT_* patterns for
> the saturated value.  Aka, overflow to INT_MAX when > 0 and downflow
> to INT_MIN when < 0.  Thus, we can remove sorts of duplicated expression
> in different patterns.
>
> The below test suites are passed for this patch.
> * The rv64gcv fully regression test.
> * The x86 bootstrap test.
> * The x86 fully regression test.

Note there's currently an implementation detail of how genmatch treats
(match ...) that makes this factoring less efficient in some cases (not
specifically this one where there is enough of a "pattern" left) - genmatch
does not "inline" (match ...), even when there's only a single variant,
causing a function call to match such "tail" instead of integrating the
sub-pattern in the decision tree.

That's something to keep in mind (though it is a genmatch limitation).

The patch is OK.

Richard.

> gcc/ChangeLog:
>
> * match.pd: Extract saturated value match for signed SAT_*.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/match.pd | 38 +-
>  1 file changed, 17 insertions(+), 21 deletions(-)
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 5b30a1e9990..18098920007 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3314,6 +3314,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  }
>  (if (wi::eq_p (trunc_max, int_cst_1) && wi::eq_p (max, int_cst_2)))
>
> +(if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type))
> + /* SAT_VAL = (-(T)(X < 0) ^ MAX)  */
> + (match (signed_integer_sat_val @0)
> +  (bit_xor:c (nop_convert? (negate
> +   (nop_convert? (convert (lt @0 integer_zerop)
> +max_value)))
> +
>  (if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type))
>   (match (signed_integer_sat_add @0 @1)
>/* T SUM = (T)((UT)X + (UT)Y)
> @@ -3322,7 +3329,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>(nop_convert @1
> (bit_not (bit_xor:c @0 @1)))
>  integer_zerop)
> -(bit_xor:c (negate (convert (lt @0 integer_zerop))) max_value)
> +(signed_integer_sat_val @0)
>  @2))
>   (match (signed_integer_sat_add @0 @1)
>/* T SUM = (T)((UT)X + (UT)Y)
> @@ -3340,17 +3347,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (nop_convert @1
> integer_zerop)
> (ge (bit_xor:c @0 @1) integer_zerop))
> -(bit_xor:c (nop_convert (negate (nop_convert (convert
> - (lt @0 
> integer_zerop)
> -   max_value)
> +(signed_integer_sat_val @0)
>  @2))
>   (match (signed_integer_sat_add @0 @1)
> /* SUM = .ADD_OVERFLOW (X, Y)
>SAT_S_ADD = IMAGPART_EXPR (SUM) != 0 ? (-(T)(X < 0) ^ MAX) : SUM  */
>(cond^ (ne (imagpart (IFN_ADD_OVERFLOW:c@2 @0 @1)) integer_zerop)
> -(bit_xor:c (nop_convert?
> -(negate (nop_convert? (convert (lt @0 integer_zerop)
> -   max_value)
> +(signed_integer_sat_val @0)
>  (realpart @2)))
>   (match (signed_integer_sat_add @0 @1)
>/* T SUM = (T)((UT)X + (UT)Y)
> @@ -3359,9 +3362,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (nop_convert @1
> integer_zerop)
> (bit_not (lt (bit_xor:c @0 @1) integer_zerop)))
> -(bit_xor:c (nop_convert (negate (nop_convert (convert
> -  (lt @0 
> integer_zerop)
> -   max_value)
> +(signed_integer_sat_val @0)
>  @2))
>   (match (signed_integer_sat_add @0 @1)
>/* T SUM = (T)((UT)X + (UT)IMM);
> @@ -3370,10 +3371,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>(cond^ (lt (bit_and:c (bit_xor:c @0 (nop_convert@2 (plus (nop_convert @0)
>INTEGER_CST@1)))
> (bit_xor:c @0 INTEGER_CST@3)) integer_zerop)
> -(bit_xor:c (negate (convert (lt @0 integer_zerop))) max_value)
> +(signed_integer_sat_val @0)
>  @2)
> -  (if (wi::bit_and (wi::to_wide (@1), wi::to_wide (@3)) == 0)))
> -)
> +  (if (wi::bit_and (wi::to_wide (@1), wi::to_wide (@3)) == 0
>
>  (if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type))
>   (match (signed_integer_sat_sub @0 @1)
> @@ -3383,7 +3383,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (bit_xor @0 (nop_convert@2 (minus (nop_convert @0)
>   (nop_convert @1)
>  integer_zerop)
> -(bit_xor:c (negate (convert (lt @0 integer_zerop))) max_value)
> +(signed_integer_sat_val @0)
>  @2))
>   (match (signed_integer_sat_sub @0 @1)
>/* T Z = (T)((UT)X - (U

Re: [PATCH] c++: Add attribute handles_virtual_move_assign

2025-05-11 Thread Owen Avery

Yeah, that looks way simpler. Should I add you as co-author on the
patch?

On 4/28/25 22:13, Jason Merrill wrote:

On 4/28/25 5:07 PM, Owen Avery wrote:
As far as I can tell, that would need to be applied to every class 
which virtually inherits from such a base class, rather than just the 
base class's move assignment operator.


Ah, sure.  But if you replace the lookup_attribute with 
warning_enabled_at (DECL_SOURCE_LOCATION (fn), OPT_Wvirtual_move_assign)

then disabling the warning around the base op= would be enough.

Jason


On 4/28/25 08:16, Jason Merrill wrote:

On 4/27/25 5:57 PM, Owen Avery wrote:

This patch should make it easier to selectively disable
-Wvirtual-move-assign errors by adding an attribute
for move assignment operators which marks them as handling
duplicate calls.


Thanks, but this sort of situation seems like a good fit for
#pragma GCC diagnostic ignored "-Wvirtual-move-assign"
It's not clear to me that the attribute is a significant usability 
improvement.



gcc/cp/ChangeLog:

* method.cc: Include "attribs.h".
(synthesized_method_walk): Avoid outputting
-Wvirtual_move_assign when the base class' move assignment
operator has the handles_virtual_move_assign attribute.
* tree.cc
(handle_handles_virtual_move_assign): Add.
(cxx_gnu_attributes): Add handles_virtual_move_assign to the
attribute list.

gcc/ChangeLog:

* doc/extend.texi (C++-Specific Variable, Function, and Type
Attributes): Document handles_virtual_move_assign.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wvirtual-move-assign-1.C: New test.

Signed-off-by: Owen Avery 
---
  gcc/cp/method.cc  |  5 ++-
  gcc/cp/tree.cc    | 28 
  gcc/doc/extend.texi   | 13 
  .../g++.dg/warn/Wvirtual-move-assign-1.C  | 32 
+++

  4 files changed, 77 insertions(+), 1 deletion(-)
  create mode 100644 
gcc/testsuite/g++.dg/warn/Wvirtual-move-assign-1.C


diff --git a/gcc/cp/method.cc b/gcc/cp/method.cc
index 05c19cf0661..898f05c9b7d 100644
--- a/gcc/cp/method.cc
+++ b/gcc/cp/method.cc
@@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
  #include "toplev.h"
  #include "intl.h"
  #include "common/common-target.h"
+#include "attribs.h"
    static void do_build_copy_assign (tree);
  static void do_build_copy_constructor (tree);
@@ -2949,7 +2950,9 @@ synthesized_method_walk (tree ctype, 
special_function_kind sfk, bool const_p,

    && BINFO_VIRTUAL_P (base_binfo)
    && fn && TREE_CODE (fn) == FUNCTION_DECL
    && move_fn_p (fn) && !trivial_fn_p (fn)
-  && vbase_has_user_provided_move_assign (BINFO_TYPE 
(base_binfo)))
+  && vbase_has_user_provided_move_assign (BINFO_TYPE 
(base_binfo))

+  && !lookup_attribute ("handles_virtual_move_assign",
+    DECL_ATTRIBUTES (fn)))
  warning (OPT_Wvirtual_move_assign,
   "defaulted move assignment for %qT calls a non-trivial "
   "move assignment operator for virtual base %qT",
diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
index 5863b6878f0..4efd5121319 100644
--- a/gcc/cp/tree.cc
+++ b/gcc/cp/tree.cc
@@ -48,6 +48,8 @@ static tree handle_init_priority_attribute (tree 
*, tree, tree, int, bool *);
  static tree handle_abi_tag_attribute (tree *, tree, tree, int, 
bool *);
  static tree handle_contract_attribute (tree *, tree, tree, int, 
bool *);
  static tree handle_no_dangling_attribute (tree *, tree, tree, 
int, bool *);
+static tree handle_handles_virtual_move_assign (tree *, tree, 
tree, int,

+    bool *);
    /* If REF is an lvalue, returns the kind of lvalue that REF is.
 Otherwise, returns clk_none.  */
@@ -5234,6 +5236,8 @@ static const attribute_spec 
cxx_gnu_attributes[] =

  handle_abi_tag_attribute, NULL },
    { "no_dangling", 0, 1, false, true, false, false,
  handle_no_dangling_attribute, NULL },
+  { "handles_virtual_move_assign", 0, 0, false, false, false, false,
+    handle_handles_virtual_move_assign, NULL },
  };
    const scoped_attribute_specs cxx_gnu_attribute_table =
@@ -5565,6 +5569,30 @@ handle_no_dangling_attribute (tree *node, 
tree name, tree args, int,

    return NULL_TREE;
  }
  +/* Handle a "handles_virtual_move_assign" attribute; arguments 
as in

+   struct attribute_spec.handler.  */
+
+tree
+handle_handles_virtual_move_assign (tree *node, tree name, tree / 
*args*/,

+    int /*flags*/, bool *no_add_attrs)
+{
+  if (TREE_CODE (*node) != FUNCTION_DECL || DECL_CONSTRUCTOR_P 
(*node)

+  || !move_fn_p (*node))
+    {
+  warning (
+    OPT_Wattributes,
+    "%qE attribute ignored; valid only for move assignment 
operators",

+    name);
+  *no_add_attrs = true;
+    }
+  else
+    {
+  *no_add_attrs = false;
+    }
+
+  return NULL_TREE;
+}
+
  /* Return a new PTRMEM_CST of the indicated TYPE.  The MEMBER is the
 thing pointed to by the constant.  */
  diff --git a/gcc/doc/

Re: [PATCH v3] xtensa: Fix up unwanted spills of SFmode hard registers holding function arguments/returns

2025-05-11 Thread Max Filippov
On Sat, May 10, 2025 at 12:51 PM Takayuki 'January June' Suwa
 wrote:
>
> Until now (presumably after transition to LRA), hard registers storing
> function arguments or return values were spilling undesirably when
> TARGET_HARD_FLOAT is enabled.
>
>  /* example */
>  float test0(float a, float b) {
>return a + b;
>  }
>  extern float foo(void);
>  float test1(void) {
>return foo() * 3.14f;
>  }
>
>  ;; before
>  test0:
> entry   sp, 48
> wfr f0, a2
> wfr f1, a3
> add.s   f0, f0, f1
> s32i.n  a2, sp, 0   ;; unwanted spilling-out
> s32i.n  a3, sp, 4   ;;
> rfr a2, f0
> retw.n
> .literal .LC1, 1078523331
>  test1:
> entry   sp, 48
> call8   foo
> l32ra8, .LC1
> wfr f0, a10
> wfr f1, a8
> mul.s   f0, f0, f1
> s32i.n  a10, sp, 0  ;; unwanted spilling-out
> rfr a2, f0
> retw.n
>
> Ultimately, that is because the costs of moving between integer and
> floating-point hard registers are undefined and the default (large value)
> is used.  This patch fixes this.
>
>  ;; after
>  test0:
> entry   sp, 32
> wfr f1, a2
> wfr f0, a3
> add.s   f0, f1, f0
> rfr a2, f0
> retw.n
> .literal .LC1, 1078523331
>  test1:
> entry   sp, 32
> call8   foo
> l32ra8, .LC1
> wfr f1, a10
> wfr f0, a8
> mul.s   f0, f1, f0
> rfr a2, f0
> retw.n
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.cc (xtensa_register_move_cost):
> Add appropriate move costs between AR_REGS and FP_REGS.
> ---
>   gcc/config/xtensa/xtensa.cc | 28 +++-
>   1 file changed, 19 insertions(+), 9 deletions(-)

Regtested for target=xtensa-linux-uclibc, no new regressions.
Committed to master.
That's a nice fix, thank you Suwa-san!

--
Thanks.
-- Max


[PATCH v1 5/7] RISC-V: Add test for vec_duplicate + vsub.vv combine case 1 with GR2VR cost 0

2025-05-11 Thread pan2 . li
From: Pan Li 

Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i8.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-u64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-u8.c: New test.

Signed-off-by: Pan Li 
---
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i16.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i32.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i64.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i8.c | 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-u16.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-u32.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-u64.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-u8.c | 8 
 8 files changed, 64 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i8.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-u16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-u32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-u64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-u8.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i16.c
new file mode 100644
index 000..40e99baa30a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i16.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl128b -mabi=lp64d --param=gpr2vr-cost=0" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_1(int16_t, -, VX_BINARY_BODY_X16)
+
+/* { dg-final { scan-assembler {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i32.c
new file mode 100644
index 000..b515cd9dcf3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i32.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl128b -mabi=lp64d --param=gpr2vr-cost=0" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_1(int32_t, -, VX_BINARY_BODY_X4)
+
+/* { dg-final { scan-assembler {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i64.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i64.c
new file mode 100644
index 000..967ede95753
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i64.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl128b -mabi=lp64d --param=gpr2vr-cost=0" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_1(int64_t, -, VX_BINARY_BODY)
+
+/* { dg-final { scan-assembler {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i8.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i8.c
new file mode 100644
index 000..35811c46796
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i8.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl128b -mabi=lp64d --param=gpr2vr-cost=0" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_1(int8_t, -, VX_BINARY_BODY_X16)
+
+/* { dg-final { scan-assembler {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-u16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-u16.c
new file mode 100644
index 000..2f0ea3a775e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-u16.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl128b -mabi=lp64d --param=gpr2vr-cost=0" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_1(uint16_t, -, VX_BINARY_BODY_X16)
+
+/* { dg-final { scan-assembler {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-u32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-u32.c
new file mode 100644
index 000..2d2153abf62
--- /dev/null
+++ 

[PATCH v1 2/7] RISC-V: Add test for vec_duplicate + vsub.vv combine case 0 with GR2VR cost 0

2025-05-11 Thread pan2 . li
From: Pan Li 

Add asm dump check and run test for vec_duplicate + vsub.vv
combine to vsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test
data for vsub.vx.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-i8.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-u64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-u8.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i8.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u8.c: New test.

Signed-off-by: Pan Li 
---
 .../riscv/rvv/autovec/vx_vf/vx_binary_data.h  | 392 ++
 .../riscv/rvv/autovec/vx_vf/vx_vsub-1-i16.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-1-i32.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-1-i64.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-1-i8.c|   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-1-u16.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-1-u32.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-1-u64.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-1-u8.c|   8 +
 .../rvv/autovec/vx_vf/vx_vsub-run-1-i16.c |  14 +
 .../rvv/autovec/vx_vf/vx_vsub-run-1-i32.c |  14 +
 .../rvv/autovec/vx_vf/vx_vsub-run-1-i64.c |  14 +
 .../rvv/autovec/vx_vf/vx_vsub-run-1-i8.c  |  14 +
 .../rvv/autovec/vx_vf/vx_vsub-run-1-u16.c |  14 +
 .../rvv/autovec/vx_vf/vx_vsub-run-1-u32.c |  14 +
 .../rvv/autovec/vx_vf/vx_vsub-run-1-u64.c |  14 +
 .../rvv/autovec/vx_vf/vx_vsub-run-1-u8.c  |  14 +
 17 files changed, 568 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-i16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-i32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-i8.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-u16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-u32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-u64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-u8.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i8.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u8.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h
index 11a32cbbf0f..c9ea22800c2 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h
@@ -398,4 +398,396 @@ uint64_t TEST_BINARY_DATA(uint64_t, vadd)[][3][N] =
   },
 };
 
+int8_t TEST_BINARY_DATA(int8_t, vsub)[][3][N] =
+{
+  {
+{ 1 },
+{
+   1,  1,  1,  1,
+   2,  2,  2,  2,
+   0,  0,  0,  0,
+  -1, -1, -1, -1,
+},
+{
+   0,  0,  0,  0,
+   1,  1,  1,  1,
+  -1, -1, -1, -1,
+  -2, -2, -2, -2,
+},
+  },
+  {
+{ 127 },
+{
+   127,  127,  127,  127,
+   126,  126,  126,  126,
+-1,   -1,   -1,   -1,
+   125,  125,  125,  125,
+},
+{
+ 0,0,0,0,
+-1,   -1,   -1,   -1,
+  -128, -128, -128, -128,
+-2,   -2,   -2,   -2,
+},
+  },
+  {
+{ -128 },
+{
+  -128, -128, -128, -128,
+  -127, -127, -127,

[PATCH v1 7/7] RISC-V: Add test for vec_duplicate + vsub.vv combine case 1 with GR2VR cost 2

2025-05-11 Thread pan2 . li
From: Pan Li 

Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i8.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-u64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-u8.c: New test.

Signed-off-by: Pan Li 
---
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i16.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i32.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i64.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i8.c | 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-u16.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-u32.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-u64.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-u8.c | 8 
 8 files changed, 64 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i8.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-u16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-u32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-u64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-u8.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i16.c
new file mode 100644
index 000..f3f7bb643b9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i16.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl128b -mabi=lp64d --param=gpr2vr-cost=2" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_1(int16_t, -, VX_BINARY_BODY_X8)
+
+/* { dg-final { scan-assembler {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i32.c
new file mode 100644
index 000..4fa3093d4f9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i32.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl128b -mabi=lp64d --param=gpr2vr-cost=2" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_1(int32_t, -, VX_BINARY_BODY_X4)
+
+/* { dg-final { scan-assembler {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i64.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i64.c
new file mode 100644
index 000..f112018ba3c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i64.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl128b -mabi=lp64d --param=gpr2vr-cost=2" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_1(int64_t, -, VX_BINARY_BODY)
+
+/* { dg-final { scan-assembler-not {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i8.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i8.c
new file mode 100644
index 000..ef480c21fb0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i8.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl128b -mabi=lp64d --param=gpr2vr-cost=2" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_1(int8_t, -, VX_BINARY_BODY_X16)
+
+/* { dg-final { scan-assembler {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-u16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-u16.c
new file mode 100644
index 000..af0c2dabff5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-u16.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl128b -mabi=lp64d --param=gpr2vr-cost=2" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_1(uint16_t, -, VX_BINARY_BODY_X8)
+
+/* { dg-final { scan-assembler {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-u32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-u32.c
new file mode 100644
index 000..5b056bb9e17
--- /dev/null
++

[PATCH v1 4/7] RISC-V: Add test for vec_duplicate + vsub.vv combine case 0 with GR2VR cost 15

2025-05-11 Thread pan2 . li
From: Pan Li 

Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i8.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-u64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-u8.c: New test.

Signed-off-by: Pan Li 
---
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i16.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i32.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i64.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i8.c | 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-u16.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-u32.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-u64.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-u8.c | 8 
 8 files changed, 64 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i8.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-u16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-u32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-u64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-u8.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i16.c
new file mode 100644
index 000..7a252a922aa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i16.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d --param=gpr2vr-cost=15" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_0(int16_t, -)
+
+/* { dg-final { scan-assembler-not {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i32.c
new file mode 100644
index 000..02543446720
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i32.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d --param=gpr2vr-cost=15" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_0(int32_t, -)
+
+/* { dg-final { scan-assembler-not {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i64.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i64.c
new file mode 100644
index 000..fc01c5479b7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i64.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d --param=gpr2vr-cost=15" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_0(int64_t, -)
+
+/* { dg-final { scan-assembler-not {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i8.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i8.c
new file mode 100644
index 000..6da427130f1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i8.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d --param=gpr2vr-cost=15" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_0(int8_t, -)
+
+/* { dg-final { scan-assembler-not {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-u16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-u16.c
new file mode 100644
index 000..aab9b9b8177
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-u16.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d --param=gpr2vr-cost=15" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_0(uint16_t, -)
+
+/* { dg-final { scan-assembler-not {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-u32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-u32.c
new file mode 100644
index 000..01e159e98d6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-u32.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/

[PATCH] fortran: map atand(y, x) to atan2d(y, x)

2025-05-11 Thread Yuao Ma
Hi all,

According to the Fortran standard, atand(y, x) is equivalent to atan2d(y, x).
However, the current atand(y, x) function produces an error. This patch
includes the necessary intrinsic mapping, related test, and intrinsic
documentation.
The minor comment change in intrinsic.cc is cherry-picked from Steve's previous
work.

Best regards,
Yuao



0001-fortran-map-atand-y-x-to-atan2d-y-x.patch
Description: 0001-fortran-map-atand-y-x-to-atan2d-y-x.patch


Re: [PATCH] fortran: map atand(y, x) to atan2d(y, x)

2025-05-11 Thread Tobias Burnus

Hi all, hi Yuao,

first, thanks for your patch - you are awesome! I believe it fixes the
issue reported by Steven in problem report (PR) 113414,
https://gcc.gnu.org/PR113413

Thus:

* * *

[Linking PR numbers]

In order to correlate commits to issued (and get them automatically
linked), the commit log should contain a reference.

The syntax is  + "PR " + space +  + "/" + number,
i.e. here: (tab) PR fortran/113413.

However, when using mklog.py, it takes already care of the syntax:

* Use '-b' to specify the bug number or
* If in the first few lines of a (test) file, a PRxxx or
  PR comp/xxx appears, mklog.py assumes that's a PR number.

Unless you already use 'PR comp/123", this doesn't automatically
add the component to the commit changelog. However, if you invoke
mklog.py with '-p', it queries Bugzilla – and fills in the component.
Additionally, '-p' includes the bug title(s) in the template; that's a
good way to cross check the number as it is easily to mistype a longer
number.

[PR number in the commit/email subject]

Consider to add the PR number to the mail – most common is to append
[PR1234] at the end (i.e. without component). However, there are no firm
rules about this. Sometimes, leaving it out completely make most sense
(e.g. if several bugs are fixed) or if the subject line already very long;
additionally, different persons have different styles - and use a different
style.

* * *

(Pre-existing documentation issue, but should be fixed alongside:)

Looking at the documentation and comparing ATAND to ATAN,
https://gcc.gnu.org/onlinedocs/gfortran/ATAND.html and
https://gcc.gnu.org/onlinedocs/gfortran/ATAN.html

* Synopsis should also show the two-argument version
* "ifY is present,X shall be REAL" does no make sense as ATAND
  (contrary to ATAN) does not permit complex values
* Something like "If Y is present, the result is identical to
  ATAN2D(Y,X). Otherwise," is missing.
* "range -90 \leq \Re \atand(x) \leq 90": the 'Re' IMHO doesn't
  make   sense as the argument must always be real.

* * *

[I have still to look at the rest of the patch, but at a glance it
looks fine.]

Thanks again for your patch!

Tobias



[committed] cobol: Eliminate padding bytes from cbl_declarative_t.

2025-05-11 Thread Robert Dubner
>From 1e4dee2dae0ad08fecb50dcced3d00c6cfffd932 Mon Sep 17 00:00:00 2001
From: Robert Dubner mailto:rdub...@symas.com
Date: Sun, 11 May 2025 13:43:32 -0400
Subject: [PATCH] cobol: Eliminate padding bytes from cbl_declarative_t.
 [PR119377]

By changing the type of a variable in the cbl_declarative_t structure from
"bool"
to "uint32_t", three uninitialized padding bytes were turned into
initialized
bytes.  This eliminates the valgrind error caused by those uninitialized
values.

This is an interim fix, which expediently eliminates the valgrind problem.
The
underlying design flaw, which involves turning a host-side C++ structure
into
a run-time data block, is slated for complete replacement in the next few
weeks.

libgcobol/ChangeLog:

PR cobol/119377
* common-defs.h: (struct cbl_declaratives_t): Change "bool global"
to
"uint32_t global".
---
 libgcobol/common-defs.h | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/libgcobol/common-defs.h b/libgcobol/common-defs.h
index 026f377e74b2..e3471c5ccc3d 100644
--- a/libgcobol/common-defs.h
+++ b/libgcobol/common-defs.h
@@ -458,11 +458,25 @@ struct cbl_enabled_exception_t {
 struct cbl_declarative_t {
   enum { files_max = 16 };
   size_t section; // implies program
-  bool global;
+  uint32_t global;  // See the note below
   ec_type_t type;
   uint32_t nfile, files[files_max];
   cbl_file_mode_t mode;
 
+/*  The ::global member originally was "bool global".  A bool, however,
occupies
+only one byte of storage.  The structure, in turn, is constructed on
+four-byte boundaries for members, so there were three padding bytes
between
+the single byte of global and the ::type member.
+
+When used to create a "blob", where the structure was treated as a
stream
+of bytes that were used to create a constructor for an array of
bytes,
+valgrind noticed that those three padding bytes were not initialized,
and
+generated the appropriate error message.  This made it hard to find
other
+problems.
+
+Changing the declaration from "bool" to "uint32_t" seems to have
eliminated
+the valgrind error without affecting overall performance.  */
+
   cbl_declarative_t( cbl_file_mode_t mode = file_mode_none_e )
 : section(0), global(false)
 , type(ec_none_e)
-- 
2.34.1



Re: i386: Fix some problems in stv cost model

2025-05-11 Thread Richard Biener



> Am 10.05.2025 um 22:28 schrieb Jan Hubicka :
> 
> Hi,
> this patch fixes some of problems with cosint in scalar to vector pass.
> In particular
> 1) the pass uses optimize_insn_for_size which is intended to be used by
>expanders and splitters and requires the optimization pass to use
>set_rtl_profile (bb) for currently processed bb.
>This is not done, so we get random stale info about hotness of insn.
> 2) register allocator move costs are all realtive to integer reg-reg move
>which has cost of 2, so it is (except for size tables and i386)
>a latency of instruction multiplied by 2.
>These costs has been duplicated and are now used in combination with
>rtx costs which are all based to COSTS_N_INSNS that multiplies latency
>by 4.
>Some of vectorizer costing contains COSTS_N_INSNS (move_cost) / 2
>to compensate, but some new code does not.  This patch adds compensatoin.
> 
>Perhaps we should update the cost tables to use COSTS_N_INSNS everywher
>but I think we want to first fix inconsistencies.  Also the tables will
>get optically much longer, since we have many move costs and COSTS_N_INSNS
>is a lot of characters.
> 3) variable m which decides how much to multiply integer variant (to account
>that with -m32 all 64bit computations needs 2 instructions) is declared
>unsigned which makes the signed computation of instruction gain to be
>done in unsigned type and breaks i.e. for division.
> 4) I added integer_to_sse costs which are currently all duplicationof
>sse_to_integer. AMD chips are asymetric and moving one direction is faster
>than another.  I will chance costs incremnetally once vectorizer part
>is fixed up, too.
> 
> There are two failures gcc.target/i386/minmax-6.c and 
> gcc.target/i386/minmax-7.c.
> Both test stv on hasswell which no longer happens since SSE->INT and INT->SSE 
> moves
> are now more expensive.
> 
> There is only one instruction to convert:
> 
> Computing gain for chain #1...
>  Instruction gain 8 for11: {r110:SI=smax(r116:SI,0);clobber flags:CC;}
>  Instruction conversion gain: 8
>  Registers conversion cost: 8<- this is integer_to_sse and sse_to_integer
>  Total gain: 0
> 
> total gain used to be 4 since the patch doubles the conversion costs.
> According to agner fog's tables the costs should be 1 cycle which is correct
> here.
> 
> Final code gnerated is:
> 
>vmovd%esi, %xmm0 * latency 1
>cmpl%edx, %esi
>je.L2
>vpxor%xmm1, %xmm1, %xmm1 * latency 1
>vpmaxsd%xmm1, %xmm0, %xmm0 * latency 1
>vmovd%xmm0, %eax * latency 1
>imull%edx, %eax
>cltq
>movzwl(%rdi,%rax,2), %eax
>ret
> 
>cmpl%edx, %esi
>je.L2
>xorl%eax, %eax  * latency 1
>testl%esi, %esi  * latency 1
>cmovs%eax, %esi  * latency 2
>imull%edx, %esi
>movslq%esi, %rsi
>movzwl(%rdi,%rsi,2), %eax
>ret
> 
> Instructions with latency info are those really different.
> So the uncoverted code has sum of latencies 4 and real latency 3.
> Converted code has sum of latencies 4 and real latency 3 (vmod+vpmaxsd+vmov).
> So I do not quite see it should be a win.

Note this was historically done because cmov performance behaves erratically at 
least on some uarchs compared to SSE min/max, esp. if there are back-to-back 
cmov (the latter, aka throughput, is not modeled at all in the cost tables nor 
the pass).  IIRC it was hmmer from SPEC 2006 exhibiting such back-to-back case.

Richard 

> There is also a bug in costing MIN/MAX
> 
>case ABS:
>case SMAX:
>case SMIN:
>case UMAX:
>case UMIN:
>  /* We do not have any conditional move cost, estimate it as a
> reg-reg move.  Comparisons are costed as adds.  */
>  igain += m * (COSTS_N_INSNS (2) + ix86_cost->add);
>  /* Integer SSE ops are all costed the same.  */
>  igain -= ix86_cost->sse_op;
>  break;
> 
> Now COSTS_N_INSNS (2) is not quite right since reg-reg move should be 1 or 
> perhaps 0.
> For Haswell cmov really is 2 cycles, but I guess we want to have that in cost 
> vectors
> like all other instructions.
> 
> I am not sure if this is really a win in this case (other minmax testcases 
> seems to make
> sense).  I have xfailed it for now and will check if that affects specs on 
> LNT testers.
> 
> Bootstrapped/regtested x86_64-linux, comitted.
> 
> I will proceed with similar fixes on vectorizer cost side. Sadly those 
> introduces
> quite some differences in the testuiste (partly triggered by other costing 
> problems,
> such as one of scatter/gather)
> 
> gcc/ChangeLog:
> 
>* config/i386/i386-features.cc
>(general_scalar_chain::vector_const_cost): Add BB parameter; handle
>size costs; use COSTS_N_INSNS to compute move costs.
>(general_scalar_chain::compute_convert_gain): Use optimize_bb_for_size
>instead of optimi

Re: [PATCH v1] libstdc++: More efficient weekday from year_month_day.

2025-05-11 Thread Cassio Neri
Hi all,

After reflecting on my previous message and Andrew's, I now believe this
patch is not the best solution to optimise the day of the week. Instead,
the optimisation for n % 7 should be done by the compiler depending on the
platform.

I'll open a missing optimisation opportunity bug report against GCC with my
suggestions.

Therefore, expecting a better solution from the compiler, I'd ask you to
disregard this patch.

Best wishes,
Cassio.

On Sun, 11 May 2025 at 01:59, Cassio Neri  wrote:

> Thanks Andrew for your prompt reply.
>
> The results below regard my PoC which is as close to the proposed patch as
> I could make. This is because I can't have chrono with my patch on godbold
> for a comparison between current chrono and patched chrono.
>
> I tried on all platforms that I could make it to compile. Please double
> check everything because I might be misreading some results, especially on
> the platforms that I'm not familiar with. Sometimes godbold seems to have
> issues and cut pieces of the generated assembly from the output. I've
> marked these cases with (X).
>
> I suspect we want to disable this for -Os
>>
>
> Below are the sizes with -Os. Most of the time the new code is shorter
> than the old one with a few exceptions where they are the same size
> (because the platform doesn't seem to support [[assume]]). The new code is
> never longer. On each link, the middle panel shows the result for the old
> code and the right panel for the new code. These panels have tabs for
> different platforms.
>
> Old   New
>
> https://godbolt.org/z/hfz9szEWf
>
> x86-64 0x81  0x69
> ARM32  0x78  0x68
> ARM64  0x81  0x71
> ARM64 Morello  0x48  0x48
> HPPA   0xf8  0xc8
> KVX ACB0xec  0xcc
> loongarch640x94  0x8c  (x)
>
> https://godbolt.org/z/eMfzoPhT5
>
> M68K   0xb6  0xa6
> MinGW  0xa0  0x80  (X)
> mips   0xdc  0xac
> mips64 0xcc  0xb8
> mipls64 (el)   0xbc  0xa8
> mipsel 0xe0  0xb0
> MRISC320xa4  0x74
> power  0xb8  0x80
>
> https://godbolt.org/z/PjqbTqK6b
>
> power640xa8  0x8c
> power64le  0xa4  0x88
> RISC-V (32)0x90  0x7e  (X)
> RISC-V (64)0x86  0x86  (X)
> s390x  0xf0  0x90
> sh 0xc2  0xb2
> SPARC  0xc0  0x98
> SPARC LEON 0xbc  0x94
>
> https://godbolt.org/z/7oebGMYTM
>
> SPARC640xac  0x94
> TI C6x 0xc4  0x98
> Tricore0xb0  0xb0
> VAX0xc8  0xc5
>
>
>> And plus i am not 100% convinced it is best for all micro-architures.
>> Especially on say aarch64.
>> Can you do more benchmarking and peocide which exaxt core is being used?
>>
>
> I don't have access to any platform other than x86-64 to do benchmarks :-(
>
> And mention the size difference too?
>>
>
> Same exercise explained above but with -O2:
>
>  OldNew
> https://godbolt.org/z/eqGo9xnz3
>
> x86-64 0x a4  0x 72
> ARM32  0x a8  0x 74
> ARM64  0x 98  0x 80
> ARM64 Morello  0x14c  0x14c
> HPPA   0x134  0x c8
> KVX ACB0x f4  0x 98
> loongarch640x ac  0x 9c  (X)
>
> https://godbolt.org/z/7qh94zGMK
>
> M68K   0x13a  0x a2
> MinGW  0x d0  0x 80  (X)
> mips   0x11c  0x e4
> mips64 0x130  0x f0
> mipls64 (el)   0x120  0x e0
> mipsel 0x120  0x e8
> MRISC320x a0  0x 74
> power  0x dc  0x 88
>
> https://godbolt.org/z/Y11Trnqc1
>
> power640x d0  0x 94
> power64le  0x d0  0x 90
> RISC-V (32)0x bc  0x 84  (X)
> RISC-V (64)0x be  0x 94  (X)
> s390x  0x f0  0x a8
> sh 0x c6  0x cc  (*)
> SPARC  0x108  0x 9c
> SPARC LEON 0x f4  0x 94
>
> https://godbolt.org/z/h456PTEWh
>
> SPARC640x c0  0x a0
> TI C6x 0x108  0x b0
> Tricore0x e8  0x ea  (*)
> VAX0x dc  0x dc
>
> (*) These are the only cases where the new code is larger than the old one.
>
> Plus gcc knows how to do %7 via multiplication is that being used or is it
>> due to generic x86 tuning it is using the div instruction?
>>
>
> Yes and no. In x86-64 (and probably many other platforms) the current
> optimisation for n % 7 is a byproduct of the optimisation for /, that is,
> to calculate n % 7, the generated code evaluates n - (n / 7) * 7. The
> quotient q = n / 7 is optimised to avoid div and uses a multiplication and
> other cheaper operations. In total it evaluates 2 multiplications +
> shifts + add + subs and movs. (One multiplication is q*7 which is performed
> with LEA + sub.) The algorithm that I'm suggesting, performs only one
> multiplication and one. Below are the comparisons of n % 7 and the proposed
> algorithm.
>
> https://godbolt.org/z/o7dazs4Gc
> https://godbolt.org/z/zP79736WK
> https://godbolt.org/z/65x7naMfq
> https://godbolt.org/z/z9ofaMzex
>
> I hope this helps.
> Cassio.
>
>


[PATCH] tree-optimization/120211 - constrain LOOP_VINFO_EARLY_BREAKS_LIVE_IVS more

2025-05-11 Thread Richard Biener
The PR120089 fix added more PHIs to LOOP_VINFO_EARLY_BREAKS_LIVE_IVS
but not checking that we only add PHIs with a latch argument.  The
following adds this missing check.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/120211
* tree-vect-stmts.cc (vect_stmt_relevant_p): Only add PHIs
from the loop header to LOOP_VINFO_EARLY_BREAKS_LIVE_IVS.

* gcc.dg/vect/vect-early-break_135-pr120211.c: New testcase.
* gcc.dg/torture/pr120211-1.c: Likewise.
---
 gcc/testsuite/gcc.dg/torture/pr120211-1.c | 20 +++
 .../vect/vect-early-break_135-pr120211.c  | 12 +++
 gcc/tree-vect-stmts.cc|  1 +
 3 files changed, 33 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr120211-1.c
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-early-break_135-pr120211.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr120211-1.c 
b/gcc/testsuite/gcc.dg/torture/pr120211-1.c
new file mode 100644
index 000..f9bc97cab5b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr120211-1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+
+int a, b, d;
+void e() {
+  do {
+int f = 0;
+while (1) {
+  int c = a;
+  for (; (c & 1) == 0; c = 1)
+for (; c & 1;)
+  ;
+  if (a)
+break;
+  f++;
+}
+b = f & 5;
+if (b)
+  break;
+  } while (d++);
+}
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_135-pr120211.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_135-pr120211.c
new file mode 100644
index 000..664b60de2aa
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_135-pr120211.c
@@ -0,0 +1,12 @@
+/* { dg-add-options vect_early_break } */
+/* { dg-additional-options "-O3 -fno-tree-copy-prop -fno-tree-dominator-opts 
-fno-tree-loop-ivcanon -fno-tree-pre -fno-code-hoisting" } */
+
+int a, b[1];
+int main() {
+  int c = 0;
+  for (; c < 1; c++) {
+while (a)
+  c++;
+b[c] = 0;
+  }
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index efe6a2c9c42..bd390b26e0a 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -424,6 +424,7 @@ vect_stmt_relevant_p (stmt_vec_info stmt_info, 
loop_vec_info loop_vinfo,
  alternate exit.  */
   if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo)
   && is_a  (stmt)
+  && gimple_bb (stmt) == LOOP_VINFO_LOOP (loop_vinfo)->header
   && ((! VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info))
  && ! *live_p)
  || STMT_VINFO_DEF_TYPE (stmt_info) == vect_induction_def))
-- 
2.43.0


[committed] cobol: New testcases.

2025-05-11 Thread Robert Dubner
cobol: New testcases.

Eighty-six testcases extracted from the run_move and run_misc
COBOLworx
testsuite.

gcc/testsuite/ChangeLog:

* cobol.dg/group2/258_Nested_PERFORM.cob: New testcase.
* cobol.dg/group2/259_PERFORM_VARYING_BY_-0.2.cob: Likewise.
* cobol.dg/group2/338_Default_Arithmetic__1_.cob: Likewise.
*
cobol.dg/group2/access_to_OPTIONAL_LINKAGE_item_not_passed.cob: Likewise.
* cobol.dg/group2/ALLOCATE___FREE_basic_default_versions.cob:
Likewise.
* cobol.dg/group2/ALLOCATE___FREE_with_BASED_item__1_.cob:
Likewise.
* cobol.dg/group2/ALLOCATE___FREE_with_BASED_item__2_.cob:
Likewise.
*
cobol.dg/group2/ALLOCATE_Rule_8_OPTION_INITIALIZE_with_figconst.cob:
Likewise.
* cobol.dg/group2/Alphanumeric_and_binary_numeric.cob:
Likewise.
* cobol.dg/group2/Alphanumeric_MOVE_with_truncation.cob:
Likewise.
* cobol.dg/group2/ANY_LENGTH__1_.cob: Likewise.
* cobol.dg/group2/ANY_LENGTH__2_.cob: Likewise.
* cobol.dg/group2/ANY_LENGTH__3_.cob: Likewise.
* cobol.dg/group2/ANY_LENGTH__4_.cob: Likewise.
* cobol.dg/group2/ANY_LENGTH__5_.cob: Likewise.
* cobol.dg/group2/CALL_with_OMITTED_parameter.cob: Likewise.
* cobol.dg/group2/Class_check_with_reference_modification.cob:
Likewise.
* cobol.dg/group2/Complex_HEX__VALUE_and_MOVE.cob: Likewise.
* cobol.dg/group2/Complex_IF.cob: Likewise.
* cobol.dg/group2/Concatenation_operator.cob: Likewise.
* cobol.dg/group2/CONTINUE_AFTER_1_SECONDS.cob: Likewise.
* cobol.dg/group2/CURRENCY_SIGN.cob: Likewise.
* cobol.dg/group2/CURRENCY_SIGN_WITH_PICTURE_SYMBOL.cob:
Likewise.
* cobol.dg/group2/DECIMAL-POINT_is_COMMA__1_.cob: Likewise.
* cobol.dg/group2/DECIMAL-POINT_is_COMMA__2_.cob: Likewise.
* cobol.dg/group2/DECIMAL-POINT_is_COMMA__3_.cob: Likewise.
* cobol.dg/group2/DECIMAL-POINT_is_COMMA__4_.cob: Likewise.
* cobol.dg/group2/DECIMAL-POINT_is_COMMA__5_.cob: Likewise.
* cobol.dg/group2/EC-SIZE-TRUNCATION_EC-SIZE-OVERFLOW.cob:
Likewise.
* cobol.dg/group2/EC-SIZE-ZERO-DIVIDE__fixed_and_float.cob:
Likewise.
* cobol.dg/group2/EXIT_PARAGRAPH.cob: Likewise.
* cobol.dg/group2/EXIT_PERFORM.cob: Likewise.
* cobol.dg/group2/EXIT_PERFORM_CYCLE.cob: Likewise.
* cobol.dg/group2/EXIT_SECTION.cob: Likewise.
* cobol.dg/group2/Fixed_continuation_indicator.cob: Likewise.
* cobol.dg/group2/FLOAT-LONG_with_SIZE_ERROR.cob: Likewise.
* cobol.dg/group2/FLOAT-SHORT___FLOAT-LONG_w_o_SIZE_ERROR.cob:
Likewise.
* cobol.dg/group2/FLOAT-SHORT_with_SIZE_ERROR.cob: Likewise.
* cobol.dg/group2/Index_and_parenthesized_expression.cob:
Likewise.
* cobol.dg/group2/LENGTH_OF_omnibus.cob: Likewise.
*
cobol.dg/group2/LOCAL-STORAGE__3__with_recursive_PROGRAM-ID.cob: Likewise.
*
cobol.dg/group2/LOCAL-STORAGE__4__with_recursive_PROGRAM-ID_..._USING.cob:
Likewise.
* cobol.dg/group2/MOVE_indexes.cob: Likewise.
* cobol.dg/group2/MOVE_integer_literal_to_alphanumeric.cob:
Likewise.
* cobol.dg/group2/MOVE_to_edited_item__1_.cob: Likewise.
* cobol.dg/group2/MOVE_to_edited_item__2_.cob: Likewise.
*
cobol.dg/group2/MOVE_to_item_with_simple_and_floating_insertion.cob:
Likewise.
* cobol.dg/group2/MOVE_to_itself.cob: Likewise.
* cobol.dg/group2/MOVE_to_JUSTIFIED_item.cob: Likewise.
* cobol.dg/group2/MOVE_with_group_refmod.cob: Likewise.
* cobol.dg/group2/MOVE_with_refmod.cob: Likewise.
* cobol.dg/group2/MOVE_with_refmod__variable_.cob: Likewise.
* cobol.dg/group2/MOVE_Z_literal_.cob: Likewise.
*
cobol.dg/group2/Multi-target_MOVE_with_subscript_re-evaluation.cob:
Likewise.
* cobol.dg/group2/Non-numeric_data_in_numeric_items__1_.cob:
Likewise.
* cobol.dg/group2/Non-numeric_data_in_numeric_items__2_.cob:
Likewise.
* cobol.dg/group2/Non-overflow_after_overflow.cob: Likewise.
* cobol.dg/group2/OCCURS_clause_with_1_entry.cob: Likewise.
* cobol.dg/group2/OSVS_Arithmetic_Test__2_.cob: Likewise.
* cobol.dg/group2/PERFORM_..._CONTINUE.cob: Likewise.
* cobol.dg/group2/PERFORM_inline__1_.cob: Likewise.
* cobol.dg/group2/PERFORM_inline__2_.cob: Likewise.
* cobol.dg/group2/PERFORM_type_OSVS.cob: Likewise.
* cobol.dg/group2/PIC_ZZZ-__ZZZ_.cob: Likewise.
* cobol.dg/group2/Quick_check_of_PIC_XX_COMP-5.cob: Likewise.
* cobol.dg/group2/Quote_marks_in_comment_paragraphs.cob:
Likewise.
* cobol.dg/group2/Recursive_PERFORM_paragraph.cob: Likewise.
*

[PATCH v20 0/4] c: Add _Countof and

2025-05-11 Thread Alejandro Colomar
Hi,

Here's the list of changes in v20:

-  Drop changes to support Cc tags in commit messages (but keep the
   patch to add support for Link tags).
-  Drop the Cc tags from commit 2 (but keep Link tags).
-  Remove one _Static_assert() from tests.  I think the test is more
   readable without it.
-  Add space in macro definition.
-  Fix changelog for the makefile changes.
-  Add patch 4, which adds the pedantic warning for <= C23.

I think this is feature-complete.  I have run `make bootstrap`.  I
haven't run `make check` yet this time; I'll do that in the following
days.


Have a lovely day!
Alex


Alejandro Colomar (4):
  contrib/: Add support for Link: tags
  c: Add _Countof operator
  c: Add 
  c: Add -Wpedantic diagnostic for _Countof

 contrib/gcc-changelog/git_commit.py|   3 +
 gcc/Makefile.in|   1 +
 gcc/c-family/c-common.cc   |  26 +
 gcc/c-family/c-common.def  |   3 +
 gcc/c-family/c-common.h|   2 +
 gcc/c/c-decl.cc|  22 +++-
 gcc/c/c-parser.cc  |  63 +++---
 gcc/c/c-tree.h |   4 +
 gcc/c/c-typeck.cc  | 115 +-
 gcc/doc/extend.texi|  30 +
 gcc/ginclude/stdcountof.h  |  31 +
 gcc/testsuite/gcc.dg/countof-compile.c | 130 +
 gcc/testsuite/gcc.dg/countof-vla.c |  51 
 gcc/testsuite/gcc.dg/countof.c | 154 +
 14 files changed, 611 insertions(+), 24 deletions(-)
 create mode 100644 gcc/ginclude/stdcountof.h
 create mode 100644 gcc/testsuite/gcc.dg/countof-compile.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-vla.c
 create mode 100644 gcc/testsuite/gcc.dg/countof.c

Range-diff against v19:
1:  796c82b0cba ! 1:  0a752a02dd0 contrib/: Add support for Cc: and Link: tags
@@ Metadata
 Author: Alejandro Colomar 
 
  ## Commit message ##
-contrib/: Add support for Cc: and Link: tags
+contrib/: Add support for Link: tags
 
 contrib/ChangeLog:
 
 * gcc-changelog/git_commit.py (GitCommit):
-Add support for 'Cc: ' and 'Link: ' tags.
+Add support for 'Link:' tags.
 
 Cc: Jason Merrill 
 Signed-off-by: Alejandro Colomar 
 
  ## contrib/gcc-changelog/git_commit.py ##
 @@ contrib/gcc-changelog/git_commit.py: CO_AUTHORED_BY_PREFIX = 
'co-authored-by: '
- 
  REVIEW_PREFIXES = ('reviewed-by: ', 'reviewed-on: ', 'signed-off-by: ',
 'acked-by: ', 'tested-by: ', 'reported-by: ',
--   'suggested-by: ')
-+   'suggested-by: ', 'cc: ')
+'suggested-by: ')
 +LINK_PREFIXES = ('link: ')
  DATE_FORMAT = '%Y-%m-%d'
  
2:  ae4691c8b45 ! 2:  c28c880f609 c: Add _Countof operator
@@ Commit message
 Suggested-by: Xavier Del Campo Romero 
 Co-authored-by: Martin Uecker 
 Acked-by: "James K. Lowden" 
-Cc: Joseph Myers 
-Cc: Gabriel Ravier 
-Cc: Jakub Jelinek 
-Cc: Kees Cook 
-Cc: Qing Zhao 
-Cc: Jens Gustedt 
-Cc: David Brown 
-Cc: Florian Weimer 
-Cc: Andreas Schwab 
-Cc: Timm Baeder 
-Cc: Daniel Plakosh 
-Cc: "A. Jiang" 
-Cc: Eugene Zelenko 
-Cc: Aaron Ballman 
-Cc: Paul Koning 
-Cc: Daniel Lundin 
-Cc: Nikolaos Strimpas 
-Cc: JeanHeyd Meneide 
-Cc: Fernando Borretti 
-Cc: Jonathan Protzenko 
-Cc: Chris Bazley 
-Cc: Ville Voutilainen 
-Cc: Alex Celeste 
-Cc: Jakub Łukasiewicz 
-Cc: Douglas McIlroy 
-Cc: Jason Merrill 
-Cc: "Gustavo A. R. Silva" 
-Cc: Patrizia Kaye 
-Cc: Ori Bernstein 
-Cc: Robert Seacord 
-Cc: Marek Polacek 
-Cc: Sam James 
-Cc: Richard Biener 
 Signed-off-by: Alejandro Colomar 
 
  ## gcc/c-family/c-common.cc ##
@@ gcc/testsuite/gcc.dg/countof-compile.c (new)
 +{
 +  int b[n][n];
 +
-+  _Static_assert (_Countof (a[f1()]) == 9);
++  _Countof (a[f1()]);
 +  _Countof (b[f2()]);
 +}
 +
3:  f4700c6d7dc ! 3:  f6ff1f130de c: Add 
@@ Commit message
 
 gcc/ChangeLog:
 
-* Makefile.in
+* Makefile.in (USER_H): Add .
 * ginclude/stdcountof.h: Add countof macro.
 
 Signed-off-by: Alejandro Colomar 
@@ gcc/ginclude/stdcountof.h (new)
 +#ifndef _STDCOUNTOF_H
 +#define _STDCOUNTOF_H
 +
-+#define countof _Countof
++#define countof  _Countof
 +
 +#endif/* stdcountof.h */
-:  --- > 4:  94bc203a406 c: Add -Wpedantic diagnostic for _Countof
-- 
2.49.0



[PATCH v1 6/7] RISC-V: Add test for vec_duplicate + vsub.vv combine case 1 with GR2VR cost 1

2025-05-11 Thread pan2 . li
From: Pan Li 

Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i8.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-u64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-u8.c: New test.

Signed-off-by: Pan Li 
---
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i16.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i32.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i64.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i8.c | 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-u16.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-u32.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-u64.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-u8.c | 8 
 8 files changed, 64 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i8.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-u16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-u32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-u64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-u8.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i16.c
new file mode 100644
index 000..ac209e7f884
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i16.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl128b -mabi=lp64d --param=gpr2vr-cost=1" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_1(int16_t, -, VX_BINARY_BODY_X8)
+
+/* { dg-final { scan-assembler {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i32.c
new file mode 100644
index 000..afb852b18f0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i32.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl128b -mabi=lp64d --param=gpr2vr-cost=1" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_1(int32_t, -, VX_BINARY_BODY_X4)
+
+/* { dg-final { scan-assembler {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i64.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i64.c
new file mode 100644
index 000..713f746123c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i64.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl128b -mabi=lp64d --param=gpr2vr-cost=1" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_1(int64_t, -, VX_BINARY_BODY)
+
+/* { dg-final { scan-assembler {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i8.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i8.c
new file mode 100644
index 000..18a098cdda1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i8.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl128b -mabi=lp64d --param=gpr2vr-cost=1" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_1(int8_t, -, VX_BINARY_BODY_X16)
+
+/* { dg-final { scan-assembler {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-u16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-u16.c
new file mode 100644
index 000..8da3ac356ad
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-u16.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl128b -mabi=lp64d --param=gpr2vr-cost=1" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_1(uint16_t, -, VX_BINARY_BODY_X8)
+
+/* { dg-final { scan-assembler {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-u32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-u32.c
new file mode 100644
index 000..f8fdcaad34a
--- /dev/null
+++ b/

[PATCH v1 1/7] RISC-V: Combine vec_duplicate + vsub.vv to vsub.vx on GR2VR cost

2025-05-11 Thread pan2 . li
From: Pan Li 

This patch would like to combine the vec_duplicate + vsub.vv to the
vsub.vx.  From example as below code.  The related pattern will depend
on the cost of vec_duplicate from GR2VR.  Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.

Assume we have example code like below, GR2VR cost is 0.

  #define DEF_VX_BINARY(T, OP)\
  void\
  test_vx_binary (T * restrict out, T * restrict in, T x, unsigned n) \
  {   \
for (unsigned i = 0; i < n; i++)  \
  out[i] = in[i] OP x;\
  }

  DEF_VX_BINARY(int32_t, -)

Before this patch:
  10   │ test_binary_vx_sub:
  11   │ beq a3,zero,.L8
  12   │ vsetvli a5,zero,e32,m1,ta,ma // Deleted if GR2VR cost zero
  13   │ vmv.v.x v2,a2// Ditto.
  14   │ sllia3,a3,32
  15   │ srlia3,a3,32
  16   │ .L3:
  17   │ vsetvli a5,a3,e32,m1,ta,ma
  18   │ vle32.v v1,0(a1)
  19   │ sllia4,a5,2
  20   │ sub a3,a3,a5
  21   │ add a1,a1,a4
  22   │ vsub.vv v1,v2,v1
  23   │ vse32.v v1,0(a0)
  24   │ add a0,a0,a4
  25   │ bne a3,zero,.L3

After this patch:
  10   │ test_binary_vx_sub:
  11   │ beq a3,zero,.L8
  12   │ sllia3,a3,32
  13   │ srlia3,a3,32
  14   │ .L3:
  15   │ vsetvli a5,a3,e32,m1,ta,ma
  16   │ vle32.v v1,0(a1)
  17   │ sllia4,a5,2
  18   │ sub a3,a3,a5
  19   │ add a1,a1,a4
  20   │ vsub.vx v1,v1,a2
  21   │ vse32.v v1,0(a0)
  22   │ add a0,a0,a4
  23   │ bne a3,zero,.L3

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/autovec-opt.md (*_vx_): >): Add new
pattern to convert vec_duplicate + vsub.vv to vsub.vx.
* config/riscv/riscv.cc (riscv_rtx_costs): Add minus as plus op.
* config/riscv/vector-iterators.md: Add minus to iterator
any_int_binop_no_shift_vx.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec-opt.md  | 17 +
 gcc/config/riscv/riscv.cc|  1 +
 gcc/config/riscv/vector-iterators.md |  2 +-
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index 7cf7e8a92ba..9c6bf06c3a9 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -1696,3 +1696,20 @@ (define_insn_and_split "*_vx_"
   riscv_vector::BINARY_OP, ops);
   }
   [(set_attr "type" "vialu")])
+
+(define_insn_and_split "*_vx_"
+ [(set (match_operand:V_VLSI0 "register_operand")
+   (any_int_binop_no_shift_vx:V_VLSI
+(match_operand:V_VLSI  2 "")
+(vec_duplicate:V_VLSI
+  (match_operand: 1 "register_operand"]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+rtx ops[] = {operands[0], operands[2], operands[1]};
+riscv_vector::emit_vlmax_insn (code_for_pred_scalar (, mode),
+  riscv_vector::BINARY_OP, ops);
+  }
+  [(set_attr "type" "vialu")])
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 3ee88db24fa..23a47978a18 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3875,6 +3875,7 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
outer_code, int opno ATTRIBUTE_UN
*total = gr2vr_cost * COSTS_N_INSNS (1);
break;
  case PLUS:
+ case MINUS:
{
  rtx op_0 = XEXP (x, 0);
  rtx op_1 = XEXP (x, 1);
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index eae33409cb0..23cb940310f 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -4042,7 +4042,7 @@ (define_code_iterator any_int_binop [plus minus and ior 
xor ashift ashiftrt lshi
 ])
 
 (define_code_iterator any_int_binop_no_shift_vx [
-  plus
+  plus minus
 ])
 
 (define_code_iterator any_int_unop [neg not])
-- 
2.43.0



[PATCH v1 3/7] RISC-V: Add test for vec_duplicate + vsub.vv combine case 0 with GR2VR cost 1

2025-05-11 Thread pan2 . li
From: Pan Li 

Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i8.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-u64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-u8.c: New test.

Signed-off-by: Pan Li 
---
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i16.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i32.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i64.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i8.c | 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-u16.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-u32.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-u64.c| 8 
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-u8.c | 8 
 8 files changed, 64 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i8.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-u16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-u32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-u64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-u8.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i16.c
new file mode 100644
index 000..fdbc7fc2c4c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i16.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d --param=gpr2vr-cost=1" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_0(int16_t, -)
+
+/* { dg-final { scan-assembler-not {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i32.c
new file mode 100644
index 000..92fcad5c8d0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i32.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d --param=gpr2vr-cost=1" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_0(int32_t, -)
+
+/* { dg-final { scan-assembler-not {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i64.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i64.c
new file mode 100644
index 000..10d4378668b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i64.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d --param=gpr2vr-cost=1" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_0(int64_t, -)
+
+/* { dg-final { scan-assembler-not {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i8.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i8.c
new file mode 100644
index 000..63626f96573
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i8.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d --param=gpr2vr-cost=1" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_0(int8_t, -)
+
+/* { dg-final { scan-assembler-not {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-u16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-u16.c
new file mode 100644
index 000..c0d853a9068
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-u16.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d --param=gpr2vr-cost=1" } */
+
+#include "vx_binary.h"
+
+DEF_VX_BINARY_CASE_0(uint16_t, -)
+
+/* { dg-final { scan-assembler-not {vsub.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-u32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-u32.c
new file mode 100644
index 000..f0710e17d39
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-u32.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg

[PATCH v1 0/7] RISC-V: Combine vec_duplicate + vsub.vv to vsub.vx on GR2VR cost

2025-05-11 Thread pan2 . li
From: Pan Li 

This patch would like to introduce the combine of vec_dup + vsub.vv into
vsub.vx on the cost value of GR2VR.  The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test.  There will be two cases for the combine:

Case 0:
 |   ...
 |   vmv.v.x
 | L1:
 |   vsub.vv
 |   J L1
 |   ...

Case 1:
 |   ...
 | L1:
 |   vmv.v.x
 |   vsub.vv
 |   J L1
 |   ...

Both will be combined to below if the cost of GR2VR is zero.
 |   ...
 | L1:
 |   vsub.vx
 |   J L1
 |   ...

The below test suites are passed for this patch series.
* The rv64gcv fully regression test.

Pan Li (7):
  RISC-V: Combine vec_duplicate + vsub.vv to vsub.vx on GR2VR cost
  RISC-V: Add test for vec_duplicate + vsub.vv combine case 0 with GR2VR cost 0
  RISC-V: Add test for vec_duplicate + vsub.vv combine case 0 with GR2VR cost 1
  RISC-V: Add test for vec_duplicate + vsub.vv combine case 0 with GR2VR cost 15
  RISC-V: Add test for vec_duplicate + vsub.vv combine case 1 with GR2VR cost 0
  RISC-V: Add test for vec_duplicate + vsub.vv combine case 1 with GR2VR cost 1
  RISC-V: Add test for vec_duplicate + vsub.vv combine case 1 with GR2VR cost 2

 gcc/config/riscv/autovec-opt.md   |  17 +
 gcc/config/riscv/riscv.cc |   1 +
 gcc/config/riscv/vector-iterators.md  |   2 +-
 .../riscv/rvv/autovec/vx_vf/vx_binary_data.h  | 392 ++
 .../riscv/rvv/autovec/vx_vf/vx_vsub-1-i16.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-1-i32.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-1-i64.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-1-i8.c|   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-1-u16.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-1-u32.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-1-u64.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-1-u8.c|   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-2-i16.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-2-i32.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-2-i64.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-2-i8.c|   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-2-u16.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-2-u32.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-2-u64.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-2-u8.c|   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-3-i16.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-3-i32.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-3-i64.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-3-i8.c|   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-3-u16.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-3-u32.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-3-u64.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-3-u8.c|   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-4-i16.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-4-i32.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-4-i64.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-4-i8.c|   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-4-u16.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-4-u32.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-4-u64.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-4-u8.c|   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-5-i16.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-5-i32.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-5-i64.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-5-i8.c|   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-5-u16.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-5-u32.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-5-u64.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-5-u8.c|   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-6-i16.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-6-i32.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-6-i64.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-6-i8.c|   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-6-u16.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-6-u32.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-6-u64.c   |   8 +
 .../riscv/rvv/autovec/vx_vf/vx_vsub-6-u8.c|   8 +
 .../rvv/autovec/vx_vf/vx_vsub-run-1-i16.c |  14 +
 .../rvv/autovec/vx_vf/vx_vsub-run-1-i32.c |  14 +
 .../rvv/autovec/vx_vf/vx_vsub-run-1-i64.c |  14 +
 .../rvv/autovec/vx_vf/vx_vsub-run-1-i8.c  |  14 +
 .../rvv/autovec/vx_vf/vx_vsub-run-1-u16.c |  14 +
 .../rvv/autovec/vx_vf/vx_vsub-run-1-u32.c |  14 +
 .../rvv/autovec/vx_vf/vx_vsub-run-1-u64.c |  14 +
 .../rvv/autovec/vx_vf/vx_vsub-run-1-u8.c  |  14 +
 60 files changed, 907 insertions(+), 1 deletion(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-i16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-i32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-i8.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-1-u16.c
 create mo

[PATCH v20 2/4] c: Add _Countof operator

2025-05-11 Thread Alejandro Colomar
This operator is similar to sizeof but can only be applied to an array,
and returns its number of elements.

FUTURE DIRECTIONS:

-  We should make it work with array parameters to functions,
   and somehow magically return the number of elements of the array,
   regardless of it being really a pointer.

gcc/ChangeLog:

* doc/extend.texi: Document _Countof operator.

gcc/c-family/ChangeLog:

* c-common.h
* c-common.def
* c-common.cc (c_countof_type): Add _Countof operator.

gcc/c/ChangeLog:

* c-tree.h
(c_expr_countof_expr, c_expr_countof_type)
* c-decl.cc
(start_struct, finish_struct)
(start_enum, finish_enum)
* c-parser.cc
(c_parser_sizeof_expression)
(c_parser_countof_expression)
(c_parser_sizeof_or_countof_expression)
(c_parser_unary_expression)
* c-typeck.cc
(build_external_ref)
(record_maybe_used_decl)
(pop_maybe_used)
(is_top_array_vla)
(c_expr_countof_expr, c_expr_countof_type):
Add _Countof operator.

gcc/testsuite/ChangeLog:

* gcc.dg/countof-compile.c
* gcc.dg/countof-vla.c
* gcc.dg/countof.c: Add tests for _Countof operator.

Link: 
Link: 
Link: 
Link: 

Link: 
Link: 
Link: 
Link: 
Link: 
Link: 
Link: 
Link: 
Suggested-by: Xavier Del Campo Romero 
Co-authored-by: Martin Uecker 
Acked-by: "James K. Lowden" 
Signed-off-by: Alejandro Colomar 
---
 gcc/c-family/c-common.cc   |  26 +
 gcc/c-family/c-common.def  |   3 +
 gcc/c-family/c-common.h|   2 +
 gcc/c/c-decl.cc|  22 +++-
 gcc/c/c-parser.cc  |  59 +++---
 gcc/c/c-tree.h |   4 +
 gcc/c/c-typeck.cc  | 115 +-
 gcc/doc/extend.texi|  30 +
 gcc/testsuite/gcc.dg/countof-compile.c | 130 +
 gcc/testsuite/gcc.dg/countof-vla.c |  51 
 gcc/testsuite/gcc.dg/countof.c | 154 +
 11 files changed, 572 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/countof-compile.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-vla.c
 create mode 100644 gcc/testsuite/gcc.dg/countof.c

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 587d76461e9..f71cb2652d5 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -394,6 +394,7 @@ const struct c_common_resword c_common_reswords[] =
 {
   { "_Alignas",RID_ALIGNAS,   D_CONLY },
   { "_Alignof",RID_ALIGNOF,   D_CONLY },
+  { "_Countof",RID_COUNTOF,   D_CONLY },
   { "_Atomic", RID_ATOMIC,D_CONLY },
   { "_BitInt", RID_BITINT,D_CONLY },
   { "_Bool",   RID_BOOL,  D_CONLY },
@@ -4080,6 +4081,31 @@ c_alignof_expr (location_t loc, tree expr)
 
   return fold_convert_loc (loc, size_type_node, t);
 }
+
+/* Implement the _Countof keyword:
+   Return the number of elements of an array.  */
+
+tree
+c_countof_type (location_t loc, tree type)
+{
+  enum tree_code type_code;
+
+  type_code = TREE_CODE (type);
+  if (type_code != ARRAY_TYPE)
+{
+  error_at (loc, "invalid application of %<_Countof%> to type %qT", type);
+  return error_mark_node;
+}
+  if (!COMPLETE_TYPE_P (type))
+{
+  error_at (loc,
+   "invalid application of %<_Countof%> to incomplete type %qT",
+   type);
+  return error_mark_node;
+}
+
+  return array_type_nelts_top (type);
+}
 
 /* Handle C and C++ default attributes.  */
 
diff --git a/gcc/c-family/c-common.def b/gcc/c-family/c-common.def
index cf2228201fa..0bcc4998afe 100644
--- a/gcc/c-family/c-common.def
+++ b/gcc/c-family/c-common.def
@@ -50,6 +50,9 @@ DEFTREECODE (EXCESS_PRECISION_EXPR, "excess_precision_expr", 
tcc_expression, 1)
number.  */
 DEFTREECODE (USERDEF_LITERAL, "userdef_literal", tcc_exceptional, 3)
 
+/* Represents a 'countof' expression.  */
+DEFTREECODE (COUNTOF_EXPR, "countof_expr", tcc_expression, 1)
+
 /* Represents a 'sizeof' expression during C++ template expansion,
or for the purpose of -Wsizeof-pointer-memaccess warning.  */
 DEFTREECODE (SIZEOF_EXPR, "size

[PATCH v20 1/4] contrib/: Add support for Link: tags

2025-05-11 Thread Alejandro Colomar
contrib/ChangeLog:

* gcc-changelog/git_commit.py (GitCommit):
Add support for 'Link:' tags.

Cc: Jason Merrill 
Signed-off-by: Alejandro Colomar 
---
 contrib/gcc-changelog/git_commit.py | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
index 5645f80ebb9..5f5f3b9110a 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -188,6 +188,7 @@ CO_AUTHORED_BY_PREFIX = 'co-authored-by: '
 REVIEW_PREFIXES = ('reviewed-by: ', 'reviewed-on: ', 'signed-off-by: ',
'acked-by: ', 'tested-by: ', 'reported-by: ',
'suggested-by: ')
+LINK_PREFIXES = ('link: ')
 DATE_FORMAT = '%Y-%m-%d'
 
 
@@ -529,6 +530,8 @@ class GitCommit:
 continue
 elif lowered_line.startswith(REVIEW_PREFIXES):
 continue
+elif lowered_line.startswith(LINK_PREFIXES):
+continue
 else:
 m = cherry_pick_regex.search(line)
 if m:
-- 
2.49.0



[PATCH v20 3/4] c: Add

2025-05-11 Thread Alejandro Colomar
gcc/ChangeLog:

* Makefile.in (USER_H): Add .
* ginclude/stdcountof.h: Add countof macro.

Signed-off-by: Alejandro Colomar 
---
 gcc/Makefile.in   |  1 +
 gcc/ginclude/stdcountof.h | 31 +++
 2 files changed, 32 insertions(+)
 create mode 100644 gcc/ginclude/stdcountof.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index e3af923e0e0..8d5d357632e 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -481,6 +481,7 @@ USER_H = $(srcdir)/ginclude/float.h \
 $(srcdir)/ginclude/stdalign.h \
 $(srcdir)/ginclude/stdatomic.h \
 $(srcdir)/ginclude/stdckdint.h \
+$(srcdir)/ginclude/stdcountof.h \
 $(EXTRA_HEADERS)
 
 USER_H_INC_NEXT_PRE = @user_headers_inc_next_pre@
diff --git a/gcc/ginclude/stdcountof.h b/gcc/ginclude/stdcountof.h
new file mode 100644
index 000..1d914f40e5d
--- /dev/null
+++ b/gcc/ginclude/stdcountof.h
@@ -0,0 +1,31 @@
+/* Copyright (C) 2025 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+/* ISO C2Y: 7.21 Array count .  */
+
+#ifndef _STDCOUNTOF_H
+#define _STDCOUNTOF_H
+
+#define countof  _Countof
+
+#endif /* stdcountof.h */
-- 
2.49.0



[PATCH v20 4/4] c: Add -Wpedantic diagnostic for _Countof

2025-05-11 Thread Alejandro Colomar
It is not supported in <= C23 mode.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_sizeof_or_countof_expression):
Add -Wpedantic diagnostic for _Countof in <= C23 mode.

Signed-off-by: Alejandro Colomar 
---
 gcc/c/c-parser.cc | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 87700339394..d2193ad2f34 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -10637,6 +10637,10 @@ c_parser_sizeof_or_countof_expression (c_parser 
*parser, enum rid rid)
 
   start = c_parser_peek_token (parser)->location;
 
+  if (rid == RID_COUNTOF)
+pedwarn_c23 (start, OPT_Wpedantic,
+"ISO C does not support %qs before C23", op_name);
+
   c_parser_consume_token (parser);
   c_inhibit_evaluation_warnings++;
   if (rid == RID_COUNTOF)
-- 
2.49.0



Re: [PATCH] hurd: Add OPTION_GLIBC_P and OPTION_GLIBC

2025-05-11 Thread Samuel Thibault
Hello,

Are there any news on this?

Samuel

Samuel Thibault, le lun. 10 févr. 2025 23:08:35 +0100, a ecrit:
> Hello,
> 
> Are there any news on this?
> 
> Samuel
> 
> Samuel Thibault, le jeu. 02 janv. 2025 16:33:43 +0100, a ecrit:
> > From: Svante Signell 
> > 
> > GNU/Hurd uses glibc just like GNU/Linux.
> > 
> > This is needed for gcc to notice that glibc supports split stack in
> > finish_options.
> > 
> > gcc/ChangeLog:
> > * gcc/config/gnu.h (OPTION_GLIBC_P, OPTION_GLIBC): Define.
> > 
> > Patch from Svante Signell for PR go/104290.
> > ---
> >  gcc/config/gnu.h | 4 
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/gcc/config/gnu.h b/gcc/config/gnu.h
> > index e2a33baf040..4e921e0d51e 100644
> > --- a/gcc/config/gnu.h
> > +++ b/gcc/config/gnu.h
> > @@ -19,6 +19,10 @@ You should have received a copy of the GNU General 
> > Public License
> >  along with GCC.  If not, see .
> >  */
> >  
> > +/* C libraries used on GNU/Hurd.  */
> > +#define OPTION_GLIBC_P(opts)   (DEFAULT_LIBC == LIBC_GLIBC)
> > +#define OPTION_GLIBC   OPTION_GLIBC_P (&global_options)
> > +
> >  #undef GNU_USER_TARGET_OS_CPP_BUILTINS
> >  #define GNU_USER_TARGET_OS_CPP_BUILTINS()  \
> >  do {   \
> > -- 
> > 2.43.0

-- 
Samuel
 l'alim je sais où elle est, elle est juste à côté de la dame qui dort
 B: clairement faut revoir les priorités dans la vie
 B: une dame ça se retrouve, un uptime...


[COMMITTED] testsuite: xtensa: add support for effective_target_sync_*

2025-05-11 Thread Max Filippov
Add new function check_effective_target_xtensa_atomic and use it in the
check_effective_target_sync_int_long and
check_effective_target_sync_char_short.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp
(check_effective_target_xtensa_atomic): New function.
(check_effective_target_sync_int_long)
(check_effective_target_sync_char_short): Add test for xtensa.
---
 gcc/testsuite/lib/target-supports.exp | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 287e51bbfc66..24d0b3d08e34 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -10145,6 +10145,7 @@ proc check_effective_target_sync_int_long { } {
 || ([istarget arc*-*-*] && [check_effective_target_arc_atomic])
 || [check_effective_target_mips_llsc]
 || [istarget nvptx*-*-*]
+|| ([istarget xtensa*-*-*] && 
[check_effective_target_xtensa_atomic])
 }}]
 }
 
@@ -10182,7 +10183,9 @@ proc check_effective_target_sync_char_short { } {
 || ([istarget riscv*-*-*]
 && ([check_effective_target_riscv_zalrsc]
 || [check_effective_target_riscv_zabha]))
-|| [check_effective_target_mips_llsc] }}]
+|| [check_effective_target_mips_llsc]
+|| ([istarget xtensa*-*-*] && 
[check_effective_target_xtensa_atomic])
+}}]
 }
 
 # Return 1 if thread_fence does not rely on __sync_synchronize
@@ -14407,3 +14410,12 @@ proc 
check_effective_target_speculation_barrier_defined { } {
}
}]
 }
+
+# Return 1 if this is a compiler supporting Xtensa atomic operations
+proc check_effective_target_xtensa_atomic { } {
+return [check_no_compiler_messages xtensa_atomic assembly {
+   #if __XCHAL_HAVE_S32C1I != 1 && __XCHAL_HAVE_EXCLUSIVE != 1
+   #error FOO
+   #endif
+}]
+}
-- 
2.39.5



Re: [PATCH 61/61] Fix pr54240

2025-05-11 Thread Andrew Pinski
On Mon, Feb 3, 2025 at 1:46 AM Richard Biener
 wrote:
>
> On Fri, Jan 31, 2025 at 7:18 PM Aleksandar Rakic
>  wrote:
> >
> > From: Chao-ying Fu 
>
> OK

Pushed as r16-533 with a slightly reworded commit message that
references the fix for the powerpc testcase:
```
Fix mips pr54240 testcase

Like r9-5152-gd1409ea5a2f759 but for the mips testcase.
```

Thanks,
Andrew Pinski

>
> > gcc/testsuite/
> > * gcc.target/mips/pr54240.c: Scan phiopt2.
> >
> > Cherry-picked 02dd052d4822ca187af075f1fb5301c954844144
> > from https://github.com/MIPS/gcc
> >
> > Signed-off-by: Chao-ying Fu 
> > Signed-off-by: Aleksandar Rakic 
> > ---
> >  gcc/testsuite/gcc.target/mips/pr54240.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/gcc/testsuite/gcc.target/mips/pr54240.c 
> > b/gcc/testsuite/gcc.target/mips/pr54240.c
> > index d3976f6cfef..31b793bb8c6 100644
> > --- a/gcc/testsuite/gcc.target/mips/pr54240.c
> > +++ b/gcc/testsuite/gcc.target/mips/pr54240.c
> > @@ -27,4 +27,4 @@ NOMIPS16 int foo(S *s)
> >return next->v;
> >  }
> >
> > -/* { dg-final { scan-tree-dump "Hoisting adjacent loads" "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump "Hoisting adjacent loads" "phiopt2" } } */
> > --
> > 2.34.1