Re: [PATCH 1/2] c++: add -Wdeprecated-literal-operator [CWG2521]

2024-11-10 Thread Andrew Pinski
On Thu, Oct 3, 2024 at 9:42 AM Jason Merrill  wrote:
>
> Tested x86_64-pc-linux-gnu, applying to trunk.
>
> -- 8< --
>
> C++23 CWG issue 2521 (https://wg21.link/cwg2521) deprecates user-defined
> literal operators declared with the optional space between "" and the
> suffix.
>
> Many testcases used that syntax; I removed the space from most of them, and
> added C++23 warning tests to a few.

I noticed that clang turns this warning on for all language levels.
Should we follow clang here too?

Thanks,
Andrew

>
> CWG 2521
>
> gcc/ChangeLog:
>
> * doc/invoke.texi: Document -Wdeprecated-literal-operator.
>
> gcc/c-family/ChangeLog:
>
> * c.opt: Add -Wdeprecated-literal-operator.
> * c-opts.cc (c_common_post_options): Default on in C++23.
> * c.opt.urls: Regenerate.
>
> gcc/cp/ChangeLog:
>
> * parser.cc (location_between): New.
> (cp_parser_operator): Handle -Wdeprecated-literal-operator.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/cpp0x/udlit-string-literal.h
> * g++.dg/cpp0x/Wliteral-suffix2.C
> * g++.dg/cpp0x/constexpr-55708.C
> * g++.dg/cpp0x/gnu_fext-numeric-literals.C
> * g++.dg/cpp0x/gnu_fno-ext-numeric-literals.C
> * g++.dg/cpp0x/pr51420.C
> * g++.dg/cpp0x/pr60209-neg.C
> * g++.dg/cpp0x/pr60209.C
> * g++.dg/cpp0x/pr61038.C
> * g++.dg/cpp0x/std_fext-numeric-literals.C
> * g++.dg/cpp0x/std_fno-ext-numeric-literals.C
> * g++.dg/cpp0x/udlit-addr.C
> * g++.dg/cpp0x/udlit-args-neg.C
> * g++.dg/cpp0x/udlit-args.C
> * g++.dg/cpp0x/udlit-args2.C
> * g++.dg/cpp0x/udlit-clink-neg.C
> * g++.dg/cpp0x/udlit-concat-neg.C
> * g++.dg/cpp0x/udlit-concat.C
> * g++.dg/cpp0x/udlit-constexpr.C
> * g++.dg/cpp0x/udlit-cpp98-neg.C
> * g++.dg/cpp0x/udlit-declare-neg.C
> * g++.dg/cpp0x/udlit-embed-quote.C
> * g++.dg/cpp0x/udlit-extended-id-1.C
> * g++.dg/cpp0x/udlit-extended-id-3.C
> * g++.dg/cpp0x/udlit-extern-c.C
> * g++.dg/cpp0x/udlit-friend.C
> * g++.dg/cpp0x/udlit-general.C
> * g++.dg/cpp0x/udlit-implicit-conv-neg-char8_t.C
> * g++.dg/cpp0x/udlit-implicit-conv-neg.C
> * g++.dg/cpp0x/udlit-inline.C
> * g++.dg/cpp0x/udlit-mangle.C
> * g++.dg/cpp0x/udlit-member-neg.C
> * g++.dg/cpp0x/udlit-namespace.C
> * g++.dg/cpp0x/udlit-nofunc-neg.C
> * g++.dg/cpp0x/udlit-nonempty-str-neg.C
> * g++.dg/cpp0x/udlit-nosuffix-neg.C
> * g++.dg/cpp0x/udlit-nounder-neg.C
> * g++.dg/cpp0x/udlit-operator-neg.C
> * g++.dg/cpp0x/udlit-overflow-neg.C
> * g++.dg/cpp0x/udlit-overflow.C
> * g++.dg/cpp0x/udlit-preproc-neg.C
> * g++.dg/cpp0x/udlit-raw-length.C
> * g++.dg/cpp0x/udlit-raw-op-string-neg.C
> * g++.dg/cpp0x/udlit-raw-op.C
> * g++.dg/cpp0x/udlit-raw-str.C
> * g++.dg/cpp0x/udlit-resolve-char8_t.C
> * g++.dg/cpp0x/udlit-resolve.C
> * g++.dg/cpp0x/udlit-shadow-neg.C
> * g++.dg/cpp0x/udlit-string-length.C
> * g++.dg/cpp0x/udlit-suffix-neg.C
> * g++.dg/cpp0x/udlit-template.C
> * g++.dg/cpp0x/udlit-tmpl-arg-neg.C
> * g++.dg/cpp0x/udlit-tmpl-arg-neg2.C
> * g++.dg/cpp0x/udlit-tmpl-arg.C
> * g++.dg/cpp0x/udlit-tmpl-parms-neg.C
> * g++.dg/cpp0x/udlit-tmpl-parms.C
> * g++.dg/cpp1y/pr57640.C
> * g++.dg/cpp1y/pr88872.C
> * g++.dg/cpp26/unevalstr1.C
> * g++.dg/cpp2a/concepts-pr60391.C
> * g++.dg/cpp2a/consteval-prop21.C
> * g++.dg/cpp2a/nontype-class6.C
> * g++.dg/cpp2a/udlit-class-nttp-ctad-neg.C
> * g++.dg/cpp2a/udlit-class-nttp-ctad-neg2.C
> * g++.dg/cpp2a/udlit-class-nttp-ctad.C
> * g++.dg/cpp2a/udlit-class-nttp-neg.C
> * g++.dg/cpp2a/udlit-class-nttp-neg2.C
> * g++.dg/cpp2a/udlit-class-nttp.C
> * g++.dg/ext/is_convertible2.C
> * g++.dg/lookup/pr87269.C
> * g++.dg/cpp0x/udlit_system_header: Adjust for C++23 deprecated
> operator "" _suffix.
> * g++.dg/DRs/dr2521.C: New test.
> ---
>  gcc/doc/invoke.texi   | 12 ++
>  gcc/c-family/c.opt|  4 ++
>  .../g++.dg/cpp0x/udlit-string-literal.h   | 10 ++---
>  gcc/c-family/c-opts.cc|  5 +++
>  gcc/cp/parser.cc  | 33 +--
>  gcc/testsuite/g++.dg/DRs/dr2521.C |  5 +++
>  gcc/testsuite/g++.dg/cpp0x/Wliteral-suffix2.C |  5 ++-
>  gcc/testsuite/g++.dg/cpp0x/constexpr-55708.C  |  2 +-
>  .../g++.dg/cpp0x/gnu_fext-numeric-literals.C  | 32 +++
>  .../cpp0x/gnu_fno-ext-numeric-literals.C  | 32 +++
>  gcc/testsuite/g++.dg/cpp0x/pr51420.C  |  4 +-
>  gcc/testsuite/g++.dg/cpp0x/pr60209-neg.C  | 16 
>  gcc/testsuite/g++.dg/c

[PATCH v3] i386: Zero extend 32-bit address to 64-bit with option -mx32 -maddress-mode=long. [PR 117418]

2024-11-10 Thread Hu, Lin1


OK, added check for target.

Bootstrapped and Regtested on x86-64-linux-pc-gnu, OK for trunk?

BRs,
Lin

-maddress-mode=long let Pmode = DI_mode, so zero extend 32-bit address to
64-bit and uses a 64-bit register as a pointer for avoid raise an ICE.

gcc/ChangeLog:

PR target/117418
* config/i386/i386-expand.cc (ix86_expand_builtin): Convert
pointer's mode according to Pmode.

gcc/testsuite/ChangeLog:

PR target/117418
* gcc.target/i386/pr117418-1.c: New test.
---
 gcc/config/i386/i386-expand.cc | 12 +++
 gcc/testsuite/gcc.target/i386/pr117418-1.c | 24 ++
 2 files changed, 36 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr117418-1.c

diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index 6eef27f3fcd..a99ef9613f5 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -14064,6 +14064,9 @@ ix86_expand_builtin (tree exp, rtx target, rtx 
subtarget,
   op1 = expand_normal (arg1);
   op2 = expand_normal (arg2);
 
+  if (GET_MODE (op1) != Pmode)
+   op1 = convert_to_mode (Pmode, op1, 1);
+
   if (!address_operand (op2, VOIDmode))
{
  op2 = convert_memory_address (Pmode, op2);
@@ -14099,6 +14102,9 @@ ix86_expand_builtin (tree exp, rtx target, rtx 
subtarget,
   emit_label (ok_label);
   emit_insn (gen_rtx_SET (target, pat));
 
+  if (GET_MODE (op0) != Pmode)
+   op0 = convert_to_mode (Pmode, op0, 1);
+
   for (i = 0; i < 8; i++)
{
  op = gen_rtx_MEM (V2DImode,
@@ -14123,6 +14129,9 @@ ix86_expand_builtin (tree exp, rtx target, rtx 
subtarget,
if (!REG_P (op0))
  op0 = copy_to_mode_reg (SImode, op0);
 
+   if (GET_MODE (op2) != Pmode)
+ op2 = convert_to_mode (Pmode, op2, 1);
+
op = gen_rtx_REG (V2DImode, GET_SSE_REGNO (0));
emit_move_insn (op, op1);
 
@@ -14160,6 +14169,9 @@ ix86_expand_builtin (tree exp, rtx target, rtx 
subtarget,
if (!REG_P (op0))
  op0 = copy_to_mode_reg (SImode, op0);
 
+   if (GET_MODE (op3) != Pmode)
+ op3 = convert_to_mode (Pmode, op3, 1);
+
/* Force to use xmm0, xmm1 for keylow, keyhi*/
op = gen_rtx_REG (V2DImode, GET_SSE_REGNO (0));
emit_move_insn (op, op1);
diff --git a/gcc/testsuite/gcc.target/i386/pr117418-1.c 
b/gcc/testsuite/gcc.target/i386/pr117418-1.c
new file mode 100644
index 000..4839b139b79
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr117418-1.c
@@ -0,0 +1,24 @@
+/* PR target/117418 */
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-maddress-mode=long -mwidekl -mx32" } */
+/* { dg-require-effective-target maybe_x32  } */
+/* { dg-final { scan-assembler-times "aesdec128kl" 1 } } */
+/* { dg-final { scan-assembler-times "aesdec256kl" 1 } } */
+/* { dg-final { scan-assembler-times "aesenc128kl" 1 } } */
+/* { dg-final { scan-assembler-times "aesenc256kl" 1 } } */
+/* { dg-final { scan-assembler-times "encodekey128" 1 } } */
+/* { dg-final { scan-assembler-times "encodekey256" 1 } } */
+
+typedef __attribute__((__vector_size__(16))) long long V;
+V a;
+
+void
+foo()
+{
+__builtin_ia32_aesdec128kl_u8 (&a, a, &a);
+__builtin_ia32_aesdec256kl_u8 (&a, a, &a);
+__builtin_ia32_aesenc128kl_u8 (&a, a, &a);
+__builtin_ia32_aesenc256kl_u8 (&a, a, &a);
+__builtin_ia32_encodekey128_u32 (0, a, &a); 
+__builtin_ia32_encodekey256_u32 (0, a, a, &a); 
+}
-- 
2.31.1



[WIP][PATCH] c++: Fix ABI for lambdas declared in alias templates [PR116568]

2024-11-10 Thread Nathaniel Shead
FWIW, here's my WIP patch to fix the lambda in type alias case
"properly".  I've gotten stuck with trying to work out how to set
LAMBDA_EXPR_EXTRA_CONTEXT on the uninstantiated declaration; any
thoughts or suggestions here?

I also found that the hunk 

-   /* Substituting the type might have recursively instantiated this
-  same alias (c++/86171).  */
-   if (use_spec_table && gen_tmpl && DECL_ALIAS_TEMPLATE_P (gen_tmpl)
-   && (spec = retrieve_specialization (gen_tmpl, argvec, hash)))
- {
-   r = spec;
-   break;
- }

is no longer needed for the original testcase and doesn't appear to be
used anymore, but I imagine we might need something similar in a
different way to handle this testcase (which errors since GCC14 but
succeeds before then):

  template  struct A;
  template  using B = decltype([]() -> A::X { return 0; });
  template  struct A {
typedef int X;
typedef B U;
  };
  B b;

I wasn't able to find an exact dup for this so I've created a new issue
for it (PR117530).

-- >8 --

This adds mangling support for lambdas with a mangling context of an
alias template, and gives that context when instantiating such a lambda.

This only currently works for class-scope alias templates, however, due
to the

  if (LAMBDA_EXPR_EXTRA_SCOPE (t))
record_lambda_scope (r);

condition in 'tsubst_lambda_scope'.

For namespace-scope alias templates, we can't easily add the mangling
context: we can't build the TYPE_DECL to record against until after
we've parsed the type (and already recorded lambda scope), as
`start_decl` relies on the type being passed in correctly, and setting
the mangling scope after parsing is too late because e.g.
'template_class_depth' (called from grokfndecl when building the lambda
functions while parsing the type) relies on the LAMBDA_EXPR_EXTRA_SCOPE
already being properly set.  This will also likely matter for
'determine_visibility'.  I'm not sure what a good way to break this
recursive dependency is.

PR c++/116568

gcc/cp/ChangeLog:

* mangle.cc (maybe_template_info): Support getting template info
of alias templates.
(canonicalize_for_substitution): Don't canonicalise aliases.
(decl_mangling_context): Don't treat aliases as lambda closure
types.
(write_unqualified_name): Likewise.
* pt.cc (tsubst_decl): Start lambda scope for alias templates.
(instantiate_template): No longer need to special case alias
templates here.

gcc/testsuite/ChangeLog:

* g++.dg/abi/lambda-ctx4.C: Adjust mangling, include namespace
scope alias templates (XFAILed for now).

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/mangle.cc   | 13 -
 gcc/cp/pt.cc   | 23 ++-
 gcc/testsuite/g++.dg/abi/lambda-ctx4.C | 21 ++---
 3 files changed, 32 insertions(+), 25 deletions(-)

diff --git a/gcc/cp/mangle.cc b/gcc/cp/mangle.cc
index 42fcdc34353..a6d1830397c 100644
--- a/gcc/cp/mangle.cc
+++ b/gcc/cp/mangle.cc
@@ -293,7 +293,7 @@ abi_check (int ver)
 static tree
 maybe_template_info (const tree decl)
 {
-  if (TREE_CODE (decl) == TYPE_DECL)
+  if (TREE_CODE (decl) == TYPE_DECL && !TYPE_DECL_ALIAS_P (decl))
 {
   /* TYPE_DECLs are handled specially.  Look at its type to decide
 if this is a template instantiation.  */
@@ -306,7 +306,7 @@ maybe_template_info (const tree decl)
 {
   /* Check if the template is a primary template.  */
   if (DECL_LANG_SPECIFIC (decl) != NULL
- && VAR_OR_FUNCTION_DECL_P (decl)
+ && (VAR_OR_FUNCTION_DECL_P (decl) || TREE_CODE (decl) == TYPE_DECL)
  && DECL_TEMPLATE_INFO (decl)
  && PRIMARY_TEMPLATE_P (DECL_TI_TEMPLATE (decl)))
return DECL_TEMPLATE_INFO (decl);
@@ -403,8 +403,8 @@ write_exception_spec (tree spec)
 static inline tree
 canonicalize_for_substitution (tree node)
 {
-  /* For a TYPE_DECL, use the type instead.  */
-  if (TREE_CODE (node) == TYPE_DECL)
+  /* For a non-alias TYPE_DECL, use the type instead.  */
+  if (TREE_CODE (node) == TYPE_DECL && !TYPE_DECL_ALIAS_P (node))
 node = TREE_TYPE (node);
   if (TYPE_P (node)
   && TYPE_CANONICAL (node) != node
@@ -1045,6 +1045,7 @@ decl_mangling_context (tree decl)
 decl = DECL_TEMPLATE_RESULT (decl);
 
   if (TREE_CODE (decl) == TYPE_DECL
+  && !TYPE_DECL_ALIAS_P (decl)
   && LAMBDA_TYPE_P (TREE_TYPE (decl)))
 {
   tree extra = LAMBDA_TYPE_EXTRA_SCOPE (TREE_TYPE (decl));
@@ -1589,7 +1590,9 @@ write_unqualified_name (tree decl)
   if (TREE_CODE (decl) == TYPE_DECL
   && TYPE_UNNAMED_P (type))
 write_unnamed_type_name (type);
-  else if (TREE_CODE (decl) == TYPE_DECL && LAMBDA_TYPE_P (type))
+  else if (TREE_CODE (decl) == TYPE_DECL
+  && !TYPE_DECL_ALIAS_P (decl)
+  && LAMBDA_TYPE_P (type))
 write_closure_type_na

Re: [PATCH v2] xtensa: Fix the issue in "*extzvsi-1bit_addsubx"

2024-11-10 Thread Max Filippov
On Sat, Nov 9, 2024 at 10:39 PM Takayuki 'January June' Suwa
 wrote:
>
> The second source register of insn "*extzvsi-1bit_addsubx" cannot be the
> same as the destination register, because that register will be overwritten
> with an intermediate value after insn splitting.
>
>  /* example #1 */
>  int test1(int b, int a) {
>return ((a & 1024) ? 4 : 0) + b;
>  }
>
>  ;; result #1 (incorrect)
>  test1:
> extui   a2, a3, 10, 1   ;; overwrites A2 before used
> addx4   a2, a2, a2
> ret.n

Interestingly I couldn't reproduce it with the current gcc mainline.
For me it produces the following for the above source:

test1:
   mov.n   a9, a2
   extui   a2, a3, 10, 1
   addx4   a2, a2, a9
   ret.n

With this change the generated code is one instruction shorter,
so applying it still makes sense.

> This patch fixes that.
>
>  ;; result #1 (correct)
>  test1:
> extui   a3, a3, 10, 1   ;; uses A3 and then overwrites
> addx4   a2, a3, a2
> ret.n
>
> However, it should be noted that the first source register can be the same
> as the destination without any problems.
>
>  /* example #2 */
>  int test2(int a, int b) {
>return ((a & 1024) ? 4 : 0) + b;
>  }
>
>  ;; result (correct)
>  test2:
> extui   a2, a2, 10, 1   ;; uses A2 and then overwrites
> addx4   a2, a2, a3
> ret.n
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.md (*extzvsi-1bit_addsubx):
> Add '&' to the destination register constraint to indicate that
> it is 'earlyclobber', append '0' to the first source register
> constraint to indicate that it can be the same as the destination
> register, and change the split condition from 1 to reload_completed
> so that the insn will be split only after RA in order to obtain
>  allocated registers that satisfy the above constraints.
> ---
>   gcc/config/xtensa/xtensa.md | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)

Regtested for target=xtensa-linux-uclibc, no new regressions.
Committed to master

-- 
Thanks.
-- Max


[PATCH] doc: mention STAGE1_CFLAGS

2024-11-10 Thread Sam James
STAGE1_CFLAGS can be used to accelerate the just-built stage1 compiler
which especially improves its performance on some of the large generated
files during bootstrap. It defaults to nothing (i.e. -O0).

The downside is that if the native compiler is buggy, there's a greater
risk of a failed bootstrap. Those with a modern native compiler, ideally
a recent version of GCC, should be able to use -O1 or -O2 without issue
to get a faster build.

PR rtl-optimization/111619
* doc/install.texi (Building a native compiler): Discuss STAGE1_CFLAGS.
---
This came out of a discussion between mjw and I a little while ago when
working on the buildbots. OK?

 gcc/doc/install.texi | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 705440ffd330..4bd60555af9b 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -3017,7 +3017,11 @@ bootstrapped, you can use @code{CFLAGS_FOR_TARGET} to 
modify their
 compilation flags, as for non-bootstrapped target libraries.
 Again, if the native compiler miscompiles the stage1 compiler, you may
 need to work around this by avoiding non-working parts of the stage1
-compiler.  Use @code{STAGE1_TFLAGS} to this end.
+compiler.  Use @code{STAGE1_CFLAGS} and @code{STAGE1_TFLAGS} (for target
+libraries) to this end.  The default value for @code{STAGE1_CFLAGS} is
+@samp{STAGE1_CFLAGS='-O0'} to increase the chances of a successful bootstrap
+with a buggy native compiler.  Changing this to @code{-O1} or @code{-O2}
+can improve bootstrap times, with some greater risk of a failed bootstrap.
 
 If you used the flag @option{--enable-languages=@dots{}} to restrict
 the compilers to be built, only those you've actually enabled will be

base-commit: 00448f9b5a123b4b6b3e6f45d2fecf0a5dca66b3
-- 
2.47.0



Re: [PATCH] Guard truncate from vector float to vector __bf16 with !flag_rounding_math && HONOR_NANS (BFmode).

2024-11-10 Thread Hongtao Liu
On Fri, Nov 8, 2024 at 10:33 AM liuhongt  wrote:
>
> hw instruction doesn't raise exceptions, turns sNAN into qNAN quietly,
> and always round to nearest (even). Output denormals are always
> flushed to zero and input denormals are always treated as zero. MXCSR
> is not consulted nor updated.
> W/o native instructions, flag_unsafe_math_optimizations is needed for
> the permutation instructions.
> Similar guard extend from vector __bf16 to vector float with
> !HONOR_NANS (BFmode).
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Any comments?
Pushed to trunk.
>
> gcc/ChangeLog:
>
> * config/i386/i386.md (truncsf2bf2): Add !flag_rounding_math
> to the condition, require flag_unsafe_math_optimizations when
> native instruction is not available.
> * config/i386/mmx.md: (truncv2sfv2bf2): Ditto.
> (extendv2bfv2sf2): Add !HONOR_NANS (BFmode) to the condition.
> * config/i386/sse.md: (truncv4sfv4sf2): Add
> !flag_rounding_math to the condition, require
> flag_unsafe_math_optimizations when native instruction is not
> available.
> (truncv8sfv8bf2): Ditto.
> (truncv16sfv16bf2): Ditto.
> (extendv4bfv4sf2): Add !HONOR_NANS (BFmode) to the condition.
> (extendv8bfv8sf2): Ditto.
> (extendv16bfv16sf2): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/avx512bf16-truncsfbf.c: Add -ffast-math.
> * gcc.target/i386/avx512bw-extendbf2sf.c: Ditto.
> * gcc.target/i386/avx512bw-truncsfbf.c: Ditto.
> * gcc.target/i386/sse2-extendbf2sf.c: Ditto.
> * gcc.target/i386/ssse3-truncsfbf.c: Ditto.
> ---
>  gcc/config/i386/i386.md  | 11 ++-
>  gcc/config/i386/mmx.md   |  8 ++--
>  gcc/config/i386/sse.md   | 16 
>  .../gcc.target/i386/avx512bf16-truncsfbf.c   |  2 +-
>  .../gcc.target/i386/avx512bw-extendbf2sf.c   |  2 +-
>  .../gcc.target/i386/avx512bw-truncsfbf.c |  2 +-
>  gcc/testsuite/gcc.target/i386/sse2-extendbf2sf.c |  2 +-
>  gcc/testsuite/gcc.target/i386/ssse3-truncsfbf.c  |  2 +-
>  8 files changed, 33 insertions(+), 12 deletions(-)
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index c492fe55881..96d5420d9de 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -5694,11 +5694,20 @@ (define_insn "*trunchf2"
> (set_attr "prefix" "evex")
> (set_attr "mode" "HF")])
>
> +/* vcvtneps2bf16 doesn't honor SNAN, and turn sNAN into qNAN quietly,
> +   and it always round to even.
> +   flag_unsafte_math_optimization is needed for psrld.
> +   If we don't expect qNaNs nor sNaNs and can assume rounding
> +   to nearest, we can expand the conversion inline as
> +   (fromi + 0x7fff + ((fromi >> 16) & 1)) >> 16.  */
>  (define_insn "truncsfbf2"
>[(set (match_operand:BF 0 "register_operand" "=x,x,v,Yv")
> (float_truncate:BF
>   (match_operand:SF 1 "register_operand" "0,x,v,Yv")))]
> -  "TARGET_SSE2 && flag_unsafe_math_optimizations && !HONOR_NANS (BFmode)"
> +  "TARGET_SSE2 && !HONOR_NANS (BFmode) && !flag_rounding_math
> +   && (flag_unsafe_math_optimizations
> +   || TARGET_AVXNECONVERT
> +   || (TARGET_AVX512BF16 && TARGET_AVX512VL))"
>"@
>psrld\t{$16, %0|%0, 16}
>%{vex%} vcvtneps2bf16\t{%1, %0|%0, %1}
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index 021ac90ae2a..61a4f4d21ea 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -2998,7 +2998,11 @@ (define_expand "truncv2sfv2bf2"
>[(set (match_operand:V2BF 0 "register_operand")
> (float_truncate:V2BF
>   (match_operand:V2SF 1 "nonimmediate_operand")))]
> -  "TARGET_SSSE3 && TARGET_MMX_WITH_SSE"
> +  "TARGET_SSSE3 && TARGET_MMX_WITH_SSE
> +  && !HONOR_NANS (BFmode) && !flag_rounding_math
> +  && (flag_unsafe_math_optimizations
> +  || TARGET_AVXNECONVERT
> +  || (TARGET_AVX512BF16 && TARGET_AVX512VL))"
>  {
>rtx op1 = gen_reg_rtx (V4SFmode);
>rtx op0 = gen_reg_rtx (V4BFmode);
> @@ -3016,7 +3020,7 @@ (define_expand "extendv2bfv2sf2"
>[(set (match_operand:V2SF 0 "register_operand")
> (float_extend:V2SF
>   (match_operand:V2BF 1 "nonimmediate_operand")))]
> -  "TARGET_SSE2 && TARGET_MMX_WITH_SSE"
> +  "TARGET_SSE2 && TARGET_MMX_WITH_SSE && !HONOR_NANS (BFmode)"
>  {
>rtx op0 = gen_reg_rtx (V4SFmode);
>rtx op1 = gen_reg_rtx (V4BFmode);
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 5eeb3ab221a..efe32e5149f 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -30995,7 +30995,10 @@ (define_expand "truncv4sfv4bf2"
>[(set (match_operand:V4BF 0 "register_operand")
>   (float_truncate:V4BF
> (match_operand:V4SF 1 "nonimmediate_operand")))]
> -  "TARGET_SSSE3"
> +  "TARGET_SSSE3 && !HONOR_NANS (BFmode) && !flag_rounding_math
> +   && (flag_unsafe_math_optim

[gcc-wwwdocs PATCH] gcc-15: Mention new ISA and Diamond Rapids support for x86_64 backend

2024-11-10 Thread Haochen Jiang
Hi all,

This patch will add recent new ISA and arch support for x86_64 backend into
gcc-wwwdocs.

Ok for gcc-wwwdocs?

Thx,
Haochen

---
 htdocs/gcc-15/changes.html | 37 +
 1 file changed, 37 insertions(+)

diff --git a/htdocs/gcc-15/changes.html b/htdocs/gcc-15/changes.html
index 46dad391..d138942c 100644
--- a/htdocs/gcc-15/changes.html
+++ b/htdocs/gcc-15/changes.html
@@ -191,12 +191,49 @@ a work-in-progress.
 IA-32/x86-64
 
 
+  New ISA extension support for Intel AMX-AVX512 was added.
+  AMX-AVX512 intrinsics are available via the -mamx-avx512
+  compiler switch.
+  
+  New ISA extension support for Intel AMX-FP8 was added.
+  AMX-FP8 intrinsics are available via the -mamx-fp8
+  compiler switch.
+  
+  New ISA extension support for Intel AMX-MOVRS was added.
+  AMX-MOVRS intrinsics are available via the -mamx-movrs
+  compiler switch.
+  
+  New ISA extension support for Intel AMX-TF32 was added.
+  AMX-TF32 intrinsics are available via the -mamx-tf32
+  compiler switch.
+  
+  New ISA extension support for Intel AMX-TRANSPOSE was added.
+  AMX-TRANSPOSE intrinsics are available via the 
-mamx-transpose
+  compiler switch.
+  
   New ISA extension support for Intel AVX10.2 was added.
   AVX10.2 intrinsics are available via the -mavx10.2 or
   -mavx10.2-256 compiler switch with 256-bit vector size
   support. 512-bit vector size support for AVX10.2 intrinsics are
   available via the -mavx10.2-512 compiler switch.
   
+  New ISA extension support for Intel MOVRS was added.
+  MOVRS intrinsics are available via the -mmovrs
+  compiler switch. 128 and 256 bit MOVRS intrinsics are available via the
+  -mmovrs -mavx10.2 compiler switch. 512 bit MOVRS intrinsics
+  are available via the -mmovrs -mavx10.2-512 compiler switch.
+  
+  The EVEX version support for Intel SM4 was added.
+  New 512-bit SM4 intrinsics are available via the
+  -msm4 -mavx10.2-512 compiler switch.
+  
+  GCC now supports the Intel CPU named Diamond Rapids through
+-march=diamondrapids.
+Based on Granite Rapids, the switch further enables the AMX-AVX512,
+AMX-FP8, AMX-MOVRS, AMX-TF32, AMX-TRANSPOSE, APX_F, AVX10.2 with 512 bit
+support, AVX-IFMA. AVX-NE-CONVERT, AVX-VNNI-INT16, AVX-VNNI-INT8,
+CMPccXADD, MOVRS, SHA512, SM3, SM4 and USER_MSR ISA extensions.
+  
   Support for Xeon Phi CPUs (a.k.a. Knight Landing and Knight Mill) were
   removed in GCC 15. GCC will no longer accept -march=knl,
   -march=knm,-mavx5124fmaps,
-- 
2.31.1



[PATCH htdocs] bugs: mention ASAN too

2024-11-10 Thread Sam James
Request that reporters try `-fsanitize=address,undefined` rather than
just `-fsanitize=undefined` when reporting bugs. We get invalid bug
reports which ASAN would've caught sometimes, even if it's less often
than where UBSAN would help.
---
OK?

 htdocs/bugs/index.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/bugs/index.html b/htdocs/bugs/index.html
index c7d2f310..d6556b26 100644
--- a/htdocs/bugs/index.html
+++ b/htdocs/bugs/index.html
@@ -52,7 +52,7 @@ try a current release or development snapshot.
 with gcc -Wall -Wextra and see whether this shows anything
 wrong with your code.  Similarly, if compiling with
 -fno-strict-aliasing -fwrapv -fno-aggressive-loop-optimizations
-makes a difference, or if compiling with -fsanitize=undefined
+makes a difference, or if compiling with 
-fsanitize=address,undefined
 produces any run-time errors, then your code is probably not correct.
 
 

base-commit: 96aaafdcdba21aad22fb1b745c75a01855dc5f0c
-- 
2.47.0



[PATCH v3] testsuite: arm: Use effective-target for attr-neon* tests

2024-11-10 Thread Torbjörn SVENSSON
Changes since v1:

- Changed from arm_neon to arm_arch_v7a for the required effective target.

Changes since v2:

- Added arm_libc_fp_abi as an required effective taret.
- Removed to arm_neon and arm_vfp from effective target.


With v3, the tests are now tested in armv7-a context in either hard or softfp
mode, depending on how libc was built.

Ok for trunk and releases/gcc-14?

--

Force armv7-a as the tests require a neon compatible architecture.

gcc/testsuite/ChangeLog:

* gcc.target/arm/attr-neon-builtin-fail.c: Use effective-target
arm_arch_v7a.
* gcc.target/arm/attr-neon-builtin-fail2.c: Likewise.
* gcc.target/arm/attr-neon-fp16.c: Likewise.
* gcc.target/arm/attr-neon2.c: Likewise.

Signed-off-by: Torbjörn SVENSSON 
---
 gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c  | 7 ---
 gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail2.c | 6 --
 gcc/testsuite/gcc.target/arm/attr-neon-fp16.c  | 6 --
 gcc/testsuite/gcc.target/arm/attr-neon2.c  | 7 ---
 4 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c 
b/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c
index fb6e0b9cd66..143ad9c4908 100644
--- a/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c
+++ b/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c
@@ -1,9 +1,10 @@
 /* Check that calling a neon builtin from a function compiled with vfp fails.  
*/
 /* { dg-do compile } */
-/* { dg-require-effective-target arm_fp_ok } */
-/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-require-effective-target arm_arch_v7a_ok } */
+/* { dg-require-effective-target arm_libc_fp_abi_ok } */
 /* { dg-options "-O2" } */
-/* { dg-add-options arm_fp } */
+/* { dg-add-options arm_arch_v7a } */
+/* { dg-add-options arm_libc_fp_abi } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail2.c 
b/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail2.c
index 9cb5a2ebb90..39689b7c3c7 100644
--- a/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail2.c
+++ b/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail2.c
@@ -1,8 +1,10 @@
 /* Check that calling a neon builtin from a function compiled with vfp fails.  
*/
 /* { dg-do compile } */
-/* { dg-require-effective-target arm_vfp_ok } */
+/* { dg-require-effective-target arm_arch_v7a_ok } */
+/* { dg-require-effective-target arm_libc_fp_abi_ok } */
 /* { dg-options "-O2" } */
-/* { dg-add-options arm_vfp } */
+/* { dg-add-options arm_arch_v7a } */
+/* { dg-add-options arm_libc_fp_abi } */
 
 extern __simd64_int8_t a, b;
 
diff --git a/gcc/testsuite/gcc.target/arm/attr-neon-fp16.c 
b/gcc/testsuite/gcc.target/arm/attr-neon-fp16.c
index d7b75645bc4..9bc6ce635e2 100644
--- a/gcc/testsuite/gcc.target/arm/attr-neon-fp16.c
+++ b/gcc/testsuite/gcc.target/arm/attr-neon-fp16.c
@@ -1,8 +1,10 @@
 /* { dg-do compile } */
 /* { dg-skip-if "-mpure-code supports M-profile only and without Neon" { *-*-* 
} { "-mpure-code" } } */
-/* { dg-require-effective-target arm_fp_ok } */
+/* { dg-require-effective-target arm_arch_v7a_ok } */
+/* { dg-require-effective-target arm_libc_fp_abi_ok } */
 /* { dg-options "-mfp16-format=ieee" } */
-/* { dg-add-options arm_fp } */
+/* { dg-add-options arm_arch_v7a } */
+/* { dg-add-options arm_libc_fp_abi } */
 
 #include "arm_neon.h"
 
diff --git a/gcc/testsuite/gcc.target/arm/attr-neon2.c 
b/gcc/testsuite/gcc.target/arm/attr-neon2.c
index a7a72dac379..db10cfa4928 100644
--- a/gcc/testsuite/gcc.target/arm/attr-neon2.c
+++ b/gcc/testsuite/gcc.target/arm/attr-neon2.c
@@ -1,8 +1,9 @@
 /* { dg-do compile } */
-/* { dg-require-effective-target arm_neon_ok } */
-/* { dg-require-effective-target arm_fp_ok } */
+/* { dg-require-effective-target arm_arch_v7a_ok } */
+/* { dg-require-effective-target arm_libc_fp_abi_ok } */
 /* { dg-options "-Ofast" } */
-/* { dg-add-options arm_fp } */
+/* { dg-add-options arm_arch_v7a } */
+/* { dg-add-options arm_libc_fp_abi } */
 
 /* Reset fpu to a value compatible with the next pragmas.  */
 #pragma GCC target ("fpu=vfp")
-- 
2.25.1



Adjust 'libgomp.c/max_vf-*.c' (was: [PATCH 4/4] openmp: Add testcases for omp_max_vf)

2024-11-10 Thread Thomas Schwinge
Hi!

On 2024-11-06T16:18:50+, Andrew Stubbs  wrote:
> On 06/11/2024 15:41, Jakub Jelinek wrote:
>> On Wed, Nov 06, 2024 at 03:27:22PM +, Andrew Stubbs wrote:
>>> [...] requires enabling the offload-dump scanning features previously only 
>>> used
>>> in the libgomp testsuite.  The automake scheme used there isn't a good fit
>>> here, so we probe the known devices manually.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.dg/gomp/gomp.exp: Load scanoffload.exp and scanoffloadtree.exp.
>>> Set offload_targets when available.
>>> * gcc.dg/gomp/max_vf-1.c: New test.
>>> * gcc.dg/gomp/max_vf-2.c: New test.
>>> * gcc.dg/gomp/max_vf-3.c: New test.
>> 
>> I don't see how this can work.  gomp.exp isn't prepared to find the libgomp
>> directory nor add -B options etc.

ACK.

>> Perhaps it appears to work if you have
>> your system gcc's libgomp installed, but that isn't the library that should
>> be used.
>> So, max_vf-1.c test can stay where it is, but the gomp.exp changes shouldn't
>> be done and max_vf-{2,3}.c should move to libgomp/testsuite/libgomp.c/
>
> It worked for me, but I might have an unusual configuration that allows 
> me to test installed toolchains with remote devices, rather than build 
> trees with local devices.

Right.  (And, of course, boths way should work.)


> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.c/max_vf-1.c
> @@ -0,0 +1,47 @@
> +/* Test that omp parallel simd schedule uses the correct max_vf for the
> +   host system, when target directives are present.  */
> +
> +/* { dg-require-effective-target offloading_enabled } */
> +
> +[...]
> +
> +/* Make sure that the max_vf is used as an IFN.
> +{ dg-final { scan-tree-dump-times {GOMP_MAX_VF} 2 "ompexp" { target { 
> x86_64-*-* i?86-*-* } } } } */
> +
> +/* Make sure the max_vf is passed as a temporary variable.
> +{ dg-final { scan-tree-dump-times 
> {__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \(.*, D\.[0-9]*, 0\);} 1 
> "ompexp" { target { x86_64-*-* i?86-*-* } } } } */
> +
> +/* Test SIMD offload devices
> +{ dg-final { scan-offload-tree-dump-times 
> {__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \(.*, 64, 0\);} 1 
> "optimized" { target { offload_gcn } } } } 
> +{ dg-final { scan-offload-tree-dump-times 
> {__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \(.*, 7, 0\);} 1 
> "optimized" { target { offload_nvptx } } } } */
> +
> +int main() {}

For configurations where both GCN and nvptx offloading are enabled, we
get FAILs here.  Avoid these via 'only_for_offload_target [...]'.
Pushed to trunk branch commit 730f28b081bea4a749f9b82902446731ec8faa93
"Adjust 'libgomp.c/max_vf-*.c'", see attached.


Grüße
 Thomas


>From 730f28b081bea4a749f9b82902446731ec8faa93 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Sat, 9 Nov 2024 13:37:53 +0100
Subject: [PATCH] Adjust 'libgomp.c/max_vf-*.c'

For configurations where both GCN and nvptx offloading are enabled, we get:

PASS: libgomp.c/max_vf-1.c (test for excess errors)
PASS: libgomp.c/max_vf-1.c scan-tree-dump-times ompexp "GOMP_MAX_VF" 2
PASS: libgomp.c/max_vf-1.c scan-tree-dump-times ompexp "__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \\(.*, D\\.[0-9]*, 0\\);" 1
PASS: libgomp.c/max_vf-1.c scan-amdgcn-amdhsa-offload-tree-dump-times optimized "__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \\(.*, 64, 0\\);" 1
FAIL: libgomp.c/max_vf-1.c scan-nvptx-none-offload-tree-dump-times optimized "__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \\(.*, 64, 0\\);" 1
FAIL: libgomp.c/max_vf-1.c scan-amdgcn-amdhsa-offload-tree-dump-times optimized "__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \\(.*, 7, 0\\);" 1
PASS: libgomp.c/max_vf-1.c scan-nvptx-none-offload-tree-dump-times optimized "__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \\(.*, 7, 0\\);" 1

Avoid these FAILs via 'only_for_offload_target [...]'.  Also, for consistency
with other libgomp test cases, use effective-target specifiers of the libgomp
test suite.  Fix-up for recent commit d334f729e53867b838e867375b3f475ba793d96e
"openmp: Add testcases for omp_max_vf".

	libgomp/
	* testsuite/libgomp.c/max_vf-1.c: Adjust.
	* testsuite/libgomp.c/max_vf-2.c: Likewise.
---
 libgomp/testsuite/libgomp.c/max_vf-1.c | 6 +++---
 libgomp/testsuite/libgomp.c/max_vf-2.c | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/libgomp/testsuite/libgomp.c/max_vf-1.c b/libgomp/testsuite/libgomp.c/max_vf-1.c
index 9c8d5dc0af9..70f6b86c614 100644
--- a/libgomp/testsuite/libgomp.c/max_vf-1.c
+++ b/libgomp/testsuite/libgomp.c/max_vf-1.c
@@ -1,7 +1,7 @@
 /* Test that omp parallel simd schedule uses the correct max_vf for the
host system, when target directives are present.  */
 
-/* { dg-require-effective-target offloading_enabled } */
+/* { dg-require-effective-target offload_target_any } */
 
 /* { dg-do link } */
 /* { dg-options "-fopenmp -O2 -fdump-tree-ompexp -foffload=-fdump-tree-optimized" } */
@@ -41,7 +41,7 @@ f3 (int *a, int *b, int *c)
 { dg-final { s

[PATCH v18 1/2] contrib/: Add support for Cc: and Link: tags

2024-11-10 Thread Alejandro Colomar
contrib/ChangeLog:

* gcc-changelog/git_commit.py (GitCommit):
Add support for 'Cc: ' and 'Link: ' tags.

Cc: Jason Merrill 
Signed-off-by: Alejandro Colomar 
---
 contrib/gcc-changelog/git_commit.py | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
index 87ecb9e1a17..64fb986b74c 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -182,7 +182,8 @@ CO_AUTHORED_BY_PREFIX = 'co-authored-by: '
 
 REVIEW_PREFIXES = ('reviewed-by: ', 'reviewed-on: ', 'signed-off-by: ',
'acked-by: ', 'tested-by: ', 'reported-by: ',
-   'suggested-by: ')
+   'suggested-by: ', 'cc: ')
+LINK_PREFIXES = ('link: ')
 DATE_FORMAT = '%Y-%m-%d'
 
 
@@ -524,6 +525,8 @@ class GitCommit:
 continue
 elif lowered_line.startswith(REVIEW_PREFIXES):
 continue
+elif lowered_line.startswith(LINK_PREFIXES):
+continue
 else:
 m = cherry_pick_regex.search(line)
 if m:
-- 
2.45.2



signature.asc
Description: PGP signature


[PATCH v18 0/2] c: Add __countof__ operator

2024-11-10 Thread Alejandro Colomar
Hi!

Your favourite operator with the most controversial name comes back with
support for [0], thanks to Martin Uecker.  In movie theaters, and
probably in GCC 16.

For those who fight in a side in the name wars, here's a reminder of a
fair survey (by JeanHeyd) which might end the war with a peace treaty:
.


Changes since v17:

-  Rebase after the recent patches added by Martin, which made [0][n]
   and [*][n] have distinct representation, and thus allowed making
   __countof__(int [0][n]) be a constant expression.

-  Make __countof__(int [0][n]) a constant expression.  Thanks, Martin!
   Update the testsuite to reflect this too, of course.

-  Rename small function in the testsuite (automatic => completed).

See the range-diff below for the exact differences since v17.


Martin, this worked out of the box.  I'll reply to this email with the
regression-test session results; they all passed.  [0] works like a
charm.


Have a lovely day!
Alex


Alejandro Colomar (2):
  contrib/: Add support for Cc: and Link: tags
  c: Add __countof__ operator

 contrib/gcc-changelog/git_commit.py|   5 +-
 gcc/c-family/c-common.cc   |  26 +
 gcc/c-family/c-common.def  |   3 +
 gcc/c-family/c-common.h|   2 +
 gcc/c/c-decl.cc|  22 +++-
 gcc/c/c-parser.cc  |  62 +++---
 gcc/c/c-tree.h |   4 +
 gcc/c/c-typeck.cc  | 115 ++-
 gcc/doc/extend.texi|  30 +
 gcc/testsuite/gcc.dg/countof-compile.c | 125 +
 gcc/testsuite/gcc.dg/countof-vla.c |  45 
 gcc/testsuite/gcc.dg/countof.c | 150 +
 12 files changed, 564 insertions(+), 25 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/countof-compile.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-vla.c
 create mode 100644 gcc/testsuite/gcc.dg/countof.c

Range-diff against v17:
1:  d847dc4a795 = 1:  82100c813c3 contrib/: Add support for Cc: and Link: tags
2:  936f7945fae ! 2:  f8336e4646a c: Add __countof__ operator
@@ Commit message
and somehow magically return the number of elements of the array,
regardless of it being really a pointer.
 
--  Fix support for [0].
-
 gcc/ChangeLog:
 
 * doc/extend.texi: Document __countof__ operator.
@@ gcc/c/c-decl.cc: finish_enum (tree enumtype, tree values, tree 
attributes)
 
  ## gcc/c/c-parser.cc ##
 @@ gcc/c/c-parser.cc: along with GCC; see the file COPYING3.  If not see
- #include "bitmap.h"
- #include "analyzer/analyzer-language.h"
  #include "toplev.h"
+ #include "asan.h"
+ #include "c-family/c-ubsan.h"
 +
 +#define c_parser_sizeof_expression(parser)
\
 +( 
\
@@ gcc/c/c-typeck.cc: c_expr_sizeof_type (location_t loc, struct 
c_type_name *t)
 +static bool
 +is_top_array_vla (tree type)
 +{
-+  bool zero, star, var;
++  bool zero, var;
 +  tree d;
 +
 +  if (TREE_CODE (type) != ARRAY_TYPE)
@@ gcc/c/c-typeck.cc: c_expr_sizeof_type (location_t loc, struct 
c_type_name *t)
 +
 +  d = TYPE_DOMAIN (type);
 +  zero = !TYPE_MAX_VALUE (d);
-+  star = (zero && C_TYPE_VARIABLE_SIZE (type));
-+  if (star)
-+return true;
 +  if (zero)
 +return false;
 +
@@ gcc/testsuite/gcc.dg/countof-compile.c (new)
 +  _Static_assert (__countof__ (int [n][3]) == 7); /* { dg-error "not 
constant" } */
 +  _Static_assert (__countof__ (int [0][3]) == 0);
 +  _Static_assert (__countof__ (int [0]) == 0);
-+
-+  /* FIXME: countof(int [0][n]) should result in a constant expression.  
*/
-+  _Static_assert (__countof__ (int [0][n]) == 0); /* { dg-error "not 
constant" } */
++  _Static_assert (__countof__ (int [0][n]) == 0);
 +}
 
  ## gcc/testsuite/gcc.dg/countof-vla.c (new) ##
@@ gcc/testsuite/gcc.dg/countof-vla.c (new)
 +char (*a)[*][*],
 +int (*x)[__countof__ (*a)]);
 +
-+// Can't test due to bug: 

-+//static int z2[0];
-+//static int y2[__countof__(z2)];
++static int z2[0];
++static int y2[__countof__(z2)];
 
  ## gcc/testsuite/gcc.dg/countof.c (new) ##
 @@
@@ gcc/testsuite/gcc.dg/countof.c (new)
 +}
 +
 +void
-+automatic(void)
++completed (void)
 +{
 +  int a[] = {1, 2, 3};
 +  int z[] = {};
@@ gcc/testsuite/gcc.dg/countof.c (new)
 +main (void)
 +{
 +  array ();
-+  automatic ();
++  completed ();
 +  vla ();
 +  member ();
 +  vla_eval ();

base-commit: 9cbcf8d1de159e6113fafb5dc2feb4a7e467a302
-- 
2.45.2



signature.asc
Description: PGP signat

[PATCH v18 2/2] c: Add __countof__ operator

2024-11-10 Thread Alejandro Colomar
This operator is similar to sizeof but can only be applied to an array,
and returns its number of elements.

FUTURE DIRECTIONS:

-  We should make it work with array parameters to functions,
   and somehow magically return the number of elements of the array,
   regardless of it being really a pointer.

gcc/ChangeLog:

* doc/extend.texi: Document __countof__ operator.

gcc/c-family/ChangeLog:

* c-common.h
* c-common.def
* c-common.cc (c_countof_type): Add __countof__ operator.

gcc/c/ChangeLog:

* c-tree.h
(c_expr_countof_expr, c_expr_countof_type)
* c-decl.cc
(start_struct, finish_struct)
(start_enum, finish_enum)
* c-parser.cc
(c_parser_sizeof_expression)
(c_parser_countof_expression)
(c_parser_sizeof_or_countof_expression)
(c_parser_unary_expression)
* c-typeck.cc
(build_external_ref)
(record_maybe_used_decl)
(pop_maybe_used)
(is_top_array_vla)
(c_expr_countof_expr, c_expr_countof_type):
Add __countof__ operator.

gcc/testsuite/ChangeLog:

* gcc.dg/countof-compile.c
* gcc.dg/countof-vla.c
* gcc.dg/countof.c: Add tests for __countof__ operator.

Link: 
Link: 
Link: 

Link: 
Link: 
Link: 
Link: 
Link: 
Suggested-by: Xavier Del Campo Romero 
Co-authored-by: Martin Uecker 
Acked-by: "James K. Lowden" 
Cc: Joseph Myers 
Cc: Gabriel Ravier 
Cc: Jakub Jelinek 
Cc: Kees Cook 
Cc: Qing Zhao 
Cc: Jens Gustedt 
Cc: David Brown 
Cc: Florian Weimer 
Cc: Andreas Schwab 
Cc: Timm Baeder 
Cc: Daniel Plakosh 
Cc: "A. Jiang" 
Cc: Eugene Zelenko 
Cc: Aaron Ballman 
Cc: Paul Koning 
Cc: Daniel Lundin 
Cc: Nikolaos Strimpas 
Cc: JeanHeyd Meneide 
Cc: Fernando Borretti 
Cc: Jonathan Protzenko 
Cc: Chris Bazley 
Cc: Ville Voutilainen 
Cc: Alex Celeste 
Cc: Jakub Łukasiewicz 
Cc: Douglas McIlroy 
Cc: Jason Merrill 
Cc: "Gustavo A. R. Silva" 
Cc: Patrizia Kaye 
Cc: Ori Bernstein 
Cc: Robert Seacord 
Cc: Marek Polacek 
Cc: Sam James 
Cc: Richard Biener 
Signed-off-by: Alejandro Colomar 
---
 gcc/c-family/c-common.cc   |  26 +
 gcc/c-family/c-common.def  |   3 +
 gcc/c-family/c-common.h|   2 +
 gcc/c/c-decl.cc|  22 +++-
 gcc/c/c-parser.cc  |  62 +++---
 gcc/c/c-tree.h |   4 +
 gcc/c/c-typeck.cc  | 115 ++-
 gcc/doc/extend.texi|  30 +
 gcc/testsuite/gcc.dg/countof-compile.c | 125 +
 gcc/testsuite/gcc.dg/countof-vla.c |  45 
 gcc/testsuite/gcc.dg/countof.c | 150 +
 11 files changed, 560 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/countof-compile.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-vla.c
 create mode 100644 gcc/testsuite/gcc.dg/countof.c

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 06be2a37b4f..77bf8e84847 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -469,6 +469,7 @@ const struct c_common_resword c_common_reswords[] =
   { "__inline",RID_INLINE, 0 },
   { "__inline__",  RID_INLINE, 0 },
   { "__label__",   RID_LABEL,  0 },
+  { "__countof__", RID_COUNTOF,0 },
   { "__null",  RID_NULL,   0 },
   { "__real",  RID_REALPART,   0 },
   { "__real__",RID_REALPART,   0 },
@@ -4074,6 +4075,31 @@ c_alignof_expr (location_t loc, tree expr)
 
   return fold_convert_loc (loc, size_type_node, t);
 }
+
+/* Implement the countof keyword:
+   Return the number of elements of an array.  */
+
+tree
+c_countof_type (location_t loc, tree type)
+{
+  enum tree_code type_code;
+
+  type_code = TREE_CODE (type);
+  if (type_code != ARRAY_TYPE)
+{
+  error_at (loc, "invalid application of %<__countof__%> to type %qT", 
type);
+  return error_mark_node;
+}
+  if (!COMPLETE_TYPE_P (type))
+{
+  error_at (loc,
+   "invalid application of %<__countof__%> to incomplete type %qT",
+   type);
+  return error_mark_node;
+}
+
+  return array_type_nelts_top (type);
+}
 
 /* Handle C and C++ default attributes.  */
 
diff --git a/gcc/c-family/c-common.def b/gcc/c-family/c-common.def
index dc49ad09e2f..f2ae784cefe 100644
--- a/gcc/c-family/c-common.def
+++ b/gcc/c-family/c-common.def
@@ -50,6 +50,9 @@ DEFTREECODE (EXCESS_PRECISION_EXPR, "

Re: [PATCH v18 0/2] c: Add __countof__ operator

2024-11-10 Thread Alejandro Colomar
On Sun, Nov 10, 2024 at 11:32:54AM GMT, Alejandro Colomar wrote:
> Hi!
> 
> Your favourite operator with the most controversial name comes back with
> support for [0], thanks to Martin Uecker.  In movie theaters, and
> probably in GCC 16.
> 
> For those who fight in a side in the name wars, here's a reminder of a
> fair survey (by JeanHeyd) which might end the war with a peace treaty:
> .
> 
> 
> Changes since v17:
> 
> -  Rebase after the recent patches added by Martin, which made [0][n]
>and [*][n] have distinct representation, and thus allowed making
>__countof__(int [0][n]) be a constant expression.
> 
> -  Make __countof__(int [0][n]) a constant expression.  Thanks, Martin!
>Update the testsuite to reflect this too, of course.
> 
> -  Rename small function in the testsuite (automatic => completed).
> 
> See the range-diff below for the exact differences since v17.
> 
> 
> Martin, this worked out of the box.  I'll reply to this email with the
> regression-test session results; they all passed.  [0] works like a
> charm.

Regression tests say ok:

alx@debian:~/src/gnu/gcc/len$ git tag len18
alx@debian:~/src/gnu/gcc/len$ git log --oneline gnu/master^..len18
f8336e4646a (HEAD -> len, tag: len18) c: Add __countof__ operator
82100c813c3 contrib/: Add support for Cc: and Link: tags
114abf075c1 (gnu/trunk, gnu/master) c: minor fixes related to arrays of 
unspecified size
alx@debian:~/src/gnu/gcc/len$ git reset gnu/master --h
HEAD is now at 114abf075c1 c: minor fixes related to arrays of unspecified size
alx@debian:~/src/gnu/gcc/len$ mkdir ../len18
alx@debian:~/src/gnu/gcc/len$ cd ../len18
alx@debian:~/src/gnu/gcc/len18$ /bin/time ../len/configure --disable-multilib 
--prefix=/opt/local/gnu/gcc/countof18 |& ts -s | ovr -n 3; echo $?
00:00:04 config.status: creating Makefile
00:00:04 2.74user 1.51system 0:03.82elapsed 111%CPU (0avgtext+0avgdata 
26588maxresident)k
00:00:04 91760inputs+8000outputs (275major+276906minor)pagefaults 0swaps
0
alx@debian:~/src/gnu/gcc/len18$ /bin/time make -j24 bootstrap |& ts -s | ovr -n 
3; echo $?
00:20:34 make[1]: Leaving directory '/home/alx/src/gnu/gcc/len18'
00:20:34 14990.13user 437.20system 20:34.11elapsed 1250%CPU (0avgtext+0avgdata 
1558756maxresident)k
00:20:34 1555888inputs+30810152outputs (19314major+119705948minor)pagefaults 
0swaps
0
alx@debian:~/src/gnu/gcc/len18$ /bin/time make check |& ts -s | ovr -n 3; echo 
$?
06:54:58 make[1]: Leaving directory '/home/alx/src/gnu/gcc/len18'
06:54:58 21595.28user 3410.52system 6:54:57elapsed 100%CPU (0avgtext+0avgdata 
2327800maxresident)k
06:54:58 728800inputs+21757040outputs (2918major+1010304995minor)pagefaults 
0swaps
0
alx@debian:~/src/gnu/gcc/len18$ cd ../len
alx@debian:~/src/gnu/gcc/len$ git merge --ff-only len18
Updating 114abf075c1..f8336e4646a
Fast-forward
 contrib/gcc-changelog/git_commit.py|   5 +-
 gcc/c-family/c-common.cc   |  26 +
 gcc/c-family/c-common.def  |   3 +
 gcc/c-family/c-common.h|   2 +
 gcc/c/c-decl.cc|  22 +++-
 gcc/c/c-parser.cc  |  62 +++---
 gcc/c/c-tree.h |   4 +
 gcc/c/c-typeck.cc  | 115 ++-
 gcc/doc/extend.texi|  30 +
 gcc/testsuite/gcc.dg/countof-compile.c | 125 +
 gcc/testsuite/gcc.dg/countof-vla.c |  45 
 gcc/testsuite/gcc.dg/countof.c | 150 +
 12 files changed, 564 insertions(+), 25 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/countof-compile.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-vla.c
 create mode 100644 gcc/testsuite/gcc.dg/countof.c
alx@debian:~/src/gnu/gcc/len$ cd ..
alx@debian:~/src/gnu/gcc$ mv len18 len18_b4
alx@debian:~/src/gnu/gcc$ mkdir len18
alx@debian:~/src/gnu/gcc$ cd len18
alx@debian:~/src/gnu/gcc/len18$ /bin/time ../len/configure --disable-multilib 
--prefix=/opt/local/gnu/gcc/countof18 |& ts -s | ovr -n 3; echo $?
00:00:03 config.status: creating Makefile
00:00:04 2.91user 1.55system 0:03.89elapsed 114%CPU (0avgtext+0avgdata 
26556maxresident)k
00:00:04 0inputs+8000outputs (0major+280759minor)pagefaults 0swaps
0
alx@debian:~/src/gnu/gcc/len18$ /bin/time make -j24 bootstrap |& ts -s | ovr -n 
3; echo $?
00:20:36 make[1]: Leaving directory '/home/alx/src/gnu/gcc/len18'
00:20:36 15194.00user 433.58system 20:35.69elapsed 1264%CPU (0avgtext+0avgdata 
1565564maxresident)k
00:20:36 3688inputs+30817928outputs (168major+119818585minor)pagefaults 0swaps
0
alx@debian:~/src/gnu/gcc/len18$ /bin/time make check |& ts -s | ovr -n 3; echo 
$?
06:55:35 make[1]: Leaving directory '/home/alx/src/gnu/gcc/len18'
06:55:35 21613.87user 3477.00system 6:55:35elapsed 100%CPU (0avgtext+0avgdata 
2327144maxresident)k
06:55:35 304inputs+21758376outputs (2758major+1012005133minor)pagefaults 0swaps
0
alx@debian:~/src/gnu/gcc/len18$ find -type f | grep '\.sum$' | while read f; do 
diff

[PATCH] testsuite: arm: Update expected assembler for pr43920-2.c test

2024-11-10 Thread Torbjörn SVENSSON
Ok for trunk, releases/gcc-12, releases/gcc-13 and releases/gcc-14?

--

In version 6-2017-q1-update of the "GNU Arm Embedded Toolchain" build,
there are 2 pop instructions. In version 7-2018-q2-update, the next
version that still have a binary build available on launchpad, there is
only a single pop instruction.
When I try to build vanilla GCC in the same version range, I always end
up with a single pop instruciton.

Since r12-5301-g04520645038, the generated assembler contains one more
registry move, and it's requested in PR103298 to allow it.

gcc/testsuite/ChangeLog:

PR testsuite/103298
* gcc.target/arm/pr43920-2.c: Increase allowed text size and
lower number of expected pop instructions.

Signed-off-by: Torbjörn SVENSSON 
---
 gcc/testsuite/gcc.target/arm/pr43920-2.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/pr43920-2.c 
b/gcc/testsuite/gcc.target/arm/pr43920-2.c
index c367d6bc15d..80cc0b7d260 100644
--- a/gcc/testsuite/gcc.target/arm/pr43920-2.c
+++ b/gcc/testsuite/gcc.target/arm/pr43920-2.c
@@ -27,6 +27,6 @@ int getFileStartAndLength (int fd, int *start_, size_t 
*length_)
   return 0;
 }
 
-/* { dg-final { scan-assembler-times "pop" 2 } } */
+/* { dg-final { scan-assembler-times "pop" 1 } } */
 /* { dg-final { scan-assembler-times "beq" 3 } } */
-/* { dg-final { object-size text <= 54 { target { ! arm*-*-uclinuxfdpiceabi } 
} } } */
+/* { dg-final { object-size text <= 56 { target { ! arm*-*-uclinuxfdpiceabi } 
} } } */
-- 
2.25.1



[PING^1][PATCH] testsuite: Simplify target test and dg-options for AMO tests

2024-11-10 Thread jeevitha
Ping!

please review.

Thanks & Regards
Jeevitha

On 15/10/24 12:49 pm, jeevitha wrote:
> Hi All,
> 
> Removed powerpc*-*-* from the target test as it is always true. Simplified
> options by removing -mpower9-misc and -mvsx, which are enabled by default with
> -mdejagnu-cpu=power9. The has_arch_pwr9 check is also true with
> -mdejagnu-cpu=power9, so it has been removed.
> 
> 2024-10-15 Jeevitha Palanisamy 
> 
> gcc/testsuite/
> 
>   * gcc.target/powerpc/amo1.c: Removed powerpc*-*-* from the target and
>   simplified dg-options.
>   * gcc.target/powerpc/amo2.c: Simplified dg-options and added powerpc_vsx
>   target check.
> 
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/amo1.c 
> b/gcc/testsuite/gcc.target/powerpc/amo1.c
> index c5af373b4e9..9a981cd4219 100644
> --- a/gcc/testsuite/gcc.target/powerpc/amo1.c
> +++ b/gcc/testsuite/gcc.target/powerpc/amo1.c
> @@ -1,6 +1,5 @@
> -/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
> -/* { dg-options "-mvsx -mpower9-misc -O2" } */
> -/* { dg-additional-options "-mdejagnu-cpu=power9" { target { ! has_arch_pwr9 
> } } } */
> +/* { dg-do compile { target { lp64 } } } */
> +/* { dg-options "-mdejagnu-cpu=power9 -O2" } */
>  /* { dg-require-effective-target powerpc_vsx } */
>  
>  /* Verify P9 atomic memory operations.  */
> diff --git a/gcc/testsuite/gcc.target/powerpc/amo2.c 
> b/gcc/testsuite/gcc.target/powerpc/amo2.c
> index 592f0fb3f92..9e4ff0ce064 100644
> --- a/gcc/testsuite/gcc.target/powerpc/amo2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/amo2.c
> @@ -1,6 +1,6 @@
>  /* { dg-do run { target { powerpc*-*-linux* && { lp64 && p9vector_hw } } } } 
> */
> -/* { dg-options "-O2 -mvsx -mpower9-misc" } */
> -/* { dg-additional-options "-mdejagnu-cpu=power9" { target { ! has_arch_pwr9 
> } } } */
> +/* { dg-options "-mdejagnu-cpu=power9 -O2" } */
> +/* { dg-require-effective-target powerpc_vsx } */
>  
>  #include 
>  #include 
> 
> 
> 



[PING^1][PATCH v3] rs6000: Fix issue in specifying PTImode as an attribute [PR106895]

2024-11-10 Thread jeevitha
Ping!

please review.

Thanks & Regards
Jeevitha

On 14/10/24 5:16 pm, jeevitha wrote:
> Hi All,
> 
> The following patch has been bootstrapped and regtested on powerpc64le-linux.
> 
> PTImode assists in generating even/odd register pairs on 128 bits. When the 
> user
> specifies PTImode as an attribute, it breaks because there is no internal type
> to handle this mode. To fix this, we have created a intPTI_type_internal_node 
> to
> handle PTImode. We are not documenting this __pti_internal type, since users
> are not encouraged to use this type externally.
> 
> 2024-10-14  Jeevitha Palanisamy  
> 
> gcc/
>   PR target/106895
>   * config/rs6000/rs6000.h (enum rs6000_builtin_type_index): Add
>   RS6000_BTI_INTPTI.
>   * config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Add node for
>   PTImode type.
> 
> gcc/testsuite/
>   PR target/106895
>   * gcc.target/powerpc/pr106895.c: New testcase.
> 
> diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
> b/gcc/config/rs6000/rs6000-builtin.cc
> index 9bdbae1ecf9..baf17f3b28a 100644
> --- a/gcc/config/rs6000/rs6000-builtin.cc
> +++ b/gcc/config/rs6000/rs6000-builtin.cc
> @@ -756,6 +756,15 @@ rs6000_init_builtins (void)
>else
>  ieee128_float_type_node = NULL_TREE;
>  
> +  /* PTImode to get even/odd register pairs.  */
> +  intPTI_type_internal_node = make_node(INTEGER_TYPE);
> +  TYPE_PRECISION (intPTI_type_internal_node) = GET_MODE_BITSIZE (PTImode);
> +  layout_type (intPTI_type_internal_node);
> +  SET_TYPE_MODE (intPTI_type_internal_node, PTImode);
> +  t = build_qualified_type (intPTI_type_internal_node, TYPE_QUAL_CONST);
> +  lang_hooks.types.register_builtin_type (intPTI_type_internal_node,
> +   "__pti_internal");
> +
>/* Vector pair and vector quad support.  */
>vector_pair_type_node = make_node (OPAQUE_TYPE);
>SET_TYPE_MODE (vector_pair_type_node, OOmode);
> diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
> index d460eb06544..1612b3e2fcd 100644
> --- a/gcc/config/rs6000/rs6000.h
> +++ b/gcc/config/rs6000/rs6000.h
> @@ -2288,6 +2288,7 @@ enum rs6000_builtin_type_index
>RS6000_BTI_ptr_vector_quad,
>RS6000_BTI_ptr_long_long,
>RS6000_BTI_ptr_long_long_unsigned,
> +  RS6000_BTI_INTPTI,
>RS6000_BTI_MAX
>  };
>  
> @@ -2332,6 +2333,7 @@ enum rs6000_builtin_type_index
>  #define uintDI_type_internal_node 
> (rs6000_builtin_types[RS6000_BTI_UINTDI])
>  #define intTI_type_internal_node  
> (rs6000_builtin_types[RS6000_BTI_INTTI])
>  #define uintTI_type_internal_node 
> (rs6000_builtin_types[RS6000_BTI_UINTTI])
> +#define intPTI_type_internal_node 
> (rs6000_builtin_types[RS6000_BTI_INTPTI])
>  #define float_type_internal_node  
> (rs6000_builtin_types[RS6000_BTI_float])
>  #define double_type_internal_node 
> (rs6000_builtin_types[RS6000_BTI_double])
>  #define long_double_type_internal_node
> (rs6000_builtin_types[RS6000_BTI_long_double])
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr106895.c 
> b/gcc/testsuite/gcc.target/powerpc/pr106895.c
> new file mode 100644
> index 000..88516c5a426
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr106895.c
> @@ -0,0 +1,17 @@
> +/* PR target/106895 */
> +/* { dg-do assemble } */
> +/* { dg-require-effective-target int128 } */
> +/* { dg-options "-O2 -save-temps" } */
> +
> +/* Verify the following generates even/odd register pairs.  */
> +
> +typedef __int128 pti __attribute__((mode(PTI)));
> +
> +void
> +set128 (pti val, pti *mem)
> +{
> +asm("stq %1,%0" : "=m"(*mem) : "r"(val));
> +}
> +
> +/* { dg-final { scan-assembler {\mstq\M} } } */
> +
> 
> 
> 



[PATCH] [v2] Add missing SLP discovery for CFN[_MASK][_LEN]_SCATTER_STORE

2024-11-10 Thread Richard Biener
This was responsible for a bunch of SVE FAILs with --param vect-force-slp=1

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

* tree-vect-slp.cc (arg1_arg3_map): New.
(arg1_arg3_arg4_map): Likewise.
(vect_get_operand_map): Handle IFN_SCATTER_STORE,
IFN_MASK_SCATTER_STORE and IFN_MASK_LEN_SCATTER_STORE.
(vect_build_slp_tree_1): Likewise.
* tree-vect-stmts.cc (vectorizable_store): For SLP masked
gather/scatter record the mask with proper number of copies.
---
 gcc/tree-vect-slp.cc   | 17 -
 gcc/tree-vect-stmts.cc |  6 --
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 8e4ad05e2a4..eebac1955de 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -512,7 +512,9 @@ static const int no_arg_map[] = { 0 };
 static const int arg0_map[] = { 1, 0 };
 static const int arg1_map[] = { 1, 1 };
 static const int arg2_map[] = { 1, 2 };
+static const int arg1_arg3_map[] = { 2, 1, 3 };
 static const int arg1_arg4_map[] = { 2, 1, 4 };
+static const int arg1_arg3_arg4_map[] = { 3, 1, 3, 4 };
 static const int arg3_arg2_map[] = { 2, 3, 2 };
 static const int op1_op0_map[] = { 2, 1, 0 };
 static const int off_map[] = { 1, -3 };
@@ -573,6 +575,13 @@ vect_get_operand_map (const gimple *stmt, bool 
gather_scatter_p = false,
  case IFN_MASK_LEN_GATHER_LOAD:
return arg1_arg4_map;
 
+ case IFN_SCATTER_STORE:
+   return arg1_arg3_map;
+
+ case IFN_MASK_SCATTER_STORE:
+ case IFN_MASK_LEN_SCATTER_STORE:
+   return arg1_arg3_arg4_map;
+
  case IFN_MASK_STORE:
return gather_scatter_p ? off_arg3_arg2_map : arg3_arg2_map;
 
@@ -1187,7 +1196,10 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char 
*swap,
  if (cfn == CFN_MASK_LOAD
  || cfn == CFN_GATHER_LOAD
  || cfn == CFN_MASK_GATHER_LOAD
- || cfn == CFN_MASK_LEN_GATHER_LOAD)
+ || cfn == CFN_MASK_LEN_GATHER_LOAD
+ || cfn == CFN_SCATTER_STORE
+ || cfn == CFN_MASK_SCATTER_STORE
+ || cfn == CFN_MASK_LEN_SCATTER_STORE)
ldst_p = true;
  else if (cfn == CFN_MASK_STORE)
{
@@ -1473,6 +1485,9 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char 
*swap,
  && rhs_code != CFN_GATHER_LOAD
  && rhs_code != CFN_MASK_GATHER_LOAD
  && rhs_code != CFN_MASK_LEN_GATHER_LOAD
+ && rhs_code != CFN_SCATTER_STORE
+ && rhs_code != CFN_MASK_SCATTER_STORE
+ && rhs_code != CFN_MASK_LEN_SCATTER_STORE
  && !STMT_VINFO_GATHER_SCATTER_P (stmt_info)
  /* Not grouped loads are handled as externals for BB
 vectorization.  For loop vectorization we can handle
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 28bfd8f4e28..f77a223b0c4 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -9163,7 +9163,8 @@ vectorizable_store (vec_info *vinfo,
{
  if (loop_masks)
final_mask = vect_get_loop_mask (loop_vinfo, gsi,
-loop_masks, ncopies,
+loop_masks,
+ncopies * vec_num,
 vectype, j);
  if (vec_mask)
final_mask = prepare_vec_mask (loop_vinfo, mask_vectype,
@@ -9189,7 +9190,8 @@ vectorizable_store (vec_info *vinfo,
{
  if (loop_lens)
final_len = vect_get_loop_len (loop_vinfo, gsi,
-  loop_lens, ncopies,
+  loop_lens,
+  ncopies * vec_num,
   vectype, j, 1);
  else
final_len = size_int (TYPE_VECTOR_SUBPARTS (vectype));
-- 
2.43.0


Re: [PATCH v18 0/2] c: Add __countof__ operator

2024-11-10 Thread Martin Uecker
Am Sonntag, dem 10.11.2024 um 11:32 +0100 schrieb Alejandro Colomar:
> Hi!
> 
> Your favourite operator with the most controversial name comes back with
> support for [0], thanks to Martin Uecker.  In movie theaters, and
> probably in GCC 16.
> 
> For those who fight in a side in the name wars, here's a reminder of a
> fair survey (by JeanHeyd) which might end the war with a peace treaty:
> .
> 
> 
> Changes since v17:
> 
> -  Rebase after the recent patches added by Martin, which made [0][n]
>and [*][n] have distinct representation, and thus allowed making
>__countof__(int [0][n]) be a constant expression.
> 
> -  Make __countof__(int [0][n]) a constant expression.  Thanks, Martin!
>Update the testsuite to reflect this too, of course.
> 
> -  Rename small function in the testsuite (automatic => completed).
> 
> See the range-diff below for the exact differences since v17.
> 
> 
> Martin, this worked out of the box.  I'll reply to this email with the
> regression-test session results; they all passed.  [0] works like a
> charm.

Yes, it is nice how everything starts to fall into place
once you remove enough complexity and special cases ...

Martin

> 
> 
> Have a lovely day!
> Alex
> 
> 
> Alejandro Colomar (2):
>   contrib/: Add support for Cc: and Link: tags
>   c: Add __countof__ operator
> 
>  contrib/gcc-changelog/git_commit.py|   5 +-
>  gcc/c-family/c-common.cc   |  26 +
>  gcc/c-family/c-common.def  |   3 +
>  gcc/c-family/c-common.h|   2 +
>  gcc/c/c-decl.cc|  22 +++-
>  gcc/c/c-parser.cc  |  62 +++---
>  gcc/c/c-tree.h |   4 +
>  gcc/c/c-typeck.cc  | 115 ++-
>  gcc/doc/extend.texi|  30 +
>  gcc/testsuite/gcc.dg/countof-compile.c | 125 +
>  gcc/testsuite/gcc.dg/countof-vla.c |  45 
>  gcc/testsuite/gcc.dg/countof.c | 150 +
>  12 files changed, 564 insertions(+), 25 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/countof-compile.c
>  create mode 100644 gcc/testsuite/gcc.dg/countof-vla.c
>  create mode 100644 gcc/testsuite/gcc.dg/countof.c
> 
> Range-diff against v17:
> 1:  d847dc4a795 = 1:  82100c813c3 contrib/: Add support for Cc: and Link: tags
> 2:  936f7945fae ! 2:  f8336e4646a c: Add __countof__ operator
> @@ Commit message
> and somehow magically return the number of elements of the array,
> regardless of it being really a pointer.
>  
> --  Fix support for [0].
> -
>  gcc/ChangeLog:
>  
>  * doc/extend.texi: Document __countof__ operator.
> @@ gcc/c/c-decl.cc: finish_enum (tree enumtype, tree values, tree 
> attributes)
>  
>   ## gcc/c/c-parser.cc ##
>  @@ gcc/c/c-parser.cc: along with GCC; see the file COPYING3.  If not see
> - #include "bitmap.h"
> - #include "analyzer/analyzer-language.h"
>   #include "toplev.h"
> + #include "asan.h"
> + #include "c-family/c-ubsan.h"
>  +
>  +#define c_parser_sizeof_expression(parser)  
>   \
>  +(   
>   \
> @@ gcc/c/c-typeck.cc: c_expr_sizeof_type (location_t loc, struct 
> c_type_name *t)
>  +static bool
>  +is_top_array_vla (tree type)
>  +{
> -+  bool zero, star, var;
> ++  bool zero, var;
>  +  tree d;
>  +
>  +  if (TREE_CODE (type) != ARRAY_TYPE)
> @@ gcc/c/c-typeck.cc: c_expr_sizeof_type (location_t loc, struct 
> c_type_name *t)
>  +
>  +  d = TYPE_DOMAIN (type);
>  +  zero = !TYPE_MAX_VALUE (d);
> -+  star = (zero && C_TYPE_VARIABLE_SIZE (type));
> -+  if (star)
> -+return true;
>  +  if (zero)
>  +return false;
>  +
> @@ gcc/testsuite/gcc.dg/countof-compile.c (new)
>  +  _Static_assert (__countof__ (int [n][3]) == 7); /* { dg-error "not 
> constant" } */
>  +  _Static_assert (__countof__ (int [0][3]) == 0);
>  +  _Static_assert (__countof__ (int [0]) == 0);
> -+
> -+  /* FIXME: countof(int [0][n]) should result in a constant expression. 
>  */
> -+  _Static_assert (__countof__ (int [0][n]) == 0); /* { dg-error "not 
> constant" } */
> ++  _Static_assert (__countof__ (int [0][n]) == 0);
>  +}
>  
>   ## gcc/testsuite/gcc.dg/countof-vla.c (new) ##
> @@ gcc/testsuite/gcc.dg/countof-vla.c (new)
>  +  char (*a)[*][*],
>  +  int (*x)[__countof__ (*a)]);
>  +
> -+// Can't test due to bug: 
> 
> -+//static int z2[0];
> -+//static int y2[__countof__(z2)];
> ++static int z2[0];
> ++static int y2[__countof__(z2)];
>  
>   ## gcc/testsuite/gcc.dg/countof.c (new) ##
>  @@
>   

[Patch, fortran] PR109345 - [12/13/14/15 Regression] class(*) variable that is a string array is not handled correctly

2024-11-10 Thread Paul Richard Thomas
Hi All,

The failing testcase came about because the array reference in the TYPE IS
block required the correct value of the span. The fix separates out
unlimited polymorphic expressions in gfc_get_array_span and ensures that
the value returned is the originating array span, rather than the element
size. This is done by extracting the class container and then the class
data.

The other tweak in gfc_get_array_span makes the logic rather clearer by
identifying class dummy references as being the only cases where 'desc' is
not a component of a class container.

OK for mainline and backporting to the affected, active branches after a
couple of weeks?

Paul
diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
index a52bde90bd2..e888b737bec 100644
--- a/gcc/fortran/trans-array.cc
+++ b/gcc/fortran/trans-array.cc
@@ -962,6 +962,8 @@ tree
 gfc_get_array_span (tree desc, gfc_expr *expr)
 {
   tree tmp;
+  gfc_symbol *sym = expr->expr_type == EXPR_VARIABLE
+		? expr->symtree->n.sym : NULL;
 
   if (is_pointer_array (desc)
   || (get_CFI_desc (NULL, expr, &desc, NULL)
@@ -983,25 +985,43 @@ gfc_get_array_span (tree desc, gfc_expr *expr)
 	desc = build_fold_indirect_ref_loc (input_location, desc);
   tmp = gfc_conv_descriptor_span_get (desc);
 }
+  else if (UNLIMITED_POLY (expr)
+	   || (sym && UNLIMITED_POLY (sym)))
+{
+  /* Treat unlimited polymorphic expressions separately because
+	 the element size need not be the same as the span.  Obtain
+	 the class container, which is simplified here by their being
+	 no component references.  */
+  if (sym && sym->attr.dummy)
+	{
+	  tmp = gfc_get_symbol_decl (sym);
+	  tmp = GFC_DECL_SAVED_DESCRIPTOR (tmp);
+	  if (INDIRECT_REF_P (tmp))
+	tmp = TREE_OPERAND (tmp, 0);
+	}
+  else
+	{
+	  gcc_assert (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (desc)));
+	  tmp = TREE_OPERAND (desc, 0);
+	}
+  tmp = gfc_class_data_get (tmp);
+  tmp = gfc_conv_descriptor_span_get (tmp);
+}
   else if (TREE_CODE (desc) == COMPONENT_REF
 	   && GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (desc))
 	   && GFC_CLASS_TYPE_P (TREE_TYPE (TREE_OPERAND (desc, 0
 {
-  /* The descriptor is a class _data field and so use the vtable
-	 size for the receiving span field.  */
-  tmp = gfc_get_vptr_from_expr (desc);
+  /* The descriptor is a class _data field. Use the vtable size
+	 since it is guaranteed to have been set and is always OK for
+	 class array descriptors that are not unlimited.  */
+  tmp = gfc_class_vptr_get (TREE_OPERAND (desc, 0));
   tmp = gfc_vptr_size_get (tmp);
 }
-  else if (expr && expr->expr_type == EXPR_VARIABLE
-	   && expr->symtree->n.sym->ts.type == BT_CLASS
-	   && expr->ref->type == REF_COMPONENT
-	   && expr->ref->next->type == REF_ARRAY
-	   && expr->ref->next->next == NULL
-	   && CLASS_DATA (expr->symtree->n.sym)->attr.dimension)
+  else if (sym && sym->ts.type == BT_CLASS && sym->attr.dummy)
 {
-  /* Dummys come in sometimes with the descriptor detached from
-	 the class field or declaration.  */
-  tmp = gfc_class_vptr_get (expr->symtree->n.sym->backend_decl);
+  /* Class dummys usually requires extraction from the saved
+	 descriptor, which gfc_class_vptr_get does for us.  */
+  tmp = gfc_class_vptr_get (sym->backend_decl);
   tmp = gfc_vptr_size_get (tmp);
 }
   else
diff --git a/gcc/testsuite/gfortran.dg/character_workout_1.f90 b/gcc/testsuite/gfortran.dg/character_workout_1.f90
index 98133b48960..8f8bdbf0069 100644
--- a/gcc/testsuite/gfortran.dg/character_workout_1.f90
+++ b/gcc/testsuite/gfortran.dg/character_workout_1.f90
@@ -1,7 +1,7 @@
 ! { dg-do run }
 !
 ! Tests fix for PR100120/100816/100818/100819/100821
-! 
+!
 
 program main_p
 
@@ -27,10 +27,10 @@ program main_p
   character(len=m, kind=k), pointer :: pm(:)
   character(len=e, kind=k), pointer :: pe(:)
   character(len=:, kind=k), pointer :: pd(:)
-  
+
   class(*), pointer :: su
   class(*), pointer :: pu(:)
-  
+
   integer :: i, j
 
   nullify(s1, sm, se, sd, su)
@@ -41,7 +41,7 @@ program main_p
   cm(i)(j:j) = char(i*m+j+c-m, kind=k)
 end do
   end do
-  
+
   s1 => c1(n)
   if(.not.associated(s1))  stop 1
   if(.not.associated(s1, c1(n)))   stop 2
diff --git a/gcc/testsuite/gfortran.dg/pr109435.f90 b/gcc/testsuite/gfortran.dg/pr109435.f90
new file mode 100644
index 000..7326c2e71a5
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr109435.f90
@@ -0,0 +1,77 @@
+! { dg-do run }
+!
+! Test the fix for PR109435 in which array references in the SELECT TYPE
+! block below failed because the descriptor span was not set correctly.
+!
+! Contributed by Lauren Chilutti  
+!
+program test
+  implicit none
+  type :: t
+character(len=12, kind=4) :: str_array(4)
+integer :: i
+  end type
+  character(len=12, kind=1), target :: str_array(4)
+  character(len=12, kind=4), target :: str_array4(4)
+  type(t) :: str_t (4)
+  integer :: i
+
+  str_array(:) 

[PATCH] testsuite: arm: fast-math-complex-add-half-float.c test should not xfail

2024-11-10 Thread Torbjörn SVENSSON
Ok for trunk?

--

With the change in 15-3128-gde1923f9f4d, this test case no longer xfail.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/complex/fast-math-complex-add-half-float.c: Remove
xfail from test.

Signed-off-by: Torbjörn SVENSSON 
---
 .../gcc.dg/vect/complex/fast-math-complex-add-half-float.c  | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git 
a/gcc/testsuite/gcc.dg/vect/complex/fast-math-complex-add-half-float.c 
b/gcc/testsuite/gcc.dg/vect/complex/fast-math-complex-add-half-float.c
index 1fa914916ee..a773e796ddc 100644
--- a/gcc/testsuite/gcc.dg/vect/complex/fast-math-complex-add-half-float.c
+++ b/gcc/testsuite/gcc.dg/vect/complex/fast-math-complex-add-half-float.c
@@ -8,7 +8,5 @@
 #define N 200
 #include "complex-add-template.c"
 
-/* Vectorization is failing for these cases.  They should work but for now 
ignore.  */
-
-/* { dg-final { scan-tree-dump-times "stmt.*COMPLEX_ADD_ROT270" 1 "vect" { 
xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump-times "stmt.*COMPLEX_ADD_ROT90" 1 "vect" { 
xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump-times "stmt.*COMPLEX_ADD_ROT270" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "stmt.*COMPLEX_ADD_ROT90" 1 "vect" } } */
-- 
2.25.1



Re: [committed] contrib: Add 2 further ignored commits

2024-11-10 Thread Jakub Jelinek
On Sun, Nov 10, 2024 at 01:30:06PM -0300, Alexandre Oliva wrote:
> On Nov  9, 2024, Jakub Jelinek  wrote:
> 
> > r15-4998 and r15-5004 had wrong commit message, add those to
> > ignored commits.
> 
> Ugh, sorry and thanks.
> Was that .c vs .cc only, or was there anything else?

I think so.

> I'm surprised the commit-time checker didn't catch them.

I'm surprised too, but don't want to try to push further broken commits just
to double check that. ;)

> It used to, and that was very helpful to avoid typos in filenames.

Yes.
And I think it usually still does, I had one commit rejected because of such
a reason recently.

Jakub



[PATCH] testsuite: arm: Prune incremental link warning

2024-11-10 Thread Torbjörn SVENSSON
Ok for trunk and releases/gcc-14?

--

When the feature "needs_status_wrapper" in dejagnu is used, the
resulting gcc_tg.o file is a regular object file and thus the following
warning will be emitted if doing an incremental link:

.../ld: warning: incremental linking of LTO and non-LTO objects; using 
-flinker-output=nolto-rel which will bypass whole program optimization

Since the warning causes test cases, like pr61123-enum-size, to fail,
prune it.

gcc/testsuite/ChangeLog:

* gcc.target/arm/lto/lto.exp: Prune incremental link warning if
status wrapper is used.

Signed-off-by: Torbjörn SVENSSON 
---
 gcc/testsuite/gcc.target/arm/lto/lto.exp | 9 +
 1 file changed, 9 insertions(+)

diff --git a/gcc/testsuite/gcc.target/arm/lto/lto.exp 
b/gcc/testsuite/gcc.target/arm/lto/lto.exp
index 4ccb0737253..3f8377bdd3e 100644
--- a/gcc/testsuite/gcc.target/arm/lto/lto.exp
+++ b/gcc/testsuite/gcc.target/arm/lto/lto.exp
@@ -43,6 +43,14 @@ if { ![check_effective_target_lto] } {
 return
 }
 
+# This variable should only apply to tests called in this exp file.
+global dg_runtest_extra_prunes
+set dg_runtest_extra_prunes ""
+if { ![check_effective_target_unwrapped] } {
+# The status wrapper is a regular object file
+lappend dg_runtest_extra_prunes "warning: incremental linking of LTO and 
non-LTO objects"
+}
+
 gcc_init
 lto_init no-mathlib
 
@@ -60,4 +68,5 @@ foreach src [lsort [find $srcdir/$subdir *_0.c]] {
 lto-execute $src $sid
 }
 
+set dg_runtest_extra_prunes ""
 lto_finish
-- 
2.25.1



Re: [committed] contrib: Add 2 further ignored commits

2024-11-10 Thread Alexandre Oliva
On Nov  9, 2024, Jakub Jelinek  wrote:

> r15-4998 and r15-5004 had wrong commit message, add those to
> ignored commits.

Ugh, sorry and thanks.
Was that .c vs .cc only, or was there anything else?

I'm surprised the commit-time checker didn't catch them.
It used to, and that was very helpful to avoid typos in filenames.

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH v2] gccjit: support dynamic alloca stub

2024-11-10 Thread Schrodinger ZHU Yifan
sorry, just noticed that the check_value was wrongly written on the last line 
of this patch. Other parts should be good. Should I correct it now in a new 
revision? Or is it okay to wait for further reviews for now?
On Sun, Nov 10, 2024 at 13:17, Schrodinger ZHU Yifan  
wrote:  This patch adds dynamic alloca stubs support to GCCJIT.



DEF_BUILTIN_STUB only defines the enum for builtins instead of

providing the type. Therefore, builtins with stub will lead to

ICE before this patch. This applies to `alloca_with_align`,

`stack_save` and `stack_restore`.



This patch adds special handling for builtins defined by

DEF_BUILTIN_STUB.



Additionally, it fixes that supercontext is not

set for blocks emitted by gccjit. This triggers a SEGV error inside

`fold_builtin_with_align`.

gcc/jit/ChangeLog:



 * jit-builtins.cc (builtins_manager::make_builtin_function):

 (builtins_manager::make_type_for_stub):

 (builtins_manager::get_type_for_stub):

 (builtins_manager::get_attrs_tree):

 (builtins_manager::get_attrs_tree_for_stub):

 * jit-builtins.h:

 * jit-playback.cc (postprocess):



gcc/testsuite/ChangeLog:



 * jit.dg/test-aligned-alloca.c: New test.

 * jit.dg/test-stack-save-restore.c: New test.



---

 gcc/jit/jit-builtins.cc | 69 +-

 gcc/jit/jit-builtins.h | 7 +

 gcc/jit/jit-playback.cc | 1 +

 gcc/testsuite/jit.dg/test-aligned-alloca.c | 121 ++

 .../jit.dg/test-stack-save-restore.c | 114 +

 5 files changed, 309 insertions(+), 3 deletions(-)

 create mode 100644 gcc/testsuite/jit.dg/test-aligned-alloca.c

 create mode 100644 gcc/testsuite/jit.dg/test-stack-save-restore.c



diff --git a/gcc/jit/jit-builtins.cc b/gcc/jit/jit-builtins.cc

index 0c13c8db586..06affd66634 100644

--- a/gcc/jit/jit-builtins.cc

+++ b/gcc/jit/jit-builtins.cc

@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3. If not see

 #include "target.h"

 #include "jit-playback.h"

 #include "stringpool.h"

+#include "tree-core.h"

 
 #include "jit-builtins.h"

 
@@ -185,7 +186,8 @@ builtins_manager::make_builtin_function (enum 
built_in_function builtin_id)

 {

 const struct builtin_data& bd = builtin_data[builtin_id];

 enum jit_builtin_type type_id = bd.type;

- recording::type *t = get_type (type_id);

+ recording::type *t = type_id == BT_LAST ? get_type_for_stub (builtin_id)

+ : get_type (type_id);

 if (!t)

 return NULL;

 recording::function_type *func_type = t->as_a_function_type ();

@@ -333,6 +335,52 @@ builtins_manager::get_type (enum jit_builtin_type type_id)

 return m_types[type_id];

 }

 
+/* Create the recording::type for special builtins whose types are not defined

+ in builtin-types.def. */

+

+recording::type *

+builtins_manager::make_type_for_stub (enum built_in_function builtin_id)

+{

+ switch (builtin_id)

+ {

+ default:

+ return reinterpret_cast (-1);

+ case BUILT_IN_ALLOCA_WITH_ALIGN:

+ {

+ recording::type *p = m_ctxt->get_type (GCC_JIT_TYPE_SIZE_T);

+ recording::type *r = m_ctxt->get_type (GCC_JIT_TYPE_VOID_PTR);

+ recording::type *params[2] = { p, p };

+ return m_ctxt->new_function_type (r, 2, params, false);

+ }

+ case BUILT_IN_STACK_SAVE:

+ {

+ recording::type *r = m_ctxt->get_type (GCC_JIT_TYPE_VOID_PTR);

+ return m_ctxt->new_function_type (r, 0, nullptr, false);

+ }

+ case BUILT_IN_STACK_RESTORE:

+ {

+ recording::type *p = m_ctxt->get_type (GCC_JIT_TYPE_VOID_PTR);

+ recording::type *r = m_ctxt->get_type (GCC_JIT_TYPE_VOID);

+ recording::type *params[1] = { p };

+ return m_ctxt->new_function_type (r, 1, params, false);

+ }

+ }

+}

+

+/* Get the recording::type for a given type of builtin function,

+ by ID, creating it if it doesn't already exist. */

+

+recording::type *

+builtins_manager::get_type_for_stub (enum built_in_function type_id)

+{

+ if (m_types[type_id] == nullptr)

+ m_types[type_id] = make_type_for_stub (type_id);

+ recording::type *t = m_types[type_id];

+ if (reinterpret_cast (t) == -1)

+ return nullptr;

+ return t;

+}

+

 /* Create the recording::type for a given type of builtin function. */

 
 recording::type *

@@ -661,15 +709,30 @@ tree

 builtins_manager::get_attrs_tree (enum built_in_function builtin_id)

 {

 enum built_in_attribute attr = builtin_data[builtin_id].attr;

+ if (attr == ATTR_LAST)

+ return get_attrs_tree_for_stub (builtin_id);

 return get_attrs_tree (attr);

 }

 
-/* As above, but for an enum built_in_attribute. */

+/* Get attributes for builtin stubs. */

+

+tree

+builtins_manager::get_attrs_tree_for_stub (enum built_in_function builtin_id)

+{

+ switch (builtin_id)

+ {

+ default:

+ return NULL_TREE;

+ case BUILT_IN_ALLOCA_WITH_ALIGN:

+ return get_attrs_tree (BUILT_IN_ALLOCA);

+ }

+}

+

+/* As get_attrs_tree, but for an enum built_in_attribute. */

 
 tree

 builtins_manager::get_attrs_tree (enum built_in_attribute attr)

 {

- gcc_assert (attr < ATTR_LAST);

 if (!m_attributes [attr])

 m_attributes [attr] = make_at

Re: [patch,avr] Fix PR117500: Don't ICE on invalid asm operands.

2024-11-10 Thread Denis Chertykov
сб, 9 нояб. 2024 г. в 15:51, Georg-Johann Lay :
>
> This patch avoids an internal compiler error when a %i gets an operand
> that's not valid for %i.  It uses output_operand_lossage that outputs
> an ordinary error.
>
> Ok to apply?

Ok, please apply.

Denis.
>
> Johann
>
> --
>
> AVR: target/117500 - Use output_operand_lossage in avr_print_operand.
>
>  PR target/117500
> gcc/
>  * config/avr/avr.cc (avr_print_operand) [code = 'i']: Use
>  output_operand_lossage on bad operands instead of fatal_insn.


[PATCH] testsuite: arm: Mark pr81812.C as xfail for thumb1

2024-11-10 Thread Torbjörn SVENSSON
Ok for trunk and releases/gcc-14?

--

Test fails for Cortex-M0 with:

.../pr81812.C:6:8: error: generic thunk code fails for method 'virtual void 
ChildNode::_ZTv0_n12_NK9ChildNode5errorEz(...) const' which uses '...'

According to PR108277, it's expected that thumb1 targets does not
support empty virtual functions with ellipsis.

gcc/testsuite/ChangeLog:

* g++.dg/torture/pr81812.C: Add xfail for thumb1.

Signed-off-by: Torbjörn SVENSSON 
---
 gcc/testsuite/g++.dg/torture/pr81812.C | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/testsuite/g++.dg/torture/pr81812.C 
b/gcc/testsuite/g++.dg/torture/pr81812.C
index d235e237588..b5c621d2beb 100644
--- a/gcc/testsuite/g++.dg/torture/pr81812.C
+++ b/gcc/testsuite/g++.dg/torture/pr81812.C
@@ -1,3 +1,5 @@
+// { dg-xfail-if "PR108277" { arm_thumb1 } }
+
 struct Error {
   virtual void error(... ) const;
 };
-- 
2.25.1



[PATCH v3] [GCCJIT] support dynamic alloca stub

2024-11-10 Thread Schrodinger ZHU Yifan
This patch adds dynamic alloca stubs support to GCCJIT.

DEF_BUILTIN_STUB only defines the enum for builtins instead of
providing the type. Therefore, builtins with stub will lead to
ICE before this patch. This applies to `alloca_with_align`,
`stack_save` and `stack_restore`.

This patch adds special handling for builtins defined by
DEF_BUILTIN_STUB.

Additionally, it fixes that supercontext is not
set for blocks emitted by gccjit. This triggers a SEGV error inside
`fold_builtin_with_align`.

This is the third roll of the patch:

- Fix wrong test cases mistakenly introduced in V2.
- Undo the removal of `gcc_assert` inside get_attrs_tree for non-stub builtin
  functions.

gcc/jit/ChangeLog:

* jit-builtins.cc (builtins_manager::make_builtin_function): Add stub 
type handling.
(builtins_manager::make_type_for_stub): Add stub type handling.
(builtins_manager::get_type_for_stub): Add stub type handling.
(builtins_manager::get_attrs_tree): Add stub attribute hand
ling.
(builtins_manager::get_attrs_tree_for_stub): Add stub attribute 
handling.
* jit-builtins.h: Add new functions for stubs.
* jit-playback.cc (postprocess): Always set supercontext.

gcc/testsuite/ChangeLog:

* jit.dg/test-aligned-alloca.c: New test.
* jit.dg/test-stack-save-restore.c: New test.
---
 gcc/jit/jit-builtins.cc   |  68 +-
 gcc/jit/jit-builtins.h|   7 +
 gcc/jit/jit-playback.cc   |   1 +
 gcc/testsuite/jit.dg/test-aligned-alloca.c| 121 ++
 .../jit.dg/test-stack-save-restore.c  | 114 +
 5 files changed, 309 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/jit.dg/test-aligned-alloca.c
 create mode 100644 gcc/testsuite/jit.dg/test-stack-save-restore.c

diff --git a/gcc/jit/jit-builtins.cc b/gcc/jit/jit-builtins.cc
index 0c13c8db586..5f61775beb7 100644
--- a/gcc/jit/jit-builtins.cc
+++ b/gcc/jit/jit-builtins.cc
@@ -23,
6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "target.h"
 #include "jit-playback.h"
 #include "stringpool.h"
+#include "tree-core.h"
 
 #include "jit-builtins.h"
 
@@ -185,7 +186,8 @@ builtins_manager::make_builtin_function (enum 
built_in_function builtin_id)
 {
   const struct builtin_data& bd = builtin_data[builtin_id];
   enum jit_builtin_type type_id = bd.type;
-  recording::type *t = get_type (type_id);
+  recording::type *t = type_id == BT_LAST ? get_type_for_stub (builtin_id)
+: get_type (type_id);
   if (!t)
 return NULL;
   recording::function_type *func_type = t->as_a_function_type ();
@@ -333,6 +335,52 @@ builtins_manager::get_type (enum jit_builtin_type type_id)
   return m_types[type_id];
 }
 
+/* Create the recording::type for special builtins whose types are not defined
+   in builtin-types.def.  */
+
+recording::type *
+builtins_manager::make_type_for_stub (enum built_in_function builtin_id)
+{
+  switch (bui
ltin_id)
+{
+default:
+  return reinterpret_cast (-1);
+case BUILT_IN_ALLOCA_WITH_ALIGN:
+  {
+   recording::type *p = m_ctxt->get_type (GCC_JIT_TYPE_SIZE_T);
+   recording::type *r = m_ctxt->get_type (GCC_JIT_TYPE_VOID_PTR);
+   recording::type *params[2] = { p, p };
+   return m_ctxt->new_function_type (r, 2, params, false);
+  }
+case BUILT_IN_STACK_SAVE:
+  {
+   recording::type *r = m_ctxt->get_type (GCC_JIT_TYPE_VOID_PTR);
+   return m_ctxt->new_function_type (r, 0, nullptr, false);
+  }
+case BUILT_IN_STACK_RESTORE:
+  {
+   recording::type *p = m_ctxt->get_type (GCC_JIT_TYPE_VOID_PTR);
+   recording::type *r = m_ctxt->get_type (GCC_JIT_TYPE_VOID);
+   recording::type *params[1] = { p };
+   return m_ctxt->new_function_type (r, 1, params, false);
+  }
+}
+}
+
+/* Get the recording::type for a given type of builtin function,
+   by ID, creating it if it doesn't already exist.  */
+
+recording::type *

+builtins_manager::get_type_for_stub (enum built_in_function type_id)
+{
+  if (m_types[type_id] == nullptr)
+m_types[type_id] = make_type_for_stub (type_id);
+  recording::type *t = m_types[type_id];
+  if (reinterpret_cast (t) == -1)
+return nullptr;
+  return t;
+}
+
 /* Create the recording::type for a given type of builtin function.  */
 
 recording::type *
@@ -661,10 +709,26 @@ tree
 builtins_manager::get_attrs_tree (enum built_in_function builtin_id)
 {
   enum built_in_attribute attr = builtin_data[builtin_id].attr;
+  if (attr == ATTR_LAST)
+return get_attrs_tree_for_stub (builtin_id);
   return get_attrs_tree (attr);
 }
 
-/* As above, but for an enum built_in_attribute.  */
+/* Get attributes for builtin stubs.  */
+
+tree
+builtins_manager::get_attrs_tree_for_stub (enum built_in_function builtin_id)
+{
+  switch (builtin_id)
+{
+default:
+  return NULL_TREE;
+case BUILT_IN_ALLOCA_WITH_ALIGN:
+  retu
rn get_attrs_tree (BUILT_IN_ALLOCA);

[PATCH v2] gccjit: support dynamic alloca stub

2024-11-10 Thread Schrodinger ZHU Yifan
This patch adds dynamic alloca stubs support to GCCJIT.

DEF_BUILTIN_STUB only defines the enum for builtins instead of
providing the type. Therefore, builtins with stub will lead to
ICE before this patch. This applies to `alloca_with_align`,
`stack_save` and `stack_restore`.

This patch adds special handling for builtins defined by
DEF_BUILTIN_STUB.

Additionally, it fixes that supercontext is not
set for blocks emitted by gccjit. This triggers a SEGV error inside
`fold_builtin_with_align`.
gcc/jit/ChangeLog:

* jit-builtins.cc (builtins_manager::make_builtin_function):
(builtins_manager::make_type_for_stub):
(builtins_manager::get_type_for_stub):
(builtins_manager::get_attrs_tree):
(builtins_manager::get_attrs_tree_for_stub):
* jit-builtins.h:
* jit-playback.cc (postprocess):

gcc/testsuite/ChangeLog:

* jit.dg/test-aligned-alloca.c: New test.
* jit.dg/test-stack-save-restore.c: New test.

---
 gcc/jit/jit-builtins.cc   |  69 
+-
 gcc/jit/jit-builtins.h|   7 +
 gcc/jit/jit-playback.cc   |   1 +
 gcc/testsuite/jit.dg/test-aligned-alloca.c| 121 ++
 .../jit.dg/test-stack-save-restore.c  | 114 +
 5 files changed, 309 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/jit.dg/test-aligned-alloca.c
 create mode 100644 gcc/testsuite/jit.dg/test-stack-save-restore.c

diff --git a/gcc/jit/jit-builtins.cc b/gcc/jit/jit-builtins.cc
index 0c13c8db586..06affd66634 100644
--- a/gcc/jit/jit-builtins.cc
+++ b/gcc/jit/jit-builtins.cc
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "target.h"
 #include "jit-playback.h"
 #include "stringpool.h"
+#include "tree-core.h"
 
 #include "jit-builtins.h"
 
@@ -185,7 +186,8 @@ builtins_manager::make_builtin_function (enum 
built_in_function builtin_id)
 {
   const struct builtin_data& bd = builtin_data[builtin_id];
   enum jit
_builtin_type type_id = bd.type;
-  recording::type *t = get_type (type_id);
+  recording::type *t = type_id == BT_LAST ? get_type_for_stub (builtin_id)
+: get_type (type_id);
   if (!t)
 return NULL;
   recording::function_type *func_type = t->as_a_function_type ();
@@ -333,6 +335,52 @@ builtins_manager::get_type (enum jit_builtin_type type_id)
   return m_types[type_id];
 }
 
+/* Create the recording::type for special builtins whose types are not defined
+   in builtin-types.def.  */
+
+recording::type *
+builtins_manager::make_type_for_stub (enum built_in_function builtin_id)
+{
+  switch (builtin_id)
+{
+default:
+  return reinterpret_cast (-1);
+case BUILT_IN_ALLOCA_WITH_ALIGN:
+  {
+   recording::type *p = m_ctxt->get_type (GCC_JIT_TYPE_SIZE_T);
+   recording::type *r = m_ctxt->get_type (GCC_JIT_TYPE_VOID_PTR);
+   recording::type *params[2] = { p, p };
+   return m_ctxt->new_function_type (r, 2, params, false);
+ 
 }
+case BUILT_IN_STACK_SAVE:
+  {
+   recording::type *r = m_ctxt->get_type (GCC_JIT_TYPE_VOID_PTR);
+   return m_ctxt->new_function_type (r, 0, nullptr, false);
+  }
+case BUILT_IN_STACK_RESTORE:
+  {
+   recording::type *p = m_ctxt->get_type (GCC_JIT_TYPE_VOID_PTR);
+   recording::type *r = m_ctxt->get_type (GCC_JIT_TYPE_VOID);
+   recording::type *params[1] = { p };
+   return m_ctxt->new_function_type (r, 1, params, false);
+  }
+}
+}
+
+/* Get the recording::type for a given type of builtin function,
+   by ID, creating it if it doesn't already exist.  */
+
+recording::type *
+builtins_manager::get_type_for_stub (enum built_in_function type_id)
+{
+  if (m_types[type_id] == nullptr)
+m_types[type_id] = make_type_for_stub (type_id);
+  recording::type *t = m_types[type_id];
+  if (reinterpret_cast (t) == -1)
+return nullptr;
+  return t;
+}
+
 /* Create the recording::type for a given type of builtin function.  */

 
 recording::type *
@@ -661,15 +709,30 @@ tree
 builtins_manager::get_attrs_tree (enum built_in_function builtin_id)
 {
   enum built_in_attribute attr = builtin_data[builtin_id].attr;
+  if (attr == ATTR_LAST)
+return get_attrs_tree_for_stub (builtin_id);
   return get_attrs_tree (attr);
 }
 
-/* As above, but for an enum built_in_attribute.  */
+/* Get attributes for builtin stubs.  */
+
+tree
+builtins_manager::get_attrs_tree_for_stub (enum built_in_function builtin_id)
+{
+  switch (builtin_id)
+{
+default:
+  return NULL_TREE;
+case BUILT_IN_ALLOCA_WITH_ALIGN:
+  return get_attrs_tree (BUILT_IN_ALLOCA);
+}
+}
+
+/* As get_attrs_tree, but for an enum built_in_attribute.  */
 
 tree
 builtins_manager::get_attrs_tree (enum built_in_attribute attr)
 {
-  gcc_assert (attr < ATTR_LAST);
   if (!m_attributes [attr])
 m_attributes [attr] = make_attrs_tree (attr);
   return m_attributes [attr];
diff --git a/gcc/jit/jit-b
uiltins.h b/gcc/jit/jit-builtins.h
index 17e1

Re: [Patch, fortran] PR109345 - [12/13/14/15 Regression] class(*) variable that is a string array is not handled correctly

2024-11-10 Thread Harald Anlauf

Hi Paul,

this looks good to me for mainline as well as backports ...

... except that the PR number should be corrected (109345 instead of
109435) in the testcase and the commit message (Change.logs).

Thanks for the patch!

Harald

Am 10.11.24 um 14:52 schrieb Paul Richard Thomas:

Hi All,

The failing testcase came about because the array reference in the TYPE IS
block required the correct value of the span. The fix separates out
unlimited polymorphic expressions in gfc_get_array_span and ensures that
the value returned is the originating array span, rather than the element
size. This is done by extracting the class container and then the class
data.

The other tweak in gfc_get_array_span makes the logic rather clearer by
identifying class dummy references as being the only cases where 'desc' is
not a component of a class container.

OK for mainline and backporting to the affected, active branches after a
couple of weeks?

Paul





Re: [patch, Fortran] Reject UNSIGNED for COMPLEX

2024-11-10 Thread Harald Anlauf

Hi Thomas,

the patch is basically fine.

I am wondering if we should create a new helper function that is
the opposite of type_check ("type_cannot_be"), so that we avoid
redundant code at the source level.  It may not be worth it yet,
so your choice.

Furthermore, if you planned to list intrinsics alphabetically,

diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 429d8461f8f..00276b5b45d 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi

this part needs corrected (in my counting, M comes before S):

@@ -2780,6 +2781,7 @@ The following intrinsics take unsigned arguments:
 @item @code{BLE}, @pxref{BLE}
 @item @code{BLT}, @pxref{BLT}
 @item @code{CSHIFT}, @pxref{CSHIFT}
+@item @code{CMPLX}, @pxref{CMPLX}
 @item @code{DIGITS}, @pxref{DIGITS}
 @item @code{DOT_PRODUCT}, @pxref{DOT_PRODUCT}
 @item @code{DSHIFTL}, @pxref{DSHIFTL}

Not being a native speaker, I stumbled over this:

diff --git a/gcc/fortran/intrinsic.texi b/gcc/fortran/intrinsic.texi
index 9d0b752670b..d11d37761d9 100644
--- a/gcc/fortran/intrinsic.texi
+++ b/gcc/fortran/intrinsic.texi

@@ -3637,9 +3638,9 @@ Elemental function
 @item @emph{Arguments}:
 @multitable @columnfractions .15 .70
 @item @var{X} @tab The type may be @code{INTEGER}, @code{REAL},
-or @code{COMPLEX}.
+@code{COMPLEX} or @code{UNSIGNED}.
 @item @var{Y} @tab (Optional; only allowed if @var{X} is not
-@code{COMPLEX}.)  May be @code{INTEGER} or @code{REAL}.
+@code{COMPLEX}.)  May be @code{INTEGER}, @code{REAL} or @code{UNSIGNED}.
   ^^^ Shouldn't one add "The type" before "may be"?

 @item @var{KIND} @tab (Optional) A scalar @code{INTEGER} constant
 expression indicating the kind parameter of the result.
 @end multitable


OK for mainline after considering the above comments.

Thanks for the patch!

Harald

Am 09.11.24 um 17:53 schrieb Thomas Koenig:

Hello world,

the attached patch rejects UNSIGNED arguments for the COMPLEX
function, which is an extension.  It also documents CMPLX,
INT and REAL as taking UNSIGNED arguments.

Regression-tested. OK for trunk?

Best regards

 Thomas

gcc/fortran/ChangeLog:

 * check.cc (gfc_check_complex): Reject UNSIGNED.
 * gfortran.texi: Update example program.  Note that
 CMPLX, INT and REAL also take unsigned arguments.
 * intrinsic.texi (CMPLX): Document UNSIGNED.
 (INT): Likewise.
 (REAL): Likewise.

gcc/testsuite/ChangeLog:

 * gfortran.dg/unsigned_41.f90: New test.





[PATCH v3 4/7] i386: Adjust apx-ndd.c for frontend promotion removal

2024-11-10 Thread H.J. Lu
Since the C frontend no longer promotes integer argument smaller than int,
the apx-ndd.c codgen is slightly different:

--- apx-ndd.s (original)2024-11-10 06:07:09.894876973 +0800
+++ apx-ndd.s (updated) 2024-11-10 06:06:59.371860565 +0800
@@ -17,7 +17,7 @@ foo_add_char:
 foo1_add_char:
 .LFB1:
.cfi_startproc
-   leal(%rsi,%rdi), %eax
+   leal(%rdi,%rsi), %eax
ret
.cfi_endproc
 .LFE1:
@@ -50,7 +50,7 @@ foo_add_short:
 foo1_add_short:
 .LFB4:
.cfi_startproc
-   leal(%rsi,%rdi), %eax
+   leal(%rdi,%rsi), %eax
ret
.cfi_endproc
 .LFE4:
@@ -413,7 +413,7 @@ foo_and_char:
 foo1_and_char:
 .LFB37:
.cfi_startproc
-   andl%edi, %esi, %eax
+   andl%esi, %edi, %eax
ret
.cfi_endproc
 .LFE37:
@@ -435,7 +435,7 @@ foo_and_short:
 foo1_and_short:
 .LFB39:
.cfi_startproc
-   andl%edi, %esi, %eax
+   andl%esi, %edi, %eax
ret
.cfi_endproc
 .LFE39:
@@ -501,7 +501,7 @@ foo_or_char:
 foo1_or_char:
 .LFB45:
.cfi_startproc
-   orl %edi, %esi, %eax
+   orl %esi, %edi, %eax
ret
.cfi_endproc
 .LFE45:
@@ -523,7 +523,7 @@ foo_or_short:
 foo1_or_short:
 .LFB47:
.cfi_startproc
-   orl %edi, %esi, %eax
+   orl %esi, %edi, %eax
ret
.cfi_endproc
 .LFE47:
@@ -589,7 +589,7 @@ foo_xor_char:
 foo1_xor_char:
 .LFB53:
.cfi_startproc
-   xorl%edi, %esi, %eax
+   xorl%esi, %edi, %eax
ret
.cfi_endproc
 .LFE53:
@@ -611,7 +611,7 @@ foo_xor_short:
 foo1_xor_short:
 .LFB55:
.cfi_startproc
-   xorl%edi, %esi, %eax
+   xorl%esi, %edi, %eax
ret
.cfi_endproc
 .LFE55:
@@ -1018,7 +1018,7 @@ foo4_rol_uint64_t:
 foo1_imul_short:
 .LFB92:
.cfi_startproc
-   imull   %edi, %esi, %eax
+   imull   %esi, %edi, %eax
ret
.cfi_endproc
 .LFE92:

Adjust the assembler scans.

PR middle-end/14907
* gcc.target/i386/apx-ndd.c: Adjusted.

Signed-off-by: H.J. Lu 
---
 gcc/testsuite/gcc.target/i386/apx-ndd.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c 
b/gcc/testsuite/gcc.target/i386/apx-ndd.c
index ce77630a47c..2b2f4fc4b0f 100644
--- a/gcc/testsuite/gcc.target/i386/apx-ndd.c
+++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c
@@ -188,16 +188,13 @@ FOO2 (int64_t, imul, *)
 /* { dg-final { scan-assembler-times "not(?:l|w|q)\[^\n\r]%(?:|r|e)di, 
%(?:|r|e)ax" 4 } } */
 /* { dg-final { scan-assembler-times "andb\[^\n\r]*1, \\(%(?:r|e)di\\), %al" 1 
} } */
 /* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]*1, 
\\(%(?:r|e)di\\), %(?:|r|e)ax" 3 } } */
-/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)di, 
%(?:|r|e)si, %(?:|r|e)ax" 2 } } */
-/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)si, 
%(?:|r|e)di, %(?:|r|e)ax" 2 } } */
+/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)si, 
%(?:|r|e)di, %(?:|r|e)ax" 4 } } */
 /* { dg-final { scan-assembler-times "orb\[^\n\r]*1, \\(%(?:r|e)di\\), %al" 2} 
} */
 /* { dg-final { scan-assembler-times "or(?:l|w|q)\[^\n\r]*1, \\(%(?:r|e)di\\), 
%(?:|r|e)ax" 6 } } */
-/* { dg-final { scan-assembler-times "or(?:l|w|q)\[^\n\r]%(?:|r|e)di, 
%(?:|r|e)si, %(?:|r|e)ax" 4 } } */
-/* { dg-final { scan-assembler-times "or(?:l|w|q)\[^\n\r]%(?:|r|e)si, 
%(?:|r|e)di, %(?:|r|e)ax" 4 } } */
+/* { dg-final { scan-assembler-times "or(?:l|w|q)\[^\n\r]%(?:|r|e)si, 
%(?:|r|e)di, %(?:|r|e)ax" 8 } } */
 /* { dg-final { scan-assembler-times "xorb\[^\n\r]*1, \\(%(?:r|e)di\\), %al" 1 
} } */
 /* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]*1, 
\\(%(?:r|e)di\\), %(?:|r|e)ax" 3 } } */
-/* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)di, 
%(?:|r|e)si, %(?:|r|e)ax" 2 } } */
-/* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)si, 
%(?:|r|e)di, %(?:|r|e)ax" 2 } } */
+/* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)si, 
%(?:|r|e)di, %(?:|r|e)ax" 4 } } */
 /* { dg-final { scan-assembler-times "sal(?:b|l|w|q)\[^\n\r]*1, 
\\(%(?:r|e)di\\), %(?:|r|e)a(?:x|l)" 4 } } */
 /* { dg-final { scan-assembler-times "sal(?:l|w|q)\[^\n\r]*7, %(?:|r|e)di, 
%(?:|r|e)ax" 4 } } */
 /* { dg-final { scan-assembler-times "sar(?:b|l|w|q)\[^\n\r]*1, 
\\(%(?:r|e)di\\), %(?:|r|e)a(?:x|l)" 4 } } */
-- 
2.47.0



[PATCH v3 2/7] Add expand_promote_outgoing_argument

2024-11-10 Thread H.J. Lu
Since the C/C++/Ada frontends no longer promote integer argument smaller
than int, add expand_promote_outgoing_argument to promote it when expanding
builtin functions.

PR middle-end/14907
* expr.cc (expand_promote_outgoing_argument): New function.
* expr.h (expand_promote_outgoing_argument): New prototype.
* config/i386/i386-expand.cc (ix86_expand_binop_builtin): Call
expand_promote_outgoing_argument to expand the outgoing
argument.
(ix86_expand_multi_arg_builtin): Likewise.
(ix86_expand_unop_vec_merge_builtin): Likewise.
(ix86_expand_sse_compare): Likewise.
(ix86_expand_sse_comi): Likewise.
(ix86_expand_sse_round): Likewise.
(ix86_expand_sse_round_vec_pack_sfix): Likewise.
(ix86_expand_sse_ptest): Likewise.
(ix86_expand_sse_pcmpestr): Likewise.
(ix86_expand_sse_pcmpistr): Likewise.
(ix86_expand_args_builtin): Likewise.
(ix86_expand_sse_comi_round): Likewise.
(ix86_expand_round_builtin): Likewise.
(ix86_expand_special_args_builtin): Likewise.
(ix86_expand_vec_init_builtin): Likewise.
(ix86_expand_vec_ext_builtin): Likewise.
(ix86_expand_builtin): Likewise.

Signed-off-by: H.J. Lu 
---
 gcc/config/i386/i386-expand.cc | 244 -
 gcc/expr.cc|  18 +++
 gcc/expr.h |   1 +
 3 files changed, 141 insertions(+), 122 deletions(-)

diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index 5c4a8e07d62..ce887d96f6a 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -10415,8 +10415,8 @@ ix86_expand_binop_builtin (enum insn_code icode, tree 
exp, rtx target)
   rtx pat;
   tree arg0 = CALL_EXPR_ARG (exp, 0);
   tree arg1 = CALL_EXPR_ARG (exp, 1);
-  rtx op0 = expand_normal (arg0);
-  rtx op1 = expand_normal (arg1);
+  rtx op0 = expand_promote_outgoing_argument (arg0);
+  rtx op1 = expand_promote_outgoing_argument (arg1);
   machine_mode tmode = insn_data[icode].operand[0].mode;
   machine_mode mode0 = insn_data[icode].operand[1].mode;
   machine_mode mode1 = insn_data[icode].operand[2].mode;
@@ -10564,7 +10564,7 @@ ix86_expand_multi_arg_builtin (enum insn_code icode, 
tree exp, rtx target,
   for (i = 0; i < nargs; i++)
 {
   tree arg = CALL_EXPR_ARG (exp, i);
-  rtx op = expand_normal (arg);
+  rtx op = expand_promote_outgoing_argument (arg);
   int adjust = (comparison_p) ? 1 : 0;
   machine_mode mode = insn_data[icode].operand[i+adjust+1].mode;
 
@@ -10691,7 +10691,7 @@ ix86_expand_unop_vec_merge_builtin (enum insn_code 
icode, tree exp,
 {
   rtx pat;
   tree arg0 = CALL_EXPR_ARG (exp, 0);
-  rtx op1, op0 = expand_normal (arg0);
+  rtx op1, op0 = expand_promote_outgoing_argument (arg0);
   machine_mode tmode = insn_data[icode].operand[0].mode;
   machine_mode mode0 = insn_data[icode].operand[1].mode;
 
@@ -10727,8 +10727,8 @@ ix86_expand_sse_compare (const struct 
builtin_description *d,
   rtx pat;
   tree arg0 = CALL_EXPR_ARG (exp, 0);
   tree arg1 = CALL_EXPR_ARG (exp, 1);
-  rtx op0 = expand_normal (arg0);
-  rtx op1 = expand_normal (arg1);
+  rtx op0 = expand_promote_outgoing_argument (arg0);
+  rtx op1 = expand_promote_outgoing_argument (arg1);
   rtx op2;
   machine_mode tmode = insn_data[d->icode].operand[0].mode;
   machine_mode mode0 = insn_data[d->icode].operand[1].mode;
@@ -10823,8 +10823,8 @@ ix86_expand_sse_comi (const struct builtin_description 
*d, tree exp,
   rtx pat, set_dst;
   tree arg0 = CALL_EXPR_ARG (exp, 0);
   tree arg1 = CALL_EXPR_ARG (exp, 1);
-  rtx op0 = expand_normal (arg0);
-  rtx op1 = expand_normal (arg1);
+  rtx op0 = expand_promote_outgoing_argument (arg0);
+  rtx op1 = expand_promote_outgoing_argument (arg1);
   enum insn_code icode = d->icode;
   const struct insn_data_d *insn_p = &insn_data[icode];
   machine_mode mode0 = insn_p->operand[0].mode;
@@ -10916,7 +10916,7 @@ ix86_expand_sse_round (const struct builtin_description 
*d, tree exp,
 {
   rtx pat;
   tree arg0 = CALL_EXPR_ARG (exp, 0);
-  rtx op1, op0 = expand_normal (arg0);
+  rtx op1, op0 = expand_promote_outgoing_argument (arg0);
   machine_mode tmode = insn_data[d->icode].operand[0].mode;
   machine_mode mode0 = insn_data[d->icode].operand[1].mode;
 
@@ -10948,8 +10948,8 @@ ix86_expand_sse_round_vec_pack_sfix (const struct 
builtin_description *d,
   rtx pat;
   tree arg0 = CALL_EXPR_ARG (exp, 0);
   tree arg1 = CALL_EXPR_ARG (exp, 1);
-  rtx op0 = expand_normal (arg0);
-  rtx op1 = expand_normal (arg1);
+  rtx op0 = expand_promote_outgoing_argument (arg0);
+  rtx op1 = expand_promote_outgoing_argument (arg1);
   rtx op2;
   machine_mode tmode = insn_data[d->icode].operand[0].mode;
   machine_mode mode0 = insn_data[d->icode].operand[1].mode;
@@ -10988,8 +10988,8 @@ ix86_expand_sse_ptest (const struct builtin_description 
*d, tree exp,
   rtx pat;
   tree arg0 = CALL_EXPR_ARG (exp, 0);
   tree ar

[PATCH v3 0/7] Improve outgoing integer argument promotion

2024-11-10 Thread H.J. Lu
For targets, like x86, which define TARGET_PROMOTE_PROTOTYPES to return
true, all integer arguments smaller than int are passed as int:

[hjl@gnu-tgl-3 pr14907]$ cat x.c
extern int baz (char c1);

int
foo (char c1)
{
  return baz (c1);
}
[hjl@gnu-tgl-3 pr14907]$ gcc -S -O2 -m32 x.c
[hjl@gnu-tgl-3 pr14907]$ cat x.s
.file   "x.c"
.text
.p2align 4
.globl  foo
.type   foo, @function
foo:
.LFB0:
.cfi_startproc
movsbl  4(%esp), %eax
movl%eax, 4(%esp)
jmp baz
.cfi_endproc
.LFE0:
.size   foo, .-foo
.ident  "GCC: (GNU) 14.2.1 20240912 (Red Hat 14.2.1-3)"
.section.note.GNU-stack,"",@progbits
[hjl@gnu-tgl-3 pr14907]$

But integer promotion:

movsbl  4(%esp), %eax
movl%eax, 4(%esp)

isn't necessary if incoming arguments and outgoing arguments are the
same.

1. Drop targetm.promote_prototypes from C, C++ and Ada frontends and apply
targetm.promote_prototypes during RTL call expansion.
2. Add expand_promote_outgoing_argument for TARGET_PROMOTE_PROTOTYPES
targets to promote outgoing integer arguments when expanding builtin
functions.
3. Adjust tests for the C frontend promotion removal.
4. gcc.dg/tree-ssa/pr108357.c fails with the C frontend promotion removal.
This test passes for aarch64-linux.  Comparing tree dumps between x86-64
and aarch64, differences are

diff -upr x86/pr108357.c.098t.fixup_cfg3 aarch64/pr108357.c.098t.fixup_cfg3
--- x86/pr108357.c.098t.fixup_cfg3  2024-11-10 18:23:40.418419777 +0800
+++ aarch64/pr108357.c.098t.fixup_cfg3  2024-11-10 18:20:55.410902153 +0800
...
@@ -19,7 +19,7 @@ short int a (short int d, short int e)



-;; Function main (main, funcdef_no=2, decl_uid=2806, cgraph_uid=3, 
symbol_order=4) (executed once)
+;; Function main (main, funcdef_no=2, decl_uid=4429, cgraph_uid=3, 
symbol_order=4) (executed once)

 int main ()
 {
@@ -43,34 +43,28 @@ int main ()
   _7 = e.1_6 * 5;
   _8 = (short int) _7;
   b = 0;
-  if (_8 != 0)
-goto ; [67.00%]
-  else
-goto ; [33.00%]
-
-   [local count: 719407024]:
   c.4_9 = 0;
   _10 = c.4_9 == 0;
   _11 = (int) _10;
   _12 = (int) _8;
   if (_11 < _12)
-goto ; [50.00%]
-  else
 goto ; [50.00%]
+  else
+goto ; [50.00%]
...
diff -upr x86/pr108357.c.107t.ccp2 aarch64/pr108357.c.107t.ccp2
--- x86/pr108357.c.107t.ccp22024-11-10 18:23:40.419419775 +0800
+++ aarch64/pr108357.c.107t.ccp22024-11-10 18:20:55.411902150 +0800
...
@@ -59,7 +59,7 @@ short int a (short int d, short int e)



-;; Function main (main, funcdef_no=2, decl_uid=2806, cgraph_uid=3, 
symbol_order=4) (executed once)
+;; Function main (main, funcdef_no=2, decl_uid=4429, cgraph_uid=3, 
symbol_order=4) (executed once)

 Adding destination of edge (0 -> 2) to worklist

@@ -73,35 +73,32 @@ Lattice value changed to VARYING.  Addin
 Visiting statement:
 _2 = (short int) b.2_1;
 which is likely CONSTANT
-Lattice value changed to VARYING.  Adding SSA edges to worklist.
+Lattice value changed to CONSTANT 0x0 (0xff).  Adding SSA edges to worklist.
+marking stmt to be not simulated again

 Visiting statement:
 _3 = _2 ^ 9854;
 which is likely CONSTANT
-Lattice value changed to VARYING.  Adding SSA edges to worklist.
+Lattice value changed to CONSTANT 0x2600 (0xff).  Adding SSA edges to worklist.
+marking stmt to be not simulated again

 Visiting statement:
 e.1_6 = (unsigned short) _3;
 which is likely CONSTANT
-Lattice value changed to VARYING.  Adding SSA edges to worklist.
+Lattice value changed to CONSTANT 0x2600 (0xff).  Adding SSA edges to worklist.
+marking stmt to be not simulated again

 Visiting statement:
 _7 = e.1_6 * 5;
 which is likely CONSTANT
-Lattice value changed to VARYING.  Adding SSA edges to worklist.
+Lattice value changed to CONSTANT 0x8000 (0x7fff).  Adding SSA edges to 
worklist.
+marking stmt to be not simulated again
...

The C frontend promotion removal causes inline strategy change which
leads to this test failure.

H.J. Lu (7):
  Improve outgoing integer argument promotion
  Add expand_promote_outgoing_argument
  i386: Use GET_MODE with lowpart_subreg
  i386: Adjust apx-ndd.c for frontend promotion removal
  vect-simd-clone-1[6-8][cd].c: Expect in-branch clones for x86
  scev-cast.c: Adjusted
  ssa-fre-4.c: Skip for all targets

 gcc/ada/gcc-interface/utils.cc|  24 --
 gcc/c/c-decl.cc   |  40 ---
 gcc/c/c-typeck.cc |  19 +-
 gcc/calls.cc  |  81 ++
 gcc/config/i386/i386-expand.cc| 247 +-
 gcc/cp/call.cc|  10 -
 gcc/cp/typeck.cc  |  13 +-
 gcc/expr.cc   |  18 ++
 gcc/expr.h|   1 +
 gcc/gimple.cc |  10 +-
 gcc/testsuite/gcc.dg/tree-ssa/scev-cast.c |   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-4.c |   

[PATCH v3 6/7] scev-cast.c: Adjusted

2024-11-10 Thread H.J. Lu
Since the C frontend no longer promotes char argument, adjust scev-cast.c.

PR middle-end/14907
* gcc.dg/tree-ssa/scev-cast.c: Adjusted.

Signed-off-by: H.J. Lu 
---
 gcc/testsuite/gcc.dg/tree-ssa/scev-cast.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/scev-cast.c 
b/gcc/testsuite/gcc.dg/tree-ssa/scev-cast.c
index c569523ffa7..1a3c150a884 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/scev-cast.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/scev-cast.c
@@ -22,6 +22,6 @@ void tst(void)
 blau ((unsigned char) i);
 }
 
-/* { dg-final { scan-tree-dump-times "& 255" 1 "optimized" } } */
-/* { dg-final { scan-tree-dump-times "= \\(signed char\\)" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= \\(unsigned char\\)" 2 "optimized" } } 
*/
+/* { dg-final { scan-tree-dump-times "= \\(signed char\\)" 3 "optimized" } } */
 
-- 
2.47.0



[PATCH v3 5/7] vect-simd-clone-1[6-8][cd].c: Expect in-branch clones for x86

2024-11-10 Thread H.J. Lu
Since the C frontend no longer promotes char and short arguments, expect
in-branch clones for x86.

PR middle-end/14907
* gcc.dg/vect/vect-simd-clone-16c.c: Expect in-branch clones for
x86.
* gcc.dg/vect/vect-simd-clone-16d.c: Likewise.
* gcc.dg/vect/vect-simd-clone-17c.c: Likewise.
* gcc.dg/vect/vect-simd-clone-17d.c: Likewise.
* gcc.dg/vect/vect-simd-clone-18c.c: Likewise.
* gcc.dg/vect/vect-simd-clone-18d.c: Likewise.

Signed-off-by: H.J. Lu 
---
 gcc/testsuite/gcc.dg/vect/vect-simd-clone-16c.c | 5 +
 gcc/testsuite/gcc.dg/vect/vect-simd-clone-16d.c | 4 +---
 gcc/testsuite/gcc.dg/vect/vect-simd-clone-17c.c | 5 +
 gcc/testsuite/gcc.dg/vect/vect-simd-clone-17d.c | 5 +
 gcc/testsuite/gcc.dg/vect/vect-simd-clone-18c.c | 5 +
 gcc/testsuite/gcc.dg/vect/vect-simd-clone-18d.c | 5 +
 6 files changed, 6 insertions(+), 23 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16c.c 
b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16c.c
index 4fdf25d06c6..628d4575673 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16c.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16c.c
@@ -7,11 +7,8 @@
 
 /* Ensure the the in-branch simd clones are used on targets that support them.
Some targets use another call for the epilogue loops.  */
-/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" 
{ target { ! { x86_64-*-* || { i?86-*-* || aarch64*-*-* } } } } } } */
+/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" 
{ target { !aarch64*-*-* } } } } */
 /* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 3 "vect" 
{ target { aarch64*-*-* } } } } */
 
-/* x86_64 fails to use in-branch clones for TYPE=short.  */
-/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 0 "vect" 
{ target x86_64-*-* i?86-*-* } } } */
-
 /* The LTO test produces two dump files and we scan the wrong one.  */
 /* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16d.c 
b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16d.c
index 55d3c0afae5..d1f85b0703e 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16d.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16d.c
@@ -7,11 +7,9 @@
 
 /* Ensure the the in-branch simd clones are used on targets that support them.
Some targets use another call for the epilogue loops.  */
-/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" 
{ target { ! { x86_64-*-* || { i?86-*-* || aarch64*-*-* } } } } } } */
+/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" 
{ target { !aarch64*-*-* } } } } */
 /* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 3 "vect" 
{ target { aarch64*-*-* } } } } */
 
-/* x86_64 fails to use in-branch clones for TYPE=char.  */
-/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 0 "vect" 
{ target x86_64-*-* i?86-*-* } } } */
 
 /* The LTO test produces two dump files and we scan the wrong one.  */
 /* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-17c.c 
b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-17c.c
index 6afa2fd595e..6148abee806 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-17c.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-17c.c
@@ -7,11 +7,8 @@
  
 /* Ensure the the in-branch simd clones are used on targets that support them.
Some targets use another call for the epilogue loops.  */
-/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" 
{ target { ! { x86_64-*-* || { i?86-*-* || aarch64*-*-* } } } } } } */
+/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" 
{ target { !aarch64*-*-* } } } } */
 /* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 3 "vect" 
{ target { aarch64*-*-* } } } } */
 
-/* x86_64 fails to use in-branch clones for TYPE=short.  */
-/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 0 "vect" 
{ target x86_64-*-* i?86-*-* } } } */
-
 /* The LTO test produces two dump files and we scan the wrong one.  */
 /* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-17d.c 
b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-17d.c
index 56177880b6b..63687984598 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-17d.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-17d.c
@@ -7,11 +7,8 @@
 
 /* Ensure the the in-branch simd clones are used on targets that support them.
Some targets use another call for the epilogue loops.  */
-/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" 
{ target { ! { x86_64-*-* || { i?86-*-* || aarch64*-*-* } } } } } } */
+/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" 
{ target { !aarch64*-*-* } } } } */
 /* { dg-final { scan-tree-dump-times {[\n\r] 

[PATCH v3 3/7] i386: Use GET_MODE with lowpart_subreg

2024-11-10 Thread H.J. Lu
With expand_promote_outgoing_argument, op3 in

  op3 = lowpart_subreg (QImode, op3, HImode);

may not be in HImode.  Call lowpart_subreg on op3 only if it isn't integer
constant and pass GET_MODE (op3) to lowpart_subreg instead of HImode.

PR middle-end/14907
* config/i386/i386-expand.cc (ix86_expand_builtin): Use GET_MODE
with lowpart_subreg.

Signed-off-by: H.J. Lu 
---
 gcc/config/i386/i386-expand.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index ce887d96f6a..4c7aa316ae5 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -15532,7 +15532,8 @@ rdseed_step:
op0 = copy_to_mode_reg (GET_MODE (op0), op0);
  emit_insn (gen (half, op0));
  op0 = half;
- op3 = lowpart_subreg (QImode, op3, HImode);
+ if (!CONST_INT_P (op3))
+   op3 = lowpart_subreg (QImode, op3, GET_MODE (op3));
  break;
case IX86_BUILTIN_GATHER3ALTDIV8SF:
case IX86_BUILTIN_GATHER3ALTDIV8SI:
-- 
2.47.0



[PATCH v3 7/7] ssa-fre-4.c: Skip for all targets

2024-11-10 Thread H.J. Lu
Since the C frontend no longer promotes char argument, ssa-fre-4.c will
fail for all targets.  Skip it for all targets.

PR middle-end/14907
* gcc.dg/tree-ssa/ssa-fre-4.c: Skip for all targets.

Signed-off-by: H.J. Lu 
---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-4.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-4.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-4.c
index 5a7588febaa..07d4d81996a 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-4.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-4.c
@@ -1,6 +1,6 @@
-/* If the target returns false for TARGET_PROMOTE_PROTOTYPES, then there
-   will be no casts for FRE to eliminate and the test will fail.  */
-/* { dg-do compile { target i?86-*-* x86_64-*-* hppa*-*-* m68k*-*-* } } */
+/* Since the C frontend no longer promotes char argument, there will be
+   no casts for FRE to eliminate and the test will fail.  */
+/* { dg-do compile { target !*-*-* } } */
 /* { dg-options "-O -fno-tree-ccp -fno-tree-forwprop -fdump-tree-fre1-details" 
} */
 
 /* From PR21608.  */
-- 
2.47.0



[PATCH] testsuite: arm: Update expected RTL for reg_equal_test.c test

2024-11-10 Thread Torbjörn SVENSSON
Hi Richard,

I'm not sure if I'm doing something wrong here, or if it was an oversight
when doing the update in r12-8108-g62082d278d1.
Anyway, the commit message suggest that it's only the constant that is of
interrest, so I updated the test to only check the constant. Do you think
this is enough, or is should the test case also verify that it's used in
a "set" expression?

Ok for trunk and releases/gcc-14?

--

The test case was re-writtend in r12-8108-g62082d278d1, but the expected
RTL was not updated.

The diff for the generated reg_equal_test.c.*r.expand files produced by
r12-8108-g62082d278d1 and r15-5047-g7e1d9f58858 is:

--- reg_equal_test.c.253r.expand-r12-8108-g62082d278d1  2024-11-10 
14:24:54.957438394 +0100
+++ reg_equal_test.c.268r.expand-r15-5047-g7e1d9f58858  2024-11-10 
14:30:13.633437178 +0100
@@ -1,5 +1,5 @@

-;; Function x (x, funcdef_no=0, decl_uid=4195, cgraph_uid=1, symbol_order=0)
+;; Function x (x, funcdef_no=0, decl_uid=4590, cgraph_uid=1, symbol_order=0)

 ;; Generating RTL for gimple basic block 2
@@ -25,6 +25,6 @@
 (note 1 0 3 NOTE_INSN_DELETED)
 (note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
 (note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)
-(insn 5 2 0 2 (set (reg/v:SI 113 [ d ])
+(insn 5 2 0 2 (set (reg/v:SI 114 [ d ])
 (const_int -942519458 [0xc7d24b5e])) -1
  (nil))

In both versions, the constant is simply assigned, thus I updated the
expected RTL accordingly.

gcc/testsuite/ChangeLog:

* gcc.target/arm/reg_equal_test.c: Update expected RTL.

Signed-off-by: Torbjörn SVENSSON 
---
 gcc/testsuite/gcc.target/arm/reg_equal_test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/reg_equal_test.c 
b/gcc/testsuite/gcc.target/arm/reg_equal_test.c
index d87c75cc27c..4337e3f0af5 100644
--- a/gcc/testsuite/gcc.target/arm/reg_equal_test.c
+++ b/gcc/testsuite/gcc.target/arm/reg_equal_test.c
@@ -12,4 +12,4 @@ x ()
   return;
 }
 
-/* { dg-final { scan-rtl-dump "expr_list:REG_EQUAL \\(const_int -942519458" 
"expand" } } */
+/* { dg-final { scan-rtl-dump "\\(const_int -942519458" "expand" } } */
-- 
2.25.1



[PATCH v3 1/7] Improve outgoing integer argument promotion

2024-11-10 Thread H.J. Lu
For targets, like x86, which define TARGET_PROMOTE_PROTOTYPES to return
true, all integer arguments smaller than int are passed as int:

[hjl@gnu-tgl-3 pr14907]$ cat x.c
extern int baz (char c1);

int
foo (char c1)
{
  return baz (c1);
}
[hjl@gnu-tgl-3 pr14907]$ gcc -S -O2 -m32 x.c
[hjl@gnu-tgl-3 pr14907]$ cat x.s
.file   "x.c"
.text
.p2align 4
.globl  foo
.type   foo, @function
foo:
.LFB0:
.cfi_startproc
movsbl  4(%esp), %eax
movl%eax, 4(%esp)
jmp baz
.cfi_endproc
.LFE0:
.size   foo, .-foo
.ident  "GCC: (GNU) 14.2.1 20240912 (Red Hat 14.2.1-3)"
.section.note.GNU-stack,"",@progbits
[hjl@gnu-tgl-3 pr14907]$

But integer promotion:

movsbl  4(%esp), %eax
movl%eax, 4(%esp)

isn't necessary if incoming arguments and outgoing arguments are the
same.  Drop targetm.promote_prototypes from C, C++ and Ada frontends
and apply targetm.promote_prototypes during RTL call expansion.

gcc/

PR middle-end/14907
* calls.cc: Include "ssa.h", "tree-ssa-live.h" and
"tree-outof-ssa.h".
(get_promoted_int_value_from_ssa_name): New function.
(get_promoted_int_value): Likewise.
(initialize_argument_information): Call get_promoted_int_value
to promote integer function argument.
* gimple.cc (gimple_builtin_call_types_compatible_p): Remove the
targetm.calls.promote_prototypes call.
* tree.cc (tree_builtin_call_types_compatible_p): Likewise.

gcc/ada/

PR middle-end/14907
* gcc-interface/utils.cc (create_param_decl): Remove the
targetm.calls.promote_prototypes call.

gcc/c/

PR middle-end/14907
* c-decl.cc (start_decl): Remove the
targetm.calls.promote_prototypes call.
(store_parm_decls_oldstyle): Likewise.
(finish_function): Likewise.
* c-typeck.cc (convert_argument): Likewise.
(c_safe_arg_type_equiv_p): Likewise.

gcc/cp/

PR middle-end/14907
* call.cc (type_passed_as): Remove the
targetm.calls.promote_prototypes call.
(convert_for_arg_passing): Likewise.
* typeck.cc (cxx_safe_arg_type_equiv_p): Likewise.

gcc/testsuite/

PR middle-end/14907
* gcc.target/i386/pr14907-1.c: New test.
* gcc.target/i386/pr14907-2.c: Likewise.
* gcc.target/i386/pr14907-3.c: Likewise.
* gcc.target/i386/pr14907-4.c: Likewise.
* gcc.target/i386/pr14907-5.c: Likewise.
* gcc.target/i386/pr14907-6.c: Likewise.
* gcc.target/i386/pr14907-7.c: Likewise.
* gcc.target/i386/pr14907-8.c: Likewise.
* gcc.target/i386/pr14907-9.c: Likewise.
* gcc.target/i386/pr14907-10.c: Likewise.
* gcc.target/i386/pr14907-11.c: Likewise.
* gcc.target/i386/pr14907-12.c: Likewise.
* gcc.target/i386/pr14907-13.c: Likewise.
* gcc.target/i386/pr14907-14.c: Likewise.
* gcc.target/i386/pr14907-15.c: Likewise.
* gcc.target/i386/pr14907-16.c: Likewise.
* gfortran.dg/pr14907-1.f90: Likewise.

Signed-off-by: H.J. Lu 
---
 gcc/ada/gcc-interface/utils.cc | 24 ---
 gcc/c/c-decl.cc| 40 ---
 gcc/c/c-typeck.cc  | 19 ++---
 gcc/calls.cc   | 81 ++
 gcc/cp/call.cc | 10 ---
 gcc/cp/typeck.cc   | 13 ++--
 gcc/gimple.cc  | 10 +--
 gcc/testsuite/gcc.target/i386/pr14907-1.c  | 21 ++
 gcc/testsuite/gcc.target/i386/pr14907-10.c | 23 ++
 gcc/testsuite/gcc.target/i386/pr14907-11.c | 12 
 gcc/testsuite/gcc.target/i386/pr14907-12.c | 17 +
 gcc/testsuite/gcc.target/i386/pr14907-13.c | 12 
 gcc/testsuite/gcc.target/i386/pr14907-14.c | 17 +
 gcc/testsuite/gcc.target/i386/pr14907-15.c | 26 +++
 gcc/testsuite/gcc.target/i386/pr14907-16.c | 24 +++
 gcc/testsuite/gcc.target/i386/pr14907-2.c  | 21 ++
 gcc/testsuite/gcc.target/i386/pr14907-3.c  | 21 ++
 gcc/testsuite/gcc.target/i386/pr14907-4.c  | 21 ++
 gcc/testsuite/gcc.target/i386/pr14907-5.c  | 21 ++
 gcc/testsuite/gcc.target/i386/pr14907-6.c  | 21 ++
 gcc/testsuite/gcc.target/i386/pr14907-7.c  | 22 ++
 gcc/testsuite/gcc.target/i386/pr14907-8.c  | 23 ++
 gcc/testsuite/gcc.target/i386/pr14907-9.c  | 22 ++
 gcc/testsuite/gfortran.dg/pr14907-1.f90| 17 +
 gcc/tree.cc| 14 
 25 files changed, 431 insertions(+), 121 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr14907-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr14907-10.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr14907-11.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr14907-12.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr14907-13.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr14907-14