[PATCH] [i386] Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b & mask).
Hi all, This patch targets PR94790, which change the instruction selection under the following circumstance. Regtested on x86_64-pc-linux-gnu. Ok for trunk? BRs, Haochen >From the perspective of the pipeline, `andn + and + ior` version take 2 cycles(AND and ANDN doesn't have dependence), but xor + and + xor will take 3 cycles. - xorl%edi, %esi andl%edx, %esi - movl%esi, %eax - xorl%edi, %eax + andn%edi, %edx, %eax + orl %esi, %eax gcc/ChangeLog: PR taeget/94790 * config/i386/i386.md (*xor2andn): New define_insn_and_split. gcc/testsuite/ChangeLog: PR taeget/94790 * gcc.target/i386/pr94790-1.c: New test. * gcc.target/i386/pr94790-2.c: Ditto. --- gcc/config/i386/i386.md | 39 +++ gcc/testsuite/gcc.target/i386/pr94790-1.c | 14 gcc/testsuite/gcc.target/i386/pr94790-2.c | 9 ++ 3 files changed, 62 insertions(+) create mode 100755 gcc/testsuite/gcc.target/i386/pr94790-1.c create mode 100755 gcc/testsuite/gcc.target/i386/pr94790-2.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 9b424a3935b..38efc6d5837 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -10452,6 +10452,45 @@ (set_attr "znver1_decode" "double") (set_attr "mode" "DI")]) +;; PR target/94790: Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b & mask) +(define_insn_and_split "*xor2andn" + [(set (match_operand:SWI248 0 "nonimmediate_operand") + (xor:SWI248 + (and:SWI248 + (xor:SWI248 + (match_operand:SWI248 1 "nonimmediate_operand") + (match_operand:SWI248 2 "nonimmediate_operand")) + (match_operand:SWI248 3 "nonimmediate_operand")) + (match_dup 1))) +(clobber (reg:CC FLAGS_REG))] + "(TARGET_BMI || TARGET_AVX512BW) + && ix86_pre_reload_split ()" + "#" + "&& 1" + [(parallel [(set (match_dup 4) + (and:SWI248 + (not:SWI248 + (match_dup 3)) + (match_dup 1))) + (clobber (reg:CC FLAGS_REG))]) + (parallel [(set (match_dup 5) + (and:SWI248 + (match_dup 2) + (match_dup 3))) + (clobber (reg:CC FLAGS_REG))]) + (parallel [(set (match_dup 0) + (ior:SWI248 + (match_dup 4) + (match_dup 5))) + (clobber (reg:CC FLAGS_REG))])] + { +operands[1] = force_reg (mode, operands[1]); +operands[3] = force_reg (mode, operands[3]); +operands[4] = gen_reg_rtx (mode); +operands[5] = gen_reg_rtx (mode); + } +) + ;; See comment for addsi_1_zext why we do use nonimmediate_operand (define_insn "*si_1_zext" [(set (match_operand:DI 0 "register_operand" "=r") diff --git a/gcc/testsuite/gcc.target/i386/pr94790-1.c b/gcc/testsuite/gcc.target/i386/pr94790-1.c new file mode 100755 index 000..6ebbec15cfd --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr94790-1.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mbmi" } */ +/* { dg-final { scan-assembler-times "andn\[ \\t\]" 2 } } */ +/* { dg-final { scan-assembler-not "xorl\[ \\t\]" } } */ + +unsigned r1(unsigned a, unsigned b, unsigned mask) +{ + return a ^ ((a ^ b) & mask); +} + +unsigned r2(unsigned a, unsigned b, unsigned mask) +{ + return (~mask & a) | (b & mask); +} diff --git a/gcc/testsuite/gcc.target/i386/pr94790-2.c b/gcc/testsuite/gcc.target/i386/pr94790-2.c new file mode 100755 index 000..d7b0eec5bef --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr94790-2.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mbmi" } */ +/* { dg-final { scan-assembler-not "andn\[ \\t\]" } } */ +/* { dg-final { scan-assembler-times "xorl\[ \\t\]" 2 } } */ + +unsigned r1(unsigned a, unsigned b, unsigned mask) +{ + return a ^ ((a ^ b) & mask) + (a ^ b); +} -- 2.18.1
Re: [PATCH] ira: Fix old-reload targets [PR103974]
On Tue, Jan 11, 2022 at 7:41 PM Vladimir Makarov via Gcc-patches wrote: > > > On 2022-01-11 12:42, Richard Sandiford wrote: > > The new IRA heuristics would need more work on old-reload targets, > > since flattening needs to be able to undo the cost propagation. > > It's doable, but hardly seems worth it. > Agree. It is not worth to spend your time for work for reload. > > This patch therefore makes all the new calls to > > ira_subloop_allocnos_can_differ_p return false if !ira_use_lra_p. > > The color_pass code that predated the new function (and that was > > the source of ira_subloop_allocnos_can_differ_p) continues to > > behave as before. > > > > It's a hack, but at least it has the advantage that the new parameter > > would become obviously unused if reload and (!)ira_use_lra_p were > > removed. The hack should therefore disappear alongside reload. > I have a feeling that it will stay for a long time if not forever. We indeed seem to have 34 targets w/o LRA by default and only 15 with :/ At some point Eric wrote a nice summary for how to transition targets away from CC0, I wonder if there's something similar for transitioning a port away from reload to LRA? In those 34 targets there must be some for which that's a relatively easy task? I suppose it depends on how much of the reload target hooks are put to use and how those translate to LRA. Richard.
Re: [PATCH] c++: Silence -Wuseless-cast warnings during move [PR103480]
On Tue, Jan 11, 2022 at 11:50:42AM -0500, Jason Merrill wrote: > On 12/31/21 04:30, Jakub Jelinek wrote: > > Hi! > > > > This is maybe just a shot in the dark, but IMHO we shouldn't be diagnosing > > -Wuseless-cast on casts the compiler adds on its own when calling its move > > function. We don't seem to warn when user calls std::move either. > > We call move on elinit (*NON_LVALUE_EXPR <(struct C[2] &&) &D.2497->b>)[0] > > so it is already an xvalue_p and try to static_cast it to struct C &&. > > But we don't warn e.g. on std::move (std::move (whatever)). > > > > Bootstrapped/regtested on x86_64-linux and i686-linux. > > Maybe we should > > if (xvalue_p (expr)) > return expr; > > instead? OK either way. That worked well, so I've committed following instead: 2022-01-11 Jakub Jelinek Jason Merrill PR c++/103480 * tree.c (move): If expr is xvalue_p, just return expr without build_static_cast. * g++.dg/warn/Wuseless-cast2.C: New test. --- gcc/cp/tree.c.jj2022-01-10 10:53:48.983925610 +0100 +++ gcc/cp/tree.c 2022-01-11 19:41:12.787433061 +0100 @@ -1304,6 +1304,8 @@ move (tree expr) { tree type = TREE_TYPE (expr); gcc_assert (!TYPE_REF_P (type)); + if (xvalue_p (expr)) +return expr; type = cp_build_reference_type (type, /*rval*/true); return build_static_cast (input_location, type, expr, tf_warning_or_error); --- gcc/testsuite/g++.dg/warn/Wuseless-cast2.C.jj 2022-01-11 19:38:08.552039028 +0100 +++ gcc/testsuite/g++.dg/warn/Wuseless-cast2.C 2022-01-11 19:38:08.552039028 +0100 @@ -0,0 +1,24 @@ +// PR c++/103480 +// { dg-do compile { target c++14 } } +// { dg-options "-Wuseless-cast" } + +template +struct A { typedef T t[N]; }; +template +struct B { typename A::t b; }; +struct C { + constexpr C (C &&) {} + template + static auto bar () + { +B r; +return r; // { dg-bogus "useless cast to type" } + } + C () = default; +}; + +void +foo () +{ + C::bar<2> (); +} Jakub
[committed] testsuite: Fix up c-c++-common/builtin-shufflevector-3.c testcase [PR101530]
Hi! This fixes on i686-linux: FAIL: c-c++-common/builtin-shufflevector-3.c -Wc++-compat (test for excess errors) Excess errors: .../gcc/testsuite/c-c++-common/builtin-shufflevector-3.c:6:1: warning: SSE vector argument without SSE enabled changes the ABI [-Wpsabi] Committed as obvious. 2022-01-12 Jakub Jelinek PR middle-end/101530 * c-c++-common/builtin-shufflevector-3.c: Add -Wno-psabi to dg-options. --- gcc/testsuite/c-c++-common/builtin-shufflevector-3.c.jj 2022-01-11 23:11:22.830283971 +0100 +++ gcc/testsuite/c-c++-common/builtin-shufflevector-3.c2022-01-12 09:37:43.013565187 +0100 @@ -1,4 +1,5 @@ /* { dg-do compile } */ +/* { dg-options "-Wno-psabi" } */ typedef int __attribute__((__vector_size__ (sizeof(int)*4))) V; Jakub
Re: libgfortran bootstrap failure
On Tue, Jan 11, 2022 at 10:30:26PM -0500, David Edelsohn wrote: > The recent patch to support Power IEEE128 causes a bootstrap failure > on AIX and possibly all non-GLIBC systems. > > +#if defined(__powerpc64__) && __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ \ > +&& defined __GLIBC_PREREQ && __GLIBC_PREREQ (2, 32) > +#define POWER_IEEE128 1 > +#endif > > __GLIBC_PREREQ is tested on all systems. > > /nasfarm/edelsohn/src/src/libgfortran/libgfortran.h:107:49: error: > missing binary operator before token "(" > 107 | && defined __GLIBC_PREREQ && __GLIBC_PREREQ (2, 32) > | ^ This is what I've committed as obvious: 2022-01-12 Jakub Jelinek * libgfortran.h (POWER_IEEE128): Use __GLIBC_PREREQ in a separate #if directive inside of #if ... && defined __GLIBC_PREREQ. --- libgfortran/libgfortran.h.jj2022-01-11 23:49:51.759830402 +0100 +++ libgfortran/libgfortran.h 2022-01-12 09:41:48.779107854 +0100 @@ -104,9 +104,11 @@ typedef off_t gfc_offset; #endif #if defined(__powerpc64__) && __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ \ -&& defined __GLIBC_PREREQ && __GLIBC_PREREQ (2, 32) +&& defined __GLIBC_PREREQ +#if __GLIBC_PREREQ (2, 32) #define POWER_IEEE128 1 #endif +#endif /* These functions from should only be used on values that can be represented as unsigned char, otherwise the behavior is undefined. Jakub
Re: [PATCH] Mass rename of C++ .c files to .cc suffix
On 1/11/22 17:16, Jakub Jelinek wrote: On Tue, Jan 11, 2022 at 05:03:34PM +0100, Martin Liška wrote: On 1/11/22 16:56, Jakub Jelinek wrote: While e.g. libcpp or gcc are in C++. Which means I should rename .c files under libcpp, right? Is there any other folder from gcc/ and libcpp/ that would need that as well? From the directories that use .c files for C++, I think that is all. c++tools/, libcody/, libitm/, libsanitizer/, libstdc++-v3/ already use .cc or .cpp. All right, I've included libcpp folder in the renaming effort. Please take a look at V2 of the series here: https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=log;h=refs/users/marxin/heads/cc-renaming Now I filled up complete ChangeLog entries and I'm asking for a patchset approval. Cheers, Martin Jakub
[PATCH] Fix -Wformat-diag for aarch64 target.
Hello. We've got -Wformat-diag for some time and I think we should start using it in -Werror for GCC bootstrap. The following patch removes last pieces of the warning for aarch64 target. Ready to be installed? Thanks, Martin gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_parse_boolean_options): Use %qs where possible. (aarch64_parse_sve_width_string): Likewise. (aarch64_override_options_internal): Likewise. (aarch64_print_hint_for_extensions): Likewise. (aarch64_validate_sls_mitigation): Likewise. (aarch64_handle_attr_arch): Likewise. (aarch64_handle_attr_cpu): Likewise. (aarch64_handle_attr_tune): Likewise. (aarch64_handle_attr_isa_flags): Likewise. --- gcc/config/aarch64/aarch64.c | 26 +- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 5920b29880a..1bca2a31a1b 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -16337,7 +16337,7 @@ aarch64_parse_boolean_options (const char *option, /* We ended with a comma, print something. */ if (!(*specs)) { - error ("%s string ill-formed\n", option_name); + error ("%qs string ill-formed", option_name); return 0; } @@ -16393,7 +16393,7 @@ aarch64_parse_sve_width_string (const char *tune_string, int n = sscanf (tune_string, "%d", &width); if (n == EOF) { - error ("invalid format for sve_width"); + error ("invalid format for %"); return; } switch (width) @@ -16405,7 +16405,7 @@ aarch64_parse_sve_width_string (const char *tune_string, case SVE_2048: break; default: - error ("invalid sve_width value: %d", width); + error ("invalid % value: %d", width); } tune->sve_width = (enum aarch64_sve_vector_bits_enum) width; } @@ -16628,7 +16628,7 @@ aarch64_override_options_internal (struct gcc_options *opts) if (opts->x_aarch64_stack_protector_guard_reg_str) { if (strlen (opts->x_aarch64_stack_protector_guard_reg_str) > 100) - error ("specify a system register with a small string length."); + error ("specify a system register with a small string length"); } if (opts->x_aarch64_stack_protector_guard_offset_str) @@ -16832,7 +16832,7 @@ aarch64_print_hint_for_extensions (const std::string &str) inform (input_location, "valid arguments are: %s;" " did you mean %qs?", s, hint); else -inform (input_location, "valid arguments are: %s;", s); +inform (input_location, "valid arguments are: %s", s); XDELETEVEC (s); } @@ -16933,7 +16933,7 @@ aarch64_validate_sls_mitigation (const char *const_str) temp |= SLS_RETBR; else if (strcmp (str, "none") == 0 || strcmp (str, "all") == 0) { - error ("%<%s%> must be by itself for %<-mharden-sls=%>", str); + error ("%qs must be by itself for %<-mharden-sls=%>", str); break; } else @@ -17572,11 +17572,11 @@ aarch64_handle_attr_arch (const char *str) error ("missing name in % pragma or attribute"); break; case AARCH64_PARSE_INVALID_ARG: - error ("invalid name (\"%s\") in % pragma or attribute", str); + error ("invalid name (%qs) in % pragma or attribute", str); aarch64_print_hint_for_arch (str); break; case AARCH64_PARSE_INVALID_FEATURE: - error ("invalid feature modifier %s of value (\"%s\") in " + error ("invalid feature modifier %s of value (%qs) in " "% pragma or attribute", invalid_extension.c_str (), str); aarch64_print_hint_for_extensions (invalid_extension); break; @@ -17614,11 +17614,11 @@ aarch64_handle_attr_cpu (const char *str) error ("missing name in % pragma or attribute"); break; case AARCH64_PARSE_INVALID_ARG: - error ("invalid name (\"%s\") in % pragma or attribute", str); + error ("invalid name (%qs) in % pragma or attribute", str); aarch64_print_hint_for_core (str); break; case AARCH64_PARSE_INVALID_FEATURE: - error ("invalid feature modifier %s of value (\"%s\") in " + error ("invalid feature modifier %s of value (%qs) in " "% pragma or attribute", invalid_extension.c_str (), str); aarch64_print_hint_for_extensions (invalid_extension); break; @@ -17645,7 +17645,7 @@ aarch64_handle_attr_cpu (const char *str) " attribute"); break; case AARCH64_PARSE_INVALID_ARG: - error ("invalid protection type (\"%s\") in % pragma or attribute", err_str); break; case AARCH64_PARSE_OK: @@ -17680,7 +17680,7 @@ aarch64_handle_attr_tune (const char *str) switch (parse_res) { case AARCH64_PARSE_INVALID_ARG: - error ("invalid name (\"%s\") in % pragma or attribute", str); + error ("invalid name (
Re: [PATCH] tree-optimization/103551 - Always set EDGE_EXECUTABLE in ranger vrp.
On Tue, Jan 11, 2022 at 8:49 PM Andrew MacLeod via Gcc-patches wrote: > > The simplify_and_fold engine uses the EDGE_EXECUTABLE flag to eliminate > branches: > > substitute_and_fold_dom_walker::before_dom_children: > > if (gimple_code (stmt) == GIMPLE_COND) > { >if ((EDGE_SUCC (bb, 0)->flags & EDGE_EXECUTABLE) >^ (EDGE_SUCC (bb, 1)->flags & EDGE_EXECUTABLE)) > > ... but if certain passes before VRP2 are turned off, the flag can be in > an uninitialized state upon entry to VRP2. This patch simply always > sets EDGE_EXECUTABLE on all edges before invoking ranger and the S&F engine. > > Bootstraps on x86_64-pc-linux-gnu with no regressions. > > OK for trunk? OK. > Andrew > > > > >
[PATCH] sh-linux fix target cpu
sh-linux not supported any SH1 and SH2a little-endian. Add exceptios it. gcc/ChangeLog: * config/sh/t-linux (MULTILIB_EXCEPTIONS): Add m1, mb/m1 and m2a. --- gcc/config/sh/t-linux | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/config/sh/t-linux b/gcc/config/sh/t-linux index d33c6383915..ca323ca25a9 100644 --- a/gcc/config/sh/t-linux +++ b/gcc/config/sh/t-linux @@ -1,2 +1,3 @@ MULTILIB_DIRNAMES= -MULTILIB_MATCHES = +MULTILIB_MATCHES= +MULTILIB_EXCEPTIONS=m1 mb/m1 m2a -- 2.30.2
[PATCH] Fix -Wformat-diag for rs6000 target.
Hello. We've got -Wformat-diag for some time and I think we should start using it in -Werror for GCC bootstrap. The following patch removes last pieces of the warning for rs6000 target. Ready to be installed? Thanks, Martin gcc/ChangeLog: * config/rs6000/rs6000-call.c (rs6000_invalid_builtin): Wrap keywords and use %qs instead of %<%s%>. (rs6000_expand_builtin): Likewise. gcc/testsuite/ChangeLog: * gcc.target/powerpc/bfp/scalar-extract-exp-5.c: Adjust scans in testcases. * gcc.target/powerpc/bfp/scalar-extract-sig-5.c: Likewise. * gcc.target/powerpc/bfp/scalar-insert-exp-11.c: Likewise. --- gcc/config/rs6000/rs6000-call.c | 8 .../gcc.target/powerpc/bfp/scalar-extract-exp-5.c | 2 +- .../gcc.target/powerpc/bfp/scalar-extract-sig-5.c | 2 +- .../gcc.target/powerpc/bfp/scalar-insert-exp-11.c | 2 +- 4 files changed, 7 insertions(+), 7 deletions(-) diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index c78b8b08c40..becdad73812 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -3307,7 +3307,7 @@ rs6000_invalid_builtin (enum rs6000_gen_builtins fncode) "-mvsx"); break; case ENB_IEEE128_HW: - error ("%qs requires ISA 3.0 IEEE 128-bit floating point", name); + error ("%qs requires ISA 3.0 IEEE 128-bit floating-point", name); break; case ENB_DFP: error ("%qs requires the %qs option", name, "-mhard-dfp"); @@ -5589,20 +5589,20 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* subtarget */, if (bif_is_nosoft (*bifaddr) && rs6000_isa_flags & OPTION_MASK_SOFT_FLOAT) { - error ("%<%s%> not supported with %<-msoft-float%>", + error ("%qs not supported with %<-msoft-float%>", bifaddr->bifname); return const0_rtx; } if (bif_is_no32bit (*bifaddr) && TARGET_32BIT) { - error ("%<%s%> is not supported in 32-bit mode", bifaddr->bifname); + error ("%qs is not supported in 32-bit mode", bifaddr->bifname); return const0_rtx; } if (bif_is_ibmld (*bifaddr) && !FLOAT128_2REG_P (TFmode)) { - error ("%<%s%> requires % to be IBM 128-bit format", + error ("%qs requires % to be IBM 128-bit format", bifaddr->bifname); return const0_rtx; } diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c index 34184812dc5..1225613b275 100644 --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c @@ -14,7 +14,7 @@ get_exponent (__ieee128 *p) { __ieee128 source = *p; - return scalar_extract_exp (source); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating point" } */ + return scalar_extract_exp (source); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating-point" } */ } diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c index 13c64fc3acf..adf0e4f99df 100644 --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c @@ -12,5 +12,5 @@ get_significand (__ieee128 *p) { __ieee128 source = *p; - return (long long int) __builtin_vec_scalar_extract_sig (source); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating point" } */ + return (long long int) __builtin_vec_scalar_extract_sig (source); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating-point" } */ } diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c index a5dd852e60f..6787a67950b 100644 --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c @@ -16,5 +16,5 @@ insert_exponent (__ieee128 *significand_p, __ieee128 significand = *significand_p; unsigned long long int exponent = *exponent_p; - return scalar_insert_exp (significand, exponent); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating point" } */ + return scalar_insert_exp (significand, exponent); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating-point" } */ } -- 2.34.1
[PATCH] Fix -Wformat-diag for s390x target.
Hello. We've got -Wformat-diag for some time and I think we should start using it in -Werror for GCC bootstrap. The following patch removes last pieces of the warning for s390x target. Ready to be installed? Thanks, Martin gcc/ChangeLog: * config/s390/s390-c.c (s390_expand_overloaded_builtin): Wrap keyword in quotes. (s390_resolve_overloaded_builtin): Remove trailing dot. * config/s390/s390.c (s390_const_operand_ok): Use - for range. (s390_expand_builtin): Remove trailing dot. (s390_emit_prologue): Likewise, use semicolon. (s390_option_override_internal): Update keyword. * varasm.c (do_assemble_alias): Wrap keyword in quotes. --- gcc/config/s390/s390-c.c | 9 + gcc/config/s390/s390.c | 28 ++-- gcc/varasm.c | 2 +- 3 files changed, 20 insertions(+), 19 deletions(-) diff --git a/gcc/config/s390/s390-c.c b/gcc/config/s390/s390-c.c index 600018421df..10bc6ac8900 100644 --- a/gcc/config/s390/s390-c.c +++ b/gcc/config/s390/s390-c.c @@ -484,7 +484,8 @@ s390_expand_overloaded_builtin (location_t loc, case S390_OVERLOADED_BUILTIN_s390_vec_step: if (TREE_CODE (TREE_TYPE ((*arglist)[0])) != VECTOR_TYPE) { - error_at (loc, "builtin vec_step can only be used on vector types."); + error_at (loc, "builtin %qs can only be used on vector types", + "vec_step "); return error_mark_node; } return build_int_cst (NULL_TREE, @@ -905,7 +906,7 @@ s390_resolve_overloaded_builtin (location_t loc, if (ob_flags & B_INT) { error_at (loc, - "builtin %qF is for GCC internal use only.", + "builtin %qF is for GCC internal use only", ob_fndecl); return error_mark_node; } @@ -913,7 +914,7 @@ s390_resolve_overloaded_builtin (location_t loc, } if (ob_flags & B_DEP) -warning_at (loc, 0, "builtin %qF is deprecated.", ob_fndecl); +warning_at (loc, 0, "builtin %qF is deprecated", ob_fndecl); if (!TARGET_VX && (ob_flags & B_VX)) { @@ -1021,7 +1022,7 @@ s390_resolve_overloaded_builtin (location_t loc, } if (bflags_overloaded_builtin_var[last_match_index] & B_DEP) -warning_at (loc, 0, "%qs matching variant is deprecated.", +warning_at (loc, 0, "%qs matching variant is deprecated", IDENTIFIER_POINTER (DECL_NAME (ob_fndecl))); /* Overloaded variants which have MAX set as low level builtin are diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 056002e4a4a..bf96cbf7588 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -766,7 +766,7 @@ s390_const_operand_ok (tree arg, int argnum, int op_flags, tree decl) argnum, decl, values); } else - error ("constant argument %d for builtin %qF is out of range (0..%wu)", + error ("constant argument %d for builtin %qF is out of range (0-%wu)", argnum, decl, (HOST_WIDE_INT_1U << bitwidth) - 1); return false; @@ -783,7 +783,7 @@ s390_const_operand_ok (tree arg, int argnum, int op_flags, tree decl) || tree_to_shwi (arg) > ((HOST_WIDE_INT_1 << (bitwidth - 1)) - 1)) { error ("constant argument %d for builtin %qF is out of range " -"(%wd..%wd)", argnum, decl, +"(%wd-%wd)", argnum, decl, -(HOST_WIDE_INT_1 << (bitwidth - 1)), (HOST_WIDE_INT_1 << (bitwidth - 1)) - 1); return false; @@ -832,25 +832,25 @@ s390_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED, if ((bflags & B_HTM) && !TARGET_HTM) { error ("builtin %qF is not supported without %<-mhtm%> " -"(default with %<-march=zEC12%> and higher).", fndecl); +"(default with %<-march=zEC12%> and higher)", fndecl); return const0_rtx; } if (((bflags & B_VX) || (bflags & B_VXE)) && !TARGET_VX) { error ("builtin %qF requires %<-mvx%> " -"(default with %<-march=z13%> and higher).", fndecl); +"(default with %<-march=z13%> and higher)", fndecl); return const0_rtx; } if ((bflags & B_VXE) && !TARGET_VXE) { - error ("Builtin %qF requires z14 or higher.", fndecl); + error ("Builtin %qF requires z14 or higher", fndecl); return const0_rtx; } if ((bflags & B_VXE2) && !TARGET_VXE2) { - error ("Builtin %qF requires z15 or higher.", fndecl); + error ("Builtin %qF requires z15 or higher", fndecl); return const0_rtx; } @@ -11464,8 +11464,8 @@ s390_emit_prologue (void) { warning (0, "frame size of function %qs is %wd" " bytes exceeding user provided stack limit of " - "
[PATCH] libstdc++: Add attribute to features deprecated in C++17 [PR91260]
This passes testing (with -std=gnu++98/11/17/20) but is quite a large patch for this late in stage 3. Does anybody object to doing this now? The bugs it fixes were closed as INVALID because we're not actually *required* to remove these or deprecate them. But users are right to complain about us silently accepting use of things like std::bind1st in C++20 mode. Probably WONTFIX would have been better, and in fact we can fix them (as this patch does). If we don't do this, users will keep reporting bugs about it, but it could wait for stage 1 if needed. === >8 === >8 === >8 === There are a lot of things in the C++ standard library which were deprecated in C++11, and more in C++17. Some of them were removed after deprecation and are no longer present in the standard at all. We have not removed these from libstdc++ because keeping them as non-standard extensions is conforming, and avoids gratuitously breaking user code, and in some cases we need to keep using them to avoid ABI changes. But we should at least give a warning for using them. That has not been done previously because of the library's own uses of them (e.g. the std::iterator class template used as a base class). This adds deprecated attributes to the relevant components, and then goes through the whole library to add diagnostic pragmas where needed to suppress warnings about our internal uses of them. The tests are updated to either expect the additional warnings, or to suppress them where we aren't interested in them. libstdc++-v3/ChangeLog: PR libstdc++/91260 PR libstdc++/91383 PR libstdc++/95065 * include/backward/binders.h (bind1st, bind2nd): Add deprecated attribute. * include/bits/refwrap.h (_Maybe_unary_or_binary_function): Disable deprecated warnings for base classes. (_Reference_wrapper_base): Likewise. * include/bits/shared_ptr_base.h (_Sp_owner_less): Likewise. * include/bits/stl_bvector.h (_Bit_iterator_base): Likewise. * include/bits/stl_function.h (unary_function, binary_function): Add deprecated attribute. (unary_negate, not1, binary_negate, not2, ptr_fun) (pointer_to_unary_function, pointer_to_binary_function) (mem_fun_t, const_mem_fun_t, mem_fun_ref_t, const_mem_fun_ref_t) (mem_fun1_t, const_mem_fun1_t, mem_fun_ref1_t) (const_mem_fun1_ref_t, mem_fun, mem_fun_ref): Add deprecated attributes. * include/bits/stl_iterator.h: Disable deprecated warnings for std::iterator base classes. * include/bits/stl_iterator_base_types.h (iterator): Add deprecated attribute. * include/bits/stl_map.h (map::value_compare): Disable deprecated warnings for base class. * include/bits/stl_multimap.h (multimap::value_compare): Likewise. * include/bits/stl_raw_storage_iter.h (raw_storage_iterator): Add deprecated attribute. * include/bits/stl_tempbuf.h (get_temporary_buffer): Likewise. * include/bits/stream_iterator.h: Disable deprecated warnings. * include/bits/streambuf_iterator.h: Likewise. * include/ext/bitmap_allocator.h: Remove unary_function base classes. * include/ext/functional: Disable deprecated warnings. * include/ext/rope: Likewise. * include/ext/throw_allocator.h: Likewise. * include/std/type_traits (result_of): Add deprecated attribute. * include/tr1/functional: Disable deprecated warnings. * include/tr1/functional_hash.h: Likewise. * testsuite/20_util/function_objects/binders/1.cc: Add -Wno-disable-deprecations. * testsuite/20_util/function_objects/binders/3113.cc: Likewise. * testsuite/20_util/function_objects/constexpr.cc: Add dg-warning. * testsuite/20_util/raw_storage_iterator/base.cc: Likewise. * testsuite/20_util/raw_storage_iterator/dr2127.cc: Likewise. * testsuite/20_util/raw_storage_iterator/requirements/base_classes.cc: Likewise. * testsuite/20_util/raw_storage_iterator/requirements/explicit_instantiation/1.cc: Likewise. * testsuite/20_util/raw_storage_iterator/requirements/typedefs.cc: Likewise. * testsuite/20_util/reference_wrapper/24803.cc: Likewise. * testsuite/20_util/reference_wrapper/typedefs.cc: Enable for C++20 and check for absence of nested types. * testsuite/20_util/shared_ptr/comparison/less.cc: Remove std::binary_function base class. * testsuite/20_util/temporary_buffer.cc: Add dg-warning. * testsuite/21_strings/basic_string/cons/char/69092.cc: Remove std::iterator base class. * testsuite/24_iterators/back_insert_iterator/requirements/base_classes.cc: Likewise. * testsuite/24_iterators/front_insert_iterator/requirements/base_classes.cc: Likewise. * testsuite/24_iterators/insert_iterator/requir
[PATCH] libgomp, OpenMP: Fix issue for omp_get_device_num on gfx targets.
Hi, Currently omp_get_device_num does not work on gcn targets with more than one offload device. The reason is that GOMP_DEVICE_NUM_VAR is static in icv-device.c and thus "__gomp_device_num" is not visible in the offload image. This patch removes "static" such that "__gomp_device_num" is now part of the offload image and can now be found in GOMP_OFFLOAD_load_image in the plugin. This is not an issue for nvptx. There, "__gomp_device_num" is in the offload image even with "static". The patch was tested on x86_64-linux with amdgcn offloading with no regressions. Marcel - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 libgomp, OpenMP: Fix issue for omp_get_device_num on gcn targets. Currently omp_get_device_num does not work on gcn targets with more than one offload device. The reason is that GOMP_DEVICE_NUM_VAR is static in icv-device.c and thus "__gomp_device_num" is not visible in the offload image. This patch removes "static" such that "__gomp_device_num" is now part of the offload image and can now be found in GOMP_OFFLOAD_load_image in the plugin. This is not an issue for nvptx. There, "__gomp_device_num" is in the offload image even with "static". libgomp/ChangeLog: * config/gcn/icv-device.c: Make GOMP_DEVICE_NUM_VAR public (remove "static") to make the device num available in the offload image. diff --git a/libgomp/config/gcn/icv-device.c b/libgomp/config/gcn/icv-device.c index fcfa0f3..f70b7e6 100644 --- a/libgomp/config/gcn/icv-device.c +++ b/libgomp/config/gcn/icv-device.c @@ -60,7 +60,7 @@ omp_is_initial_device (void) /* This is set to the device number of current GPU during device initialization, when the offload image containing this libgomp portion is loaded. */ -static volatile int GOMP_DEVICE_NUM_VAR; +volatile int GOMP_DEVICE_NUM_VAR; int omp_get_device_num (void)
Re: [patch] Fix reverse SSO issues in IPA-SRA
> Thanks for the fixes, the forgotten duplication and streaming were quite > embarrassing, but one reason is that there are few reverse SSO testcases > in the suite. Would it be difficult to add some covering the issues you > fixed? No, I think I should be able to cover 3 out of the 4 changes, see below. > ...is this strictly necessary? I know it is inconsistent but the > certain flag is fairly special. More importantly... No strong opinion, but that's really inconsistent... > > @@ -3349,6 +3346,7 @@ pull_accesses_from_callee (cgraph_node *caller, > > isra_param_desc *param_desc,> > > copy->type = argacc->type; > > copy->alias_ptr_type = argacc->alias_ptr_type; > > copy->certain = true; > > > > + copy->reverse = argacc->reverse; > > > > vec_safe_push (param_desc->accesses, copy); > > > > } > > > >else if (prop_kinds[j] == ACC_PROP_CERTAIN) > > ...earlier in the function, there is a check doing: > > if (argacc->alias_ptr_type != pacc->alias_ptr_type > > || !types_compatible_p (argacc->type, pacc->type)) > > return "propagated access types would not match existing ones"; > > Will the types_compatible_p catch the case when one type is a > reverse-SSO and the other is not? If not, we probably need to check > that the flags are the same too. That's the change I don't know how to cover and I agree that the check looks in order, but I presume that the flag still needs to be copied onto "copy"? -- Eric Botcazou
[PATCH] Fix redefined macro for epiphany.
The following warning is emitted gazillion times. Fixes: In file included from ./tm.h:23, from gcc/genconfig.c:25: gcc/config/elfos.h:209: warning: "READONLY_DATA_SECTION_ASM_OP" redefined 209 | #define READONLY_DATA_SECTION_ASM_OP"\t.section\t.rodata" | In file included from ./tm.h:21, from gcc/genconfig.c:25: gcc/config/epiphany/epiphany.h:671: note: this is the location of the previous definition 671 | #define READONLY_DATA_SECTION_ASM_OP"\t.section .rodata" Ready to be installed? Thanks, Martin gcc/ChangeLog: * config/elfos.h (READONLY_DATA_SECTION_ASM_OP): Define only if not defined. --- gcc/config/elfos.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/gcc/config/elfos.h b/gcc/config/elfos.h index 2e0c709e585..add949ec940 100644 --- a/gcc/config/elfos.h +++ b/gcc/config/elfos.h @@ -206,7 +206,9 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define ASCII_DATA_ASM_OP "\t.ascii\t" /* Support a read-only data section. */ +#ifndef READONLY_DATA_SECTION_ASM_OP #define READONLY_DATA_SECTION_ASM_OP "\t.section\t.rodata" +#endif /* On svr4, we *do* have support for the .init and .fini sections, and we can put stuff in there to be executed before and after `main'. We let -- 2.34.1
Re: [PATCH] Fix -Wformat-diag for s390x target.
On Wed, Jan 12, 2022 at 10:05 AM Martin Liška wrote: > > Hello. > > We've got -Wformat-diag for some time and I think we should start using it > in -Werror for GCC bootstrap. The following patch removes last pieces of the > warning > for s390x target. > > Ready to be installed? > Thanks, > Martin > > > gcc/ChangeLog: > > * config/s390/s390-c.c (s390_expand_overloaded_builtin): Wrap > keyword in quotes. > (s390_resolve_overloaded_builtin): Remove trailing dot. > * config/s390/s390.c (s390_const_operand_ok): Use - for range. > (s390_expand_builtin): Remove trailing dot. > (s390_emit_prologue): Likewise, use semicolon. > (s390_option_override_internal): Update keyword. > * varasm.c (do_assemble_alias): Wrap keyword in quotes. > --- > gcc/config/s390/s390-c.c | 9 + > gcc/config/s390/s390.c | 28 ++-- > gcc/varasm.c | 2 +- > 3 files changed, 20 insertions(+), 19 deletions(-) > > diff --git a/gcc/config/s390/s390-c.c b/gcc/config/s390/s390-c.c > index 600018421df..10bc6ac8900 100644 > --- a/gcc/config/s390/s390-c.c > +++ b/gcc/config/s390/s390-c.c > @@ -484,7 +484,8 @@ s390_expand_overloaded_builtin (location_t loc, > case S390_OVERLOADED_BUILTIN_s390_vec_step: > if (TREE_CODE (TREE_TYPE ((*arglist)[0])) != VECTOR_TYPE) > { > - error_at (loc, "builtin vec_step can only be used on vector > types."); > + error_at (loc, "builtin %qs can only be used on vector types", > + "vec_step "); extra space in "vec_step "? > return error_mark_node; > } > return build_int_cst (NULL_TREE, > @@ -905,7 +906,7 @@ s390_resolve_overloaded_builtin (location_t loc, > if (ob_flags & B_INT) > { > error_at (loc, > - "builtin %qF is for GCC internal use only.", > + "builtin %qF is for GCC internal use only", > ob_fndecl); > return error_mark_node; > } > @@ -913,7 +914,7 @@ s390_resolve_overloaded_builtin (location_t loc, > } > > if (ob_flags & B_DEP) > -warning_at (loc, 0, "builtin %qF is deprecated.", ob_fndecl); > +warning_at (loc, 0, "builtin %qF is deprecated", ob_fndecl); > > if (!TARGET_VX && (ob_flags & B_VX)) > { > @@ -1021,7 +1022,7 @@ s390_resolve_overloaded_builtin (location_t loc, > } > > if (bflags_overloaded_builtin_var[last_match_index] & B_DEP) > -warning_at (loc, 0, "%qs matching variant is deprecated.", > +warning_at (loc, 0, "%qs matching variant is deprecated", > IDENTIFIER_POINTER (DECL_NAME (ob_fndecl))); > > /* Overloaded variants which have MAX set as low level builtin are > diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c > index 056002e4a4a..bf96cbf7588 100644 > --- a/gcc/config/s390/s390.c > +++ b/gcc/config/s390/s390.c > @@ -766,7 +766,7 @@ s390_const_operand_ok (tree arg, int argnum, int > op_flags, tree decl) > argnum, decl, values); > } > else > - error ("constant argument %d for builtin %qF is out of range > (0..%wu)", > + error ("constant argument %d for builtin %qF is out of range > (0-%wu)", >argnum, decl, (HOST_WIDE_INT_1U << bitwidth) - 1); > > return false; > @@ -783,7 +783,7 @@ s390_const_operand_ok (tree arg, int argnum, int > op_flags, tree decl) > || tree_to_shwi (arg) > ((HOST_WIDE_INT_1 << (bitwidth - 1)) - 1)) > { > error ("constant argument %d for builtin %qF is out of range " > -"(%wd..%wd)", argnum, decl, > +"(%wd-%wd)", argnum, decl, > -(HOST_WIDE_INT_1 << (bitwidth - 1)), > (HOST_WIDE_INT_1 << (bitwidth - 1)) - 1); > return false; > @@ -832,25 +832,25 @@ s390_expand_builtin (tree exp, rtx target, rtx > subtarget ATTRIBUTE_UNUSED, > if ((bflags & B_HTM) && !TARGET_HTM) > { > error ("builtin %qF is not supported without %<-mhtm%> " > -"(default with %<-march=zEC12%> and higher).", fndecl); > +"(default with %<-march=zEC12%> and higher)", fndecl); > return const0_rtx; > } > if (((bflags & B_VX) || (bflags & B_VXE)) && !TARGET_VX) > { > error ("builtin %qF requires %<-mvx%> " > -"(default with %<-march=z13%> and higher).", fndecl); > +"(default with %<-march=z13%> and higher)", fndecl); > return const0_rtx; > } > > if ((bflags & B_VXE) && !TARGET_VXE) > { > - error ("Builtin %qF requires z14 or higher.", fndecl); > + error ("Builtin %qF requires z14 or higher", fndecl); > return const0_rtx; > } > > if ((bflags & B_VXE2) && !TARGET_VXE2) > { > - error ("Builtin %qF requires z15 or higher.", fndecl)
[PATCH] epiphany: fix -Wimplicit-fallthrough warnings in epiphany.c.
This is RFC. Cheers, Martin gcc/ChangeLog: * config/epiphany/epiphany.c (epiphany_mode_priority): Use gcc_unreachable for not handled cases. --- gcc/config/epiphany/epiphany.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/gcc/config/epiphany/epiphany.c b/gcc/config/epiphany/epiphany.c index a2743436f38..88283419dc1 100644 --- a/gcc/config/epiphany/epiphany.c +++ b/gcc/config/epiphany/epiphany.c @@ -2369,6 +2369,7 @@ epiphany_mode_priority (int entity, int priority) case 2: return (epiphany_normal_fp_rounding == FP_MODE_ROUND_NEAREST ? FP_MODE_ROUND_TRUNC : FP_MODE_ROUND_NEAREST); case 3: return FP_MODE_CALLER; + default: gcc_unreachable (); } case FP_MODE_ROUND_NEAREST: case FP_MODE_CALLER: @@ -2378,6 +2379,7 @@ epiphany_mode_priority (int entity, int priority) case 1: return FP_MODE_ROUND_TRUNC; case 2: return FP_MODE_INT; case 3: return FP_MODE_CALLER; + default: gcc_unreachable (); } case FP_MODE_ROUND_TRUNC: switch (priority) @@ -2386,6 +2388,7 @@ epiphany_mode_priority (int entity, int priority) case 1: return FP_MODE_ROUND_NEAREST; case 2: return FP_MODE_INT; case 3: return FP_MODE_CALLER; + default: gcc_unreachable (); } case FP_MODE_ROUND_UNKNOWN: case FP_MODE_NONE: -- 2.34.1
[PATCH] epiphany: fix -Wimplicit-fallthrough warnings in epiphany.c.
This fixes -Wformat-diag warnings. Ready to be installed? Thanks, Martin gcc/ChangeLog: * config/epiphany/epiphany.c (epiphany_mode_priority): Use gcc_unreachable for not handled cases. --- gcc/config/epiphany/epiphany.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/gcc/config/epiphany/epiphany.c b/gcc/config/epiphany/epiphany.c index a2743436f38..88283419dc1 100644 --- a/gcc/config/epiphany/epiphany.c +++ b/gcc/config/epiphany/epiphany.c @@ -2369,6 +2369,7 @@ epiphany_mode_priority (int entity, int priority) case 2: return (epiphany_normal_fp_rounding == FP_MODE_ROUND_NEAREST ? FP_MODE_ROUND_TRUNC : FP_MODE_ROUND_NEAREST); case 3: return FP_MODE_CALLER; + default: gcc_unreachable (); } case FP_MODE_ROUND_NEAREST: case FP_MODE_CALLER: @@ -2378,6 +2379,7 @@ epiphany_mode_priority (int entity, int priority) case 1: return FP_MODE_ROUND_TRUNC; case 2: return FP_MODE_INT; case 3: return FP_MODE_CALLER; + default: gcc_unreachable (); } case FP_MODE_ROUND_TRUNC: switch (priority) @@ -2386,6 +2388,7 @@ epiphany_mode_priority (int entity, int priority) case 1: return FP_MODE_ROUND_NEAREST; case 2: return FP_MODE_INT; case 3: return FP_MODE_CALLER; + default: gcc_unreachable (); } case FP_MODE_ROUND_UNKNOWN: case FP_MODE_NONE: -- 2.34.1
[PATCH] epiphany: fir -Wformat-diag.
This fixes -Wformat-diag warnings. Ready to be installed? Thanks, Martin gcc/ChangeLog: * config/epiphany/epiphany.c (epiphany_handle_interrupt_attribute): Use %qs format specifier. (epiphany_override_options): Wrap keyword in %<, %>. --- gcc/config/epiphany/epiphany.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/gcc/config/epiphany/epiphany.c b/gcc/config/epiphany/epiphany.c index 88283419dc1..c9c321560de 100644 --- a/gcc/config/epiphany/epiphany.c +++ b/gcc/config/epiphany/epiphany.c @@ -520,8 +520,10 @@ epiphany_handle_interrupt_attribute (tree *node, tree name, tree args, && strcmp (TREE_STRING_POINTER (value), "swi")) { warning (OPT_Wattributes, - "argument of %qE attribute is not \"reset\", \"software_exception\", \"page_miss\", \"timer0\", \"timer1\", \"message\", \"dma0\", \"dma1\", \"wand\" or \"swi\"", - name); + "argument of %qE attribute is not %qs, %qs %qs, %qs, %qs, " + "%qs, %qs, %qs, %qs or %qs", name, + "reset", "software_exception", "page_miss", "timer0", "timer1", + "message", "dma0", "dma1", "wand", "swi"); *no_add_attrs = true; return NULL_TREE; } @@ -1538,9 +1540,9 @@ static void epiphany_override_options (void) { if (epiphany_stack_offset < 4) -error ("stack_offset must be at least 4"); +error ("% must be at least 4"); if (epiphany_stack_offset & 3) -error ("stack_offset must be a multiple of 4"); +error ("% must be a multiple of 4"); epiphany_stack_offset = (epiphany_stack_offset + 3) & -4; if (!TARGET_SOFT_CMPSF) flag_finite_math_only = 1; -- 2.34.1
Re: [PATCH] epiphany: fix -Wimplicit-fallthrough warnings in epiphany.c.
On 1/12/22 10:56, Martin Liška wrote: This fixes -Wformat-diag warnings. Ready to be installed? Thanks, Martin Please forget about this email, it's a dup. Martin
Re: [PATCH] Fix redefined macro for epiphany.
On Wed, Jan 12, 2022 at 10:47 AM Martin Liška wrote: > > The following warning is emitted gazillion times. > > Fixes: > > In file included from ./tm.h:23, > from gcc/genconfig.c:25: > gcc/config/elfos.h:209: warning: "READONLY_DATA_SECTION_ASM_OP" redefined >209 | #define READONLY_DATA_SECTION_ASM_OP"\t.section\t.rodata" >| > In file included from ./tm.h:21, > from gcc/genconfig.c:25: > gcc/config/epiphany/epiphany.h:671: note: this is the location of the > previous definition >671 | #define READONLY_DATA_SECTION_ASM_OP"\t.section .rodata" > > Ready to be installed? Most other targets #undef READONLY_DATA_SECION_ASM_OP before defining, isn't the issue that the target doesn't do that and that it has uncommon order of tm_file in config.gcc? (${tm_file} first, before elfos.h) Richard. > Thanks, > Martin > > gcc/ChangeLog: > > * config/elfos.h (READONLY_DATA_SECTION_ASM_OP): Define only if > not defined. > --- > gcc/config/elfos.h | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/gcc/config/elfos.h b/gcc/config/elfos.h > index 2e0c709e585..add949ec940 100644 > --- a/gcc/config/elfos.h > +++ b/gcc/config/elfos.h > @@ -206,7 +206,9 @@ see the files COPYING3 and COPYING.RUNTIME respectively. > If not, see > #define ASCII_DATA_ASM_OP "\t.ascii\t" > > /* Support a read-only data section. */ > +#ifndef READONLY_DATA_SECTION_ASM_OP > #define READONLY_DATA_SECTION_ASM_OP "\t.section\t.rodata" > +#endif > > /* On svr4, we *do* have support for the .init and .fini sections, and we > can put stuff in there to be executed before and after `main'. We let > -- > 2.34.1 >
Re: [PATCH] Fix redefined macro for epiphany.
On 1/12/22 11:03, Richard Biener wrote: tm_file in config.gcc? (${tm_file} first, before elfos.h) Heh, yes, that's the case. The following patch fixes that. May I install it? Cheers, MartinFrom 14fa27ba17cf9aa1a971ae93f8a5ef6aa857c67e Mon Sep 17 00:00:00 2001 From: Martin Liska Date: Wed, 12 Jan 2022 11:09:20 +0100 Subject: [PATCH] Include elfos.h before ${tm_file}. Fixes: In file included from ./tm.h:23, from gcc/genconfig.c:25: gcc/config/elfos.h:209: warning: "READONLY_DATA_SECTION_ASM_OP" redefined 209 | #define READONLY_DATA_SECTION_ASM_OP"\t.section\t.rodata" | In file included from ./tm.h:21, from gcc/genconfig.c:25: gcc/config/epiphany/epiphany.h:671: note: this is the location of the previous definition 671 | #define READONLY_DATA_SECTION_ASM_OP"\t.section .rodata" gcc/ChangeLog: * config.gcc: Include elfos.h before ${tm_file}. --- gcc/config.gcc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config.gcc b/gcc/config.gcc index 89e0992fda5..f1b8d832c6f 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -1620,7 +1620,7 @@ csky-*-*) esac ;; epiphany-*-elf | epiphany-*-rtems*) - tm_file="${tm_file} dbxelf.h elfos.h" + tm_file="dbxelf.h elfos.h ${tm_file}" tmake_file="${tmake_file} epiphany/t-epiphany" case ${target} in epiphany-*-rtems*) -- 2.34.1
[vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
Hi, This a fix for the regression caused by '[vect] Re-analyze all modes for epilogues'. The earlier patch assumed there was always at least one other mode than VOIDmode, but that does not need to be the case. If we are dealing with a target that does not define more modes for 'autovectorize_vector_modes', the behaviour before the patch would be to try to create an epilogue for the same autodetected_vector_mode, which unless the target supported partial vectors would always fail. So as a fix I suggest trying to vectorize the epilogue with the preferred_simd_mode for QI, mimicking autovectorize_vector_mode, which will be skipped if it is not a vector_mode (since that already should indicate partial vectors aren't possible) or if no partial vectors are supported and its pessimistic NUNITS is larger than the main loop's VF. Currently bootstrapping and regression testing, otherwise OK for trunk? Can someone verify this fixes the issue for PR103971 on powerpc? gcc/ChangeLog: * tree-vect-loop.c (vect-analyze-loop): Handle scenario where target does add autovectorize_vector_modes. diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index 6ed2b5f8724e5ebf27592f67d7f6bdfe1ebcf512..c81ebc411312e649f9cd954895244c60c928fee1 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -3024,6 +3024,18 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared) ordering is not guaranteed, so we could end up picking a mode for the main loop that is after the epilogue's optimal mode. */ mode_i = 1; + /* If we only had VOIDmode then push the AUTODETECTED_VECTOR_MODE to see if + an epilogue can be created with that mode. */ + if (vector_modes.length () == 1) +{ + machine_mode preferred_mode + = targetm.vectorize.preferred_simd_mode (QImode); + /* If the preferred mode isn't a vector mode we will not be needing an + epilogue. */ + if (!VECTOR_MODE_P (preferred_mode)) + return first_loop_vinfo; + vector_modes.safe_push (preferred_mode); +} bool supports_partial_vectors = partial_vectors_supported_p (); poly_uint64 first_vinfo_vf = LOOP_VINFO_VECT_FACTOR (first_loop_vinfo);
Re: [patch] Fix reverse SSO issues in IPA-SRA
Hi, On Wed, Jan 12 2022, Eric Botcazou wrote: >> Thanks for the fixes, the forgotten duplication and streaming were quite >> embarrassing, but one reason is that there are few reverse SSO testcases >> in the suite. Would it be difficult to add some covering the issues you >> fixed? > > No, I think I should be able to cover 3 out of the 4 changes, see below. > >> ...is this strictly necessary? I know it is inconsistent but the >> certain flag is fairly special. More importantly... > > No strong opinion, but that's really inconsistent... > >> > @@ -3349,6 +3346,7 @@ pull_accesses_from_callee (cgraph_node *caller, >> > isra_param_desc *param_desc,> >> > copy->type = argacc->type; >> > copy->alias_ptr_type = argacc->alias_ptr_type; >> > copy->certain = true; >> > >> > +copy->reverse = argacc->reverse; >> > >> > vec_safe_push (param_desc->accesses, copy); >> > >> >} >> > >> >else if (prop_kinds[j] == ACC_PROP_CERTAIN) >> >> ...earlier in the function, there is a check doing: >> >>if (argacc->alias_ptr_type != pacc->alias_ptr_type >> >>|| !types_compatible_p (argacc->type, pacc->type)) >> >> return "propagated access types would not match existing > ones"; >> >> Will the types_compatible_p catch the case when one type is a >> reverse-SSO and the other is not? If not, we probably need to check >> that the flags are the same too. > > That's the change I don't know how to cover and I agree that the check looks > in order, It might be indeed difficult to trigger the inconsistency without some rather unsavory type casting. Basically it would mean we would have something like... f (struct with_reverse s) { use (s.field); } g (struct without_reverse_buth_otherwise_identical s) { f ((struct with_reverse) s); } However with LTO, in incorrect programs, this might happen even without the typecast and so IPA might encounter the situation even without an explicit typecast. > but I presume that the flag still needs to be copied onto "copy"? > Yes, it must still be copied. Thanks, Martin
Re: [PATCH] Fortran: make IEEE_CLASS recognize signaling NaNs
Hi, I can confirm that I don’t see this failure on a Debian bullseye/sid (Linux 5.11.0-46, glibc 2.31-0ubuntu9.2) with a fresh bootstrap of master: $ grep signaling testsuite/gfortran/gfortran.sum PASS: gfortran.dg/ieee/signaling_1.f90 -O0 (test for excess errors) PASS: gfortran.dg/ieee/signaling_1.f90 -O0 execution test PASS: gfortran.dg/ieee/signaling_1.f90 -O1 (test for excess errors) PASS: gfortran.dg/ieee/signaling_1.f90 -O1 execution test PASS: gfortran.dg/ieee/signaling_1.f90 -O2 (test for excess errors) PASS: gfortran.dg/ieee/signaling_1.f90 -O2 execution test PASS: gfortran.dg/ieee/signaling_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) PASS: gfortran.dg/ieee/signaling_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test PASS: gfortran.dg/ieee/signaling_1.f90 -O3 -g (test for excess errors) PASS: gfortran.dg/ieee/signaling_1.f90 -O3 -g execution test PASS: gfortran.dg/ieee/signaling_1.f90 -Os (test for excess errors) PASS: gfortran.dg/ieee/signaling_1.f90 -Os execution test It’s showing on some gcc-testresults for x86_64 and aarch64 linux, so I’ll need someone (HJ, Andreas are in CC) to show me what the error is. It may be related to the use of dg-options instead of dg-additional-options in that testcase, so I pushed the attached fix. I’ll check the test results mailing-list to see if it improved things (or confirm when the error message is posted). FX test.patch Description: Binary data
Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
On Wed, 12 Jan 2022, Andre Vieira (lists) wrote: > Hi, > > This a fix for the regression caused by '[vect] Re-analyze all modes for > epilogues'. The earlier patch assumed there was always at least one other mode > than VOIDmode, but that does not need to be the case. > If we are dealing with a target that does not define more modes for > 'autovectorize_vector_modes', the behaviour before the patch would be to try > to create an epilogue for the same autodetected_vector_mode, which unless the > target supported partial vectors would always fail. So as a fix I suggest > trying to vectorize the epilogue with the preferred_simd_mode for QI, > mimicking autovectorize_vector_mode, which will be skipped if it is not a > vector_mode (since that already should indicate partial vectors aren't > possible) or if no partial vectors are supported and its pessimistic NUNITS is > larger than the main loop's VF. > > Currently bootstrapping and regression testing, otherwise OK for trunk? Can > someone verify this fixes the issue for PR103971 on powerpc? Why not simply start at mode_i = 0 which means autodetecting the mode to use for the epilogue? That appears to be a much simpler solution to me, including for targets where there are more than one element in the vector. Richard.
Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
Richard Biener writes: > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote: > >> Hi, >> >> This a fix for the regression caused by '[vect] Re-analyze all modes for >> epilogues'. The earlier patch assumed there was always at least one other >> mode >> than VOIDmode, but that does not need to be the case. >> If we are dealing with a target that does not define more modes for >> 'autovectorize_vector_modes', the behaviour before the patch would be to try >> to create an epilogue for the same autodetected_vector_mode, which unless the >> target supported partial vectors would always fail. So as a fix I suggest >> trying to vectorize the epilogue with the preferred_simd_mode for QI, >> mimicking autovectorize_vector_mode, which will be skipped if it is not a >> vector_mode (since that already should indicate partial vectors aren't >> possible) or if no partial vectors are supported and its pessimistic NUNITS >> is >> larger than the main loop's VF. >> >> Currently bootstrapping and regression testing, otherwise OK for trunk? Can >> someone verify this fixes the issue for PR103971 on powerpc? > > Why not simply start at mode_i = 0 which means autodetecting the mode > to use for the epilogue? That appears to be a much simpler solution to > me, including for targets where there are more than one element in the > vector. VOIDmode doesn't tell us anything about what the autodetected mode will be, so current short-circuit: /* If the target does not support partial vectors we can shorten the number of modes to analyze for the epilogue as we know we can't pick a mode that has at least as many NUNITS as the main loop's vectorization factor, since that would imply the epilogue's vectorization factor would be at least as high as the main loop's and we would be vectorizing for more scalar iterations than there would be left. */ if (!supports_partial_vectors && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf)) { mode_i++; if (mode_i == vector_modes.length ()) break; continue; } wouldn't be effective. Thanks, Richard
Re: [PATCH] Fortran: make IEEE_CLASS recognize signaling NaNs
On Wed, Jan 12, 2022 at 11:23:43AM +0100, FX via Gcc-patches wrote: > I can confirm that I don’t see this failure on a Debian bullseye/sid (Linux > 5.11.0-46, glibc 2.31-0ubuntu9.2) with a fresh bootstrap of master: > > $ grep signaling testsuite/gfortran/gfortran.sum > PASS: gfortran.dg/ieee/signaling_1.f90 -O0 (test for excess errors) > PASS: gfortran.dg/ieee/signaling_1.f90 -O0 execution test > PASS: gfortran.dg/ieee/signaling_1.f90 -O1 (test for excess errors) > PASS: gfortran.dg/ieee/signaling_1.f90 -O1 execution test > PASS: gfortran.dg/ieee/signaling_1.f90 -O2 (test for excess errors) > PASS: gfortran.dg/ieee/signaling_1.f90 -O2 execution test > PASS: gfortran.dg/ieee/signaling_1.f90 -O3 -fomit-frame-pointer > -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess > errors) > PASS: gfortran.dg/ieee/signaling_1.f90 -O3 -fomit-frame-pointer > -funroll-loops -fpeel-loops -ftracer -finline-functions execution test > PASS: gfortran.dg/ieee/signaling_1.f90 -O3 -g (test for excess errors) > PASS: gfortran.dg/ieee/signaling_1.f90 -O3 -g execution test > PASS: gfortran.dg/ieee/signaling_1.f90 -Os (test for excess errors) > PASS: gfortran.dg/ieee/signaling_1.f90 -Os execution test The error I was getting was: /home/jakub/src/gcc/obj46/gcc/testsuite/gfortran2/../../gfortran -B/home/jakub/src/gcc/obj46/gcc/testsuite/gfortran2/../../ -B/home/jakub/src/gcc/obj46/x86_64-pc -linux-gnu/./libgfortran/ /home/jakub/src/gcc/gcc/testsuite/gfortran.dg/ieee/signaling_1.f90 -fdiagnostics-plain-output -fdiagnostics-plain-output -O1 -fsignaling-nans /home/jakub/sr c/gcc/gcc/testsuite/gfortran.dg/ieee/signaling_1_c.c -dumpbase -B/home/jakub/src/gcc/obj46/x86_64-pc-linux-gnu/./libgfortran/.libs -L/home/jakub/src/gcc/obj46/x86_64-pc-linux-gnu/./ libgfortran/.libs -L/home/jakub/src/gcc/obj46/x86_64-pc-linux-gnu/./libgfortran/.libs -L/home/jakub/src/gcc/obj46/x86_64-pc-linux-gnu/./libatomic/.libs -B/home/jakub/src/gcc/obj46/x8 6_64-pc-linux-gnu/./libquadmath/.libs -L/home/jakub/src/gcc/obj46/x86_64-pc-linux-gnu/./libquadmath/.libs -L/home/jakub/src/gcc/obj46/x86_64-pc-linux-gnu/./libquadmath/.libs -lm -o . /signaling_1.exe /home/jakub/src/gcc/gcc/testsuite/gfortran.dg/ieee/signaling_1.f90:8:20: Fatal Error: Cannot find an intrinsic module named 'ieee_arithmetic' at (1) compilation terminated. compiler exited with status 1 FAIL: gfortran.dg/ieee/signaling_1.f90 -O1 (test for excess errors) Excess errors: /home/jakub/src/gcc/gcc/testsuite/gfortran.dg/ieee/signaling_1.f90:8:20: Fatal Error: Cannot find an intrinsic module named 'ieee_arithmetic' at (1) compilation terminated. UNRESOLVED: gfortran.dg/ieee/signaling_1.f90 -O1 compilation failed to produce executable And -! { dg-options "-fsignaling-nans" } +! { dg-additional-options "-fsignaling-nans" } doesn't fix it, it changes the FAIL into: cc1: warning: command-line option '-fintrinsic-modules-path /home/jakub/src/gcc/obj46/x86_64-pc-linux-gnu/./libgfortran/' is valid for Fortran but not for C FAIL: gfortran.dg/ieee/signaling_1.f90 -O1 (test for excess errors) Excess errors: cc1: warning: command-line option '-fintrinsic-modules-path /home/jakub/src/gcc/obj46/x86_64-pc-linux-gnu/./libgfortran/' is valid for Fortran but not for C We need -fintrinsic-modules-path option for the signalling_1.f90 compilation but need to make sure it isn't used when the *.c file is compiled, so they need to be compiled by separate compiler invocations probably. Jakub
Re: [PATCH] Fix redefined macro for epiphany.
On Wed, Jan 12, 2022 at 11:10 AM Martin Liška wrote: > > On 1/12/22 11:03, Richard Biener wrote: > > tm_file in config.gcc? (${tm_file} first, before elfos.h) > > Heh, yes, that's the case. The following patch fixes that. OK if the target builds with that. > May I install it? > Cheers, > Martin
Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
On Wed, 12 Jan 2022, Richard Sandiford wrote: > Richard Biener writes: > > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote: > > > >> Hi, > >> > >> This a fix for the regression caused by '[vect] Re-analyze all modes for > >> epilogues'. The earlier patch assumed there was always at least one other > >> mode > >> than VOIDmode, but that does not need to be the case. > >> If we are dealing with a target that does not define more modes for > >> 'autovectorize_vector_modes', the behaviour before the patch would be to > >> try > >> to create an epilogue for the same autodetected_vector_mode, which unless > >> the > >> target supported partial vectors would always fail. So as a fix I suggest > >> trying to vectorize the epilogue with the preferred_simd_mode for QI, > >> mimicking autovectorize_vector_mode, which will be skipped if it is not a > >> vector_mode (since that already should indicate partial vectors aren't > >> possible) or if no partial vectors are supported and its pessimistic > >> NUNITS is > >> larger than the main loop's VF. > >> > >> Currently bootstrapping and regression testing, otherwise OK for trunk? Can > >> someone verify this fixes the issue for PR103971 on powerpc? > > > > Why not simply start at mode_i = 0 which means autodetecting the mode > > to use for the epilogue? That appears to be a much simpler solution to > > me, including for targets where there are more than one element in the > > vector. > > VOIDmode doesn't tell us anything about what the autodetected mode > will be, so current short-circuit: > > /* If the target does not support partial vectors we can shorten the >number of modes to analyze for the epilogue as we know we can't pick a >mode that has at least as many NUNITS as the main loop's vectorization >factor, since that would imply the epilogue's vectorization factor >would be at least as high as the main loop's and we would be >vectorizing for more scalar iterations than there would be left. */ > if (!supports_partial_vectors > && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf)) > { > mode_i++; > if (mode_i == vector_modes.length ()) > break; > continue; > } > > wouldn't be effective. Well, before this change we simply did - /* Handle the case that the original loop can use partial - vectorization, but want to only adopt it for the epilogue. - The retry should be in the same mode as original. */ - if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo)) ... - else -{ - mode_i = first_loop_next_i; - if (mode_i == vector_modes.length ()) - return first_loop_vinfo; -} and thus didn't bother with epilogue vectorization. I think we should then just restore this behavior, not doing epilogue vectorization if vector_modes.length () == 1? Richard.
Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
Richard Biener writes: > On Wed, 12 Jan 2022, Richard Sandiford wrote: > >> Richard Biener writes: >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote: >> > >> >> Hi, >> >> >> >> This a fix for the regression caused by '[vect] Re-analyze all modes for >> >> epilogues'. The earlier patch assumed there was always at least one other >> >> mode >> >> than VOIDmode, but that does not need to be the case. >> >> If we are dealing with a target that does not define more modes for >> >> 'autovectorize_vector_modes', the behaviour before the patch would be to >> >> try >> >> to create an epilogue for the same autodetected_vector_mode, which unless >> >> the >> >> target supported partial vectors would always fail. So as a fix I suggest >> >> trying to vectorize the epilogue with the preferred_simd_mode for QI, >> >> mimicking autovectorize_vector_mode, which will be skipped if it is not a >> >> vector_mode (since that already should indicate partial vectors aren't >> >> possible) or if no partial vectors are supported and its pessimistic >> >> NUNITS is >> >> larger than the main loop's VF. >> >> >> >> Currently bootstrapping and regression testing, otherwise OK for trunk? >> >> Can >> >> someone verify this fixes the issue for PR103971 on powerpc? >> > >> > Why not simply start at mode_i = 0 which means autodetecting the mode >> > to use for the epilogue? That appears to be a much simpler solution to >> > me, including for targets where there are more than one element in the >> > vector. >> >> VOIDmode doesn't tell us anything about what the autodetected mode >> will be, so current short-circuit: >> >> /* If the target does not support partial vectors we can shorten the >> number of modes to analyze for the epilogue as we know we can't pick a >> mode that has at least as many NUNITS as the main loop's vectorization >> factor, since that would imply the epilogue's vectorization factor >> would be at least as high as the main loop's and we would be >> vectorizing for more scalar iterations than there would be left. */ >> if (!supports_partial_vectors >>&& maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf)) >> { >>mode_i++; >>if (mode_i == vector_modes.length ()) >> break; >>continue; >> } >> >> wouldn't be effective. > > Well, before this change we simply did > > - /* Handle the case that the original loop can use partial > - vectorization, but want to only adopt it for the epilogue. > - The retry should be in the same mode as original. */ > - if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo)) > ... > - else > -{ > - mode_i = first_loop_next_i; > - if (mode_i == vector_modes.length ()) > - return first_loop_vinfo; > -} > > and thus didn't bother with epilogue vectorization. I think we should > then just restore this behavior, not doing epilogue vectorization > if vector_modes.length () == 1? Yeah, but that case didn't need epilogue vectorisation before. This series is adding support for unrolling, and targets with a single vector size will benefit from epilogues in that case. Thanks, Richard
Re: [Patch][V2]Enable -Wuninitialized + -ftrivial-auto-var-init for address taken variables
On Tue, Jan 11, 2022 at 5:32 PM Qing Zhao wrote: > > Hi, Richard, > > > On Jan 11, 2022, at 7:43 AM, Richard Biener > > wrote: > > > > > 1. Add some meaningful temporaries to break the long expression to make > it > Readable. And also add comments to explain the purpose of the > statement; > > 2. Resolve the memory leakage of the dynamically created string. > > The patch has been bootstrapped and regressing tested on both x86 and > aarch64, no issues. > Okay for commit? > >>> > >>> tree decl_name > >>> += build_string_literal (IDENTIFIER_LENGTH (DECL_NAME (decl)) + 1, > >>> + IDENTIFIER_POINTER (DECL_NAME (decl))); > >>> > >>> you need to deal with DECL_NAME being NULL. > >> > >> Okay. > >> Usually under what situation, the decl_name will be NULL? > > > > I don't know but it definitely happens. > > > >>> It's also a bit awkward > >>> to build another > >>> copy of all decl names :/ > >> > >> Yes, this is awkward. But it might be unavoidable for address taken > >> variables since the original variable might be completely deleted by > >> optimizations. > >> See the details at: > >> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577431.html > >> > >> We had a previous discussion on this issue, and the idea of adding this > >> 3rd argument with the name of the variable was proposed by you at that > >> time. -:) > > > > I know ... I didn't have a better idea. > > I think that adding the name string of the auto variable as one parameter to > the function .DEFERRED_INIT might be the only solution to this issue? (In the > very beginning of the implementation, we added the VAR itself as one > parameter to the function .DEFERRED_INIT, but that design didn’t work out) > > > >> > >> > >>> > >>> + /* The LHS of the call is a temporary variable, we use it as a > >>> +placeholder to record the information on whether the warning > >>> +has been issued or not. */ > >>> + repl_var = gimple_call_lhs (def_stmt); > >>> > >>> this seems to be a change that can be done independently? > >> > >> The major purpose of this “repl_var” is used to record the info whether > >> the warning has been issued for the variable or not, then avoid emitting > >> it again later. > >> Since the original variable has been completely deleted by optimization, > >> we have to use this “repl_var” for a placeholder to record such info. > > > > But the ... = .DEFERRED_INIT stmt itself could be used to record this > > since its location is > > already used to indicate the original location, like with > > suppress_warning/warning_suppressed_p? > > Ah, I will check on this. Thanks a lot. > > > >>> > >>> + /* Ignore the call to .DEFERRED_INIT that define the original > >>> +var itself. */ > >>> + if (is_gimple_assign (context)) > >>> + { > >>> + if (TREE_CODE (gimple_assign_lhs (context)) == VAR_DECL) > >>> + lhs_var = gimple_assign_lhs (context); > >>> + else if (TREE_CODE (gimple_assign_lhs (context)) == > >>> SSA_NAME) > >>> + lhs_var = SSA_NAME_VAR (gimple_assign_lhs (context)); > >>> + } > >>> + if (lhs_var > >>> + && (lhs_var_name = DECL_NAME (lhs_var)) > >>> + && (lhs_var_name_str = IDENTIFIER_POINTER (lhs_var_name)) > >>> + && (strcmp (lhs_var_name_str, var_name_str) == 0)) > >>> + return; > >>> > >>> likewise but I don't really understand what you are doing here. > >> > >> The above is to exclude the following case: > >> > >> temp = .DEFERRED_INIT (4, 2, “alt_reloc"); > >> alt_reloc = temp; > >> > >> i.e, a call to .DEFERRED_INIT that define the original variable itself. > > > > How can this happen? It looks like a bug to me. Do you have a testcase? > With -ftrivial-auto-var-init, During gimplification phase, almost all address > taken variables that do not have an explicit initializer will have the above > IR pattern. > > For example, gcc.dg/auto-init-uninit-16.c: > > [opc@qinzhao-ol8u3-x86 gcc]$ cat > /home/opc/Work/GCC/latest-gcc/gcc/testsuite/gcc.dg/auto-init-uninit-16.c > /* { dg-do compile } */ > /* { dg-options "-O2 -Wuninitialized -ftrivial-auto-var-init=zero" } */ > > int foo, bar; > > static > void decode_reloc(int reloc, int *is_alt) > { > if (reloc >= 20) > *is_alt = 1; > else if (reloc >= 10) > *is_alt = 0; > } > > void testfunc() > { > int alt_reloc; > > decode_reloc(foo, &alt_reloc); > > if (alt_reloc) /* { dg-warning "may be used uninitialized" "" } */ > bar = 42; > } > > With -ftrivial-auto-var-init=zero, the IR after gimplification is: > > _1 = .DEFERRED_INIT (4, 2, &"alt_reloc"[0]); > alt_reloc = _1; > > And the IR after SSA is similar as the above: > > _1 = .DEFERRED_INIT (4, 2, &"alt_reloc"[0]); > alt_reloc = _1; > > During the early uninitialized analysis
Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
On Wed, 12 Jan 2022, Richard Sandiford wrote: > Richard Biener writes: > > On Wed, 12 Jan 2022, Richard Sandiford wrote: > > > >> Richard Biener writes: > >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote: > >> > > >> >> Hi, > >> >> > >> >> This a fix for the regression caused by '[vect] Re-analyze all modes for > >> >> epilogues'. The earlier patch assumed there was always at least one > >> >> other mode > >> >> than VOIDmode, but that does not need to be the case. > >> >> If we are dealing with a target that does not define more modes for > >> >> 'autovectorize_vector_modes', the behaviour before the patch would be > >> >> to try > >> >> to create an epilogue for the same autodetected_vector_mode, which > >> >> unless the > >> >> target supported partial vectors would always fail. So as a fix I > >> >> suggest > >> >> trying to vectorize the epilogue with the preferred_simd_mode for QI, > >> >> mimicking autovectorize_vector_mode, which will be skipped if it is not > >> >> a > >> >> vector_mode (since that already should indicate partial vectors aren't > >> >> possible) or if no partial vectors are supported and its pessimistic > >> >> NUNITS is > >> >> larger than the main loop's VF. > >> >> > >> >> Currently bootstrapping and regression testing, otherwise OK for trunk? > >> >> Can > >> >> someone verify this fixes the issue for PR103971 on powerpc? > >> > > >> > Why not simply start at mode_i = 0 which means autodetecting the mode > >> > to use for the epilogue? That appears to be a much simpler solution to > >> > me, including for targets where there are more than one element in the > >> > vector. > >> > >> VOIDmode doesn't tell us anything about what the autodetected mode > >> will be, so current short-circuit: > >> > >> /* If the target does not support partial vectors we can shorten the > >> number of modes to analyze for the epilogue as we know we can't pick a > >> mode that has at least as many NUNITS as the main loop's vectorization > >> factor, since that would imply the epilogue's vectorization factor > >> would be at least as high as the main loop's and we would be > >> vectorizing for more scalar iterations than there would be left. */ > >> if (!supports_partial_vectors > >> && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf)) > >>{ > >> mode_i++; > >> if (mode_i == vector_modes.length ()) > >>break; > >> continue; > >>} > >> > >> wouldn't be effective. > > > > Well, before this change we simply did > > > > - /* Handle the case that the original loop can use partial > > - vectorization, but want to only adopt it for the epilogue. > > - The retry should be in the same mode as original. */ > > - if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo)) > > ... > > - else > > -{ > > - mode_i = first_loop_next_i; > > - if (mode_i == vector_modes.length ()) > > - return first_loop_vinfo; > > -} > > > > and thus didn't bother with epilogue vectorization. I think we should > > then just restore this behavior, not doing epilogue vectorization > > if vector_modes.length () == 1? > > Yeah, but that case didn't need epilogue vectorisation before. This > series is adding support for unrolling, and targets with a single vector > size will benefit from epilogues in that case. But in that case (which we could detect), we could then just use autodetected_vector_mode? Like if we do before epilogue vect vector_modes[0] = autodetected_vector_mode; mode_i = 0; thus replace VOIDmode with what we detected and then start at 0? That is, the proposed patch looks very much like a hack to me. I suppose the VECTOR_MODE_P check should be added to if (!supports_partial_vectors && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf)) { mode_i++; instead. > Thanks, > Richard > -- Richard Biener SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)
Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
Richard Biener writes: > On Wed, 12 Jan 2022, Richard Sandiford wrote: > >> Richard Biener writes: >> > On Wed, 12 Jan 2022, Richard Sandiford wrote: >> > >> >> Richard Biener writes: >> >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote: >> >> > >> >> >> Hi, >> >> >> >> >> >> This a fix for the regression caused by '[vect] Re-analyze all modes >> >> >> for >> >> >> epilogues'. The earlier patch assumed there was always at least one >> >> >> other mode >> >> >> than VOIDmode, but that does not need to be the case. >> >> >> If we are dealing with a target that does not define more modes for >> >> >> 'autovectorize_vector_modes', the behaviour before the patch would be >> >> >> to try >> >> >> to create an epilogue for the same autodetected_vector_mode, which >> >> >> unless the >> >> >> target supported partial vectors would always fail. So as a fix I >> >> >> suggest >> >> >> trying to vectorize the epilogue with the preferred_simd_mode for QI, >> >> >> mimicking autovectorize_vector_mode, which will be skipped if it is >> >> >> not a >> >> >> vector_mode (since that already should indicate partial vectors aren't >> >> >> possible) or if no partial vectors are supported and its pessimistic >> >> >> NUNITS is >> >> >> larger than the main loop's VF. >> >> >> >> >> >> Currently bootstrapping and regression testing, otherwise OK for >> >> >> trunk? Can >> >> >> someone verify this fixes the issue for PR103971 on powerpc? >> >> > >> >> > Why not simply start at mode_i = 0 which means autodetecting the mode >> >> > to use for the epilogue? That appears to be a much simpler solution to >> >> > me, including for targets where there are more than one element in the >> >> > vector. >> >> >> >> VOIDmode doesn't tell us anything about what the autodetected mode >> >> will be, so current short-circuit: >> >> >> >> /* If the target does not support partial vectors we can shorten the >> >>number of modes to analyze for the epilogue as we know we can't pick a >> >>mode that has at least as many NUNITS as the main loop's vectorization >> >>factor, since that would imply the epilogue's vectorization factor >> >>would be at least as high as the main loop's and we would be >> >>vectorizing for more scalar iterations than there would be left. */ >> >> if (!supports_partial_vectors >> >> && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf)) >> >> { >> >> mode_i++; >> >> if (mode_i == vector_modes.length ()) >> >> break; >> >> continue; >> >> } >> >> >> >> wouldn't be effective. >> > >> > Well, before this change we simply did >> > >> > - /* Handle the case that the original loop can use partial >> > - vectorization, but want to only adopt it for the epilogue. >> > - The retry should be in the same mode as original. */ >> > - if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo)) >> > ... >> > - else >> > -{ >> > - mode_i = first_loop_next_i; >> > - if (mode_i == vector_modes.length ()) >> > - return first_loop_vinfo; >> > -} >> > >> > and thus didn't bother with epilogue vectorization. I think we should >> > then just restore this behavior, not doing epilogue vectorization >> > if vector_modes.length () == 1? >> >> Yeah, but that case didn't need epilogue vectorisation before. This >> series is adding support for unrolling, and targets with a single vector >> size will benefit from epilogues in that case. > > But in that case (which we could detect), we could then just use > autodetected_vector_mode? Like if we do before epilogue vect > > vector_modes[0] = autodetected_vector_mode; > mode_i = 0; > > thus replace VOIDmode with what we detected and then start at 0? > That is, the proposed patch looks very much like a hack to me. You mean check whether the loop is unrolled? If so, that's what feels like a hack to me :-) The question is whether there are enough elements for epilogue vectorisation to make sense. The VF is what tells us that. Unrolling is just one of the things that influences that VF and I don't think we should check for the individual influences. It's just the end result that matters. > I suppose the VECTOR_MODE_P check should be added to > > if (!supports_partial_vectors > && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), > first_vinfo_vf)) > { > mode_i++; > > instead. You mean: if (!supports_partial_vectors && VECTOR_MODE_P (vector_modes[mode_i]) && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf)) { mode_i++; ? If so, the skip won't be effective the first time round. Thanks, Richard
Re: [PATCH] Fortran: make IEEE_CLASS recognize signaling NaNs
Hi Jakub, > We need -fintrinsic-modules-path option for the signalling_1.f90 compilation > but need to make sure it isn't used when the *.c file is compiled, so they > need to be compiled by separate compiler invocations probably. Thanks for posting the errors! So I wasn’t seeing it because I had run “make install” before the testsuite. The patch I pushed at https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=6b14100b9504800768da726dcb81f1857db3b493 should fix the failure, then. FX
Re: [PATCH] Fortran: make IEEE_CLASS recognize signaling NaNs
On Wed, Jan 12, 2022 at 12:03:40PM +0100, FX wrote: > > We need -fintrinsic-modules-path option for the signalling_1.f90 compilation > > but need to make sure it isn't used when the *.c file is compiled, so they > > need to be compiled by separate compiler invocations probably. > > Thanks for posting the errors! So I wasn’t seeing it because I had run > “make install” before the testsuite. The patch I pushed at > https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=6b14100b9504800768da726dcb81f1857db3b493 > should fix the failure, then. Thanks. Just a nit, it is cc1 that reports the warning, not f951. Jakub
Re: [PATCH] Fix -Wformat-diag for aarch64 target.
On 12/01/2022 09:02, Martin Liška wrote: Hello. We've got -Wformat-diag for some time and I think we should start using it in -Werror for GCC bootstrap. The following patch removes last pieces of the warning for aarch64 target. Ready to be installed? Thanks, Martin OK. R. gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_parse_boolean_options): Use %qs where possible. (aarch64_parse_sve_width_string): Likewise. (aarch64_override_options_internal): Likewise. (aarch64_print_hint_for_extensions): Likewise. (aarch64_validate_sls_mitigation): Likewise. (aarch64_handle_attr_arch): Likewise. (aarch64_handle_attr_cpu): Likewise. (aarch64_handle_attr_tune): Likewise. (aarch64_handle_attr_isa_flags): Likewise. --- gcc/config/aarch64/aarch64.c | 26 +- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 5920b29880a..1bca2a31a1b 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -16337,7 +16337,7 @@ aarch64_parse_boolean_options (const char *option, /* We ended with a comma, print something. */ if (!(*specs)) { - error ("%s string ill-formed\n", option_name); + error ("%qs string ill-formed", option_name); return 0; } @@ -16393,7 +16393,7 @@ aarch64_parse_sve_width_string (const char *tune_string, int n = sscanf (tune_string, "%d", &width); if (n == EOF) { - error ("invalid format for sve_width"); + error ("invalid format for %"); return; } switch (width) @@ -16405,7 +16405,7 @@ aarch64_parse_sve_width_string (const char *tune_string, case SVE_2048: break; default: - error ("invalid sve_width value: %d", width); + error ("invalid % value: %d", width); } tune->sve_width = (enum aarch64_sve_vector_bits_enum) width; } @@ -16628,7 +16628,7 @@ aarch64_override_options_internal (struct gcc_options *opts) if (opts->x_aarch64_stack_protector_guard_reg_str) { if (strlen (opts->x_aarch64_stack_protector_guard_reg_str) > 100) - error ("specify a system register with a small string length."); + error ("specify a system register with a small string length"); } if (opts->x_aarch64_stack_protector_guard_offset_str) @@ -16832,7 +16832,7 @@ aarch64_print_hint_for_extensions (const std::string &str) inform (input_location, "valid arguments are: %s;" " did you mean %qs?", s, hint); else - inform (input_location, "valid arguments are: %s;", s); + inform (input_location, "valid arguments are: %s", s); XDELETEVEC (s); } @@ -16933,7 +16933,7 @@ aarch64_validate_sls_mitigation (const char *const_str) temp |= SLS_RETBR; else if (strcmp (str, "none") == 0 || strcmp (str, "all") == 0) { - error ("%<%s%> must be by itself for %<-mharden-sls=%>", str); + error ("%qs must be by itself for %<-mharden-sls=%>", str); break; } else @@ -17572,11 +17572,11 @@ aarch64_handle_attr_arch (const char *str) error ("missing name in % pragma or attribute"); break; case AARCH64_PARSE_INVALID_ARG: - error ("invalid name (\"%s\") in % pragma or attribute", str); + error ("invalid name (%qs) in % pragma or attribute", str); aarch64_print_hint_for_arch (str); break; case AARCH64_PARSE_INVALID_FEATURE: - error ("invalid feature modifier %s of value (\"%s\") in " + error ("invalid feature modifier %s of value (%qs) in " "% pragma or attribute", invalid_extension.c_str (), str); aarch64_print_hint_for_extensions (invalid_extension); break; @@ -17614,11 +17614,11 @@ aarch64_handle_attr_cpu (const char *str) error ("missing name in % pragma or attribute"); break; case AARCH64_PARSE_INVALID_ARG: - error ("invalid name (\"%s\") in % pragma or attribute", str); + error ("invalid name (%qs) in % pragma or attribute", str); aarch64_print_hint_for_core (str); break; case AARCH64_PARSE_INVALID_FEATURE: - error ("invalid feature modifier %s of value (\"%s\") in " + error ("invalid feature modifier %s of value (%qs) in " "% pragma or attribute", invalid_extension.c_str (), str); aarch64_print_hint_for_extensions (invalid_extension); break; @@ -17645,7 +17645,7 @@ aarch64_handle_attr_cpu (const char *str) " attribute"); break; case AARCH64_PARSE_INVALID_ARG: - error ("invalid protection type (\"%s\") in %+ error ("invalid protection type (%qs) in % "=\")%> pragma or attribute", err_str); break; case AARCH64_PARSE_OK: @@ -17680,7 +17680,7 @@ aarch64_handle_attr_tune (const char *str) switch (parse_res) { case AARCH64_PARSE_INVALID_ARG: - error ("invalid name (\"%s\") in %
Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
On Wed, 12 Jan 2022, Richard Sandiford wrote: > Richard Biener writes: > > On Wed, 12 Jan 2022, Richard Sandiford wrote: > > > >> Richard Biener writes: > >> > On Wed, 12 Jan 2022, Richard Sandiford wrote: > >> > > >> >> Richard Biener writes: > >> >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote: > >> >> > > >> >> >> Hi, > >> >> >> > >> >> >> This a fix for the regression caused by '[vect] Re-analyze all modes > >> >> >> for > >> >> >> epilogues'. The earlier patch assumed there was always at least one > >> >> >> other mode > >> >> >> than VOIDmode, but that does not need to be the case. > >> >> >> If we are dealing with a target that does not define more modes for > >> >> >> 'autovectorize_vector_modes', the behaviour before the patch would > >> >> >> be to try > >> >> >> to create an epilogue for the same autodetected_vector_mode, which > >> >> >> unless the > >> >> >> target supported partial vectors would always fail. So as a fix I > >> >> >> suggest > >> >> >> trying to vectorize the epilogue with the preferred_simd_mode for QI, > >> >> >> mimicking autovectorize_vector_mode, which will be skipped if it is > >> >> >> not a > >> >> >> vector_mode (since that already should indicate partial vectors > >> >> >> aren't > >> >> >> possible) or if no partial vectors are supported and its pessimistic > >> >> >> NUNITS is > >> >> >> larger than the main loop's VF. > >> >> >> > >> >> >> Currently bootstrapping and regression testing, otherwise OK for > >> >> >> trunk? Can > >> >> >> someone verify this fixes the issue for PR103971 on powerpc? > >> >> > > >> >> > Why not simply start at mode_i = 0 which means autodetecting the mode > >> >> > to use for the epilogue? That appears to be a much simpler solution > >> >> > to > >> >> > me, including for targets where there are more than one element in the > >> >> > vector. > >> >> > >> >> VOIDmode doesn't tell us anything about what the autodetected mode > >> >> will be, so current short-circuit: > >> >> > >> >> /* If the target does not support partial vectors we can shorten > >> >> the > >> >> number of modes to analyze for the epilogue as we know we > >> >> can't pick a > >> >> mode that has at least as many NUNITS as the main loop's > >> >> vectorization > >> >> factor, since that would imply the epilogue's vectorization > >> >> factor > >> >> would be at least as high as the main loop's and we would be > >> >> vectorizing for more scalar iterations than there would be > >> >> left. */ > >> >> if (!supports_partial_vectors > >> >> && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), > >> >> first_vinfo_vf)) > >> >> { > >> >> mode_i++; > >> >> if (mode_i == vector_modes.length ()) > >> >> break; > >> >> continue; > >> >> } > >> >> > >> >> wouldn't be effective. > >> > > >> > Well, before this change we simply did > >> > > >> > - /* Handle the case that the original loop can use partial > >> > - vectorization, but want to only adopt it for the epilogue. > >> > - The retry should be in the same mode as original. */ > >> > - if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo)) > >> > ... > >> > - else > >> > -{ > >> > - mode_i = first_loop_next_i; > >> > - if (mode_i == vector_modes.length ()) > >> > - return first_loop_vinfo; > >> > -} > >> > > >> > and thus didn't bother with epilogue vectorization. I think we should > >> > then just restore this behavior, not doing epilogue vectorization > >> > if vector_modes.length () == 1? > >> > >> Yeah, but that case didn't need epilogue vectorisation before. This > >> series is adding support for unrolling, and targets with a single vector > >> size will benefit from epilogues in that case. > > > > But in that case (which we could detect), we could then just use > > autodetected_vector_mode? Like if we do before epilogue vect > > > > vector_modes[0] = autodetected_vector_mode; > > mode_i = 0; > > > > thus replace VOIDmode with what we detected and then start at 0? > > That is, the proposed patch looks very much like a hack to me. > > You mean check whether the loop is unrolled? If so, that's what feels > like a hack to me :-) The question is whether there are enough elements > for epilogue vectorisation to make sense. The VF is what tells us that. > Unrolling is just one of the things that influences that VF and I don't > think we should check for the individual influences. It's just the end > result that matters. > > > I suppose the VECTOR_MODE_P check should be added to > > > > if (!supports_partial_vectors > > && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), > > first_vinfo_vf)) > > { > > mode_i++; > > > > instead. > > You mean: > > if (!supports_partial_vectors > && VECTOR_MODE_P (vector_modes[mode_i]) > && maybe_ge (GET_MODE_NUNITS (vector_modes[
Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
Richard Biener writes: > On Wed, 12 Jan 2022, Richard Sandiford wrote: > >> Richard Biener writes: >> > On Wed, 12 Jan 2022, Richard Sandiford wrote: >> > >> >> Richard Biener writes: >> >> > On Wed, 12 Jan 2022, Richard Sandiford wrote: >> >> > >> >> >> Richard Biener writes: >> >> >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote: >> >> >> > >> >> >> >> Hi, >> >> >> >> >> >> >> >> This a fix for the regression caused by '[vect] Re-analyze all >> >> >> >> modes for >> >> >> >> epilogues'. The earlier patch assumed there was always at least one >> >> >> >> other mode >> >> >> >> than VOIDmode, but that does not need to be the case. >> >> >> >> If we are dealing with a target that does not define more modes for >> >> >> >> 'autovectorize_vector_modes', the behaviour before the patch would >> >> >> >> be to try >> >> >> >> to create an epilogue for the same autodetected_vector_mode, which >> >> >> >> unless the >> >> >> >> target supported partial vectors would always fail. So as a fix I >> >> >> >> suggest >> >> >> >> trying to vectorize the epilogue with the preferred_simd_mode for >> >> >> >> QI, >> >> >> >> mimicking autovectorize_vector_mode, which will be skipped if it is >> >> >> >> not a >> >> >> >> vector_mode (since that already should indicate partial vectors >> >> >> >> aren't >> >> >> >> possible) or if no partial vectors are supported and its >> >> >> >> pessimistic NUNITS is >> >> >> >> larger than the main loop's VF. >> >> >> >> >> >> >> >> Currently bootstrapping and regression testing, otherwise OK for >> >> >> >> trunk? Can >> >> >> >> someone verify this fixes the issue for PR103971 on powerpc? >> >> >> > >> >> >> > Why not simply start at mode_i = 0 which means autodetecting the mode >> >> >> > to use for the epilogue? That appears to be a much simpler solution >> >> >> > to >> >> >> > me, including for targets where there are more than one element in >> >> >> > the >> >> >> > vector. >> >> >> >> >> >> VOIDmode doesn't tell us anything about what the autodetected mode >> >> >> will be, so current short-circuit: >> >> >> >> >> >> /* If the target does not support partial vectors we can shorten >> >> >> the >> >> >> number of modes to analyze for the epilogue as we know we >> >> >> can't pick a >> >> >> mode that has at least as many NUNITS as the main loop's >> >> >> vectorization >> >> >> factor, since that would imply the epilogue's vectorization >> >> >> factor >> >> >> would be at least as high as the main loop's and we would be >> >> >> vectorizing for more scalar iterations than there would be >> >> >> left. */ >> >> >> if (!supports_partial_vectors >> >> >> && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), >> >> >> first_vinfo_vf)) >> >> >>{ >> >> >> mode_i++; >> >> >> if (mode_i == vector_modes.length ()) >> >> >>break; >> >> >> continue; >> >> >>} >> >> >> >> >> >> wouldn't be effective. >> >> > >> >> > Well, before this change we simply did >> >> > >> >> > - /* Handle the case that the original loop can use partial >> >> > - vectorization, but want to only adopt it for the epilogue. >> >> > - The retry should be in the same mode as original. */ >> >> > - if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo)) >> >> > ... >> >> > - else >> >> > -{ >> >> > - mode_i = first_loop_next_i; >> >> > - if (mode_i == vector_modes.length ()) >> >> > - return first_loop_vinfo; >> >> > -} >> >> > >> >> > and thus didn't bother with epilogue vectorization. I think we should >> >> > then just restore this behavior, not doing epilogue vectorization >> >> > if vector_modes.length () == 1? >> >> >> >> Yeah, but that case didn't need epilogue vectorisation before. This >> >> series is adding support for unrolling, and targets with a single vector >> >> size will benefit from epilogues in that case. >> > >> > But in that case (which we could detect), we could then just use >> > autodetected_vector_mode? Like if we do before epilogue vect >> > >> > vector_modes[0] = autodetected_vector_mode; >> > mode_i = 0; >> > >> > thus replace VOIDmode with what we detected and then start at 0? >> > That is, the proposed patch looks very much like a hack to me. >> >> You mean check whether the loop is unrolled? If so, that's what feels >> like a hack to me :-) The question is whether there are enough elements >> for epilogue vectorisation to make sense. The VF is what tells us that. >> Unrolling is just one of the things that influences that VF and I don't >> think we should check for the individual influences. It's just the end >> result that matters. >> >> > I suppose the VECTOR_MODE_P check should be added to >> > >> > if (!supports_partial_vectors >> > && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), >> > first_vinfo_vf)) >> > { >> > mode_i++; >> > >> > instead. >>
Re: [PATCH] opts: do not do sanity check when an error is seen
On 12/16/21 17:44, Jeff Law wrote: On 12/16/2021 7:37 AM, Martin Liška wrote: Do not check global options modification when an error is seen in parsing of options (pragmas or attributes). Patch can bootstrap on x86_64-linux-gnu and survives regression tests. Ready to be installed? Thanks, Martin PR target/103709 gcc/c-family/ChangeLog: * c-pragma.c (handle_pragma_pop_options): Do not check global options modification when an error is seen in parsing of options (pragmas or attributes). OK jeff The following patch handled the same for attributes. Pushed as obvious. MartinFrom 98b5359b474e4de89ebc1ea5203ca907738f7d7f Mon Sep 17 00:00:00 2001 From: Martin Liska Date: Wed, 12 Jan 2022 12:48:33 +0100 Subject: [PATCH] opts: do not do sanity check when an error is seen PR target/103804 gcc/c-family/ChangeLog: * c-attribs.c (handle_optimize_attribute): Do not call cl_optimization_compare if we seen an error. --- gcc/c-family/c-attribs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c index dbb892e0ec6..bdf72ce385c 100644 --- a/gcc/c-family/c-attribs.c +++ b/gcc/c-family/c-attribs.c @@ -5516,7 +5516,8 @@ handle_optimize_attribute (tree *node, tree name, tree args, if (saved_global_options != NULL) { - cl_optimization_compare (saved_global_options, &global_options); + if (!seen_error ()) + cl_optimization_compare (saved_global_options, &global_options); free (saved_global_options); } } -- 2.34.1
Re: [PATCH] Fortran: make IEEE_CLASS recognize signaling NaNs
> Thanks. Just a nit, it is cc1 that reports the warning, not f951. I confirm the patch fixes the testcase failure. And I fixed the comment in a follow-up commit. Thanks! FX
Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
On 12/01/2022 11:44, Richard Sandiford wrote: Another alternative would be to push autodetected_vector_mode when the length is 1 and keep 1 as the starting point. Richard I'm guessing we would still want to skip epilogue vectorization if !VECTOR_MODE_P (autodetected_vector_mode) in that case?
Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
On Wed, 12 Jan 2022, Richard Sandiford wrote: > Richard Biener writes: > > On Wed, 12 Jan 2022, Richard Sandiford wrote: > > > >> Richard Biener writes: > >> > On Wed, 12 Jan 2022, Richard Sandiford wrote: > >> > > >> >> Richard Biener writes: > >> >> > On Wed, 12 Jan 2022, Richard Sandiford wrote: > >> >> > > >> >> >> Richard Biener writes: > >> >> >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote: > >> >> >> > > >> >> >> >> Hi, > >> >> >> >> > >> >> >> >> This a fix for the regression caused by '[vect] Re-analyze all > >> >> >> >> modes for > >> >> >> >> epilogues'. The earlier patch assumed there was always at least > >> >> >> >> one other mode > >> >> >> >> than VOIDmode, but that does not need to be the case. > >> >> >> >> If we are dealing with a target that does not define more modes > >> >> >> >> for > >> >> >> >> 'autovectorize_vector_modes', the behaviour before the patch > >> >> >> >> would be to try > >> >> >> >> to create an epilogue for the same autodetected_vector_mode, > >> >> >> >> which unless the > >> >> >> >> target supported partial vectors would always fail. So as a fix I > >> >> >> >> suggest > >> >> >> >> trying to vectorize the epilogue with the preferred_simd_mode for > >> >> >> >> QI, > >> >> >> >> mimicking autovectorize_vector_mode, which will be skipped if it > >> >> >> >> is not a > >> >> >> >> vector_mode (since that already should indicate partial vectors > >> >> >> >> aren't > >> >> >> >> possible) or if no partial vectors are supported and its > >> >> >> >> pessimistic NUNITS is > >> >> >> >> larger than the main loop's VF. > >> >> >> >> > >> >> >> >> Currently bootstrapping and regression testing, otherwise OK for > >> >> >> >> trunk? Can > >> >> >> >> someone verify this fixes the issue for PR103971 on powerpc? > >> >> >> > > >> >> >> > Why not simply start at mode_i = 0 which means autodetecting the > >> >> >> > mode > >> >> >> > to use for the epilogue? That appears to be a much simpler > >> >> >> > solution to > >> >> >> > me, including for targets where there are more than one element in > >> >> >> > the > >> >> >> > vector. > >> >> >> > >> >> >> VOIDmode doesn't tell us anything about what the autodetected mode > >> >> >> will be, so current short-circuit: > >> >> >> > >> >> >> /* If the target does not support partial vectors we can > >> >> >> shorten the > >> >> >> number of modes to analyze for the epilogue as we know we > >> >> >> can't pick a > >> >> >> mode that has at least as many NUNITS as the main loop's > >> >> >> vectorization > >> >> >> factor, since that would imply the epilogue's vectorization > >> >> >> factor > >> >> >> would be at least as high as the main loop's and we would be > >> >> >> vectorizing for more scalar iterations than there would be > >> >> >> left. */ > >> >> >> if (!supports_partial_vectors > >> >> >>&& maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), > >> >> >> first_vinfo_vf)) > >> >> >> { > >> >> >>mode_i++; > >> >> >>if (mode_i == vector_modes.length ()) > >> >> >> break; > >> >> >>continue; > >> >> >> } > >> >> >> > >> >> >> wouldn't be effective. > >> >> > > >> >> > Well, before this change we simply did > >> >> > > >> >> > - /* Handle the case that the original loop can use partial > >> >> > - vectorization, but want to only adopt it for the epilogue. > >> >> > - The retry should be in the same mode as original. */ > >> >> > - if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo)) > >> >> > ... > >> >> > - else > >> >> > -{ > >> >> > - mode_i = first_loop_next_i; > >> >> > - if (mode_i == vector_modes.length ()) > >> >> > - return first_loop_vinfo; > >> >> > -} > >> >> > > >> >> > and thus didn't bother with epilogue vectorization. I think we should > >> >> > then just restore this behavior, not doing epilogue vectorization > >> >> > if vector_modes.length () == 1? > >> >> > >> >> Yeah, but that case didn't need epilogue vectorisation before. This > >> >> series is adding support for unrolling, and targets with a single vector > >> >> size will benefit from epilogues in that case. > >> > > >> > But in that case (which we could detect), we could then just use > >> > autodetected_vector_mode? Like if we do before epilogue vect > >> > > >> > vector_modes[0] = autodetected_vector_mode; > >> > mode_i = 0; > >> > > >> > thus replace VOIDmode with what we detected and then start at 0? > >> > That is, the proposed patch looks very much like a hack to me. > >> > >> You mean check whether the loop is unrolled? If so, that's what feels > >> like a hack to me :-) The question is whether there are enough elements > >> for epilogue vectorisation to make sense. The VF is what tells us that. > >> Unrolling is just one of the things that influences that VF and I don't > >> think we should check for the individual influences. It's just the end > >> result that
Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
On Wed, 12 Jan 2022, Andre Vieira (lists) wrote: > > On 12/01/2022 11:44, Richard Sandiford wrote: > > Another alternative would be to push autodetected_vector_mode when the > > length is 1 and keep 1 as the starting point. > > > > Richard > > I'm guessing we would still want to skip epilogue vectorization if > !VECTOR_MODE_P (autodetected_vector_mode) in that case? Practically we currently only support fixed width word_mode there, but eventually one could end up with 64bit DImode for the main loop and 32bit V4QImode in the epilogue ... so not sure if it's worth special-casing. But I don't mind adding that skip. Richard.
[PATCH, rs6000] Enable absolute jump table by default
Hi, This patch enables absolute jump table by default on rs6000. The relative jump tables are used when it's explicit set by "rs6000_relative_jumptables", or jump tables are placed in text section but global relocation is required. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. ChangeLog 2022-01-12 Haochen Gui gcc/ * config/rs6000/linux64.h (JUMP_TABLES_IN_TEXT_SECTION): Define. * config/rs6000/rs6000.c (rs6000_gen_pic_addr_diff_vec): Return true when relative jump table is explicit required or jump tables have to be put in text section but global relocation is also required. * config/rs6000/rs6000.opt (rs6000_relative_jumptables): Disable. patch.diff diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h index d617f346f81..2e257c60f8c 100644 --- a/gcc/config/rs6000/linux64.h +++ b/gcc/config/rs6000/linux64.h @@ -239,7 +239,7 @@ extern int dot_symbols; /* Indicate that jump tables go in the text section. */ #undef JUMP_TABLES_IN_TEXT_SECTION -#define JUMP_TABLES_IN_TEXT_SECTION TARGET_64BIT +#define JUMP_TABLES_IN_TEXT_SECTION 0 /* The linux ppc64 ABI isn't explicit on whether aggregates smaller than a doubleword should be padded upward or downward. You could diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 319182e94d9..9fba893a27a 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -28465,7 +28465,9 @@ rs6000_emit_xxspltidp_v2df (rtx dst, long value) static bool rs6000_gen_pic_addr_diff_vec (void) { - return rs6000_relative_jumptables; + return rs6000_relative_jumptables +|| (JUMP_TABLES_IN_TEXT_SECTION +&& targetm.asm_out.reloc_rw_mask () == 3); } void diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt index c2a77182a9e..75e3fa86829 100644 --- a/gcc/config/rs6000/rs6000.opt +++ b/gcc/config/rs6000/rs6000.opt @@ -630,7 +630,7 @@ Target Mask(MMA) Var(rs6000_isa_flags) Generate (do not generate) MMA instructions. mrelative-jumptables -Target Undocumented Var(rs6000_relative_jumptables) Init(1) Save +Target Undocumented Var(rs6000_relative_jumptables) Init(0) Save mrop-protect Target Var(rs6000_rop_protect) Init(0)
Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
On 12/01/2022 11:59, Richard Biener wrote: On Wed, 12 Jan 2022, Andre Vieira (lists) wrote: On 12/01/2022 11:44, Richard Sandiford wrote: Another alternative would be to push autodetected_vector_mode when the length is 1 and keep 1 as the starting point. Richard I'm guessing we would still want to skip epilogue vectorization if !VECTOR_MODE_P (autodetected_vector_mode) in that case? Practically we currently only support fixed width word_mode there, but eventually one could end up with 64bit DImode for the main loop and 32bit V4QImode in the epilogue ... so not sure if it's worth special-casing. But I don't mind adding that skip. Richard. I left out the skip, it shouldn't break anything as it would try that same mode before anyway. Just to clarify what I meant though was to skip if autodetected_vector_mode wasn't a vector AND the target didn't define autovectorize_vector_modes, so in that scenario it wouldn't ever try V4QImode for the epilogue if the mainloop was autodetected DImode, I think... Either way, this is less code, less complicated and doesn't analyze more than it did before the original patch, so I'm happy with that too. Is this what you had in mind? Andre diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index 6ed2b5f8724e5ebf27592f67d7f6bdfe1ebcf512..03459363afa48f0e2753bc7aa18cbf2771d2a4e5 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -3023,7 +3023,16 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared) array may contain length-agnostic and length-specific modes. Their ordering is not guaranteed, so we could end up picking a mode for the main loop that is after the epilogue's optimal mode. */ - mode_i = 1; + if (vector_modes.length () == 1) +{ + /* If we only had VOIDmode then use AUTODETECTED_VECTOR_MODE to see if +an epilogue can be created with that mode. */ + vector_modes[0] = autodetected_vector_mode; + mode_i = 0; +} + else +mode_i = 1; + bool supports_partial_vectors = partial_vectors_supported_p (); poly_uint64 first_vinfo_vf = LOOP_VINFO_VECT_FACTOR (first_loop_vinfo);
Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
On Wed, 12 Jan 2022, Andre Vieira (lists) wrote: > > On 12/01/2022 11:59, Richard Biener wrote: > > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote: > > > >> On 12/01/2022 11:44, Richard Sandiford wrote: > >>> Another alternative would be to push autodetected_vector_mode when the > >>> length is 1 and keep 1 as the starting point. > >>> > >>> Richard > >> I'm guessing we would still want to skip epilogue vectorization if > >> !VECTOR_MODE_P (autodetected_vector_mode) in that case? > > Practically we currently only support fixed width word_mode there, > > but eventually one could end up with 64bit DImode for the main loop > > and 32bit V4QImode in the epilogue ... so not sure if it's worth > > special-casing. But I don't mind adding that skip. > > > > Richard. > > I left out the skip, it shouldn't break anything as it would try that same > mode before anyway. > Just to clarify what I meant though was to skip if autodetected_vector_mode > wasn't a vector AND the target didn't define autovectorize_vector_modes, so in > that scenario it wouldn't ever try V4QImode for the epilogue if the mainloop > was autodetected DImode, I think... > Either way, this is less code, less complicated and doesn't analyze more than > it did before the original patch, so I'm happy with that too. > > Is this what you had in mind? - mode_i = 1; + if (vector_modes.length () == 1) +{ + /* If we only had VOIDmode then use AUTODETECTED_VECTOR_MODE to see if +an epilogue can be created with that mode. */ + vector_modes[0] = autodetected_vector_mode; + mode_i = 0; +} + else +mode_i = 1; + I would have left out the condition and unconditionally do vector_modes[0] = autodetected_vector_mode; mode_i = 0; but OK if you think it makes sense to special case length == 1. Richard.
[PATCH] Fix -Wformat-diag for ARM target.
Hello. We've got -Wformat-diag for some time and I think we should start using it in -Werror for GCC bootstrap. The following patch removes last pieces of the warning for ARM target. Ready to be installed? Thanks, Martin gcc/ChangeLog: * common/config/arm/arm-common.c (arm_target_mode): Wrap keywords with %<, %> and remove trailing punctuation char. (arm_canon_arch_option_1): Likewise. (arm_asm_auto_mfpu): Likewise. * config/arm/arm-builtins.c (arm_expand_builtin): Likewise. * config/arm/arm.c (arm_options_perform_arch_sanity_checks): Likewise. (use_vfp_abi): Likewise. (aapcs_vfp_is_call_or_return_candidate): Likewise. (arm_handle_cmse_nonsecure_entry): Likewise. (arm_handle_cmse_nonsecure_call): Likewise. (thumb1_md_asm_adjust): Likewise. --- gcc/common/config/arm/arm-common.c | 12 +++ gcc/config/arm/arm-builtins.c | 50 +++--- gcc/config/arm/arm.c | 12 +++ 3 files changed, 37 insertions(+), 37 deletions(-) diff --git a/gcc/common/config/arm/arm-common.c b/gcc/common/config/arm/arm-common.c index e7e19400263..6a898d8554b 100644 --- a/gcc/common/config/arm/arm-common.c +++ b/gcc/common/config/arm/arm-common.c @@ -286,7 +286,7 @@ arm_target_mode (int argc, const char **argv) if (argc % 2 != 0) fatal_error (input_location, -"%%:target_mode_check takes an even number of parameters"); +"%%:% takes an even number of parameters"); while (argc) { @@ -295,8 +295,8 @@ arm_target_mode (int argc, const char **argv) else if (strcmp (argv[0], "cpu") == 0) cpu = argv[1]; else - fatal_error (input_location, -"unrecognized option passed to %%:target_mode_check"); + fatal_error (input_location, "unrecognized option passed to %%:" +"%>"); argc -= 2; argv += 2; } @@ -662,7 +662,7 @@ arm_canon_arch_option_1 (int argc, const char **argv, bool arch_for_multilib) if (argc & 1) fatal_error (input_location, -"%%:canon_for_mlib takes 1 or more pairs of parameters"); +"%%:% takes 1 or more pairs of parameters"); while (argc) { @@ -676,7 +676,7 @@ arm_canon_arch_option_1 (int argc, const char **argv, bool arch_for_multilib) abi = argv[1]; else fatal_error (input_location, -"unrecognized operand to %%:canon_for_mlib"); +"unrecognized operand to %%:%"); argc -= 2; argv += 2; @@ -1032,7 +1032,7 @@ arm_asm_auto_mfpu (int argc, const char **argv) arch = argv[1]; else fatal_error (input_location, -"unrecognized operand to %%:asm_auto_mfpu"); +"unrecognized operand to %%:%"); argc -= 2; argv += 2; } diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index 9c645722230..ab5c469b1ba 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -3013,7 +3013,7 @@ constant_arg: else error_at (EXPR_LOCATION (exp), "coproc must be a constant immediate in " - "range [0-%d] enabled with +cdecp", + "range [0-%d] enabled with +cdecp%", ARM_CDE_CONST_COPROC); } else @@ -3860,60 +3860,60 @@ arm_expand_builtin (tree exp, && (imm < 0 || imm > 32)) { if (fcode == ARM_BUILTIN_WRORHI) - error ("the range of count should be in 0 to 32. please check the intrinsic _mm_rori_pi16 in code."); + error ("the range of count should be in 0 to 32; please check the intrinsic %<_mm_rori_pi16%> in code"); else if (fcode == ARM_BUILTIN_WRORWI) - error ("the range of count should be in 0 to 32. please check the intrinsic _mm_rori_pi32 in code."); + error ("the range of count should be in 0 to 32; please check the intrinsic %<_mm_rori_pi32%> in code"); else if (fcode == ARM_BUILTIN_WRORH) - error ("the range of count should be in 0 to 32. please check the intrinsic _mm_ror_pi16 in code."); + error ("the range of count should be in 0 to 32; please check the intrinsic %<_mm_ror_pi16%> in code"); else - error ("the range of count should be in 0 to 32. please check the intrinsic _mm_ror_pi32 in code."); + error ("the range of count should be in 0 to 32; please check the intrinsic %<_mm_ror_pi32%> in code"); } else if ((fcode == ARM_BUILTIN_WRORDI || fcode == ARM_BUILTIN_WRORD) && (imm < 0 || imm > 64)) { if (fcode == AR
[PATCH] Fix -Wformat-diag for s390x-ibm-tpf.
This one is related to s390x-ibm-tpf -Wformat-diag warnings. Ready to be installed? Thanks, Martin gcc/ChangeLog: * config/s390/s390.c: Fix -Wformat-diag warnings. --- gcc/config/s390/s390.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index bf96cbf7588..968672bb42b 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -15606,16 +15606,16 @@ s390_option_override_internal (struct gcc_options *opts, #if TARGET_TPF != 0 if (!CONST_OK_FOR_J (opts->x_s390_tpf_trace_hook_prologue_check)) -error ("-mtpf-trace-hook-prologue-check requires integer in range 0..4095"); +error ("%<-mtpf-trace-hook-prologue-check%> requires integer in range 0-4095"); if (!CONST_OK_FOR_J (opts->x_s390_tpf_trace_hook_prologue_target)) -error ("-mtpf-trace-hook-prologue-target requires integer in range 0..4095"); +error ("%<-mtpf-trace-hook-prologue-target%> requires integer in range 0-4095"); if (!CONST_OK_FOR_J (opts->x_s390_tpf_trace_hook_epilogue_check)) -error ("-mtpf-trace-hook-epilogue-check requires integer in range 0..4095"); +error ("%<-mtpf-trace-hook-epilogue-check%> requires integer in range 0-4095"); if (!CONST_OK_FOR_J (opts->x_s390_tpf_trace_hook_epilogue_target)) -error ("-mtpf-trace-hook-epilogue-target requires integer in range 0..4095"); +error ("%<-mtpf-trace-hook-epilogue-target%> requires integer in range 0-4095"); if (s390_tpf_trace_skip) { -- 2.34.1
Re: [PATCH] [12/11/10] Fix invalid format warnings on Windows
On 1/11/22 2:37 PM, Martin Liška wrote: Hello. I do support the patch, but I would ... Thanks, Martin, that makes the patch simpler and easier to maintain. Would the attached version do? Thanks Tomas On 1/7/22 19:33, Tomas Kalibera wrote: + if (is_attribute_p ("format", get_attribute_name (aa)) && + fndecl && fndecl_built_in_p (fndecl, BUILT_IN_NORMAL)) +{ + switch (DECL_FUNCTION_CODE (fndecl)) +{ +case BUILT_IN_FSCANF: +case BUILT_IN_PRINTF: +case BUILT_IN_SCANF: +case BUILT_IN_SNPRINTF: +case BUILT_IN_SSCANF: +case BUILT_IN_VFSCANF: +case BUILT_IN_VPRINTF: +case BUILT_IN_VSCANF: +case BUILT_IN_VSNPRINTF: +case BUILT_IN_VSSCANF: +case BUILT_IN_DCGETTEXT: +case BUILT_IN_DGETTEXT: +case BUILT_IN_GETTEXT: +case BUILT_IN_STRFMON: +case BUILT_IN_STRFTIME: +case BUILT_IN_SNPRINTF_CHK: +case BUILT_IN_VSNPRINTF_CHK: +case BUILT_IN_PRINTF_CHK: +case BUILT_IN_VPRINTF_CHK: + skipped_default_format = 1; + break; +default: + break; +} +} ... skip this as the listed functions are only these that have defined ATTR_FORMAT_*: $ grep ATTR_FORMAT gcc/builtins.def DEF_LIB_BUILTIN (BUILT_IN_FSCANF, "fscanf", BT_FN_INT_FILEPTR_CONST_STRING_VAR, ATTR_FORMAT_SCANF_2_3) DEF_LIB_BUILTIN (BUILT_IN_PRINTF, "printf", BT_FN_INT_CONST_STRING_VAR, ATTR_FORMAT_PRINTF_1_2) DEF_LIB_BUILTIN (BUILT_IN_SCANF, "scanf", BT_FN_INT_CONST_STRING_VAR, ATTR_FORMAT_SCANF_1_2) DEF_C99_BUILTIN (BUILT_IN_SNPRINTF, "snprintf", BT_FN_INT_STRING_SIZE_CONST_STRING_VAR, ATTR_FORMAT_PRINTF_NOTHROW_3_4) DEF_LIB_BUILTIN (BUILT_IN_SSCANF, "sscanf", BT_FN_INT_CONST_STRING_CONST_STRING_VAR, ATTR_FORMAT_SCANF_NOTHROW_2_3) DEF_C99_BUILTIN (BUILT_IN_VFSCANF, "vfscanf", BT_FN_INT_FILEPTR_CONST_STRING_VALIST_ARG, ATTR_FORMAT_SCANF_2_0) DEF_LIB_BUILTIN (BUILT_IN_VPRINTF, "vprintf", BT_FN_INT_CONST_STRING_VALIST_ARG, ATTR_FORMAT_PRINTF_1_0) DEF_C99_BUILTIN (BUILT_IN_VSCANF, "vscanf", BT_FN_INT_CONST_STRING_VALIST_ARG, ATTR_FORMAT_SCANF_1_0) DEF_C99_BUILTIN (BUILT_IN_VSNPRINTF, "vsnprintf", BT_FN_INT_STRING_SIZE_CONST_STRING_VALIST_ARG, ATTR_FORMAT_PRINTF_NOTHROW_3_0) DEF_C99_BUILTIN (BUILT_IN_VSSCANF, "vsscanf", BT_FN_INT_CONST_STRING_CONST_STRING_VALIST_ARG, ATTR_FORMAT_SCANF_NOTHROW_2_0) DEF_EXT_LIB_BUILTIN (BUILT_IN_DCGETTEXT, "dcgettext", BT_FN_STRING_CONST_STRING_CONST_STRING_INT, ATTR_FORMAT_ARG_2) DEF_EXT_LIB_BUILTIN (BUILT_IN_DGETTEXT, "dgettext", BT_FN_STRING_CONST_STRING_CONST_STRING, ATTR_FORMAT_ARG_2) DEF_EXT_LIB_BUILTIN (BUILT_IN_GETTEXT, "gettext", BT_FN_STRING_CONST_STRING, ATTR_FORMAT_ARG_1) DEF_EXT_LIB_BUILTIN (BUILT_IN_STRFMON, "strfmon", BT_FN_SSIZE_STRING_SIZE_CONST_STRING_VAR, ATTR_FORMAT_STRFMON_NOTHROW_3_4) DEF_LIB_BUILTIN (BUILT_IN_STRFTIME, "strftime", BT_FN_SIZE_STRING_SIZE_CONST_STRING_CONST_TM_PTR, ATTR_FORMAT_STRFTIME_NOTHROW_3_0) DEF_EXT_LIB_BUILTIN (BUILT_IN_SNPRINTF_CHK, "__snprintf_chk", BT_FN_INT_STRING_SIZE_INT_SIZE_CONST_STRING_VAR, ATTR_FORMAT_PRINTF_NOTHROW_5_6) DEF_EXT_LIB_BUILTIN (BUILT_IN_VSNPRINTF_CHK, "__vsnprintf_chk", BT_FN_INT_STRING_SIZE_INT_SIZE_CONST_STRING_VALIST_ARG, ATTR_FORMAT_PRINTF_NOTHROW_5_0) DEF_EXT_LIB_BUILTIN (BUILT_IN_PRINTF_CHK, "__printf_chk", BT_FN_INT_INT_CONST_STRING_VAR, ATTR_FORMAT_PRINTF_2_3) DEF_EXT_LIB_BUILTIN (BUILT_IN_VPRINTF_CHK, "__vprintf_chk", BT_FN_INT_INT_CONST_STRING_VALIST_ARG, ATTR_FORMAT_PRINTF_2_0) Martin >From 82a659c7e5b24bbd39ac567dff3f79cc4c1e083f Mon Sep 17 00:00:00 2001 From: Tomas Kalibera Date: Wed, 12 Jan 2022 08:17:21 -0500 Subject: [PATCH] Mingw32 targets use ms_printf format for printf, but mingw-w64 when configured for UCRT uses gnu_format (via stdio.h). GCC then checks both formats, which means that one cannot print a 64-bit integer without a warning. All these lines issue a warning: printf("Hello %"PRIu64"\n", x); printf("Hello %I64u\n", x); printf("Hello %llu\n", x); because each of them violates one of the formats. Also, one gets a warning twice if the format string violates both formats. Fixed by disabling the built in format in case there are additional ones. gcc/c-family/ChangeLog: PR c/95130 PR c/92292 * c-common.c (check_function_arguments): Pass also function declaration to check_function_format. * c-common.h (check_function_format): Extra argument - function declaration. * c-format.c (check_function_format): For builtin functions with a built in format and at least one more, do not check the first one. --- gcc/c-family/c-common.c | 2 +- gcc/c-family/c-common.h | 2 +- gcc/c-family/c-format.c | 31 +--
Re: [PATCH v2][GCC] arm: Add support for dwarf debug directives and pseudo hard-register for PAC feature.
Ping!! From: Srinath Parvathaneni Sent: 13 December 2021 10:44 To: gcc-patches@gcc.gnu.org Cc: Kyrylo Tkachov ; Richard Earnshaw ; Tejas Belagod Subject: Re: [PATCH v2][GCC] arm: Add support for dwarf debug directives and pseudo hard-register for PAC feature. Ping!! From: Srinath Parvathaneni Sent: 12 November 2021 18:03 To: gcc-patches@gcc.gnu.org Cc: Kyrylo Tkachov ; Richard Earnshaw ; Tejas Belagod Subject: [PATCH v2][GCC] arm: Add support for dwarf debug directives and pseudo hard-register for PAC feature. Hello, This patch teaches the DWARF support in gcc about RA_AUTH_CODE pseudo hard-register and also .save {ra_auth_code} and .cfi_offset ra_auth_code dwarf directives for the PAC feature in Armv8.1-M architecture. RA_AUTH_CODE register number is 107 and it's dwarf register number is 143. When compiled with "arm-none-eabi-gcc -O2 -mthumb -march=armv8.1-m.main+pacbti -S -fasynchronous-unwind-tables -g" command line options, the directives supported in this patch looks like below: ... push{ip} .save {ra_auth_code} .cfi_def_cfa_offset 8 .cfi_offset 143, -8 ... This patch can be committed after the patch at https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583407.html is committed. Regression tested on arm-none-eabi target and found no regressions. Ok for master? Regards, Srinath. gcc/ChangeLog: 2021-11-12 Srinath Parvathaneni * config/arm/aout.h (ra_auth_code): Add to enum. * config/arm/arm.c (emit_multi_reg_push): Add RA_AUTH_CODE register to dwarf frame expression instead of IP_REGNUM. (arm_expand_prologue): Mark as frame related insn. (arm_regno_class): Check for pac pseudo reigster. (arm_dbx_register_number): Assign ra_auth_code register number in dwarf. (arm_unwind_emit_sequence): Print .save directive with ra_auth_code register. (arm_conditional_register_usage): Mark ra_auth_code in fixed reigsters. * config/arm/arm.h (FIRST_PSEUDO_REGISTER): Modify. (IS_PAC_Pseudo_REGNUM): Define. (enum reg_class): Add PAC_REG entry. * config/arm/arm.md (RA_AUTH_CODE): Define. gcc/testsuite/ChangeLog: 2021-11-12 Srinath Parvathaneni * g++.target/arm/pac-1.C: New test. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/aout.h b/gcc/config/arm/aout.h index 25a2812a663742893b928398b0d3948e97f1905b..c69e299e012f46c8d0711830125dbf2f6b2e93d7 100644 --- a/gcc/config/arm/aout.h +++ b/gcc/config/arm/aout.h @@ -74,7 +74,8 @@ "wr8", "wr9", "wr10", "wr11", \ "wr12", "wr13", "wr14", "wr15", \ "wcgr0", "wcgr1", "wcgr2", "wcgr3", \ - "cc", "vfpcc", "sfp", "afp", "apsrq", "apsrge", "p0" \ + "cc", "vfpcc", "sfp", "afp", "apsrq", "apsrge", "p0",\ + "ra_auth_code" \ } #endif diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index 8e6ef41f6b065217d1af3f4f1cb85b2d8fbd0dc0..f31944e85c9ab83501f156d138e2aea1bcb5b79d 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -815,7 +815,8 @@ extern const int arm_arch_cde_coproc_bits[]; s16-s31 S VFP variable (aka d8-d15). vfpcc Not a real register. Represents the VFP condition code flags. - vpr Used to represent MVE VPR predication. */ + vpr Used to represent MVE VPR predication. + ra_auth_codePseudo register to save PAC. */ /* The stack backtrace structure is as follows: fp points to here: | save code pointer | [fp] @@ -856,7 +857,7 @@ extern const int arm_arch_cde_coproc_bits[]; 1,1,1,1,1,1,1,1, \ 1,1,1,1, \ /* Specials. */ \ - 1,1,1,1,1,1,1\ + 1,1,1,1,1,1,1,1 \ } /* 1 for registers not available across function calls. @@ -886,7 +887,7 @@ extern const int arm_arch_cde_coproc_bits[]; 1,1,1,1,1,1,1,1, \ 1,1,1,1, \ /* Specials. */ \ - 1,1,1,1,1,1,1\ + 1,1,1,1,1,1,1,1 \ } #ifndef SUBTARGET_CONDITIONAL_REGISTER_USAGE @@ -1062,10 +1063,10 @@ extern const int arm_arch_cde_coproc_bits[]; && (LAST_VFP_REGNUM - (REGNUM) >= 2 * (N) - 1)) /* The number of hard registers is 16 ARM + 1 CC + 1 SFP + 1 AFP - + 1 APSRQ + 1 APSRGE + 1 VPR. */ + + 1 APSRQ + 1 APSRGE + 1 VPR + 1 Pseudo register to save PAC. */ /* Intel Wireless MMX Technology registers add 16 + 4 more. */ /* VFP (VFP3) adds 32 (64) + 1 VFPCC. */ -#define FIRST_PSEUDO_REGISTER 107 +#define FIRST_PSEUDO_REGISTER 108 #define DBX_REGISTER_NUMBER(REGNO) arm_dbx_register_number (REGNO) @@ -1248,12 +1249,
Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
I think we need a fix or a revert for this today, please. Bootstrap has been broken for a couple of days during the last week of stage 3, which is really problematic. Thanks, Bill On 1/12/22 6:57 AM, Richard Biener via Gcc-patches wrote: > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote: > >> On 12/01/2022 11:59, Richard Biener wrote: >>> On Wed, 12 Jan 2022, Andre Vieira (lists) wrote: >>> On 12/01/2022 11:44, Richard Sandiford wrote: > Another alternative would be to push autodetected_vector_mode when the > length is 1 and keep 1 as the starting point. > > Richard I'm guessing we would still want to skip epilogue vectorization if !VECTOR_MODE_P (autodetected_vector_mode) in that case? >>> Practically we currently only support fixed width word_mode there, >>> but eventually one could end up with 64bit DImode for the main loop >>> and 32bit V4QImode in the epilogue ... so not sure if it's worth >>> special-casing. But I don't mind adding that skip. >>> >>> Richard. >> I left out the skip, it shouldn't break anything as it would try that same >> mode before anyway. >> Just to clarify what I meant though was to skip if autodetected_vector_mode >> wasn't a vector AND the target didn't define autovectorize_vector_modes, so >> in >> that scenario it wouldn't ever try V4QImode for the epilogue if the mainloop >> was autodetected DImode, I think... >> Either way, this is less code, less complicated and doesn't analyze more than >> it did before the original patch, so I'm happy with that too. >> >> Is this what you had in mind? > - mode_i = 1; > + if (vector_modes.length () == 1) > +{ > + /* If we only had VOIDmode then use AUTODETECTED_VECTOR_MODE to see > if > +an epilogue can be created with that mode. */ > + vector_modes[0] = autodetected_vector_mode; > + mode_i = 0; > +} > + else > +mode_i = 1; > + > > I would have left out the condition and unconditionally do > > vector_modes[0] = autodetected_vector_mode; > mode_i = 0; > > but OK if you think it makes sense to special case length == 1. > > Richard.
Re: [PATCH] Fix redefined macro for epiphany.
On 1/12/2022 2:47 AM, Martin Liška wrote: The following warning is emitted gazillion times. Fixes: In file included from ./tm.h:23, from gcc/genconfig.c:25: gcc/config/elfos.h:209: warning: "READONLY_DATA_SECTION_ASM_OP" redefined 209 | #define READONLY_DATA_SECTION_ASM_OP "\t.section\t.rodata" | In file included from ./tm.h:21, from gcc/genconfig.c:25: gcc/config/epiphany/epiphany.h:671: note: this is the location of the previous definition 671 | #define READONLY_DATA_SECTION_ASM_OP "\t.section .rodata" Ready to be installed? Thanks, Martin gcc/ChangeLog: * config/elfos.h (READONLY_DATA_SECTION_ASM_OP): Define only if not defined. Please do. However, I think we should deprecate the epiphany. It's been broken for ~2 years randomly failing in reload. I tried to fix it once and quickly gave up. It currently ignores all testing failures in my tester as it's been horribly unstable. jeff
Re: [PATCH] ira: Fix old-reload targets [PR103974]
On 2022-01-12 03:47, Richard Biener wrote: On Tue, Jan 11, 2022 at 7:41 PM Vladimir Makarov via Gcc-patches wrote: On 2022-01-11 12:42, Richard Sandiford wrote: The new IRA heuristics would need more work on old-reload targets, since flattening needs to be able to undo the cost propagation. It's doable, but hardly seems worth it. Agree. It is not worth to spend your time for work for reload. This patch therefore makes all the new calls to ira_subloop_allocnos_can_differ_p return false if !ira_use_lra_p. The color_pass code that predated the new function (and that was the source of ira_subloop_allocnos_can_differ_p) continues to behave as before. It's a hack, but at least it has the advantage that the new parameter would become obviously unused if reload and (!)ira_use_lra_p were removed. The hack should therefore disappear alongside reload. I have a feeling that it will stay for a long time if not forever. We indeed seem to have 34 targets w/o LRA by default and only 15 with :/ At some point Eric wrote a nice summary for how to transition targets away from CC0, I wonder if there's something similar for transitioning a port away from reload to LRA? In those 34 targets there must be some for which that's a relatively easy task? I suppose it depends on how much of the reload target hooks are put to use and how those translate to LRA. First of all the target should be rid of using CC0. Then theoretically :) the target should work with LRA as LRA uses existing reload hooks (more accurately a subset of them). In practice some work should be done for switching to LRA. I did first 4 major targets to work with LRA and unfortunately did not find some repeating patterns in this work. The problems for the first targets were mostly unique and required a lot of LRA code modifications. After that people did other target switching pretty easily and spent few time for this as I remember. So based on my experience of porting targets to LRA I can not formalize this work. But probably it can be done by examining all LRA targets code (mostly looking at machine dependent code related to use lra_in_progress_p) or by collecting information what was done from other people who did porting to LRA.
Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
On 12/01/2022 12:57, Richard Biener wrote: On Wed, 12 Jan 2022, Andre Vieira (lists) wrote: On 12/01/2022 11:59, Richard Biener wrote: On Wed, 12 Jan 2022, Andre Vieira (lists) wrote: On 12/01/2022 11:44, Richard Sandiford wrote: Another alternative would be to push autodetected_vector_mode when the length is 1 and keep 1 as the starting point. Richard I'm guessing we would still want to skip epilogue vectorization if !VECTOR_MODE_P (autodetected_vector_mode) in that case? Practically we currently only support fixed width word_mode there, but eventually one could end up with 64bit DImode for the main loop and 32bit V4QImode in the epilogue ... so not sure if it's worth special-casing. But I don't mind adding that skip. Richard. I left out the skip, it shouldn't break anything as it would try that same mode before anyway. Just to clarify what I meant though was to skip if autodetected_vector_mode wasn't a vector AND the target didn't define autovectorize_vector_modes, so in that scenario it wouldn't ever try V4QImode for the epilogue if the mainloop was autodetected DImode, I think... Either way, this is less code, less complicated and doesn't analyze more than it did before the original patch, so I'm happy with that too. Is this what you had in mind? - mode_i = 1; + if (vector_modes.length () == 1) +{ + /* If we only had VOIDmode then use AUTODETECTED_VECTOR_MODE to see if +an epilogue can be created with that mode. */ + vector_modes[0] = autodetected_vector_mode; + mode_i = 0; +} + else +mode_i = 1; + I would have left out the condition and unconditionally do vector_modes[0] = autodetected_vector_mode; mode_i = 0; but OK if you think it makes sense to special case length == 1. Richard. Tested without the special casing, all good, only have performance regressions left (which I'm working on), will commit this to fix the ICEs. Thanks, Andre
Re: [PATCH, rs6000] Enable absolute jump table by default
On Wed, Jan 12, 2022 at 7:22 AM HAO CHEN GUI wrote: > > Hi, >This patch enables absolute jump table by default on rs6000. The relative > jump tables are used when >it's explicit set by "rs6000_relative_jumptables", >or jump tables are placed in text section but global relocation is > required. > >Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. > Is this okay for trunk? > Any recommendations? Thanks a lot. > > ChangeLog > 2022-01-12 Haochen Gui > > gcc/ > * config/rs6000/linux64.h (JUMP_TABLES_IN_TEXT_SECTION): Define. > * config/rs6000/rs6000.c (rs6000_gen_pic_addr_diff_vec): Return > true when relative jump table is explicit required or jump tables have > to be put in text section but global relocation is also required. > * config/rs6000/rs6000.opt (rs6000_relative_jumptables): Disable. > > patch.diff > diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h > index d617f346f81..2e257c60f8c 100644 > --- a/gcc/config/rs6000/linux64.h > +++ b/gcc/config/rs6000/linux64.h > @@ -239,7 +239,7 @@ extern int dot_symbols; > > /* Indicate that jump tables go in the text section. */ > #undef JUMP_TABLES_IN_TEXT_SECTION > -#define JUMP_TABLES_IN_TEXT_SECTION TARGET_64BIT > +#define JUMP_TABLES_IN_TEXT_SECTION 0 > > /* The linux ppc64 ABI isn't explicit on whether aggregates smaller > than a doubleword should be padded upward or downward. You could > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > index 319182e94d9..9fba893a27a 100644 > --- a/gcc/config/rs6000/rs6000.c > +++ b/gcc/config/rs6000/rs6000.c > @@ -28465,7 +28465,9 @@ rs6000_emit_xxspltidp_v2df (rtx dst, long value) > static bool > rs6000_gen_pic_addr_diff_vec (void) > { > - return rs6000_relative_jumptables; > + return rs6000_relative_jumptables > +|| (JUMP_TABLES_IN_TEXT_SECTION > +&& targetm.asm_out.reloc_rw_mask () == 3); > } This seems like contorted logic and overriding the rs6000_relative_jumptables option change. The later part of the patch overrides rs6000_relative_jumptables for all rs6000 configurations, and then changes this one use of rs6000_relative_jumptables to add more logic to revert to the old meaning for some targets. What about all of the other uses of rs6000_relative_jumptables in the target? What about rs6000.md? I highly doubt that this patch is correct. Why not override rs6000_relative_jumptables for PPC64 Linux instead of changing its value globally and then trying to fix it up? Thanks, David > > void > diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt > index c2a77182a9e..75e3fa86829 100644 > --- a/gcc/config/rs6000/rs6000.opt > +++ b/gcc/config/rs6000/rs6000.opt > @@ -630,7 +630,7 @@ Target Mask(MMA) Var(rs6000_isa_flags) > Generate (do not generate) MMA instructions. > > mrelative-jumptables > -Target Undocumented Var(rs6000_relative_jumptables) Init(1) Save > +Target Undocumented Var(rs6000_relative_jumptables) Init(0) Save > > mrop-protect > Target Var(rs6000_rop_protect) Init(0)
[PATCH] tree-optimization/103990 - fix CFG cleanup regression from PRE change
This adjusts the CFG cleanup flow back to what it was before the last change which fixes the observed regression of 541.leela_r with LTO and FDO. Boostrapped on x86_64-unknown-linux-gnu, testing in progress. 2022-01-12 Richard Biener PR tree-optimization/103990 * tree-pass.h (tail_merge_optimize): Drop unused argument. * tree-ssa-tail-merge.c (tail_merge_optimize): Likewise. * tree-ssa-pre.c (pass_pre::execute): Retain TODO_cleanup_cfg and adjust call to tail_merge_optimize. --- gcc/tree-pass.h | 2 +- gcc/tree-ssa-pre.c| 6 ++ gcc/tree-ssa-tail-merge.c | 4 ++-- 3 files changed, 5 insertions(+), 7 deletions(-) diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 36097cf2736..606d1d60b85 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -412,7 +412,7 @@ extern gimple_opt_pass *make_pass_early_thread_jumps (gcc::context *ctxt); extern gimple_opt_pass *make_pass_split_crit_edges (gcc::context *ctxt); extern gimple_opt_pass *make_pass_laddress (gcc::context *ctxt); extern gimple_opt_pass *make_pass_pre (gcc::context *ctxt); -extern unsigned int tail_merge_optimize (unsigned int, bool); +extern unsigned int tail_merge_optimize (bool); extern gimple_opt_pass *make_pass_profile (gcc::context *ctxt); extern gimple_opt_pass *make_pass_strip_predict_hints (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_complex_O0 (gcc::context *ctxt); diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c index ab24fa98a1f..5113256802e 100644 --- a/gcc/tree-ssa-pre.c +++ b/gcc/tree-ssa-pre.c @@ -4442,7 +4442,6 @@ pass_pre::execute (function *fun) if (todo & TODO_cleanup_cfg) { cleanup_tree_cfg (); - todo &= ~TODO_cleanup_cfg; need_crit_edge_split = true; } @@ -4458,9 +4457,8 @@ pass_pre::execute (function *fun) It should either: - call merge_blocks after each tail merge iteration - call merge_blocks after all tail merge iterations - - mark TODO_cleanup_cfg when necessary - - share the cfg cleanup with fini_pre. */ - todo |= tail_merge_optimize (todo, need_crit_edge_split); + - mark TODO_cleanup_cfg when necessary. */ + todo |= tail_merge_optimize (need_crit_edge_split); free_rpo_vn (); diff --git a/gcc/tree-ssa-tail-merge.c b/gcc/tree-ssa-tail-merge.c index fd333800f0f..8e1ea1a982e 100644 --- a/gcc/tree-ssa-tail-merge.c +++ b/gcc/tree-ssa-tail-merge.c @@ -1724,7 +1724,7 @@ update_debug_stmts (void) /* Runs tail merge optimization. */ unsigned int -tail_merge_optimize (unsigned int todo, bool need_crit_edge_split) +tail_merge_optimize (bool need_crit_edge_split) { int nr_bbs_removed_total = 0; int nr_bbs_removed; @@ -1814,5 +1814,5 @@ tail_merge_optimize (unsigned int todo, bool need_crit_edge_split) timevar_pop (TV_TREE_TAIL_MERGE); - return todo; + return 0; } -- 2.31.1
Re: [PATCH] Fix redefined macro for epiphany.
On Wed, Jan 12, 2022 at 3:23 PM Jeff Law via Gcc-patches wrote: > > > > On 1/12/2022 2:47 AM, Martin Liška wrote: > > The following warning is emitted gazillion times. > > > > Fixes: > > > > In file included from ./tm.h:23, > > from gcc/genconfig.c:25: > > gcc/config/elfos.h:209: warning: "READONLY_DATA_SECTION_ASM_OP" redefined > > 209 | #define READONLY_DATA_SECTION_ASM_OP "\t.section\t.rodata" > > | > > In file included from ./tm.h:21, > > from gcc/genconfig.c:25: > > gcc/config/epiphany/epiphany.h:671: note: this is the location of the > > previous definition > > 671 | #define READONLY_DATA_SECTION_ASM_OP"\t.section .rodata" > > > > Ready to be installed? > > Thanks, > > Martin > > > > gcc/ChangeLog: > > > > * config/elfos.h (READONLY_DATA_SECTION_ASM_OP): Define only if > > not defined. > Please do. > > However, I think we should deprecate the epiphany. It's been broken for > ~2 years randomly failing in reload. I tried to fix it once and quickly > gave up. It currently ignores all testing failures in my tester as it's > been horribly unstable. Joern is still listed as maintainer so let's CC him before doing anything. That said, if it builds it's good enough for GCC 12 - we can discuss any target deprecations during stage1 (or deprecate those that don't even build right now, though I'm not aware of any). Richard. > > jeff
[committed][nvptx] Improve gcc.target/nvptx/atomic_fetch-*.c test-cases
Hi, Fix a few issues in test-cases gcc.target/nvptx/atomic_fetch-*.c: - atomic_fetch-1.c uses scan-assembler instead of scan-assembler-times, which is less accurate - atomic_fetch-2.c only contains negative testing using scan-assembler-not - the test-cases use stack variables to generate generic addresses, while stack atomics are not natively supported - the test-cases only test (64-bit) x (generic), instead of (32-bit, 64-bit) x (generic, global, shared) - the test-cases use a hardcoded '0' instead of the clearer MEMMODEL_RELAXED Tested on nvptx. Committed to trunk. Thanks, - Tom [nvptx] Improve gcc.target/nvptx/atomic_fetch-*.c test-cases gcc/testsuite/ChangeLog: 2022-01-12 Tom de Vries * gcc.target/nvptx/atomic_fetch-1.c: Rewrite. * gcc.target/nvptx/atomic_fetch-2.c: Rewrite. --- gcc/testsuite/gcc.target/nvptx/atomic_fetch-1.c | 98 ++--- gcc/testsuite/gcc.target/nvptx/atomic_fetch-2.c | 92 --- 2 files changed, 168 insertions(+), 22 deletions(-) diff --git a/gcc/testsuite/gcc.target/nvptx/atomic_fetch-1.c b/gcc/testsuite/gcc.target/nvptx/atomic_fetch-1.c index c637caa79a0..941cf3a2ab4 100644 --- a/gcc/testsuite/gcc.target/nvptx/atomic_fetch-1.c +++ b/gcc/testsuite/gcc.target/nvptx/atomic_fetch-1.c @@ -4,21 +4,97 @@ /* { dg-do compile } */ /* { dg-options "-O2 -misa=sm_35" } */ +enum memmodel +{ + MEMMODEL_RELAXED = 0 +}; + +unsigned long long int *p64; +unsigned int *p32; + +unsigned long long int g64; +unsigned int g32; + +unsigned int s32 __attribute__((shared)); +unsigned long long int s64 __attribute__((shared)); + +unsigned long long int v64; +unsigned int v32; + int main() { - unsigned long long a = ~0; - unsigned b = 0xa; + /* Generic. */ - __atomic_fetch_add (&a, b, 0); - __atomic_fetch_and (&a, b, 0); - __atomic_fetch_or (&a, b, 0); - __atomic_fetch_xor (&a, b, 0); + __atomic_fetch_add (p64, v64, MEMMODEL_RELAXED); + __atomic_fetch_and (p64, v64, MEMMODEL_RELAXED); + __atomic_fetch_or (p64, v64, MEMMODEL_RELAXED); + __atomic_fetch_xor (p64, v64, MEMMODEL_RELAXED); - return a; + __atomic_fetch_add (p32, v32, MEMMODEL_RELAXED); + __atomic_fetch_and (p32, v32, MEMMODEL_RELAXED); + __atomic_fetch_or (p32, v32, MEMMODEL_RELAXED); + __atomic_fetch_xor (p32, v32, MEMMODEL_RELAXED); + + /* Global. */ + + __atomic_fetch_add (&g64, v64, MEMMODEL_RELAXED); + __atomic_fetch_and (&g64, v64, MEMMODEL_RELAXED); + __atomic_fetch_or (&g64, v64, MEMMODEL_RELAXED); + __atomic_fetch_xor (&g64, v64, MEMMODEL_RELAXED); + + __atomic_fetch_add (&g32, v32, MEMMODEL_RELAXED); + __atomic_fetch_and (&g32, v32, MEMMODEL_RELAXED); + __atomic_fetch_or (&g32, v32, MEMMODEL_RELAXED); + __atomic_fetch_xor (&g32, v32, MEMMODEL_RELAXED); + + /* Shared. */ + + __atomic_fetch_add (&s64, v64, MEMMODEL_RELAXED); + __atomic_fetch_and (&s64, v64, MEMMODEL_RELAXED); + __atomic_fetch_or (&s64, v64, MEMMODEL_RELAXED); + __atomic_fetch_xor (&s64, v64, MEMMODEL_RELAXED); + + __atomic_fetch_add (&s32, v32, MEMMODEL_RELAXED); + __atomic_fetch_and (&s32, v32, MEMMODEL_RELAXED); + __atomic_fetch_or (&s32, v32, MEMMODEL_RELAXED); + __atomic_fetch_xor (&s32, v32, MEMMODEL_RELAXED); + + return 0; } -/* { dg-final { scan-assembler "atom.add.u64" } } */ -/* { dg-final { scan-assembler "atom.b64.and" } } */ -/* { dg-final { scan-assembler "atom.b64.or" } } */ -/* { dg-final { scan-assembler "atom.b64.xor" } } */ +/* Generic. */ + +/* { dg-final { scan-assembler-times "atom.add.u64" 1 } } */ +/* { dg-final { scan-assembler-times "atom.b64.and" 1 } } */ +/* { dg-final { scan-assembler-times "atom.b64.or" 1 } } */ +/* { dg-final { scan-assembler-times "atom.b64.xor" 1 } } */ + +/* { dg-final { scan-assembler-times "atom.add.u32" 1 } } */ +/* { dg-final { scan-assembler-times "atom.b32.and" 1 } } */ +/* { dg-final { scan-assembler-times "atom.b32.or" 1 } } */ +/* { dg-final { scan-assembler-times "atom.b32.xor" 1 } } */ + +/* Global. */ + +/* { dg-final { scan-assembler-times "atom.global.add.u64" 1 } } */ +/* { dg-final { scan-assembler-times "atom.global.b64.and" 1 } } */ +/* { dg-final { scan-assembler-times "atom.global.b64.or" 1 } } */ +/* { dg-final { scan-assembler-times "atom.global.b64.xor" 1 } } */ + +/* { dg-final { scan-assembler-times "atom.global.add.u32" 1 } } */ +/* { dg-final { scan-assembler-times "atom.global.b32.and" 1 } } */ +/* { dg-final { scan-assembler-times "atom.global.b32.or" 1 } } */ +/* { dg-final { scan-assembler-times "atom.global.b32.xor" 1 } } */ + +/* Shared. */ + +/* { dg-final { scan-assembler-times "atom.shared.add.u64" 1 } } */ +/* { dg-final { scan-assembler-times "atom.shared.b64.and" 1 } } */ +/* { dg-final { scan-assembler-times "atom.shared.b64.or" 1 } } */ +/* { dg-final { scan-assembler-times "atom.shared.b64.xor" 1 } } */ + +/* { dg-final { scan-assembler-times "atom.shared.add.u32" 1 } } */ +/* { dg-final { scan-assembler-times "atom.shared.b32.and" 1 } } */ +/* { d
[committed][nvptx] Add gcc.target/nvptx/atomic-exchange-*.c test-cases
Hi, Add a few test-cases that test expansion of __atomic_exchange. Tested on nvptx. Committed to trunk. Thanks, - Tom [nvptx] Add gcc.target/nvptx/atomic-exchange-*.c test-cases gcc/testsuite/ChangeLog: 2022-01-12 Tom de Vries * gcc.target/nvptx/atomic-exchange-1.c: New test. * gcc.target/nvptx/atomic-exchange-2.c: New test. * gcc.target/nvptx/atomic-exchange-3.c: New test. * gcc.target/nvptx/atomic-exchange-4.c: New test. --- gcc/testsuite/gcc.target/nvptx/atomic-exchange-1.c | 39 gcc/testsuite/gcc.target/nvptx/atomic-exchange-2.c | 33 ++ gcc/testsuite/gcc.target/nvptx/atomic-exchange-3.c | 33 ++ gcc/testsuite/gcc.target/nvptx/atomic-exchange-4.c | 74 ++ 4 files changed, 179 insertions(+) diff --git a/gcc/testsuite/gcc.target/nvptx/atomic-exchange-1.c b/gcc/testsuite/gcc.target/nvptx/atomic-exchange-1.c new file mode 100644 index 000..c63f52b168c --- /dev/null +++ b/gcc/testsuite/gcc.target/nvptx/atomic-exchange-1.c @@ -0,0 +1,39 @@ +/* Test the atomic exchange expansion, shared state space. */ + +/* { dg-do compile } */ +/* { dg-options "-Wno-long-long" } */ + +enum memmodel +{ + MEMMODEL_SEQ_CST = 5 +}; + +unsigned char u8 __attribute__((shared)); +unsigned short u16 __attribute__((shared)); +unsigned int u32 __attribute__((shared)); +unsigned long long int u64 __attribute__((shared)); + +int +main() +{ + __atomic_exchange_n (&u8, 0, MEMMODEL_SEQ_CST); + __atomic_exchange_n (&u16, 0, MEMMODEL_SEQ_CST); + __atomic_exchange_n (&u32, 0, MEMMODEL_SEQ_CST); + __atomic_exchange_n (&u64, 0, MEMMODEL_SEQ_CST); + + return 0; +} + + +/* Not ptx-native, fallback to libatomic. + Libatomic uses generic addressing with a global lock and membar.sys barriers. + We could implement these more efficiently by cloning libatomic for .shared, + using a per-CTA lock and membar.cta barrier. But we'd expect + performance-critical code to use the ptx-native atomic sizes 32 and 64 bit, + so that doesn't seem to be worth the trouble. */ +/* { dg-final { scan-assembler-times "(?n)call .* __atomic_exchange_1" 1 } } */ +/* { dg-final { scan-assembler-times "(?n)call .* __atomic_exchange_2" 1 } } */ + +/* { dg-final { scan-assembler-times "atom.shared.exch.b32" 1 } } */ +/* { dg-final { scan-assembler-times "atom.shared.exch.b64" 1 } } */ +/* { dg-final { scan-assembler-times "membar.cta" 4 } } */ diff --git a/gcc/testsuite/gcc.target/nvptx/atomic-exchange-2.c b/gcc/testsuite/gcc.target/nvptx/atomic-exchange-2.c new file mode 100644 index 000..4301e74e94e --- /dev/null +++ b/gcc/testsuite/gcc.target/nvptx/atomic-exchange-2.c @@ -0,0 +1,33 @@ +/* Test the atomic exchange expansion, global state space. */ + +/* { dg-do compile } */ +/* { dg-options "-Wno-long-long" } */ + +enum memmodel +{ + MEMMODEL_SEQ_CST = 5 +}; + +unsigned char u8; +unsigned short u16; +unsigned int u32; +unsigned long long int u64; + +int +main() +{ + __atomic_exchange_n (&u8, 0, MEMMODEL_SEQ_CST); + __atomic_exchange_n (&u16, 0, MEMMODEL_SEQ_CST); + __atomic_exchange_n (&u32, 0, MEMMODEL_SEQ_CST); + __atomic_exchange_n (&u64, 0, MEMMODEL_SEQ_CST); + + return 0; +} + +/* Not ptx-native, fallback to libatomic. */ +/* { dg-final { scan-assembler-times "(?n)call .* __atomic_exchange_1" 1 } } */ +/* { dg-final { scan-assembler-times "(?n)call .* __atomic_exchange_2" 1 } } */ + +/* { dg-final { scan-assembler-times "atom.global.exch.b32" 1 } } */ +/* { dg-final { scan-assembler-times "atom.global.exch.b64" 1 } } */ +/* { dg-final { scan-assembler-times "membar.sys" 4 } } */ diff --git a/gcc/testsuite/gcc.target/nvptx/atomic-exchange-3.c b/gcc/testsuite/gcc.target/nvptx/atomic-exchange-3.c new file mode 100644 index 000..2f8232f25eb --- /dev/null +++ b/gcc/testsuite/gcc.target/nvptx/atomic-exchange-3.c @@ -0,0 +1,33 @@ +/* Test the atomic exchange expansion, generic addressing. */ + +/* { dg-do compile } */ +/* { dg-options "-Wno-long-long" } */ + +enum memmodel +{ + MEMMODEL_SEQ_CST = 5 +}; + +unsigned char *u8; +unsigned short *u16; +unsigned int *u32; +unsigned long long int *u64; + +int +main() +{ + __atomic_exchange_n (u8, 0, MEMMODEL_SEQ_CST); + __atomic_exchange_n (u16, 0, MEMMODEL_SEQ_CST); + __atomic_exchange_n (u32, 0, MEMMODEL_SEQ_CST); + __atomic_exchange_n (u64, 0, MEMMODEL_SEQ_CST); + + return 0; +} + +/* Not ptx-native, fallback to libatomic. */ +/* { dg-final { scan-assembler-times "(?n)call .* __atomic_exchange_1" 1 } } */ +/* { dg-final { scan-assembler-times "(?n)call .* __atomic_exchange_2" 1 } } */ + +/* { dg-final { scan-assembler-times "atom.exch.b32" 1 } } */ +/* { dg-final { scan-assembler-times "atom.exch.b64" 1 } } */ +/* { dg-final { scan-assembler-times "membar.sys" 4 } } */ diff --git a/gcc/testsuite/gcc.target/nvptx/atomic-exchange-4.c b/gcc/testsuite/gcc.target/nvptx/atomic-exchange-4.c new file mode 100644 index 000..de1d395cccf --- /dev/null +++ b/gcc/testsuite/gcc.targ
Re: [PATCH] ira: Fix old-reload targets [PR103974]
On Wed, Jan 12, 2022 at 3:26 PM Vladimir Makarov wrote: > > > On 2022-01-12 03:47, Richard Biener wrote: > > On Tue, Jan 11, 2022 at 7:41 PM Vladimir Makarov via Gcc-patches > > wrote: > >> > >> On 2022-01-11 12:42, Richard Sandiford wrote: > >>> The new IRA heuristics would need more work on old-reload targets, > >>> since flattening needs to be able to undo the cost propagation. > >>> It's doable, but hardly seems worth it. > >> Agree. It is not worth to spend your time for work for reload. > >>> This patch therefore makes all the new calls to > >>> ira_subloop_allocnos_can_differ_p return false if !ira_use_lra_p. > >>> The color_pass code that predated the new function (and that was > >>> the source of ira_subloop_allocnos_can_differ_p) continues to > >>> behave as before. > >>> > >>> It's a hack, but at least it has the advantage that the new parameter > >>> would become obviously unused if reload and (!)ira_use_lra_p were > >>> removed. The hack should therefore disappear alongside reload. > >> I have a feeling that it will stay for a long time if not forever. > > We indeed seem to have 34 targets w/o LRA by default and only 15 with :/ > > > > At some point Eric wrote a nice summary for how to transition targets > > away from CC0, I wonder if there's something similar for transitioning > > a port away from reload to LRA? In those 34 targets there must be some > > for which that's a relatively easy task? I suppose it depends on how > > much of the reload target hooks are put to use and how those translate > > to LRA. > > First of all the target should be rid of using CC0. Then theoretically > :) the target should work with LRA as LRA uses existing reload hooks > (more accurately a subset of them). CC0 is no more, we've accomplished that feat for GCC 12! > In practice some work should be done for switching to LRA. I did first > 4 major targets to work with LRA and unfortunately did not find some > repeating patterns in this work. The problems for the first targets > were mostly unique and required a lot of LRA code modifications. After > that people did other target switching pretty easily and spent few time > for this as I remember. > > So based on my experience of porting targets to LRA I can not formalize > this work. But probably it can be done by examining all LRA targets > code (mostly looking at machine dependent code related to use > lra_in_progress_p) or by collecting information what was done from other > people who did porting to LRA. So in theory it might be just pulling the switch for some? That is, removing their definition of TARGET_LRA_P which then defaults to true? Jeff might be able to test this for (all) targets on his harness. Richard.
[committed] analyzer: complain about tainted sizes with "access" attribute [PR103940]
GCC 10 gained the "access" function and type attribute, which optionally can take a size-index param: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html -fanalyzer in trunk (for GCC 12) has gained a -Wanalyzer-tainted-size to complain about attacker-controlled size values, but this was only being used deep inside the region-model code when handling the hardcoded known behavior of certain functions (memset, IIRC). This patch extends -Wanalyzer-tainted-size to also complain about unsanitized attacker-controlled values being passed to function parameters marked as a size via the "access" attribute. Note that -fanalyzer-checker=taint is currently required in addition to -fanalyzer to use this warning, due to scaling issues (see bug 103533). Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to trunk as r12-6526-g2c16dfe6268eeeb4b7924ff423e274fa00894a4d. gcc/analyzer/ChangeLog: PR analyzer/103940 * engine.cc (impl_sm_context::impl_sm_context): Add "unknown_side_effects" param and use it to initialize new m_unknown_side_effects field. (impl_sm_context::unknown_side_effects_p): New. (impl_sm_context::m_unknown_side_effects): New. (exploded_node::on_stmt): Pass unknown_side_effects to sm_ctxt ctor. * sm-taint.cc: Include "stringpool.h" and "attribs.h". (tainted_size::tainted_size): Drop "dir" param. (tainted_size::get_kind): Drop "FINAL". (tainted_size::emit): Likewise. (tainted_size::m_dir): Drop unused field. (class tainted_access_attrib_size): New subclass. (taint_state_machine::on_stmt): Call check_for_tainted_size_arg on external functions with unknown side effects. (taint_state_machine::check_for_tainted_size_arg): New. (region_model::check_region_for_taint): Drop "dir" param from tainted_size ctor. * sm.h (sm_context::unknown_side_effects_p): New. gcc/testsuite/ChangeLog: PR analyzer/103940 * gcc.dg/analyzer/taint-size-access-attr-1.c: New test. Signed-off-by: David Malcolm --- gcc/analyzer/engine.cc| 17 ++- gcc/analyzer/sm-taint.cc | 116 -- gcc/analyzer/sm.h | 3 + .../analyzer/taint-size-access-attr-1.c | 63 ++ 4 files changed, 187 insertions(+), 12 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-size-access-attr-1.c diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc index 18fc00d3857..f2a14f52caa 100644 --- a/gcc/analyzer/engine.cc +++ b/gcc/analyzer/engine.cc @@ -293,14 +293,16 @@ public: const sm_state_map *old_smap, sm_state_map *new_smap, path_context *path_ctxt, - stmt_finder *stmt_finder = NULL) + stmt_finder *stmt_finder = NULL, + bool unknown_side_effects = false) : sm_context (sm_idx, sm), m_logger (eg.get_logger ()), m_eg (eg), m_enode_for_diag (enode_for_diag), m_old_state (old_state), m_new_state (new_state), m_old_smap (old_smap), m_new_smap (new_smap), m_path_ctxt (path_ctxt), -m_stmt_finder (stmt_finder) +m_stmt_finder (stmt_finder), +m_unknown_side_effects (unknown_side_effects) { } @@ -490,6 +492,11 @@ public: return m_path_ctxt; } + bool unknown_side_effects_p () const FINAL OVERRIDE + { +return m_unknown_side_effects; + } + log_user m_logger; exploded_graph &m_eg; exploded_node *m_enode_for_diag; @@ -499,6 +506,9 @@ public: sm_state_map *m_new_smap; path_context *m_path_ctxt; stmt_finder *m_stmt_finder; + + /* Are we handling an external function with unknown side effects? */ + bool m_unknown_side_effects; }; /* Subclass of stmt_finder for finding the best stmt to report the leak at, @@ -1325,7 +1335,8 @@ exploded_node::on_stmt (exploded_graph &eg, = old_state.m_checker_states[sm_idx]; sm_state_map *new_smap = state->m_checker_states[sm_idx]; impl_sm_context sm_ctxt (eg, sm_idx, sm, this, &old_state, state, - old_smap, new_smap, path_ctxt); + old_smap, new_smap, path_ctxt, NULL, + unknown_side_effects); /* Allow the state_machine to handle the stmt. */ if (sm.on_stmt (&sm_ctxt, snode, stmt)) diff --git a/gcc/analyzer/sm-taint.cc b/gcc/analyzer/sm-taint.cc index 8f274df997a..54c7e6015ab 100644 --- a/gcc/analyzer/sm-taint.cc +++ b/gcc/analyzer/sm-taint.cc @@ -41,6 +41,8 @@ along with GCC; see the file COPYING3. If not see #include "cgraph.h" #include "cfg.h" #include "digraph.h" +#include "stringpool.h" +#include "attribs.h" #include "analyzer/supergraph.h" #include "analyzer/call-string.h" #include "analyzer/program-point.h" @@ -102,6 +104,13 @@ public: state_t combine_states (state_t s0, state_t s1)
Re: [PATCH] ira: Fix old-reload targets [PR103974]
On Wed, Jan 12, 2022 at 4:00 PM Richard Biener wrote: > > On Wed, Jan 12, 2022 at 3:26 PM Vladimir Makarov wrote: > > > > > > On 2022-01-12 03:47, Richard Biener wrote: > > > On Tue, Jan 11, 2022 at 7:41 PM Vladimir Makarov via Gcc-patches > > > wrote: > > >> > > >> On 2022-01-11 12:42, Richard Sandiford wrote: > > >>> The new IRA heuristics would need more work on old-reload targets, > > >>> since flattening needs to be able to undo the cost propagation. > > >>> It's doable, but hardly seems worth it. > > >> Agree. It is not worth to spend your time for work for reload. > > >>> This patch therefore makes all the new calls to > > >>> ira_subloop_allocnos_can_differ_p return false if !ira_use_lra_p. > > >>> The color_pass code that predated the new function (and that was > > >>> the source of ira_subloop_allocnos_can_differ_p) continues to > > >>> behave as before. > > >>> > > >>> It's a hack, but at least it has the advantage that the new parameter > > >>> would become obviously unused if reload and (!)ira_use_lra_p were > > >>> removed. The hack should therefore disappear alongside reload. > > >> I have a feeling that it will stay for a long time if not forever. > > > We indeed seem to have 34 targets w/o LRA by default and only 15 with :/ > > > > > > At some point Eric wrote a nice summary for how to transition targets > > > away from CC0, I wonder if there's something similar for transitioning > > > a port away from reload to LRA? In those 34 targets there must be some > > > for which that's a relatively easy task? I suppose it depends on how > > > much of the reload target hooks are put to use and how those translate > > > to LRA. > > > > First of all the target should be rid of using CC0. Then theoretically > > :) the target should work with LRA as LRA uses existing reload hooks > > (more accurately a subset of them). > > CC0 is no more, we've accomplished that feat for GCC 12! > > > In practice some work should be done for switching to LRA. I did first > > 4 major targets to work with LRA and unfortunately did not find some > > repeating patterns in this work. The problems for the first targets > > were mostly unique and required a lot of LRA code modifications. After > > that people did other target switching pretty easily and spent few time > > for this as I remember. > > > > So based on my experience of porting targets to LRA I can not formalize > > this work. But probably it can be done by examining all LRA targets > > code (mostly looking at machine dependent code related to use > > lra_in_progress_p) or by collecting information what was done from other > > people who did porting to LRA. > > So in theory it might be just pulling the switch for some? That is, > removing their definition of TARGET_LRA_P which then defaults > to true? I can confirm it works for visium which builds after such change and can compile a hello world without crashing. So it might be a viable strathegy for the less maintained odd architectures we still have in our tree. > Jeff might be able to test this for (all) targets on his harness. > > Richard.
Re: PING^2 (C/C++): Re: [PATCH 6/6] Add __attribute__ ((tainted))
On Tue, 2022-01-11 at 23:36 -0500, Jason Merrill wrote: > On 1/10/22 16:36, David Malcolm via Gcc-patches wrote: > > On Thu, 2022-01-06 at 09:08 -0500, David Malcolm wrote: > > > On Sat, 2021-11-13 at 15:37 -0500, David Malcolm wrote: > > > > This patch adds a new __attribute__ ((tainted)) to the C/C++ > > > > frontends. > > > > > > Ping for GCC C/C++ mantainers for review of the C/C++ FE parts of > > > this > > > patch (attribute registration, documentation, the name of the > > > attribute, etc). > > > > > > (I believe it's independent of the rest of the patch kit, in that > > > it > > > could go into trunk without needing the prior patches) > > > > > > Thanks > > > Dave > > > > Getting close to end of stage 3 for GCC 12, so pinging this patch > > again... > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584376.html > > The c-family change is OK. Thanks. I'm retesting the patch now, but it now seems to me that __attribute__((tainted_args)) would lead to more readable code than: __attribute__((tainted)) in that the name "tainted_args" better conveys the idea that all arguments are under attacker-control (as opposed to the body of the function or the function pointer being under attacker-control). Looking at https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html we already have some attributes with underscores in their names. Does this sound good? Dave > > > Thanks > > Dave > > > > > > > > > > > > > > > > It can be used on function decls: the analyzer will treat as > > > > tainted > > > > all parameters to the function and all buffers pointed to by > > > > parameters > > > > to the function. Adding this in one place to the Linux kernel's > > > > __SYSCALL_DEFINEx macro allows the analyzer to treat all syscalls > > > > as > > > > having tainted inputs. This gives additional testing beyond e.g. > > > > __user > > > > pointers added by earlier patches - an example of the use of this > > > > can > > > > be > > > > seen in CVE-2011-2210, where given: > > > > > > > > SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user > > > > *, > > > > buffer, > > > > unsigned long, nbytes, int __user *, start, > > > > void > > > > __user *, arg) > > > > > > > > the analyzer will treat the nbytes param as under attacker > > > > control, > > > > and > > > > can complain accordingly: > > > > > > > > taint-CVE-2011-2210-1.c: In function ‘sys_osf_getsysinfo’: > > > > taint-CVE-2011-2210-1.c:69:21: warning: use of attacker- > > > > controlled > > > > value > > > > ‘nbytes’ as size without upper-bounds checking [CWE-129] [- > > > > Wanalyzer-tainted-size] > > > > 69 | if (copy_to_user(buffer, hwrpb, nbytes) > > > > != 0) > > > > | ^~~ > > > > > > > > Additionally, the patch allows the attribute to be used on field > > > > decls: > > > > specifically function pointers. Any function used as an > > > > initializer > > > > for such a field gets treated as tainted. An example can be seen > > > > in > > > > CVE-2020-13143, where adding __attribute__((tainted)) to the > > > > "store" > > > > callback of configfs_attribute: > > > > > > > > struct configfs_attribute { > > > > /* [...snip...] */ > > > > ssize_t (*store)(struct config_item *, const char *, > > > > size_t) > > > > __attribute__((tainted)); > > > > /* [...snip...] */ > > > > }; > > > > > > > > allows the analyzer to see: > > > > > > > > CONFIGFS_ATTR(gadget_dev_desc_, UDC); > > > > > > > > and treat gadget_dev_desc_UDC_store as tainted, so that it > > > > complains: > > > > > > > > taint-CVE-2020-13143-1.c: In function > > > > ‘gadget_dev_desc_UDC_store’: > > > > taint-CVE-2020-13143-1.c:33:17: warning: use of attacker- > > > > controlled > > > > value > > > > ‘len + 18446744073709551615’ as offset without upper-bounds > > > > checking [CWE-823] [-Wanalyzer-tainted-offset] > > > > 33 | if (name[len - 1] == '\n') > > > > | ^ > > > > > > > > Similarly, the attribute could be used on the ioctl callback > > > > field, > > > > USB device callbacks, network-handling callbacks etc. This > > > > potentially > > > > gives a lot of test coverage with relatively little code > > > > annotation, > > > > and > > > > without necessarily needing link-time analysis (which -fanalyzer > > > > can > > > > only do at present on trivial examples). > > > > > > > > I believe this is the first time we've had an attribute on a > > > > field. > > > > If that's an issue, I could prepare a version of the patch that > > > > merely allowed it on functions themselves. > > > > > > > > As before this currently still needs -fanalyzer-checker=taint (in > > > > addition to -fanalyzer). > > > > > > > > gcc/analyzer/ChangeLog: > > > > * engine.cc: Include "stringpool.h", "attribs.h", and > > > > "tree-dfa.h". > > > > (mark_params_as_tai
Re: [PATCH] Fix -Wformat-diag for rs6000 target.
On 1/12/22 02:02, Martin Liška wrote: Hello. We've got -Wformat-diag for some time and I think we should start using it in -Werror for GCC bootstrap. The following patch removes last pieces of the warning for rs6000 target. Ready to be installed? Thanks, Martin gcc/ChangeLog: * config/rs6000/rs6000-call.c (rs6000_invalid_builtin): Wrap keywords and use %qs instead of %<%s%>. (rs6000_expand_builtin): Likewise. gcc/testsuite/ChangeLog: * gcc.target/powerpc/bfp/scalar-extract-exp-5.c: Adjust scans in testcases. * gcc.target/powerpc/bfp/scalar-extract-sig-5.c: Likewise. * gcc.target/powerpc/bfp/scalar-insert-exp-11.c: Likewise. --- gcc/config/rs6000/rs6000-call.c | 8 .../gcc.target/powerpc/bfp/scalar-extract-exp-5.c | 2 +- .../gcc.target/powerpc/bfp/scalar-extract-sig-5.c | 2 +- .../gcc.target/powerpc/bfp/scalar-insert-exp-11.c | 2 +- 4 files changed, 7 insertions(+), 7 deletions(-) diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index c78b8b08c40..becdad73812 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -3307,7 +3307,7 @@ rs6000_invalid_builtin (enum rs6000_gen_builtins fncode) "-mvsx"); break; case ENB_IEEE128_HW: - error ("%qs requires ISA 3.0 IEEE 128-bit floating point", name); + error ("%qs requires ISA 3.0 IEEE 128-bit floating-point", name); The instances of the warning where floating point is at the end of a message aren't correct. The warning should be relaxed to allow unhyphenated floating point as a noun (as discussed briefly last March: https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566881.html) Martin break; case ENB_DFP: error ("%qs requires the %qs option", name, "-mhard-dfp"); @@ -5589,20 +5589,20 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* subtarget */, if (bif_is_nosoft (*bifaddr) && rs6000_isa_flags & OPTION_MASK_SOFT_FLOAT) { - error ("%<%s%> not supported with %<-msoft-float%>", + error ("%qs not supported with %<-msoft-float%>", bifaddr->bifname); return const0_rtx; } if (bif_is_no32bit (*bifaddr) && TARGET_32BIT) { - error ("%<%s%> is not supported in 32-bit mode", bifaddr->bifname); + error ("%qs is not supported in 32-bit mode", bifaddr->bifname); return const0_rtx; } if (bif_is_ibmld (*bifaddr) && !FLOAT128_2REG_P (TFmode)) { - error ("%<%s%> requires % to be IBM 128-bit format", + error ("%qs requires % to be IBM 128-bit format", bifaddr->bifname); return const0_rtx; } diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c index 34184812dc5..1225613b275 100644 --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c @@ -14,7 +14,7 @@ get_exponent (__ieee128 *p) { __ieee128 source = *p; - return scalar_extract_exp (source); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating point" } */ + return scalar_extract_exp (source); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating-point" } */ } diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c index 13c64fc3acf..adf0e4f99df 100644 --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c @@ -12,5 +12,5 @@ get_significand (__ieee128 *p) { __ieee128 source = *p; - return (long long int) __builtin_vec_scalar_extract_sig (source); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating point" } */ + return (long long int) __builtin_vec_scalar_extract_sig (source); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating-point" } */ } diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c index a5dd852e60f..6787a67950b 100644 --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c @@ -16,5 +16,5 @@ insert_exponent (__ieee128 *significand_p, __ieee128 significand = *significand_p; unsigned long long int exponent = *exponent_p; - return scalar_insert_exp (significand, exponent); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating point" } */ + return scalar_insert_exp (significand, exponent); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating-point" } */ }
[PATCH] git-backport: support renamed .cc files in commit message.
Hi. There's a patch that enhances git-backport so that it updates commit messages for files which name ends now with .cc and is still .c on a branch. Example usage: $ git show test commit 8ed4b2cb9aa158c0ef418fd1ac66271664904604 (test) Author: Martin Liska Date: Wed Jan 12 16:08:13 2022 +0100 Fix file. gcc/ChangeLog: * ipa-icf.cc: Test me. gcc/ada/ChangeLog: * cal.c: aaa. libcpp/ChangeLog: * expr.cc: aaa. $ git gcc-backport test Auto-merging gcc/ada/cal.c Auto-merging gcc/ipa-icf.c Auto-merging libcpp/expr.c [test2 ee40feb077e] Fix file. Date: Wed Jan 12 16:08:13 2022 +0100 3 files changed, 3 insertions(+) Commit message updated: 2 .cc file(s) changed. $ git show test2 commit f59a1e736c1b68f07d83388a994df8d043e8aa6e (test2) Author: Martin Liska Date: Wed Jan 12 16:08:13 2022 +0100 Fix file. gcc/ChangeLog: * ipa-icf.c: Test me. gcc/ada/ChangeLog: * cal.c: aaa. libcpp/ChangeLog: * expr.c: aaa. (cherry picked from commit 8ed4b2cb9aa158c0ef418fd1ac66271664904604) MartinFrom 647a6dbaf8cde4ee07b95c4530a03f7774500914 Mon Sep 17 00:00:00 2001 From: Martin Liska Date: Wed, 12 Jan 2022 16:35:41 +0100 Subject: [PATCH] git-backport: support renamed .cc files in commit message. The change can automatically update names for *.cc files that are part of a backport. contrib/ChangeLog: * git-backport.py: Support renaming of .cc files. --- contrib/git-backport.py | 50 + 1 file changed, 50 insertions(+) diff --git a/contrib/git-backport.py b/contrib/git-backport.py index 2b8e4686719..627bbd9ee66 100755 --- a/contrib/git-backport.py +++ b/contrib/git-backport.py @@ -20,7 +20,32 @@ # Boston, MA 02110-1301, USA. import argparse +import os import subprocess +import sys +import tempfile + +script_folder = os.path.dirname(os.path.abspath(__file__)) +verify_script = os.path.join(script_folder, + 'gcc-changelog/git_check_commit.py') + + +def replace_file_in_changelog(lines, filename): +if not filename.endswith('.cc'): +return + +# consider all componenets of a path: gcc/ipa-icf.cc +while filename: +for i, line in enumerate(lines): +if filename in line: +line = line.replace(filename, filename[:-1]) +lines[i] = line +return +parts = filename.split('/') +if len(parts) == 1: +return +filename = '/'.join(parts[1:]) + if __name__ == '__main__': parser = argparse.ArgumentParser(description='Backport a git revision and ' @@ -63,3 +88,28 @@ if __name__ == '__main__': subprocess.check_output(cmd, shell=True) else: print('Please resolve all remaining file conflicts.') +sys.exit(1) + +# Update commit message if change for a .cc file was taken +r = subprocess.run(f'{verify_script} HEAD', shell=True, encoding='utf8', + stdout=subprocess.PIPE, stderr=subprocess.PIPE) +if r.returncode != 0: +lines = r.stdout.splitlines() +cmd = 'git show -s --format=%B' +commit_message = subprocess.check_output(cmd, shell=True, + encoding='utf8').strip() +commit_message = commit_message.splitlines() + +todo = [line for line in lines if 'unchanged file mentioned' in line] +for item in todo: +filename = item.split()[-1].strip('"') +replace_file_in_changelog(commit_message, filename) + +with tempfile.NamedTemporaryFile('w', encoding='utf8', + delete=False) as w: +w.write('\n'.join(commit_message)) +w.close() +subprocess.check_output(f'git commit --amend -F {w.name}', +shell=True, encoding='utf8') +os.unlink(w.name) +print(f'Commit message updated: {len(todo)} .cc file(s) changed.') -- 2.34.1
Re: [PATCH] Fix redefined macro for epiphany.
On 1/12/2022 7:56 AM, Richard Biener wrote: On Wed, Jan 12, 2022 at 3:23 PM Jeff Law via Gcc-patches wrote: On 1/12/2022 2:47 AM, Martin Liška wrote: The following warning is emitted gazillion times. Fixes: In file included from ./tm.h:23, from gcc/genconfig.c:25: gcc/config/elfos.h:209: warning: "READONLY_DATA_SECTION_ASM_OP" redefined 209 | #define READONLY_DATA_SECTION_ASM_OP "\t.section\t.rodata" | In file included from ./tm.h:21, from gcc/genconfig.c:25: gcc/config/epiphany/epiphany.h:671: note: this is the location of the previous definition 671 | #define READONLY_DATA_SECTION_ASM_OP"\t.section .rodata" Ready to be installed? Thanks, Martin gcc/ChangeLog: * config/elfos.h (READONLY_DATA_SECTION_ASM_OP): Define only if not defined. Please do. However, I think we should deprecate the epiphany. It's been broken for ~2 years randomly failing in reload. I tried to fix it once and quickly gave up. It currently ignores all testing failures in my tester as it's been horribly unstable. Joern is still listed as maintainer so let's CC him before doing anything. That said, if it builds it's good enough for GCC 12 - we can discuss any target deprecations during stage1 (or deprecate those that don't even build right now, though I'm not aware of any). cr16 still uses cc0. It's marked as deprecated in gcc-12 and assuming it doesn't get fixed, removal in gcc-13. m32c should be deprecated. While it'll build libgcc, it'll fault (due to register allocation/reloading issues) building newlib or anything relatively complex. This has been the case for at least 2 years. epiphany will build newlib consistently, so I guess keeping it another release is OK to see if Joern wants to step up and fix its reload issues. Jeff
Re: [PATCH] ira: Fix old-reload targets [PR103974]
On 1/12/2022 8:00 AM, Richard Biener wrote: On Wed, Jan 12, 2022 at 3:26 PM Vladimir Makarov wrote: On 2022-01-12 03:47, Richard Biener wrote: On Tue, Jan 11, 2022 at 7:41 PM Vladimir Makarov via Gcc-patches wrote: On 2022-01-11 12:42, Richard Sandiford wrote: The new IRA heuristics would need more work on old-reload targets, since flattening needs to be able to undo the cost propagation. It's doable, but hardly seems worth it. Agree. It is not worth to spend your time for work for reload. This patch therefore makes all the new calls to ira_subloop_allocnos_can_differ_p return false if !ira_use_lra_p. The color_pass code that predated the new function (and that was the source of ira_subloop_allocnos_can_differ_p) continues to behave as before. It's a hack, but at least it has the advantage that the new parameter would become obviously unused if reload and (!)ira_use_lra_p were removed. The hack should therefore disappear alongside reload. I have a feeling that it will stay for a long time if not forever. We indeed seem to have 34 targets w/o LRA by default and only 15 with :/ At some point Eric wrote a nice summary for how to transition targets away from CC0, I wonder if there's something similar for transitioning a port away from reload to LRA? In those 34 targets there must be some for which that's a relatively easy task? I suppose it depends on how much of the reload target hooks are put to use and how those translate to LRA. First of all the target should be rid of using CC0. Then theoretically :) the target should work with LRA as LRA uses existing reload hooks (more accurately a subset of them). CC0 is no more, we've accomplished that feat for GCC 12! In practice some work should be done for switching to LRA. I did first 4 major targets to work with LRA and unfortunately did not find some repeating patterns in this work. The problems for the first targets were mostly unique and required a lot of LRA code modifications. After that people did other target switching pretty easily and spent few time for this as I remember. So based on my experience of porting targets to LRA I can not formalize this work. But probably it can be done by examining all LRA targets code (mostly looking at machine dependent code related to use lra_in_progress_p) or by collecting information what was done from other people who did porting to LRA. So in theory it might be just pulling the switch for some? That is, removing their definition of TARGET_LRA_P which then defaults to true? Jeff might be able to test this for (all) targets on his harness. Given a patch, it's trivial do throw it in and see what the fallout is. jeff
Re: [PATCH] epiphany: fir -Wformat-diag.
On 1/12/2022 2:57 AM, Martin Liška wrote: This fixes -Wformat-diag warnings. Ready to be installed? Thanks, Martin gcc/ChangeLog: * config/epiphany/epiphany.c (epiphany_handle_interrupt_attribute): Use %qs format specifier. (epiphany_override_options): Wrap keyword in %<, %>. OK. And similar changes for other ports are pre-approved. jeff
Re: [PATCH] epiphany: fix -Wimplicit-fallthrough warnings in epiphany.c.
On 1/12/2022 2:56 AM, Martin Liška wrote: This is RFC. Cheers, Martin gcc/ChangeLog: * config/epiphany/epiphany.c (epiphany_mode_priority): Use gcc_unreachable for not handled cases. OK. As are similar changes in other ports. jeff
Re: [Patch][V2]Enable -Wuninitialized + -ftrivial-auto-var-init for address taken variables
> On Jan 12, 2022, at 4:46 AM, Richard Biener > wrote: > > On Tue, Jan 11, 2022 at 5:32 PM Qing Zhao wrote: >> > + /* Ignore the call to .DEFERRED_INIT that define the original > +var itself. */ > + if (is_gimple_assign (context)) > + { > + if (TREE_CODE (gimple_assign_lhs (context)) == VAR_DECL) > + lhs_var = gimple_assign_lhs (context); > + else if (TREE_CODE (gimple_assign_lhs (context)) == > SSA_NAME) > + lhs_var = SSA_NAME_VAR (gimple_assign_lhs (context)); > + } > + if (lhs_var > + && (lhs_var_name = DECL_NAME (lhs_var)) > + && (lhs_var_name_str = IDENTIFIER_POINTER (lhs_var_name)) > + && (strcmp (lhs_var_name_str, var_name_str) == 0)) > + return; > > likewise but I don't really understand what you are doing here. The above is to exclude the following case: temp = .DEFERRED_INIT (4, 2, “alt_reloc"); alt_reloc = temp; i.e, a call to .DEFERRED_INIT that define the original variable itself. >>> >>> How can this happen? It looks like a bug to me. Do you have a testcase? >> With -ftrivial-auto-var-init, During gimplification phase, almost all >> address taken variables that do not have an explicit initializer will have >> the above IR pattern. >> >> For example, gcc.dg/auto-init-uninit-16.c: >> >> [opc@qinzhao-ol8u3-x86 gcc]$ cat >> /home/opc/Work/GCC/latest-gcc/gcc/testsuite/gcc.dg/auto-init-uninit-16.c >> >> With -ftrivial-auto-var-init=zero, the IR after gimplification is: >> >> _1 = .DEFERRED_INIT (4, 2, &"alt_reloc"[0]); >> alt_reloc = _1; >> >> And the IR after SSA is similar as the above: >> >> _1 = .DEFERRED_INIT (4, 2, &"alt_reloc"[0]); >> alt_reloc = _1; >> >> During the early uninitialized analysis phase, the above IR will feed to the >> analyzer, we should exclude such >> IR from issuing fake warnings. > > Yes, but how do we get to a fake warning here? We should eventually > run into the _1 def being used > by alt_reloc = ...; and then by means of using the string "alt_reloc" > warn about an uninitialized use of > alt_reloc? Is it the point of the use that is "misdiagnosed"? Yes, that’s the issue, the use point would be misdiagnosed without excluding this self definition chain. For the following: cat /home/opc/Work/GCC/latest-gcc/gcc/testsuite/gcc.dg/auto-init-uninit-16.c 1 /* { dg-do compile } */ 2 /* { dg-options "-O2 -Wuninitialized -ftrivial-auto-var-init=zero" } */ 3 4 int foo, bar; 5 6 static 7 void decode_reloc(int reloc, int *is_alt) 8 { 9 if (reloc >= 20) 10 *is_alt = 1; 11 else if (reloc >= 10) 12 *is_alt = 0; 13 } 14 15 void testfunc() 16 { 17 int alt_reloc; 18 19 decode_reloc(foo, &alt_reloc); 20 21 if (alt_reloc) /* { dg-warning "may be used uninitialized" "" } */ 22 bar = 42; 23 } If I deleted the following from tree-ssa-uninit.c: + /* Ignore the call to .DEFERRED_INIT that define the original +var itself. */ + if (is_gimple_assign (context)) + { + if (TREE_CODE (gimple_assign_lhs (context)) == VAR_DECL) + lhs_var = gimple_assign_lhs (context); + else if (TREE_CODE (gimple_assign_lhs (context)) == SSA_NAME) + lhs_var = SSA_NAME_VAR (gimple_assign_lhs (context)); + } + if (lhs_var + && (lhs_var_name = DECL_NAME (lhs_var)) + && (lhs_var_name_str = IDENTIFIER_POINTER (lhs_var_name)) + && (strcmp (lhs_var_name_str, var_name_str) == 0)) + return; We will get the following warning: /home/opc/Work/GCC/latest_gcc/gcc/testsuite/gcc.dg/auto-init-uninit-16.c: In function ‘testfunc’: /home/opc/Work/GCC/latest_gcc/gcc/testsuite/gcc.dg/auto-init-uninit-16.c:17:7: warning: ‘alt_reloc’ is used uninitialized [-Wuninitialized] In which the line number of the usage point “17:7” is wrong, which is the declaration point of the variable. This warning is issued during “pass_early_warn_uninitialized” for the following IR: _1 = .DEFERRED_INIT (4, 2, &"alt_reloc"[0]); alt_reloc = _1; The above usage point “alt_reloc = _1” actually is a fake usage, we should exclude this usage from issuing warning. The correct warning should be issued during “pass_late_warn_uninitialized” and for line 21 “if alt_reloc” for the following IR (the use point is “if (_1 != 0), which is for line 21: _1 = .DEFERRED_INIT (4, 2, &"alt_reloc"[0]); if (_1 != 0) Hope this is clear now. > I was > expecting the first patch to fix this. > > I'll wait for an update to the second patch. I will update it today. thanks. Qing > > Richard. > >> >>> > I'm > also not sure > I understand the case we try to fix with passing the name - is that > for VLA decls > that
Re: [PATCH] Fix -Wformat-diag for s390x target.
On 1/12/22 02:03, Martin Liška wrote: Hello. We've got -Wformat-diag for some time and I think we should start using it in -Werror for GCC bootstrap. The following patch removes last pieces of the warning for s390x target. Ready to be installed? Thanks, Martin gcc/ChangeLog: * config/s390/s390-c.c (s390_expand_overloaded_builtin): Wrap keyword in quotes. (s390_resolve_overloaded_builtin): Remove trailing dot. * config/s390/s390.c (s390_const_operand_ok): Use - for range. (s390_expand_builtin): Remove trailing dot. (s390_emit_prologue): Likewise, use semicolon. (s390_option_override_internal): Update keyword. * varasm.c (do_assemble_alias): Wrap keyword in quotes. --- gcc/config/s390/s390-c.c | 9 + gcc/config/s390/s390.c | 28 ++-- gcc/varasm.c | 2 +- 3 files changed, 20 insertions(+), 19 deletions(-) diff --git a/gcc/config/s390/s390-c.c b/gcc/config/s390/s390-c.c index 600018421df..10bc6ac8900 100644 --- a/gcc/config/s390/s390-c.c +++ b/gcc/config/s390/s390-c.c @@ -484,7 +484,8 @@ s390_expand_overloaded_builtin (location_t loc, case S390_OVERLOADED_BUILTIN_s390_vec_step: if (TREE_CODE (TREE_TYPE ((*arglist)[0])) != VECTOR_TYPE) { - error_at (loc, "builtin vec_step can only be used on vector types."); + error_at (loc, "builtin %qs can only be used on vector types", + "vec_step "); I'd have expected the warning to also trigger for the missing hyphen in "builtin" (as per the GCC coding conventions) but it looks like the code only looks for "builtin function". I must have done that because of the large number of misspellings. Regardless, if the name of the function is __builtin_vec_step (or __builtin_s390_vec_step?) then quoting the entire name of the function as is conventionally done in the rest of GCC would be one solution. Rather than hardcoding the name as a string there should be a way to do that by passing the right tree to %qD (or %qF like below). Based on my reading of the rest of the file I wonder if this might be the decl: s390_builtin_decls[bt_for_overloaded_builtin_var[S390_OVERLOADED_BUILTIN_s390_vec_step]] Alternatively, if the preferred name to call the function with is vec_step then simply vec_step without the "builtin" would suffice. (As an aside, there's a spurious space at the end of "vec_step " above.) return error_mark_node; } return build_int_cst (NULL_TREE, @@ -905,7 +906,7 @@ s390_resolve_overloaded_builtin (location_t loc, if (ob_flags & B_INT) { error_at (loc, - "builtin %qF is for GCC internal use only.", + "builtin %qF is for GCC internal use only", ob_fndecl); return error_mark_node; } @@ -913,7 +914,7 @@ s390_resolve_overloaded_builtin (location_t loc, } if (ob_flags & B_DEP) - warning_at (loc, 0, "builtin %qF is deprecated.", ob_fndecl); + warning_at (loc, 0, "builtin %qF is deprecated", ob_fndecl); if (!TARGET_VX && (ob_flags & B_VX)) { @@ -1021,7 +1022,7 @@ s390_resolve_overloaded_builtin (location_t loc, } if (bflags_overloaded_builtin_var[last_match_index] & B_DEP) - warning_at (loc, 0, "%qs matching variant is deprecated.", + warning_at (loc, 0, "%qs matching variant is deprecated", IDENTIFIER_POINTER (DECL_NAME (ob_fndecl))); /* Overloaded variants which have MAX set as low level builtin are diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 056002e4a4a..bf96cbf7588 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -766,7 +766,7 @@ s390_const_operand_ok (tree arg, int argnum, int op_flags, tree decl) argnum, decl, values); } else - error ("constant argument %d for builtin %qF is out of range (0..%wu)", + error ("constant argument %d for builtin %qF is out of range (0-%wu)", argnum, decl, (HOST_WIDE_INT_1U << bitwidth) - 1); return false; @@ -783,7 +783,7 @@ s390_const_operand_ok (tree arg, int argnum, int op_flags, tree decl) || tree_to_shwi (arg) > ((HOST_WIDE_INT_1 << (bitwidth - 1)) - 1)) { error ("constant argument %d for builtin %qF is out of range " - "(%wd..%wd)", argnum, decl, + "(%wd-%wd)", argnum, decl, -(HOST_WIDE_INT_1 << (bitwidth - 1)), (HOST_WIDE_INT_1 << (bitwidth - 1)) - 1); return false; @@ -832,25 +832,25 @@ s390_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED, if ((bflags & B_HTM) && !TARGET_HTM) { error ("builtin %qF is not supported without %<-mhtm%> " - "(default with %<-march=zEC12%> and higher).", fndecl); + "(default with %<-march=zEC12%> and higher)", fndecl); return const0_rtx; } if (((bflags & B_VX) || (bflags & B_VXE)) && !TARGET_VX) { error ("builtin %qF requires %
[committed] libstdc++: Add explicit dg-do directive to .../103955.cc
libstdc++-v3/ChangeLog: * testsuite/20_util/to_chars/103955.cc: Add explicit dg-do directive. --- libstdc++-v3/testsuite/20_util/to_chars/103955.cc | 1 + 1 file changed, 1 insertion(+) diff --git a/libstdc++-v3/testsuite/20_util/to_chars/103955.cc b/libstdc++-v3/testsuite/20_util/to_chars/103955.cc index 3753c87ad8b..e7f17f7a9e8 100644 --- a/libstdc++-v3/testsuite/20_util/to_chars/103955.cc +++ b/libstdc++-v3/testsuite/20_util/to_chars/103955.cc @@ -1,6 +1,7 @@ // PR libstdc++/103955 // Verify we don't crash when the floating-point to_chars overloads are passed // a large precision argument. +// { dg-do run { target c++17 } } #include -- 2.35.0.rc0
Re: [PATCH] Fix -Wformat-diag for rs6000 target.
On 1/12/22 02:02, Martin Liška wrote: Hello. We've got -Wformat-diag for some time and I think we should start using it in -Werror for GCC bootstrap. The following patch removes last pieces of the warning for rs6000 target. Ready to be installed? Thanks, Martin gcc/ChangeLog: * config/rs6000/rs6000-call.c (rs6000_invalid_builtin): Wrap keywords and use %qs instead of %<%s%>. (rs6000_expand_builtin): Likewise. gcc/testsuite/ChangeLog: * gcc.target/powerpc/bfp/scalar-extract-exp-5.c: Adjust scans in testcases. * gcc.target/powerpc/bfp/scalar-extract-sig-5.c: Likewise. * gcc.target/powerpc/bfp/scalar-insert-exp-11.c: Likewise. --- gcc/config/rs6000/rs6000-call.c | 8 .../gcc.target/powerpc/bfp/scalar-extract-exp-5.c | 2 +- .../gcc.target/powerpc/bfp/scalar-extract-sig-5.c | 2 +- .../gcc.target/powerpc/bfp/scalar-insert-exp-11.c | 2 +- 4 files changed, 7 insertions(+), 7 deletions(-) diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index c78b8b08c40..becdad73812 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -3307,7 +3307,7 @@ rs6000_invalid_builtin (enum rs6000_gen_builtins fncode) "-mvsx"); break; case ENB_IEEE128_HW: - error ("%qs requires ISA 3.0 IEEE 128-bit floating point", name); + error ("%qs requires ISA 3.0 IEEE 128-bit floating-point", name); break; case ENB_DFP: error ("%qs requires the %qs option", name, "-mhard-dfp"); @@ -5589,20 +5589,20 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* subtarget */, if (bif_is_nosoft (*bifaddr) && rs6000_isa_flags & OPTION_MASK_SOFT_FLOAT) { - error ("%<%s%> not supported with %<-msoft-float%>", + error ("%qs not supported with %<-msoft-float%>", bifaddr->bifname); return const0_rtx; } if (bif_is_no32bit (*bifaddr) && TARGET_32BIT) { - error ("%<%s%> is not supported in 32-bit mode", bifaddr->bifname); + error ("%qs is not supported in 32-bit mode", bifaddr->bifname); return const0_rtx; } if (bif_is_ibmld (*bifaddr) && !FLOAT128_2REG_P (TFmode)) { - error ("%<%s%> requires % to be IBM 128-bit format", + error ("%qs requires % to be IBM 128-bit format", bifaddr->bifname); return const0_rtx; } diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c index 34184812dc5..1225613b275 100644 --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c @@ -14,7 +14,7 @@ get_exponent (__ieee128 *p) { __ieee128 source = *p; - return scalar_extract_exp (source); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating point" } */ + return scalar_extract_exp (source); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating-point" } */ This instance of the warning isn't correct. It should be relaxed to allow unhyphenated floating point as a noun at the end of a message, as discussed briefly last March: https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566881.html Martin } diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c index 13c64fc3acf..adf0e4f99df 100644 --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c @@ -12,5 +12,5 @@ get_significand (__ieee128 *p) { __ieee128 source = *p; - return (long long int) __builtin_vec_scalar_extract_sig (source); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating point" } */ + return (long long int) __builtin_vec_scalar_extract_sig (source); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating-point" } */ } diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c index a5dd852e60f..6787a67950b 100644 --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c @@ -16,5 +16,5 @@ insert_exponent (__ieee128 *significand_p, __ieee128 significand = *significand_p; unsigned long long int exponent = *exponent_p; - return scalar_insert_exp (significand, exponent); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating point" } */ + return scalar_insert_exp (significand, exponent); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating-point" } */ }
[PATCH] Allow other languages to change long double format on PowerPC
Allow other languages to change long double format on PowerPC. With Fortran adding support for changing the long double format, this patch removes the code that only allowed C/C++ to change the long double format for GLIBC 2.32 and later without a warning. I have bootstraped the compiler with this change and there where no regressesion. In addition, I have applied it to the power-ieee128 branch where the work is being done to add support for both types of 128-bit floating point on PowerPC. If I build a compiler from that branch, I can change the 128-bit floating point format at compile time. Can I check this patch into the trunk GCC compiler? gcc/ 2022-01-12 Michael Meissner * config/rs6000/rs6000.c (rs6000_option_override_internal): Remove checks for only C/C++ front ends before allowing the long double format to change without a warning. --- gcc/config/rs6000/rs6000.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 319182e94d9..0e3481c8327 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -4221,13 +4221,11 @@ rs6000_option_override_internal (bool global_init_p) if (rs6000_ieeequad != TARGET_IEEEQUAD_DEFAULT && TARGET_LONG_DOUBLE_128) { /* Determine if the user can change the default long double type at -compilation time. Only C and C++ support this, and you need GLIBC -2.32 or newer. Only issue one warning. */ +compilation time. You need GLIBC 2.32 or newer to be able to +change the long double type. Only issue one warning. */ static bool warned_change_long_double; - if (!warned_change_long_double - && (!glibc_supports_ieee_128bit () - || (!lang_GNU_C () && !lang_GNU_CXX ( + if (!warned_change_long_double && !glibc_supports_ieee_128bit ()) { warned_change_long_double = true; if (TARGET_IEEEQUAD) -- 2.33.1 -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com
Re: [PATCH] Allow other languages to change long double format on PowerPC
On Wed, Jan 12, 2022 at 12:46:16PM -0500, Michael Meissner via Gcc-patches wrote: > Allow other languages to change long double format on PowerPC. > > With Fortran adding support for changing the long double format, this > patch removes the code that only allowed C/C++ to change the long double > format for GLIBC 2.32 and later without a warning. > > I have bootstraped the compiler with this change and there where no > regressesion. In addition, I have applied it to the power-ieee128 branch > where > the work is being done to add support for both types of 128-bit floating point > on PowerPC. If I build a compiler from that branch, I can change the 128-bit > floating point format at compile time. > > Can I check this patch into the trunk GCC compiler? > > gcc/ > 2022-01-12 Michael Meissner > > * config/rs6000/rs6000.c (rs6000_option_override_internal): Remove > checks for only C/C++ front ends before allowing the long double > format to change without a warning. This is already on the trunk, I've merged the power-ieee128 branch changes into trunk yesterday (r12-6491 through r12-6508 commits inclusive). This patch in particular as r12-6503-g7d8011fa00fca283003c84e23a8ca66286f83dfa. Jakub
Re: [PATCH] Allow other languages to change long double format on PowerPC
On Wed, Jan 12, 2022 at 06:50:04PM +0100, Jakub Jelinek wrote: > On Wed, Jan 12, 2022 at 12:46:16PM -0500, Michael Meissner via Gcc-patches > wrote: > > Allow other languages to change long double format on PowerPC. > > > > With Fortran adding support for changing the long double format, this > > patch removes the code that only allowed C/C++ to change the long double > > format for GLIBC 2.32 and later without a warning. > > > > I have bootstraped the compiler with this change and there where no > > regressesion. In addition, I have applied it to the power-ieee128 branch > > where > > the work is being done to add support for both types of 128-bit floating > > point > > on PowerPC. If I build a compiler from that branch, I can change the > > 128-bit > > floating point format at compile time. > > > > Can I check this patch into the trunk GCC compiler? > > > > gcc/ > > 2022-01-12 Michael Meissner > > > > * config/rs6000/rs6000.c (rs6000_option_override_internal): Remove > > checks for only C/C++ front ends before allowing the long double > > format to change without a warning. > > This is already on the trunk, I've merged the power-ieee128 branch changes > into trunk yesterday (r12-6491 through r12-6508 commits inclusive). > This patch in particular as > r12-6503-g7d8011fa00fca283003c84e23a8ca66286f83dfa. Thanks. I didn't notice that the patches had been merged into the trunk. -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com
[PATCH] Use system default for long double if not specified on PowerPC.
Use system default for long double if not specified on PowerPC. If the user did not specify a default long double format, use the long double default for the build compiler for the long double default. This patch will allow compilers built on a distribution that has changed the 128-bit floating point format to use the default used on the system. I did a normal normal bootstrap and make check regression on a little endian power9 system and there were no regressions. In addition, I built a compiler where I configured the default to use IEEE 128-bit floating point for long double. I then used that compiler to build a bootstrap with this patch applied and I did not set the floating point format. I verified that the compiler built with this patch defaults long double to be IEEE 128-bit. Can I apply this patch to the trunk for GCC 12? gcc/ 2022-01-12 Michael Meissner * config/rs6000/rs6000.c (TARGET_IEEEQUAD_DEFAULT): If the compiler used to build this compiler defaults long double to IEEE 128-bit, then make it the default for the target. --- gcc/config/rs6000/rs6000.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 0e3481c8327..e59f7aa49c8 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -91,14 +91,22 @@ explicitly redefine TARGET_IEEEQUAD and TARGET_IEEEQUAD_DEFAULT to 0, so those systems will not pick up this default. This needs to be after all of the include files, so that POWERPC_LINUX and POWERPC_FREEBSD are - properly defined. */ + properly defined. + + If we are being built by a compiler that uses IEEE 128-bit as the default + long double and no explicit long double format was selected, then also + default long double to IEEE 128-bit. */ #ifndef TARGET_IEEEQUAD_DEFAULT #if !defined (POWERPC_LINUX) && !defined (POWERPC_FREEBSD) #define TARGET_IEEEQUAD_DEFAULT 1 #else +#ifdef __LONG_DOUBLE_IEEE128__ +#define TARGET_IEEEQUAD_DEFAULT 1 +#else #define TARGET_IEEEQUAD_DEFAULT 0 #endif #endif +#endif /* Don't enable PC-relative addressing if the target does not support it. */ #ifndef PCREL_SUPPORTED_BY_OS -- 2.34.1 -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com
Re: [PATCH] ira: Fix old-reload targets [PR103974]
> From: Jeff Law via Gcc-patches > Date: Wed, 12 Jan 2022 16:58:50 +0100 > On 1/12/2022 8:00 AM, Richard Biener wrote: > > On Wed, Jan 12, 2022 at 3:26 PM Vladimir Makarov > > wrote: > >> > >> On 2022-01-12 03:47, Richard Biener wrote: > >>> On Tue, Jan 11, 2022 at 7:41 PM Vladimir Makarov via Gcc-patches > >>> wrote: > On 2022-01-11 12:42, Richard Sandiford wrote: > > The new IRA heuristics would need more work on old-reload targets, > > since flattening needs to be able to undo the cost propagation. > > It's doable, but hardly seems worth it. > Agree. It is not worth to spend your time for work for reload. > > This patch therefore makes all the new calls to > > ira_subloop_allocnos_can_differ_p return false if !ira_use_lra_p. > > The color_pass code that predated the new function (and that was > > the source of ira_subloop_allocnos_can_differ_p) continues to > > behave as before. > > > > It's a hack, but at least it has the advantage that the new parameter > > would become obviously unused if reload and (!)ira_use_lra_p were > > removed. The hack should therefore disappear alongside reload. > I have a feeling that it will stay for a long time if not forever. > >>> We indeed seem to have 34 targets w/o LRA by default and only 15 with :/ > >>> > >>> At some point Eric wrote a nice summary for how to transition targets > >>> away from CC0, I wonder if there's something similar for transitioning > >>> a port away from reload to LRA? In those 34 targets there must be some > >>> for which that's a relatively easy task? I suppose it depends on how > >>> much of the reload target hooks are put to use and how those translate > >>> to LRA. > >> First of all the target should be rid of using CC0. Then theoretically > >> :) the target should work with LRA as LRA uses existing reload hooks > >> (more accurately a subset of them). > > CC0 is no more, we've accomplished that feat for GCC 12! > > > >> In practice some work should be done for switching to LRA. I did first > >> 4 major targets to work with LRA and unfortunately did not find some > >> repeating patterns in this work. The problems for the first targets > >> were mostly unique and required a lot of LRA code modifications. After > >> that people did other target switching pretty easily and spent few time > >> for this as I remember. > >> > >> So based on my experience of porting targets to LRA I can not formalize > >> this work. But probably it can be done by examining all LRA targets > >> code (mostly looking at machine dependent code related to use > >> lra_in_progress_p) or by collecting information what was done from other (About lra_in_progress_p, I see mixes of reload_in_progress and lra_in_progress (no "_p") in supposed-switched-targets. And shouldn't reload_completed be renamed (actually: some kind or alias for) lra_completed to avoid confusion?) I recall comments about code quality regressions. Are there actual numbers? (Preferably from around the transition time, because I bet targets still supporting "-mlra" have regressed on the reload side since then.) > >> people who did porting to LRA. > > So in theory it might be just pulling the switch for some? That is, > > removing their definition of TARGET_LRA_P which then defaults > > to true? > > > > Jeff might be able to test this for (all) targets on his harness. > Given a patch, it's trivial do throw it in and see what the fallout is. Again there's talk about LRA and comparing it to CC0, so again I'll remind of the lack of documentation for LRA (in contrast to CC0). I'm not just referring to guides to use for switching over a target to LRA, but sure, that'll help too. For starters, for each constraint and register-class macro and hook, what's the difference between reload and LRA; which ones are unused and which ones are new? brgds, H-P
Re: [PATCH] ira: Fix old-reload targets [PR103974]
> From: Jeff Law via Gcc-patches > Date: Wed, 12 Jan 2022 16:58:50 +0100 > On 1/12/2022 8:00 AM, Richard Biener wrote: > > On Wed, Jan 12, 2022 at 3:26 PM Vladimir Makarov > > wrote: > >> > >> On 2022-01-12 03:47, Richard Biener wrote: > >>> On Tue, Jan 11, 2022 at 7:41 PM Vladimir Makarov via Gcc-patches > >>> wrote: > On 2022-01-11 12:42, Richard Sandiford wrote: > > The new IRA heuristics would need more work on old-reload targets, > > since flattening needs to be able to undo the cost propagation. > > It's doable, but hardly seems worth it. > Agree. It is not worth to spend your time for work for reload. > > This patch therefore makes all the new calls to > > ira_subloop_allocnos_can_differ_p return false if !ira_use_lra_p. > > The color_pass code that predated the new function (and that was > > the source of ira_subloop_allocnos_can_differ_p) continues to > > behave as before. > > > > It's a hack, but at least it has the advantage that the new parameter > > would become obviously unused if reload and (!)ira_use_lra_p were > > removed. The hack should therefore disappear alongside reload. > I have a feeling that it will stay for a long time if not forever. > >>> We indeed seem to have 34 targets w/o LRA by default and only 15 with :/ > >>> > >>> At some point Eric wrote a nice summary for how to transition targets > >>> away from CC0, I wonder if there's something similar for transitioning > >>> a port away from reload to LRA? In those 34 targets there must be some > >>> for which that's a relatively easy task? I suppose it depends on how > >>> much of the reload target hooks are put to use and how those translate > >>> to LRA. > >> First of all the target should be rid of using CC0. Then theoretically > >> :) the target should work with LRA as LRA uses existing reload hooks > >> (more accurately a subset of them). > > CC0 is no more, we've accomplished that feat for GCC 12! > > > >> In practice some work should be done for switching to LRA. I did first > >> 4 major targets to work with LRA and unfortunately did not find some > >> repeating patterns in this work. The problems for the first targets > >> were mostly unique and required a lot of LRA code modifications. After > >> that people did other target switching pretty easily and spent few time > >> for this as I remember. > >> > >> So based on my experience of porting targets to LRA I can not formalize > >> this work. But probably it can be done by examining all LRA targets > >> code (mostly looking at machine dependent code related to use > >> lra_in_progress_p) or by collecting information what was done from other (About lra_in_progress_p, I see mixes of reload_in_progress and lra_in_progress (no "_p") in supposed-switched-targets. And shouldn't reload_completed be renamed (actually: some kind or alias for) lra_completed to avoid confusion?) I recall comments about code quality regressions. Are there actual numbers? (Preferably from around the transition time, because I bet targets still supporting "-mlra" have regressed on the reload side since then.) > >> people who did porting to LRA. > > So in theory it might be just pulling the switch for some? That is, > > removing their definition of TARGET_LRA_P which then defaults > > to true? > > > > Jeff might be able to test this for (all) targets on his harness. > Given a patch, it's trivial do throw it in and see what the fallout is. Again there's talk about LRA and comparing it to CC0, so again I'll remind of the lack of documentation for LRA (in contrast to CC0). I'm not just referring to guides to use for switching over a target to LRA, but sure, that'll help too. For starters, for each constraint and register-class macro and hook, what's the difference between reload and LRA; which ones are unused and which ones are new? brgds, H-P
Re: [PATCH v5 1/1] [ARM] Add support for TLS register based stack protector canary access
On Wed, 17 Nov 2021 at 18:12, Ard Biesheuvel wrote: > > (+ Ramana) > Ping? > On Mon, 15 Nov 2021 at 19:04, Ard Biesheuvel wrote: > > > > Add support for accessing the stack canary value via the TLS register, > > so that multiple threads running in the same address space can use > > distinct canary values. This is intended for the Linux kernel running in > > SMP mode, where processes entering the kernel are essentially threads > > running the same program concurrently: using a global variable for the > > canary in that context is problematic because it can never be rotated, > > and so the OS is forced to use the same value as long as it remains up. > > > > Using the TLS register to index the stack canary helps with this, as it > > allows each CPU to context switch the TLS register along with the rest > > of the process, permitting each process to use its own value for the > > stack canary. > > > > 2021-11-15 Ard Biesheuvel > > > > * config/arm/arm-opts.h (enum stack_protector_guard): New > > * config/arm/arm-protos.h (arm_stack_protect_tls_canary_mem): > > New > > * config/arm/arm.c (TARGET_STACK_PROTECT_GUARD): Define > > (arm_option_override_internal): Handle and put in error checks > > for stack protector guard options. > > (arm_option_reconfigure_globals): Likewise > > (arm_stack_protect_tls_canary_mem): New > > (arm_stack_protect_guard): New > > * config/arm/arm.md (stack_protect_set): New > > (stack_protect_set_tls): Likewise > > (stack_protect_test): Likewise > > (stack_protect_test_tls): Likewise > > (reload_tp_hard): Likewise > > * config/arm/arm.opt (-mstack-protector-guard): New > > (-mstack-protector-guard-offset): New. > > * doc/invoke.texi: Document new options > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/arm/stack-protector-7.c: New test. > > * gcc.target/arm/stack-protector-8.c: New test. > > > > Signed-off-by: Ard Biesheuvel > > --- > > gcc/config/arm/arm-opts.h| 6 ++ > > gcc/config/arm/arm-protos.h | 2 + > > gcc/config/arm/arm.c | 55 +++ > > gcc/config/arm/arm.md| 71 +++- > > gcc/config/arm/arm.opt | 22 ++ > > gcc/doc/invoke.texi | 11 +++ > > gcc/testsuite/gcc.target/arm/stack-protector-7.c | 10 +++ > > gcc/testsuite/gcc.target/arm/stack-protector-8.c | 5 ++ > > 8 files changed, 180 insertions(+), 2 deletions(-) > > > > diff --git a/gcc/config/arm/arm-opts.h b/gcc/config/arm/arm-opts.h > > index 5c4b62f404f7..581ba3c4fbbb 100644 > > --- a/gcc/config/arm/arm-opts.h > > +++ b/gcc/config/arm/arm-opts.h > > @@ -69,4 +69,10 @@ enum arm_tls_type { > >TLS_GNU, > >TLS_GNU2 > > }; > > + > > +/* Where to get the canary for the stack protector. */ > > +enum stack_protector_guard { > > + SSP_TLSREG, /* per-thread canary in TLS register */ > > + SSP_GLOBAL /* global canary */ > > +}; > > #endif > > diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h > > index 9b1f61394ad7..d8d605920c97 100644 > > --- a/gcc/config/arm/arm-protos.h > > +++ b/gcc/config/arm/arm-protos.h > > @@ -195,6 +195,8 @@ extern void arm_split_atomic_op (enum rtx_code, rtx, > > rtx, rtx, rtx, rtx, rtx); > > extern rtx arm_load_tp (rtx); > > extern bool arm_coproc_builtin_available (enum unspecv); > > extern bool arm_coproc_ldc_stc_legitimate_address (rtx); > > +extern rtx arm_stack_protect_tls_canary_mem (bool); > > + > > > > #if defined TREE_CODE > > extern void arm_init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree); > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > > index a5b403eb3e49..e5077348ce07 100644 > > --- a/gcc/config/arm/arm.c > > +++ b/gcc/config/arm/arm.c > > @@ -829,6 +829,9 @@ static const struct attribute_spec > > arm_attribute_table[] = > > > > #undef TARGET_MD_ASM_ADJUST > > #define TARGET_MD_ASM_ADJUST arm_md_asm_adjust > > + > > +#undef TARGET_STACK_PROTECT_GUARD > > +#define TARGET_STACK_PROTECT_GUARD arm_stack_protect_guard > > > > /* Obstack for minipool constant handling. */ > > static struct obstack minipool_obstack; > > @@ -3176,6 +3179,26 @@ arm_option_override_internal (struct gcc_options > > *opts, > >if (TARGET_THUMB2_P (opts->x_target_flags)) > > opts->x_inline_asm_unified = true; > > > > + if (arm_stack_protector_guard == SSP_GLOBAL > > + && opts->x_arm_stack_protector_guard_offset_str) > > +{ > > + error ("incompatible options %'-mstack-protector-guard=global%' and" > > +"%'-mstack-protector-guard-offset=%qs%'", > > +arm_stack_protector_guard_offset_str); > > +} > > + > > + if (opts->x_arm_stack_protector_guard_offset_str) > > +{ > > + char *end; > > + const char *str = arm_st
Re: [PATCH] ira: Fix old-reload targets [PR103974]
On Wed, 12 Jan 2022, Richard Biener via Gcc-patches wrote: > > So in theory it might be just pulling the switch for some? That is, > > removing their definition of TARGET_LRA_P which then defaults > > to true? > > I can confirm it works for visium which builds after such change and > can compile a hello world without crashing. So it might be a viable > strathegy for the less maintained odd architectures we still have in > our tree. FWIW I have massaged the VAX backend sufficiently for the `vax-netbsdelf' target to successfully build with `-mlra' as the default. There are still several ICE regressions throughout the testsuite however (none for old reload), to say nothing about severe code quality regression where it does build. So there seems to be a lot of work to do there yet even if we let code quality regress with LRA. I plan to do it, but it won't be quick. Maciej
Re: [PATCH] ira: Fix old-reload targets [PR103974]
> On Jan 12, 2022, at 1:13 PM, Hans-Peter Nilsson via Gcc-patches > wrote: > >> ... > I recall comments about code quality regressions. Are there > actual numbers? (Preferably from around the transition > time, because I bet targets still supporting "-mlra" have > regressed on the reload side since then.) I haven't looked in a while, but it is certainly the case that the -mlra code out of pdp11 is not as good as that coming out of the old reload. My understanding is that LRA isn't as friendly to memory-centric targets like pdp11 (and vax?). In particular, from what I understood there is no support, or at least no significant support, for the pre-decrement and post-increment register indirect references that those targets like so much. There was some suggestion along the lines of "please feel free to add it to LRA" but that's a seriously hairy undertaking for a programmer with no current knowledge of LRA at all. people who did porting to LRA. >>> So in theory it might be just pulling the switch for some? That is, >>> removing their definition of TARGET_LRA_P which then defaults >>> to true? >>> >>> Jeff might be able to test this for (all) targets on his harness. >> Given a patch, it's trivial do throw it in and see what the fallout is. > > Again there's talk about LRA and comparing it to CC0, so > again I'll remind of the lack of documentation for LRA (in > contrast to CC0). I'm not just referring to guides to use > for switching over a target to LRA, but sure, that'll help > too. > > For starters, for each constraint and register-class macro > and hook, what's the difference between reload and LRA; > which ones are unused and which ones are new? What I found interesting is that apparently, to first approximation, supporting LRA amounted to "just turn it on". I think there were some issues to fix in register classes or constraints, more along the lines of "these are things you should not do for either system but the old reload usually lets you get away with it". So those were handled by cleaning up the issue in question generally, not as an LRA-specific change. Compared to CC0 work the effort was vastly smaller, it's a major rewrite of the back end vs. a few small changes in a few spots. But it's quite possible that the true picture is different and that there should be more changes. And yes, there really should be documentation saying so. GCCint has traditionally been quite excellent; it would be distressing if the creation of new technology like LRA causes it to regress. paul
[PATCH] rs6000: Convert built-in constraints to form
Hi! When introducing the new built-in support, I tried to match as many existing error messages as possible. One common form was "argument X must be a Y-bit unsigned literal". Another was "argument X must be a literal between X' and Y', inclusive". During reviews, Segher requested that I eventually convert all messages of the first form into the second form for consistency. That's what this patch does, replacing all -form constraints (first form) with -form constraints (second form). For the moment, the parser will still accept arguments, but I've added a note in rs6000-builtins.def that this form is deprecated in favor of . I think it's harmless to leave it in, in case a desire for the distinction comes up in the future. Bootstrapped and tested on powerpc64le-linux-gnu with no regressions. Is this okay for trunk? Thanks! Bill 2022-01-12 Bill Schmidt gcc/ * config/rs6000/rs6000-builtins.def (MTFSB0): Replace -form constraints with -form constraints. (MTFSB1): Likewise. (MTFSF): Likewise. (UNPACK_IF): Likewise. (UNPACK_TF): Likewise. (DSS): Likewise. (DST): Likewise. (DSTST): Likewise. (DSTSTT): Likewise. (DSTT): Likewise. (VCFSX): Likewise. (VCFUX): Likewise. (VCTSXS): Likewise. (VCTUXS): Likewise. (VSLDOI_16QI): Likewise. (VSLDOI_4SF): Likewise. (VSLDOI_4SI): Likewise. (VSLDOI_8HI): Likewise. (VSPLTB): Likewise. (VSPLTH): Likewise. (VSPLTW): Likewise. (VEC_SET_V16QI): Likewise. (VEC_SET_V4SF): Likewise. (VEC_SET_V4SI): Likewise. (VEC_SET_V8HI): Likewise. (VSLDOI_2DF): Likewise. (VSLDOI_2DI): Likewise. (VEC_SET_V2DF): Likewise. (VEC_SET_V2DI): Likewise. (XVCVSXDDP_SCALE): Likewise. (XVCVUXDDP_SCALE): Likewise. (XXPERMDI_16QI): Likewise. (XXPERMDI_1TI): Likewise. (XXPERMDI_2DF): Likewise. (XXPERMDI_2DI): Likewise. (XXPERMDI_4SF): Likewise. (XXPERMDI_4SI): Likewise. (XXPERMDI_8HI): Likewise. (XXSLDWI_16QI): Likewise. (XXSLDWI_2DF): Likewise. (XXSLDWI_2DI): Likewise. (XXSLDWI_4SF): Likewise. (XXSLDWI_4SI): Likewise. (XXSLDWI_8HI): Likewise. (XXSPLTD_V2DF): Likewise. (XXSPLTD_V2DI): Likewise. (UNPACK_V1TI): Likewise. (BCDADD_V1TI): Likewise. (BCDADD_V16QI): Likewise. (BCDADD_EQ_V1TI): Likewise. (BCDADD_EQ_V16QI): Likewise. (BCDADD_GT_V1TI): Likewise. (BCDADD_GT_V16QI): Likewise. (BCDADD_LT_V1TI): Likewise. (BCDADD_LT_V16QI): Likewise. (BCDADD_OV_V1TI): Likewise. (BCDADD_OV_V16QI): Likewise. (BCDSUB_V1TI): Likewise. (BCDSUB_V16QI): Likewise. (BCDSUB_EQ_V1TI): Likewise. (BCDSUB_EQ_V16QI): Likewise. (BCDSUB_GT_V1TI): Likewise. (BCDSUB_GT_V16QI): Likewise. (BCDSUB_LT_V1TI): Likewise. (BCDSUB_LT_V16QI): Likewise. (BCDSUB_OV_V1TI): Likewise. (BCDSUB_OV_V16QI): Likewise. (VSTDCDP): Likewise. (VSTDCSP): Likewise. (VTDCDP): Likewise. (VTDCSP): Likewise. (TSTSFI_EQ_DD): Likewise. (TSTSFI_EQ_TD): Likewise. (TSTSFI_GT_DD): Likewise. (TSTSFI_GT_TD): Likewise. (TSTSFI_LT_DD): Likewise. (TSTSFI_LT_TD): Likewise. (TSTSFI_OV_DD): Likewise. (TSTSFI_OV_TD): Likewise. (VSTDCQP): Likewise. (DDEDPD): Likewise. (DDEDPDQ): Likewise. (DENBCD): Likewise. (DENBCDQ): Likewise. (DSCLI): Likewise. (DSCLIQ): Likewise. (DSCRI): Likewise. (DSCRIQ): Likewise. (UNPACK_TD): Likewise. (VSHASIGMAD): Likewise. (VSHASIGMAW): Likewise. (VCNTMBB): Likewise. (VCNTMBD): Likewise. (VCNTMBH): Likewise. (VCNTMBW): Likewise. (VREPLACE_UN_UV2DI): Likewise. (VREPLACE_UN_UV4SI): Likewise. (VREPLACE_UN_V2DF): Likewise. (VREPLACE_UN_V2DI): Likewise. (VREPLACE_UN_V4SF): Likewise. (VREPLACE_UN_V4SI): Likewise. (VREPLACE_ELT_UV2DI): Likewise. (VREPLACE_ELT_UV4SI): Likewise. (VREPLACE_ELT_V2DF): Likewise. (VREPLACE_ELT_V2DI): Likewise. (VREPLACE_ELT_V4SF): Likewise. (VREPLACE_ELT_V4SI): Likewise. (VSLDB_V16QI): Likewise. (VSLDB_V2DI): Likewise. (VSLDB_V4SI): Likewise. (VSLDB_V8HI): Likewise. (VSRDB_V16QI): Likewise. (VSRDB_V2DI): Likewise. (VSRDB_V4SI): Likewise. (VSRDB_V8HI): Likewise. (VXXSPLTI32DX_V4SF): Likewise. (VXXSPLTI32DX_V4SI): Likewise. (XXEVAL): Likewise. (XXGENPCVM_V16QI): Likewise. (XXGENPCVM_V2DI): Likewise. (XXGENPCVM_V4SI): Likewise. (XXGENPCVM_V8HI): Likewise.
[PATCH] i386: Add CC clobber and splits for 32-bit vector mode logic insns [PR100673, PR103861]
Add CC clobber to 32-bit vector mode logic insns to allow variants with general-purpose registers. Also improve ix86_sse_movcc to emit insn with CC clobber for narrow vector modes in order to re-enable conditional moves for 16-bit and 32-bit narrow vector modes with -msse2. 2022-01-12 Uroš Bizjak gcc/ChangeLog: PR target/100637 PR target/103861 * config/i386/i386-expand.c (ix86_emit_vec_binop): New static function. (ix86_expand_sse_movcc): Use ix86_emit_vec_binop instead of gen_rtx_X when constructing vector logic RTXes. (expand_vec_perm_pshufb2): Ditto. * config/i386/mmx.md (negv2qi): Disparage GPR alternative a bit. (v2qi3): Ditto. (vcond): Re-enable for TARGET_SSE2. (vcondu): Ditto. (vcond_mask_): Ditto. (one_cmpl2): Remove expander. (one_cmpl2): Rename from one_cmplv2qi. Use VI_16_32 mode iterator. (one_cmpl2 splitters): Use VI_16_32 mode iterator. Use lowpart_subreg instead of gen_lowpart to create subreg. (*andnot3): Merge from "*andnot" and "*andnotv2qi3" insn patterns using VI_16_32 mode iterator. Disparage GPR alternative a bit. Add CC clobber. (*andnot3 splitters): Use VI_16_32 mode iterator. Use lowpart_subreg instead of gen_lowpart to create subreg. (*3): Merge from "*" and "*v2qi3 insn patterns" using VI_16_32 mode iterator. Disparage GPR alternative a bit. Add CC clobber. (*3 splitters):Use VI_16_32 mode iterator. Use lowpart_subreg instead of gen_lowpart to create subreg. gcc/testsuite/ChangeLog: PR target/100637 PR target/103861 * g++.target/i386/pr100637-1b.C (dg-options): Use -msse2 instead of -msse4.1. * g++.target/i386/pr100637-1w.C (dg-options): Ditto. * g++.target/i386/pr103861-1.C (dg-options): Ditto. * gcc.target/i386/pr100637-4b.c (dg-options): Ditto. * gcc.target/i386/pr103861-4.c (dg-options): Ditto. * gcc.target/i386/pr100637-1b.c: Remove scan-assembler directives for logic instructions. * gcc.target/i386/pr100637-1w.c: Ditto. * gcc.target/i386/warn-vect-op-2.c: Update dg-warning for vector logic operation. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to master. Uros. diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index 8b1266fb9f1..0318f126785 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -3752,6 +3752,27 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx cmp_op0, rtx cmp_op1, return dest; } +/* Emit x86 binary operand CODE in mode MODE for SSE vector + instructions that can be performed using GP registers. */ + +static void +ix86_emit_vec_binop (enum rtx_code code, machine_mode mode, +rtx dst, rtx src1, rtx src2) +{ + rtx tmp; + + tmp = gen_rtx_SET (dst, gen_rtx_fmt_ee (code, mode, src1, src2)); + + if (GET_MODE_SIZE (mode) <= GET_MODE_SIZE (SImode) + && GET_MODE_CLASS (mode) == MODE_VECTOR_INT) +{ + rtx clob = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG)); + tmp = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, tmp, clob)); +} + + emit_insn (tmp); +} + /* Expand DEST = CMP ? OP_TRUE : OP_FALSE into a sequence of logical operations. This is used for both scalar and vector conditional moves. */ @@ -3820,23 +3841,20 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) else if (op_false == CONST0_RTX (mode)) { op_true = force_reg (mode, op_true); - x = gen_rtx_AND (mode, cmp, op_true); - emit_insn (gen_rtx_SET (dest, x)); + ix86_emit_vec_binop (AND, mode, dest, cmp, op_true); return; } else if (op_true == CONST0_RTX (mode)) { op_false = force_reg (mode, op_false); x = gen_rtx_NOT (mode, cmp); - x = gen_rtx_AND (mode, x, op_false); - emit_insn (gen_rtx_SET (dest, x)); + ix86_emit_vec_binop (AND, mode, dest, x, op_false); return; } else if (INTEGRAL_MODE_P (mode) && op_true == CONSTM1_RTX (mode)) { op_false = force_reg (mode, op_false); - x = gen_rtx_IOR (mode, cmp, op_false); - emit_insn (gen_rtx_SET (dest, x)); + ix86_emit_vec_binop (IOR, mode, dest, cmp, op_false); return; } else if (TARGET_XOP) @@ -4010,15 +4028,12 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) else t3 = dest; - x = gen_rtx_AND (mode, op_true, cmp); - emit_insn (gen_rtx_SET (t2, x)); + ix86_emit_vec_binop (AND, mode, t2, op_true, cmp); x = gen_rtx_NOT (mode, cmp); - x = gen_rtx_AND (mode, x, op_false); - emit_insn (gen_rtx_SET (t3, x)); + ix86_emit_vec_binop (AND, mode, t3, x, op_false); - x = gen_rtx_IOR (mode, t3, t2); - emit_insn (gen_rtx_SET (dest, x)); + ix86_emit_vec_binop (IOR, mode, dest, t3, t2); } } @@ -20733,7 +20748,7 @@ expand_vec_perm_pshufb2 (struct expand_vec_perm_d *d) op = d->target; if (d-
Re: [PATCH] x86_64: Improvements to arithmetic right shifts of V1TImode values.
On Tue, Jan 11, 2022 at 2:26 PM Roger Sayle wrote: > > > This patch to the i386 backend's ix86_expand_v1ti_ashiftrt provides > improved (shorter) implementations of V1TI mode arithmetic right shifts > for constant amounts between 111 and 126 bits. The significance of > this range is that this functionality is useful for (eventually) > providing sign extension from HImode and QImode to V1TImode. > > For example, x>>112 (to sign extend a 16-bit value), was previously > generated as a four operation sequence: > > movdqa %xmm0, %xmm1// word7 6 5 4 3 2 1 0 > psrad $31, %xmm0 // V8HI = [S,S,?,?,?,?,?,?] > psrad $16, %xmm1 // V8HI = [S,X,?,?,?,?,?,?] > punpckhqdq %xmm0, %xmm1// V8HI = [S,S,?,?,S,X,?,?] > pshufd $253, %xmm1, %xmm0 // V8HI = [S,S,S,S,S,S,S,X] > > with this patch, we now generates a three operation sequence: > > psrad $16, %xmm0 // V8HI = [S,X,?,?,?,?,?,?] > pshufhw $254, %xmm0, %xmm0 // V8HI = [S,S,S,X,?,?,?,?] > pshufd $254, %xmm0, %xmm0 // V8HI = [S,S,S,S,S,S,S,X] > > The correctness of generated code is confirmed by the existing > run-time test gcc.target/i386/sse2-v1ti-ashiftrt-1.c in the testsuite. > This idiom is safe to use for shifts by 127, but that case gets handled > by a two operation sequence earlier in this function. > > > This patch has been tested on x86_64-pc-linux-gnu with a make bootstrap > and make -k check with no new failures. OK for mainline? > > > 2022-01-11 Roger Sayle > > gcc/ChangeLog > * config/i386/i386-expand.c (ix86_expand_v1ti_ashiftrt): Provide > new three operation implementations for shifts by 111..126 bits. + if (bits >= 111) +{ + /* Three operations. */ + rtx tmp1 = gen_reg_rtx (V4SImode); + rtx tmp2 = gen_reg_rtx (V4SImode); + emit_move_insn (tmp1, gen_lowpart (V4SImode, op1)); + emit_insn (gen_ashrv4si3 (tmp2, tmp1, GEN_INT (bits - 96))); This can be written as: rtx tmp1 = force_reg (V4SImode, gen_lowpart (V4SImode, op1)); emit_insn (gen_ashrv4i3 (tmp2, tmp1, GEN_INT ...)); + rtx tmp3 = gen_reg_rtx (V8HImode); + rtx tmp4 = gen_reg_rtx (V8HImode); + emit_move_insn (tmp3, gen_lowpart (V8HImode, tmp2)); + emit_insn (gen_sse2_pshufhw (tmp4, tmp3, GEN_INT (0xfe))); Here in a similar way... + rtx tmp5 = gen_reg_rtx (V4SImode); + rtx tmp6 = gen_reg_rtx (V4SImode); + emit_move_insn (tmp5, gen_lowpart (V4SImode, tmp4)); + emit_insn (gen_sse2_pshufd (tmp6, tmp5, GEN_INT (0xfe))); ... also here. + rtx tmp7 = gen_reg_rtx (V1TImode); + emit_move_insn (tmp7, gen_lowpart (V1TImode, tmp6)); + emit_move_insn (operands[0], tmp7); And here a simple: emit_move_insn (operands[0], gen_lowpart (V1TImode, tmp6); + return; +} + Uros.
Re: [PATCH] [i386] Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b & mask).
On Wed, Jan 12, 2022 at 9:11 AM Haochen Jiang wrote: > > Hi all, > > This patch targets PR94790, which change the instruction selection under the > following circumstance. > > Regtested on x86_64-pc-linux-gnu. Ok for trunk? Please also test with -m32, e.g.: make -j 12 -k check RUNTESTFLAGS="--target_board=unix\{,-m32\}" OK (with an it below), if new testcases do not FAIL with -m32. Thanks, Uros. > > BRs, > Haochen > > From the perspective of the pipeline, `andn + and + ior` version take > 2 cycles(AND and ANDN doesn't have dependence), but xor + and + xor > will take 3 cycles. > > - xorl%edi, %esi > andl%edx, %esi > - movl%esi, %eax > - xorl%edi, %eax > + andn%edi, %edx, %eax > + orl %esi, %eax > > gcc/ChangeLog: > > PR taeget/94790 > * config/i386/i386.md (*xor2andn): New define_insn_and_split. > > gcc/testsuite/ChangeLog: > > PR taeget/94790 > * gcc.target/i386/pr94790-1.c: New test. > * gcc.target/i386/pr94790-2.c: Ditto. > --- > gcc/config/i386/i386.md | 39 +++ > gcc/testsuite/gcc.target/i386/pr94790-1.c | 14 > gcc/testsuite/gcc.target/i386/pr94790-2.c | 9 ++ > 3 files changed, 62 insertions(+) > create mode 100755 gcc/testsuite/gcc.target/i386/pr94790-1.c > create mode 100755 gcc/testsuite/gcc.target/i386/pr94790-2.c > > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > index 9b424a3935b..38efc6d5837 100644 > --- a/gcc/config/i386/i386.md > +++ b/gcc/config/i386/i386.md > @@ -10452,6 +10452,45 @@ > (set_attr "znver1_decode" "double") > (set_attr "mode" "DI")]) > > +;; PR target/94790: Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b & mask) > +(define_insn_and_split "*xor2andn" > + [(set (match_operand:SWI248 0 "nonimmediate_operand") > + (xor:SWI248 > + (and:SWI248 > + (xor:SWI248 > + (match_operand:SWI248 1 "nonimmediate_operand") > + (match_operand:SWI248 2 "nonimmediate_operand")) > + (match_operand:SWI248 3 "nonimmediate_operand")) > + (match_dup 1))) > +(clobber (reg:CC FLAGS_REG))] > + "(TARGET_BMI || TARGET_AVX512BW) > + && ix86_pre_reload_split ()" > + "#" > + "&& 1" > + [(parallel [(set (match_dup 4) > + (and:SWI248 > + (not:SWI248 > + (match_dup 3)) > + (match_dup 1))) > + (clobber (reg:CC FLAGS_REG))]) > + (parallel [(set (match_dup 5) > + (and:SWI248 > + (match_dup 2) > + (match_dup 3))) > + (clobber (reg:CC FLAGS_REG))]) > + (parallel [(set (match_dup 0) > + (ior:SWI248 > + (match_dup 4) > + (match_dup 5))) > + (clobber (reg:CC FLAGS_REG))])] > + { > +operands[1] = force_reg (mode, operands[1]); > +operands[3] = force_reg (mode, operands[3]); > +operands[4] = gen_reg_rtx (mode); > +operands[5] = gen_reg_rtx (mode); > + } > +) Please put brace just after the curved brace, see numerous examples in .md files. > + > ;; See comment for addsi_1_zext why we do use nonimmediate_operand > (define_insn "*si_1_zext" >[(set (match_operand:DI 0 "register_operand" "=r") > diff --git a/gcc/testsuite/gcc.target/i386/pr94790-1.c > b/gcc/testsuite/gcc.target/i386/pr94790-1.c > new file mode 100755 > index 000..6ebbec15cfd > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr94790-1.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mbmi" } */ > +/* { dg-final { scan-assembler-times "andn\[ \\t\]" 2 } } */ > +/* { dg-final { scan-assembler-not "xorl\[ \\t\]" } } */ > + > +unsigned r1(unsigned a, unsigned b, unsigned mask) > +{ > + return a ^ ((a ^ b) & mask); > +} > + > +unsigned r2(unsigned a, unsigned b, unsigned mask) > +{ > + return (~mask & a) | (b & mask); > +} > diff --git a/gcc/testsuite/gcc.target/i386/pr94790-2.c > b/gcc/testsuite/gcc.target/i386/pr94790-2.c > new file mode 100755 > index 000..d7b0eec5bef > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr94790-2.c > @@ -0,0 +1,9 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mbmi" } */ > +/* { dg-final { scan-assembler-not "andn\[ \\t\]" } } */ > +/* { dg-final { scan-assembler-times "xorl\[ \\t\]" 2 } } */ > + > +unsigned r1(unsigned a, unsigned b, unsigned mask) > +{ > + return a ^ ((a ^ b) & mask) + (a ^ b); > +} > -- > 2.18.1 >
[PATCH] testsuite: Compile g++.dg/vect/slp-pr98855.cc only for x86 targets [PR103935]
The testcase is x86 specific, other targets have different costs defined. 2022-01-12 Uroš Bizjak gcc/testsuite/ChangeLog: PR target/103935 * g++.dg/vect/slp-pr98855.cc: Compile only for x86 targets. Tested on x86_64-linux-gnu {,-m32}. Pushed to master. Uros. diff --git a/gcc/testsuite/g++.dg/vect/slp-pr98855.cc b/gcc/testsuite/g++.dg/vect/slp-pr98855.cc index ff59eb95aca..e0527c492d5 100644 --- a/gcc/testsuite/g++.dg/vect/slp-pr98855.cc +++ b/gcc/testsuite/g++.dg/vect/slp-pr98855.cc @@ -1,6 +1,5 @@ -// { dg-do compile } -// { dg-additional-options "-fvect-cost-model=cheap" } -// { dg-additional-options "-mavx2" { target x86_64-*-* i?86-*-* } } +// { dg-do compile { target i?86-*-* x86_64-*-* } } +// { dg-additional-options "-fvect-cost-model=cheap -mavx2" } #include #include
[PATCH] PR fortran/67804 - ICE on data initialization of type(character) with wrong data
Dear Fortranners, the attached patch improves error recovery after an invalid structure constructor has been detected in a DATA statement. Testcase by Gerhard. Regtested on x86_64-pc-linux-gnu. OK for mainline? This should be a rather safe patch which I would like to backport to 11-branch after a suitable waiting period. Thanks, Harald From 31436189cb2859040703ec6baff816cd63ef Mon Sep 17 00:00:00 2001 From: Harald Anlauf Date: Wed, 12 Jan 2022 21:24:49 +0100 Subject: [PATCH] Fortran: fix error recovery on bad structure constructor in DATA statement gcc/fortran/ChangeLog: PR fortran/67804 * primary.c (gfc_match_structure_constructor): Recover from errors that occurred while checking for a valid structure constructor in a DATA statement. gcc/testsuite/ChangeLog: PR fortran/67804 * gfortran.dg/pr93604.f90: Adjust to changed diagnostics. * gfortran.dg/pr67804.f90: New test. --- gcc/fortran/primary.c | 15 --- gcc/testsuite/gfortran.dg/pr67804.f90 | 25 + gcc/testsuite/gfortran.dg/pr93604.f90 | 2 +- 3 files changed, 38 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/pr67804.f90 diff --git a/gcc/fortran/primary.c b/gcc/fortran/primary.c index fd4d6af50c0..3f01f67cd49 100644 --- a/gcc/fortran/primary.c +++ b/gcc/fortran/primary.c @@ -3364,6 +3364,7 @@ gfc_match_structure_constructor (gfc_symbol *sym, gfc_expr **result) match m; gfc_expr *e; gfc_symtree *symtree; + bool t = true; gfc_get_ha_sym_tree (sym->name, &symtree); @@ -3394,10 +3395,18 @@ gfc_match_structure_constructor (gfc_symbol *sym, gfc_expr **result) in the structure constructor must be a constant. Try to reduce the expression here. */ if (gfc_in_match_data ()) -gfc_reduce_init_expr (e); +t = gfc_reduce_init_expr (e); - *result = e; - return MATCH_YES; + if (t) +{ + *result = e; + return MATCH_YES; +} + else +{ + gfc_free_expr (e); + return MATCH_ERROR; +} } diff --git a/gcc/testsuite/gfortran.dg/pr67804.f90 b/gcc/testsuite/gfortran.dg/pr67804.f90 new file mode 100644 index 000..e2009a5bfdb --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr67804.f90 @@ -0,0 +1,25 @@ +! { dg-do compile } +! PR fortran/67804 - ICE on bad type in structure constructor in DATA statement +! Contributed by G.Steinmetz + +program p + type t + character :: c + end type + type u + character, pointer :: c + end type + type(t) :: x0, x1, x2, x3, x4, x5, x6, x7, x8, x9 + type(u) :: y6 + data x0 /t('a')/ ! OK + data x1 /t(1)/ ! { dg-error "Cannot convert" } + data x2 /t(1.)/ ! { dg-error "Cannot convert" } + data x3 /t(1d1)/ ! { dg-error "Cannot convert" } + data x4 /t((0.,1.))/ ! { dg-error "Cannot convert" } + data x5 /t(.true.)/ ! { dg-error "Cannot convert" } + data x6 /t(null())/ ! { dg-error "neither a POINTER nor ALLOCATABLE" } + data x7 /t(['1'])/ ! { dg-error "The rank of the element" } + data x8 /t([1])/ ! { dg-error "Cannot convert" } + data x9 /t(z'0')/! { dg-error "Cannot convert" } + data y6 /u(null())/ ! OK +end diff --git a/gcc/testsuite/gfortran.dg/pr93604.f90 b/gcc/testsuite/gfortran.dg/pr93604.f90 index 2c695d37829..4040155120c 100644 --- a/gcc/testsuite/gfortran.dg/pr93604.f90 +++ b/gcc/testsuite/gfortran.dg/pr93604.f90 @@ -5,6 +5,6 @@ program p integer :: a end type type(t) :: x - data x /t(z'1')/ ! { dg-error "cannot appear in a structure constructor" } + data x /t(z'1')/ ! { dg-error "BOZ" } end -- 2.31.1
RE: [PATCH] [i386] Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b & mask).
Hi Uros, I have also tested on -m32. They do not fail. Thx, Haochen -Original Message- From: Uros Bizjak Sent: Thursday, January 13, 2022 3:22 AM To: Jiang, Haochen Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao Subject: Re: [PATCH] [i386] Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b & mask). On Wed, Jan 12, 2022 at 9:11 AM Haochen Jiang wrote: > > Hi all, > > This patch targets PR94790, which change the instruction selection under the > following circumstance. > > Regtested on x86_64-pc-linux-gnu. Ok for trunk? Please also test with -m32, e.g.: make -j 12 -k check RUNTESTFLAGS="--target_board=unix\{,-m32\}" OK (with an it below), if new testcases do not FAIL with -m32. Thanks, Uros. > > BRs, > Haochen > > From the perspective of the pipeline, `andn + and + ior` version take > 2 cycles(AND and ANDN doesn't have dependence), but xor + and + xor > will take 3 cycles. > > - xorl%edi, %esi > andl%edx, %esi > - movl%esi, %eax > - xorl%edi, %eax > + andn%edi, %edx, %eax > + orl %esi, %eax > > gcc/ChangeLog: > > PR taeget/94790 > * config/i386/i386.md (*xor2andn): New define_insn_and_split. > > gcc/testsuite/ChangeLog: > > PR taeget/94790 > * gcc.target/i386/pr94790-1.c: New test. > * gcc.target/i386/pr94790-2.c: Ditto. > --- > gcc/config/i386/i386.md | 39 +++ > gcc/testsuite/gcc.target/i386/pr94790-1.c | 14 > gcc/testsuite/gcc.target/i386/pr94790-2.c | 9 ++ > 3 files changed, 62 insertions(+) > create mode 100755 gcc/testsuite/gcc.target/i386/pr94790-1.c > create mode 100755 gcc/testsuite/gcc.target/i386/pr94790-2.c > > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index > 9b424a3935b..38efc6d5837 100644 > --- a/gcc/config/i386/i386.md > +++ b/gcc/config/i386/i386.md > @@ -10452,6 +10452,45 @@ > (set_attr "znver1_decode" "double") > (set_attr "mode" "DI")]) > > +;; PR target/94790: Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b > +& mask) (define_insn_and_split "*xor2andn" > + [(set (match_operand:SWI248 0 "nonimmediate_operand") > + (xor:SWI248 > + (and:SWI248 > + (xor:SWI248 > + (match_operand:SWI248 1 "nonimmediate_operand") > + (match_operand:SWI248 2 "nonimmediate_operand")) > + (match_operand:SWI248 3 "nonimmediate_operand")) > + (match_dup 1))) > +(clobber (reg:CC FLAGS_REG))] > + "(TARGET_BMI || TARGET_AVX512BW) > + && ix86_pre_reload_split ()" > + "#" > + "&& 1" > + [(parallel [(set (match_dup 4) > + (and:SWI248 > + (not:SWI248 > + (match_dup 3)) > + (match_dup 1))) > + (clobber (reg:CC FLAGS_REG))]) > + (parallel [(set (match_dup 5) > + (and:SWI248 > + (match_dup 2) > + (match_dup 3))) > + (clobber (reg:CC FLAGS_REG))]) > + (parallel [(set (match_dup 0) > + (ior:SWI248 > + (match_dup 4) > + (match_dup 5))) > + (clobber (reg:CC FLAGS_REG))])] > + { > +operands[1] = force_reg (mode, operands[1]); > +operands[3] = force_reg (mode, operands[3]); > +operands[4] = gen_reg_rtx (mode); > +operands[5] = gen_reg_rtx (mode); > + } > +) Please put brace just after the curved brace, see numerous examples in .md files. > + > ;; See comment for addsi_1_zext why we do use nonimmediate_operand > (define_insn "*si_1_zext" >[(set (match_operand:DI 0 "register_operand" "=r") diff --git > a/gcc/testsuite/gcc.target/i386/pr94790-1.c > b/gcc/testsuite/gcc.target/i386/pr94790-1.c > new file mode 100755 > index 000..6ebbec15cfd > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr94790-1.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mbmi" } */ > +/* { dg-final { scan-assembler-times "andn\[ \\t\]" 2 } } */ > +/* { dg-final { scan-assembler-not "xorl\[ \\t\]" } } */ > + > +unsigned r1(unsigned a, unsigned b, unsigned mask) { > + return a ^ ((a ^ b) & mask); > +} > + > +unsigned r2(unsigned a, unsigned b, unsigned mask) { > + return (~mask & a) | (b & mask); > +} > diff --git a/gcc/testsuite/gcc.target/i386/pr94790-2.c > b/gcc/testsuite/gcc.target/i386/pr94790-2.c > new file mode 100755 > index 000..d7b0eec5bef > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr94790-2.c > @@ -0,0 +1,9 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mbmi" } */ > +/* { dg-final { scan-assembler-not "andn\[ \\t\]" } } */ > +/* { dg-final { scan-assembler-times "xorl\[ \\t\]" 2 } } */ > + > +unsigned r1(unsigned a, unsigned b, unsigned mask) { > + return a ^ ((a ^ b) & mask) + (a ^ b); } > -- > 2.18.1 >
Re: [PATCH, rs6000] Enable absolute jump table by default
Hi David, On 12/1/2022 下午 10:44, David Edelsohn wrote: > On Wed, Jan 12, 2022 at 7:22 AM HAO CHEN GUI wrote: >> >> Hi, >>This patch enables absolute jump table by default on rs6000. The relative >> jump tables are used when >>it's explicit set by "rs6000_relative_jumptables", >>or jump tables are placed in text section but global relocation is >> required. >> >>Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. >> Is this okay for trunk? >> Any recommendations? Thanks a lot. >> >> ChangeLog >> 2022-01-12 Haochen Gui >> >> gcc/ >> * config/rs6000/linux64.h (JUMP_TABLES_IN_TEXT_SECTION): Define. >> * config/rs6000/rs6000.c (rs6000_gen_pic_addr_diff_vec): Return >> true when relative jump table is explicit required or jump tables >> have >> to be put in text section but global relocation is also required. >> * config/rs6000/rs6000.opt (rs6000_relative_jumptables): Disable. >> >> patch.diff >> diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h >> index d617f346f81..2e257c60f8c 100644 >> --- a/gcc/config/rs6000/linux64.h >> +++ b/gcc/config/rs6000/linux64.h >> @@ -239,7 +239,7 @@ extern int dot_symbols; >> >> /* Indicate that jump tables go in the text section. */ >> #undef JUMP_TABLES_IN_TEXT_SECTION >> -#define JUMP_TABLES_IN_TEXT_SECTION TARGET_64BIT >> +#define JUMP_TABLES_IN_TEXT_SECTION 0 >> >> /* The linux ppc64 ABI isn't explicit on whether aggregates smaller >> than a doubleword should be padded upward or downward. You could >> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c >> index 319182e94d9..9fba893a27a 100644 >> --- a/gcc/config/rs6000/rs6000.c >> +++ b/gcc/config/rs6000/rs6000.c >> @@ -28465,7 +28465,9 @@ rs6000_emit_xxspltidp_v2df (rtx dst, long value) >> static bool >> rs6000_gen_pic_addr_diff_vec (void) >> { >> - return rs6000_relative_jumptables; >> + return rs6000_relative_jumptables >> +|| (JUMP_TABLES_IN_TEXT_SECTION >> +&& targetm.asm_out.reloc_rw_mask () == 3); >> } > > This seems like contorted logic and overriding the > rs6000_relative_jumptables option change. The later part of the patch > overrides rs6000_relative_jumptables for all rs6000 configurations, > and then changes this one use of rs6000_relative_jumptables to add > more logic to revert to the old meaning for some targets. > > What about all of the other uses of rs6000_relative_jumptables in the > target? What about rs6000.md? > > I highly doubt that this patch is correct. > > Why not override rs6000_relative_jumptables for PPC64 Linux instead of > changing its value globally and then trying to fix it up? > > Thanks, David Thanks for your comments. In this patch, I tried to enable absolute jump table on all rs6000 targets. For PPC64 Linux, it supports RELRO section (e.g. .data.rel.ro.local) as "JUMP_TABLES_IN_TEXT_SECTION" is set to 0. So, absolute jump tables could be placed in RELRO section whatever global relocation is required or not. The absolute jump table can't be placed in text section when global relocation is required as text section is not writable. So for other rs6000 targets, absolute jump table can't be used if the target doesn't support RELRO and global relocation is required also. Looking forward to your advice. Thanks again. > >> >> void >> diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt >> index c2a77182a9e..75e3fa86829 100644 >> --- a/gcc/config/rs6000/rs6000.opt >> +++ b/gcc/config/rs6000/rs6000.opt >> @@ -630,7 +630,7 @@ Target Mask(MMA) Var(rs6000_isa_flags) >> Generate (do not generate) MMA instructions. >> >> mrelative-jumptables >> -Target Undocumented Var(rs6000_relative_jumptables) Init(1) Save >> +Target Undocumented Var(rs6000_relative_jumptables) Init(0) Save >> >> mrop-protect >> Target Var(rs6000_rop_protect) Init(0)
[Patch][V4][Patch 2/2]Enable -Wuninitialized + -ftrivial-auto-var-init for address taken variables.
Hi, Richard, This is the updated version for the second patch, which is mainly the change for "Enable -Wuninitialized + -ftrivial-auto-var-init for address taken variables”. In this update, I mainly made the following change: 1. Delete “repl_var”, use the var_def_stmt, i.e, the call to .DEFERRED_INIT to record the warning suppressed info. 2. Add and change the comments in multiple places to make the change more readable. Now, for the deleted variable, we will get the necessary info to report warning from the call to .DEFERRED_INIT. A. the name string of DECL from the 3rd parameter of the call; B. the location of the DECL from the location of the call; C. the call can also be used to hold the information on whether the warning has been issued or not to suppress warning messages when needed; Please see the detailed description below for the problem and solution of this patch. This patch has been bootstrapped and regressing tested on both X86 and aarch64. Okay for GCC12? thanks. Qing. Enable -Wuninitialized + -ftrivial-auto-var-init for address taken variables. With -ftrivial-auto-var-init, the address taken auto variable is replaced with a temporary variable during gimplification, and the original auto variable might be eliminated by compiler optimization completely. As a result, the current uninitialized warning analysis cannot get enough information from the IR, therefore the uninitialized warnings for address taken variable cannot be issued based on the current implemenation of -ftrival-auto-var-init. For more info please refer to: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577431.html In order to improve this situation, we can improve uninitialized analysis for address taken auto variables with -ftrivial-auto-var-init as following: for the following stmt: _1 = .DEFERRED_INIT (4, 2, &"alt_reloc"[0]); if (_1 != 0) The original variable DECL has been eliminated from the IR, all the necessary information that is needed for reporting the warnings for DECL can be acquired from the call to .DEFERRED_INIT. A. the name string of DECL from the 3rd parameter of the call; B. the location of the DECL from the location of the call; C. the call can also be used to hold the information on whether the warning has been issued or not to suppress warning messages when needed; The current testing cases for uninitialized warnings + -ftrivial-auto-var-init are adjusted to reflect the fact that we can issue warnings for address taken variables. gcc/ChangeLog: 2022-01-12 qing zhao * tree-ssa-uninit.c (warn_uninit): Handle .DEFERRED_INIT call with an anonymous SSA_NAME specially. (check_defs): Likewise. (warn_uninit_phi_uses): Adjust the message format for warn_uninit. (warn_uninitialized_vars): Likewise. (warn_uninitialized_phi): Likewise gcc/testsuite/ChangeLog: 2022-01-12 qing zhao * gcc.dg/auto-init-uninit-16.c (testfunc): Delete xfail to reflect the fact that address taken variable can be warned. * gcc.dg/auto-init-uninit-34.c (warn_scalar_1): Likewise. (warn_scalar_2): Likewise. * gcc.dg/auto-init-uninit-37.c (T1): Likewise. (T2): Likewise. * gcc.dg/auto-init-uninit-B.c (baz): Likewise. The complete patch is attached: 0001-Enable-Wuninitialized-ftrivial-auto-var-init-for-add.patch Description: 0001-Enable-Wuninitialized-ftrivial-auto-var-init-for-add.patch
[PATCH 1/2] Check negative combined step
Hi, Previously, there is discussion in: https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586460.html I seperate it as two patches. This first patch is to avoid negative step when combining two ivs. The second patch is adding more accurate assumptions. This patch pass bootstrap and regtest on ppc64, ppc64le and x86_64. Is this ok for trunk? BR, Jiufu PR tree-optimization/100740 gcc/ChangeLog: * tree-ssa-loop-niter.c (number_of_iterations_cond): Check sign of combined step. gcc/testsuite/ChangeLog: * gcc.c-torture/execute/pr100740.c: New test. --- gcc/tree-ssa-loop-niter.c | 6 -- gcc/testsuite/gcc.c-torture/execute/pr100740.c | 13 + 2 files changed, 17 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr100740.c diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c index b767056aeb0..439d595a79f 100644 --- a/gcc/tree-ssa-loop-niter.c +++ b/gcc/tree-ssa-loop-niter.c @@ -1890,8 +1890,10 @@ number_of_iterations_cond (class loop *loop, tree step = fold_binary_to_constant (MINUS_EXPR, step_type, iv0->step, iv1->step); - /* No need to check sign of the new step since below code takes care -of this well. */ + /* Like cases shown in PR100740/102131, negtive step is not safe. */ + if (tree_int_cst_sign_bit (step)) + return false; + if (code != NE_EXPR && (TREE_CODE (step) != INTEGER_CST || !iv0->no_overflow || !iv1->no_overflow)) diff --git a/gcc/testsuite/gcc.c-torture/execute/pr100740.c b/gcc/testsuite/gcc.c-torture/execute/pr100740.c new file mode 100644 index 000..381cdeb947a --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/execute/pr100740.c @@ -0,0 +1,13 @@ +/* PR tree-optimization/100740 */ + +unsigned a, b; +int +main () +{ + unsigned c = 0; + for (a = 0; a < 2; a++) +for (b = 0; b < 2; b++) + if (++c < a) + __builtin_abort (); + return 0; +} -- 2.17.1
[PATCH 2/2] Add assumption combining iv
This is the second patch for two IVs combining. When one IV is chasing another one, to make it safe, we should check if there is wrap/overflow for either IV. With the assumption, which computed as this patch, the number of iterations can be caculated, even the no_overflow flag is not updated for unsigned IVs, like the test case of this patch. This patch pass bootstrap and regtest on ppc64le and x86_64. Is this ok for trunk, or it may more suitable for stage1. BR, Jiufu PR tree-optimization/102131 gcc/ChangeLog: * tree-ssa-loop-niter.c (get_step_count): New function. (iv_chase_assumption): New function. (number_of_iterations_cond): Call iv_chase_assumption. gcc/testsuite/ChangeLog: * gcc.dg/vect/pr102131.c: New test. --- gcc/tree-ssa-loop-niter.c| 73 gcc/testsuite/gcc.dg/vect/pr102131.c | 47 ++ 2 files changed, 110 insertions(+), 10 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/pr102131.c diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c index 439d595a79f..56261164f28 100644 --- a/gcc/tree-ssa-loop-niter.c +++ b/gcc/tree-ssa-loop-niter.c @@ -1788,6 +1788,63 @@ dump_affine_iv (FILE *file, affine_iv *iv) } } +/* Generate expr: (HIGH - LOW) / STEP, under UTYPE. */ + +static tree +get_step_count (tree high, tree low, tree step, tree utype, + bool end_inclusive = false) +{ + tree delta = fold_build2 (MINUS_EXPR, TREE_TYPE (low), high, low); + delta = fold_convert (utype, delta); + if (end_inclusive) +delta = fold_build2 (PLUS_EXPR, utype, delta, build_one_cst (utype)); + + if (tree_int_cst_sign_bit (step)) +step = fold_build1 (NEGATE_EXPR, TREE_TYPE (step), step); + step = fold_convert (utype, step); + + return fold_build2 (FLOOR_DIV_EXPR, utype, delta, step); +} + +/* Get the additional assumption if both two steps are not zero. +Assumptions satisfy that there is no overflow or wrap during +v0 and v1 chasing. */ + +static tree +iv_chase_assumption (affine_iv *iv0, affine_iv *iv1, tree step, +enum tree_code code) +{ + /* Here, no need additional assumptions for NE. */ + if (code == NE_EXPR) +return boolean_true_node; + + /* No need addition assumption for pointer. */ + tree type = TREE_TYPE (iv0->base); + if (POINTER_TYPE_P (type)) +return boolean_true_node; + + bool positive0 = !tree_int_cst_sign_bit (iv0->step); + bool positive1 = !tree_int_cst_sign_bit (iv1->step); + tree utype = unsigned_type_for (type); + bool add1 = code == LE_EXPR; + tree niter = get_step_count (iv1->base, iv0->base, step, utype, add1); + + int prec = TYPE_PRECISION (type); + signop sgn = TYPE_SIGN (type); + tree max = wide_int_to_tree (type, wi::max_value (prec, sgn)); + tree min = wide_int_to_tree (type, wi::min_value (prec, sgn)); + tree valid_niter0, valid_niter1; + + valid_niter0 = positive0 ? get_step_count (max, iv0->base, iv0->step, utype) + : get_step_count (iv0->base, min, iv0->step, utype); + valid_niter1 = positive1 ? get_step_count (max, iv1->base, iv1->step, utype) + : get_step_count (iv1->base, min, iv1->step, utype); + + tree e0 = fold_build2 (LT_EXPR, boolean_type_node, niter, valid_niter0); + tree e1 = fold_build2 (LT_EXPR, boolean_type_node, niter, valid_niter1); + return fold_build2 (TRUTH_AND_EXPR, boolean_type_node, e0, e1); +} + /* Determine the number of iterations according to condition (for staying inside loop) which compares two induction variables using comparison operator CODE. The induction variable on left side of the comparison @@ -1879,11 +1936,10 @@ number_of_iterations_cond (class loop *loop, {iv0.base, iv0.step - iv1.step} cmp_code {iv1.base, 0} provided that either below condition is satisfied: + a. iv0.step and iv1.step are integer. + b. Additional condition: before iv0 chase up v1, iv0 and iv1 should not + step over min or max of the type. */ - a) the test is NE_EXPR; - b) iv0.step - iv1.step is integer and iv0/iv1 don't overflow. - - This rarely occurs in practice, but it is simple enough to manage. */ if (!integer_zerop (iv0->step) && !integer_zerop (iv1->step)) { tree step_type = POINTER_TYPE_P (type) ? sizetype : type; @@ -1894,15 +1950,12 @@ number_of_iterations_cond (class loop *loop, if (tree_int_cst_sign_bit (step)) return false; - if (code != NE_EXPR - && (TREE_CODE (step) != INTEGER_CST - || !iv0->no_overflow || !iv1->no_overflow)) + niter->assumptions = iv_chase_assumption (iv0, iv1, step, code); + if (integer_zerop (niter->assumptions)) return false; iv0->step = step; - if (!POINTER_TYPE_P (type)) - iv0->no_overflow = false; - + iv0->no_overflow = true; iv1->step = build_int_cst (step_type, 0); iv1->no_overflow = true; } diff --git a
RE: [PATCH] [i386] Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b & mask).
Hi Uros, Has fixed that format issue with this new patch. Ok for trunk? Thx, Haochen -Original Message- From: Uros Bizjak Sent: Thursday, January 13, 2022 3:22 AM To: Jiang, Haochen Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao Subject: Re: [PATCH] [i386] Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b & mask). On Wed, Jan 12, 2022 at 9:11 AM Haochen Jiang wrote: > > Hi all, > > This patch targets PR94790, which change the instruction selection under the > following circumstance. > > Regtested on x86_64-pc-linux-gnu. Ok for trunk? Please also test with -m32, e.g.: make -j 12 -k check RUNTESTFLAGS="--target_board=unix\{,-m32\}" OK (with an it below), if new testcases do not FAIL with -m32. Thanks, Uros. > > BRs, > Haochen > > From the perspective of the pipeline, `andn + and + ior` version take > 2 cycles(AND and ANDN doesn't have dependence), but xor + and + xor > will take 3 cycles. > > - xorl%edi, %esi > andl%edx, %esi > - movl%esi, %eax > - xorl%edi, %eax > + andn%edi, %edx, %eax > + orl %esi, %eax > > gcc/ChangeLog: > > PR taeget/94790 > * config/i386/i386.md (*xor2andn): New define_insn_and_split. > > gcc/testsuite/ChangeLog: > > PR taeget/94790 > * gcc.target/i386/pr94790-1.c: New test. > * gcc.target/i386/pr94790-2.c: Ditto. > --- > gcc/config/i386/i386.md | 39 +++ > gcc/testsuite/gcc.target/i386/pr94790-1.c | 14 > gcc/testsuite/gcc.target/i386/pr94790-2.c | 9 ++ > 3 files changed, 62 insertions(+) > create mode 100755 gcc/testsuite/gcc.target/i386/pr94790-1.c > create mode 100755 gcc/testsuite/gcc.target/i386/pr94790-2.c > > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index > 9b424a3935b..38efc6d5837 100644 > --- a/gcc/config/i386/i386.md > +++ b/gcc/config/i386/i386.md > @@ -10452,6 +10452,45 @@ > (set_attr "znver1_decode" "double") > (set_attr "mode" "DI")]) > > +;; PR target/94790: Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b > +& mask) (define_insn_and_split "*xor2andn" > + [(set (match_operand:SWI248 0 "nonimmediate_operand") > + (xor:SWI248 > + (and:SWI248 > + (xor:SWI248 > + (match_operand:SWI248 1 "nonimmediate_operand") > + (match_operand:SWI248 2 "nonimmediate_operand")) > + (match_operand:SWI248 3 "nonimmediate_operand")) > + (match_dup 1))) > +(clobber (reg:CC FLAGS_REG))] > + "(TARGET_BMI || TARGET_AVX512BW) > + && ix86_pre_reload_split ()" > + "#" > + "&& 1" > + [(parallel [(set (match_dup 4) > + (and:SWI248 > + (not:SWI248 > + (match_dup 3)) > + (match_dup 1))) > + (clobber (reg:CC FLAGS_REG))]) > + (parallel [(set (match_dup 5) > + (and:SWI248 > + (match_dup 2) > + (match_dup 3))) > + (clobber (reg:CC FLAGS_REG))]) > + (parallel [(set (match_dup 0) > + (ior:SWI248 > + (match_dup 4) > + (match_dup 5))) > + (clobber (reg:CC FLAGS_REG))])] > + { > +operands[1] = force_reg (mode, operands[1]); > +operands[3] = force_reg (mode, operands[3]); > +operands[4] = gen_reg_rtx (mode); > +operands[5] = gen_reg_rtx (mode); > + } > +) Please put brace just after the curved brace, see numerous examples in .md files. > + > ;; See comment for addsi_1_zext why we do use nonimmediate_operand > (define_insn "*si_1_zext" >[(set (match_operand:DI 0 "register_operand" "=r") diff --git > a/gcc/testsuite/gcc.target/i386/pr94790-1.c > b/gcc/testsuite/gcc.target/i386/pr94790-1.c > new file mode 100755 > index 000..6ebbec15cfd > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr94790-1.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mbmi" } */ > +/* { dg-final { scan-assembler-times "andn\[ \\t\]" 2 } } */ > +/* { dg-final { scan-assembler-not "xorl\[ \\t\]" } } */ > + > +unsigned r1(unsigned a, unsigned b, unsigned mask) { > + return a ^ ((a ^ b) & mask); > +} > + > +unsigned r2(unsigned a, unsigned b, unsigned mask) { > + return (~mask & a) | (b & mask); > +} > diff --git a/gcc/testsuite/gcc.target/i386/pr94790-2.c > b/gcc/testsuite/gcc.target/i386/pr94790-2.c > new file mode 100755 > index 000..d7b0eec5bef > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr94790-2.c > @@ -0,0 +1,9 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mbmi" } */ > +/* { dg-final { scan-assembler-not "andn\[ \\t\]" } } */ > +/* { dg-final { scan-assembler-times "xorl\[ \\t\]" 2 } } */ > + > +unsigned r1(unsigned a, unsigned b, unsigned mask) { > + return a ^ ((a ^ b) & mask) + (a ^ b); } > -- > 2.18.1 > 0001-i386-Optimize-a-a-b-mask-to-mask-a-b-mask.patch Description: 0001-i386-Optimize-a-a-b-mask-to-mask-a-b-mask.patch
Re: [PATCH, rs6000] Enable absolute jump table by default
On Wed, Jan 12, 2022 at 8:40 PM HAO CHEN GUI wrote: > > Hi David, > > On 12/1/2022 下午 10:44, David Edelsohn wrote: > > On Wed, Jan 12, 2022 at 7:22 AM HAO CHEN GUI wrote: > >> > >> Hi, > >>This patch enables absolute jump table by default on rs6000. The > >> relative jump tables are used when > >>it's explicit set by "rs6000_relative_jumptables", > >>or jump tables are placed in text section but global relocation is > >> required. > >> > >>Bootstrapped and tested on powerpc64-linux BE and LE with no > >> regressions. Is this okay for trunk? > >> Any recommendations? Thanks a lot. > >> > >> ChangeLog > >> 2022-01-12 Haochen Gui > >> > >> gcc/ > >> * config/rs6000/linux64.h (JUMP_TABLES_IN_TEXT_SECTION): Define. > >> * config/rs6000/rs6000.c (rs6000_gen_pic_addr_diff_vec): Return > >> true when relative jump table is explicit required or jump tables > >> have > >> to be put in text section but global relocation is also required. > >> * config/rs6000/rs6000.opt (rs6000_relative_jumptables): Disable. > >> > >> patch.diff > >> diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h > >> index d617f346f81..2e257c60f8c 100644 > >> --- a/gcc/config/rs6000/linux64.h > >> +++ b/gcc/config/rs6000/linux64.h > >> @@ -239,7 +239,7 @@ extern int dot_symbols; > >> > >> /* Indicate that jump tables go in the text section. */ > >> #undef JUMP_TABLES_IN_TEXT_SECTION > >> -#define JUMP_TABLES_IN_TEXT_SECTION TARGET_64BIT > >> +#define JUMP_TABLES_IN_TEXT_SECTION 0 > >> > >> /* The linux ppc64 ABI isn't explicit on whether aggregates smaller > >> than a doubleword should be padded upward or downward. You could > >> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > >> index 319182e94d9..9fba893a27a 100644 > >> --- a/gcc/config/rs6000/rs6000.c > >> +++ b/gcc/config/rs6000/rs6000.c > >> @@ -28465,7 +28465,9 @@ rs6000_emit_xxspltidp_v2df (rtx dst, long value) > >> static bool > >> rs6000_gen_pic_addr_diff_vec (void) > >> { > >> - return rs6000_relative_jumptables; > >> + return rs6000_relative_jumptables > >> +|| (JUMP_TABLES_IN_TEXT_SECTION > >> +&& targetm.asm_out.reloc_rw_mask () == 3); > >> } > > > > This seems like contorted logic and overriding the > > rs6000_relative_jumptables option change. The later part of the patch > > overrides rs6000_relative_jumptables for all rs6000 configurations, > > and then changes this one use of rs6000_relative_jumptables to add > > more logic to revert to the old meaning for some targets. > > > > What about all of the other uses of rs6000_relative_jumptables in the > > target? What about rs6000.md? > > > > I highly doubt that this patch is correct. > > > > Why not override rs6000_relative_jumptables for PPC64 Linux instead of > > changing its value globally and then trying to fix it up? > > > > Thanks, David > Thanks for your comments. > > In this patch, I tried to enable absolute jump table on all rs6000 targets. > For PPC64 Linux, it supports RELRO section (e.g. .data.rel.ro.local) as > "JUMP_TABLES_IN_TEXT_SECTION" is set to 0. So, absolute jump tables could be > placed > in RELRO section whatever global relocation is required or not. The absolute > jump > table can't be placed in text section when global relocation is required as > text > section is not writable. So for other rs6000 targets, absolute jump table > can't be > used if the target doesn't support RELRO and global relocation is required > also. > > Looking forward to your advice. Thanks again. Why not override rs6000_relative_jumptables in rs6000_option_override_internal() for PPC64 Linux subtarget instead of changing the default value and then trying to fix the behavior for other configurations in rs6000_gen_pic_addr_diff_vec()? Or override it in SUBSUBTARGET_OVERRIDE_OPTIONS in linux64.h, or whichever header file(s) is appropriate for the subtarget variants. Your initial patch also changed rs6000_gen_pic_addr_diff_vec but didn't address the use of rs6000_relative_jumptables in the definition of CASE_VECTOR_MODE in rs6000.h. That cannot have been correct. At least without the change to the default value of rs6000_relative_jumptables you don't need to add kludges to all of the places where that variable is used for other subtarget variants of the rs6000 target. Thanks, David
[PATCH] rs6000: Fix constraint v with rs6000_constraints[RS6000_CONSTRAINT_v]
Hi, This patch is to fix register constraint v with rs6000_constraints[RS6000_CONSTRAINT_v] instead of ALTIVEC_REGS, just like some other existing register constraints with RS6000_CONSTRAINT_*. I happened to see this and hope it's not intentional and just got neglected. Bootstrapped and regtested on powerpc64le-linux-gnu P9 and powerpc64-linux-gnu P8. Is it ok for trunk? BR, Kewen - gcc/ChangeLog: * config/rs6000/constraints.md (register constraint v): Use rs6000_constraints[RS6000_CONSTRAINT_v] instead of ALTIVEC_REGS. --- gcc/config/rs6000/constraints.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md index a4b05837fa6..c01dcbbc3a3 100644 --- a/gcc/config/rs6000/constraints.md +++ b/gcc/config/rs6000/constraints.md @@ -37,7 +37,7 @@ (define_register_constraint "d" "rs6000_constraints[RS6000_CONSTRAINT_d]" historically @code{f} was for single-precision and @code{d} was for double-precision floating point.") -(define_register_constraint "v" "ALTIVEC_REGS" +(define_register_constraint "v" "rs6000_constraints[RS6000_CONSTRAINT_v]" "An Altivec vector register (VR), @code{v0}@dots{}@code{v31}.") (define_register_constraint "wa" "rs6000_constraints[RS6000_CONSTRAINT_wa]" -- 2.27.0
[PATCH] rs6000: Use known constant for GET_MODE_NUNITS and similar
Hi, This patch is to clean up some codes with GET_MODE_UNIT_SIZE or GET_MODE_NUNITS, which can use known constant instead. Bootstrapped and regtested on powerpc64le-linux-gnu P9 and powerpc64-linux-gnu P8. Is it ok for trunk? BR, Kewen - gcc/ChangeLog: * config/rs6000/altivec.md (altivec_vreveti2): Use known constant values to simplify code. * config/rs6000/vsx.md (*vsx_extract_si, *vsx_extract_si_float_df, *vsx_extract_si_float_, *vsx_insert_extract_v4sf_p9): Likewise. --- gcc/config/rs6000/altivec.md | 11 +++ gcc/config/rs6000/vsx.md | 8 2 files changed, 7 insertions(+), 12 deletions(-) diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index c2312cc1e0f..d5c4ecfa9b7 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -3957,17 +3957,12 @@ (define_expand "altivec_vreveti2" UNSPEC_VREVEV))] "TARGET_ALTIVEC" { - int i, j, size, num_elements; + int i; rtvec v = rtvec_alloc (16); rtx mask = gen_reg_rtx (V16QImode); - size = GET_MODE_UNIT_SIZE (TImode); - num_elements = GET_MODE_NUNITS (TImode); - - for (j = 0; j < num_elements; j++) -for (i = 0; i < size; i++) - RTVEC_ELT (v, i + j * size) - = GEN_INT (i + (num_elements - 1 - j) * size); + for (i = 0; i < 16; i++) +RTVEC_ELT (v, i) = GEN_INT (i); emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v))); emit_insn (gen_altivec_vperm_ti (operands[0], operands[1], diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index 802db0d112b..892b99c6d6b 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -3855,7 +3855,7 @@ (define_insn_and_split "*vsx_extract_si" int value; if (!BYTES_BIG_ENDIAN) -element = GEN_INT (GET_MODE_NUNITS (V4SImode) - 1 - INTVAL (element)); +element = GEN_INT (3 - INTVAL (element)); /* If the value is in the correct position, we can avoid doing the VSPLT instruction. */ @@ -4231,7 +4231,7 @@ (define_insn_and_split "*vsx_extract_si_float_df" int value; if (!BYTES_BIG_ENDIAN) -element = GEN_INT (GET_MODE_NUNITS (V4SImode) - 1 - INTVAL (element)); +element = GEN_INT (3 - INTVAL (element)); /* If the value is in the correct position, we can avoid doing the VSPLT instruction. */ @@ -4274,7 +4274,7 @@ (define_insn_and_split "*vsx_extract_si_float_" int value; if (!BYTES_BIG_ENDIAN) -element = GEN_INT (GET_MODE_NUNITS (V4SImode) - 1 - INTVAL (element)); +element = GEN_INT (3 - INTVAL (element)); /* If the value is in the correct position, we can avoid doing the VSPLT instruction. */ @@ -4467,7 +4467,7 @@ (define_insn "*vsx_insert_extract_v4sf_p9" int ele = INTVAL (operands[4]); if (!BYTES_BIG_ENDIAN) -ele = GET_MODE_NUNITS (V4SFmode) - 1 - ele; +ele = 3 - ele; operands[4] = GEN_INT (GET_MODE_SIZE (SFmode) * ele); return "xxinsertw %x0,%x2,%4"; -- 2.27.0
PING^1 [PATCH] rs6000: Fix an assertion in update_target_cost_per_stmt [PR103702]
Gentle ping: https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587309.html on 2021/12/23 上午10:06, Kewen.Lin via Gcc-patches wrote: > Hi, > > This patch is to fix one wrong assertion which is too aggressive. > Vectorizer can do vec_construct costing for the vector type which > only has one unit. For the failed case, the passed-in vector type > is "vector(1) int", though it doesn't end up with any construction > eventually. We have to handle this kind of input in function > rs6000_cost_data::update_target_cost_per_stmt. > > Bootstrapped and regtested on powerpc64le-linux-gnu P9 and > powerpc64-linux-gnu P8. > > Is it ok for trunk? > > BR, > Kewen > - > gcc/ChangeLog: > > PR target/103702 > * config/rs6000/rs6000.c > (rs6000_cost_data::update_target_cost_per_stmt): Fix one wrong > assertion with early return. > > gcc/testsuite/ChangeLog: > > PR target/103702 > * gcc.target/powerpc/pr103702.c: New test. > --- > gcc/config/rs6000/rs6000.c | 7 -- > gcc/testsuite/gcc.target/powerpc/pr103702.c | 24 + > 2 files changed, 29 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103702.c > > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > index 0b09713b2f5..37f07fe5358 100644 > --- a/gcc/config/rs6000/rs6000.c > +++ b/gcc/config/rs6000/rs6000.c > @@ -5461,8 +5461,11 @@ rs6000_cost_data::update_target_cost_per_stmt > (vect_cost_for_stmt kind, > { > tree vectype = STMT_VINFO_VECTYPE (stmt_info); > unsigned int nunits = vect_nunits_for_cost (vectype); > - /* We don't expect strided/elementwise loads for just 1 nunit. */ > - gcc_assert (nunits > 1); > + /* As PR103702 shows, it's possible that vectorizer wants to do > + costings for only one unit here, it's no need to do any > + penalization for it, so simply early return here. */ > + if (nunits == 1) > + return; > /* i386 port adopts nunits * stmt_cost as the penalized cost >for this kind of penalization, we used to follow it but >found it could result in an unreliable body cost especially > diff --git a/gcc/testsuite/gcc.target/powerpc/pr103702.c > b/gcc/testsuite/gcc.target/powerpc/pr103702.c > new file mode 100644 > index 000..585946fd64b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr103702.c > @@ -0,0 +1,24 @@ > +/* We don't have one powerpc.*_ok for Power6, use altivec_ok conservatively. > */ > +/* { dg-require-effective-target powerpc_altivec_ok } */ > +/* { dg-options "-mdejagnu-cpu=power6 -O2 -ftree-loop-vectorize > -fno-tree-scev-cprop" } */ > + > +/* Verify there is no ICE. */ > + > +unsigned short a, e; > +int *b, *d; > +int c; > +extern int fn2 (); > +void > +fn1 () > +{ > + void *f; > + for (;;) > +{ > + fn2 (); > + b = f; > + e = 0; > + for (; e < a; ++e) > + b[e] = d[e * c]; > +} > +} > + > -- > 2.27.0 >
PING^1 [PATCH] rs6000: Disable MMA if no P9 VECTOR support [PR103627]
Gentle ping: https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587310.html on 2021/12/23 上午10:09, Kewen.Lin via Gcc-patches wrote: > Hi, > > As PR103627 shows, there is an unexpected case where !TARGET_VSX > and TARGET_MMA co-exist. As ISA3.1 claims, SIMD is a requirement > for MMA. By looking into the ICE, I noticed that the current > MMA implementation depends on vector pairs load/store, but since > we don't have a separated option to control Power10 vector, this > patch is to check for Power9 vector instead. > > Bootstrapped and regtested on powerpc64le-linux-gnu P9 and > powerpc64-linux-gnu P8. > > Is it ok for trunk? > > BR, > Kewen > - > gcc/ChangeLog: > > PR target/103627 > * config/rs6000/rs6000.c (rs6000_option_override_internal): Disable > MMA if !TARGET_P9_VECTOR. > > gcc/testsuite/ChangeLog: > > PR target/103627 > * gcc.target/powerpc/pr103627-1.c: New test. > * gcc.target/powerpc/pr103627-2.c: New test. > --- > gcc/config/rs6000/rs6000.c| 11 +++ > gcc/testsuite/gcc.target/powerpc/pr103627-1.c | 16 > gcc/testsuite/gcc.target/powerpc/pr103627-2.c | 16 > 3 files changed, 43 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103627-1.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103627-2.c > > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > index c020947abc8..ec3b46682a7 100644 > --- a/gcc/config/rs6000/rs6000.c > +++ b/gcc/config/rs6000/rs6000.c > @@ -4505,6 +4505,17 @@ rs6000_option_override_internal (bool global_init_p) >rs6000_isa_flags &= ~OPTION_MASK_MMA; > } > > + /* MMA requires SIMD support as ISA 3.1 claims and our implementation > + such as "*movoo" uses vector pair access which are only supported > + from ISA 3.1. But since we don't have one separated option to > + control Power10 vector, check for Power9 vector instead. */ > + if (TARGET_MMA && !TARGET_P9_VECTOR) > +{ > + if ((rs6000_isa_flags_explicit & OPTION_MASK_MMA) != 0) > + error ("%qs requires %qs", "-mmma", "-mpower9-vector"); > + rs6000_isa_flags &= ~OPTION_MASK_MMA; > +} > + >if (!TARGET_PCREL && TARGET_PCREL_OPT) > rs6000_isa_flags &= ~OPTION_MASK_PCREL_OPT; > > diff --git a/gcc/testsuite/gcc.target/powerpc/pr103627-1.c > b/gcc/testsuite/gcc.target/powerpc/pr103627-1.c > new file mode 100644 > index 000..6c6c16188fb > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr103627-1.c > @@ -0,0 +1,16 @@ > +/* { dg-require-effective-target power10_ok } */ > +/* { dg-options "-mdejagnu-cpu=power10 -mno-power9-vector" } */ > + > +/* Verify compiler emits error message instead of ICE. */ > + > +extern float *dest; > +extern __vector_quad src; > + > +int > +foo () > +{ > + __builtin_mma_disassemble_acc (dest, &src); > + /* { dg-error "'__builtin_mma_disassemble_acc' requires the '-mmma' > option" "" { target *-*-* } .-1 } */ > + return 0; > +} > + > diff --git a/gcc/testsuite/gcc.target/powerpc/pr103627-2.c > b/gcc/testsuite/gcc.target/powerpc/pr103627-2.c > new file mode 100644 > index 000..6604872c0e8 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr103627-2.c > @@ -0,0 +1,16 @@ > +/* { dg-require-effective-target power10_ok } */ > +/* { dg-options "-mdejagnu-cpu=power10 -mmma -mno-power9-vector" } */ > + > +/* Verify the emitted error message. */ > + > +extern float *dest; > +extern __vector_quad src; > + > +int > +foo () > +{ > + __builtin_mma_disassemble_acc (dest, &src); > + /* { dg-error "'-mmma' requires '-mpower9-vector'" "mma" { target *-*-* } > 0 } */ > + return 0; > +} > + > -- > 2.27.0 >