Re: [PATCH] ctf: Do not warn for CTF not supported for GNU GIMPLE
On Wed, Sep 29, 2021 at 5:55 PM Indu Bhagat wrote: > > On 9/29/21 12:14 AM, Richard Biener wrote: > > On Tue, Sep 28, 2021 at 8:52 PM Indu Bhagat via Gcc-patches > > wrote: > >> > >> CTF is supported for C only. Currently, a warning is emitted if the -gctf > >> command line option is specified for a non-C frontend. This warning is > >> also > >> used by the GCC testsuite framework - it skips adding -gctf to the list of > >> debug flags for automated testing, if CTF is not supported for the > >> frontend. > >> > >> The following warning, however, is not useful in case of LTO: > >> > >> "lto1: note: CTF debug info requested, but not supported for ‘GNU GIMPLE’ > >> frontend" > >> > >> This patch disables the generation of the above warning for GNU GIMPLE. > >> > >> Bootstrapped and regression tested on x86_64. > >> > >> gcc/ChangeLog: > >> > >> * toplev.c (process_options): Do not warn for GNU GIMPLE. > >> --- > >> gcc/toplev.c | 12 +++- > >> 1 file changed, 7 insertions(+), 5 deletions(-) > >> > >> diff --git a/gcc/toplev.c b/gcc/toplev.c > >> index e1688aa..511a343 100644 > >> --- a/gcc/toplev.c > >> +++ b/gcc/toplev.c > >> @@ -1416,14 +1416,16 @@ process_options (void) > >> debug_info_level = DINFO_LEVEL_NONE; > >> } > >> > >> - /* CTF is supported for only C at this time. > >> - Compiling with -flto results in frontend language of GNU GIMPLE. */ > >> + /* CTF is supported for only C at this time. */ > >> if (!lang_GNU_C () > >> && ctf_debug_info_level > CTFINFO_LEVEL_NONE) > >> { > >> - inform (UNKNOWN_LOCATION, > >> - "CTF debug info requested, but not supported for %qs > >> frontend", > >> - language_string); > >> + /* Compiling with -flto results in frontend language of GNU GIMPLE. > >> It > >> +is not useful to warn in that case. */ > >> + if (!startswith (lang_hooks.name, "GNU GIMPLE")) > > > > please use in_lto_p instead > > > > OK with that change. > > > > in_lto_p is set later in lto_init () (when its time for do_compile ()). > > in_lto_p's updated value is not available at this point in > process_options (). I see - a bit ugly IMHO but I guess the patch is OK then. Thanks, Richard. > >> + inform (UNKNOWN_LOCATION, > >> + "CTF debug info requested, but not supported for %qs > >> frontend", > >> + language_string); > >> ctf_debug_info_level = CTFINFO_LEVEL_NONE; > >> } > >> > >> -- > >> 1.8.3.1 > >> >
Re: [PATCH] Loop unswitching: support gswitch statements.
On Wed, Sep 29, 2021 at 5:28 PM Jeff Law wrote: > > > > On 9/29/2021 9:20 AM, Andrew MacLeod via Gcc-patches wrote: > > On 9/29/21 4:43 AM, Richard Biener wrote: > >> On Tue, Sep 28, 2021 at 10:39 PM Andrew MacLeod > >> wrote: > >>> On 9/28/21 7:50 AM, Richard Biener wrote: > On Wed, Sep 15, 2021 at 10:46 AM Martin Liška wrote: > > /* Unswitch single LOOP. NUM is number of unswitchings done; > > we do not allow > > @@ -269,6 +311,7 @@ tree_unswitch_single_loop (class loop *loop, > > int num) > > class loop *nloop; > > unsigned i, found; > > tree cond = NULL_TREE; > > + edge cond_edge = NULL; > > gimple *stmt; > > bool changed = false; > > HOST_WIDE_INT iterations; > > @@ -311,11 +354,12 @@ tree_unswitch_single_loop (class loop *loop, > > int num) > > bbs = get_loop_body (loop); > > found = loop->num_nodes; > > > > + gimple_ranger ranger; > ISTR constructing/destructing ranger has a non-negligible overhead - > is it possible > to keep it live for a longer time (note we're heavily modifying the > CFG)? > >>> > >>> There is some overhead.. right now we determine all the imports and > >>> exports for each block ahead of time, but thats about it. We can make > >>> adjustments for true on demand clients like this so that even that > >>> doesnt happen. we only do that so we know ahead of time which ssa-names > >>> are never used in outgoing edges, and never even have to check those. > >>> Thats mostly an optimization for heavy users like EVRP. If you want, I > >>> can make that an option so there virtually no overhead > >>> > >>> More importantly, the longer it remains alive, the more "reuse" of > >>> ranges you will get.. If there is not a pattern of using variables > >>> from earlier in the program it wouldnt really matter much. > >>> > >>> In Theory, modifying the IL should be fine, it happens already in > >>> places, but its not extensively tested under those conditions yet. > >> Note it's modifying the CFG as well. > > bah, thats what I meant. as long as the IL is changed and CFG > > updated to match, it should in theory work. And as long as existing > > SSA_NAMEs dont have their meaning changes.. ie reusing an SSA_NAME to > > have a different definition is likely to cause problems without > > telling ranger that an SSA_NAME is now different. > There is an API somewhere which resets the global information that we > call for these scenarios. But that assumes we know when an name is > going to be reused or moved to a point where the global information > isn't necessary accurate anymore. It's not heavily used as these cases > are relatively rare. > > The nastier case is when we release an SSA_NAME because it was dead > code, then later re-cycle it. If we're going to have Ranger instances > live across passes, then that becomes a much bigger concern. For the case at hand there shouldn't be any transforms invalidating ranges on existing SSA names - we possibly place additional conditions and thus refine existing ranges though, and we rely on that info to be used. Since we repeatedly transform a single loop re-setting ranger like Martin did might be necessary for those to be picked up. Richard. > Jeff >
Re: [PATCH] Plug possible snprintf overflow in lto-wrapper.
On Thu, Sep 30, 2021 at 8:17 AM Aldy Hernandez via Gcc-patches wrote: > > My upcoming improvements to the DOM threader triggered a warning in > this code. It looks like the format string is ".ltrans%u.ltrans", but > we're only writing a max of ".ltrans" + whatever the MAX_INT is here. > > Tested on x86-64 Linux. > > OK? OK. Note that %u is max 127 by default (--param lto-partitions). Richard. > gcc/ChangeLog: > > * lto-wrapper.c (run_gcc): Plug snprintf overflow. > --- > gcc/lto-wrapper.c | 10 +++--- > 1 file changed, 7 insertions(+), 3 deletions(-) > > diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c > index 903c258a03a..7b9e4883f38 100644 > --- a/gcc/lto-wrapper.c > +++ b/gcc/lto-wrapper.c > @@ -1983,7 +1983,9 @@ cont: > output_name = XOBFINISH (&env_obstack, char *); > > /* Adjust the dumpbase if the linker output file was seen. */ > - int dumpbase_len = (strlen (dumppfx) + sizeof (DUMPBASE_SUFFIX)); > + int dumpbase_len = (strlen (dumppfx) > + + sizeof (DUMPBASE_SUFFIX) > + + sizeof (".ltrans")); > char *dumpbase = (char *) xmalloc (dumpbase_len + 1); > snprintf (dumpbase, dumpbase_len, "%sltrans%u.ltrans", dumppfx, i); > argv_ptr[0] = dumpbase; > @@ -2009,9 +2011,11 @@ cont: > } > else > { > - char argsuffix[sizeof (DUMPBASE_SUFFIX) + 1]; > + char argsuffix[sizeof (DUMPBASE_SUFFIX) > ++ sizeof (".ltrans_args") + 1]; > if (save_temps) > - snprintf (argsuffix, sizeof (DUMPBASE_SUFFIX), > + snprintf (argsuffix, > + sizeof (DUMPBASE_SUFFIX) + sizeof (".ltrans_args"), > "ltrans%u.ltrans_args", i); > fork_execute (new_argv[0], CONST_CAST (char **, new_argv), > true, save_temps ? argsuffix : NULL); > -- > 2.31.1 >
[committed] openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc
Hi! This patch adds new OpenMP 5.1 allocator entrypoints and in addition to that fixes an omp_alloc bug which is hard to test for - if the first allocator fails but has a larger alignment trait and has a fallback allocator, either the default behavior or a user fallback, then the extra alignment will be used even in the fallback allocation, rather than just starting with whatever alignment has been requested (in GOMP_alloc or the minimum one in omp_alloc). Jonathan's comment on IRC this morning made me realize that I should add alloc_align attributes to 2 of the prototypes and I still need to add testsuite coverage for omp_realloc, will do that in a follow-up. Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk. 2021-09-30 Jakub Jelinek * omp.h.in (omp_aligned_alloc, omp_calloc, omp_aligned_calloc, omp_realloc): New prototypes. (omp_alloc): Move after omp_free prototype, add __malloc__ (omp_free) attribute. * allocator.c: Include string.h. (omp_aligned_alloc): No longer static, add ialias. Add new_alignment variable and use it instead of alignment so that when retrying the old alignment is used again. Don't retry if new alignment is the same as old alignment, unless allocator had pool size. (omp_alloc, GOMP_alloc, GOMP_free): Use ialias_call. (omp_aligned_calloc, omp_calloc, omp_realloc): New functions. * libgomp.map (OMP_5.0.2): Export omp_aligned_alloc, omp_calloc, omp_aligned_calloc and omp_realloc. * testsuite/libgomp.c-c++-common/alloc-4.c (main): Add omp_aligned_alloc, omp_calloc and omp_aligned_calloc tests. * testsuite/libgomp.c-c++-common/alloc-5.c: New test. * testsuite/libgomp.c-c++-common/alloc-6.c: New test. * testsuite/libgomp.c-c++-common/alloc-7.c: New test. * testsuite/libgomp.c-c++-common/alloc-8.c: New test. --- libgomp/omp.h.in.jj 2021-08-12 22:36:53.281885443 +0200 +++ libgomp/omp.h.in2021-09-29 19:50:48.642110522 +0200 @@ -295,12 +295,31 @@ extern omp_allocator_handle_t omp_init_a extern void omp_destroy_allocator (omp_allocator_handle_t) __GOMP_NOTHROW; extern void omp_set_default_allocator (omp_allocator_handle_t) __GOMP_NOTHROW; extern omp_allocator_handle_t omp_get_default_allocator (void) __GOMP_NOTHROW; -extern void *omp_alloc (__SIZE_TYPE__, - omp_allocator_handle_t __GOMP_DEFAULT_NULL_ALLOCATOR) - __GOMP_NOTHROW __attribute__((__malloc__, __alloc_size__ (1))); extern void omp_free (void *, omp_allocator_handle_t __GOMP_DEFAULT_NULL_ALLOCATOR) __GOMP_NOTHROW; +extern void *omp_alloc (__SIZE_TYPE__, + omp_allocator_handle_t __GOMP_DEFAULT_NULL_ALLOCATOR) + __GOMP_NOTHROW __attribute__((__malloc__, __malloc__ (omp_free), + __alloc_size__ (1))); +extern void *omp_aligned_alloc (__SIZE_TYPE__, __SIZE_TYPE__, + omp_allocator_handle_t + __GOMP_DEFAULT_NULL_ALLOCATOR) + __GOMP_NOTHROW __attribute__((__malloc__, __malloc__ (omp_free), + __alloc_size__ (2))); +extern void *omp_calloc (__SIZE_TYPE__, __SIZE_TYPE__, +omp_allocator_handle_t __GOMP_DEFAULT_NULL_ALLOCATOR) + __GOMP_NOTHROW __attribute__((__malloc__, __malloc__ (omp_free), + __alloc_size__ (1, 2))); +extern void *omp_aligned_calloc (__SIZE_TYPE__, __SIZE_TYPE__, __SIZE_TYPE__, +omp_allocator_handle_t +__GOMP_DEFAULT_NULL_ALLOCATOR) + __GOMP_NOTHROW __attribute__((__malloc__, __malloc__ (omp_free), + __alloc_size__ (2, 3))); +extern void *omp_realloc (void *, __SIZE_TYPE__, + omp_allocator_handle_t __GOMP_DEFAULT_NULL_ALLOCATOR, + omp_allocator_handle_t __GOMP_DEFAULT_NULL_ALLOCATOR) + __GOMP_NOTHROW __attribute__((__malloc__ (omp_free), __alloc_size__ (2))); extern void omp_display_env (int) __GOMP_NOTHROW; --- libgomp/allocator.c.jj 2021-08-12 18:14:29.731846863 +0200 +++ libgomp/allocator.c 2021-09-29 15:28:08.121095372 +0200 @@ -30,6 +30,7 @@ #define _GNU_SOURCE #include "libgomp.h" #include +#include #define omp_max_predefined_alloc omp_thread_mem_alloc @@ -205,18 +206,19 @@ omp_destroy_allocator (omp_allocator_han ialias (omp_init_allocator) ialias (omp_destroy_allocator) -static void * +void * omp_aligned_alloc (size_t alignment, size_t size, omp_allocator_handle_t allocator) { struct omp_allocator_data *allocator_data; - size_t new_size; + size_t new_size, new_alignment; void *ptr, *ret; if (__builtin_expect (size == 0, 0)) return NULL; retry: + new_alignment = alignment; if (allocator == omp_null_allocator) { struct gomp_thread *thr = gomp_thread (); @@ -228,19 +23
RE: [PATCH 1/7]AArch64 Add combine patterns for right shift and narrow
Hi Tamar, > -Original Message- > From: Tamar Christina > Sent: Wednesday, September 29, 2021 5:19 PM > To: gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > Marcus Shawcroft ; Kyrylo Tkachov > ; Richard Sandiford > > Subject: [PATCH 1/7]AArch64 Add combine patterns for right shift and > narrow > > Hi All, > > This adds a simple pattern for combining right shifts and narrows into > shifted narrows. > > i.e. > > typedef short int16_t; > typedef unsigned short uint16_t; > > void foo (uint16_t * restrict a, int16_t * restrict d, int n) > { > for( int i = 0; i < n; i++ ) > d[i] = (a[i] * a[i]) >> 10; > } > > now generates: > > .L4: > ldr q0, [x0, x3] > umull v1.4s, v0.4h, v0.4h > umull2 v0.4s, v0.8h, v0.8h > shrnv1.4h, v1.4s, 10 > shrn2 v1.8h, v0.4s, 10 > str q1, [x1, x3] > add x3, x3, 16 > cmp x4, x3 > bne .L4 > > instead of: > > .L4: > ldr q0, [x0, x3] > umull v1.4s, v0.4h, v0.4h > umull2 v0.4s, v0.8h, v0.8h > sshrv1.4s, v1.4s, 10 > sshrv0.4s, v0.4s, 10 > xtn v1.4h, v1.4s > xtn2v1.8h, v0.4s > str q1, [x1, x3] > add x3, x3, 16 > cmp x4, x3 > bne .L4 > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? > > Thanks, > Tamar > > gcc/ChangeLog: > > * config/aarch64/aarch64-simd.md > (*aarch64_shrn_vect, > *aarch64_shrn2_vect): New. > * config/aarch64/iterators.md (srn_op): New. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/shrn-combine.c: New test. > > --- inline copy of patch -- > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index > 48eddf64e05afe3788abfa05141f6544a9323ea1..d7b6cae424622d259f97a3d5 > fa9093c0fb0bd5ce 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -1818,6 +1818,28 @@ (define_insn "aarch64_shrn_insn_be" >[(set_attr "type" "neon_shift_imm_narrow_q")] > ) > > +(define_insn "*aarch64_shrn_vect" > + [(set (match_operand: 0 "register_operand" "=w") > +(truncate: > + (SHIFTRT:VQN (match_operand:VQN 1 "register_operand" "w") > +(match_operand:VQN 2 > "aarch64_simd_shift_imm_vec_"] > + "TARGET_SIMD" > + "shrn\\t%0., %1., %2" > + [(set_attr "type" "neon_shift_imm_narrow_q")] > +) > + > +(define_insn "*aarch64_shrn2_vect" > + [(set (match_operand: 0 "register_operand" "=w") > + (vec_concat: > + (match_operand: 1 "register_operand" "0") > + (truncate: > + (SHIFTRT:VQN (match_operand:VQN 2 "register_operand" "w") > + (match_operand:VQN 3 > "aarch64_simd_shift_imm_vec_")] > + "TARGET_SIMD" > + "shrn2\\t%0., %2., %3" > + [(set_attr "type" "neon_shift_imm_narrow_q")] > +) I think this needs to be guarded on !BYTES_BIG_ENDIAN and a similar pattern added for BYTES_BIG_ENDIAN with the vec_concat operands swapped around. This is similar to the aarch64_xtn2_insn_be pattern, for example. Thanks, Kyrill > + > (define_expand "aarch64_shrn" >[(set (match_operand: 0 "register_operand") > (truncate: > diff --git a/gcc/config/aarch64/iterators.md > b/gcc/config/aarch64/iterators.md > index > caa42f8f169fbf2cf46a90cf73dee05619acc300..8dbeed3b0d4a44cdc17dd333e > d397b39a33f386a 100644 > --- a/gcc/config/aarch64/iterators.md > +++ b/gcc/config/aarch64/iterators.md > @@ -2003,6 +2003,9 @@ (define_code_attr shift [(ashift "lsl") (ashiftrt "asr") > ;; Op prefix for shift right and accumulate. > (define_code_attr sra_op [(ashiftrt "s") (lshiftrt "u")]) > > +;; op prefix for shift right and narrow. > +(define_code_attr srn_op [(ashiftrt "r") (lshiftrt "")]) > + > ;; Map shift operators onto underlying bit-field instructions > (define_code_attr bfshift [(ashift "ubfiz") (ashiftrt "sbfx") > (lshiftrt "ubfx") (rotatert "extr")]) > diff --git a/gcc/testsuite/gcc.target/aarch64/shrn-combine.c > b/gcc/testsuite/gcc.target/aarch64/shrn-combine.c > new file mode 100644 > index > ..0187f49f4dcc76182c90366c > aaf00d294e835707 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/shrn-combine.c > @@ -0,0 +1,14 @@ > +/* { dg-do assemble } */ > +/* { dg-options "-O3 --save-temps --param=vect-epilogues-nomask=0" } */ > + > +typedef short int16_t; > +typedef unsigned short uint16_t; > + > +void foo (uint16_t * restrict a, int16_t * restrict d, int n) > +{ > +for( int i = 0; i < n; i++ ) > + d[i] = (a[i] * a[i]) >> 10; > +} > + > +/* { dg-final { scan-assembler-times {\tshrn\t} 1 } } */ > +/* { dg-final { scan-assembler-times {\tshrn2\t} 1 } } */ > > > --
[PATCH gcc-11 0/2] Backport kpatch changes
Hi, This series contains a backport of kpatch changes needed to support https://github.com/dynup/kpatch/pull/1203 so that it could be used in RHEL 9. The patches have been in master for 4 months now without issues. Bootstrapped and regtested on s390x-redhat-linux. Ok for gcc-11? Best regards, Ilya Ilya Leoshkevich (2): IBM Z: Define NO_PROFILE_COUNTERS IBM Z: Use @PLT symbols for local functions in 64-bit mode gcc/config/s390/predicates.md | 9 +- gcc/config/s390/s390.c| 115 +++--- gcc/config/s390/s390.h| 2 + gcc/config/s390/s390.md | 32 ++--- gcc/testsuite/g++.dg/ext/visibility/noPLT.C | 2 +- gcc/testsuite/g++.target/s390/mi-thunk.C | 23 .../gcc.target/s390/call-z10-pic-nodatarel.c | 20 +++ gcc/testsuite/gcc.target/s390/call-z10-pic.c | 20 +++ gcc/testsuite/gcc.target/s390/call-z10.c | 20 +++ .../gcc.target/s390/call-z9-pic-nodatarel.c | 18 +++ gcc/testsuite/gcc.target/s390/call-z9-pic.c | 18 +++ gcc/testsuite/gcc.target/s390/call-z9.c | 20 +++ gcc/testsuite/gcc.target/s390/call.h | 40 ++ .../gcc.target/s390/mfentry-m64-pic.c | 9 ++ .../gcc.target/s390/mnop-mcount-m31-mzarch.c | 2 +- .../gcc.target/s390/mnop-mcount-m64.c | 2 +- gcc/testsuite/gcc.target/s390/nodatarel-1.c | 26 +--- gcc/testsuite/gcc.target/s390/pr80080-4.c | 2 +- gcc/testsuite/gcc.target/s390/risbg-ll-3.c| 6 +- gcc/testsuite/gcc.target/s390/tls-pic.c | 14 +++ gcc/testsuite/gcc.target/s390/tls.c | 10 ++ gcc/testsuite/gcc.target/s390/tls.h | 23 22 files changed, 336 insertions(+), 97 deletions(-) create mode 100644 gcc/testsuite/g++.target/s390/mi-thunk.C create mode 100644 gcc/testsuite/gcc.target/s390/call-z10-pic-nodatarel.c create mode 100644 gcc/testsuite/gcc.target/s390/call-z10-pic.c create mode 100644 gcc/testsuite/gcc.target/s390/call-z10.c create mode 100644 gcc/testsuite/gcc.target/s390/call-z9-pic-nodatarel.c create mode 100644 gcc/testsuite/gcc.target/s390/call-z9-pic.c create mode 100644 gcc/testsuite/gcc.target/s390/call-z9.c create mode 100644 gcc/testsuite/gcc.target/s390/call.h create mode 100644 gcc/testsuite/gcc.target/s390/mfentry-m64-pic.c create mode 100644 gcc/testsuite/gcc.target/s390/tls-pic.c create mode 100644 gcc/testsuite/gcc.target/s390/tls.c create mode 100644 gcc/testsuite/gcc.target/s390/tls.h -- 2.31.1
[PATCH gcc-11 1/2] IBM Z: Define NO_PROFILE_COUNTERS
s390 glibc does not need counters in the .data section, since it stores edge hits in its own data structure. Therefore counters only waste space and confuse diffing tools (e.g. kpatch), so don't generate them. gcc/ChangeLog: * config/s390/s390.c (s390_function_profiler): Ignore labelno parameter. * config/s390/s390.h (NO_PROFILE_COUNTERS): Define. gcc/testsuite/ChangeLog: * gcc.target/s390/mnop-mcount-m31-mzarch.c: Adapt to the new prologue size. * gcc.target/s390/mnop-mcount-m64.c: Likewise. (cherry picked from commit a1c1b7a888a) --- gcc/config/s390/s390.c| 42 +++ gcc/config/s390/s390.h| 2 + .../gcc.target/s390/mnop-mcount-m31-mzarch.c | 2 +- .../gcc.target/s390/mnop-mcount-m64.c | 2 +- 4 files changed, 20 insertions(+), 28 deletions(-) diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index c5d4c439bcc..a863dfce9a2 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -13120,33 +13120,25 @@ output_asm_nops (const char *user, int hw) } } -/* Output assembler code to FILE to increment profiler label # LABELNO - for profiling a function entry. */ +/* Output assembler code to FILE to call a profiler hook. */ void -s390_function_profiler (FILE *file, int labelno) +s390_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED) { - rtx op[8]; - - char label[128]; - ASM_GENERATE_INTERNAL_LABEL (label, "LP", labelno); + rtx op[4]; fprintf (file, "# function profiler \n"); op[0] = gen_rtx_REG (Pmode, RETURN_REGNUM); op[1] = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); op[1] = gen_rtx_MEM (Pmode, plus_constant (Pmode, op[1], UNITS_PER_LONG)); - op[7] = GEN_INT (UNITS_PER_LONG); - - op[2] = gen_rtx_REG (Pmode, 1); - op[3] = gen_rtx_SYMBOL_REF (Pmode, label); - SYMBOL_REF_FLAGS (op[3]) = SYMBOL_FLAG_LOCAL; + op[3] = GEN_INT (UNITS_PER_LONG); - op[4] = gen_rtx_SYMBOL_REF (Pmode, flag_fentry ? "__fentry__" : "_mcount"); + op[2] = gen_rtx_SYMBOL_REF (Pmode, flag_fentry ? "__fentry__" : "_mcount"); if (flag_pic) { - op[4] = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, op[4]), UNSPEC_PLT); - op[4] = gen_rtx_CONST (Pmode, op[4]); + op[2] = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, op[2]), UNSPEC_PLT); + op[2] = gen_rtx_CONST (Pmode, op[2]); } if (flag_record_mcount) @@ -13160,20 +13152,19 @@ s390_function_profiler (FILE *file, int labelno) warning (OPT_Wcannot_profile, "nested functions cannot be profiled " "with %<-mfentry%> on s390"); else - output_asm_insn ("brasl\t0,%4", op); + output_asm_insn ("brasl\t0,%2", op); } else if (TARGET_64BIT) { if (flag_nop_mcount) - output_asm_nops ("-mnop-mcount", /* stg */ 3 + /* larl */ 3 + -/* brasl */ 3 + /* lg */ 3); + output_asm_nops ("-mnop-mcount", /* stg */ 3 + /* brasl */ 3 + +/* lg */ 3); else { output_asm_insn ("stg\t%0,%1", op); if (flag_dwarf2_cfi_asm) - output_asm_insn (".cfi_rel_offset\t%0,%7", op); - output_asm_insn ("larl\t%2,%3", op); - output_asm_insn ("brasl\t%0,%4", op); + output_asm_insn (".cfi_rel_offset\t%0,%3", op); + output_asm_insn ("brasl\t%0,%2", op); output_asm_insn ("lg\t%0,%1", op); if (flag_dwarf2_cfi_asm) output_asm_insn (".cfi_restore\t%0", op); @@ -13182,15 +13173,14 @@ s390_function_profiler (FILE *file, int labelno) else { if (flag_nop_mcount) - output_asm_nops ("-mnop-mcount", /* st */ 2 + /* larl */ 3 + -/* brasl */ 3 + /* l */ 2); + output_asm_nops ("-mnop-mcount", /* st */ 2 + /* brasl */ 3 + +/* l */ 2); else { output_asm_insn ("st\t%0,%1", op); if (flag_dwarf2_cfi_asm) - output_asm_insn (".cfi_rel_offset\t%0,%7", op); - output_asm_insn ("larl\t%2,%3", op); - output_asm_insn ("brasl\t%0,%4", op); + output_asm_insn (".cfi_rel_offset\t%0,%3", op); + output_asm_insn ("brasl\t%0,%2", op); output_asm_insn ("l\t%0,%1", op); if (flag_dwarf2_cfi_asm) output_asm_insn (".cfi_restore\t%0", op); diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h index 3b876160420..fb16a455a03 100644 --- a/gcc/config/s390/s390.h +++ b/gcc/config/s390/s390.h @@ -787,6 +787,8 @@ CUMULATIVE_ARGS; #define PROFILE_BEFORE_PROLOGUE 1 +#define NO_PROFILE_COUNTERS 1 + /* Trampolines for nested functions. */ diff --git a/gcc/testsuite/gcc.target/s390/mnop-mcount-m31-mzarch.c b/gcc/testsuite/gcc.target/s390/mnop-mcount-m31-mzarch.c index b2ad9f5bced..874ceb96fe8 100644 --- a/gcc/testsuite/gcc.target/s390/mnop-mcount-m31-mzarch.c +++ b/gcc/testsuite/gcc.target/s390/mnop-mcount-m31-mzarch.c @@ -4,5 +4,5 @@ void profileme
[PATCH gcc-11 2/2] IBM Z: Use @PLT symbols for local functions in 64-bit mode
This helps with generating code for kernel hotpatches, which contain individual functions and are loaded more than 2G away from vmlinux. This should not create performance regressions for the normal use cases, because for local functions ld replaces @PLT calls with direct calls. gcc/ChangeLog: * config/s390/predicates.md (bras_sym_operand): Accept all functions in 64-bit mode, use UNSPEC_PLT31. (larl_operand): Use UNSPEC_PLT31. * config/s390/s390.c (s390_loadrelative_operand_p): Likewise. (legitimize_pic_address): Likewise. (s390_emit_tls_call_insn): Mark __tls_get_offset as function, use UNSPEC_PLT31. (s390_delegitimize_address): Use UNSPEC_PLT31. (s390_output_addr_const_extra): Likewise. (print_operand): Add @PLT to TLS calls, handle %K. (s390_function_profiler): Mark __fentry__/_mcount as function, use %K, use UNSPEC_PLT31. (s390_output_mi_thunk): Use only UNSPEC_GOT, use %K. (s390_emit_call): Use UNSPEC_PLT31. (s390_emit_tpf_eh_return): Mark __tpf_eh_return as function. * config/s390/s390.md (UNSPEC_PLT31): Rename from UNSPEC_PLT. (*movdi_64): Use %K. (reload_base_64): Likewise. (*sibcall_brc): Likewise. (*sibcall_brcl): Likewise. (*sibcall_value_brc): Likewise. (*sibcall_value_brcl): Likewise. (*bras): Likewise. (*brasl): Likewise. (*bras_r): Likewise. (*brasl_r): Likewise. (*bras_tls): Likewise. (*brasl_tls): Likewise. (main_base_64): Likewise. (reload_base_64): Likewise. (@split_stack_call): Likewise. gcc/testsuite/ChangeLog: * g++.dg/ext/visibility/noPLT.C: Skip on s390x. * g++.target/s390/mi-thunk.C: New test. * gcc.target/s390/nodatarel-1.c: Move foostatic to the new tests. * gcc.target/s390/pr80080-4.c: Allow @PLT suffix. * gcc.target/s390/risbg-ll-3.c: Likewise. * gcc.target/s390/call.h: Common code for the new tests. * gcc.target/s390/call-z10-pic-nodatarel.c: New test. * gcc.target/s390/call-z10-pic.c: New test. * gcc.target/s390/call-z10.c: New test. * gcc.target/s390/call-z9-pic-nodatarel.c: New test. * gcc.target/s390/call-z9-pic.c: New test. * gcc.target/s390/call-z9.c: New test. * gcc.target/s390/mfentry-m64-pic.c: New test. * gcc.target/s390/tls.h: Common code for the new TLS tests. * gcc.target/s390/tls-pic.c: New test. * gcc.target/s390/tls.c: New test. (cherry picked from commit 0990d93dd8a) --- gcc/config/s390/predicates.md | 9 ++- gcc/config/s390/s390.c| 81 +-- gcc/config/s390/s390.md | 32 gcc/testsuite/g++.dg/ext/visibility/noPLT.C | 2 +- gcc/testsuite/g++.target/s390/mi-thunk.C | 23 ++ .../gcc.target/s390/call-z10-pic-nodatarel.c | 20 + gcc/testsuite/gcc.target/s390/call-z10-pic.c | 20 + gcc/testsuite/gcc.target/s390/call-z10.c | 20 + .../gcc.target/s390/call-z9-pic-nodatarel.c | 18 + gcc/testsuite/gcc.target/s390/call-z9-pic.c | 18 + gcc/testsuite/gcc.target/s390/call-z9.c | 20 + gcc/testsuite/gcc.target/s390/call.h | 40 + .../gcc.target/s390/mfentry-m64-pic.c | 9 +++ gcc/testsuite/gcc.target/s390/nodatarel-1.c | 26 +- gcc/testsuite/gcc.target/s390/pr80080-4.c | 2 +- gcc/testsuite/gcc.target/s390/risbg-ll-3.c| 6 +- gcc/testsuite/gcc.target/s390/tls-pic.c | 14 gcc/testsuite/gcc.target/s390/tls.c | 10 +++ gcc/testsuite/gcc.target/s390/tls.h | 23 ++ 19 files changed, 320 insertions(+), 73 deletions(-) create mode 100644 gcc/testsuite/g++.target/s390/mi-thunk.C create mode 100644 gcc/testsuite/gcc.target/s390/call-z10-pic-nodatarel.c create mode 100644 gcc/testsuite/gcc.target/s390/call-z10-pic.c create mode 100644 gcc/testsuite/gcc.target/s390/call-z10.c create mode 100644 gcc/testsuite/gcc.target/s390/call-z9-pic-nodatarel.c create mode 100644 gcc/testsuite/gcc.target/s390/call-z9-pic.c create mode 100644 gcc/testsuite/gcc.target/s390/call-z9.c create mode 100644 gcc/testsuite/gcc.target/s390/call.h create mode 100644 gcc/testsuite/gcc.target/s390/mfentry-m64-pic.c create mode 100644 gcc/testsuite/gcc.target/s390/tls-pic.c create mode 100644 gcc/testsuite/gcc.target/s390/tls.c create mode 100644 gcc/testsuite/gcc.target/s390/tls.h diff --git a/gcc/config/s390/predicates.md b/gcc/config/s390/predicates.md index 15093cb4b30..99c343aa32c 100644 --- a/gcc/config/s390/predicates.md +++ b/gcc/config/s390/predicates.md @@ -101,10 +101,13 @@ (define_special_predicate "bras_sym_operand" (ior (and (match_code "symbol_ref") - (match_test "!flag_pic || SYMBOL_REF_LOCAL_P (op)")) + (ior (match_test "!flag_pic") +(match_test
Re: [PATCH 1v2/3][vect] Add main vectorized loop unrolling
Hi, That just forces trying the vector modes we've tried before. Though I might need to revisit this now I think about it. I'm afraid it might be possible for this to generate an epilogue with a vf that is not lower than that of the main loop, but I'd need to think about this again. Either way I don't think this changes the vector modes used for the epilogue. But maybe I'm just missing your point here. Yes, I was refering to the above which suggests that when we vectorize the main loop with V4SF but unroll then we try vectorizing the epilogue with V4SF as well (but not unrolled). I think that's premature (not sure if you try V8SF if the main loop was V4SF but unrolled 4 times). My main motivation for this was because I had a SVE loop that vectorized with both VNx8HI, then V8HI which beat VNx8HI on cost, then it decided to unroll V8HI by two and skipped using VNx8HI as a predicated epilogue which would've been the best choice. So that is why I decided to just 'reset' the vector_mode selection. In a scenario where you only have the traditional vector modes it might make less sense. Just realized I still didn't add any check to make sure the epilogue has a lower VF than the previous loop, though I'm still not sure that could happen. I'll go look at where to add that if you agree with this. I can move it there, it would indeed remove the need for the change to vect_update_vf_for_slp, the change to vect_determine_partial_vectors_and_peeling would still be required I think. It is meant to disable using partial vectors in an unrolled loop. Why would we disable the use of partial vectors in an unrolled loop? The motivation behind that is that the overhead caused by generating predicates for each iteration will likely be too much for it to be profitable to unroll. On top of that, when dealing with low iteration count loops, if executing one predicated iteration would be enough we now still need to execute all other unrolled predicated iterations, whereas if we keep them unrolled we skip the unrolled loops. Sure but I'm suggesting you keep the not unrolled body as one way of costed vectorization but then if the target says "try unrolling" re-do the analysis with the same mode but a larger VF. Just like we iterate over vector modes you'll now iterate over pairs of vector mode + VF (unroll factor). It's not about re-using the costing it's about using costing that is actually relevant and also to avoid targets inventing two distinct separate costings - a target (powerpc) might already compute load/store density and other stuff for the main costing so it should have an idea whether doubling or triplicating is OK. Richard. Sounds good! I changed the patch to determine the unrolling factor later, after all analysis has been done and retry analysis if an unrolling factor larger than 1 has been chosen for this loop and vector_mode. gcc/ChangeLog: * doc/tm.texi: Document TARGET_VECTORIZE_UNROLL_FACTOR. * doc/tm.texi.in: Add entries for TARGET_VECTORIZE_UNROLL_FACTOR. * params.opt: Add vect-unroll and vect-unroll-reductions parameters. * target.def: Define hook TARGET_VECTORIZE_UNROLL_FACTOR. * targhooks.c (default_unroll_factor): New. * targhooks.h (default_unroll_factor): Likewise. * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize par_unrolling_factor. (vect_determine_partial_vectors_and_peeling): Account for unrolling. (vect_determine_unroll_factor): New. (vect_try_unrolling): New. (vect_reanalyze_as_main_loop): Call vect_try_unrolling when retrying a loop_vinfo as a main loop. (vect_analyze_loop): Call vect_try_unrolling when vectorizing main loops. (vect_analyze_loop): Allow for epilogue vectorization when unrolling and rewalk vector_mode warray for the epilogues. (vectorizable_reduction): Disable single_defuse_cycle when unrolling. * tree-vectorizer.h (vect_unroll_value): Declare par_unrolling_factor as a member of loop_vec_info. diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index be8148583d8571b0d035b1938db9d056bfd213a8..71ee33a200fcbd37ccd5380321df507ae1e8961f 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6289,6 +6289,12 @@ allocated by TARGET_VECTORIZE_INIT_COST. The default releases the accumulator. @end deftypefn +@deftypefn {Target Hook} unsigned TARGET_VECTORIZE_UNROLL_FACTOR (class vec_info *@var{vinfo}) +This hook should return the desired vector unrolling factor for a loop with +@var{vinfo}. The default returns one, which means no unrolling will be +performed. +@end deftypefn + @deftypefn {Target Hook} tree TARGET_VECTORIZE_BUILTIN_GATHER (const_tree @var{mem_vectype}, const_tree @var{index_type}, int @var{scale}) Target builtin that implements vector gather operation. @var{mem_vectype} is the vector type of the load and @var{index_type} is scalar type of diff --git a/gcc/doc
RE: [PATCH 2/7]AArch64 Add combine patterns for narrowing shift of half top bits (shuffle)
> -Original Message- > From: Tamar Christina > Sent: Wednesday, September 29, 2021 5:20 PM > To: gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > Marcus Shawcroft ; Kyrylo Tkachov > ; Richard Sandiford > > Subject: [PATCH 2/7]AArch64 Add combine patterns for narrowing shift of > half top bits (shuffle) > > Hi All, > > When doing a (narrowing) right shift by half the width of the original type > then > we are essentially shuffling the top bits from the first number down. > > If we have a hi/lo pair we can just use a single shuffle instead of needing > two > shifts. > > i.e. > > typedef short int16_t; > typedef unsigned short uint16_t; > > void foo (uint16_t * restrict a, int16_t * restrict d, int n) > { > for( int i = 0; i < n; i++ ) > d[i] = (a[i] * a[i]) >> 16; > } > > now generates: > > .L4: > ldr q0, [x0, x3] > umull v1.4s, v0.4h, v0.4h > umull2 v0.4s, v0.8h, v0.8h > uzp2v0.8h, v1.8h, v0.8h > str q0, [x1, x3] > add x3, x3, 16 > cmp x4, x3 > bne .L4 > > instead of > > .L4: > ldr q0, [x0, x3] > umull v1.4s, v0.4h, v0.4h > umull2 v0.4s, v0.8h, v0.8h > sshrv1.4s, v1.4s, 16 > sshrv0.4s, v0.4s, 16 > xtn v1.4h, v1.4s > xtn2v1.8h, v0.4s > str q1, [x1, x3] > add x3, x3, 16 > cmp x4, x3 > bne .L4 > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? > Ok. Thanks, Kyrill > Thanks, > Tamar > > gcc/ChangeLog: > > * config/aarch64/aarch64-simd.md > (*aarch64_topbits_shuffle, > *aarch64_topbits_shuffle): New. > * config/aarch64/predicates.md > (aarch64_simd_shift_imm_vec_exact_top): New. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/shrn-combine-2.c: New test. > * gcc.target/aarch64/shrn-combine-3.c: New test. > > --- inline copy of patch -- > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index > d7b6cae424622d259f97a3d5fa9093c0fb0bd5ce..300bf001b59ca7fa197c580b > 10adb7f70f20d1e0 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -1840,6 +1840,36 @@ (define_insn > "*aarch64_shrn2_vect" >[(set_attr "type" "neon_shift_imm_narrow_q")] > ) > > +(define_insn "*aarch64_topbits_shuffle" > + [(set (match_operand: 0 "register_operand" "=w") > + (vec_concat: > + (truncate: > +(SHIFTRT:VQN (match_operand:VQN 1 "register_operand" "w") > + (match_operand:VQN 2 > "aarch64_simd_shift_imm_vec_exact_top"))) > + (truncate: > + (SHIFTRT:VQN (match_operand:VQN 3 "register_operand" "w") > + (match_dup 2)] > + "TARGET_SIMD" > + "uzp2\\t%0., %1., %3." > + [(set_attr "type" "neon_permute")] > +) > + > +(define_insn "*aarch64_topbits_shuffle" > + [(set (match_operand: 0 "register_operand" "=w") > + (vec_concat: > + (unspec: [ > + (match_operand:VQN 1 "register_operand" "w") > + (match_operand:VQN 2 > "aarch64_simd_shift_imm_vec_exact_top") > + ] UNSPEC_RSHRN) > + (unspec: [ > + (match_operand:VQN 3 "register_operand" "w") > + (match_dup 2) > + ] UNSPEC_RSHRN)))] > + "TARGET_SIMD" > + "uzp2\\t%0., %1., %3." > + [(set_attr "type" "neon_permute")] > +) > + > (define_expand "aarch64_shrn" >[(set (match_operand: 0 "register_operand") > (truncate: > diff --git a/gcc/config/aarch64/predicates.md > b/gcc/config/aarch64/predicates.md > index > 49f02ae0381359174fed80c2a2264295c75bc189..7fd4f9e7d06d3082d6f30472 > 90f0446789e1d0d2 100644 > --- a/gcc/config/aarch64/predicates.md > +++ b/gcc/config/aarch64/predicates.md > @@ -545,6 +545,12 @@ (define_predicate > "aarch64_simd_shift_imm_offset_di" >(and (match_code "const_int") > (match_test "IN_RANGE (INTVAL (op), 1, 64)"))) > > +(define_predicate "aarch64_simd_shift_imm_vec_exact_top" > + (and (match_code "const_vector") > + (match_test "aarch64_const_vec_all_same_in_range_p (op, > + GET_MODE_UNIT_BITSIZE (GET_MODE (op)) / 2, > + GET_MODE_UNIT_BITSIZE (GET_MODE (op)) / 2)"))) > + > (define_predicate "aarch64_simd_shift_imm_vec_qi" >(and (match_code "const_vector") > (match_test "aarch64_const_vec_all_same_in_range_p (op, 1, 8)"))) > diff --git a/gcc/testsuite/gcc.target/aarch64/shrn-combine-2.c > b/gcc/testsuite/gcc.target/aarch64/shrn-combine-2.c > new file mode 100644 > index > ..924b3b849e449082b8c0b7 > dc6b955a2bad8d0911 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/shrn-combine-2.c > @@ -0,0 +1,15 @@ > +/* { dg-do assemble } */ > +/* { dg-options "-O3 --save-temps --param=vect-epilogues-nomask=0" } */ > + > +typedef short int16_t; > +typedef unsigned short uint16_t; > + > +void fo
[PATCH] Fix thinko in previous alignment peeling change
I was mistaken in that npeel is -1 for variable peeling - it is 0. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. 2021-09-30 Richard Biener * tree-vect-data-refs.c (vect_update_misalignment_for_peel): Fix npeel check for variable amount of peeling. --- gcc/tree-vect-data-refs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c index 1c6fc4a8f0f..bece58df3bf 100644 --- a/gcc/tree-vect-data-refs.c +++ b/gcc/tree-vect-data-refs.c @@ -1265,7 +1265,7 @@ vect_update_misalignment_for_peel (dr_vec_info *dr_info, tree vectype = STMT_VINFO_VECTYPE (dr_info->stmt); if (DR_TARGET_ALIGNMENT (dr_info).is_constant (&alignment) && known_alignment_for_access_p (dr_info, vectype) - && npeel != -1) + && npeel != 0) { int misal = dr_info->misalignment; misal += npeel * TREE_INT_CST_LOW (DR_STEP (dr_info->dr)); -- 2.31.1
RE: [PATCH 3/7]AArch64 Add pattern for sshr to cmlt
> -Original Message- > From: Tamar Christina > Sent: Wednesday, September 29, 2021 5:20 PM > To: gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > Marcus Shawcroft ; Kyrylo Tkachov > ; Richard Sandiford > > Subject: [PATCH 3/7]AArch64 Add pattern for sshr to cmlt > > Hi All, > > This optimizes signed right shift by BITSIZE-1 into a cmlt operation which is > more optimal because generally compares have a higher throughput than > shifts. > > On AArch64 the result of the shift would have been either -1 or 0 which is the > results of the compare. > > i.e. > > void e (int * restrict a, int *b, int n) > { > for (int i = 0; i < n; i++) > b[i] = a[i] >> 31; > } > > now generates: > > .L4: > ldr q0, [x0, x3] > cmltv0.4s, v0.4s, #0 > str q0, [x1, x3] > add x3, x3, 16 > cmp x4, x3 > bne .L4 > > instead of: > > .L4: > ldr q0, [x0, x3] > sshrv0.4s, v0.4s, 31 > str q0, [x1, x3] > add x3, x3, 16 > cmp x4, x3 > bne .L4 > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? This should be okay (either a win or neutral) for Arm Cortex and Neoverse cores so I'm tempted to not ask for a CPU-specific tunable to guard it to keep the code clean. Andrew, would this change be okay from a Thunder X line perspective? Thanks, Kyrill > > Thanks, > Tamar > > gcc/ChangeLog: > > * config/aarch64/aarch64-simd.md (aarch64_simd_ashr): > Add case cmp > case. > * config/aarch64/constraints.md (D1): New. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/shl-combine-2.c: New test. > > --- inline copy of patch -- > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index > 300bf001b59ca7fa197c580b10adb7f70f20d1e0..19b2d0ad4dab4d574269829 > 7ded861228ee22007 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -1127,12 +1127,14 @@ (define_insn "aarch64_simd_lshr" > ) > > (define_insn "aarch64_simd_ashr" > - [(set (match_operand:VDQ_I 0 "register_operand" "=w") > - (ashiftrt:VDQ_I (match_operand:VDQ_I 1 "register_operand" "w") > - (match_operand:VDQ_I 2 "aarch64_simd_rshift_imm" > "Dr")))] > + [(set (match_operand:VDQ_I 0 "register_operand" "=w,w") > + (ashiftrt:VDQ_I (match_operand:VDQ_I 1 "register_operand" "w,w") > + (match_operand:VDQ_I 2 "aarch64_simd_rshift_imm" > "D1,Dr")))] > "TARGET_SIMD" > - "sshr\t%0., %1., %2" > - [(set_attr "type" "neon_shift_imm")] > + "@ > + cmlt\t%0., %1., #0 > + sshr\t%0., %1., %2" > + [(set_attr "type" "neon_compare,neon_shift_imm")] > ) > > (define_insn "*aarch64_simd_sra" > diff --git a/gcc/config/aarch64/constraints.md > b/gcc/config/aarch64/constraints.md > index > 3b49b452119c49320020fa9183314d9a25b92491..18630815ffc13f2168300a89 > 9db69fd428dfb0d6 100644 > --- a/gcc/config/aarch64/constraints.md > +++ b/gcc/config/aarch64/constraints.md > @@ -437,6 +437,14 @@ (define_constraint "Dl" >(match_test "aarch64_simd_shift_imm_p (op, GET_MODE (op), >true)"))) > > +(define_constraint "D1" > + "@internal > + A constraint that matches vector of immediates that is bits(mode)-1." > + (and (match_code "const,const_vector") > + (match_test "aarch64_const_vec_all_same_in_range_p (op, > + GET_MODE_UNIT_BITSIZE (mode) - 1, > + GET_MODE_UNIT_BITSIZE (mode) - 1)"))) > + > (define_constraint "Dr" >"@internal > A constraint that matches vector of immediates for right shifts." > diff --git a/gcc/testsuite/gcc.target/aarch64/shl-combine-2.c > b/gcc/testsuite/gcc.target/aarch64/shl-combine-2.c > new file mode 100644 > index > ..bdfe35d09ffccc7928947c9e > 57f1034f7ca2c798 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/shl-combine-2.c > @@ -0,0 +1,12 @@ > +/* { dg-do assemble } */ > +/* { dg-options "-O3 --save-temps --param=vect-epilogues-nomask=0" } */ > + > +void e (int * restrict a, int *b, int n) > +{ > +for (int i = 0; i < n; i++) > + b[i] = a[i] >> 31; > +} > + > +/* { dg-final { scan-assembler-times {\tcmlt\t} 1 } } */ > +/* { dg-final { scan-assembler-not {\tsshr\t} } } */ > + > > > --
RE: [PATCH 4/7]AArch64 Add pattern xtn+xtn2 to uzp2
> -Original Message- > From: Tamar Christina > Sent: Wednesday, September 29, 2021 5:20 PM > To: gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > Marcus Shawcroft ; Kyrylo Tkachov > ; Richard Sandiford > > Subject: [PATCH 4/7]AArch64 Add pattern xtn+xtn2 to uzp2 > > Hi All, > > This turns truncate operations with a hi/lo pair into a single permute of half > the bit size of the input and just ignoring the top bits (which are truncated > out). > > i.e. > > void d2 (short * restrict a, int *b, int n) > { > for (int i = 0; i < n; i++) > a[i] = b[i]; > } > > now generates: > > .L4: > ldp q0, q1, [x3] > add x3, x3, 32 > uzp1v0.8h, v0.8h, v1.8h > str q0, [x5], 16 > cmp x4, x3 > bne .L4 > > instead of > > .L4: > ldp q0, q1, [x3] > add x3, x3, 32 > xtn v0.4h, v0.4s > xtn2v0.8h, v1.4s > str q0, [x5], 16 > cmp x4, x3 > bne .L4 > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? > Ok. Thanks, Kyrill > Thanks, > Tamar > > gcc/ChangeLog: > > * config/aarch64/aarch64-simd.md > (*aarch64_narrow_trunc): New. > * config/aarch64/iterators.md (VNARROWSIMD, Vnarrowsimd): > New. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/xtn-combine-1.c: New test. > * gcc.target/aarch64/narrow_high_combine.c: Update case. > > --- inline copy of patch -- > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index > 36396ef236e8c476d5e2f1acee80dc54ec5ebe4e..33e3301d229366022a5b9481 > b6c3ae8f8d93f9e2 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -1753,6 +1753,18 @@ (define_expand "aarch64_xtn2" >} > ) > > +(define_insn "*aarch64_narrow_trunc" > + [(set (match_operand: 0 "register_operand" "=w") > + (vec_concat: > + (truncate: > +(match_operand:VQN 1 "register_operand" "w")) > + (truncate: > + (match_operand:VQN 2 "register_operand" "w"] > + "TARGET_SIMD" > + "uzp1\\t%0., %1., %2." > + [(set_attr "type" "neon_permute")] > +) > + > ;; Packing doubles. > > (define_expand "vec_pack_trunc_" > diff --git a/gcc/config/aarch64/iterators.md > b/gcc/config/aarch64/iterators.md > index > 8dbeed3b0d4a44cdc17dd333ed397b39a33f386a..95b385c0c9405fe95fcd072 > 62a9471ab13d5488e 100644 > --- a/gcc/config/aarch64/iterators.md > +++ b/gcc/config/aarch64/iterators.md > @@ -270,6 +270,14 @@ (define_mode_iterator VDQHS [V4HI V8HI V2SI > V4SI]) > ;; Advanced SIMD modes for H, S and D types. > (define_mode_iterator VDQHSD [V4HI V8HI V2SI V4SI V2DI]) > > +;; Modes for which we can narrow the element and increase the lane counts > +;; to preserve the same register size. > +(define_mode_attr VNARROWSIMD [(V4HI "V8QI") (V8HI "V16QI") (V4SI > "V8HI") > +(V2SI "V4HI") (V2DI "V4SI")]) > + > +(define_mode_attr Vnarrowsimd [(V4HI "v8qi") (V8HI "v16qi") (V4SI "v8hi") > +(V2SI "v4hi") (V2DI "v4si")]) > + > ;; Advanced SIMD and scalar integer modes for H and S. > (define_mode_iterator VSDQ_HSI [V4HI V8HI V2SI V4SI HI SI]) > > diff --git a/gcc/testsuite/gcc.target/aarch64/narrow_high_combine.c > b/gcc/testsuite/gcc.target/aarch64/narrow_high_combine.c > index > 50ecab002a3552d37a5cc0d8921f42f6c3dba195..fa61196d3644caa48b12151e > 12b15dfeab8c7e71 100644 > --- a/gcc/testsuite/gcc.target/aarch64/narrow_high_combine.c > +++ b/gcc/testsuite/gcc.target/aarch64/narrow_high_combine.c > @@ -225,7 +225,8 @@ TEST_2_UNARY (vqmovun, uint32x4_t, int64x2_t, > s64, u32) > /* { dg-final { scan-assembler-times "\\tuqshrn2\\tv" 6} } */ > /* { dg-final { scan-assembler-times "\\tsqrshrn2\\tv" 6} } */ > /* { dg-final { scan-assembler-times "\\tuqrshrn2\\tv" 6} } */ > -/* { dg-final { scan-assembler-times "\\txtn2\\tv" 12} } */ > +/* { dg-final { scan-assembler-times "\\txtn2\\tv" 6} } */ > +/* { dg-final { scan-assembler-times "\\tuzp1\\tv" 6} } */ > /* { dg-final { scan-assembler-times "\\tuqxtn2\\tv" 6} } */ > /* { dg-final { scan-assembler-times "\\tsqxtn2\\tv" 6} } */ > /* { dg-final { scan-assembler-times "\\tsqxtun2\\tv" 6} } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/xtn-combine-1.c > b/gcc/testsuite/gcc.target/aarch64/xtn-combine-1.c > new file mode 100644 > index > ..ed655cc970a602da4ace78d > c8dbd64ab18b0d4ab > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/xtn-combine-1.c > @@ -0,0 +1,12 @@ > +/* { dg-do assemble } */ > +/* { dg-options "-O3 --save-temps --param=vect-epilogues-nomask=0" } */ > + > +void d2 (short * restrict a, int *b, int n) > +{ > +for (int i = 0; i < n; i++) > + a[i] = b[i]; > +} > + > +/* { dg-final { scan-assembler-times {\tuzp1\t} 1 } } */ > +/* { dg-final { scan-assembler-not {\txtn\t} } } */ > +/* { dg-final { scan-assem
RE: [PATCH 6/7]AArch64 Add neg + cmle into cmgt
> -Original Message- > From: Tamar Christina > Sent: Wednesday, September 29, 2021 5:22 PM > To: gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > Marcus Shawcroft ; Kyrylo Tkachov > ; Richard Sandiford > > Subject: [PATCH 6/7]AArch64 Add neg + cmle into cmgt > > Hi All, > > This turns an inversion of the sign bit + arithmetic right shift into a > comparison with 0. > > i.e. > > void fun1(int32_t *x, int n) > { > for (int i = 0; i < (n & -16); i++) > x[i] = (-x[i]) >> 31; > } > > now generates: > > .L3: > ldr q0, [x0] > cmgtv0.4s, v0.4s, #0 > str q0, [x0], 16 > cmp x0, x1 > bne .L3 > > instead of: > > .L3: > ldr q0, [x0] > neg v0.4s, v0.4s > sshrv0.4s, v0.4s, 31 > str q0, [x0], 16 > cmp x0, x1 > bne .L3 > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? > > Thanks, > Tamar > > gcc/ChangeLog: > > * config/aarch64/aarch64-simd.md > (*aarch64_simd_neg_ashr): New. > * config/aarch64/predicates.md > (aarch64_simd_shift_imm_vec_signbit): New. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/signbit-1.c: New test. > > --- inline copy of patch -- > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index > 0045b100c6af1c007293ee26506199868be90e9f..9d936428b438c95b56614c94 > 081d7e2ebc47d89f 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -1137,6 +1137,18 @@ (define_insn "aarch64_simd_ashr" >[(set_attr "type" "neon_compare,neon_shift_imm")] > ) > > +;; Additional opt when we negate the sign bit and then shift right > +(define_insn "*aarch64_simd_neg_ashr" > + [(set (match_operand:VDQ_I 0 "register_operand" "=w") > + (ashiftrt:VDQ_I > + (neg:VDQ_I > +(match_operand:VDQ_I 1 "register_operand" "w")) > +(match_operand:VDQ_I 2 "aarch64_simd_shift_imm_vec_signbit" > "D1")))] > + "TARGET_SIMD" > + "cmgt\t%0., %1., #0" > + [(set_attr "type" "neon_compare_zero")] > +) > + > (define_insn "*aarch64_simd_sra" > [(set (match_operand:VDQ_I 0 "register_operand" "=w") > (plus:VDQ_I > diff --git a/gcc/config/aarch64/predicates.md > b/gcc/config/aarch64/predicates.md > index > 7fd4f9e7d06d3082d6f3047290f0446789e1d0d2..12e7d35da154b10f0190274 > d0279cab313563455 100644 > --- a/gcc/config/aarch64/predicates.md > +++ b/gcc/config/aarch64/predicates.md > @@ -545,6 +545,12 @@ (define_predicate > "aarch64_simd_shift_imm_offset_di" >(and (match_code "const_int") > (match_test "IN_RANGE (INTVAL (op), 1, 64)"))) > > +(define_predicate "aarch64_simd_shift_imm_vec_signbit" > + (and (match_code "const_vector") > + (match_test "aarch64_const_vec_all_same_in_range_p (op, > + GET_MODE_UNIT_BITSIZE (mode) - 1, > + GET_MODE_UNIT_BITSIZE (mode) - 1)"))) > + > (define_predicate "aarch64_simd_shift_imm_vec_exact_top" >(and (match_code "const_vector") > (match_test "aarch64_const_vec_all_same_in_range_p (op, Ok but > diff --git a/gcc/testsuite/gcc.target/aarch64/signbit-1.c > b/gcc/testsuite/gcc.target/aarch64/signbit-1.c > new file mode 100644 > index > ..3ebfb0586f37de29cf58635 > b27fe48503714447e > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/signbit-1.c > @@ -0,0 +1,18 @@ > +/* { dg-do assemble } */ > +/* { dg-options "-O3 --save-temps" } */ > + > +#include > + > +void fun1(int32_t *x, int n) > +{ > +for (int i = 0; i < (n & -16); i++) > + x[i] = (-x[i]) >> 31; > +} > + > +void fun2(int32_t *x, int n) > +{ > +for (int i = 0; i < (n & -16); i++) > + x[i] = (-x[i]) >> 30; > +} > + > +/* { dg-final { scan-assembler-times {\tcmgt\t} 1 } } */ ... as discussed offline can we also add test coverage for the other modes used in the iterators in this patch series. The extra tests can be added as separate follow up patches. Also, I'd appreciate a comment in the test for why only one of the functions is expected to generate a cmgt here (or remove the one that's irrelevant here) Thanks, Kyrill > > > --
RE: [PATCH 7/7]AArch64 Combine cmeq 0 + not into cmtst
> -Original Message- > From: Tamar Christina > Sent: Wednesday, September 29, 2021 5:22 PM > To: gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > Marcus Shawcroft ; Kyrylo Tkachov > ; Richard Sandiford > > Subject: [PATCH 7/7]AArch64 Combine cmeq 0 + not into cmtst > > Hi All, > > This turns a bitwise inverse of an equality comparison with 0 into a compare > of > bitwise nonzero (cmtst). > > We already have one pattern for cmsts, this adds an additional one which > does > not require an additional bitwise and. > > i.e. > > #include > > uint8x8_t bar(int16x8_t abs_row0, int16x8_t row0) { > uint16x8_t row0_diff = > vreinterpretq_u16_s16(veorq_s16(abs_row0, vshrq_n_s16(row0, 15))); > uint8x8_t abs_row0_gt0 = > vmovn_u16(vcgtq_u16(vreinterpretq_u16_s16(abs_row0), > vdupq_n_u16(0))); > return abs_row0_gt0; > } > > now generates: > > bar: > cmtst v0.8h, v0.8h, v0.8h > xtn v0.8b, v0.8h > ret > > instead of: > > bar: > cmeqv0.8h, v0.8h, #0 > not v0.16b, v0.16b > xtn v0.8b, v0.8h > ret > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? > > Thanks, > Tamar > > gcc/ChangeLog: > > * config/aarch64/aarch64-simd.md > (*aarch64_cmtst_same_): New. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/mvn-cmeq0-1.c: New test. > > --- inline copy of patch -- > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index > 9d936428b438c95b56614c94081d7e2ebc47d89f..bce01c36386074bf475b8b7 > e5c69a1959a13fef3 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -6585,6 +6585,23 @@ (define_insn "aarch64_cmtst" >[(set_attr "type" "neon_tst")] > ) > > +;; One can also get a cmtsts by having to combine a > +;; not (neq (eq x 0)) in which case you rewrite it to > +;; a comparison against itself > + > +(define_insn "*aarch64_cmtst_same_" > + [(set (match_operand: 0 "register_operand" "=w") > + (plus: > + (eq: > + (match_operand:VDQ_I 1 "register_operand" "w") > + (match_operand:VDQ_I 2 "aarch64_simd_imm_zero")) > + (match_operand: 3 > "aarch64_simd_imm_minus_one"))) > + ] > + "TARGET_SIMD" > + "cmtst\t%0, %1, %1" > + [(set_attr "type" "neon_tst")] > +) > + > (define_insn_and_split "aarch64_cmtstdi" >[(set (match_operand:DI 0 "register_operand" "=w,r") > (neg:DI > diff --git a/gcc/testsuite/gcc.target/aarch64/mvn-cmeq0-1.c > b/gcc/testsuite/gcc.target/aarch64/mvn-cmeq0-1.c > new file mode 100644 > index > ..59f3a230271c70d3bb51d03 > 38d9ec2613bd4394b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/mvn-cmeq0-1.c > @@ -0,0 +1,17 @@ > +/* { dg-do assemble } */ > +/* { dg-options "-O3 --save-temps --param=vect-epilogues-nomask=0" } */ I don't think we need the param here (or even anything higher than -O really). Ok otherwise. Thanks, Kyrill > + > +#include > + > +uint8x8_t bar(int16x8_t abs_row0, int16x8_t row0) { > + uint16x8_t row0_diff = > +vreinterpretq_u16_s16(veorq_s16(abs_row0, vshrq_n_s16(row0, 15))); > + uint8x8_t abs_row0_gt0 = > +vmovn_u16(vcgtq_u16(vreinterpretq_u16_s16(abs_row0), > vdupq_n_u16(0))); > + return abs_row0_gt0; > +} > + > + > +/* { dg-final { scan-assembler-times {\tcmtst\t} 1 } } */ > +/* { dg-final { scan-assembler-not {\tcmeq\t} } } */ > +/* { dg-final { scan-assembler-not {\tnot\t} } } */ > > > --
PING: [PATCH v2] tree-optimization/101186 - extend FRE with "equivalence map" for condition prediction
Thanks, Di -Original Message- From: Gcc-patches On Behalf Of Di Zhao OS via Gcc-patches Sent: Friday, September 17, 2021 2:13 AM To: gcc-patches@gcc.gnu.org Subject: [PATCH v2] tree-optimization/101186 - extend FRE with "equivalence map" for condition prediction Sorry about updating on this after so long. It took me much time to work out a new plan and pass the tests. The new idea is to use one variable to represent a set of equal variables at some basic-block. This variable is called a "equivalence head" or "equiv-head" in the code. (There's no-longer a "equivalence map".) - Initially an SSA_NAME's "equivalence head" is its value number. Temporary equivalence heads are recorded as unary NOP_EXPR results in the vn_nary_op_t map. Besides, when inserting into vn_nary_op_t map, make the new result at front of the vn_pval list, so that when searching for a variable's equivalence head, the first result represents the largest equivalence set at current location. - In vn_ssa_aux_t, maintain a list of references to valid_info->nary entry. For recorded equivalences, the reference is result->entry; for normal N-ary operations, the reference is operand->entry. - When recording equivalences, if one side A is constant or has more refs, make it the new equivalence head of the other side B. Traverse B's ref-list, if a variable C's previous equiv-head is B, update to A. And re-insert B's n-ary operations by replacing B with A. - When inserting and looking for the results of n-ary operations, insert and lookup by the operands' equiv-heads. So except for the refs in vn_ssa_aux_t, this scheme uses the original infrastructure to its best. Quadric search time is avoided at the cost of some re-insertions. Test results on SPEC2017 intrate (counts and percentages): |more bb |more bb |more stmt|more stmt|more |more |removed |removed |removed |removed |nv_nary_ops|nv_nary_ops |at fre1 |at fre1 |at fre1 |at fre1 |inserted |inserted -- 500.perlbench_r| 64 | 1.98% | 103 | 0.19% | 11260 | 12.16% 502.gcc_r | 671| 4.80% | 317 | 0.23% | 13964 | 6.09% 505.mcf_r | 5 | 35.71% | 9 | 1.40% | 32| 2.52% 520.omnetpp| 132| 5.45% | 39 | 0.11% | 1895 | 3.91% 523.xalancbmk_r| 238| 3.26% | 313 | 0.36% | 1417 | 1.27% 525.x264_r | 4 | 1.36% | 27 | 0.11% | 1752 | 6.78% 531.deepsjeng_r| 1 | 3.45% | 2 | 0.14% | 228 | 8.67% 541.leela_r| 2 | 0.63% | 0 | 0.00% | 92| 1.14% 548.exchange2_r| 0 | 0.00% | 3 | 0.04% | 49| 1.03% 557.xz_r | 0 | 0.00% | 3 | 0.07% | 272 | 7.55% There're more basic_blocks and statements removed compared with last implementation, the reasons are: 1) "CONST op CONST" simplification is included. It is missed in previous patch. 2) By inserting RHS of statements on equiv-heads, more N-ary operations can be simplified. One example is in 'ssa-fre-97.c' in the patch file. While jump-threading & vrp also utilize temporary equivalences (so some of the newly removed blocks and statements can also be covered by them), I think this patch is a supplement, in cases when jump threading cannot take place (the original example), or value number info needs to be involved (the 'ssa-fre-97.c' example). Fixed the former issue with non-iterate mode. About recording the temporary equivalences generated by PHIs (i.e. the 'record_equiv_from_previous_cond' stuff), I have to admit it looks strange and the code size is large, but I haven't find a better way yet. Consider a piece of CFG like the one below, if we want to record x==x2 on the true edge when processing bb1, the location (following current practice) will be bb2. But that is not useful at bb5 or bb6, because bb2 doesn't dominate them. And I can't find a place to record x==x1 when processing bb1. If we can record things on edges rather than blocks, say x==x1 on 1->3 and x==x2 on 1->2, then perhaps with an extra check for "a!=0", x2 can be a valid equiv-head for x since bb5. But I think it lacks efficiency and is not persuasive. It is more efficient to find a valid previous predicate when processing bb4, because the vn_nary_op_t will be fetched anyway. -- | if (a != 0) | bb1 -- f | \ t |--- || bb2 | |--- | / - | x = PHI | bb3 - | | -- | if (a != 0) | bb4 -- |f \t - --- bb7 | where | | bb5 | ==> where "x==x2" is recorded now | "x==x1" is| --- | recorded |\ | now |
RE: [PATCH 5/7]middle-end Convert bitclear + cmp #0 into cm
> -Original Message- > From: Richard Biener > Sent: Thursday, September 30, 2021 7:18 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [PATCH 5/7]middle-end Convert bitclear + cmp #0 > into cm > > On Wed, 29 Sep 2021, Tamar Christina wrote: > > > Hi All, > > > > This optimizes the case where a mask Y which fulfills ~Y + 1 == pow2 > > is used to clear a some bits and then compared against 0 into one > > without the masking and a compare against a different bit immediate. > > > > We can do this for all unsigned compares and for signed we can do it > > for comparisons of EQ and NE: > > > > (x & (~255)) == 0 becomes x <= 255. Which for leaves it to the target > > to optimally deal with the comparison. > > > > This transformation has to be done in the mid-end because in RTL you > > don't have the signs of the comparison operands and if the target > > needs an immediate this should be floated outside of the loop. > > > > The RTL loop invariant hoisting is done before split1. > > > > i.e. > > > > void fun1(int32_t *x, int n) > > { > > for (int i = 0; i < (n & -16); i++) > > x[i] = (x[i]&(~255)) == 0; > > } > > > > now generates: > > > > .L3: > > ldr q0, [x0] > > cmhsv0.4s, v2.4s, v0.4s > > and v0.16b, v1.16b, v0.16b > > str q0, [x0], 16 > > cmp x0, x1 > > bne .L3 > > > > and floats the immediate out of the loop. > > > > instead of: > > > > .L3: > > ldr q0, [x0] > > bic v0.4s, #255 > > cmeqv0.4s, v0.4s, #0 > > and v0.16b, v1.16b, v0.16b > > str q0, [x0], 16 > > cmp x0, x1 > > bne .L3 > > > > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu > > and no issues. > > > > Ok for master? > > > > Thanks, > > Tamar > > > > gcc/ChangeLog: > > > > * match.pd: New bitmask compare pattern. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.dg/bic-bitmask-10.c: New test. > > * gcc.dg/bic-bitmask-11.c: New test. > > * gcc.dg/bic-bitmask-12.c: New test. > > * gcc.dg/bic-bitmask-2.c: New test. > > * gcc.dg/bic-bitmask-3.c: New test. > > * gcc.dg/bic-bitmask-4.c: New test. > > * gcc.dg/bic-bitmask-5.c: New test. > > * gcc.dg/bic-bitmask-6.c: New test. > > * gcc.dg/bic-bitmask-7.c: New test. > > * gcc.dg/bic-bitmask-8.c: New test. > > * gcc.dg/bic-bitmask-9.c: New test. > > * gcc.dg/bic-bitmask.h: New test. > > * gcc.target/aarch64/bic-bitmask-1.c: New test. > > > > --- inline copy of patch -- > > diff --git a/gcc/match.pd b/gcc/match.pd index > > > 0fcfd0ea62c043dc217d0d560ce5b7e569b70e7d..df9212cb27d172856b9d43b08 > 752 > > 62f96e8993c4 100644 > > --- a/gcc/match.pd > > +++ b/gcc/match.pd > > @@ -4288,6 +4288,56 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > (if (ic == ncmp) > > (ncmp @0 @1)) > > > > +/* Transform comparisons of the form (X & Y) CMP 0 to X CMP2 Z > > + where ~Y + 1 == pow2 and Z = ~Y. */ (for cmp (simple_comparison) > > +(simplify > > + (cmp (bit_and:c @0 VECTOR_CST@1) integer_zerop) > > Why not for INTEGER_CST as well? We do have a related folding (only for > INTEGER_CST) that does > Because of a slight concern to de-optimize what targets currently generate for the flag setting variants. So for example AArch64 generates worse code for foo than it does bar int foo (int x) { if (x <= 0x) return 1; return 0; } int bar (int x) { if (x & ~0x) return 1; return 0; } Because the flag setting bitmask was optimized more. I can of course do this and fix AArch64 but other targets may have the same issue. For vectors this was less of a concern since there's not flag setting there. Do you still want the scalar version? Thanks, Tamar > /* A & (2**N - 1) <= 2**K - 1 -> A & (2**N - 2**K) == 0 >A & (2**N - 1) > 2**K - 1 -> A & (2**N - 2**K) != 0 > > which could be extended for integer vectors. That said, can you please place > the pattern next to the above? > > Why does the transform only work for uniform vector constants? (I see that > the implementation becomes simpler, but then you should also handle the > INTEGER_CST case at least) > > > + (if (VECTOR_INTEGER_TYPE_P (TREE_TYPE (@1)) > > + && uniform_vector_p (@1)) > > +(with { tree elt = vector_cst_elt (@1, 0); } > > + (switch > > + (if (TYPE_UNSIGNED (TREE_TYPE (@1)) && tree_fits_uhwi_p (elt)) > > avoid tree_fits_uhwi_p and use wide_int here > > > + (with { unsigned HOST_WIDE_INT diff = tree_to_uhwi (elt); > > + tree tdiff = wide_int_to_tree (TREE_TYPE (elt), (~diff) + 1); > > + tree newval = wide_int_to_tree (TREE_TYPE (elt), ~diff); > > + tree newmask = build_uniform_cst (TREE_TYPE (@1), > newval); } > > +(if (integer_pow2p (tdiff)) > > You don't seem to use 'tdiff' so please do this check in wide_int > > > + (switch > > + /* ((mask & x) < 0) -> 0. */ > > +
RE: [PATCH 5/7]middle-end Convert bitclear + cmp #0 into cm
On Thu, 30 Sep 2021, Tamar Christina wrote: > > -Original Message- > > From: Richard Biener > > Sent: Thursday, September 30, 2021 7:18 AM > > To: Tamar Christina > > Cc: gcc-patches@gcc.gnu.org; nd > > Subject: Re: [PATCH 5/7]middle-end Convert bitclear + cmp #0 > > into cm > > > > On Wed, 29 Sep 2021, Tamar Christina wrote: > > > > > Hi All, > > > > > > This optimizes the case where a mask Y which fulfills ~Y + 1 == pow2 > > > is used to clear a some bits and then compared against 0 into one > > > without the masking and a compare against a different bit immediate. > > > > > > We can do this for all unsigned compares and for signed we can do it > > > for comparisons of EQ and NE: > > > > > > (x & (~255)) == 0 becomes x <= 255. Which for leaves it to the target > > > to optimally deal with the comparison. > > > > > > This transformation has to be done in the mid-end because in RTL you > > > don't have the signs of the comparison operands and if the target > > > needs an immediate this should be floated outside of the loop. > > > > > > The RTL loop invariant hoisting is done before split1. > > > > > > i.e. > > > > > > void fun1(int32_t *x, int n) > > > { > > > for (int i = 0; i < (n & -16); i++) > > > x[i] = (x[i]&(~255)) == 0; > > > } > > > > > > now generates: > > > > > > .L3: > > > ldr q0, [x0] > > > cmhsv0.4s, v2.4s, v0.4s > > > and v0.16b, v1.16b, v0.16b > > > str q0, [x0], 16 > > > cmp x0, x1 > > > bne .L3 > > > > > > and floats the immediate out of the loop. > > > > > > instead of: > > > > > > .L3: > > > ldr q0, [x0] > > > bic v0.4s, #255 > > > cmeqv0.4s, v0.4s, #0 > > > and v0.16b, v1.16b, v0.16b > > > str q0, [x0], 16 > > > cmp x0, x1 > > > bne .L3 > > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu > > > and no issues. > > > > > > Ok for master? > > > > > > Thanks, > > > Tamar > > > > > > gcc/ChangeLog: > > > > > > * match.pd: New bitmask compare pattern. > > > > > > gcc/testsuite/ChangeLog: > > > > > > * gcc.dg/bic-bitmask-10.c: New test. > > > * gcc.dg/bic-bitmask-11.c: New test. > > > * gcc.dg/bic-bitmask-12.c: New test. > > > * gcc.dg/bic-bitmask-2.c: New test. > > > * gcc.dg/bic-bitmask-3.c: New test. > > > * gcc.dg/bic-bitmask-4.c: New test. > > > * gcc.dg/bic-bitmask-5.c: New test. > > > * gcc.dg/bic-bitmask-6.c: New test. > > > * gcc.dg/bic-bitmask-7.c: New test. > > > * gcc.dg/bic-bitmask-8.c: New test. > > > * gcc.dg/bic-bitmask-9.c: New test. > > > * gcc.dg/bic-bitmask.h: New test. > > > * gcc.target/aarch64/bic-bitmask-1.c: New test. > > > > > > --- inline copy of patch -- > > > diff --git a/gcc/match.pd b/gcc/match.pd index > > > > > 0fcfd0ea62c043dc217d0d560ce5b7e569b70e7d..df9212cb27d172856b9d43b08 > > 752 > > > 62f96e8993c4 100644 > > > --- a/gcc/match.pd > > > +++ b/gcc/match.pd > > > @@ -4288,6 +4288,56 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > > (if (ic == ncmp) > > > (ncmp @0 @1)) > > > > > > +/* Transform comparisons of the form (X & Y) CMP 0 to X CMP2 Z > > > + where ~Y + 1 == pow2 and Z = ~Y. */ (for cmp (simple_comparison) > > > +(simplify > > > + (cmp (bit_and:c @0 VECTOR_CST@1) integer_zerop) > > > > Why not for INTEGER_CST as well? We do have a related folding (only for > > INTEGER_CST) that does > > > > Because of a slight concern to de-optimize what targets currently generate > for the flag setting variants. > So for example AArch64 generates worse code for foo than it does bar > > int foo (int x) > { > if (x <= 0x) > return 1; > > return 0; > } > > int bar (int x) > { > if (x & ~0x) > return 1; > > return 0; > } > > Because the flag setting bitmask was optimized more. I can of course do this > and fix > AArch64 but other targets may have the same issue. For vectors this was less > of a concern since > there's not flag setting there. > > Do you still want the scalar version? Yes, the simplification result is simpler and thus more canonical on GIMPLE. On x86 we generate xorl%eax, %eax cmpl$65535, %edi setle %al ret vs xorl%eax, %eax andl$-65536, %edi setne %al ret which are equivalent I think (and would be easily be transformed using a peephole if required). Richard. > Thanks, > Tamar > > > /* A & (2**N - 1) <= 2**K - 1 -> A & (2**N - 2**K) == 0 > >A & (2**N - 1) > 2**K - 1 -> A & (2**N - 2**K) != 0 > > > > which could be extended for integer vectors. That said, can you please > > place > > the pattern next to the above? > > > > Why does the transform only work for uniform vector constants? (I see that > > the implementation becomes simpler, but then you should also handle the > > INTEGER_CST case at least) > > > > > + (if (VECTOR_INTEGER_TYPE_P (TREE_TYPE (@
[PATCH] Refine alingment peeling fix
This refines the previous fix further by reverting to the original code since the API is a bit of a mess. It also fixes the vector type used to query the misalignment - that was what triggered the original bogus change. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. 2021-09-30 Richard Biener * tree-vect-data-refs.c (vect_update_misalignment_for_peel): Restore and fix condition under which we apply npeel to the DRs misalignment value. --- gcc/tree-vect-data-refs.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c index bece58df3bf..07b5ba11f31 100644 --- a/gcc/tree-vect-data-refs.c +++ b/gcc/tree-vect-data-refs.c @@ -1262,10 +1262,11 @@ vect_update_misalignment_for_peel (dr_vec_info *dr_info, } unsigned HOST_WIDE_INT alignment; - tree vectype = STMT_VINFO_VECTYPE (dr_info->stmt); if (DR_TARGET_ALIGNMENT (dr_info).is_constant (&alignment) - && known_alignment_for_access_p (dr_info, vectype) - && npeel != 0) + && known_alignment_for_access_p (dr_info, + STMT_VINFO_VECTYPE (dr_info->stmt)) + && known_alignment_for_access_p (dr_peel_info, + STMT_VINFO_VECTYPE (dr_peel_info->stmt))) { int misal = dr_info->misalignment; misal += npeel * TREE_INT_CST_LOW (DR_STEP (dr_info->dr)); @@ -1515,7 +1516,8 @@ vect_peeling_hash_get_most_frequent (_vect_peel_info **slot, /* Get the costs of peeling NPEEL iterations for LOOP_VINFO, checking data access costs for all data refs. If UNKNOWN_MISALIGNMENT is true, - we assume DR0_INFO's misalignment will be zero after peeling. */ + npeel is computed at runtime but DR0_INFO's misalignment will be zero + after peeling. */ static void vect_get_peeling_costs_all_drs (loop_vec_info loop_vinfo, -- 2.31.1
[Patch] openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc for Fortran (was: [committed] openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc)
On 30.09.21 09:45, Jakub Jelinek wrote: This patch adds new OpenMP 5.1 allocator entrypoints ... ... and this patch adds the Fortran support for it, using the C→Fortran converted testcases. Additionally, it fixes and updated the list of API routine names. We now can also tick off one item in the OpenMP 5.1 implementation status list. OK for mainline? Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc for Fortran gcc/ChangeLog: * omp-low.c (omp_runtime_api_call): Add omp_aligned_{,c}alloc and omp_{c,re}alloc, fix omp_alloc/omp_free. libgomp/ChangeLog: * libgomp.texi (OpenMP 5.1): Set implementation status to Y for omp_aligned_{,c}alloc and omp_{c,re}alloc routines. * omp_lib.f90.in (omp_aligned_alloc, omp_aligned_calloc, omp_calloc, omp_realloc): Add. * omp_lib.h.in (omp_aligned_alloc, omp_aligned_calloc, omp_calloc, omp_realloc): Add. * testsuite/libgomp.fortran/alloc-10.f90: New test. * testsuite/libgomp.fortran/alloc-6.f90: New test. * testsuite/libgomp.fortran/alloc-7.c: New test. * testsuite/libgomp.fortran/alloc-7.f90: New test. * testsuite/libgomp.fortran/alloc-8.f90: New test. * testsuite/libgomp.fortran/alloc-9.f90: New test. gcc/omp-low.c | 8 +- libgomp/libgomp.texi | 2 +- libgomp/omp_lib.f90.in | 43 +- libgomp/omp_lib.h.in | 46 +- libgomp/testsuite/libgomp.fortran/alloc-10.f90 | 198 + libgomp/testsuite/libgomp.fortran/alloc-6.f90 | 45 ++ libgomp/testsuite/libgomp.fortran/alloc-7.c| 5 + libgomp/testsuite/libgomp.fortran/alloc-7.f90 | 174 ++ libgomp/testsuite/libgomp.fortran/alloc-8.f90 | 58 libgomp/testsuite/libgomp.fortran/alloc-9.f90 | 196 10 files changed, 770 insertions(+), 5 deletions(-) diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 26c5c0261e9..f7242dfbbca 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -3921,8 +3921,12 @@ omp_runtime_api_call (const_tree fndecl) { /* This array has 3 sections. First omp_* calls that don't have any suffixes. */ - "omp_alloc", - "omp_free", + "aligned_alloc", + "aligned_calloc", + "alloc", + "calloc", + "free", + "realloc", "target_alloc", "target_associate_ptr", "target_disassociate_ptr", diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi index b3bab8feddf..02160f81562 100644 --- a/libgomp/libgomp.texi +++ b/libgomp/libgomp.texi @@ -315,7 +315,7 @@ The OpenMP 4.5 specification is fully supported. runtime routines @tab N @tab @item @code{omp_get_mapped_ptr} runtime routine @tab N @tab @item @code{omp_calloc}, @code{omp_realloc}, @code{omp_aligned_alloc} and - @code{omp_aligned_calloc} runtime routines @tab N @tab + @code{omp_aligned_calloc} runtime routines @tab Y @tab @item @code{omp_alloctrait_key_t} enum: @code{omp_atv_serialized} added, @code{omp_atv_default} changed @tab Y @tab @item @code{omp_display_env} runtime routine @tab P diff --git a/libgomp/omp_lib.f90.in b/libgomp/omp_lib.f90.in index a36a5626123..1063eee0c94 100644 --- a/libgomp/omp_lib.f90.in +++ b/libgomp/omp_lib.f90.in @@ -680,13 +680,54 @@ end function omp_alloc end interface +interface + function omp_aligned_alloc (alignment, size, allocator) bind(c) +use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t +import :: omp_allocator_handle_kind +type(c_ptr) :: omp_aligned_alloc +integer(c_size_t), value :: alignment, size +integer(omp_allocator_handle_kind), value :: allocator + end function omp_aligned_alloc +end interface + interface subroutine omp_free(ptr, allocator) bind(c) use, intrinsic :: iso_c_binding, only : c_ptr import :: omp_allocator_handle_kind type(c_ptr), value :: ptr integer(omp_allocator_handle_kind), value :: allocator - end subroutine + end subroutine omp_free +end interface + +interface + function omp_calloc (nmemb, size, allocator) bind(c) +use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t +import :: omp_allocator_handle_kind +type(c_ptr) :: omp_calloc +integer(c_size_t), value :: nmemb, size +integer(omp_allocator_handle_kind), value :: allocator + end function omp_calloc +end interface + +interface + function omp_aligned_calloc (alignment, nmemb, size, allocator) bind(c) +use, intrinsic :: iso_c_
Re: [Patch] openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc for Fortran (was: [committed] openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc)
On Thu, Sep 30, 2021 at 01:14:49PM +0200, Tobias Burnus wrote: > On 30.09.21 09:45, Jakub Jelinek wrote: > > > This patch adds new OpenMP 5.1 allocator entrypoints ... > > ... and this patch adds the Fortran support for it, using the C→Fortran > converted testcases. Additionally, it fixes and updated the list of API > routine names. We now can also tick off one item in the OpenMP 5.1 > implementation status list. > > OK for mainline? > > Tobias > > - > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 > München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas > Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht > München, HRB 106955 > openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc for Fortran > > gcc/ChangeLog: > > * omp-low.c (omp_runtime_api_call): Add omp_aligned_{,c}alloc and > omp_{c,re}alloc, fix omp_alloc/omp_free. > > libgomp/ChangeLog: > > * libgomp.texi (OpenMP 5.1): Set implementation status to Y for > omp_aligned_{,c}alloc and omp_{c,re}alloc routines. > * omp_lib.f90.in (omp_aligned_alloc, omp_aligned_calloc, omp_calloc, > omp_realloc): Add. > * omp_lib.h.in (omp_aligned_alloc, omp_aligned_calloc, omp_calloc, > omp_realloc): Add. > * testsuite/libgomp.fortran/alloc-10.f90: New test. > * testsuite/libgomp.fortran/alloc-6.f90: New test. > * testsuite/libgomp.fortran/alloc-7.c: New test. > * testsuite/libgomp.fortran/alloc-7.f90: New test. > * testsuite/libgomp.fortran/alloc-8.f90: New test. > * testsuite/libgomp.fortran/alloc-9.f90: New test. Ok, thanks. Jakub
Re: [PATCH 2/N] Do not hide asm_out_file in ASM_OUTPUT_ASCII.
On 9/22/21 11:44, Richard Biener wrote: On Thu, Sep 16, 2021 at 12:01 PM Martin Liška wrote: Again a preparation patch that was tested on all cross compilers. Patch can bootstrap on x86_64-linux-gnu and survives regression tests. Ready to be installed? I think you want to retain -FILE *_hide_asm_out_file = (MYFILE); Oh, oh! and use _hide_asm_out_file to preserve MYFILE execution counts in case it contains side-effects. I do always forget about the fact that macros can have side-effects. Thanks for review, Martin OK with that change. Richard. Thanks, Martin
Re: [Ada] Add DWARF 5 support to System.Dwarf_Line
> The encoding of the debugging line information has substantially changed > in DWARF 5, so this adds the support for it alongside the existing code. > > Tested on x86_64-pc-linux-gnu, committed on trunk I have backported it, as well as the subsequent adjustments, onto the 11 branch since DWARF-5 is the default there. -- Eric Botcazou
Re: [Ada] Add support for PE-COFF PIE to System.Dwarf_Line
> This makes it possible for System.Dwarf_Line to handle Position-Independent > Executables on Windows systems by translating the run-time addresses it is > provided with into addresses in the executable. > > Tested on x86_64-pc-linux-gnu, committed on trunk I have backported it, as well as the subsequent adjustments, onto the 11 branch since PIE is the default for recent binutils. -- Eric Botcazou
Re: [PATCH] c++: Fix handling of __thread/thread_local extern vars declared at function scope [PR102496]
On 9/29/21 04:36, Jakub Jelinek wrote: Hi! The introduction of push_local_extern_decl_alias in r11-3699-g4e62aca0e0520e4ed2532f2d8153581190621c1a broke tls vars, while the decl they are created for has the tls model set properly, nothing sets it for the alias that is actually used, so accesses to it are done as if they were normal variables. This is then diagnosed at link time if the definition of the extern vars is __thread/thread_local. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/11.3? 2021-09-28 Jakub Jelinek PR c++/102496 * name-lookup.c: Include varasm.h. (push_local_extern_decl_alias): For CP_DECL_THREAD_LOCAL_P vars call set_decl_tls_model on alias. * g++.dg/tls/pr102496-1.C: New test. * g++.dg/tls/pr102496-2.C: New test. --- gcc/cp/name-lookup.c.jj 2021-09-28 15:51:54.035601004 +0200 +++ gcc/cp/name-lookup.c2021-09-28 16:22:51.169954638 +0200 @@ -36,6 +36,7 @@ along with GCC; see the file COPYING3. If not see #include "c-family/known-headers.h" #include "c-family/c-spellcheck.h" #include "bitmap.h" +#include "varasm.h" static cxx_binding *cxx_binding_make (tree value, tree type); static cp_binding_level *innermost_nonclass_level (void); @@ -3471,6 +3472,10 @@ push_local_extern_decl_alias (tree decl) push_nested_namespace (ns); alias = do_pushdecl (alias, /* hiding= */true); pop_nested_namespace (ns); + if (VAR_P (decl) && CP_DECL_THREAD_LOCAL_P (decl)) + set_decl_tls_model (alias, processing_template_decl + ? decl_default_tls_model (decl) + : DECL_TLS_MODEL (decl)); Hmm, what if decl has the tls_model attribute? We could decide not to push the alias for a thread-local variable when processing_template_decl, like we don't if the type is dependent; in either case we'll push it at instantiation time. } } --- gcc/testsuite/g++.dg/tls/pr102496-1.C.jj 2021-09-28 16:25:47.330522476 +0200 +++ gcc/testsuite/g++.dg/tls/pr102496-1.C 2021-09-28 16:28:44.888071020 +0200 @@ -0,0 +1,20 @@ +// PR c++/102496 +// { dg-do link { target c++11 } } +// { dg-require-effective-target tls } +// { dg-add-options tls } +// { dg-additional-sources pr102496-2.C } + +template +int +foo () +{ + extern __thread int t1; + return t1; +} + +int +main () +{ + extern __thread int t2; + return foo <0> () + t2; +} --- gcc/testsuite/g++.dg/tls/pr102496-2.C.jj2021-09-28 16:25:43.815571005 +0200 +++ gcc/testsuite/g++.dg/tls/pr102496-2.C 2021-09-28 16:28:54.132943380 +0200 @@ -0,0 +1,6 @@ +// PR c++/102496 +// { dg-do compile { target c++11 } } +// { dg-require-effective-target tls } + +__thread int t1; +__thread int t2; Jakub
[PATCH][PUSHED] testsuite: Skip a test-case when LTO is used [PR102509]
Remove 2 unresolved tests. Pushed to master, Martin PR testsuite/102509 gcc/testsuite/ChangeLog: * gcc.c-torture/compile/attr-complex-method.c: Skip if LTO is used. * gcc.c-torture/compile/attr-complex-method-2.c: Likewise. --- gcc/testsuite/gcc.c-torture/compile/attr-complex-method-2.c | 1 + gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c | 1 + 2 files changed, 2 insertions(+) diff --git a/gcc/testsuite/gcc.c-torture/compile/attr-complex-method-2.c b/gcc/testsuite/gcc.c-torture/compile/attr-complex-method-2.c index a3dc9c1ba91..121ae17f64b 100644 --- a/gcc/testsuite/gcc.c-torture/compile/attr-complex-method-2.c +++ b/gcc/testsuite/gcc.c-torture/compile/attr-complex-method-2.c @@ -1,4 +1,5 @@ /* { dg-additional-options "-fcx-limited-range -fdump-tree-optimized" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */ #pragma GCC optimize "-fno-cx-limited-range" diff --git a/gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c b/gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c index f08b72e273f..046de7efeb9 100644 --- a/gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c +++ b/gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c @@ -1,4 +1,5 @@ /* { dg-additional-options "-fdump-tree-optimized" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */ #pragma GCC optimize "-fcx-limited-range" -- 2.33.0
Re: [PATCH][PUSHED] testsuite: Skip a test-case when LTO is used [PR102509]
On Thu, Sep 30, 2021 at 02:17:08PM +0200, Martin Liška wrote: > Remove 2 unresolved tests. > > Pushed to master, > Martin > > PR testsuite/102509 > > gcc/testsuite/ChangeLog: > > * gcc.c-torture/compile/attr-complex-method.c: Skip if LTO is > used. > * gcc.c-torture/compile/attr-complex-method-2.c: Likewise. The other solution would be to add -ffat-lto-objects. I guess not a big difference... > diff --git a/gcc/testsuite/gcc.c-torture/compile/attr-complex-method-2.c > b/gcc/testsuite/gcc.c-torture/compile/attr-complex-method-2.c > index a3dc9c1ba91..121ae17f64b 100644 > --- a/gcc/testsuite/gcc.c-torture/compile/attr-complex-method-2.c > +++ b/gcc/testsuite/gcc.c-torture/compile/attr-complex-method-2.c > @@ -1,4 +1,5 @@ > /* { dg-additional-options "-fcx-limited-range -fdump-tree-optimized" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */ > #pragma GCC optimize "-fno-cx-limited-range" > diff --git a/gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c > b/gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c > index f08b72e273f..046de7efeb9 100644 > --- a/gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c > +++ b/gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c > @@ -1,4 +1,5 @@ > /* { dg-additional-options "-fdump-tree-optimized" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */ > #pragma GCC optimize "-fcx-limited-range" > -- > 2.33.0 Jakub
Re: [Patch] openmp: Add omp_aligned_{, c}alloc and omp_{c, re}alloc for Fortran (was: [committed] openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc)
Forgot to do a "git add" after modifying three testcases ... This silences the warning for a flag only valid for f951 but not for cc1. Committed the follow-up as r12-3984. Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 commit ef37ddf477ac4b21ec4d1be9260cfd3b431fd4a9 Author: Tobias Burnus Date: Thu Sep 30 14:44:06 2021 +0200 libgomp.fortran/alloc-*.f90: Add missing dg-prune-output libgomp/ * testsuite/libgomp.fortran/alloc-7.f90: Add dg-prune-output for -fintrinsic-modules-path= warning of the C compiler. * testsuite/libgomp.fortran/alloc-9.f90: Likewise. * testsuite/libgomp.fortran/alloc-10.f90: Likewise. diff --git a/libgomp/testsuite/libgomp.fortran/alloc-10.f90 b/libgomp/testsuite/libgomp.fortran/alloc-10.f90 index d26a83b216a..060c16f312b 100644 --- a/libgomp/testsuite/libgomp.fortran/alloc-10.f90 +++ b/libgomp/testsuite/libgomp.fortran/alloc-10.f90 @@ -1,4 +1,5 @@ ! { dg-additional-sources alloc-7.c } +! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" } module m use omp_lib use iso_c_binding diff --git a/libgomp/testsuite/libgomp.fortran/alloc-7.f90 b/libgomp/testsuite/libgomp.fortran/alloc-7.f90 index b047b0e4d10..d8c7eee8c25 100644 --- a/libgomp/testsuite/libgomp.fortran/alloc-7.f90 +++ b/libgomp/testsuite/libgomp.fortran/alloc-7.f90 @@ -1,4 +1,5 @@ ! { dg-additional-sources alloc-7.c } +! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" } module m use omp_lib use iso_c_binding diff --git a/libgomp/testsuite/libgomp.fortran/alloc-9.f90 b/libgomp/testsuite/libgomp.fortran/alloc-9.f90 index 6458f35fd1f..1da141631bc 100644 --- a/libgomp/testsuite/libgomp.fortran/alloc-9.f90 +++ b/libgomp/testsuite/libgomp.fortran/alloc-9.f90 @@ -1,4 +1,5 @@ ! { dg-additional-sources alloc-7.c } +! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" } module m use omp_lib use iso_c_binding
Re: [PATCH] aarch64: Improve size heuristic for cpymem expansion
On 29/09/2021 12:20, Kyrylo Tkachov via Gcc-patches wrote: Hi all, Similar to my previous patch for setmem this one does the same for the cpymem expansion. We count the number of ops emitted and compare it against the alternative of just calling the library function when optimising for size. For the code: void cpy_127 (char *out, char *in) { __builtin_memcpy (out, in, 127); } void cpy_128 (char *out, char *in) { __builtin_memcpy (out, in, 128); } we now emit a call to memcpy (with an extra MOV-immediate instruction for the size) instead of: cpy_127(char*, char*): ldp q0, q1, [x1] stp q0, q1, [x0] ldp q0, q1, [x1, 32] stp q0, q1, [x0, 32] ldp q0, q1, [x1, 64] stp q0, q1, [x0, 64] ldr q0, [x1, 96] str q0, [x0, 96] ldr q0, [x1, 111] str q0, [x0, 111] ret cpy_128(char*, char*): ldp q0, q1, [x1] stp q0, q1, [x0] ldp q0, q1, [x1, 32] stp q0, q1, [x0, 32] ldp q0, q1, [x1, 64] stp q0, q1, [x0, 64] ldp q0, q1, [x1, 96] stp q0, q1, [x0, 96] ret which is a clear code size win. Speed optimisation heuristics remain unchanged. Bootstrapped and tested on aarch64-none-linux-gnu. Pushing to trunk. Thanks, Kyrill 2021-09-29 Kyrylo Tkachov * config/aarch64/aarch64.c (aarch64_expand_cpymem): Count number of emitted operations and adjust heuristic for code size. 2021-09-29 Kyrylo Tkachov * gcc.target/aarch64/cpymem-size.c: New test. Hi Kyrill, Just to mention that the new test fails with -mabi=ilp32... Thanks, Christophe
[PATCH] c++: defaulted comparisons and vptr fields [PR95567]
We need to skip over vptr fields when synthesizing a defaulted comparison operator. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk/11? PR c++/95567 gcc/cp/ChangeLog: * method.c (build_comparison_op): Skip DECL_VIRTUAL_P fields. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/spaceship-virtual1.C: New test. --- gcc/cp/method.c | 4 .../g++.dg/cpp2a/spaceship-virtual1.C | 20 +++ 2 files changed, 24 insertions(+) create mode 100644 gcc/testsuite/g++.dg/cpp2a/spaceship-virtual1.C diff --git a/gcc/cp/method.c b/gcc/cp/method.c index 32f7186a774..3c3495227ce 100644 --- a/gcc/cp/method.c +++ b/gcc/cp/method.c @@ -1426,6 +1426,10 @@ build_comparison_op (tree fndecl, tsubst_flags_t complain) field; field = next_initializable_field (DECL_CHAIN (field))) { + if (DECL_VIRTUAL_P (field)) + /* Don't compare vptr fields. */ + continue; + tree expr_type = TREE_TYPE (field); location_t field_loc = DECL_SOURCE_LOCATION (field); diff --git a/gcc/testsuite/g++.dg/cpp2a/spaceship-virtual1.C b/gcc/testsuite/g++.dg/cpp2a/spaceship-virtual1.C new file mode 100644 index 000..8067d3cd9d1 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp2a/spaceship-virtual1.C @@ -0,0 +1,20 @@ +// PR c++/95567 +// { dg-do run { target c++20 } } + +struct B { + B(int i) : i(i) {} + virtual ~B() = default; + + bool operator==(B const&) const = default; + int i; +}; + +struct D : B { + D(int i, int j) : B(i), j(j) {} + int j; +}; + +int main() { + if (B(2) != D(2, 3)) +__builtin_abort(); +} -- 2.33.0.610.gcefe983a32
Re: [RFC][Patch][middle-end/PR102359]Not add initialization for READONLY variables with -ftrivial-auto-var-init
> On Sep 30, 2021, at 1:54 AM, Richard Biener wrote: > > On Thu, 30 Sep 2021, Jason Merrill wrote: > >> On 9/29/21 17:30, Qing Zhao wrote: >>> Hi, >>> >>> PR102359 (ICE gimplification failed since r12-3433-ga25e0b5e6ac8a77a) >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102359 >>> >>> Is due to -ftrivial-auto-var-init adding initialization for READONLY >>> variable “this” in the following routine: (t.cpp.005t.original) >>> >>> === >>> >>> ;; Function A::foo():: (null) >>> ;; enabled by -tree-original >>> >>> { >>> const struct A * const this [value-expr: &__closure->__this]; >>> const struct A * const this [value-expr: &__closure->__this]; >>> return = (double) ((const struct A *) this)->a; >>> } >>> === >>> >>> However, in the above routine, “this” is NOT marked as READONLY, but its >>> value-expr "&__closure->__this” is marked as READONLY. >>> >>> There are two major issues: >>> >>> 1. In the routine “is_var_need_auto_init”, we should exclude “decl” that is >>> marked as READONLY; >>> 2. In the C++ FE, “this” should be marked as READONLY. >>> >>> The idea solution will be: >>> >>> 1. Fix “is_var_need_auto_init” to exclude TREE_READONLY (decl); >>> 2. Fix C++ FE to mark “this” as TREE_READONLY (decl)==true; >>> >>> Not sure whether it’s hard for C++ FE to fix the 2nd issue or not? >>> >>> In the case it’s not a quick fix in C++FE, I proposed the following fix in >>> middle end: >>> >>> Let me know your comments or suggestions on this. >>> >>> Thanks a lot for the help. >> >> I'd think is_var_need_auto_init should be false for any variable with >> DECL_HAS_VALUE_EXPR_P, as they aren't really variables, just ways of naming >> objects that are initialized elsewhere. > > IIRC handing variables with DECL_HAS_VALUE_EXPR_P is necessary to > auto-init VLAs, Yes, that’s correct. i.e, when adding call to .DEFFERED_INIT to auto variables, DECL for a VLA already has a DECL_VALUE_EXPR. > otherwise I tend to agree - would we handle those > when we see a DECL_EXPR then? You mean, for VLA DECL? YES, we added a call to .DEFFERED_INIT for VLA DECL right now. Qing > >> >>> Qing >>> >>> == >>> From 0a5982cd61bc4610655d3df00ae8d2fbcb3c8e9b Mon Sep 17 00:00:00 2001 >>> From: Qing Zhao >>> Date: Wed, 29 Sep 2021 20:49:59 + >>> Subject: [PATCH] Fix PR102359 >>> >>> --- >>> gcc/gimplify.c | 15 +++ >>> gcc/testsuite/g++.dg/pr102359.C | 13 + >>> 2 files changed, 28 insertions(+) >>> create mode 100644 gcc/testsuite/g++.dg/pr102359.C >>> >>> diff --git a/gcc/gimplify.c b/gcc/gimplify.c >>> index 1067113b1639..a2587869b35d 100644 >>> --- a/gcc/gimplify.c >>> +++ b/gcc/gimplify.c >>> @@ -1819,12 +1819,27 @@ gimple_add_padding_init_for_auto_var (tree decl, >>> bool is_vla, >>>gimplify_seq_add_stmt (seq_p, call); >>> } >>> >>> +/* Return true if the DECL is READONLY. >>> + This is to workaround a C++ FE bug that only mark the value_expr of >>> "this" >>> + as readonly but does not mark "this" as readonly. >>> + C++ FE should fix this issue before replacing this routine with >>> + TREE_READONLY (decl). */ >>> + >>> +static bool >>> +is_decl_readonly (tree decl) >>> +{ >>> + return (TREE_READONLY (decl) >>> + || (DECL_HAS_VALUE_EXPR_P (decl) >>> +&& TREE_READONLY (DECL_VALUE_EXPR (decl; >>> +} >>> + >>> /* Return true if the DECL need to be automaticly initialized by the >>> compiler. */ >>> static bool >>> is_var_need_auto_init (tree decl) >>> { >>>if (auto_var_p (decl) >>> + && !is_decl_readonly (decl) >>>&& (flag_auto_var_init > AUTO_INIT_UNINITIALIZED) >>>&& (!lookup_attribute ("uninitialized", DECL_ATTRIBUTES (decl))) >>>&& !is_empty_type (TREE_TYPE (decl))) >>> diff --git a/gcc/testsuite/g++.dg/pr102359.C >>> b/gcc/testsuite/g++.dg/pr102359.C >>> new file mode 100644 >>> index ..da643cde7bed >>> --- /dev/null >>> +++ b/gcc/testsuite/g++.dg/pr102359.C >>> @@ -0,0 +1,13 @@ >>> +/* PR middle-end/102359 ICE gimplification failed since >>> + r12-3433-ga25e0b5e6ac8a77a. */ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-ftrivial-auto-var-init=zero" } */ >>> +/* { dg-require-effective-target c++17 } */ >>> + >>> +struct A { >>> + double a = 111; >>> + auto foo() { >>> +return [*this] { return a; }; >>> + } >>> +}; >>> +int X = A{}.foo()(); >>> >> >> > > -- > Richard Biener > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
Re: [PATCH gcc-11 0/2] Backport kpatch changes
On 9/30/21 10:50, Ilya Leoshkevich wrote: > Hi, > > This series contains a backport of kpatch changes needed to support > https://github.com/dynup/kpatch/pull/1203 so that it could be used in > RHEL 9. The patches have been in master for 4 months now without > issues. > > Bootstrapped and regtested on s390x-redhat-linux. > > Ok for gcc-11? Ok for both. Thanks! Andreas
[PATCH] c++: Implement C++20 -Wdeprecated-array-compare [PR97573]
This patch addresses one of my leftovers from GCC 11. C++20 introduced [depr.array.comp]: "Equality and relational comparisons between two operands of array type are deprecated." so this patch adds -Wdeprecated-array-compare (enabled by default in C++20). Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? PR c++/97573 gcc/c-family/ChangeLog: * c-opts.c (c_common_post_options): In C++20, turn on -Wdeprecated-array-compare. * c.opt (Wdeprecated-array-compare): New option. gcc/cp/ChangeLog: * typeck.c (do_warn_deprecated_array_compare): New. (cp_build_binary_op): Call it for equality and relational comparisons. gcc/ChangeLog: * doc/invoke.texi: Document -Wdeprecated-array-compare. gcc/testsuite/ChangeLog: * g++.dg/tree-ssa/pr15791-1.C: Add dg-warning. * g++.dg/cpp2a/array-comp1.C: New test. * g++.dg/cpp2a/array-comp2.C: New test. * g++.dg/cpp2a/array-comp3.C: New test. --- gcc/c-family/c-opts.c | 5 gcc/c-family/c.opt| 4 +++ gcc/cp/typeck.c | 28 +++ gcc/doc/invoke.texi | 19 - gcc/testsuite/g++.dg/cpp2a/array-comp1.C | 34 +++ gcc/testsuite/g++.dg/cpp2a/array-comp2.C | 31 + gcc/testsuite/g++.dg/cpp2a/array-comp3.C | 29 +++ gcc/testsuite/g++.dg/tree-ssa/pr15791-1.C | 2 +- 8 files changed, 150 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp2a/array-comp1.C create mode 100644 gcc/testsuite/g++.dg/cpp2a/array-comp2.C create mode 100644 gcc/testsuite/g++.dg/cpp2a/array-comp3.C diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c index 3eaab5e1530..00b52cc5e12 100644 --- a/gcc/c-family/c-opts.c +++ b/gcc/c-family/c-opts.c @@ -962,6 +962,11 @@ c_common_post_options (const char **pfilename) warn_deprecated_enum_float_conv, cxx_dialect >= cxx20 && warn_deprecated); + /* -Wdeprecated-array-compare is enabled by default in C++20. */ + SET_OPTION_IF_UNSET (&global_options, &global_options_set, + warn_deprecated_array_compare, + cxx_dialect >= cxx20 && warn_deprecated); + /* Declone C++ 'structors if -Os. */ if (flag_declone_ctor_dtor == -1) flag_declone_ctor_dtor = optimize_size; diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 9c151d19870..a4f0ea68594 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -540,6 +540,10 @@ Wdeprecated C C++ ObjC ObjC++ CPP(cpp_warn_deprecated) CppReason(CPP_W_DEPRECATED) ; Documented in common.opt +Wdeprecated-array-compare +C++ ObjC++ Var(warn_deprecated_array_compare) Warning +Warn about deprecated comparisons between two operands of array type. + Wdeprecated-copy C++ ObjC++ Var(warn_deprecated_copy) Warning LangEnabledBy(C++ ObjC++, Wextra) Mark implicitly-declared copy operations as deprecated if the class has a diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c index a2398dbe660..1e3a41104d6 100644 --- a/gcc/cp/typeck.c +++ b/gcc/cp/typeck.c @@ -40,6 +40,7 @@ along with GCC; see the file COPYING3. If not see #include "attribs.h" #include "asan.h" #include "gimplify.h" +#include "tree-pretty-print.h" static tree cp_build_addr_expr_strict (tree, tsubst_flags_t); static tree cp_build_function_call (tree, tree, tsubst_flags_t); @@ -4725,6 +4726,21 @@ do_warn_enum_conversions (location_t loc, enum tree_code code, tree type0, } } +/* Warn about C++20 [depr.array.comp] array comparisons: "Equality + and relational comparisons between two operands of array type are + deprecated." */ + +static inline void +do_warn_deprecated_array_compare (location_t location, tree_code code, + tree op0, tree op1) +{ + if (warning_at (location, OPT_Wdeprecated_array_compare, + "comparison between two arrays is deprecated")) +inform (location, "use unary %<+%> which decays operands to pointers " + "or %<&%D[0] %s &%D[0]%> to compare the addresses", + op0, op_symbol_code (code), op1); +} + /* Build a binary-operation expression without default conversions. CODE is the kind of expression to build. LOCATION is the location_t of the operator in the source code. @@ -5289,6 +5305,11 @@ cp_build_binary_op (const op_location_t &location, warning_at (location, OPT_Waddress, "comparison with string literal results in " "unspecified behavior"); + else if (TREE_CODE (TREE_TYPE (orig_op0)) == ARRAY_TYPE + && TREE_CODE (TREE_TYPE (orig_op1)) == ARRAY_TYPE) + do_warn_deprecated_array_compare (location, code, + stripped_orig_op0, + stripped_orig_op1); } build_type = boolean
Re: [PATCH] c-format: Add -Wformat-same-precision option [PR80060]
On 9/26/21 3:52 PM, Daniil Stas via Gcc-patches wrote: This option is enabled by default when -Wformat option is enabled. A user can specify -Wno-format-same-precision to disable emitting warnings about an argument passed to printf-like function having a different type from the one specified in the format string if the types precisions are the same. Having an option to control this -Wformat aspect seems useful so just a few comments mostly on the wording/naming choices. Coming up with good names is tricky but I wonder if we can find one that's clearer than "-Wformat-same-precision". Precision can mean a few different things in this context: in the representation of integers it refers to the number of value bits. In that of floating types, it refers to the number of significand bits. And in printf directives, it refers to what comes after the optional period and what controls the minimum number of digits to format (or maximum number of characters in a string). So "same precision" seems rather vague (and the proposed manual entry doesn't make it clear). IIUC, the option is specifically for directives that take integer arguments and controls whether using an argument of an incompatible integer type to a conversion specifier like i or x is diagnosed when the argument has the same precision as the expected type. With that in mind, would mentioning the word integer (or just int for short) be an improvement? E.g., -Wformat-int-precision? Some more comments on the documentation text are below. Signed-off-by: Daniil Stas gcc/c-family/ChangeLog: * c-format.c (check_format_types): Don't emit warnings about type differences with the format string if -Wno-format-same-precision is specified and the types have the same precision. * c.opt: Add -Wformat-same-precision option. gcc/ChangeLog: * doc/invoke.texi: Add -Wformat-same-precision option description. gcc/testsuite/ChangeLog: * c-c++-common/Wformat-same-precision-1.c: New test. * c-c++-common/Wformat-same-precision-2.c: New test. --- gcc/c-family/c-format.c | 2 +- gcc/c-family/c.opt| 5 + gcc/doc/invoke.texi | 8 +++- gcc/testsuite/c-c++-common/Wformat-same-precision-1.c | 7 +++ gcc/testsuite/c-c++-common/Wformat-same-precision-2.c | 7 +++ 5 files changed, 27 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/Wformat-same-precision-1.c create mode 100644 gcc/testsuite/c-c++-common/Wformat-same-precision-2.c diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c index b4cb765a9d3..07cdcefbef8 100644 --- a/gcc/c-family/c-format.c +++ b/gcc/c-family/c-format.c @@ -4243,7 +4243,7 @@ check_format_types (const substring_loc &fmt_loc, && (!pedantic || i < 2) && char_type_flag) continue; - if (types->scalar_identity_flag + if ((types->scalar_identity_flag || !warn_format_same_precision) && (TREE_CODE (cur_type) == TREE_CODE (wanted_type) || (INTEGRAL_TYPE_P (cur_type) && INTEGRAL_TYPE_P (wanted_type))) diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 9c151d19870..e7af7365c91 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -656,6 +656,11 @@ C ObjC C++ LTO ObjC++ Warning Alias(Wformat-overflow=, 1, 0) IntegerRange(0, 2) Warn about function calls with format strings that write past the end of the destination region. Same as -Wformat-overflow=1. +Wformat-same-precision +C ObjC C++ ObjC++ Var(warn_format_same_precision) Warning LangEnabledBy(C ObjC C++ ObjC++,Wformat=,warn_format >= 1, 0) +Warn about type differences with the format string even if the types +precision is the same. The grammar doesn't seem quite right here (I recommend to adjust the text as well along similar lines as the manual, except more brief as is customary here). + Wformat-security C ObjC C++ ObjC++ Var(warn_format_security) Warning LangEnabledBy(C ObjC C++ ObjC++,Wformat=, warn_format >= 2, 0) Warn about possible security problems with format functions. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index ba98eab68a5..8833f257d75 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -347,7 +347,7 @@ Objective-C and Objective-C++ Dialects}. -Werror -Werror=* -Wexpansion-to-defined -Wfatal-errors @gol -Wfloat-conversion -Wfloat-equal -Wformat -Wformat=2 @gol -Wno-format-contains-nul -Wno-format-extra-args @gol --Wformat-nonliteral -Wformat-overflow=@var{n} @gol +-Wformat-nonliteral -Wformat-overflow=@var{n} -Wformat-same-precision @gol -Wformat-security -Wformat-signedness -Wformat-truncation=@var{n} @gol -Wformat-y2k -Wframe-address @gol -Wframe-larger-than=@var{byte-size} -Wno-free-nonheap-object @gol @@ -6054,6 +6054,12 @@ If @option{-Wformat} is specified, also warn if the format string is not
Re: [RFC][Patch][middle-end/PR102359]Not add initialization for READONLY variables with -ftrivial-auto-var-init
> On Sep 30, 2021, at 1:54 AM, Richard Biener wrote: > > On Thu, 30 Sep 2021, Jason Merrill wrote: > >> On 9/29/21 17:30, Qing Zhao wrote: >>> Hi, >>> >>> PR102359 (ICE gimplification failed since r12-3433-ga25e0b5e6ac8a77a) >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102359 >>> >>> Is due to -ftrivial-auto-var-init adding initialization for READONLY >>> variable “this” in the following routine: (t.cpp.005t.original) >>> >>> === >>> >>> ;; Function A::foo():: (null) >>> ;; enabled by -tree-original >>> >>> { >>> const struct A * const this [value-expr: &__closure->__this]; >>>const struct A * const this [value-expr: &__closure->__this]; >>> return = (double) ((const struct A *) this)->a; >>> } >>> === >>> >>> However, in the above routine, “this” is NOT marked as READONLY, but its >>> value-expr "&__closure->__this” is marked as READONLY. >>> >>> There are two major issues: >>> >>> 1. In the routine “is_var_need_auto_init”, we should exclude “decl” that is >>> marked as READONLY; >>> 2. In the C++ FE, “this” should be marked as READONLY. >>> >>> The idea solution will be: >>> >>> 1. Fix “is_var_need_auto_init” to exclude TREE_READONLY (decl); >>> 2. Fix C++ FE to mark “this” as TREE_READONLY (decl)==true; >>> >>> Not sure whether it’s hard for C++ FE to fix the 2nd issue or not? >>> >>> In the case it’s not a quick fix in C++FE, I proposed the following fix in >>> middle end: >>> >>> Let me know your comments or suggestions on this. >>> >>> Thanks a lot for the help. >> >> I'd think is_var_need_auto_init should be false for any variable with >> DECL_HAS_VALUE_EXPR_P, as they aren't really variables, just ways of naming >> objects that are initialized elsewhere. > > IIRC handing variables with DECL_HAS_VALUE_EXPR_P is necessary to > auto-init VLAs, otherwise I tend to agree - would we handle those > when we see a DECL_EXPR then? The current implementation is: gimplify_decl_expr: For each DECL_EXPR “decl” If (VAR_P (decl) && !DECL_EXTERNAL (decl)) { if (is_vla (decl)) gimplify_vla_decl (decl, …); /* existing handling: create a VALUE_EXPR for this vla decl*/ … if (has_explicit_init (decl)) { …; /* existing handling. */ } else if (is_var_need_auto_init (decl)) /*. New code. */ { gimple_add_init_for_auto_var (….); /* new code. */ ... } } Since the “DECL_VALUE_EXPR (decl)” is NOT a DECL_EXPR, it will not be scanned and added initialization. if we do not add initialization for a decl that has DECL_VALUE_EXPR, then the “DECL_VALUE_EXPR (decl)” will not be added an initialization either. We will miss adding initializations for these decls. So, I think that the current implementation is correct. And if C++ FE will not mark “this” as READONLY, only mark DECL_VALUE_EXPR(this) as READONLY, the proposed fix is correct too. Let me know your opinion on this. Thanks. Qing > >> >>> Qing >>> >>> == >>> From 0a5982cd61bc4610655d3df00ae8d2fbcb3c8e9b Mon Sep 17 00:00:00 2001 >>> From: Qing Zhao >>> Date: Wed, 29 Sep 2021 20:49:59 + >>> Subject: [PATCH] Fix PR102359 >>> >>> --- >>> gcc/gimplify.c | 15 +++ >>> gcc/testsuite/g++.dg/pr102359.C | 13 + >>> 2 files changed, 28 insertions(+) >>> create mode 100644 gcc/testsuite/g++.dg/pr102359.C >>> >>> diff --git a/gcc/gimplify.c b/gcc/gimplify.c >>> index 1067113b1639..a2587869b35d 100644 >>> --- a/gcc/gimplify.c >>> +++ b/gcc/gimplify.c >>> @@ -1819,12 +1819,27 @@ gimple_add_padding_init_for_auto_var (tree decl, >>> bool is_vla, >>> gimplify_seq_add_stmt (seq_p, call); >>> } >>> >>> +/* Return true if the DECL is READONLY. >>> + This is to workaround a C++ FE bug that only mark the value_expr of >>> "this" >>> + as readonly but does not mark "this" as readonly. >>> + C++ FE should fix this issue before replacing this routine with >>> + TREE_READONLY (decl). */ >>> + >>> +static bool >>> +is_decl_readonly (tree decl) >>> +{ >>> + return (TREE_READONLY (decl) >>> + || (DECL_HAS_VALUE_EXPR_P (decl) >>> +&& TREE_READONLY (DECL_VALUE_EXPR (decl; >>> +} >>> + >>> /* Return true if the DECL need to be automaticly initialized by the >>>compiler. */ >>> static bool >>> is_var_need_auto_init (tree decl) >>> { >>> if (auto_var_p (decl) >>> + && !is_decl_readonly (decl) >>> && (flag_auto_var_init > AUTO_INIT_UNINITIALIZED) >>> && (!lookup_attribute ("uninitialized", DECL_ATTRIBUTES (decl))) >>> && !is_empty_type (TREE_TYPE (decl))) >>> diff --git a/gcc/testsuite/g++.dg/pr102359.C >>> b/gcc/testsuite/g++.dg/pr102359.C >>> new file mode 100644 >>> index ..da643cde7bed >>> --- /dev/null >>> +++ b/gcc/testsuite/g++.dg/pr102359.C >>> @@ -0,0 +1,13 @@ >>> +/* PR middle-end/102359 ICE gimplification failed since >>> + r12-3433-ga25e0b5
Re: [PATCH] c++: Fix handling of __thread/thread_local extern vars declared at function scope [PR102496]
On Thu, Sep 30, 2021 at 08:06:52AM -0400, Jason Merrill wrote: > Hmm, what if decl has the tls_model attribute? > > We could decide not to push the alias for a thread-local variable when > processing_template_decl, like we don't if the type is dependent; in either > case we'll push it at instantiation time. So like this (assuming it passes full bootstrap/regtest, so far it passed tls.exp)? 2021-09-28 Jakub Jelinek PR c++/102496 * name-lookup.c (push_local_extern_decl_alias): Return early even for tls vars with non-dependent type when processing_template_decl. For CP_DECL_THREAD_LOCAL_P vars call set_decl_tls_model on alias. * g++.dg/tls/pr102496-1.C: New test. * g++.dg/tls/pr102496-2.C: New test. --- gcc/cp/name-lookup.c.jj 2021-09-29 10:07:28.838061585 +0200 +++ gcc/cp/name-lookup.c2021-09-30 17:30:46.010100552 +0200 @@ -3375,7 +3375,10 @@ set_decl_context_in_fn (tree ctx, tree d void push_local_extern_decl_alias (tree decl) { - if (dependent_type_p (TREE_TYPE (decl))) + if (dependent_type_p (TREE_TYPE (decl)) + || (processing_template_decl + && VAR_P (decl) + && CP_DECL_THREAD_LOCAL_P (decl))) return; /* EH specs were not part of the function type prior to c++17, but we still can't go pushing dependent eh specs into the namespace. */ @@ -3471,6 +3474,8 @@ push_local_extern_decl_alias (tree decl) push_nested_namespace (ns); alias = do_pushdecl (alias, /* hiding= */true); pop_nested_namespace (ns); + if (VAR_P (decl) && CP_DECL_THREAD_LOCAL_P (decl)) + set_decl_tls_model (alias, DECL_TLS_MODEL (decl)); } } --- gcc/testsuite/g++.dg/tls/pr102496-1.C.jj2021-09-30 17:22:47.867769063 +0200 +++ gcc/testsuite/g++.dg/tls/pr102496-1.C 2021-09-30 17:22:47.867769063 +0200 @@ -0,0 +1,20 @@ +// PR c++/102496 +// { dg-do link { target c++11 } } +// { dg-require-effective-target tls } +// { dg-add-options tls } +// { dg-additional-sources pr102496-2.C } + +template +int +foo () +{ + extern __thread int t1; + return t1; +} + +int +main () +{ + extern __thread int t2; + return foo <0> () + t2; +} --- gcc/testsuite/g++.dg/tls/pr102496-2.C.jj2021-09-30 17:22:47.867769063 +0200 +++ gcc/testsuite/g++.dg/tls/pr102496-2.C 2021-09-30 17:22:47.867769063 +0200 @@ -0,0 +1,6 @@ +// PR c++/102496 +// { dg-do compile { target c++11 } } +// { dg-require-effective-target tls } + +__thread int t1; +__thread int t2; Jakub
Re: [PATCH] c++: Fix handling of __thread/thread_local extern vars declared at function scope [PR102496]
On 9/30/21 11:45, Jakub Jelinek wrote: On Thu, Sep 30, 2021 at 08:06:52AM -0400, Jason Merrill wrote: Hmm, what if decl has the tls_model attribute? We could decide not to push the alias for a thread-local variable when processing_template_decl, like we don't if the type is dependent; in either case we'll push it at instantiation time. So like this (assuming it passes full bootstrap/regtest, so far it passed tls.exp)? OK. 2021-09-28 Jakub Jelinek PR c++/102496 * name-lookup.c (push_local_extern_decl_alias): Return early even for tls vars with non-dependent type when processing_template_decl. For CP_DECL_THREAD_LOCAL_P vars call set_decl_tls_model on alias. * g++.dg/tls/pr102496-1.C: New test. * g++.dg/tls/pr102496-2.C: New test. --- gcc/cp/name-lookup.c.jj 2021-09-29 10:07:28.838061585 +0200 +++ gcc/cp/name-lookup.c2021-09-30 17:30:46.010100552 +0200 @@ -3375,7 +3375,10 @@ set_decl_context_in_fn (tree ctx, tree d void push_local_extern_decl_alias (tree decl) { - if (dependent_type_p (TREE_TYPE (decl))) + if (dependent_type_p (TREE_TYPE (decl)) + || (processing_template_decl + && VAR_P (decl) + && CP_DECL_THREAD_LOCAL_P (decl))) return; /* EH specs were not part of the function type prior to c++17, but we still can't go pushing dependent eh specs into the namespace. */ @@ -3471,6 +3474,8 @@ push_local_extern_decl_alias (tree decl) push_nested_namespace (ns); alias = do_pushdecl (alias, /* hiding= */true); pop_nested_namespace (ns); + if (VAR_P (decl) && CP_DECL_THREAD_LOCAL_P (decl)) + set_decl_tls_model (alias, DECL_TLS_MODEL (decl)); } } --- gcc/testsuite/g++.dg/tls/pr102496-1.C.jj 2021-09-30 17:22:47.867769063 +0200 +++ gcc/testsuite/g++.dg/tls/pr102496-1.C 2021-09-30 17:22:47.867769063 +0200 @@ -0,0 +1,20 @@ +// PR c++/102496 +// { dg-do link { target c++11 } } +// { dg-require-effective-target tls } +// { dg-add-options tls } +// { dg-additional-sources pr102496-2.C } + +template +int +foo () +{ + extern __thread int t1; + return t1; +} + +int +main () +{ + extern __thread int t2; + return foo <0> () + t2; +} --- gcc/testsuite/g++.dg/tls/pr102496-2.C.jj2021-09-30 17:22:47.867769063 +0200 +++ gcc/testsuite/g++.dg/tls/pr102496-2.C 2021-09-30 17:22:47.867769063 +0200 @@ -0,0 +1,6 @@ +// PR c++/102496 +// { dg-do compile { target c++11 } } +// { dg-require-effective-target tls } + +__thread int t1; +__thread int t2; Jakub
PING #2 [PATCH] warn for more impossible null pointer tests [PR102103]
Jason, since you approved the C++ changes, would you mind looking over the C bits and if they look good to you giving me the green light to commit the patch? https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579693.html Thanks in advance for your help! On 9/24/21 8:31 AM, Martin Sebor wrote: Ping: Jeff, with the C++ part approved, can you please confirm your approval with the C parts of the patch? https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579693.html On 9/21/21 6:34 PM, Martin Sebor wrote: On 9/21/21 3:40 PM, Jason Merrill wrote: The C++ changes are OK. Jeff, should I take your previous "Generally OK" as an approval for the rest of the patch as well? (It has not changed in v2.) I have just submitted a Glibc patch to suppress the new instances there. Martin
[committed] libphobos: Define main function as extern(C) when compiling without D runtime (PR102476)
Hi, This patch defines the default supplied main function as read when compiling with `-fmain' as extern(C) when compiling without D runtime. The default linkage is extern(D), however this does not work when mixing `-fmain' together with `-fno-druntime'. Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32, and committed to mainline. Regards Iain --- gcc/testsuite/ChangeLog: PR d/102476 * gdc.dg/pr102476.d: New test. libphobos/ChangeLog: PR d/102476 * libdruntime/__main.di: Define main function as extern(C) when compiling without D runtime. --- gcc/testsuite/gdc.dg/pr102476.d | 3 +++ libphobos/libdruntime/__main.di | 14 -- 2 files changed, 15 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gdc.dg/pr102476.d diff --git a/gcc/testsuite/gdc.dg/pr102476.d b/gcc/testsuite/gdc.dg/pr102476.d new file mode 100644 index 000..543716eb37d --- /dev/null +++ b/gcc/testsuite/gdc.dg/pr102476.d @@ -0,0 +1,3 @@ +// https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102476 +// { dg-do link } +// { dg-options "-fmain -fno-druntime" } diff --git a/libphobos/libdruntime/__main.di b/libphobos/libdruntime/__main.di index 8062bf4d1e8..ab1264b98f1 100644 --- a/libphobos/libdruntime/__main.di +++ b/libphobos/libdruntime/__main.di @@ -20,7 +20,17 @@ module __main; -int main(char[][]) +version (D_BetterC) { -return 0; +extern (C) int main(int, char**) +{ +return 0; +} +} +else +{ +int main(char[][]) +{ +return 0; +} } -- 2.30.2
[committed] libphobos: Give _Unwind_Exception an alignment that best resembles __attribute__((aligned))
Hi, This patch gives the definition of _Unwind_Exception on the D side a suitable alignment. For interoperability with C++ EH, the alignment should match, otherwise D may not be able to intercept exceptions thrown from C++. Ideally the correct alignment should be exposed by the compiler, but for now this is good enough for current supported targets. Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32, and committed to mainline. Regards Iain --- libphobos/ChangeLog: * libdruntime/gcc/unwind/generic.d (__aligned__): Define. (_Unwind_Exception): Align struct to __aligned__. --- libphobos/libdruntime/gcc/unwind/generic.d | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/libphobos/libdruntime/gcc/unwind/generic.d b/libphobos/libdruntime/gcc/unwind/generic.d index 592b3afcb71..68ddd1d5410 100644 --- a/libphobos/libdruntime/gcc/unwind/generic.d +++ b/libphobos/libdruntime/gcc/unwind/generic.d @@ -123,7 +123,27 @@ enum : _Unwind_Reason_Code // @@@ The IA-64 ABI says that this structure must be double-word aligned. // Taking that literally does not make much sense generically. Instead we // provide the maximum alignment required by any type for the machine. -struct _Unwind_Exception + version (ARM) private enum __aligned__ = 8; +else version (AArch64) private enum __aligned__ = 16; +else version (HPPA) private enum __aligned__ = 8; +else version (HPPA64) private enum __aligned__ = 16; +else version (MIPS_N32) private enum __aligned__ = 16; +else version (MIPS_N64) private enum __aligned__ = 16; +else version (MIPS32) private enum __aligned__ = 8; +else version (MIPS64) private enum __aligned__ = 8; +else version (PPC) private enum __aligned__ = 16; +else version (PPC64)private enum __aligned__ = 16; +else version (RISCV32) private enum __aligned__ = 16; +else version (RISCV64) private enum __aligned__ = 16; +else version (S390) private enum __aligned__ = 8; +else version (SPARC)private enum __aligned__ = 8; +else version (SPARC64) private enum __aligned__ = 16; +else version (SystemZ) private enum __aligned__ = 8; +else version (X86) private enum __aligned__ = 16; +else version (X86_64) private enum __aligned__ = 16; +else static assert( false, "Platform not supported."); + +align(__aligned__) struct _Unwind_Exception { _Unwind_Exception_Class exception_class; _Unwind_Exception_Cleanup_Fn exception_cleanup; -- 2.30.2
[committed] libphobos: Remove unused variables in gcc.backtrace.
Hi, This patch simplifies how core runtime constructs the LibBacktrace object in the event of a segfault during unittests. The core.runtime module always overrides the default parameter value for constructor calls. MaxAlignment is not required because a class can be created on the stack with the `scope' keyword. Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32, and committed to mainline. Regards Iain --- libphobos/ChangeLog: * libdruntime/core/runtime.d (runModuleUnitTests): Use scope to new LibBacktrace on the stack. * libdruntime/gcc/backtrace.d (FIRSTFRAME): Remove. (LibBacktrace.MaxAlignment): Remove. (LibBacktrace.this): Remove default initialization of firstFrame. (UnwindBacktrace.this): Likewise. --- libphobos/libdruntime/core/runtime.d | 14 +++--- libphobos/libdruntime/gcc/backtrace.d | 24 ++-- 2 files changed, 5 insertions(+), 33 deletions(-) diff --git a/libphobos/libdruntime/core/runtime.d b/libphobos/libdruntime/core/runtime.d index 848b607ae69..5fc99046d23 100644 --- a/libphobos/libdruntime/core/runtime.d +++ b/libphobos/libdruntime/core/runtime.d @@ -483,17 +483,9 @@ extern (C) bool runModuleUnitTests() fprintf(stderr, "Segmentation fault while running unittests:\n"); fprintf(stderr, "\n"); -enum alignment = LibBacktrace.MaxAlignment; -enum classSize = __traits(classInstanceSize, LibBacktrace); - -void[classSize + alignment] bt_store = void; -void* alignedAddress = cast(byte*)((cast(size_t)(bt_store.ptr + alignment - 1)) -& ~(alignment - 1)); - -(alignedAddress[0 .. classSize]) = typeid(LibBacktrace).initializer[]; -auto bt = cast(LibBacktrace)(alignedAddress); -// First frame is LibBacktrace ctor. Second is signal handler, but include that for now -bt.__ctor(1); +// First frame is LibBacktrace ctor. Second is signal handler, +// but include that for now +scope bt = new LibBacktrace(1); foreach (size_t i, const(char[]) msg; bt) fprintf(stderr, "%s\n", msg.ptr ? msg.ptr : "???"); diff --git a/libphobos/libdruntime/gcc/backtrace.d b/libphobos/libdruntime/gcc/backtrace.d index 8f5582d7469..3c4d65f417f 100644 --- a/libphobos/libdruntime/gcc/backtrace.d +++ b/libphobos/libdruntime/gcc/backtrace.d @@ -24,24 +24,6 @@ module gcc.backtrace; import gcc.libbacktrace; -version (Posix) -{ -// NOTE: The first 5 frames with the current implementation are -// inside core.runtime and the object code, so eliminate -// these for readability. The alternative would be to -// exclude the first N frames that are in a list of -// mangled function names. -private enum FIRSTFRAME = 5; -} -else -{ -// NOTE: On Windows, the number of frames to exclude is based on -// whether the exception is user or system-generated, so -// it may be necessary to exclude a list of function names -// instead. -private enum FIRSTFRAME = 0; -} - // Max size per line of the traceback. private enum MAX_BUFSIZE = 1536; @@ -205,8 +187,6 @@ static if (BACKTRACE_SUPPORTED && !BACKTRACE_USES_MALLOC) // FIXME: state is never freed as libbacktrace doesn't provide a free function... public class LibBacktrace : Throwable.TraceInfo { -enum MaxAlignment = (void*).alignof; - static void initLibBacktrace() { if (!initialized) @@ -216,7 +196,7 @@ static if (BACKTRACE_SUPPORTED && !BACKTRACE_USES_MALLOC) } } -this(int firstFrame = FIRSTFRAME) +this(int firstFrame) { _firstFrame = firstFrame; @@ -365,7 +345,7 @@ else */ public class UnwindBacktrace : Throwable.TraceInfo { -this(int firstFrame = FIRSTFRAME) +this(int firstFrame) { _firstFrame = firstFrame; _callstack = getBacktrace(); -- 2.30.2
[committed] libphobos: Print stacktrace before terminating program due to uncaught exception.
Hi, This patch adds adjusts the `throw' entrypoint to print the stacktrace of an uncaught exception before terminating. By default, D run-time has a top level exception handler to catch anything that was uncaught by user code. However when the `rt_trapExceptions' flag is cleared, this handler would not be enabled, and this termination would occur, aborting the program, but without any information about the exception. Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32, and committed to mainline. Regards Iain --- libphobos/ChangeLog: * libdruntime/gcc/deh.d (_d_print_throwable): Declare. (_d_throw): Print stacktrace before terminating program due to uncaught exception. --- libphobos/libdruntime/gcc/deh.d | 5 + 1 file changed, 5 insertions(+) diff --git a/libphobos/libdruntime/gcc/deh.d b/libphobos/libdruntime/gcc/deh.d index eb83751c59d..a7eb37cfd9e 100644 --- a/libphobos/libdruntime/gcc/deh.d +++ b/libphobos/libdruntime/gcc/deh.d @@ -34,6 +34,7 @@ extern(C) { int _d_isbaseof(ClassInfo, ClassInfo); void _d_createTrace(Object, void*); +void _d_print_throwable(Throwable t); } /** @@ -510,7 +511,11 @@ extern(C) void _d_throw(Throwable object) // things, almost certainly we will have crashed before now, rather than // actually being able to diagnose the problem. if (r == _URC_END_OF_STACK) +{ +__gdc_begin_catch(&eh.unwindHeader); +_d_print_throwable(object); terminate("uncaught exception", __LINE__); +} terminate("unwind error", __LINE__); } -- 2.30.2
[committed] libphobos: Select the appropriate exception handler in getClassInfo
Hi, This patch makes getClassInfo to be analogous to __gdc_personality, which ignores in-flight exceptions that we haven't collided with yet. Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32, and committed to mainline. Regards Iain --- libphobos/ChangeLog: * libdruntime/gcc/deh.d (ExceptionHeader.getClassInfo): Move to... (getClassInfo): ...here as free function. Add lsda parameter. (scanLSDA): Pass lsda to actionTableLookup. (actionTableLookup): Add lsda parameter, pass to getClassInfo. (__gdc_personality): Remove currentCfa variable. --- libphobos/libdruntime/gcc/deh.d | 74 - 1 file changed, 44 insertions(+), 30 deletions(-) diff --git a/libphobos/libdruntime/gcc/deh.d b/libphobos/libdruntime/gcc/deh.d index a7eb37cfd9e..ba57fed38dc 100644 --- a/libphobos/libdruntime/gcc/deh.d +++ b/libphobos/libdruntime/gcc/deh.d @@ -279,26 +279,6 @@ struct ExceptionHeader } } -/** - * Look at the chain of inflight exceptions and pick the class type that'll - * be looked for in catch clauses. - */ -static ClassInfo getClassInfo(_Unwind_Exception* unwindHeader) @nogc -{ -ExceptionHeader* eh = toExceptionHeader(unwindHeader); -// The first thrown Exception at the top of the stack takes precedence -// over others that are inflight, unless an Error was thrown, in which -// case, we search for error handlers instead. -Throwable ehobject = eh.object; -for (ExceptionHeader* ehn = eh.next; ehn; ehn = ehn.next) -{ -Error e = cast(Error)ehobject; -if (e is null || (cast(Error)ehn.object) !is null) -ehobject = ehn.object; -} -return ehobject.classinfo; -} - /** * Convert from pointer to unwindHeader to pointer to ExceptionHeader * that it is embedded inside of. @@ -666,7 +646,7 @@ _Unwind_Reason_Code scanLSDA(const(ubyte)* lsda, _Unwind_Exception_Class excepti { // Otherwise we have a catch handler or exception specification. handler = actionTableLookup(actions, unwindHeader, actionRecord, -exceptionClass, TTypeBase, +lsda, exceptionClass, TTypeBase, TType, TTypeEncoding, saw_handler, saw_cleanup); } @@ -694,7 +674,8 @@ _Unwind_Reason_Code scanLSDA(const(ubyte)* lsda, _Unwind_Exception_Class excepti * Look up and return the handler index of the classType in Action Table. */ int actionTableLookup(_Unwind_Action actions, _Unwind_Exception* unwindHeader, - const(ubyte)* actionRecord, _Unwind_Exception_Class exceptionClass, + const(ubyte)* actionRecord, const(ubyte)* lsda, + _Unwind_Exception_Class exceptionClass, _Unwind_Ptr TTypeBase, const(ubyte)* TType, ubyte TTypeEncoding, out bool saw_handler, out bool saw_cleanup) @@ -702,7 +683,7 @@ int actionTableLookup(_Unwind_Action actions, _Unwind_Exception* unwindHeader, ClassInfo thrownType; if (isGdcExceptionClass(exceptionClass)) { -thrownType = ExceptionHeader.getClassInfo(unwindHeader); +thrownType = getClassInfo(unwindHeader, lsda); } while (1) @@ -778,6 +759,41 @@ int actionTableLookup(_Unwind_Action actions, _Unwind_Exception* unwindHeader, return 0; } +/** + * Look at the chain of inflight exceptions and pick the class type that'll + * be looked for in catch clauses. + */ +ClassInfo getClassInfo(_Unwind_Exception* unwindHeader, + const(ubyte)* currentLsd) @nogc +{ +ExceptionHeader* eh = ExceptionHeader.toExceptionHeader(unwindHeader); +// The first thrown Exception at the top of the stack takes precedence +// over others that are inflight, unless an Error was thrown, in which +// case, we search for error handlers instead. +Throwable ehobject = eh.object; +for (ExceptionHeader* ehn = eh.next; ehn; ehn = ehn.next) +{ +const(ubyte)* nextLsd = void; +_Unwind_Ptr nextLandingPad = void; +_Unwind_Word nextCfa = void; +int nextHandler = void; + +ExceptionHeader.restore(&ehn.unwindHeader, nextHandler, nextLsd, nextLandingPad, nextCfa); + +// Don't combine when the exceptions are from different functions. +if (currentLsd != nextLsd) +break; + +Error e = cast(Error)ehobject; +if (e is null || (cast(Error)ehn.object) !is null) +{ +currentLsd = nextLsd; +ehobject = ehn.object; +} +} +return ehobject.classinfo; +} + /** * Called when the personality function has found neither a cleanup or handler. * To support ARM EABI personality routines, that must also unwind the stack. @@ -934,
Re: [PATCH] c++: defaulted comparisons and vptr fields [PR95567]
On 9/30/21 10:03, Patrick Palka wrote: We need to skip over vptr fields when synthesizing a defaulted comparison operator. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk/11? OK. PR c++/95567 gcc/cp/ChangeLog: * method.c (build_comparison_op): Skip DECL_VIRTUAL_P fields. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/spaceship-virtual1.C: New test. --- gcc/cp/method.c | 4 .../g++.dg/cpp2a/spaceship-virtual1.C | 20 +++ 2 files changed, 24 insertions(+) create mode 100644 gcc/testsuite/g++.dg/cpp2a/spaceship-virtual1.C diff --git a/gcc/cp/method.c b/gcc/cp/method.c index 32f7186a774..3c3495227ce 100644 --- a/gcc/cp/method.c +++ b/gcc/cp/method.c @@ -1426,6 +1426,10 @@ build_comparison_op (tree fndecl, tsubst_flags_t complain) field; field = next_initializable_field (DECL_CHAIN (field))) { + if (DECL_VIRTUAL_P (field)) + /* Don't compare vptr fields. */ + continue; + tree expr_type = TREE_TYPE (field); location_t field_loc = DECL_SOURCE_LOCATION (field); diff --git a/gcc/testsuite/g++.dg/cpp2a/spaceship-virtual1.C b/gcc/testsuite/g++.dg/cpp2a/spaceship-virtual1.C new file mode 100644 index 000..8067d3cd9d1 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp2a/spaceship-virtual1.C @@ -0,0 +1,20 @@ +// PR c++/95567 +// { dg-do run { target c++20 } } + +struct B { + B(int i) : i(i) {} + virtual ~B() = default; + + bool operator==(B const&) const = default; + int i; +}; + +struct D : B { + D(int i, int j) : B(i), j(j) {} + int j; +}; + +int main() { + if (B(2) != D(2, 3)) +__builtin_abort(); +}
Re: [RFC][Patch][middle-end/PR102359]Not add initialization for READONLY variables with -ftrivial-auto-var-init
On Sep 30, 2021, at 1:54 AM, Richard Biener wrote: On Thu, 30 Sep 2021, Jason Merrill wrote: On 9/29/21 17:30, Qing Zhao wrote: Hi, PR102359 (ICE gimplification failed since r12-3433-ga25e0b5e6ac8a77a) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102359 Is due to -ftrivial-auto-var-init adding initialization for READONLY variable “this” in the following routine: (t.cpp.005t.original) === ;; Function A::foo():: (null) ;; enabled by -tree-original { const struct A * const this [value-expr: &__closure->__this]; const struct A * const this [value-expr: &__closure->__this]; return = (double) ((const struct A *) this)->a; } === However, in the above routine, “this” is NOT marked as READONLY, but its value-expr "&__closure->__this” is marked as READONLY. There are two major issues: 1. In the routine “is_var_need_auto_init”, we should exclude “decl” that is marked as READONLY; 2. In the C++ FE, “this” should be marked as READONLY. The idea solution will be: 1. Fix “is_var_need_auto_init” to exclude TREE_READONLY (decl); 2. Fix C++ FE to mark “this” as TREE_READONLY (decl)==true; Not sure whether it’s hard for C++ FE to fix the 2nd issue or not? In the case it’s not a quick fix in C++FE, I proposed the following fix in middle end: Let me know your comments or suggestions on this. Thanks a lot for the help. I'd think is_var_need_auto_init should be false for any variable with DECL_HAS_VALUE_EXPR_P, as they aren't really variables, just ways of naming objects that are initialized elsewhere. IIRC handing variables with DECL_HAS_VALUE_EXPR_P is necessary to auto-init VLAs, otherwise I tend to agree - would we handle those when we see a DECL_EXPR then? The current implementation is: gimplify_decl_expr: For each DECL_EXPR “decl” If (VAR_P (decl) && !DECL_EXTERNAL (decl)) { if (is_vla (decl)) gimplify_vla_decl (decl, …); /* existing handling: create a VALUE_EXPR for this vla decl*/ … if (has_explicit_init (decl)) { …; /* existing handling. */ } else if (is_var_need_auto_init (decl)) /*. New code. */ { gimple_add_init_for_auto_var (….); /* new code. */ ... } } Since the “DECL_VALUE_EXPR (decl)” is NOT a DECL_EXPR, it will not be scanned and added initialization. if we do not add initialization for a decl that has DECL_VALUE_EXPR, then the “DECL_VALUE_EXPR (decl)” will not be added an initialization either. We will miss adding initializations for these decls. So, I think that the current implementation is correct. And if C++ FE will not mark “this” as READONLY, only mark DECL_VALUE_EXPR(this) as READONLY, the proposed fix is correct too. Let me know your opinion on this. Thanks. Qing Qing == From 0a5982cd61bc4610655d3df00ae8d2fbcb3c8e9b Mon Sep 17 00:00:00 2001 From: Qing Zhao Date: Wed, 29 Sep 2021 20:49:59 + Subject: [PATCH] Fix PR102359 --- gcc/gimplify.c | 15 +++ gcc/testsuite/g++.dg/pr102359.C | 13 + 2 files changed, 28 insertions(+) create mode 100644 gcc/testsuite/g++.dg/pr102359.C diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 1067113b1639..a2587869b35d 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -1819,12 +1819,27 @@ gimple_add_padding_init_for_auto_var (tree decl, bool is_vla, gimplify_seq_add_stmt (seq_p, call); } +/* Return true if the DECL is READONLY. + This is to workaround a C++ FE bug that only mark the value_expr of "this" + as readonly but does not mark "this" as readonly. + C++ FE should fix this issue before replacing this routine with + TREE_READONLY (decl). */ + +static bool +is_decl_readonly (tree decl) +{ + return (TREE_READONLY (decl) + || (DECL_HAS_VALUE_EXPR_P (decl) + && TREE_READONLY (DECL_VALUE_EXPR (decl; +} + /* Return true if the DECL need to be automaticly initialized by the compiler. */ static bool is_var_need_auto_init (tree decl) { if (auto_var_p (decl) + && !is_decl_readonly (decl) && (flag_auto_var_init > AUTO_INIT_UNINITIALIZED) && (!lookup_attribute ("uninitialized", DECL_ATTRIBUTES (decl))) && !is_empty_type (TREE_TYPE (decl))) diff --git a/gcc/testsuite/g++.dg/pr102359.C b/gcc/testsuite/g++.dg/pr102359.C new file mode 100644 index ..da643cde7bed --- /dev/null +++ b/gcc/testsuite/g++.dg/pr102359.C @@ -0,0 +1,13 @@ +/* PR middle-end/102359 ICE gimplification failed since + r12-3433-ga25e0b5e6ac8a77a. */ +/* { dg-do compile } */ +/* { dg-options "-ftrivial-auto-var-init=zero" } */ +/* { dg-require-effective-target c++17 } */ + +struct A { + double a = 111; + auto foo() { +return [*this] { return a; }; + } +}; +int X = A{}.foo()(); -- Richard Biener SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
[PATCH 1/2] c++: order of trailing arguments in a trait expr
When parsing a variadic trait expression, we build up the list of trailing arguments in reverse, but we're neglecting to reverse the list to its true order afterwards. This causes us to confuse the meaning of e.g. __is_xible(x, y, z) and __is_xible(x, z, y). Note that bug isn't exposed in the standard type traits within libstdc++ because there we pass a pack expansion as the single trailing argument to __is_xible, which indeed gets expanded correctly by tsubst_tree_list. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? What about backports? This isn't a regression AFAICT. gcc/cp/ChangeLog: * parser.c (cp_parser_trait_expr): Call nreverse on the list of trailing arguments. gcc/testsuite/ChangeLog: * g++.dg/ext/is_constructible6.C: New test. --- gcc/cp/parser.c | 1 + gcc/testsuite/g++.dg/ext/is_constructible6.C | 10 ++ 2 files changed, 11 insertions(+) create mode 100644 gcc/testsuite/g++.dg/ext/is_constructible6.C diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 8430445ef8c..04f5a24cc03 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -10832,6 +10832,7 @@ cp_parser_trait_expr (cp_parser* parser, enum rid keyword) return error_mark_node; type2 = tree_cons (NULL_TREE, elt, type2); } + type2 = nreverse (type2); } location_t finish_loc = cp_lexer_peek_token (parser->lexer)->location; diff --git a/gcc/testsuite/g++.dg/ext/is_constructible6.C b/gcc/testsuite/g++.dg/ext/is_constructible6.C new file mode 100644 index 000..7fce153fa75 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/is_constructible6.C @@ -0,0 +1,10 @@ +// Verify we respect the order of trailing arguments passed to +// __is_constructible. + +struct A { }; +struct B { }; +struct C { C(A, B); }; + +extern int n[true]; +extern int n[ __is_constructible(C, A, B)]; +extern int n[!__is_constructible(C, B, A)]; -- 2.33.0.610.gcefe983a32
[PATCH 2/2] c++: __is_trivially_xible and multi-arg aggr paren init [PR102535]
is_xible_helper assumes only 0- and 1-argument ctors can be trivial, but C++20 aggregate paren init means multi-arg ctors can now be trivial too. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk/11? PR c++/102535 gcc/cp/ChangeLog: * method.c (is_xible_helper): Don't exit early for multi-arg ctors in C++20. gcc/testsuite/ChangeLog: * g++.dg/ext/is_trivially_constructible7.C: New test. --- gcc/cp/method.c | 4 +++- .../g++.dg/ext/is_trivially_constructible7.C| 17 + 2 files changed, 20 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/ext/is_trivially_constructible7.C diff --git a/gcc/cp/method.c b/gcc/cp/method.c index 3c3495227ce..c38912a7ce9 100644 --- a/gcc/cp/method.c +++ b/gcc/cp/method.c @@ -2094,8 +2094,10 @@ is_xible_helper (enum tree_code code, tree to, tree from, bool trivial) tree expr; if (code == MODIFY_EXPR) expr = assignable_expr (to, from); - else if (trivial && from && TREE_CHAIN (from)) + else if (trivial && from && TREE_CHAIN (from) + && cxx_dialect < cxx20) return error_mark_node; // only 0- and 1-argument ctors can be trivial + // before C++20 aggregate paren init else if (TREE_CODE (to) == ARRAY_TYPE && !TYPE_DOMAIN (to)) return error_mark_node; // can't construct an array of unknown bound else diff --git a/gcc/testsuite/g++.dg/ext/is_trivially_constructible7.C b/gcc/testsuite/g++.dg/ext/is_trivially_constructible7.C new file mode 100644 index 000..f6fbf8f2d9e --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/is_trivially_constructible7.C @@ -0,0 +1,17 @@ +// PR c++/102535 +// Verify __is_trivially_constructible works with multi-arg paren init of +// aggrs. + +struct A { int x; }; +struct B { float y; }; +struct C { char z; }; +struct D { A a; B b; C c; }; + +extern int n[1 + __is_trivially_constructible(D, A)]; +extern int n[1 + __is_trivially_constructible(D, A, B)]; +extern int n[1 + __is_trivially_constructible(D, A, B, C)]; +#if __cpp_aggregate_paren_init +extern int n[1 + true]; +#else +extern int n[1 + false]; +#endif -- 2.33.0.610.gcefe983a32
Re: [PATCH, part 2] PR 102458 - issues with simplification of SIZE intrinsic applied to automatic arrays
Dear Harald, dear all, On 29.09.21 21:20, Harald Anlauf via Fortran wrote: I think I have solved the remaining issue in PR 102458 that prevented the simplification of an expression involving a static initialization and the evaluation of the SIZE of an automatic array which has provable constant size. My previous related query to the ML has thus become obsolete. My solution is to attempt the resolution of the array specification within simplify_size so that the simplification actually works. Regtested on x86_64-pc-linux-gnu. OK for mainline and same branches as the patch for part1? Thanks. I wonder whether that should be placed at some more generic place, but for now add it to another intrinsic-simplify function ... I note that this resolution is also used for (u,l)(,co)bounds/this_image (via simplify_bound_dim) and in gfc_resolve_formal_arglist, resolve_component, resolve_symbol. OK. Thanks, Tobias Fortran: resolve expressions during SIZE simplification gcc/fortran/ChangeLog: PR fortran/102458 * simplify.c (simplify_size): Resolve expressions used in array specifications so that SIZE can be simplified. gcc/testsuite/ChangeLog: PR fortran/102458 * gfortran.dg/pr102458b.f90: New test. - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[PATCH, v3] c++: Fix up synthetization of defaulted comparison operators on classes with bitfields [PR102490]
On Wed, Sep 29, 2021 at 03:38:45PM -0400, Jason Merrill wrote: > It ought to be possible to defer check_final_overrider, but it sounds > awkward. > > Or maybe_instantiate_noexcept could use the non-defining path in > build_comparison_op. > > Maybe we want a maybe_synthesize_method that actually builds the function if > the type is complete, or takes the non-defining path if not. So something like this? spaceship-synth8.C (apparently added by me, so how valid/invalid it is is unclear) now doesn't ICE anymore, but without the change I've put there is now rejected because std::strong_ordering::less etc. aren't found. Previously when we were synthesizing it we did that before the FIELD_DECLs for bases have been added, so those weren't compared, but now they actually are compared. After fixing the incomplete std::strong_ordering spaceship-synth8.C is now accepted, but I'm afraid I'm getting lost in this - clang++ rejects that testcase instead complaining that D has <=> operator, but has it pure virtual. And, when maybe_instantiate_noexcept tries to find out if the defaulted method would be implicitly deleted or not, when it does so before the class is complete, seems it can check whether there are errors when comparing the direct members of the class, but not errors about bases... 2021-09-30 Jakub Jelinek PR c++/102490 * cp-tree.h (build_comparison_op): Declare. * method.c (comp_info): Remove defining member. (comp_info::comp_info): Remove complain argument, don't initialize defining. (build_comparison_op): No longer static. Add defining argument. Adjust comp_info construction. Use defining instead of info.defining. Assert that if defining, ctype is a complete type. (synthesize_method, maybe_explain_implicit_delete, explain_implicit_non_constexpr): Adjust build_comparison_op callers. * class.c (check_bases_and_members): Don't call defaulted_late_check for sfk_comparison. (finish_struct_1): Call it here instead after class has been completed. * pt.c (maybe_instantiate_noexcept): For sfk_comparison of still incomplete classes, call build_comparison_op in non-defining mode instead of calling synthesize_method. * g++.dg/cpp2a/spaceship-synth8.C (std::strong_ordering): Provide more complete definition. (std::strong_ordering::less, std::strong_ordering::equal, std::strong_ordering::greater): Define. * g++.dg/cpp2a/spaceship-eq11.C: New test. * g++.dg/cpp2a/spaceship-eq12.C: New test. --- gcc/cp/cp-tree.h.jj 2021-09-18 09:44:31.728743713 +0200 +++ gcc/cp/cp-tree.h2021-09-30 18:39:10.416847290 +0200 @@ -7012,6 +7012,7 @@ extern bool maybe_explain_implicit_delet extern void explain_implicit_non_constexpr (tree); extern bool deduce_inheriting_ctor (tree); extern bool decl_remember_implicit_trigger_p (tree); +extern void build_comparison_op(tree, bool, tsubst_flags_t); extern void synthesize_method (tree); extern tree lazily_declare_fn (special_function_kind, tree); --- gcc/cp/method.c.jj 2021-09-30 09:22:46.323918164 +0200 +++ gcc/cp/method.c 2021-09-30 18:51:14.510744549 +0200 @@ -1288,21 +1288,16 @@ struct comp_info { tree fndecl; location_t loc; - bool defining; bool first_time; bool constexp; bool was_constexp; bool noex; - comp_info (tree fndecl, tsubst_flags_t &complain) + comp_info (tree fndecl) : fndecl (fndecl) { loc = DECL_SOURCE_LOCATION (fndecl); -/* We only have tf_error set when we're called from - explain_invalid_constexpr_fn or maybe_explain_implicit_delete. */ -defining = !(complain & tf_error); - first_time = DECL_MAYBE_DELETED (fndecl); DECL_MAYBE_DELETED (fndecl) = false; @@ -1364,12 +1359,12 @@ struct comp_info to use synthesize_method at the earliest opportunity and bail out if the function ends up being deleted. */ -static void -build_comparison_op (tree fndecl, tsubst_flags_t complain) +void +build_comparison_op (tree fndecl, bool defining, tsubst_flags_t complain) { - comp_info info (fndecl, complain); + comp_info info (fndecl); - if (!info.defining && !(complain & tf_error) && !DECL_MAYBE_DELETED (fndecl)) + if (!defining && !(complain & tf_error) && !DECL_MAYBE_DELETED (fndecl)) return; int flags = LOOKUP_NORMAL; @@ -1384,6 +1379,7 @@ build_comparison_op (tree fndecl, tsubst lhs = convert_from_reference (lhs); rhs = convert_from_reference (rhs); tree ctype = TYPE_MAIN_VARIANT (TREE_TYPE (lhs)); + gcc_assert (!defining || COMPLETE_TYPE_P (ctype)); iloc_sentinel ils (info.loc); @@ -1399,7 +1395,7 @@ build_comparison_op (tree fndecl, tsubst } tree compound_stmt = NULL_TREE; - if (info.defining) + if (defining) compound_stmt = begin_compound_stmt (0
[PATCH] i386: Eliminate sign extension after logic operation [PR89954]
Convert (sign_extend:WIDE (any_logic:NARROW (memory, immediate))) to (any_logic:WIDE (sign_extend (memory)), (sign_extend (immediate))). This eliminates sign extension after logic operation. 2021-09-30 Uroš Bizjak gcc/ PR target/89954 * config/i386/i386.md (sign_extend:WIDE (any_logic:NARROW (memory, immediate)) splitters): New splitters. gcc/testsuite/ PR target/89954 * gcc.target/i386/pr89954.c: New test. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to master. Uros. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 758d7d1e3c0..04cb3bf6a33 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -10210,6 +10210,40 @@ [(set_attr "type" "alu") (set_attr "mode" "")]) +;; convert (sign_extend:WIDE (any_logic:NARROW (memory, immediate))) +;; to (any_logic:WIDE (sign_extend (memory)), (sign_extend (immediate))). +;; This eliminates sign extension after logic operation. + +(define_split + [(set (match_operand:SWI248 0 "register_operand") + (sign_extend:SWI248 + (any_logic:QI (match_operand:QI 1 "memory_operand") + (match_operand:QI 2 "const_int_operand"] + "" + [(set (match_dup 3) (sign_extend:SWI248 (match_dup 1))) + (set (match_dup 0) (any_logic:SWI248 (match_dup 3) (match_dup 2)))] + "operands[3] = gen_reg_rtx (mode);") + +(define_split + [(set (match_operand:SWI48 0 "register_operand") + (sign_extend:SWI48 + (any_logic:HI (match_operand:HI 1 "memory_operand") + (match_operand:HI 2 "const_int_operand"] + "" + [(set (match_dup 3) (sign_extend:SWI48 (match_dup 1))) + (set (match_dup 0) (any_logic:SWI48 (match_dup 3) (match_dup 2)))] + "operands[3] = gen_reg_rtx (mode);") + +(define_split + [(set (match_operand:DI 0 "register_operand") + (sign_extend:DI + (any_logic:SI (match_operand:SI 1 "memory_operand") + (match_operand:SI 2 "const_int_operand"] + "TARGET_64BIT" + [(set (match_dup 3) (sign_extend:DI (match_dup 1))) + (set (match_dup 0) (any_logic:DI (match_dup 3) (match_dup 2)))] + "operands[3] = gen_reg_rtx (DImode);") + (define_insn "*_2" [(set (reg FLAGS_REG) (compare (any_or:SWI diff --git a/gcc/testsuite/gcc.target/i386/pr89954.c b/gcc/testsuite/gcc.target/i386/pr89954.c new file mode 100644 index 000..c1e9f3a9562 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr89954.c @@ -0,0 +1,45 @@ +/* PR target/89954 */ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +signed char ab; + +short aw; + +int al; + +short sext_andbw (void) { return ab & -2; } +short sext_orbw (void) { return ab | -3; } +short sext_xorbw (void) { return ab ^ -4; } + +int sext_andbl (void) { return ab & -2; } +int sext_orbl (void) { return ab | -3; } +int sext_xorbl (void) { return ab ^ -4; } + +int sext_andwl (void) { return aw & -2; } +int sext_orwl (void) { return aw | -3; } +int sext_xorwl (void) { return aw ^ -4; } + +#ifdef __x86_64__ + +long long sext_andbq (void) { return ab & -2; } +long long sext_orbq (void) { return ab | -3; } +long long sext_xorbq (void) { return ab ^ -4; } + +long long sext_andwq (void) { return aw & -2; } +long long sext_orwq (void) { return aw | -3; } +long long sext_xorwq (void) { return aw ^ -4; } + +long long sext_andlq (void) { return al & -2; } +long long sext_orlq (void) { return al | -3; } +long long sext_xorlq (void) { return al ^ -4; } + +#endif + +/* { dg-final { scan-assembler-times "movsbw" 3 } } */ +/* { dg-final { scan-assembler-times "movsbl" 3 } } */ +/* { dg-final { scan-assembler-times "movswl" 3 } } */ + +/* { dg-final { scan-assembler-times "movsbq" 3 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "movswq" 3 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "movslq" 3 { target { ! ia32 } } } } */
Re: [PATCH] c++: Suppress error when cv-qualified reference is introduced by typedef [PR101783]
>>You may need to run contrib/gcc-git-customization.sh to get the git >>gcc-verify command. I re-setup and can use git gcc-verify. Now I can see it rejects because I forgot to add a description of modified file. Now that it passes gcc-verify and I attach the changelog as attachment. Thank you again for your patient explanation and help! On 9/26/21 21:31, nick huang via Gcc-patches wrote: > Hi Jason, > > 1. Thank you very much for your detailed comments for my patch and I really > appreciate it! Here is my revised patch: > > The root cause of this bug is that it considers reference with > cv-qualifiers as an error by generating value for variable "bad_quals". > However, this is not correct for case of typedef. Here I quote spec: > "Cv-qualified references are ill-formed except when the cv-qualifiers > are introduced through the use of a typedef-name ([dcl.typedef], > [temp.param]) or decltype-specifier ([dcl.type.decltype]), > in which case the cv-qualifiers are ignored." > > 2021-09-25 qingzhe huang > > gcc/cp/ > PR c++/101783 > * tree.c (cp_build_qualified_type_real): git gcc-verify still rejects this line with ERR: missing description of a change: " * tree.c (cp_build_qualified_type_real):" You may need to run contrib/gcc-git-customization.sh to get the git gcc-verify command. > gcc/testsuite/ > PR c++/101783 > * g++.dg/parse/pr101783.C: New test. > -- next part -- Please drop this line, it breaks git gcc-verify when I apply the patch with git am. The patch should start immediately after the ChangeLog entries. > diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c > index 8840932dba2..d5c8daeb340 100644 > --- a/gcc/cp/tree.c > +++ b/gcc/cp/tree.c > @@ -1356,11 +1356,18 @@ cp_build_qualified_type_real (tree type, > /* A reference or method type shall not be cv-qualified. > [dcl.ref], [dcl.fct]. This used to be an error, but as of DR 295 > (in CD1) we always ignore extra cv-quals on functions. */ > + > + /* Cv-qualified references are ill-formed except when the cv-qualifiers In my previous reply I meant please add "[dcl.ref]/1" at the beginning of this comment. > + are introduced through the use of a typedef-name ([dcl.typedef], > + [temp.param]) or decltype-specifier ([dcl.type.decltype]), > + in which case the cv-qualifiers are ignored. > + */ > if (type_quals & (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE) > && (TYPE_REF_P (type) > || FUNC_OR_METHOD_TYPE_P (type))) > { > - if (TYPE_REF_P (type)) > + if (TYPE_REF_P (type) > + && (!typedef_variant_p (type) || FUNC_OR_METHOD_TYPE_P (type))) > bad_quals |= type_quals & (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE); > type_quals &= ~(TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE); > } > diff --git a/gcc/testsuite/g++.dg/parse/pr101783.C > b/gcc/testsuite/g++.dg/parse/pr101783.C > new file mode 100644 > index 000..4e0a435dd0b > --- /dev/null > +++ b/gcc/testsuite/g++.dg/parse/pr101783.C > @@ -0,0 +1,5 @@ > +template struct A{ > + typedef T& Type; > +}; > +template void f(const typename A::Type){} > +template <> void f(const typename A::Type){} > > > > 2. >> In Jonathan's earlier reply he asked how you tested the patch; this >> message still doesn't say anything about that. > I communicated with Mr. Jonathan in private email, worrying my naive question > might pollute the public maillist. The following is major part of this > communication and I attached original part in attachment. > How has this patch been tested? Have you bootstrapped the compiler and run the full testsuite? > Here is how I am doing: > a) build original 10.2.0 from scratch and make check to get both > "testsuite/gcc/gcc.sum" > and "testsuite/g++/g++.sum". > b) apply my patch and build from scratch and make check to get both two files > above. > c) compare two run's *.sum files to see if there is any difference. > > (Later I realized there is tool "contrib/compare_tests" is a good help of >doing so.) > > 3. >> What is the legal status of your contributions? > I thought small patch didn't require assignment. However, I just sent email > to ass...@gnu.org to request assignment. > Alternatively, I am not sure if adding this "signoff" tag in submission will > help? > Signed-off-by: qingzhe huang > > > Thank you again! > > >> On 8/28/21 07:54, nick huang via Gcc-patches wrote: >>> Reference with cv-qualifiers should be ignored instead of causing an error >>> because standard accepts cv-qualified references introduced by typedef which >>> is ignored. >>> Therefore, the fix prevents GCC from reporting error by not setting variable >>> "bad_quals" in case the reference is introduced by typedef. Still the >>> cv-qualifier is silently ignored. >>> Here I quote spec (https://timsong-cpp.github.io/cppwp/dcl.ref#1): >>> "Cv-qualified references are ill-formed except when the cv-qualifiers >>> are introduced thro
Re: [PATCH] Improve jump threading dump output.
On 9/28/2021 7:53 AM, Aldy Hernandez wrote: On 9/28/21 3:47 PM, Jeff Law wrote: On 9/28/2021 3:45 AM, Aldy Hernandez wrote: In analyzing PR102511, it has become abundantly clear that we need better debugging aids for the jump threader solver. Currently debugging these issues is a nightmare if you're not intimately familiar with the code. This patch attempts to improve this. First, I'm enabling path solver dumps with TDF_THREADING. None of the available TDF_* flags are a good match, and using TDF_DETAILS would blow up the dump file, since both threaders continually call the solver to try out candidates. This will allow dumping path solver details without having to resort to hacking the source. I am also dumping the current registered_jump_thread dbg counter used by the registry, in the solver. That way narrowing down a problematic thread can then be examined by -fdump-*-threading and looking at the solver details surrounding the appropriate counter (which the dbgcnt also dumps to the dump file). You still need knowledge of the solver to debug these issues, but at least now it's not entirely opaque. OK? gcc/ChangeLog: * dbgcnt.c (dbg_cnt_counter): New. * dbgcnt.h (dbg_cnt_counter): New. * dumpfile.c (dump_options): Add entry for TDF_THREADING. * dumpfile.h (enum dump_flag): Add TDF_THREADING. * gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING. * tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out debug counter. OK. Note we've got massive failures in the tester starting sometime yesterday and I suspect all the threader work. So I'm going to slow down on reviews of that code as we stabilize stuff. Fair enough. Let's knock those out then. So I'm really wondering if these were caused by that patch you'd sent me privately for the visium issue. Right now we're regressing in a few places, but it's not bad. visium & bfin are the only embedded targets failing. visium fails: Tests that now fail, but worked before (9 tests): visium-sim: gcc.c-torture/execute/960218-1.c -Os (test for excess errors) visium-sim: gcc.c-torture/execute/961125-1.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) visium-sim: gcc.c-torture/execute/961125-1.c -O3 -g (test for excess errors) visium-sim: gcc.c-torture/execute/pending-4.c -O1 (test for excess errors) visium-sim: gcc.c-torture/execute/pr58209.c -O2 (test for excess errors) visium-sim: gcc.c-torture/execute/pr58209.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors) visium-sim: gcc.c-torture/execute/pr58209.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (test for excess errors) visium-sim: gcc.c-torture/execute/pr58209.c -O3 -g (test for excess errors) visium-sim: gcc.c-torture/execute/pr68911.c -O1 (test for excess errors) We've already discussed 960218-1 a bit. I wouldn't be surprised if they're all the same problem in the end. These started with: commit 4a960d548b7d7d942f316c5295f6d849b74214f5 (HEAD, refs/bisect/bad) Author: Aldy Hernandez Date: Thu Sep 23 10:59:24 2021 +0200 Avoid invalid loop transformations in jump threading registry.
Go patch committed: Avoid calling Expression::type before lowering
For some future work on the Go frontend it will be annoying to have to make Expression::type work before the lowering pass, so this patch changes the frontend so that that doesn't happen. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian patch.txt 19b5a1147764bc867f2743fafa2f05da239618d6 diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE index edfbe46d8f4..7eea97765c3 100644 --- a/gcc/go/gofrontend/MERGE +++ b/gcc/go/gofrontend/MERGE @@ -1,4 +1,4 @@ -e3bfc0889237a5bb8aa7ae30e1cff14f90a5f941 +bbc1effb1a8a757a38011074f1d4477fae3936f5 The first line of this file holds the git revision number of the last merge done from the gofrontend repository. diff --git a/gcc/go/gofrontend/expressions.h b/gcc/go/gofrontend/expressions.h index 9f8f4e9255b..93483544e46 100644 --- a/gcc/go/gofrontend/expressions.h +++ b/gcc/go/gofrontend/expressions.h @@ -999,7 +999,9 @@ class Expression determine_type_no_context(); // Return the current type of the expression. This may be changed - // by determine_type. + // by determine_type. This should not be called before the lowering + // pass, unless the is_type_expression method returns true (i.e., + // this is an EXPRESSION_TYPE). Type* type() { return this->do_type(); } diff --git a/gcc/go/gofrontend/types.cc b/gcc/go/gofrontend/types.cc index cd692506efc..c6ce6230c58 100644 --- a/gcc/go/gofrontend/types.cc +++ b/gcc/go/gofrontend/types.cc @@ -11789,8 +11789,9 @@ Type::build_stub_methods(Gogo* gogo, const Type* type, const Methods* methods, { stub = gogo->start_function(stub_name, stub_type, false, fntype->location()); - Type::build_one_stub_method(gogo, m, buf, stub_params, - fntype->is_varargs(), location); + Type::build_one_stub_method(gogo, m, buf, receiver_type, stub_params, + fntype->is_varargs(), stub_results, + location); gogo->finish_function(fntype->location()); if (type->named_type() == NULL && stub->is_function()) @@ -11810,16 +11811,20 @@ Type::build_stub_methods(Gogo* gogo, const Type* type, const Methods* methods, void Type::build_one_stub_method(Gogo* gogo, Method* method, const char* receiver_name, + const Type* receiver_type, const Typed_identifier_list* params, bool is_varargs, + const Typed_identifier_list* results, Location location) { Named_object* receiver_object = gogo->lookup(receiver_name, NULL); go_assert(receiver_object != NULL); Expression* expr = Expression::make_var_reference(receiver_object, location); - expr = Type::apply_field_indexes(expr, method->field_indexes(), location); - if (expr->type()->points_to() == NULL) + const Type* expr_type = receiver_type; + expr = Type::apply_field_indexes(expr, method->field_indexes(), location, + &expr_type); + if (expr_type->points_to() == NULL) expr = Expression::make_unary(OPERATOR_AND, expr, location); Expression_list* arguments; @@ -11844,8 +11849,7 @@ Type::build_one_stub_method(Gogo* gogo, Method* method, go_assert(func != NULL); Call_expression* call = Expression::make_call(func, arguments, is_varargs, location); - - gogo->add_statement(Statement::make_return_from_call(call, location)); + Type::add_return_from_results(gogo, call, results, location); } // Build direct interface stub methods for TYPE as needed. METHODS @@ -11954,8 +11958,9 @@ Type::build_direct_iface_stub_methods(Gogo* gogo, const Type* type, { stub = gogo->start_function(stub_name, stub_type, false, fntype->location()); - Type::build_one_iface_stub_method(gogo, m, buf, stub_params, -fntype->is_varargs(), loc); + Type::build_one_iface_stub_method(gogo, m, buf, stub_params, + fntype->is_varargs(), stub_results, + loc); gogo->finish_function(fntype->location()); if (type->named_type() == NULL && stub->is_function()) @@ -11982,7 +11987,9 @@ void Type::build_one_iface_stub_method(Gogo* gogo, Method* method, const char* receiver_name, const Typed_identifier_list* params, - bool is_varargs, Location loc) + bool is_varargs, + const Typed_identifier_list* results, + Location loc) { Named_object* receiver_object = gogo->lookup(receiver_name, NULL); go_assert(recei
Re: [PATCH, v3] c++: Fix up synthetization of defaulted comparison operators on classes with bitfields [PR102490]
On 9/30/21 13:24, Jakub Jelinek wrote: On Wed, Sep 29, 2021 at 03:38:45PM -0400, Jason Merrill wrote: It ought to be possible to defer check_final_overrider, but it sounds awkward. Or maybe_instantiate_noexcept could use the non-defining path in build_comparison_op. Maybe we want a maybe_synthesize_method that actually builds the function if the type is complete, or takes the non-defining path if not. So something like this? Much like, thanks. spaceship-synth8.C (apparently added by me, so how valid/invalid it is is unclear) now doesn't ICE anymore, but without the change I've put there is now rejected because std::strong_ordering::less etc. aren't found. Previously when we were synthesizing it we did that before the FIELD_DECLs for bases have been added, so those weren't compared, but now they actually are compared. Ah, good to have that fixed. After fixing the incomplete std::strong_ordering spaceship-synth8.C is now accepted, but I'm afraid I'm getting lost in this - clang++ rejects that testcase instead complaining that D has <=> operator, but has it pure virtual. Ah, I think we need to add LOOKUP_NO_VIRTUAL to the flags variable, as we do in do_build_copy_assign. I suppose it wouldn't hurt to add LOOKUP_DEFAULTED as well. One more comment in the patch below. And, when maybe_instantiate_noexcept tries to find out if the defaulted method would be implicitly deleted or not, when it does so before the class is complete, seems it can check whether there are errors when comparing the direct members of the class, but not errors about bases... 2021-09-30 Jakub Jelinek PR c++/102490 * cp-tree.h (build_comparison_op): Declare. * method.c (comp_info): Remove defining member. (comp_info::comp_info): Remove complain argument, don't initialize defining. (build_comparison_op): No longer static. Add defining argument. Adjust comp_info construction. Use defining instead of info.defining. Assert that if defining, ctype is a complete type. (synthesize_method, maybe_explain_implicit_delete, explain_implicit_non_constexpr): Adjust build_comparison_op callers. * class.c (check_bases_and_members): Don't call defaulted_late_check for sfk_comparison. (finish_struct_1): Call it here instead after class has been completed. * pt.c (maybe_instantiate_noexcept): For sfk_comparison of still incomplete classes, call build_comparison_op in non-defining mode instead of calling synthesize_method. * g++.dg/cpp2a/spaceship-synth8.C (std::strong_ordering): Provide more complete definition. (std::strong_ordering::less, std::strong_ordering::equal, std::strong_ordering::greater): Define. * g++.dg/cpp2a/spaceship-eq11.C: New test. * g++.dg/cpp2a/spaceship-eq12.C: New test. --- gcc/cp/cp-tree.h.jj 2021-09-18 09:44:31.728743713 +0200 +++ gcc/cp/cp-tree.h2021-09-30 18:39:10.416847290 +0200 @@ -7012,6 +7012,7 @@ extern bool maybe_explain_implicit_delet extern void explain_implicit_non_constexpr(tree); extern bool deduce_inheriting_ctor(tree); extern bool decl_remember_implicit_trigger_p (tree); +extern void build_comparison_op(tree, bool, tsubst_flags_t); extern void synthesize_method (tree); extern tree lazily_declare_fn (special_function_kind, tree); --- gcc/cp/method.c.jj 2021-09-30 09:22:46.323918164 +0200 +++ gcc/cp/method.c 2021-09-30 18:51:14.510744549 +0200 @@ -1288,21 +1288,16 @@ struct comp_info { tree fndecl; location_t loc; - bool defining; bool first_time; bool constexp; bool was_constexp; bool noex; - comp_info (tree fndecl, tsubst_flags_t &complain) + comp_info (tree fndecl) : fndecl (fndecl) { loc = DECL_SOURCE_LOCATION (fndecl); -/* We only have tf_error set when we're called from - explain_invalid_constexpr_fn or maybe_explain_implicit_delete. */ -defining = !(complain & tf_error); - first_time = DECL_MAYBE_DELETED (fndecl); DECL_MAYBE_DELETED (fndecl) = false; @@ -1364,12 +1359,12 @@ struct comp_info to use synthesize_method at the earliest opportunity and bail out if the function ends up being deleted. */ -static void -build_comparison_op (tree fndecl, tsubst_flags_t complain) +void +build_comparison_op (tree fndecl, bool defining, tsubst_flags_t complain) { - comp_info info (fndecl, complain); + comp_info info (fndecl); - if (!info.defining && !(complain & tf_error) && !DECL_MAYBE_DELETED (fndecl)) + if (!defining && !(complain & tf_error) && !DECL_MAYBE_DELETED (fndecl)) return; int flags = LOOKUP_NORMAL; @@ -1384,6 +1379,7 @@ build_comparison_op (tree fndecl, tsubst lhs = convert_from_reference (lhs); rhs = convert_from_reference (rhs); tree ct
[AVR] Fix unused argument warning
Hi! Configuring GCC with --target=avr-elf --enable-werror-always, I see this warning that's easy to fix. The options are parsed with a lot of #ifdefs and it may actually just be unused. Let's just mark it as such. [all 2021-09-30 00:43:46] /usr/lib/gcc-snapshot/bin/g++ -fno-PIE -c -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-error=format-diag -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -I. -I. -I../../gcc/gcc -I../../gcc/gcc/. -I../../gcc/gcc/../include -I../../gcc/gcc/../libcpp/include -I../../gcc/gcc/../libcody -I../../gcc/gcc/../libdecnumber -I../../gcc/gcc/../libdecnumber/dpd -I../libdecnumber -I../../gcc/gcc/../libbacktrace -o avr-common.o -MT avr-common.o -MMD -MP -MF ./.deps/avr-common.TPo ../../gcc/gcc/common/config/avr/avr-common.c [all 2021-09-30 00:43:47] ../../gcc/gcc/common/config/avr/avr-common.c: In function 'bool avr_handle_option(gcc_options*, gcc_options*, const cl_decoded_option*, location_t)': [all 2021-09-30 00:43:47] ../../gcc/gcc/common/config/avr/avr-common.c:80:72: error: unused parameter 'loc' [-Werror=unused-parameter] [all 2021-09-30 00:43:47]80 |const struct cl_decoded_option *decoded, location_t loc) [all 2021-09-30 00:43:47] | ~~~^~~ [all 2021-09-30 00:43:47] cc1plus: all warnings being treated as errors [all 2021-09-30 00:43:47] make[1]: *** [Makefile:2420: avr-common.o] Error 1 gcc/ChangeLog: * common/config/avr/avr-common.c (avr_handle_option): Mark argument as ATTRIBUTE_UNUSED. diff --git a/gcc/common/config/avr/avr-common.c b/gcc/common/config/avr/avr-common.c index 6486659d27c..a6939ad03d3 100644 --- a/gcc/common/config/avr/avr-common.c +++ b/gcc/common/config/avr/avr-common.c @@ -77,7 +77,8 @@ static const struct default_options avr_option_optimization_table[] = static bool avr_handle_option (struct gcc_options *opts, struct gcc_options*, - const struct cl_decoded_option *decoded, location_t loc) + const struct cl_decoded_option *decoded, + location_t loc ATTRIBUTE_UNUSED) { int value = decoded->value; Ok for trunk? Thanks, Jan-Benedict -- signature.asc Description: PGP signature
Re: [RFC][Patch][middle-end/PR102359]Not add initialization for READONLY variables with -ftrivial-auto-var-init
On 9/30/21 11:42, Qing Zhao wrote: On Sep 30, 2021, at 1:54 AM, Richard Biener wrote: On Thu, 30 Sep 2021, Jason Merrill wrote: On 9/29/21 17:30, Qing Zhao wrote: Hi, PR102359 (ICE gimplification failed since r12-3433-ga25e0b5e6ac8a77a) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102359 Is due to -ftrivial-auto-var-init adding initialization for READONLY variable “this” in the following routine: (t.cpp.005t.original) === ;; Function A::foo():: (null) ;; enabled by -tree-original { const struct A * const this [value-expr: &__closure->__this]; const struct A * const this [value-expr: &__closure->__this]; return = (double) ((const struct A *) this)->a; } === However, in the above routine, “this” is NOT marked as READONLY, but its value-expr "&__closure->__this” is marked as READONLY. There are two major issues: 1. In the routine “is_var_need_auto_init”, we should exclude “decl” that is marked as READONLY; 2. In the C++ FE, “this” should be marked as READONLY. The idea solution will be: 1. Fix “is_var_need_auto_init” to exclude TREE_READONLY (decl); 2. Fix C++ FE to mark “this” as TREE_READONLY (decl)==true; Not sure whether it’s hard for C++ FE to fix the 2nd issue or not? In the case it’s not a quick fix in C++FE, I proposed the following fix in middle end: Let me know your comments or suggestions on this. Thanks a lot for the help. I'd think is_var_need_auto_init should be false for any variable with DECL_HAS_VALUE_EXPR_P, as they aren't really variables, just ways of naming objects that are initialized elsewhere. IIRC handing variables with DECL_HAS_VALUE_EXPR_P is necessary to auto-init VLAs, otherwise I tend to agree - would we handle those when we see a DECL_EXPR then? The current implementation is: gimplify_decl_expr: For each DECL_EXPR “decl” If (VAR_P (decl) && !DECL_EXTERNAL (decl)) { if (is_vla (decl)) gimplify_vla_decl (decl, …); /* existing handling: create a VALUE_EXPR for this vla decl*/ … if (has_explicit_init (decl)) { …; /* existing handling. */ } else if (is_var_need_auto_init (decl)) /*. New code. */ { gimple_add_init_for_auto_var (….); /* new code. */ ... } } Since the “DECL_VALUE_EXPR (decl)” is NOT a DECL_EXPR, it will not be scanned and added initialization. if we do not add initialization for a decl that has DECL_VALUE_EXPR, then the “DECL_VALUE_EXPR (decl)” will not be added an initialization either. We will miss adding initializations for these decls. So, I think that the current implementation is correct. And if C++ FE will not mark “this” as READONLY, only mark DECL_VALUE_EXPR(this) as READONLY, the proposed fix is correct too. Let me know your opinion on this. The problem with this test is not whether the 'this' proxy is marked READONLY, the problem is that you're trying to initialize lambda capture proxies at all; the lambda capture objects were already initialized when forming the closure object. So this test currently aborts with -ftrivial-auto-var-init=zero because you "initialize" the i capture field to 0 after it was previously initialized to 42: int main() { int i = 42; auto l = [=]() mutable { return i; }; if (l() != i) __builtin_abort (); } I believe the same issue applies to the proxy variables in coroutines that work much like lambdas. You can't just assume that a VAR_DECL with DECL_VALUE_EXPR is uninitialized. Since there's already VLA handling in gimplify_decl_expr, you could remember whether you added DECL_VALUE_EXPR in that function, and only then do the initialization. Jason
Re: [PATCH] c++: Implement C++20 -Wdeprecated-array-compare [PR97573]
On 9/30/21 10:50, Marek Polacek wrote: This patch addresses one of my leftovers from GCC 11. C++20 introduced [depr.array.comp]: "Equality and relational comparisons between two operands of array type are deprecated." so this patch adds -Wdeprecated-array-compare (enabled by default in C++20). Why not enable it by default in all modes? It was always pretty dubious code. Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? PR c++/97573 gcc/c-family/ChangeLog: * c-opts.c (c_common_post_options): In C++20, turn on -Wdeprecated-array-compare. * c.opt (Wdeprecated-array-compare): New option. gcc/cp/ChangeLog: * typeck.c (do_warn_deprecated_array_compare): New. (cp_build_binary_op): Call it for equality and relational comparisons. gcc/ChangeLog: * doc/invoke.texi: Document -Wdeprecated-array-compare. gcc/testsuite/ChangeLog: * g++.dg/tree-ssa/pr15791-1.C: Add dg-warning. * g++.dg/cpp2a/array-comp1.C: New test. * g++.dg/cpp2a/array-comp2.C: New test. * g++.dg/cpp2a/array-comp3.C: New test. --- gcc/c-family/c-opts.c | 5 gcc/c-family/c.opt| 4 +++ gcc/cp/typeck.c | 28 +++ gcc/doc/invoke.texi | 19 - gcc/testsuite/g++.dg/cpp2a/array-comp1.C | 34 +++ gcc/testsuite/g++.dg/cpp2a/array-comp2.C | 31 + gcc/testsuite/g++.dg/cpp2a/array-comp3.C | 29 +++ gcc/testsuite/g++.dg/tree-ssa/pr15791-1.C | 2 +- 8 files changed, 150 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp2a/array-comp1.C create mode 100644 gcc/testsuite/g++.dg/cpp2a/array-comp2.C create mode 100644 gcc/testsuite/g++.dg/cpp2a/array-comp3.C diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c index 3eaab5e1530..00b52cc5e12 100644 --- a/gcc/c-family/c-opts.c +++ b/gcc/c-family/c-opts.c @@ -962,6 +962,11 @@ c_common_post_options (const char **pfilename) warn_deprecated_enum_float_conv, cxx_dialect >= cxx20 && warn_deprecated); + /* -Wdeprecated-array-compare is enabled by default in C++20. */ + SET_OPTION_IF_UNSET (&global_options, &global_options_set, + warn_deprecated_array_compare, + cxx_dialect >= cxx20 && warn_deprecated); + /* Declone C++ 'structors if -Os. */ if (flag_declone_ctor_dtor == -1) flag_declone_ctor_dtor = optimize_size; diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 9c151d19870..a4f0ea68594 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -540,6 +540,10 @@ Wdeprecated C C++ ObjC ObjC++ CPP(cpp_warn_deprecated) CppReason(CPP_W_DEPRECATED) ; Documented in common.opt +Wdeprecated-array-compare +C++ ObjC++ Var(warn_deprecated_array_compare) Warning +Warn about deprecated comparisons between two operands of array type. + Wdeprecated-copy C++ ObjC++ Var(warn_deprecated_copy) Warning LangEnabledBy(C++ ObjC++, Wextra) Mark implicitly-declared copy operations as deprecated if the class has a diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c index a2398dbe660..1e3a41104d6 100644 --- a/gcc/cp/typeck.c +++ b/gcc/cp/typeck.c @@ -40,6 +40,7 @@ along with GCC; see the file COPYING3. If not see #include "attribs.h" #include "asan.h" #include "gimplify.h" +#include "tree-pretty-print.h" static tree cp_build_addr_expr_strict (tree, tsubst_flags_t); static tree cp_build_function_call (tree, tree, tsubst_flags_t); @@ -4725,6 +4726,21 @@ do_warn_enum_conversions (location_t loc, enum tree_code code, tree type0, } } +/* Warn about C++20 [depr.array.comp] array comparisons: "Equality + and relational comparisons between two operands of array type are + deprecated." */ + +static inline void +do_warn_deprecated_array_compare (location_t location, tree_code code, + tree op0, tree op1) +{ + if (warning_at (location, OPT_Wdeprecated_array_compare, + "comparison between two arrays is deprecated")) +inform (location, "use unary %<+%> which decays operands to pointers " + "or %<&%D[0] %s &%D[0]%> to compare the addresses", + op0, op_symbol_code (code), op1); +} + /* Build a binary-operation expression without default conversions. CODE is the kind of expression to build. LOCATION is the location_t of the operator in the source code. @@ -5289,6 +5305,11 @@ cp_build_binary_op (const op_location_t &location, warning_at (location, OPT_Waddress, "comparison with string literal results in " "unspecified behavior"); + else if (TREE_CODE (TREE_TYPE (orig_op0)) == ARRAY_TYPE + && TREE_CODE (TREE_TYPE (orig_op1)) == ARRAY_TYPE) + do_warn_deprecated_array_compare (location, code, +
[LM32] Fix '"LINK_GCC_C_SEQUENCE_SPEC" redefined' warning
Hi! When configuring GCC for --target=lm32-uclinux --enable-werror-always, it breaks here: [all 2021-09-30 08:55:55] /usr/lib/gcc-snapshot/bin/g++ -c -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-error=format-diag -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -DGENERATOR_FILE -I. -Ibuild -I../../gcc/gcc -I../../gcc/gcc/build -I../../gcc/gcc/../include -I../../gcc/gcc/../libcpp/include \ [all 2021-09-30 08:55:55] -o build/genpreds.o ../../gcc/gcc/genpreds.c [all 2021-09-30 08:55:55] In file included from ./tm.h:29, [all 2021-09-30 08:55:55] from ../../gcc/gcc/genpreds.c:26: [all 2021-09-30 08:55:55] ../../gcc/gcc/config/lm32/uclinux-elf.h:70: error: "LINK_GCC_C_SEQUENCE_SPEC" redefined [-Werror] [all 2021-09-30 08:55:55]70 | #define LINK_GCC_C_SEQUENCE_SPEC \ [all 2021-09-30 08:55:55] | [all 2021-09-30 08:55:55] In file included from ./tm.h:27, [all 2021-09-30 08:55:55] from ../../gcc/gcc/genpreds.c:26: [all 2021-09-30 08:55:55] ../../gcc/gcc/config/gnu-user.h:117: note: this is the location of the previous definition [all 2021-09-30 08:55:55] 117 | #define LINK_GCC_C_SEQUENCE_SPEC GNU_USER_TARGET_LINK_GCC_C_SEQUENCE_SPEC [all 2021-09-30 08:55:55] | [all 2021-09-30 08:55:58] cc1plus: all warnings being treated as errors [all 2021-09-30 08:55:58] make[1]: *** [Makefile:2825: build/genpreds.o] Error 1 Easy fix, just undefine LINK_GCC_C_SEQUENCE_SPEC beforehand: gcc/ChangeLog: * config/lm32/uclinux-elf.h (LINK_GCC_C_SEQUENCE_SPEC): Undefine before redefinition. diff --git a/gcc/config/lm32/uclinux-elf.h b/gcc/config/lm32/uclinux-elf.h index 370df4c55dd..5b638fa5db2 100644 --- a/gcc/config/lm32/uclinux-elf.h +++ b/gcc/config/lm32/uclinux-elf.h @@ -67,6 +67,7 @@ #define TARGET_OS_CPP_BUILTINS() GNU_USER_TARGET_OS_CPP_BUILTINS() +#undef LINK_GCC_C_SEQUENCE_SPEC #define LINK_GCC_C_SEQUENCE_SPEC \ "%{static|static-pie:--start-group} %G %{!nolibc:%L} \ %{static|static-pie:--end-group}%{!static:%{!static-pie:%G}}" Ok for trunk? Thanks, Jan-Benedict -- signature.asc Description: PGP signature
Re: PING #2 [PATCH] warn for more impossible null pointer tests [PR102103]
On Thu, 30 Sep 2021, Martin Sebor via Gcc-patches wrote: > Jason, since you approved the C++ changes, would you mind looking > over the C bits and if they look good to you giving me the green > light to commit the patch? > > https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579693.html The C changes are OK, with two instances of "for the address %qE will never be NULL" fixed to refer to the address *of* %qE as elsewhere (those are for IMAGPART_EXPR and REALPART_EXPR; C++ also has one "the address %qE will never be NULL"), and the "pr??" in the tests filled in with an actual PR number for the XFAILed cases. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH 1/2] c++: order of trailing arguments in a trait expr
On 9/30/21 12:52, Patrick Palka wrote: When parsing a variadic trait expression, we build up the list of trailing arguments in reverse, but we're neglecting to reverse the list to its true order afterwards. This causes us to confuse the meaning of e.g. __is_xible(x, y, z) and __is_xible(x, z, y). Note that bug isn't exposed in the standard type traits within libstdc++ because there we pass a pack expansion as the single trailing argument to __is_xible, which indeed gets expanded correctly by tsubst_tree_list. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? What about backports? This isn't a regression AFAICT. OK for trunk. I wouldn't bother backporting, since it doesn't affect the library traits. gcc/cp/ChangeLog: * parser.c (cp_parser_trait_expr): Call nreverse on the list of trailing arguments. gcc/testsuite/ChangeLog: * g++.dg/ext/is_constructible6.C: New test. --- gcc/cp/parser.c | 1 + gcc/testsuite/g++.dg/ext/is_constructible6.C | 10 ++ 2 files changed, 11 insertions(+) create mode 100644 gcc/testsuite/g++.dg/ext/is_constructible6.C diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 8430445ef8c..04f5a24cc03 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -10832,6 +10832,7 @@ cp_parser_trait_expr (cp_parser* parser, enum rid keyword) return error_mark_node; type2 = tree_cons (NULL_TREE, elt, type2); } + type2 = nreverse (type2); } location_t finish_loc = cp_lexer_peek_token (parser->lexer)->location; diff --git a/gcc/testsuite/g++.dg/ext/is_constructible6.C b/gcc/testsuite/g++.dg/ext/is_constructible6.C new file mode 100644 index 000..7fce153fa75 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/is_constructible6.C @@ -0,0 +1,10 @@ +// Verify we respect the order of trailing arguments passed to +// __is_constructible. + +struct A { }; +struct B { }; +struct C { C(A, B); }; + +extern int n[true]; +extern int n[ __is_constructible(C, A, B)]; +extern int n[!__is_constructible(C, B, A)];
Re: [PATCH 2/2] c++: __is_trivially_xible and multi-arg aggr paren init [PR102535]
On 9/30/21 12:52, Patrick Palka wrote: is_xible_helper assumes only 0- and 1-argument ctors can be trivial, but C++20 aggregate paren init means multi-arg ctors can now be trivial too. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk/11? OK. PR c++/102535 gcc/cp/ChangeLog: * method.c (is_xible_helper): Don't exit early for multi-arg ctors in C++20. gcc/testsuite/ChangeLog: * g++.dg/ext/is_trivially_constructible7.C: New test. --- gcc/cp/method.c | 4 +++- .../g++.dg/ext/is_trivially_constructible7.C| 17 + 2 files changed, 20 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/ext/is_trivially_constructible7.C diff --git a/gcc/cp/method.c b/gcc/cp/method.c index 3c3495227ce..c38912a7ce9 100644 --- a/gcc/cp/method.c +++ b/gcc/cp/method.c @@ -2094,8 +2094,10 @@ is_xible_helper (enum tree_code code, tree to, tree from, bool trivial) tree expr; if (code == MODIFY_EXPR) expr = assignable_expr (to, from); - else if (trivial && from && TREE_CHAIN (from)) + else if (trivial && from && TREE_CHAIN (from) + && cxx_dialect < cxx20) return error_mark_node; // only 0- and 1-argument ctors can be trivial + // before C++20 aggregate paren init else if (TREE_CODE (to) == ARRAY_TYPE && !TYPE_DOMAIN (to)) return error_mark_node; // can't construct an array of unknown bound else diff --git a/gcc/testsuite/g++.dg/ext/is_trivially_constructible7.C b/gcc/testsuite/g++.dg/ext/is_trivially_constructible7.C new file mode 100644 index 000..f6fbf8f2d9e --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/is_trivially_constructible7.C @@ -0,0 +1,17 @@ +// PR c++/102535 +// Verify __is_trivially_constructible works with multi-arg paren init of +// aggrs. + +struct A { int x; }; +struct B { float y; }; +struct C { char z; }; +struct D { A a; B b; C c; }; + +extern int n[1 + __is_trivially_constructible(D, A)]; +extern int n[1 + __is_trivially_constructible(D, A, B)]; +extern int n[1 + __is_trivially_constructible(D, A, B, C)]; +#if __cpp_aggregate_paren_init +extern int n[1 + true]; +#else +extern int n[1 + false]; +#endif
Re: [committed] libstdc++: Specialize std::pointer_traits<__normal_iterator>
Here is the _Safe_iterator one. Doing so I noticed that pointer_traits rebind for __normal_iterator was wrong and added tests on it. For _Safe_iterator maybe I should specialize only when instantiated with __normal_iterator ? Or maybe limit to random_access_iterator_tag ? Whatever the pointer_to implementation is problematic, we can only produce singular iterator as I did ifor now. François On 28/09/21 9:25 pm, Jonathan Wakely via Libstdc++ wrote: This allows std::__to_address to be used with __normal_iterator in C++11/14/17 modes. Without the partial specialization the deduced pointer_traits::element_type is incorrect, and so the return type of __to_address is wrong. A similar partial specialization is probably needed for __gnu_debug::_Safe_iterator. Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: * include/bits/stl_iterator.h (pointer_traits): Define partial specialization for __normal_iterator. * testsuite/24_iterators/normal_iterator/to_address.cc: New test. Tested x86_64-linux. Committed to trunk. diff --git a/libstdc++-v3/include/bits/stl_iterator.h b/libstdc++-v3/include/bits/stl_iterator.h index 004d767224d..f7e851718c1 100644 --- a/libstdc++-v3/include/bits/stl_iterator.h +++ b/libstdc++-v3/include/bits/stl_iterator.h @@ -1294,13 +1294,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION private: using _Base = pointer_traits<_Iterator>; + template + using __base_rebind = typename _Base::template rebind<_Tp>; + public: using element_type = typename _Base::element_type; using pointer = __gnu_cxx::__normal_iterator<_Iterator, _Container>; using difference_type = typename _Base::difference_type; template - using rebind = __gnu_cxx::__normal_iterator<_Tp, _Container>; + using rebind = + __gnu_cxx::__normal_iterator<__base_rebind<_Tp>, _Container>; static pointer pointer_to(element_type& __e) noexcept diff --git a/libstdc++-v3/include/debug/safe_iterator.h b/libstdc++-v3/include/debug/safe_iterator.h index 5584d06de5a..5461d2b342f 100644 --- a/libstdc++-v3/include/debug/safe_iterator.h +++ b/libstdc++-v3/include/debug/safe_iterator.h @@ -1013,6 +1013,44 @@ namespace __gnu_debug } // namespace __gnu_debug +#if __cplusplus >= 201103L +namespace std _GLIBCXX_VISIBILITY(default) +{ +_GLIBCXX_BEGIN_NAMESPACE_VERSION + template +struct pointer_traits<__gnu_debug::_Safe_iterator<_Iterator, _Sequence, + _Category>> +{ +private: + using _Base = pointer_traits<_Iterator>; + + template + using __base_rebind = typename _Base::template rebind<_Tp>; + +public: + using element_type = typename _Base::element_type; + using pointer = + __gnu_debug::_Safe_iterator<_Iterator, _Sequence, _Category>; + using difference_type = typename _Base::difference_type; + + template + using rebind = + __gnu_debug::_Safe_iterator<__base_rebind<_Tp>, _Sequence, _Category>; + + static pointer + pointer_to(element_type& __e) noexcept + { return pointer(_Base::pointer_to(__e), nullptr); } + +#if __cplusplus >= 202002L + static element_type* + to_address(pointer __p) noexcept + { return __p.base(); } +#endif +}; +_GLIBCXX_END_NAMESPACE_VERSION +} // namespace +#endif + #undef _GLIBCXX_DEBUG_VERIFY_DIST_OPERANDS #undef _GLIBCXX_DEBUG_VERIFY_REL_OPERANDS #undef _GLIBCXX_DEBUG_VERIFY_EQ_OPERANDS diff --git a/libstdc++-v3/testsuite/20_util/pointer_traits/rebind.cc b/libstdc++-v3/testsuite/20_util/pointer_traits/rebind.cc index 159ea8f5294..b78e974d777 100644 --- a/libstdc++-v3/testsuite/20_util/pointer_traits/rebind.cc +++ b/libstdc++-v3/testsuite/20_util/pointer_traits/rebind.cc @@ -18,6 +18,7 @@ // { dg-do compile { target c++11 } } #include +#include using std::is_same; @@ -66,3 +67,13 @@ template }; // PR libstdc++/72793 specialization of pointer_traits is still well-formed: std::pointer_traits>::element_type e; + +static_assert(is_same::iterator, long>>::element_type, + long>::value, + "iterator rebind"); + +static_assert(is_same::const_iterator, long>>::element_type, + long>::value, + "const_iterator rebind"); diff --git a/libstdc++-v3/testsuite/24_iterators/normal_iterator/to_address.cc b/libstdc++-v3/testsuite/24_iterators/normal_iterator/to_address.cc index 510d627435f..433c803beb1 100644 --- a/libstdc++-v3/testsuite/24_iterators/normal_iterator/to_address.cc +++ b/libstdc++-v3/testsuite/24_iterators/normal_iterator/to_address.cc @@ -1,6 +1,9 @@ // { dg-do compile { target { c++11 } } } #include +#include #include char* p = std::__to_address(std::string("1").begin()); const char* q = std::__to_address(std::string("2").cbegin()); +int* r = std::__to_address(std::vector(1, 1).begin()); +const int* s = std::__to_address(std::vector(1, 1).cbegin());
RE: [PATCH][GCC] arm: Add Cortex-R52+ multilib
> Subject: Re: [PATCH][GCC] arm: Add Cortex-R52+ multilib > > I think the RTEMS multilibs are based on the products that RTEMS supports, > so this is really the RTEMS maintainers' call. > > Joel? Ping :) > On 22/09/2021 09:46, Przemyslaw Wirkus via Gcc-patches wrote: > > Patch is adding multilib entries for `cortex-r52plus` CPU. > > > > See: > > https://www.arm.com/products/silicon-ip-cpu/cortex-r/cortex-r52-plus > > > > OK for master? > > > > gcc/ChangeLog: > > > > 2021-09-16 Przemyslaw Wirkus > > > > * config/arm/t-rtems: Add "-mthumb -mcpu=cortex-r52plus > > -mfloat-abi=hard" multilib. > >
Re: [PATCH][GCC] arm: Add Cortex-R52+ multilib
On Thu, Sep 30, 2021, 3:37 PM Przemyslaw Wirkus wrote: > > Subject: Re: [PATCH][GCC] arm: Add Cortex-R52+ multilib > > > > I think the RTEMS multilibs are based on the products that RTEMS > supports, > > so this is really the RTEMS maintainers' call. > > > > Joel? > > Ping :) > I'm ok deferring it since Sebastian doesn't think there is a user right now. But I'm actually rather ambivalent. If it makes it easier to maintain versus the other embedded arm targets then I'm all for it. Maintaining these configurations are a pain. --joel > > On 22/09/2021 09:46, Przemyslaw Wirkus via Gcc-patches wrote: > > > Patch is adding multilib entries for `cortex-r52plus` CPU. > > > > > > See: > > > https://www.arm.com/products/silicon-ip-cpu/cortex-r/cortex-r52-plus > > > > > > OK for master? > > > > > > gcc/ChangeLog: > > > > > > 2021-09-16 Przemyslaw Wirkus > > > > > > * config/arm/t-rtems: Add "-mthumb -mcpu=cortex-r52plus > > > -mfloat-abi=hard" multilib. > > > >
Re: [PATCH] c++: Implement C++20 -Wdeprecated-array-compare [PR97573]
On Thu, Sep 30, 2021 at 03:34:24PM -0400, Jason Merrill wrote: > On 9/30/21 10:50, Marek Polacek wrote: > > This patch addresses one of my leftovers from GCC 11. C++20 introduced > > [depr.array.comp]: > > "Equality and relational comparisons between two operands of array type are > > deprecated." > > so this patch adds -Wdeprecated-array-compare (enabled by default in C++20). > > Why not enable it by default in all modes? It was always pretty dubious > code. Sure, it could be done, but it kind of complicates things: we'd probably need a different option and a different message because it seems incorrect to say "deprecated" in e.g. C++17 when this was only deprecated in C++20. I'd rather not add another option; if it stays -Wdeprecated-array-compare but -Wno-deprecated doesn't turn it off that also seems weird. I could rename it to -Warray-compare, enable by -Wall, and only append "is deprecated" to the warning message in C++20. Does that seem better? > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > > > PR c++/97573 > > > > gcc/c-family/ChangeLog: > > > > * c-opts.c (c_common_post_options): In C++20, turn on > > -Wdeprecated-array-compare. > > * c.opt (Wdeprecated-array-compare): New option. > > > > gcc/cp/ChangeLog: > > > > * typeck.c (do_warn_deprecated_array_compare): New. > > (cp_build_binary_op): Call it for equality and relational comparisons. > > > > gcc/ChangeLog: > > > > * doc/invoke.texi: Document -Wdeprecated-array-compare. > > > > gcc/testsuite/ChangeLog: > > > > * g++.dg/tree-ssa/pr15791-1.C: Add dg-warning. > > * g++.dg/cpp2a/array-comp1.C: New test. > > * g++.dg/cpp2a/array-comp2.C: New test. > > * g++.dg/cpp2a/array-comp3.C: New test. > > --- > > gcc/c-family/c-opts.c | 5 > > gcc/c-family/c.opt| 4 +++ > > gcc/cp/typeck.c | 28 +++ > > gcc/doc/invoke.texi | 19 - > > gcc/testsuite/g++.dg/cpp2a/array-comp1.C | 34 +++ > > gcc/testsuite/g++.dg/cpp2a/array-comp2.C | 31 + > > gcc/testsuite/g++.dg/cpp2a/array-comp3.C | 29 +++ > > gcc/testsuite/g++.dg/tree-ssa/pr15791-1.C | 2 +- > > 8 files changed, 150 insertions(+), 2 deletions(-) > > create mode 100644 gcc/testsuite/g++.dg/cpp2a/array-comp1.C > > create mode 100644 gcc/testsuite/g++.dg/cpp2a/array-comp2.C > > create mode 100644 gcc/testsuite/g++.dg/cpp2a/array-comp3.C > > > > diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c > > index 3eaab5e1530..00b52cc5e12 100644 > > --- a/gcc/c-family/c-opts.c > > +++ b/gcc/c-family/c-opts.c > > @@ -962,6 +962,11 @@ c_common_post_options (const char **pfilename) > >warn_deprecated_enum_float_conv, > >cxx_dialect >= cxx20 && warn_deprecated); > > + /* -Wdeprecated-array-compare is enabled by default in C++20. */ > > + SET_OPTION_IF_UNSET (&global_options, &global_options_set, > > + warn_deprecated_array_compare, > > + cxx_dialect >= cxx20 && warn_deprecated); > > + > > /* Declone C++ 'structors if -Os. */ > > if (flag_declone_ctor_dtor == -1) > > flag_declone_ctor_dtor = optimize_size; > > diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt > > index 9c151d19870..a4f0ea68594 100644 > > --- a/gcc/c-family/c.opt > > +++ b/gcc/c-family/c.opt > > @@ -540,6 +540,10 @@ Wdeprecated > > C C++ ObjC ObjC++ CPP(cpp_warn_deprecated) CppReason(CPP_W_DEPRECATED) > > ; Documented in common.opt > > +Wdeprecated-array-compare > > +C++ ObjC++ Var(warn_deprecated_array_compare) Warning > > +Warn about deprecated comparisons between two operands of array type. > > + > > Wdeprecated-copy > > C++ ObjC++ Var(warn_deprecated_copy) Warning LangEnabledBy(C++ ObjC++, > > Wextra) > > Mark implicitly-declared copy operations as deprecated if the class has a > > diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c > > index a2398dbe660..1e3a41104d6 100644 > > --- a/gcc/cp/typeck.c > > +++ b/gcc/cp/typeck.c > > @@ -40,6 +40,7 @@ along with GCC; see the file COPYING3. If not see > > #include "attribs.h" > > #include "asan.h" > > #include "gimplify.h" > > +#include "tree-pretty-print.h" > > static tree cp_build_addr_expr_strict (tree, tsubst_flags_t); > > static tree cp_build_function_call (tree, tree, tsubst_flags_t); > > @@ -4725,6 +4726,21 @@ do_warn_enum_conversions (location_t loc, enum > > tree_code code, tree type0, > > } > > } > > +/* Warn about C++20 [depr.array.comp] array comparisons: "Equality > > + and relational comparisons between two operands of array type are > > + deprecated." */ > > + > > +static inline void > > +do_warn_deprecated_array_compare (location_t location, tree_code code, > > + tree op0, tree op1) > > +{ > > + if (warning_at (location, OPT_Wdeprecated_array_compare, >
Re: [PATCH] rs6000: Remove builtin mask check from builtin_decl [PR102347]
Hi! On Thu, Sep 30, 2021 at 11:06:50AM +0800, Kewen.Lin wrote: [ huge snip ] > Based on the understanding and testing, I think it's safe to adopt this patch. > Do both Peter and you agree the rs6000_expand_builtin will catch the invalid > built-in? > Is there some special case which probably escapes out? The function rs6000_builtin_decl has a terribly generic name. Where all is it called from? Do all such places allow the change in semantics? Do any comments or other documentation need to change? Is the function name still good? > By the way, I tested the bif rewriting patch series V5, it couldn't make the > original > case in PR (S5) pass, I may miss something or the used series isn't > up-to-date. Could > you help to have a try? I agree with Peter, if the rewriting can fix this > issue, then > we don't need this patch for trunk any more, I'm happy to abandon this. :) (Mail lines are 70 or so chars max, so that they can be quoted a few levels). If we do need a band-aid for 10 and 11 (and we do as far as I can see), I'd like to see one for just MMA there, and let all other badness fade into history. Unless you can convince me (in the patch / commit message) that this is safe :-) Whichever way you choose, it is likely best to do the same on 10 and 11 as on trunk, since it will all be replaced on trunk soon anyway. Segher
RE: [PATCH][GCC] arm: Enable Cortex-R52+ CPU
> Subject: Re: [PATCH][GCC] arm: Enable Cortex-R52+ CPU > > This is OK Applying as r52+ is now in Binutils. commit cd08eae26ed23497ace5f4ee6f3a41eb5bd36c38 > Ramana > > On 22/09/2021, 09:45, "Przemyslaw Wirkus" > wrote: > > Patch is adding Cortex-R52+ as 'cortex-r52plus' command line > flag for -mcpu option. > > See: https://www.arm.com/products/silicon-ip-cpu/cortex-r/cortex-r52- > plus > > OK for master? > > gcc/ChangeLog: > > 2021-09-22 Przemyslaw Wirkus > > * config/arm/arm-cpus.in: Add Cortex-R52+ CPU. > * config/arm/arm-tables.opt: Regenerate. > * config/arm/arm-tune.md: Regenerate. > * doc/invoke.texi: Update docs. >
[Patch] Fortran: Various CLASS + assumed-rank fixed [PR102541]
Hi all, this patch fixes a bunch of issues with CLASS. * * * Side remark: I disliked the way CLASS is represented when it was introduced; when writing the testcase for this PR and kept fixing the testcase fallout, I started to hate it! I am sure that there are more issues – but I tried hard not too look closer at surrounding code to avoid hitting more issues. (If you look for a project, I think if you put attributes on separate lines, like a separate "POINTER :: var" line, you have a high chance to hit the error.) * * * What I found rather puzzling is that the 'optional' argument could be either on sym->attr.optional or on CLASS_DATA (sym)->attr.optional. I think one occurs for 'foo2' and the other for 'foo4' - not that I understand why it differs. I think it is otherwise straight forward. Regarding the original issue: In order to detect an assumed-size argument to an assumed-rank array, the last dimension has 'ubound = -1' to indicate an assume-size array; for those size(x, dim=rank(x)-1) == -1 and size(x) < 0 However, when the dummy argument (and hence: actual argument) is either a pointer or an allocatable, the bound is passed as is (in particular, "-1" is a valid ubound and size(x) >= 0). – However, if the actual argument is unallocated/not associated, rank(var) still is supposed to work - hence, it has to be set. The last two items did work before - but not for CLASS -> CLASS. Additionally, the ubound = -1 had an issue for CLASS -> TYPE as the code assumed that expr->ref is the whole array ("var(full-array-ref)") but for CLASS the expr->ref is a component and only expr->ref->next is the array ref. ("var%_data(full-array-ref)"). OK for mainline? Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 Fortran: Various CLASS + assumed-rank fixed [PR102541] Starting point was PR102541, were a previous patch caused an invalid e->ref access for class. When testing, it turned out that for CLASS to CLASS the code was never executed - additionally, issues appeared for optional and a bogus error for -fcheck=all. In particular: There were a bunch of issues related to optional CLASS, can have the 'attr.dummy' set in CLASS_DATA (sym) - but sometimes also in 'sym'!?! Additionally, gfc_variable_attr could return pointer = 1 for nonpointers when the expr is no longer "var" but "var%_data". PR fortran/102541 gcc/fortran/ChangeLog: * check.c (gfc_check_present): Handle optional CLASS. * interface.c (gfc_compare_actual_formal): Likewise. * trans-array.c (gfc_trans_g77_array): Likewise. * trans-decl.c (gfc_build_dummy_array_decl): Likewise. * trans-types.c (gfc_sym_type): Likewise. * primary.c (gfc_variable_attr): Fixes for dummy and pointer when 'class%_data' is passed. * trans-expr.c (set_dtype_for_unallocated, gfc_conv_procedure_call): For assumed-rank dummy, fix setting rank for dealloc/notassoc actual and setting ubound to -1 for assumed-size actuals. gcc/testsuite/ChangeLog: * gfortran.dg/assumed_rank_24.f90: New test. gcc/fortran/check.c | 4 +- gcc/fortran/interface.c | 9 +- gcc/fortran/primary.c | 20 +++- gcc/fortran/trans-array.c | 4 +- gcc/fortran/trans-decl.c | 3 +- gcc/fortran/trans-expr.c | 81 --- gcc/fortran/trans-types.c | 3 +- gcc/testsuite/gfortran.dg/assumed_rank_24.f90 | 137 ++ 8 files changed, 213 insertions(+), 48 deletions(-) diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c index f31ad68053b..677209ee95e 100644 --- a/gcc/fortran/check.c +++ b/gcc/fortran/check.c @@ -4530,7 +4530,9 @@ gfc_check_present (gfc_expr *a) return false; } - if (!sym->attr.optional) + /* For CLASS, the optional attribute might be set at either location. */ + if ((sym->ts.type != BT_CLASS || !CLASS_DATA (sym)->attr.optional) + && !sym->attr.optional) { gfc_error ("%qs argument of %qs intrinsic at %L must be of " "an OPTIONAL dummy variable", diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c index a2fea0e97b8..34a0fddffe2 100644 --- a/gcc/fortran/interface.c +++ b/gcc/fortran/interface.c @@ -3546,8 +3546,13 @@ gfc_compare_actual_formal (gfc_actual_arglist **ap, gfc_formal_arglist *formal, "at %L", where); return false; } - if (!f->sym->attr.optional - || (in_statement_function && f->sym->attr.optional)) + /* For CLASS, the optional attribute might be set at either location. */ + if (((f->sym->ts.type != BT_CLASS || !CLASS_DATA (f->sym)->attr.optional) + && !f->sym->attr.optional) + || (in_statement_function + && (f->sym->attr.optional + || (f->sym->ts.type == BT_CLASS +
[Ada] Switch to SR0660
Change to 64bit time to avoid Unix Epochalypse. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * libgnat/s-parame__vxworks.ads (time_t_bits): Change to Long_Long_Integer'Size.diff --git a/gcc/ada/libgnat/s-parame__vxworks.ads b/gcc/ada/libgnat/s-parame__vxworks.ads --- a/gcc/ada/libgnat/s-parame__vxworks.ads +++ b/gcc/ada/libgnat/s-parame__vxworks.ads @@ -108,11 +108,11 @@ package System.Parameters is -- Select the appropriate time_t_bits for the VSB in use, then rebuild -- the runtime using instructions in adainclude/libada.gpr. - time_t_bits : constant := Long_Integer'Size; + -- time_t_bits : constant := Long_Integer'Size; -- Number of bits in type time_t for SR0650 and before and SR0660 with -- non-default configuration. - -- time_t_bits : constant := Long_Long_Integer'Size; + time_t_bits : constant := Long_Long_Integer'Size; -- Number of bits in type time_t for SR0660 with default configuration. --
[Ada] Fix CodePeer warnings
This commit fixes warnings emitted by the CodePeer static analyzer. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * sem_aggr.adb (Resolve_Iterated_Component_Association): Initialize Id_Typ to Any_Type by default.diff --git a/gcc/ada/sem_aggr.adb b/gcc/ada/sem_aggr.adb --- a/gcc/ada/sem_aggr.adb +++ b/gcc/ada/sem_aggr.adb @@ -1605,7 +1605,7 @@ package body Sem_Aggr is Loc : constant Source_Ptr := Sloc (N); Id : constant Entity_Id := Defining_Identifier (N); - Id_Typ : Entity_Id; + Id_Typ : Entity_Id := Any_Type; --- -- Remove_References --
[Ada] No ABE check needed for an expression function call.
If -gnatE is specified, then in some cases a call to a subprogram includes a check that the body of the subprogram has been elaborated. No such check is needed in the case of an expression function that is not a completion; the function has no prior declaration. However, in some cases the compiler was incorrectly treating an expression function declared in the visible part of a package as though it had a function body declared within the package body. Not only could this result in an unnecessary check, but that check could fail and raise Program_Error. The ABE check is now correctly omitted. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * sem_elab.adb (Is_Safe_Call): Return True in the case of a (possibly rewritten) call to an expression function.diff --git a/gcc/ada/sem_elab.adb b/gcc/ada/sem_elab.adb --- a/gcc/ada/sem_elab.adb +++ b/gcc/ada/sem_elab.adb @@ -13621,6 +13621,13 @@ package body Sem_Elab is then return True; + -- A call to an expression function that is not a completion cannot + -- cause an ABE because it has no prior declaration; this remains + -- true even if the FE transforms the callee into something else. + + elsif Nkind (Original_Node (Spec_Decl)) = N_Expression_Function then + return True; + -- Subprogram bodies which wrap attribute references used as actuals -- in instantiations are always ABE-safe. These bodies are artifacts -- of expansion.
[Ada] Improve error message for .ali file version mismatch
When the binder detects a mismatch between the versions of two .ali files, include the version information in the resulting message. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * bcheck.adb (Check_Versions): In the case of an ali file version mismatch, if distinct integer values can be extracted from the two version strings then include those values in the generated error message.diff --git a/gcc/ada/bcheck.adb b/gcc/ada/bcheck.adb --- a/gcc/ada/bcheck.adb +++ b/gcc/ada/bcheck.adb @@ -29,6 +29,7 @@ with Binderr; use Binderr; with Butil;use Butil; with Casing; use Casing; with Fname;use Fname; +with Gnatvsn; with Namet;use Namet; with Opt; use Opt; with Osint; @@ -1324,11 +1325,78 @@ package body Bcheck is or else ALIs.Table (A).Ver (1 .. VL) /= ALIs.Table (ALIs.First).Ver (1 .. VL) then -Error_Msg_File_1 := ALIs.Table (A).Sfile; -Error_Msg_File_2 := ALIs.Table (ALIs.First).Sfile; +declare + No_Version : constant Int := -1; -Consistency_Error_Msg - ("{ and { compiled with different GNAT versions"); + function Extract_Version (S : String) return Int; + -- Attempts to extract and return a nonnegative library + -- version number from the given string; if unsuccessful, + -- then returns No_Version. + + - + -- Extract_Version -- + - + + function Extract_Version (S : String) return Int is + use Gnatvsn; + + Prefix : constant String := + Verbose_Library_Version + (1 .. Verbose_Library_Version'Length + - Library_Version'Length); + begin + pragma Assert (S'First = 1); + + if S'Length > Prefix'Length + and then S (1 .. Prefix'Length) = Prefix + then + declare +Suffix : constant String := + S (1 + Prefix'Length .. S'Last); + +Result : Nat := 0; + begin +if Suffix'Length < 10 + and then (for all C of Suffix => C in '0' .. '9') +then + -- Using Int'Value leads to complications in + -- building the binder, so DIY. + + for C of Suffix loop + Result := (10 * Result) + +(Character'Pos (C) - Character'Pos ('0')); + end loop; + return Result; +end if; + end; + end if; + return No_Version; + end Extract_Version; + + V1_Text : constant String := + ALIs.Table (A).Ver (1 .. ALIs.Table (A).Ver_Len); + V2_Text : constant String := + ALIs.Table (ALIs.First).Ver (1 .. VL); + V1 : constant Int := Extract_Version (V1_Text); + V2 : constant Int := Extract_Version (V2_Text); + + Include_Version_Numbers_In_Message : constant Boolean := + (V1 /= V2) and (V1 /= No_Version) and (V2 /= No_Version); +begin + Error_Msg_File_1 := ALIs.Table (A).Sfile; + Error_Msg_File_2 := ALIs.Table (ALIs.First).Sfile; + + if Include_Version_Numbers_In_Message then + Error_Msg_Nat_1 := V1; + Error_Msg_Nat_2 := V2; + Consistency_Error_Msg +("{ and { compiled with different GNAT versions" + & ", v# and v#"); + else + Consistency_Error_Msg +("{ and { compiled with different GNAT versions"); + end if; +end; end if; end loop; end Check_Versions;
[Ada] Spurious range checks on aggregate with non-static bounds
Refine predicate Must_Slide to ensure that no spurious range checks are generated when context subtype and aggregate subtype are non-static. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * exp_aggr.adb (Must_Slide): If the aggregate only contains an others_clause no sliding id involved. Otherwise sliding is required if any bound of the aggregate or the context subtype is non-static.diff --git a/gcc/ada/exp_aggr.adb b/gcc/ada/exp_aggr.adb --- a/gcc/ada/exp_aggr.adb +++ b/gcc/ada/exp_aggr.adb @@ -124,7 +124,8 @@ package body Exp_Aggr is -- constants that are done in place. function Must_Slide - (Obj_Type : Entity_Id; + (Aggr : Node_Id; + Obj_Type : Entity_Id; Typ : Entity_Id) return Boolean; -- A static array aggregate in an object declaration can in most cases be -- expanded in place. The one exception is when the aggregate is given @@ -1776,7 +1777,7 @@ package body Exp_Aggr is if Nkind (Parent (N)) = N_Assignment_Statement and then Is_Array_Type (Comp_Typ) and then Present (Component_Associations (Expr_Q)) - and then Must_Slide (Comp_Typ, Etype (Expr_Q)) + and then Must_Slide (N, Comp_Typ, Etype (Expr_Q)) then Set_Expansion_Delayed (Expr_Q, False); Set_Analyzed (Expr_Q, False); @@ -6855,7 +6856,7 @@ package body Exp_Aggr is and then Parent_Kind = N_Object_Declaration and then Present (Expression (Parent_Node)) and then not - Must_Slide (Etype (Defining_Identifier (Parent_Node)), Typ) + Must_Slide (N, Etype (Defining_Identifier (Parent_Node)), Typ) and then not Is_Bit_Packed_Array (Typ) then In_Place_Assign_OK_For_Declaration := True; @@ -9616,13 +9617,16 @@ package body Exp_Aggr is function Must_Slide - (Obj_Type : Entity_Id; + (Aggr : Node_Id; + Obj_Type : Entity_Id; Typ : Entity_Id) return Boolean is begin -- No sliding if the type of the object is not established yet, if it is -- an unconstrained type whose actual subtype comes from the aggregate, - -- or if the two types are identical. + -- or if the two types are identical. If the aggregate contains only + -- an Others_Clause it gets its type from the context and no sliding + -- is involved either. if not Is_Array_Type (Obj_Type) then return False; @@ -9633,8 +9637,13 @@ package body Exp_Aggr is elsif Typ = Obj_Type then return False; + elsif Is_Others_Aggregate (Aggr) then + return False; + else -- Sliding can only occur along the first dimension + -- If any the bounds of non-static sliding is required + -- to force potential range checks. declare Bounds1 : constant Range_Nodes := @@ -9648,7 +9657,8 @@ package body Exp_Aggr is not Is_OK_Static_Expression (Bounds1.Last) or else not Is_OK_Static_Expression (Bounds2.Last) then - return False; + return True; + else return Expr_Value (Bounds1.First) /= Expr_Value (Bounds2.First) or else
[Ada] Support gmem.out longer than 2G on 32 bit platforms
C stream fopen function opens file for size no more than 2G by default on 32 bit platforms. Use fopen and other stream functions from System.CRTL to overcome this limit. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * libgnat/memtrack.adb (Putc): New routine wrapped around fputc with error check. (Write): New routine wrapped around fwrite with error check. Remove bound functions fopen, fwrite, fputs, fclose, OS_Exit. Use the similar routines from System.CRTL and System.OS_Lib.diff --git a/gcc/ada/libgnat/memtrack.adb b/gcc/ada/libgnat/memtrack.adb --- a/gcc/ada/libgnat/memtrack.adb +++ b/gcc/ada/libgnat/memtrack.adb @@ -69,10 +69,13 @@ pragma Source_File_Name (System.Memory, Body_File_Name => "memtrack.adb"); with Ada.Exceptions; +with GNAT.IO; + with System.Soft_Links; with System.Traceback; with System.Traceback_Entries; -with GNAT.IO; +with System.CRTL; +with System.OS_Lib; with System.OS_Primitives; package body System.Memory is @@ -93,30 +96,14 @@ package body System.Memory is (Ptr : System.Address; Size : size_t) return System.Address; pragma Import (C, c_realloc, "realloc"); - subtype File_Ptr is System.Address; - - function fopen (Path : String; Mode : String) return File_Ptr; - pragma Import (C, fopen); - - procedure OS_Exit (Status : Integer); - pragma Import (C, OS_Exit, "__gnat_os_exit"); - pragma No_Return (OS_Exit); - In_Child_After_Fork : Integer; pragma Import (C, In_Child_After_Fork, "__gnat_in_child_after_fork"); - procedure fwrite - (Ptr: System.Address; - Size : size_t; - Nmemb : size_t; - Stream : File_Ptr); - pragma Import (C, fwrite); + subtype File_Ptr is CRTL.FILEs; - procedure fputc (C : Integer; Stream : File_Ptr); - pragma Import (C, fputc); + procedure Write (Ptr : System.Address; Size : size_t); - procedure fclose (Stream : File_Ptr); - pragma Import (C, fclose); + procedure Putc (Char : Character); procedure Finalize; pragma Export (C, Finalize, "__gnat_finalize"); @@ -210,20 +197,17 @@ package body System.Memory is Timestamp := System.OS_Primitives.Clock; Call_Chain (Tracebk, Max_Call_Stack, Num_Calls, Skip_Frames => 2); - fputc (Character'Pos ('A'), Gmemfile); - fwrite (Result'Address, Address_Size, 1, Gmemfile); - fwrite (Actual_Size'Address, size_t'Max_Size_In_Storage_Elements, 1, - Gmemfile); - fwrite (Timestamp'Address, Duration'Max_Size_In_Storage_Elements, 1, - Gmemfile); - fwrite (Num_Calls'Address, Integer'Max_Size_In_Storage_Elements, 1, - Gmemfile); + Putc ('A'); + Write (Result'Address, Address_Size); + Write (Actual_Size'Address, size_t'Max_Size_In_Storage_Elements); + Write (Timestamp'Address, Duration'Max_Size_In_Storage_Elements); + Write (Num_Calls'Address, Integer'Max_Size_In_Storage_Elements); for J in Tracebk'First .. Tracebk'First + Num_Calls - 1 loop declare Ptr : System.Address := PC_For (Tracebk (J)); begin - fwrite (Ptr'Address, Address_Size, 1, Gmemfile); + Write (Ptr'Address, Address_Size); end; end loop; @@ -246,8 +230,8 @@ package body System.Memory is procedure Finalize is begin - if not Needs_Init then - fclose (Gmemfile); + if not Needs_Init and then CRTL.fclose (Gmemfile) /= 0 then + Put_Line ("gmem close error: " & OS_Lib.Errno_Message); end if; end Finalize; @@ -275,18 +259,16 @@ package body System.Memory is Call_Chain (Tracebk, Max_Call_Stack, Num_Calls, Skip_Frames => 2); Timestamp := System.OS_Primitives.Clock; - fputc (Character'Pos ('D'), Gmemfile); - fwrite (Addr'Address, Address_Size, 1, Gmemfile); - fwrite (Timestamp'Address, Duration'Max_Size_In_Storage_Elements, 1, - Gmemfile); - fwrite (Num_Calls'Address, Integer'Max_Size_In_Storage_Elements, 1, - Gmemfile); + Putc ('D'); + Write (Addr'Address, Address_Size); + Write (Timestamp'Address, Duration'Max_Size_In_Storage_Elements); + Write (Num_Calls'Address, Integer'Max_Size_In_Storage_Elements); for J in Tracebk'First .. Tracebk'First + Num_Calls - 1 loop declare Ptr : System.Address := PC_For (Tracebk (J)); begin - fwrite (Ptr'Address, Address_Size, 1, Gmemfile); + Write (Ptr'Address, Address_Size); end; end loop; @@ -304,29 +286,41 @@ package body System.Memory is procedure Gmem_Initialize is Timestamp : aliased Duration; - + File_Mode : constant String := "wb" & ASCII.NUL; begin if Needs_Init then Needs_Init := False; System.OS_Primiti
[Ada] Info. gathering in preparation for more efficiency improvements
Information gathering in preparation for more efficiency improvements. We gather statistics from the running compiler, and we also have gen_il generate information in the form of comments. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * atree.adb: Gather and print statistics about frequency of getter and setter calls. * atree.ads (Print_Statistics): New procedure for printing statistics. * debug.adb: Document -gnatd.A switch. * gen_il-gen.adb: Generate code for statistics gathering. Choose the offset of Homonym early. Misc cleanup. Put more comments in the generated code. * gen_il-internals.ads (Unknown_Offset): New value to indicate that the offset has not yet been chosen. * gnat1drv.adb: Call Print_Statistics. * libgnat/s-imglli.ads: Minor comment fix. * output.ads (Write_Int_64): New procedure to write a 64-bit value. Needed for new statistics, and could come in handy elsewhere. * output.adb (Write_Int_64): Likewise. * sinfo.ads: Remove obsolete comment. The xtreeprs program no longer exists. * types.ads: New 64-bit types needed for new statistics.diff --git a/gcc/ada/atree.adb b/gcc/ada/atree.adb --- a/gcc/ada/atree.adb +++ b/gcc/ada/atree.adb @@ -211,6 +211,10 @@ package body Atree is (Old_N : Entity_Id; New_Kind : Entity_Kind); -- Above are the same as the ones for nodes, but for entities + procedure Update_Kind_Statistics (Field : Node_Or_Entity_Field); + -- Increment Set_Count (Field). This is in a procedure so we can put it in + -- pragma Debug for efficiency. + procedure Init_Nkind (N : Node_Id; Val : Node_Kind); -- Initialize the Nkind field, which must not have been set already. This -- cannot be used to modify an already-initialized Nkind field. See also @@ -905,7 +909,7 @@ package body Atree is Old_Kind : constant Node_Kind := Nkind (Old_N); -- If this fails, it means you need to call Reinit_Field_To_Zero before - -- calling Set_Nkind. + -- calling Mutate_Nkind. begin for J in Node_Field_Table (Old_Kind)'Range loop @@ -970,11 +974,17 @@ package body Atree is Nkind_Offset : constant Field_Offset := Field_Descriptors (F_Nkind).Offset; + procedure Update_Kind_Statistics (Field : Node_Or_Entity_Field) is + begin + Set_Count (Field) := Set_Count (Field) + 1; + end Update_Kind_Statistics; + procedure Set_Node_Kind_Type is new Set_8_Bit_Field (Node_Kind) with Inline; procedure Init_Nkind (N : Node_Id; Val : Node_Kind) is pragma Assert (Field_Is_Initial_Zero (N, F_Nkind)); begin + pragma Debug (Update_Kind_Statistics (F_Nkind)); Set_Node_Kind_Type (N, Nkind_Offset, Val); end Init_Nkind; @@ -1017,6 +1027,7 @@ package body Atree is Zero_Dynamic_Slots (Off_F (N) + Old_Size, Slots.Last); end if; + pragma Debug (Update_Kind_Statistics (F_Nkind)); Set_Node_Kind_Type (N, Nkind_Offset, Val); pragma Debug (Validate_Node_Write (N)); @@ -1049,6 +1060,7 @@ package body Atree is -- For now, we are allocating all entities with the same size, so we -- don't need to reallocate slots here. + pragma Debug (Update_Kind_Statistics (F_Ekind)); Set_Entity_Kind_Type (N, Ekind_Offset, Val); pragma Debug (Validate_Node_Write (N)); @@ -1535,8 +1547,7 @@ package body Atree is for J in Fields'Range loop declare use Seinfo; -Desc : Field_Descriptor renames - Field_Descriptors (Fields (J)); +Desc : Field_Descriptor renames Field_Descriptors (Fields (J)); begin if Desc.Kind in Node_Id_Field | List_Id_Field then Fix_Parent (Get_Node_Field_Union (Fix_Node, Desc.Offset)); @@ -2477,4 +2488,60 @@ package body Atree is Zero_Header_Slots (N); end Zero_Slots; + -- + -- Print_Statistics -- + -- + + procedure Print_Statistics is + Total, G_Total, S_Total : Call_Count := 0; + begin + Write_Line ("Frequency of field getter and setter calls:"); + + for Field in Node_Or_Entity_Field loop + G_Total := G_Total + Get_Count (Field); + S_Total := S_Total + Set_Count (Field); + Total := G_Total + S_Total; + end loop; + + Write_Int_64 (Total); + Write_Str (" (100%) = "); + Write_Int_64 (G_Total); + Write_Str (" + "); + Write_Int_64 (S_Total); + Write_Line (" total getter and setter calls"); + + for Field in Node_Or_Entity_Field loop + declare +G : constant Call_Count := Get_Count (Field); +S : constant Call_Count := Set_Count (Field); +GS : constant Call_Count := G + S; + +Percent : constant Int := + Int ((Long_Float (GS) / Long_Float (Total)) * 100.0); + +
[Ada] Fix bug in inherited user-defined-literal aspects for tagged types
In some cases, an integer literal of a tagged type whose Integer_Literal aspect is inherited from an ancestor type was not handled correctly by the compiler. In particular, Ada RM 13.1(15.5) was not correctly implemented, resulting in the incorrect rejection of legal uses of integer literals with (incorrect) semantic error messages about illegal downward conversions. The same problem also affected the other two user-defined literal aspects, Real_Literal and String_Literal. These bugs are corrected. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * sem_res.adb (Resolve): Two separate fixes. In the case where Find_Aspect for a literal aspect returns the aspect for a different (ancestor) type, call Corresponding_Primitive_Op to get the right callee. In the case where a downward tagged type conversion appears to be needed, generate a null extension aggregate instead, as per Ada RM 3.4(27). * sem_util.ads, sem_util.adb: Add new Corresponding_Primitive_Op function. It maps a primitive op of a tagged type and a descendant type of that tagged type to the corresponding primitive op of the descendant type. The body of this function was written by Javier Miranda.diff --git a/gcc/ada/sem_res.adb b/gcc/ada/sem_res.adb --- a/gcc/ada/sem_res.adb +++ b/gcc/ada/sem_res.adb @@ -2920,6 +2920,16 @@ package body Sem_Res is Expr : Node_Id; begin + if Is_Derived_Type (Typ) +and then Is_Tagged_Type (Typ) +and then Base_Type (Etype (Callee)) /= Base_Type (Typ) + then + Callee := + Corresponding_Primitive_Op + (Ancestor_Op => Callee, + Descendant_Type => Base_Type (Typ)); + end if; + if Nkind (N) = N_Identifier then Expr := Expression (Declaration_Node (Entity (N))); @@ -2990,16 +3000,23 @@ package body Sem_Res is Set_Etype (Call, Etype (Callee)); - -- Conversion needed in case of an inherited aspect - -- of a derived type. - -- - -- ??? Need to do something different here for downward - -- tagged conversion case (which is only possible in the - -- case of a null extension); the current call to - -- Convert_To results in an error message about an illegal - -- downward conversion. + if Base_Type (Etype (Call)) /= Base_Type (Typ) then + -- Conversion may be needed in case of an inherited + -- aspect of a derived type. For a null extension, we + -- use a null extension aggregate instead because the + -- downward type conversion would be illegal. - Call := Convert_To (Typ, Call); + if Is_Null_Extension_Of + (Descendant => Typ, + Ancestor => Etype (Call)) + then +Call := Make_Extension_Aggregate (Loc, + Ancestor_Part => Call, + Null_Record_Present => True); + else +Call := Convert_To (Typ, Call); + end if; + end if; Rewrite (N, Call); end; diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb --- a/gcc/ada/sem_util.adb +++ b/gcc/ada/sem_util.adb @@ -7073,6 +7073,79 @@ package body Sem_Util is end if; end Corresponding_Generic_Type; + + -- Corresponding_Primitive_Op -- + + + function Corresponding_Primitive_Op + (Ancestor_Op : Entity_Id; + Descendant_Type : Entity_Id) return Entity_Id + is + Typ : constant Entity_Id := Find_Dispatching_Type (Ancestor_Op); + Elmt : Elmt_Id; + Subp : Entity_Id; + Prim : Entity_Id; + begin + pragma Assert (Is_Dispatching_Operation (Ancestor_Op)); + pragma Assert (Is_Ancestor (Typ, Descendant_Type) + or else Is_Progenitor (Typ, Descendant_Type)); + + Elmt := First_Elmt (Primitive_Operations (Descendant_Type)); + + while Present (Elmt) loop + Subp := Node (Elmt); + + -- For regular primitives we only need to traverse the chain of + -- ancestors when the name matches the name of Ancestor_Op, but + -- for predefined dispatching operations we cannot rely on the + -- name of the primitive to identify a candidate since their name + -- is internally built adding a suffix to the name of the tagged + -- type. + + if Chars (Subp) = Char
[Ada] Improve error message for .ali file version mismatch
When the binder detects a mismatch between the versions of two .ali files, include the version information in the resulting message. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * bcheck.adb (Check_Versions): Add support for the case where the .ali file contains both a primary and a secondary version number, as in "GNAT Lib v22.20210809".diff --git a/gcc/ada/bcheck.adb b/gcc/ada/bcheck.adb --- a/gcc/ada/bcheck.adb +++ b/gcc/ada/bcheck.adb @@ -1325,60 +1325,105 @@ package body Bcheck is or else ALIs.Table (A).Ver (1 .. VL) /= ALIs.Table (ALIs.First).Ver (1 .. VL) then +-- Version mismatch found; generate error message. + declare - No_Version : constant Int := -1; + use Gnatvsn; + + Prefix : constant String := + Verbose_Library_Version + (1 .. Verbose_Library_Version'Length + - Library_Version'Length); + + type ALI_Version is record + Primary, Secondary : Int range -1 .. Int'Last; + end record; + + No_Version : constant ALI_Version := (-1, -1); - function Extract_Version (S : String) return Int; - -- Attempts to extract and return a nonnegative library - -- version number from the given string; if unsuccessful, + function Remove_Prefix (S : String) return String is + (S (S'First + Prefix'Length .. S'Last)); + + function Extract_Version (S : String) return ALI_Version; + -- Attempts to extract and return a pair of nonnegative library + -- version numbers from the given string; if unsuccessful, -- then returns No_Version. - -- Extract_Version -- - - function Extract_Version (S : String) return Int is - use Gnatvsn; - - Prefix : constant String := - Verbose_Library_Version - (1 .. Verbose_Library_Version'Length - - Library_Version'Length); - begin + function Extract_Version (S : String) return ALI_Version is pragma Assert (S'First = 1); + function Int_Value (Img : String) return Int; + -- Using Int'Value leads to complications in + -- building the binder, so DIY. + + --- + -- Int_Value -- + --- + + function Int_Value (Img : String) return Int is + Result : Nat := 0; + begin + if Img'Length in 1 .. 9 + and then (for all C of Img => C in '0' .. '9') + then +for C of Img loop + Result := (10 * Result) + + (Character'Pos (C) - Character'Pos ('0')); +end loop; +return Result; + else +return -1; + end if; + end Int_Value; + + begin if S'Length > Prefix'Length - and then S (1 .. Prefix'Length) = Prefix +and then S (1 .. Prefix'Length) = Prefix then declare -Suffix : constant String := - S (1 + Prefix'Length .. S'Last); - -Result : Nat := 0; +Suffix: constant String := Remove_Prefix (S); +Dot_Found : Boolean := False; +Primary, Secondary : Int; begin -if Suffix'Length < 10 - and then (for all C of Suffix => C in '0' .. '9') -then - -- Using Int'Value leads to complications in - -- building the binder, so DIY. +for Dot_Index in Suffix'Range loop + if Suffix (Dot_Index) = '.' then + Dot_Found := True; + Primary := +Int_Value (Suffix (Suffix'First + .. Dot_Index - 1)); + Secondary := +Int_Value (Suffix (Dot_Index + 1 + .. Suffix'Last)); + exit; + end if; +end loop; - for C of Suffix loop -
[Ada] Fix deleting CodePeer files for non-ordinary units
A routine for deleting SCIL files generated by previous CodePeer runs didn't expect compilation units that are subprogram renamings, generic renamings and generic subprogram declarations. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * comperr.adb (Delete_SCIL_Files): Handle generic subprogram declarations and renaming just like generic package declarations and renamings, respectively; handle N_Subprogram_Renaming_Declaration.diff --git a/gcc/ada/comperr.adb b/gcc/ada/comperr.adb --- a/gcc/ada/comperr.adb +++ b/gcc/ada/comperr.adb @@ -478,6 +478,7 @@ package body Comperr is when N_Package_Declaration | N_Subprogram_Body | N_Subprogram_Declaration +| N_Subprogram_Renaming_Declaration => Unit_Name := Defining_Unit_Name (Specification (Main)); @@ -489,10 +490,10 @@ package body Comperr is => Unit_Name := Defining_Unit_Name (Main); - -- No SCIL file generated for generic package declarations + -- No SCIL file generated for generic unit declarations - when N_Generic_Package_Declaration -| N_Generic_Package_Renaming_Declaration + when N_Generic_Declaration +| N_Generic_Renaming_Declaration => return;
[Ada] Implementation of AI12-0212: iterator specs in array aggregates (II)
This patch adds a guard to the code generated in the second pass of the two-pass expansion for array aggregates described in AI12-0212. The guard is needed to prevent a spurious constraint error when incrementing the index used for aggregate insertion, before exiting the loop. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * exp_aggr.adb (Expand_Array_Aggregate, Two_Pass_Aggregate_Expansion): Increment index for element insertion within the loop, only if upper bound has not been reached.diff --git a/gcc/ada/exp_aggr.adb b/gcc/ada/exp_aggr.adb --- a/gcc/ada/exp_aggr.adb +++ b/gcc/ada/exp_aggr.adb @@ -6504,6 +6504,18 @@ package body Exp_Aggr is Expressions => New_List (New_Occurrence_Of (Index_Id, Loc; +-- Add guard to skip last increment when upper bound is reached. + +Incr := Make_If_Statement (Loc, + Condition => + Make_Op_Ne (Loc, + Left_Opnd => New_Occurrence_Of (Index_Id, Loc), + Right_Opnd => +Make_Attribute_Reference (Loc, + Prefix => New_Occurrence_Of (Index_Type, Loc), + Attribute_Name => Name_Last)), + Then_Statements => New_List (Incr)); + One_Loop := Make_Loop_Statement (Loc, Iteration_Scheme => Make_Iteration_Scheme (Loc, @@ -6561,11 +6573,10 @@ package body Exp_Aggr is return; elsif Present (Component_Associations (N)) - and then -Nkind (First (Component_Associations (N))) - = N_Iterated_Component_Association - and then Present - (Iterator_Specification (First (Component_Associations (N +and then Nkind (First (Component_Associations (N))) = + N_Iterated_Component_Association +and then + Present (Iterator_Specification (First (Component_Associations (N then Two_Pass_Aggregate_Expansion (N); return; @@ -7389,7 +7400,7 @@ package body Exp_Aggr is elsif Nkind (Comp) = N_Iterated_Element_Association then return -1; --- TBD : Create code for a loop and add to generated code, +-- ??? Need to create code for a loop and add to generated code, -- as is done for array aggregates with iterated element -- associations, instead of using Append operations.
[Ada] Ada2022: AI12-0195 overriding class-wide pre/postconditions
New implementation of class-wide pre/postconditions that relies on helpers to move the corresponding runtime checks to the caller side. This implementation also adds indirect call wrappers and dispatch table wrappers that facilitate combining class-wide conditions with access-to-subprogram types with preconditions (AI12-0220), and provides full support for inheriting body but overriding preconditions or postconditions (AI12-0195). Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * contracts.ads (Make_Class_Precondition_Subps): New subprogram. (Merge_Class_Conditions): New subprogram. (Process_Class_Conditions_At_Freeze_Point): New subprogram. * contracts.adb (Check_Class_Condition): New subprogram. (Set_Class_Condition): New subprogram. (Analyze_Contracts): Remove code analyzing class-wide-clone subprogram since it is no longer built. (Process_Spec_Postconditions): Avoid processing twice seen subprograms. (Process_Preconditions): Simplify its functionality to non-class-wide preconditions. (Process_Preconditions_For): No action needed for wrappers and helpers. (Make_Class_Precondition_Subps): New subprogram. (Process_Class_Conditions_At_Freeze_Point): New subprogram. (Merge_Class_Conditions): New subprogram. * exp_ch6.ads (Install_Class_Preconditions_Check): New subprogram. * exp_ch6.adb (Expand_Call_Helper): Install class-wide preconditions check on dispatching primitives that have or inherit class-wide preconditions. (Freeze_Subprogram): Remove code for null procedures with preconditions. (Install_Class_Preconditions_Check): New subprogram. * exp_util.ads (Build_Class_Wide_Expression): Lower the complexity of this subprogram; out-mode formal Needs_Wrapper since this functionality is now provided by a new subprogram. (Get_Mapped_Entity): New subprogram. (Map_Formals): New subprogram. * exp_util.adb (Build_Class_Wide_Expression): Lower the complexity of this subprogram. Its previous functionality is now provided by subprograms Needs_Wrapper and Check_Class_Condition. (Add_Parent_DICs): Map the overridden primitive to the overriding one. (Get_Mapped_Entity): New subprogram. (Map_Formals): New subprogram. (Update_Primitives_Mapping): Adding assertion. * freeze.ads (Check_Inherited_Conditions): Subprogram made public with added formal to support late overriding. * freeze.adb (Check_Inherited_Conditions): New implementation; builds the dispatch table wrapper required for class-wide pre/postconditions; added support for late overriding. (Needs_Wrapper): New subprogram. * sem.ads (Inside_Class_Condition_Preanalysis): New global variable. * sem_disp.ads (Covered_Interface_Primitives): New subprogram. * sem_disp.adb (Covered_Interface_Primitives): New subprogram. (Check_Dispatching_Context): Skip checking context of dispatching calls during preanalysis of class-wide conditions since at that stage the expression is not installed yet on its definite context. (Check_Dispatching_Call): Skip checking 6.1.1(18.2/5) by AI12-0412 on helpers and wrappers internally built for supporting class-wide conditions; for late-overriding subprograms call Check_Inherited_Conditions to build the dispatch-table wrapper (if required). (Propagate_Tag): Adding call to Install_Class_Preconditions_Check. * sem_util.ads (Build_Class_Wide_Clone_Body): Removed. (Build_Class_Wide_Clone_Call): Removed. (Build_Class_Wide_Clone_Decl): Removed. (Class_Condition): New subprogram. (Nearest_Class_Condition_Subprogram): New subprogram. * sem_util.adb (Build_Class_Wide_Clone_Body): Removed. (Build_Class_Wide_Clone_Call): Removed. (Build_Class_Wide_Clone_Decl): Removed. (Class_Condition): New subprogram. (Nearest_Class_Condition_Subprogram): New subprogram. (Eligible_For_Conditional_Evaluation): No need to evaluate class-wide conditions during preanalysis since the expression is not installed on its definite context. * einfo.ads (Class_Wide_Clone): Removed. (Class_Postconditions): New attribute. (Class_Preconditions): New attribute. (Class_Preconditions_Subprogram): New attribute. (Dynamic_Call_Helper): New attribute. (Ignored_Class_Postconditions): New attribute. (Ignored_Class_Preconditions): New attribute. (Indirect_Call_Wrapper): New attribute. (Is_Dispatch_Table_Wrapper): New attribute. (Static_Call_Helper): New attribute. * exp_attr.adb (Expand_N_Attribute_Reference): When the prefix
[Ada] Assert_Failure on derived type with inherited Default_Initial_Condition
A type derived from a private type that specifies Default_Initial_Condition can lead to an assertion failure when the compiler builds the body of the derived type's DIC procedure. Some code inherited from type invariants doesn't apply in the DIC case, and performed an incorrect assertion testing for the presence of a full type on such a derived type. There was also additional unneeded and ineffective code related to full types that is not needed or appropriate for the DIC aspect (which can only be applied to private types, not full types, unlike the Type_Invariant aspect). The problematic assertion and dead code is removed. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * exp_util.adb (Build_DIC_Procedure_Body): Remove inappropriate Assert pragma. Remove unneeded and dead code related to derived private types.diff --git a/gcc/ada/exp_util.adb b/gcc/ada/exp_util.adb --- a/gcc/ada/exp_util.adb +++ b/gcc/ada/exp_util.adb @@ -2035,14 +2035,11 @@ package body Exp_Util is Stmts=> Stmts); end if; - -- Otherwise the "full" DIC procedure verifies the DICs of the full - -- view, well as DICs inherited from parent types. In addition, it - -- indirectly verifies the DICs of the partial view by calling the - -- "partial" DIC procedure. + -- Otherwise, the "full" DIC procedure verifies the DICs inherited from + -- parent types, as well as indirectly verifying the DICs of the partial + -- view by calling the "partial" DIC procedure. else - pragma Assert (Present (Full_Typ)); - -- Check the DIC of the partial view by calling the "partial" DIC -- procedure, unless the partial DIC body is empty. Generate: @@ -2056,44 +2053,6 @@ package body Exp_Util is New_Occurrence_Of (Obj_Id, Loc; end if; - -- Derived subtypes do not have a partial view - - if Present (Priv_Typ) then - --- The processing of the "full" DIC procedure intentionally --- skips the partial view because a) this may result in changes of --- visibility and b) lead to duplicate checks. However, when the --- full view is the underlying full view of an untagged derived --- type whose parent type is private, partial DICs appear on --- the rep item chain of the partial view only. - ---package Pack_1 is --- type Root ... is private; ---private --- ---end Pack_1; - ---with Pack_1; ---package Pack_2 is --- type Child is new Pack_1.Root with Type_DIC => ...; --- ---end Pack_2; - --- As a result, the processing of the full view must also consider --- all DICs of the partial view. - -if Is_Untagged_Private_Derivation (Priv_Typ, Full_Typ) then - null; - --- Otherwise the DICs of the partial view are ignored - -else - -- Ignore the DICs of the partial view by eliminating the view - - Priv_Typ := Empty; -end if; - end if; - -- Process inherited Default_Initial_Conditions for all parent types Add_Parent_DICs (Work_Typ, Obj_Id, Stmts);
[Ada] Crash on renaming within declare expression
This patch corrects an issue in the compiler whereby a renaming within a declare expression may result in a crash in some systems. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * exp_dbug.adb (Debug_Renaming_Declaration): Add check for Entity present for Ren to prevent looking at unanalyzed nodesdiff --git a/gcc/ada/exp_dbug.adb b/gcc/ada/exp_dbug.adb --- a/gcc/ada/exp_dbug.adb +++ b/gcc/ada/exp_dbug.adb @@ -409,7 +409,9 @@ package body Exp_Dbug is when N_Expanded_Name | N_Identifier => - if not Present (Renamed_Object (Entity (Ren))) then + if No (Entity (Ren)) + or else not Present (Renamed_Object (Entity (Ren))) + then exit; end if;
[Ada] Add more node unions
Also fix [se]info.h output to include both declarations and definitions of the functions for unions so that unions can refer to other unions without being order-dependent. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * gen_il-gen-gen_nodes.adb (N_Alternative, N_Is_Case_Choice): Add. (N_Is_Exception_Choice, N_Is_Range): Likewise. * gen_il-types.ads: Add above names. * gen_il-gen.adb (Put_Union_Membership): Write both declarations and definitions of union functions.diff --git a/gcc/ada/gen_il-gen-gen_nodes.adb b/gcc/ada/gen_il-gen-gen_nodes.adb --- a/gcc/ada/gen_il-gen-gen_nodes.adb +++ b/gcc/ada/gen_il-gen-gen_nodes.adb @@ -1686,4 +1686,31 @@ begin -- Gen_IL.Gen.Gen_Nodes N_Subprogram_Specification)); -- Nodes that can be returned by Declaration_Node + Union (N_Is_Range, + Children => +(N_Character_Literal, + N_Entity_Name, + N_Has_Bounds, + N_Integer_Literal, + N_Subtype_Indication)); + -- Nodes that can be used to specify a range + + Union (N_Is_Case_Choice, + Children => +(N_Is_Range, + N_Others_Choice)); + -- Nodes that can be in the choices of a case statement + + Union (N_Is_Exception_Choice, + Children => +(N_Entity_Name, + N_Others_Choice)); + -- Nodes that can be in the choices of an exception handler + + Union (N_Alternative, + Children => +(N_Case_Statement_Alternative, + N_Variant)); + -- Nodes that can be alternatives in case contructs + end Gen_IL.Gen.Gen_Nodes; diff --git a/gcc/ada/gen_il-gen.adb b/gcc/ada/gen_il-gen.adb --- a/gcc/ada/gen_il-gen.adb +++ b/gcc/ada/gen_il-gen.adb @@ -652,7 +652,7 @@ package body Gen_IL.Gen is -- Used by Put_C_Getters to print out one high-level getter. procedure Put_Union_Membership -(S : in out Sink; Root : Root_Type); +(S : in out Sink; Root : Root_Type; Only_Prototypes : Boolean); -- Used by Put_Sinfo_Dot_H and Put_Einfo_Dot_H to print out functions to -- test membership in a union type. @@ -3175,6 +3175,8 @@ package body Gen_IL.Gen is end Put_Kind_Subtype; begin + Put_Union_Membership (S, Root, Only_Prototypes => True); + Iterate_Types (Root, Pre => Put_Enum_Lit'Access); Put (S, "#define Number_" & Node_Or_Entity (Root) & "_Kinds " & @@ -3182,7 +3184,7 @@ package body Gen_IL.Gen is Iterate_Types (Root, Pre => Put_Kind_Subtype'Access); - Put_Union_Membership (S, Root); + Put_Union_Membership (S, Root, Only_Prototypes => False); end Put_C_Type_And_Subtypes; -- @@ -3269,7 +3271,7 @@ package body Gen_IL.Gen is -- procedure Put_Union_Membership -(S : in out Sink; Root : Root_Type) is +(S : in out Sink; Root : Root_Type; Only_Prototypes : Boolean) is procedure Put_Ors (T : Abstract_Type); -- Print the "or" (i.e. "||") of tests whether kind is in each child @@ -3303,22 +3305,27 @@ package body Gen_IL.Gen is end Put_Ors; begin - Put (S, LF & "// Membership tests for union types" & LF & LF); + if not Only_Prototypes then +Put (S, LF & "// Membership tests for union types" & LF & LF); + end if; for T in First_Abstract (Root) .. Last_Abstract (Root) loop if Type_Table (T) /= null and then Type_Table (T).Is_Union then Put (S, "INLINE Boolean" & LF); Put (S, "Is_In_" & Image (T) & " (" & -Node_Or_Entity (Root) & "_Kind kind)" & LF); +Node_Or_Entity (Root) & "_Kind kind)" & +(if Only_Prototypes then ";" else "") & LF); - Put (S, "{" & LF); - Increase_Indent (S, 3); - Put (S, "return" & LF); - Increase_Indent (S, 3); - Put_Ors (T); - Decrease_Indent (S, 3); - Decrease_Indent (S, 3); - Put (S, ";" & LF & "}" & LF); + if not Only_Prototypes then + Put (S, "{" & LF); + Increase_Indent (S, 3); + Put (S, "return" & LF); + Increase_Indent (S, 3); + Put_Ors (T); + Decrease_Indent (S, 3); + Decrease_Indent (S, 3); + Put (S, ";" & LF & "}" & LF); + end if; Put (S, "" & LF); end if; diff --git a/gcc/ada/gen_il-types.ads b/gcc/ada/gen_il-types.ads --- a/gcc/ada/gen_il-types.ads +++ b/gcc/ada/gen_il-types.ads @@ -77,6 +77,7 @@ package Gen_IL.Types is Node_Kind, -- root of node type hierarchy N_Access_To_Subprogram_Definition, + N_Alternative, N_Array_Type_Definition,
[Ada] Stub CUDA_Device aspect
The CUDA_Device aspect allows specifying an entity as being device (GPU)-only. This commit implements the basic machinery to detect the aspect/pragma, but does not implement any functionality (a warning is issued instead). Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * aspects.ads: Add CUDA_Device aspect. * gnat_cuda.ads (Add_CUDA_Device_Entity): New subprogram. * gnat_cuda.adb: (Add_CUDA_Device_Entity): New subprogram. (CUDA_Device_Entities_Table): New hashmap for CUDA_Device entities. (Get_CUDA_Device_Entities): New internal subprogram. (Set_CUDA_Device_Entities): New internal subprogram. * par-prag.adb (Prag): Handle pragma id Pragma_CUDA_Device. * sem_prag.ads (Aspect_Specifying_Pragma): Mark CUDA_Device as being both aspect and pragma. * sem_prag.adb (Analyze_Pragma): Add CUDA_Device entities to list of CUDA_Entities belonging to package N. (Sig_Flags): Signal CUDA_Device entities as referenced. * snames.ads-tmpl: Create CUDA_Device names and pragmas.diff --git a/gcc/ada/aspects.ads b/gcc/ada/aspects.ads --- a/gcc/ada/aspects.ads +++ b/gcc/ada/aspects.ads @@ -187,6 +187,7 @@ package Aspects is Aspect_Atomic_Components, Aspect_Disable_Controlled,-- GNAT Aspect_Discard_Names, + Aspect_CUDA_Device, -- GNAT Aspect_CUDA_Global, -- GNAT Aspect_Exclusive_Functions, Aspect_Export, @@ -476,6 +477,7 @@ package Aspects is Aspect_Contract_Cases => False, Aspect_Convention => True, Aspect_CPU => False, + Aspect_CUDA_Device => False, Aspect_CUDA_Global => False, Aspect_Default_Component_Value => True, Aspect_Default_Initial_Condition=> False, @@ -627,6 +629,7 @@ package Aspects is Aspect_Contract_Cases => Name_Contract_Cases, Aspect_Convention => Name_Convention, Aspect_CPU => Name_CPU, + Aspect_CUDA_Device => Name_CUDA_Device, Aspect_CUDA_Global => Name_CUDA_Global, Aspect_Default_Component_Value => Name_Default_Component_Value, Aspect_Default_Initial_Condition=> Name_Default_Initial_Condition, @@ -872,6 +875,7 @@ package Aspects is Aspect_Attach_Handler => Always_Delay, Aspect_Constant_Indexing=> Always_Delay, Aspect_CPU => Always_Delay, + Aspect_CUDA_Device => Always_Delay, Aspect_CUDA_Global => Always_Delay, Aspect_Default_Iterator => Always_Delay, Aspect_Default_Storage_Pool => Always_Delay, diff --git a/gcc/ada/gnat_cuda.adb b/gcc/ada/gnat_cuda.adb --- a/gcc/ada/gnat_cuda.adb +++ b/gcc/ada/gnat_cuda.adb @@ -54,6 +54,18 @@ package body GNAT_CUDA is function Hash (F : Entity_Id) return Hash_Range; -- Hash function for hash table + package CUDA_Device_Entities_Table is new + GNAT.HTable.Simple_HTable + (Header_Num => Hash_Range, +Element=> Elist_Id, +No_Element => No_Elist, +Key=> Entity_Id, +Hash => Hash, +Equal => "="); + -- The keys of this table are package entities whose bodies contain at + -- least one procedure marked with aspect CUDA_Device. The values are + -- Elists of the marked entities. + package CUDA_Kernels_Table is new GNAT.HTable.Simple_HTable (Header_Num => Hash_Range, @@ -85,17 +97,45 @@ package body GNAT_CUDA is --* A procedure that takes care of calling CUDA functions that register -- CUDA_Global procedures with the runtime. + function Get_CUDA_Device_Entities (Pack_Id : Entity_Id) return Elist_Id; + -- Returns an Elist of all entities marked with pragma CUDA_Device that + -- are declared within package body Pack_Body. Returns No_Elist if Pack_Id + -- does not contain such entities. + function Get_CUDA_Kernels (Pack_Id : Entity_Id) return Elist_Id; -- Returns an Elist of all procedures marked with pragma CUDA_Global that -- are declared within package body Pack_Body. Returns No_Elist if Pack_Id -- does not contain such procedures. + procedure Set_CUDA_Device_Entities + (Pack_Id : Entity_Id; + E : Elist_Id); + -- Stores E as the list of CUDA_Device entities belonging to the package + -- entity Pack_Id. Pack_Id must not have a list of device entities. + procedure Set_CUDA_Kernels (Pack_Id : Entity_Id; Kernels : Elist_Id); -- Stores Kernels as the list of kernels belonging to the package entity -- Pack_Id. Pack_Id must not have a list of kernels. + + -- Add_CUDA_Device_Entity -- +
[Ada] Spurious warning about hiding in generic instantiation
Warnings about declarations being hidden can be issued in some cases when compiling a generic instantiation, but such warnings aren't correct (hiding can be flagged in a generic, but shouldn't be in an instance). The warning is now suppressed within instances. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * sem_util.adb (Enter_Name): Suppress hiding warning when in an instance.diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb --- a/gcc/ada/sem_util.adb +++ b/gcc/ada/sem_util.adb @@ -8656,6 +8656,10 @@ package body Sem_Util is and then Comes_From_Source (C) and then Comes_From_Source (Def_Id) +-- Don't warn within a generic instantiation + +and then not In_Instance + -- Don't warn unless entity in question is in extended main source and then In_Extended_Main_Source_Unit (Def_Id)
[Ada] Add new debug switch -gnatd.8
It will be used to tame the inlining of expression functions. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * debug.adb (d.8): Document usage. * fe.h (Debug_Flag_Dot_8): Declare.diff --git a/gcc/ada/debug.adb b/gcc/ada/debug.adb --- a/gcc/ada/debug.adb +++ b/gcc/ada/debug.adb @@ -210,7 +210,7 @@ package body Debug is -- d.5 Do not generate imported subprogram definitions in C code -- d.6 Do not avoid declaring unreferenced types in C code -- d.7 Disable unsound heuristics in gnat2scil (for CP as SPARK prover) - -- d.8 + -- d.8 Disable unconditional inlining of expression functions -- d.9 Disable build-in-place for nonlimited types -- d_1 @@ -1105,6 +1105,10 @@ package body Debug is -- issues (e.g., assuming that a low bound of an array parameter -- of an unconstrained subtype belongs to the index subtype). + -- d.8 By default calls to expression functions are always inlined. + -- This debug flag turns off this behavior, making them subject + -- to the usual inlining heuristics of the code generator. + -- d.9 Disable build-in-place for function calls returning nonlimited -- types. diff --git a/gcc/ada/fe.h b/gcc/ada/fe.h --- a/gcc/ada/fe.h +++ b/gcc/ada/fe.h @@ -61,10 +61,12 @@ extern void Compiler_Abort (String_Pointer, String_Pointer, Boolean) ATTRIBUTE_N #define Debug_Flag_Dot_KK debug__debug_flag_dot_kk #define Debug_Flag_Dot_R debug__debug_flag_dot_r +#define Debug_Flag_Dot_8 debug__debug_flag_dot_8 #define Debug_Flag_NN debug__debug_flag_nn extern Boolean Debug_Flag_Dot_KK; extern Boolean Debug_Flag_Dot_R; +extern Boolean Debug_Flag_Dot_8; extern Boolean Debug_Flag_NN; /* einfo: */
[Ada] Add missing guard before call to Interface_Present_In_Ancestor
Calling the function on an unspecified type may trigger the failure of the precondition of the Interfaces accessor. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * sem_type.adb (Specific_Type): Check that the type is tagged before calling Interface_Present_In_Ancestor on it.diff --git a/gcc/ada/sem_type.adb b/gcc/ada/sem_type.adb --- a/gcc/ada/sem_type.adb +++ b/gcc/ada/sem_type.adb @@ -3424,7 +3424,8 @@ package body Sem_Type is -- Ada 2005 (AI-251): T1 is a concrete type that implements the -- class-wide interface T2 - elsif Is_Class_Wide_Type (T2) + elsif Is_Tagged_Type (T1) +and then Is_Class_Wide_Type (T2) and then Is_Interface (Etype (T2)) and then Interface_Present_In_Ancestor (Typ => T1, Iface => Etype (T2))
[Ada] Crash on improper use of GNAT attribute Type_Key
This patch fixes a crash on a case statement whose expression is an attribute reference Type_Key, which yields a String. The attribute reference may have been fully analyzed, and the resolution against Any_Discrete fails to detect the error. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * sem_attr.adb (Analyze_Attribute, case Type_Key): Attribute can be applied to a formal type. * sem_ch5.adb (Analyze_Case_Statement): If Extensions_Allowed is not enabled, verify that the type of the expression is discrete.diff --git a/gcc/ada/sem_attr.adb b/gcc/ada/sem_attr.adb --- a/gcc/ada/sem_attr.adb +++ b/gcc/ada/sem_attr.adb @@ -6643,7 +6643,9 @@ package body Sem_Attr is Initialize (CRC); Compute_Type_Key (Entity (P)); - if not Is_Frozen (Entity (P)) then + if not Is_Frozen (Entity (P)) + and then not Is_Generic_Type (Entity (P)) + then Error_Msg_N ("premature usage of Type_Key?", N); end if; diff --git a/gcc/ada/sem_ch5.adb b/gcc/ada/sem_ch5.adb --- a/gcc/ada/sem_ch5.adb +++ b/gcc/ada/sem_ch5.adb @@ -1681,6 +1681,13 @@ package body Sem_Ch5 is Error_Msg_N ("(Ada 83) case expression cannot be of a generic type", Exp); return; + + elsif not Extensions_Allowed +and then not Is_Discrete_Type (Exp_Type) + then + Error_Msg_N + ("expression in case statement must be of a discrete_Type", Exp); + return; end if; -- If the case expression is a formal object of mode in out, then treat
[Ada] Document rounding mode assumed for dynamic floating-point computations
It is only documented for static computations currently. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * doc/gnat_rm/implementation_defined_characteristics.rst: Document the rounding mode assumed for dynamic computations as per 3.5.7(16). * gnat_rm.texi: Regenerate.diff --git a/gcc/ada/doc/gnat_rm/implementation_defined_characteristics.rst b/gcc/ada/doc/gnat_rm/implementation_defined_characteristics.rst --- a/gcc/ada/doc/gnat_rm/implementation_defined_characteristics.rst +++ b/gcc/ada/doc/gnat_rm/implementation_defined_characteristics.rst @@ -147,12 +147,12 @@ Type Representation IEEE 80-bit Extended on x86 architecture == === -The default rounding mode specified by the IEEE 754 Standard is assumed for -static computations, i.e. round to nearest, ties to even. The input routines -yield correctly rounded values for Short_Float, Float and Long_Float at least. -The output routines can compute up to twice as many exact digits as the value -of ``T'Digits`` for any type, for example 30 digits for Long_Float; if more -digits are requested, zeros are printed. +The default rounding mode specified by the IEEE 754 Standard is assumed both +for static and dynamic computations (that is, round to nearest, ties to even). +The input routines yield correctly rounded values for Short_Float, Float, and +Long_Float at least. The output routines can compute up to twice as many exact +digits as the value of ``T'Digits`` for any type, for example 30 digits for +Long_Float; if more digits are requested, zeros are printed. * "The small of an ordinary fixed point type. See 3.5.9(8)." diff --git a/gcc/ada/gnat_rm.texi b/gcc/ada/gnat_rm.texi --- a/gcc/ada/gnat_rm.texi +++ b/gcc/ada/gnat_rm.texi @@ -21,7 +21,7 @@ @copying @quotation -GNAT Reference Manual , Aug 03, 2021 +GNAT Reference Manual , Sep 28, 2021 AdaCore @@ -15939,12 +15939,12 @@ IEEE 80-bit Extended on x86 architecture @end multitable -The default rounding mode specified by the IEEE 754 Standard is assumed for -static computations, i.e. round to nearest, ties to even. The input routines -yield correctly rounded values for Short_Float, Float and Long_Float at least. -The output routines can compute up to twice as many exact digits as the value -of @code{T'Digits} for any type, for example 30 digits for Long_Float; if more -digits are requested, zeros are printed. +The default rounding mode specified by the IEEE 754 Standard is assumed both +for static and dynamic computations (that is, round to nearest, ties to even). +The input routines yield correctly rounded values for Short_Float, Float, and +Long_Float at least. The output routines can compute up to twice as many exact +digits as the value of @code{T'Digits} for any type, for example 30 digits for +Long_Float; if more digits are requested, zeros are printed. @itemize *
[Ada] More work on efficiency improvements
Gather more statistics, and make some minor efficiency improvements. Adjust the heuristic for the order in which we choose field offsets. This reduces the maximum size of a node from 10 slots to 9 slots, and makes the compiler a little bit faster. Add more special cases for fields whose offsets should be chosen early. This is a substantial efficiency win. Adjustments to statistics gathering code. Misc cleanup. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * table.ads (Table_Type): Remove "aliased"; no longer needed by Atree. Besides it contradicted the comment a few lines above, "-- Note: We do not make the table components aliased...". * types.ads: Move type Slot to Atree. * atree.ads: Move type Slot fromt Types to here. Move type Node_Header from Seinfo to here. * atree.adb: Avoid the need for aliased components of the Slots table. Instead of 'Access, use a getter and setter. Misc cleanups. (Print_Statistics): Print statistics about node and entity kind frequencies. Give 3 digit fractions instead of percentages. * (Get_Original_Node_Count, Set_Original_Node_Count): Statistics for calls to Original_Node and Set_Original_Node. (Original_Node, Set_Original_Node): Gather statistics by calling the above. (Print_Field_Statistics): Print Original_Node statistics. (Update_Kind_Statistics): Remove, and put all statistics gathering under "if Atree_Statistics_Enabled", which is a flag generated in Seinfo by Gen_IL. * gen_il-gen.adb (Compute_Field_Offsets): Choose offsets of Nkind, Ekind, and Homonym first. This causes a slight efficiency improvement. Misc cleanups. Do not generate Node_Header; it is now hand-written in Atree. When choosing the order in which to assign offsets, weight by the frequency of the node type, so the more common nodes get their field offsets assigned earlier. Add more special cases. (Compute_Type_Sizes): Remove this and related things. There was a comment: "At some point we can instrument Atree to print out accurate size statistics, and remove this code." We have Atree statistics, so we now remove this code. (Put_Seinfo): Generate Atree_Statistics_Enabled, which is equal to Statistics_Enabled. This allows Atree to say "if Atree_Statistics_Enabled then " for efficiency. When Atree_Statistics_Enabled is False, the "if ..." will be optimized away. * gen_il-internals.ads (Type_Frequency): New table of kind frequencies. * gen_il-internals.adb: Minor comment improvement. * gen_il-fields.ads: Remove unused subtypes. Suppress style checks in the Type_Frequency table. If we regenerate this table (see -gnatd.A) we don't want to have to fiddle with casing. * impunit.adb: Minor. * sinfo-utils.adb: Minor. * debug.adb: Minor comment improvement. patch.diff.gz Description: application/gzip
[Ada] Improved checking for invalid index values when accessing array elements
Ensure that the consequences of indexing into an array with the value of an uninitialized variable are consistent with Ada RM 13.9.1(11) by generating additional validity checks in some array indexing cases. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * checks.ads: Define a type Dimension_Set. Add an out-mode parameter of this new type to Generate_Index_Checks so that callers can know for which dimensions a check was generated. Add an in-mode parameter of this new type to Apply_Subscript_Validity_Checks so that callers can indicate that no check is needed for certain dimensions. * checks.adb (Generate_Index_Checks): Implement new Checks_Generated parameter. (Apply_Subscript_Validity_Checks): Implement new No_Check_Needed parameter. * exp_ch4.adb (Expand_N_Indexed_Component): Call Apply_Subscript_Validity_Checks in more cases than before. This includes declaring two new local functions, (Is_Renamed_Variable_Name, Type_Requires_Subscript_Validity_Checks_For_Reads): To help in deciding whether to call Apply_Subscript_Validity_Checks. Adjust to parameter profile changes in Generate_Index_Checks and Apply_Subscript_Validity_Checks.diff --git a/gcc/ada/checks.adb b/gcc/ada/checks.adb --- a/gcc/ada/checks.adb +++ b/gcc/ada/checks.adb @@ -3552,9 +3552,12 @@ package body Checks is -- Apply_Subscript_Validity_Checks -- - - procedure Apply_Subscript_Validity_Checks (Expr : Node_Id) is + procedure Apply_Subscript_Validity_Checks + (Expr: Node_Id; + No_Check_Needed : Dimension_Set := Empty_Dimension_Set) is Sub : Node_Id; + Dimension : Pos := 1; begin pragma Assert (Nkind (Expr) = N_Indexed_Component); @@ -3568,11 +3571,16 @@ package body Checks is -- for the subscript, and that convert will do the necessary validity -- check. - Ensure_Valid (Sub, Holes_OK => True); + if (No_Check_Needed = Empty_Dimension_Set) + or else not No_Check_Needed.Elements (Dimension) + then +Ensure_Valid (Sub, Holes_OK => True); + end if; -- Move to next subscript Next (Sub); + Dimension := Dimension + 1; end loop; end Apply_Subscript_Validity_Checks; @@ -7233,7 +7241,10 @@ package body Checks is -- Generate_Index_Checks -- --- - procedure Generate_Index_Checks (N : Node_Id) is + procedure Generate_Index_Checks + (N: Node_Id; + Checks_Generated : out Dimension_Set) + is function Entity_Of_Prefix return Entity_Id; -- Returns the entity of the prefix of N (or Empty if not found) @@ -7268,6 +7279,8 @@ package body Checks is -- Start of processing for Generate_Index_Checks begin + Checks_Generated.Elements := (others => False); + -- Ignore call if the prefix is not an array since we have a serious -- error in the sources. Ignore it also if index checks are suppressed -- for array object or type. @@ -7330,6 +7343,8 @@ package body Checks is Prefix => New_Occurrence_Of (Etype (A), Loc), Attribute_Name => Name_Range)), Reason => CE_Index_Check_Failed)); + +Checks_Generated.Elements (1) := True; end if; -- General case @@ -7416,6 +7431,8 @@ package body Checks is Duplicate_Subexpr_Move_Checks (Sub)), Right_Opnd => Range_N), Reason => CE_Index_Check_Failed)); + + Checks_Generated.Elements (Ind) := True; end if; Next_Index (A_Idx); diff --git a/gcc/ada/checks.ads b/gcc/ada/checks.ads --- a/gcc/ada/checks.ads +++ b/gcc/ada/checks.ads @@ -44,6 +44,14 @@ with Urealp; use Urealp; package Checks is + type Bit_Vector is array (Pos range <>) of Boolean; + type Dimension_Set (Dimensions : Nat) is + record + Elements : Bit_Vector (1 .. Dimensions); + end record; + Empty_Dimension_Set : constant Dimension_Set + := (Dimensions => 0, Elements => (others => <>)); + procedure Initialize; -- Called for each new main source program, to initialize internal -- variables used in the package body of the Checks unit. @@ -721,11 +729,16 @@ package Checks is -- Do_Range_Check flag, and if it is set, this routine is called, which -- turns the flag off in code-generation mode. - procedure Generate_Index_Checks (N : Node_Id); + procedure Generate_Index_Checks + (N: Node_Id; + Checks_Generated : out Dimension_Set); -- This procedure is called to generate index checks on the subscripts for -- the indexed component node N. Each subscript expression
[Ada] Empty CUDA_Global procedures when compiling for host
This commit makes GNAT empty CUDA_Global procedures when compiling for the host. This is required because CUDA_Global procedures could be referencing entities that only exist on the GPU, which would result in errors from the linker at link time. We empty the procedures rather than completely deleting them because we need to keep a symbol representing the procedure in order to be able to register kernels with the CUDA runtime. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * gnat_cuda.adb (Empty_CUDA_Global_Subprograms): New procedure. (Expand_CUDA_Package): Call Empty_CUDA_Global_Subprograms.diff --git a/gcc/ada/gnat_cuda.adb b/gcc/ada/gnat_cuda.adb --- a/gcc/ada/gnat_cuda.adb +++ b/gcc/ada/gnat_cuda.adb @@ -25,20 +25,22 @@ -- This package defines CUDA-specific datastructures and functions. -with Debug; use Debug; -with Elists; use Elists; -with Namet; use Namet; -with Nlists; use Nlists; -with Nmake; use Nmake; -with Rtsfind;use Rtsfind; -with Sinfo; use Sinfo; -with Sinfo.Nodes;use Sinfo.Nodes; -with Stringt;use Stringt; -with Tbuild; use Tbuild; -with Uintp; use Uintp; -with Sem;use Sem; -with Sem_Util; use Sem_Util; -with Snames; use Snames; +with Atree; use Atree; +with Debug; use Debug; +with Elists; use Elists; +with Namet; use Namet; +with Nlists; use Nlists; +with Nmake; use Nmake; +with Rtsfind; use Rtsfind; +with Sinfo; use Sinfo; +with Sinfo.Nodes; use Sinfo.Nodes; +with Stringt; use Stringt; +with Tbuild; use Tbuild; +with Uintp; use Uintp; +with Sem; use Sem; +with Sem_Aux; use Sem_Aux; +with Sem_Util;use Sem_Util; +with Snames; use Snames; with GNAT.HTable; @@ -97,6 +99,17 @@ package body GNAT_CUDA is --* A procedure that takes care of calling CUDA functions that register -- CUDA_Global procedures with the runtime. + procedure Empty_CUDA_Global_Subprograms (Pack_Id : Entity_Id); + -- For all subprograms marked CUDA_Global in Pack_Id, remove declarations + -- and replace statements with a single null statement. + -- This is required because CUDA_Global subprograms could be referring to + -- device-only symbols, which would result in unknown symbols at link time + -- if kept around. + -- We choose to empty CUDA_Global subprograms rather than completely + -- removing them from the package because registering CUDA_Global + -- subprograms with the CUDA runtime on the host requires knowing the + -- subprogram's host-side address. + function Get_CUDA_Device_Entities (Pack_Id : Entity_Id) return Elist_Id; -- Returns an Elist of all entities marked with pragma CUDA_Device that -- are declared within package body Pack_Body. Returns No_Elist if Pack_Id @@ -153,6 +166,50 @@ package body GNAT_CUDA is Append_Elmt (Kernel, Kernels); end Add_CUDA_Kernel; + --- + -- Empty_CUDA_Global_Subprograms -- + --- + + procedure Empty_CUDA_Global_Subprograms (Pack_Id : Entity_Id) is + Spec_Id : constant Node_Id := Corresponding_Spec (Pack_Id); + Kernels : constant Elist_Id := Get_CUDA_Kernels (Spec_Id); + Kernel_Elm : Elmt_Id; + Kernel : Entity_Id; + Kernel_Body : Node_Id; + Null_Body : Entity_Id; + Loc : Source_Ptr; + begin + -- It is an error to empty CUDA_Global subprograms when not compiling + -- for the host. + pragma Assert (Debug_Flag_Underscore_C); + + if No (Kernels) then + return; + end if; + + Kernel_Elm := First_Elmt (Kernels); + while Present (Kernel_Elm) loop + Kernel := Node (Kernel_Elm); + Kernel_Body := Subprogram_Body (Kernel); + Loc := Sloc (Kernel_Body); + + Null_Body := Make_Subprogram_Body (Loc, + Specification => Subprogram_Specification (Kernel), + Declarations => New_List, + Handled_Statement_Sequence => + Make_Handled_Sequence_Of_Statements (Loc, + Statements => New_List (Make_Null_Statement (Loc; + + Rewrite (Kernel_Body, Null_Body); + + Next_Elmt (Kernel_Elm); + end loop; + end Empty_CUDA_Global_Subprograms; + + - + -- Expand_CUDA_Package -- + - + procedure Expand_CUDA_Package (N : Node_Id) is begin @@ -162,6 +219,13 @@ package body GNAT_CUDA is return; end if; + -- Remove the content (both declarations and statements) of CUDA_Global + -- procedures. This is required because CUDA_Global functions could be + -- referencing entities available only on the device, which would result + -- in unknown symbol errors at link time. + + Empty_CUDA_Global_Subprograms (N);
[Ada] Subprogram_Variant in ignored ghost code
If a Subprogram_Variant aspect is given in ghost code, and the assertion policy is set to Ghost => Ignore, and the -gnata switch is used, the compiler gives spurious error messages. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * exp_ch6.adb (Expand_Call_Helper): Do not call Check_Subprogram_Variant if the subprogram is an ignored ghost entity. Otherwise the compiler crashes (in debug builds) or gives strange error messages (in production builds).diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb --- a/gcc/ada/exp_ch6.adb +++ b/gcc/ada/exp_ch6.adb @@ -4392,6 +4392,7 @@ package body Exp_Ch6 is -- the current subprogram is called. if Is_Subprogram (Subp) +and then not Is_Ignored_Ghost_Entity (Subp) and then Same_Or_Aliased_Subprograms (Subp, Current_Scope) then Check_Subprogram_Variant;
[PATCH] middle-end/102518 - avoid invalid GIMPLE during inlining
When inlining we have to avoid mapping a non-lvalue parameter value into a context that prevents the parameter to be a register. Formerly the register were TREE_ADDRESSABLE but now it can be just DECL_NOT_GIMPLE_REG_P. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. 2021-09-30 Richard Biener PR middle-end/102518 * tree-inline.c (setup_one_parameter): Avoid substituting an invariant into contexts where a GIMPLE register is not valid. * gcc.dg/torture/pr102518.c: New testcase. --- gcc/testsuite/gcc.dg/torture/pr102518.c | 12 gcc/tree-inline.c | 6 +- 2 files changed, 17 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/torture/pr102518.c diff --git a/gcc/testsuite/gcc.dg/torture/pr102518.c b/gcc/testsuite/gcc.dg/torture/pr102518.c new file mode 100644 index 000..bd181ec9d99 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr102518.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ + +struct A { + int *x; +}; +int i; +int f(int *const c) { + struct A * b = (struct A *)(&c); + return b->x != 0; +} +void g() { f(&i); } + diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index 5e50e8013e2..e292a144967 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -3490,7 +3490,11 @@ setup_one_parameter (copy_body_data *id, tree p, tree value, tree fn, /* We may produce non-gimple trees by adding NOPs or introduce invalid sharing when the value is not constant or DECL. And we need to make sure that it cannot be modified from another path in the callee. */ - if ((is_gimple_min_invariant (value) + if (((is_gimple_min_invariant (value) + /* When the parameter is used in a context that forces it to + not be a GIMPLE register avoid substituting something that + is not a decl there. */ + && ! DECL_NOT_GIMPLE_REG_P (p)) || (DECL_P (value) && TREE_READONLY (value)) || (auto_var_in_fn_p (value, id->dst_fn) && !TREE_ADDRESSABLE (value))) -- 2.31.1