Re: [PATCH] C, C++, Fortran, OpenMP: Add 'has_device_addr' clause to 'target' construct
Hi Jakub, +case OMP_CLAUSE_HAS_DEVICE_ADDR: + t = OMP_CLAUSE_DECL (c); + if (TREE_CODE (t) == TREE_LIST) +{ + if (handle_omp_array_sections (c, ort)) +remove = true; + else +{ + t = OMP_CLAUSE_DECL (c); + while (TREE_CODE (t) == ARRAY_REF) +t = TREE_OPERAND (t, 0); +} +} + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_HAS_DEVICE_ADDR) +bitmap_set_bit (&is_on_device_head, DECL_UID (t)); Why the OMP_CLAUSE_CODE (c) == OMP_CLAUSE_HAS_DEVICE_ADDR check? There is no goto into this block nor fallthru into it, and handle_omp_array_sections better shouldn't change OMP_CLAUSE_CODE. Good point. Removed. goto check_dup_generic; +case OMP_CLAUSE_HAS_DEVICE_ADDR: + t = OMP_CLAUSE_DECL (c); + if (TREE_CODE (t) == TREE_LIST) +if (handle_omp_array_sections (c, ort)) + remove = true; +else + { +t = OMP_CLAUSE_DECL (c); +while (TREE_CODE (t) == ARRAY_REF) + t = TREE_OPERAND (t, 0); + } + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_HAS_DEVICE_ADDR) +bitmap_set_bit (&is_on_device_head, DECL_UID (t)); Likewise. Removed. + if (VAR_P (t) || TREE_CODE (t) == PARM_DECL) +cxx_mark_addressable (t); + goto check_dup_generic_t; + case OMP_CLAUSE_USE_DEVICE_ADDR: field_ok = true; t = OMP_CLAUSE_DECL (c); --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -1391,7 +1391,8 @@ enum OMP_LIST_USE_DEVICE_PTR, OMP_LIST_USE_DEVICE_ADDR, OMP_LIST_NONTEMPORAL, - OMP_LIST_NUM + OMP_LIST_HAS_DEVICE_ADDR, + OMP_LIST_NUM /* must be the last */ Capital M and . at the end. Changed. @@ -2077,6 +2078,12 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, } break; case 'h': + if ((mask & OMP_CLAUSE_HAS_DEVICE_ADDR) + && gfc_match_omp_variable_list + ("has_device_addr (", +&c->lists[OMP_LIST_HAS_DEVICE_ADDR], false, NULL, NULL, + true) == MATCH_YES) Formatting, true should be IMO below &c->lists. Corrected the formatting. +continue; if ((mask & OMP_CLAUSE_HINT) && (m = gfc_match_dupl_check (!c->hint, "hint", true, &c->hint)) != MATCH_NO) @@ -2850,7 +2857,8 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, if ((mask & OMP_CLAUSE_USE_DEVICE_ADDR) && gfc_match_omp_variable_list ("use_device_addr (", -&c->lists[OMP_LIST_USE_DEVICE_ADDR], false) == MATCH_YES) +&c->lists[OMP_LIST_USE_DEVICE_ADDR], false, NULL, NULL, + true) == MATCH_YES) Likewise. Corrected. --- a/gcc/fortran/trans-openmp.c +++ b/gcc/fortran/trans-openmp.c @@ -1910,7 +1910,17 @@ gfc_trans_omp_variable_list (enum omp_clause_code code, tree t = gfc_trans_omp_variable (namelist->sym, declare_simd); if (t != error_mark_node) { -tree node = build_omp_clause (input_location, code); +tree node; +/* For HAS_DEVICE_ADDR of an array descriptor, firstprivatize the + descriptor such that the bounds are available; its data component + is unmodified; it is handled as device address inside target. */ +if (code == OMP_CLAUSE_HAS_DEVICE_ADDR +&& (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (t)) +|| (POINTER_TYPE_P (TREE_TYPE (t)) +&& GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (TREE_TYPE (t)) + node = build_omp_clause (input_location, OMP_CLAUSE_FIRSTPRIVATE); Not sure about the above, This is needed for allocatable arrays and array pointers to ensure that not only the (array) data is (already) present on the device but also the array descriptor. Otherwise the test cases target-has-device-addr-2.f90, target-has-device-addr-3.f90 (because of variable "c") and target-has-device-addr-4.f90 (also because of variable "c") won't work. --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -10024,6 +10024,15 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, flags = GOVD_EXPLICIT; goto do_add; +case OMP_CLAUSE_HAS_DEVICE_ADDR: + decl = OMP_CLAUSE_DECL (c); + if (TREE_CODE (decl) == ARRAY_REF) +{ + flags = GOVD_FIRSTPRIVATE | GOVD_EXPLICIT; + while (TREE_CODE (decl) == ARRAY_REF) +decl = TREE_OPERAND (decl, 0); + goto do_add_decl; but this looks weird. If decl after stripping the ARRAY_REFs is a var with pointer type, sure, firstprivatizing it is the way to go. But it can be also a variable with ARRAY_TYPE, can't it? Something like: int a[64]; #pragma omp target data map(a) use_device_addr(a) { #pragma omp target has_device_addr(a[3:16]) a[3] = 1; } and
[Patch][wwwdocs + gcc] nvptx – update for -mptx change – gcc-12/changes.html + gcc/docs/invoke.texi
This patch updates the documentation for Tom's change of the default -mptx= version - mentioning also -mptx=7.0. I forgot whether ptx = 7.0 was working fine or whether there was a reason not to mention it. At some point, we also have to update -misa=... Currently, only sm_30 and sm_35 are documented but sm_53, sm_75 and sm_80 are supported. Can they now be documented are are there still issues? OK to commit the wwwdocs + gcc invoke.texi patches? Tobias PS: I wonder whether any of the recent changes by chance did fix https://gcc.gnu.org/PR102429 – nvptx: ICE with expand_GOMP_SIMT_XCHG_BFLY : in expand_insn, at optabs.c:7947 for DCmode (complex double) Probably not, but it would have been nice nonetheless. - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 invoke.texi: nvptx – update for -mptx change gcc/ChangeLog: * doc/invoke.texi (-mptx): Document new default, add 7.0. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 7af5c51cc3c..61d7b4f077a 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -27250,9 +27250,9 @@ strings must be lower-case. Valid ISA strings include @samp{sm_30} and @item -mptx=@var{version-string} @opindex mptx -Generate code for given the specified PTX version (e.g.@: @samp{6.3}). -Valid version strings include @samp{3.1} and @samp{6.3}. The default PTX -version is 3.1. +Generate code for given the specified PTX version (e.g.@: @samp{7.0}). +Valid version strings include @samp{3.1}, @samp{6.3}, and @samp{7.0}. +The default PTX version is 6.3. @item -mmainkernel @opindex mmainkernel gcc-12/changes.html: nvptx – update for -mptx change diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html index 2719b9d5..bd0babfb 100644 --- a/htdocs/gcc-12/changes.html +++ b/htdocs/gcc-12/changes.html @@ -404,7 +404,8 @@ a work-in-progress. The -mptx flag has been added to specify the PTX ISA version for the generated code; permitted values are 3.1 - (default, matches previous GCC versions) and 6.3. + (matches previous GCC versions), 6.3 (new default) and + 7.0. The new __PTX_SM__ predefined macro allows code to check the compute model being targeted by the compiler.
Re: [PATCH] declare std::array members attribute const [PR101831]
On Wed, 2 Feb 2022 at 00:23, Martin Sebor wrote: > > On 2/1/22 17:15, Jonathan Wakely wrote: > > On Wed, 2 Feb 2022 at 00:13, Martin Sebor wrote: > >> > >> On 2/1/22 12:48, Jonathan Wakely wrote: > >>> On Tue, 1 Feb 2022 at 18:54, Martin Sebor via Libstdc++ > >>> wrote: > > Passing an uninitialized object to a function that takes its argument > by const reference is diagnosed by -Wmaybe-uninitialized because most > such functions read the argument. The exceptions are functions that > don't access the object but instead use its address to compute > a result. This includes a number of std::array member functions such > as std::array::size() which returns the template argument N. Such > functions may be candidates for attribute const which also avoids > the warning. The attribute typically only benefits extern functions > that IPA cannot infer the property from, but in this case it helps > avoid the warning which runs very early on, even without optimization > or inlining. The attached patch adds the attribute to a subset of > those member functions of std::array. (It doesn't add it to const > member functions like cbegin() or front() that return a const_iterator > or const reference to the internal data.) > > It might be possible to infer this property from inline functions > earlier on than during IPA and avoid having to annotate them explicitly. > That seems like an enhancement worth considering in the future. > > Tested on x86_64-linux. > > Martin > >>> > >>> new file mode 100644 > >>> index 000..b7743adf3c9 > >>> --- /dev/null > >>> +++ b/libstdc++-v3/testsuite/23_containers/array/iterators/begin_end.cc > >>> @@ -0,0 +1,56 @@ > >>> +// { dg-do compile { target c++11 } } > >>> +// > >>> +// Copyright (C) 2011-2022 Free Software Foundation, Inc. > >>> > >>> Those dates look wrong. I no longer bother putting a license text and > >>> copyright notice on simple tests like this. It's meaningless to assert > >>> copyright on something so trivial that doesn't do anything. > >>> > >> > >> Should I take to mean that you're okay with the rest of the change > >> (i.e., with the notice removed)? > > > > Yes, OK for trunk either with the notice entirely removed, or just fix > > the dates (I don't think it is copied from an existing test dating > > from 2011, right?) > > I copied it from 23_containers/array/iterators/end_is_one_past.cc > without even looking at the dates. Yeah, we all do that, and we all forget to fix the dates. It's just one more reason not to bother putting 16 lines of licence notices in tests, when the notice is often bigger than the test itself! > > > > > Whichever you prefer. > > > > Okay, pushed in r12-6992. Thanks.
Re: [PATCH] tree-optimization/94899: Remove "+ 0x80000000" in int comparisons
On Tue, Feb 1, 2022 at 4:21 PM Arjun Shankar wrote: > > > +/* As a special case, X + C < Y + C is the same as X < Y even with wrapping > > + overflow if X and Y are signed integers of the same size, and C is an > > + unsigned constant with all bits except MSB set to 0 and size >= that of > > + X/Y. */ > > +(for op (lt le ge gt) > > + (simplify > > + (op (plus:c (convert@0 @1) @4) (plus:c (convert@2 @3) @4)) > > + (if (CONSTANT_CLASS_P (@4) > > + && TYPE_UNSIGNED (TREE_TYPE (@4)) > > > > why include (convert ..) here? It looks like you could do without, > > merging the special case with the preceding pattern and let a followup > > pattern simplify (lt (convert @1) (convert @2)) instead? > > Thanks for taking a look at this patch. > > It looks like the convert and plus need to be taken into account > together when applying this simplification. > > 1. 0x8000 is *just* large enough to be interpreted as an unsigned. > > 2. So, an expression like... > > x + 0x8000 < y + 0x8000; > > ...where x and y are signed actually gets interpreted as: > > (unsigned) x + 0x8000 < (unsigned) y + 0x8000 > > 3. Now, adding 0x8000 to (unsigned) INT_MIN gives us 0, > and adding it to (unsigned) INT_MAX gives us UINT_MAX. > > 4. So, if x < y is true when they are compared as signed integers, then... > (unsigned) x + 0x8000 < (unsigned) y + 0x8000 > ...will also be true. > > 5. i.e. the unsigned comparison must be replaced by a signed > comparison when we remove the constant, and so the constant and > convert need to be matched and removed together. Oh, I see - that's very special then and the pattern in the comment does not include this conversion. I think you can simplify the checking done by requiring types_match (TREE_TYPE (@1), TREE_TYPE (@3)) and by noting that the types of @0, @2 and @4 are the same (you don't seem to use @2). I wonder how relevant these kind of patterns are? Probably clang catches this simplification while we don't? Btw, you fail to check for INTEGRAL_TYPE_P, the whole thing would also match floats as-is, I think the easiest thing would be to change the match to + (op (plus:c (convert@0 @1) INTEGER_CST@4) (plus:c (convert@2 @3) INTEGER_CST@4)) where you then also can elide the CONSTANT_CLASS_P (@4) check. Btw, for unsigned x, y; if (x + 0x8000 < y + 0x8000) it would be valid to transform this into if ((int)x < (int)y) which is a simplification that's worthwhile as well I think? So we might not actually need the (convert ...) but rely on (convert (convert @0)) being simplfiied? You'd then use (with { stype = signed_type_for (TREE_TYPE (@1)); } (op (convert:stype @1) (convert:stype @3))) as transform. Thanks, Richard.
Re: [GCC 11 PATCH 0/5] x86: Backport straight-line-speculation mitigation
On Tue, Feb 1, 2022 at 5:59 PM H.J. Lu wrote: > > On Mon, Jan 31, 2022 at 11:21 PM Richard Biener > wrote: > > > > On Mon, Jan 31, 2022 at 7:56 PM H.J. Lu via Gcc-patches > > wrote: > > > > > > Backport -mindirect-branch-cs-prefix: > > > > LGTM in case a x86 maintainer also acks this. Can you amend > > the 10.3 release gcc-11/changes.html notes accordingly? > > Did you mean 11.3? Yes, of course. > Here is the patch for gcc-12/changes.html: > > https://gcc.gnu.org/pipermail/gcc-patches/2022-February/589600.html > > > Thanks, > > Richard. > > > > > commit 48a4ae26c225eb018ecb59f131e2c4fd4f3cf89a > > > Author: H.J. Lu > > > Date: Wed Oct 27 06:27:15 2021 -0700 > > > > > > x86: Add -mindirect-branch-cs-prefix > > > > > > Add -mindirect-branch-cs-prefix to add CS prefix to call and jmp to > > > indirect thunk with branch target in r8-r15 registers so that the call > > > and jmp instruction length is 6 bytes to allow them to be replaced > > > with > > > "lfence; call *%r8-r15" or "lfence; jmp *%r8-r15" at run-time. > > > > > > commit 63738e176726d31953deb03f7e32cf8b760735ac > > > Author: H.J. Lu > > > Date: Wed Oct 27 07:48:54 2021 -0700 > > > > > > x86: Add -mharden-sls=[none|all|return|indirect-branch] > > > > > > Add -mharden-sls= to mitigate against straight line speculation (SLS) > > > for function return and indirect branch by adding an INT3 instruction > > > after function return and indirect branch. > > > > > > and followup commits to support Linux kernel commits: > > > > > > commit e463a09af2f0677b9485a7e8e4e70b396b2ffb6f > > > Author: Peter Zijlstra > > > Date: Sat Dec 4 14:43:44 2021 +0100 > > > > > > x86: Add straight-line-speculation mitigation > > > > > > commit 68cf4f2a72ef8786e6b7af6fd9a89f27ac0f520d > > > Author: Peter Zijlstra > > > Date: Fri Nov 19 17:50:25 2021 +0100 > > > > > > x86: Use -mindirect-branch-cs-prefix for RETPOLINE builds > > > > > > H.J. Lu (5): > > > x86: Remove "%!" before ret > > > x86: Add -mharden-sls=[none|all|return|indirect-branch] > > > x86: Add -mindirect-branch-cs-prefix > > > x86: Rename -harden-sls=indirect-branch to -harden-sls=indirect-jmp > > > x86: Generate INT3 for __builtin_eh_return > > > > > > gcc/config/i386/i386-opts.h | 7 > > > gcc/config/i386/i386.c| 38 +-- > > > gcc/config/i386/i386.md | 2 +- > > > gcc/config/i386/i386.opt | 24 > > > gcc/doc/invoke.texi | 18 - > > > gcc/testsuite/gcc.target/i386/harden-sls-1.c | 14 +++ > > > gcc/testsuite/gcc.target/i386/harden-sls-2.c | 14 +++ > > > gcc/testsuite/gcc.target/i386/harden-sls-3.c | 14 +++ > > > gcc/testsuite/gcc.target/i386/harden-sls-4.c | 16 > > > gcc/testsuite/gcc.target/i386/harden-sls-5.c | 17 + > > > gcc/testsuite/gcc.target/i386/harden-sls-6.c | 18 + > > > .../i386/indirect-thunk-cs-prefix-1.c | 14 +++ > > > .../i386/indirect-thunk-cs-prefix-2.c | 15 > > > 13 files changed, 198 insertions(+), 13 deletions(-) > > > create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-1.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-2.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-3.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-4.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-5.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-6.c > > > create mode 100644 > > > gcc/testsuite/gcc.target/i386/indirect-thunk-cs-prefix-1.c > > > create mode 100644 > > > gcc/testsuite/gcc.target/i386/indirect-thunk-cs-prefix-2.c > > > > > > -- > > > 2.34.1 > > > > > > > -- > H.J.
Re: [PING^3][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 1/31/22 08:26, Richard Sandiford wrote: Thanks for the discussion and sorry for the slow reply, was out most of last week. Dan Li writes: Thanks, Ard, On 1/26/22 00:10, Ard Biesheuvel wrote: On Wed, 26 Jan 2022 at 08:53, Dan Li wrote: Hi, all, Sorry for bothering. I'm trying to commit aarch64 scs code to the gcc and there is an issue that I'm not sure about, could someone give me some suggestions? (To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) ) So omitting the load of X30 from the ordinary stack seems fine to me. OK, thanks. Let's go with that for now then. There would still be time to change our minds before GCC 12 is released, if anyone feels that patching SCS code would be useful. Reading it back, I think my previous message came across as sounding like a complaint against binary patching, which wasn't the case at all. I think it would be fine to support patching, even if it was just for a single vendor rather than expected to be a common case. It's just that, if we did want to support it, we'd need to document it as a requirement (at least within GCC) and change the implementation accordingly. Got it, then I'll implement this feature as discussed above and see if we could add additional options for SCS later. Thanks, Dan
Re: [PATCH] Reset relations when crossing backedges.
On Tue, Feb 1, 2022 at 7:41 PM Aldy Hernandez wrote: > > Ping I didn't quite get Jeffs comment, so I waited (sorry). I've meanwhile added extern bool mark_dfs_back_edges (struct function *); so please make verify_mark_backedges take a struct function * and replace 'cfun' with the explicit argument. + gcc_unreachable (); should be sth like internal_error ("% failed"); OK with those changes. Thanks, Richard. > On Mon, Jan 24, 2022, 20:20 Aldy Hernandez wrote: >> >> On Mon, Jan 24, 2022 at 9:51 AM Richard Biener >> wrote: >> > >> > On Fri, Jan 21, 2022 at 1:12 PM Aldy Hernandez wrote: >> > > >> > > On Fri, Jan 21, 2022 at 11:56 AM Richard Biener >> > > wrote: >> > > > >> > > > On Fri, Jan 21, 2022 at 11:30 AM Aldy Hernandez >> > > > wrote: >> > > > > >> > > > > On Fri, Jan 21, 2022 at 10:43 AM Richard Biener >> > > > > wrote: >> > > > > > >> > > > > > On Fri, Jan 21, 2022 at 9:30 AM Aldy Hernandez via Gcc-patches >> > > > > > wrote: >> > > > > > > >> > > > > > > As discussed in PR103721, the problem here is that we are >> > > > > > > crossing a >> > > > > > > backedge and causing us to use relations from a previous >> > > > > > > iteration of a >> > > > > > > loop. >> > > > > > > >> > > > > > > This handles the testcases in both PR103721 and PR104067 which >> > > > > > > are variants >> > > > > > > of the same thing. >> > > > > > > >> > > > > > > Tested on x86-64 Linux with the usual regstrap as well as >> > > > > > > verifying the >> > > > > > > thread count before and after the patch. The number of threads >> > > > > > > is >> > > > > > > reduced by a miniscule amount. >> > > > > > > >> > > > > > > I assume we need release manager approval at this point? OK for >> > > > > > > trunk? >> > > > > > >> > > > > > Not for regression fixes. >> > > > > >> > > > > OK, I've pushed it to fix the P1s. We can continue refining the >> > > > > solution in a follow-up patch. >> > > > > >> > > > > > >> > > > > > Btw, I wonder whether you have to treat irreducible regions in the >> > > > > > same >> > > > > > way more generally - which edges are marked as backedge there >> > > > > > depends >> > > > > > on which edge into the region was visited first. I also wonder >> > > > > > how we >> > > > > >> > > > > Jeff, Andrew?? >> > > > > >> > > > > > I also wonder how we guarantee that all users of the resolve mode >> > > > > > have backedges marked >> > > > > > properly? Also note that CFG manipulation routines in general do >> > > > > > not >> > > > > > keep backedge markings up-to-date so incremental modification and >> > > > > > resolving queries might not mix. >> > > > > > >> > > > > > It's a bit unfortunate that we can't query the CFG on whether >> > > > > > backedges >> > > > > > are marked. >> > > > > >> > > > > Ughh. The call to mark_dfs_back_edges is currently in the backward >> > > > > threader. Perhaps we could put it in the path_range_query >> > > > > constructor? That way other users of path_range_query can benefit >> > > > > (loop_ch for instance, though it doesn't use it in a way that crosses >> > > > > backedges so perhaps it's unaffected). Does that sound reasonable? >> > > > >> > > > Hmm, I'd rather keep the burden on the callers because many already >> > > > should have backedges marked. What you could instead do is >> > > > add sth like >> > > > >> > > > if (flag_checking) >> > > > { >> > > >auto_edge_flag saved_dfs_back; >> > > >for-each-edge-in-cfg () set saved_dfs_back flag if dfs_back is >> > > > set, clear dfs_back >> > > >mark_dfs_back_edges (); >> > > >for-each-edge-in-cfg () verify the flags are set on the same >> > > > edges and clear saved_dfs_back >> > > > } >> > > > >> > > > to the path_range_query constructor. That way we at least notice when >> > > > passes >> > > > do _not_ have the backedges marked properly. >> > > >> > > Sounds good. Thanks. >> > > >> > > I've put the verifier by mark_dfs_back_edges(), since it really has >> > > nothing to do with the path solver. Perhaps it's useful for someone >> > > else. >> > > >> > > The patch triggered with the loop-ch use, so I've added a >> > > corresponding mark_dfs_back_edges there. >> > > >> > > OK pending tests? >> > >> > Please rename dfs_back_edges_available_p to sth not suggesting >> > it's a simple predicate check, like maybe verify_marked_backedges. >> > >> > + >> > + gcc_checking_assert (!m_resolve || dfs_back_edges_available_p ()); >> > >> > I'd prefer the following which allows even release checking builds >> > to verify this with -fchecking. >> > >> > if (!m_resolve) >> > if (flag_checking) >> > verify_marked_backedges (); >> > >> > and then ideally verify_marked_backedges () should raise an >> > internal_error () for the edge not marked properly, best by >> > also specifying it. Just like other CFG verifiers do. >> > >> > The loop ch and backwards threader changes are OK. You >> > can post the verification independen
Re: [PATCH] adjust warn-access pass placement [PR104260]
On Wed, Feb 2, 2022 at 12:28 AM Martin Sebor wrote: > > The attached patch adjusts the placement of the warn-access pass > as the two of you suggested in the bug. Please let me know if > this is good to commit or if you want me to make some other tweaks. > > The patch passes tests in an x86_64-linux bootstrap. OK. Thanks, Richard. > Martin
Re: [PATCH] [PATCH,v3,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 1/31/22 09:00, Richard Sandiford wrote: Dan Li writes: Shadow Call Stack can be used to protect the return address of a function at runtime, and clang already supports this feature[1]. /* This file should be included last. */ #include "target-def.h" @@ -7478,10 +7479,31 @@ aarch64_layout_frame (void) frame.sve_callee_adjust = 0; frame.callee_offset = 0; + /* Shadow call stack only deal with functions where the LR is pushed Typo: s/deal/deals/ Sorry for my non-standard English expression :) + onto the stack and without specifying the "no_sanitize" attribute + with the argument "shadow-call-stack". */ + frame.is_scs_enabled += (!crtl->calls_eh_return + && (sanitize_flags_p (SANITIZE_SHADOW_CALL_STACK) + && known_ge (cfun->machine->frame.reg_offset[LR_REGNUM], 0))); Nit, but normal GCC style would be to use a single chain of &&s here: frame.is_scs_enabled = (!crtl->calls_eh_return && sanitize_flags_p (SANITIZE_SHADOW_CALL_STACK) && known_ge (cfun->machine->frame.reg_offset[LR_REGNUM], 0)); Got it. + + /* When shadow call stack is enabled, the scs_pop in the epilogue will + restore x30, and we don't need to pop x30 again in the traditional + way. At this time, if candidate2 is x30, we need to adjust + max_push_offset to 256 to ensure that the offset meets the requirements + of emit_move_insn. Similarly, if candidate1 is x30, we need to set + max_push_offset to 0, because x30 is not popped up at this time, so + callee_adjust cannot be adjusted. */ HOST_WIDE_INT max_push_offset = 0; if (frame.wb_candidate2 != INVALID_REGNUM) -max_push_offset = 512; - else if (frame.wb_candidate1 != INVALID_REGNUM) +{ + if (frame.is_scs_enabled && frame.wb_candidate2 == R30_REGNUM) + max_push_offset = 256; + else + max_push_offset = 512; +} + else if ((frame.wb_candidate1 != INVALID_REGNUM) + && !(frame.is_scs_enabled && frame.wb_candidate1 == R30_REGNUM)) max_push_offset = 256; HOST_WIDE_INT const_size, const_outgoing_args_size, const_fp_offset; Maybe we should instead add separate fields for wb_push_candidate[12] and wb_pop_candidate[12]. The pop candidates would start out the same as the push candidates, but R30_REGNUM would get replaced with INVALID_REGNUM for SCS. This looks more reasonable, I'll change it in the next version. Admittedly, suppressing the restore of x30 is turning out to be a bit more difficult than I'd realised :-/ […] diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 2792bb29adb..1610a4fd74c 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -916,6 +916,10 @@ struct GTY (()) aarch64_frame unsigned spare_pred_reg; bool laid_out; + + /* Nonzero if shadow call stack should be enabled for the current + function, otherwise return FALSE. */ “True” seems better than “Nonzero” given that this is a bool. (A lot of GCC bools were originally ints, which is why “nonzero” still appears in non-obvious places.) I think we can just drop “otherwise return FALSE”: “return” doesn't seem appropriate here, given that it's a variable. Got it, thanks for the explanation. Looks great otherwise. Thanks especially for testing the corner cases. :-) One minor thing: +/* { dg-final { scan-assembler-times "str\tx30, \\\[x18\\\], \[#|$\]?8" 2 } } */ +/* { dg-final { scan-assembler-times "ldr\tx30, \\\[x18, \[#|$\]?-8\\\]!" 2 } } */ This sort of regexp can be easier to write if you quote them using {…} rather than "…", since it reduces the number of backslashes needed. E.g.: /* { dg-final { scan-assembler-times {str\tx30, \[x18\], [#|$]?8} 2 } } */ The current version is fine too though, and is widely used. Just mentioning it in case it's useful in future. Oh, thanks Richard, I didn't notice it before. Also, [#|$]? can be written #?. Ok. Thanks, Richard
Re: ifcvt: Fix PR104153 and PR104198
Hi Robin, > Bootstrapped and reg-tested on s390x, Power 9, x86 and SPARC. I've now also tested the patch on sparcv9-sun-solaris2.11: no regressions. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[PATCH][GCC11] IBM Z: fix `section type conflict` with -mindirect-branch-table
Bootstrapped and regtested on s390x-redhat-linux. Ok for releases/gcc-11? s390_code_end () puts indirect branch tables into separate sections and tries to switch back to wherever it was in the beginning by calling switch_to_section (current_function_section ()). First of all, this is unnecessary - the other backends don't do it. Furthermore, at this time there is no current function, but if the last processed function was cold, in_cold_section_p remains set. This causes targetm.asm_out.function_section () to call targetm.section_type_flags (), which in absence of current function decl classifies the section as SECTION_WRITE. This causes a section type conflict with the existing SECTION_CODE. gcc/ChangeLog: * config/s390/s390.c (s390_code_end): Do not switch back to code section. gcc/testsuite/ChangeLog: * gcc.target/s390/nobp-section-type-conflict.c: New test. (cherry picked from commit 8753b13a31c777cdab0265dae0b68534247908f7) --- gcc/config/s390/s390.c| 1 - .../s390/nobp-section-type-conflict.c | 22 +++ 2 files changed, 22 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/s390/nobp-section-type-conflict.c diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 8895dd7cc76..2d2e6522eb4 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -16700,7 +16700,6 @@ s390_code_end (void) assemble_name_raw (asm_out_file, label_start); fputs ("-.\n", asm_out_file); } - switch_to_section (current_function_section ()); } } } diff --git a/gcc/testsuite/gcc.target/s390/nobp-section-type-conflict.c b/gcc/testsuite/gcc.target/s390/nobp-section-type-conflict.c new file mode 100644 index 000..5d78bc99bb5 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/nobp-section-type-conflict.c @@ -0,0 +1,22 @@ +/* Checks that we don't get error: section type conflict with ‘put_page’. */ + +/* { dg-do compile } */ +/* { dg-options "-mindirect-branch=thunk-extern -mfunction-return=thunk-extern -mindirect-branch-table -O2" } */ + +int a; +int b (void); +void c (int); + +static void +put_page (void) +{ + if (b ()) +c (a); +} + +__attribute__ ((__section__ (".init.text"), __cold__)) void +d (void) +{ + put_page (); + put_page (); +} -- 2.34.1
Re: [PATCH][GCC11] IBM Z: fix `section type conflict` with -mindirect-branch-table
On 2/2/22 12:57, Ilya Leoshkevich wrote: > Bootstrapped and regtested on s390x-redhat-linux. Ok for > releases/gcc-11? > > > > s390_code_end () puts indirect branch tables into separate sections and > tries to switch back to wherever it was in the beginning by calling > switch_to_section (current_function_section ()). > > First of all, this is unnecessary - the other backends don't do it. > > Furthermore, at this time there is no current function, but if the > last processed function was cold, in_cold_section_p remains set. This > causes targetm.asm_out.function_section () to call > targetm.section_type_flags (), which in absence of current function > decl classifies the section as SECTION_WRITE. This causes a section > type conflict with the existing SECTION_CODE. > > gcc/ChangeLog: > > * config/s390/s390.c (s390_code_end): Do not switch back to > code section. > > gcc/testsuite/ChangeLog: > > * gcc.target/s390/nobp-section-type-conflict.c: New test. Ok. Thanks! Andreas > > (cherry picked from commit 8753b13a31c777cdab0265dae0b68534247908f7) > --- > gcc/config/s390/s390.c| 1 - > .../s390/nobp-section-type-conflict.c | 22 +++ > 2 files changed, 22 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.target/s390/nobp-section-type-conflict.c > > diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c > index 8895dd7cc76..2d2e6522eb4 100644 > --- a/gcc/config/s390/s390.c > +++ b/gcc/config/s390/s390.c > @@ -16700,7 +16700,6 @@ s390_code_end (void) > assemble_name_raw (asm_out_file, label_start); > fputs ("-.\n", asm_out_file); > } > - switch_to_section (current_function_section ()); > } > } > } > diff --git a/gcc/testsuite/gcc.target/s390/nobp-section-type-conflict.c > b/gcc/testsuite/gcc.target/s390/nobp-section-type-conflict.c > new file mode 100644 > index 000..5d78bc99bb5 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/s390/nobp-section-type-conflict.c > @@ -0,0 +1,22 @@ > +/* Checks that we don't get error: section type conflict with ‘put_page’. */ > + > +/* { dg-do compile } */ > +/* { dg-options "-mindirect-branch=thunk-extern > -mfunction-return=thunk-extern -mindirect-branch-table -O2" } */ > + > +int a; > +int b (void); > +void c (int); > + > +static void > +put_page (void) > +{ > + if (b ()) > +c (a); > +} > + > +__attribute__ ((__section__ (".init.text"), __cold__)) void > +d (void) > +{ > + put_page (); > + put_page (); > +}
[PING] PR100499 patches
I'd like to ping the remaining two of the series to fix PR100499: Refactor LSHIFT_EXPR handling of multiple_of_p to use the correct type https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589172.html Adjust multiple_of_p to work correctly for wrapping operations (in some cases) https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589267.html Thanks, Richard.
Re: [PATCH] C, C++, Fortran, OpenMP: Add 'has_device_addr' clause to 'target' construct
On Wed, Feb 02, 2022 at 09:19:03AM +0100, Marcel Vollweiler wrote: > gcc/c-family/ChangeLog: > > * c-omp.cc (c_omp_split_clauses): Added OMP_CLAUSE_HAS_DEVICE_ADDR case. > * c-pragma.h (enum pragma_kind): Added 5.1 in comment. > (enum pragma_omp_clause): Added PRAGMA_OMP_CLAUSE_HAS_DEVICE_ADDR. > > gcc/c/ChangeLog: > > * c-parser.cc (c_parser_omp_clause_name): Parse 'has_device_addr' > clause. > (c_parser_omp_variable_list): Handle array sections. > (c_parser_omp_clause_has_device_addr): Added. > (c_parser_omp_all_clauses): Added PRAGMA_OMP_CLAUSE_HAS_DEVICE_ADDR > case. > (c_parser_omp_target_exit_data): Added HAS_DEVICE_ADDR to > OMP_CLAUSE_MASK. > * c-typeck.cc (handle_omp_array_sections): Handle clause restrictions. > (c_finish_omp_clauses): Handle array sections. > > gcc/cp/ChangeLog: > > * parser.cc (cp_parser_omp_clause_name): Parse 'has_device_addr' clause. > (cp_parser_omp_var_list_no_open): Handle array sections. > (cp_parser_omp_all_clauses): Added PRAGMA_OMP_CLAUSE_HAS_DEVICE_ADDR > case. > (cp_parser_omp_target_update): Added HAS_DEVICE_ADDR to OMP_CLAUSE_MASK. > * pt.c (tsubst_omp_clauses): Added cases for OMP_CLAUSE_HAS_DEVICE_ADDR. > * semantics.cc (handle_omp_array_sections): Handle clause restrictions. > (finish_omp_clauses): Handle array sections. > > gcc/fortran/ChangeLog: > > * dump-parse-tree.cc (show_omp_clauses): Added OMP_LIST_HAS_DEVICE_ADDR > case. > * gfortran.h: Added OMP_LIST_HAS_DEVICE_ADDR. > * openmp.cc (enum omp_mask2): Added OMP_CLAUSE_HAS_DEVICE_ADDR. > (gfc_match_omp_clauses): Parse HAS_DEVICE_ADDR clause. > (resolve_omp_clauses): Same. > * trans-openmp.cc (gfc_trans_omp_variable_list): Added > OMP_LIST_HAS_DEVICE_ADDR case. > (gfc_trans_omp_clauses): Firstprivatize of array descriptors. > > gcc/ChangeLog: > > * gimplify.cc (gimplify_scan_omp_clauses): Added cases for > OMP_CLAUSE_HAS_DEVICE_ADDR > and handle array sections. > (gimplify_adjust_omp_clauses): Added OMP_CLAUSE_HAS_DEVICE_ADDR case. > * omp-low.cc (scan_sharing_clauses): Handle OMP_CLAUSE_HAS_DEVICE_ADDR. > (lower_omp_target): Same. > * tree-core.h (enum omp_clause_code): Same. > * tree-nested.cc (convert_nonlocal_omp_clauses): Same. > (convert_local_omp_clauses): Same. > * tree-pretty-print.cc (dump_omp_clause): Same. > * tree.cc: Same. > > libgomp/ChangeLog: > > * libgomp.texi: Updated entry for HAS_DEVICE_ADDR. > * target.c (copy_firstprivate_data): Copy only if host address is not > NULL. > * testsuite/libgomp.c++/target-has-device-addr-2.C: New test. > * testsuite/libgomp.c++/target-has-device-addr-4.C: New test. > * testsuite/libgomp.c++/target-has-device-addr-5.C: New test. > * testsuite/libgomp.c++/target-has-device-addr-6.C: New test. > * testsuite/libgomp.c-c++-common/target-has-device-addr-1.c: New test. > * testsuite/libgomp.c/target-has-device-addr-3.c: New test. > * testsuite/libgomp.fortran/target-has-device-addr-1.f90: New test. > * testsuite/libgomp.fortran/target-has-device-addr-2.f90: New test. > * testsuite/libgomp.fortran/target-has-device-addr-3.f90: New test. > * testsuite/libgomp.fortran/target-has-device-addr-4.f90: New test. > > gcc/testsuite/ChangeLog: > > * c-c++-common/gomp/clauses-1.c: Added has_device_addr to test cases. > * g++.dg/gomp/attrs-1.C: Added has_device_addr to test cases. > * g++.dg/gomp/attrs-2.C: Added has_device_addr to test cases. > * c-c++-common/gomp/target-has-device-addr-1.c: New test. > * c-c++-common/gomp/target-has-device-addr-2.c: New test. > * c-c++-common/gomp/target-is-device-ptr-1.c: New test. > * c-c++-common/gomp/target-is-device-ptr-2.c: New test. > * gfortran.dg/gomp/is_device_ptr-3.f90: New test. > * gfortran.dg/gomp/target-has-device-addr-1.f90: New test. > * gfortran.dg/gomp/target-has-device-addr-2.f90: New test. Ok, thanks. Jakub
[PATCH] lto: fix error handling for -Wl,-plugin-opt=debug
When one uses something like: -Wl,-plugin-opt=debug, we end up with lto1 WPA invocation that has 'debug' on command line. We interpret that as input filename. The patch moves resolution checking later so that we end up with a reasonable error message: lto1: error: open debug failed: No such file or directory lto1: fatal error: errors during merging of translation units Patch can bootstrap on x86_64-linux-gnu and survives regression tests. Ready to be installed? Thanks, Martin PR lto/104333 gcc/lto/ChangeLog: * lto-common.cc (read_cgraph_and_symbols): Move resolution checking for number of files later and report a reasonable error message. --- gcc/lto/lto-common.cc | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/gcc/lto/lto-common.cc b/gcc/lto/lto-common.cc index 4d6686b0b99..fbcd4742ee4 100644 --- a/gcc/lto/lto-common.cc +++ b/gcc/lto/lto-common.cc @@ -2704,6 +2704,7 @@ read_cgraph_and_symbols (unsigned nfiles, const char **fnames) { unsigned int i, last_file_ix; FILE *resolution; + unsigned resolution_objects = 0; int count = 0; struct lto_file_decl_data **decl_data; symtab_node *snode; @@ -2726,18 +2727,14 @@ read_cgraph_and_symbols (unsigned nfiles, const char **fnames) if (resolution_file_name) { int t; - unsigned num_objects; resolution = fopen (resolution_file_name, "r"); if (resolution == NULL) fatal_error (input_location, "could not open symbol resolution file: %m"); - t = fscanf (resolution, "%u", &num_objects); + t = fscanf (resolution, "%u", &resolution_objects); gcc_assert (t == 1); - - /* True, since the plugin splits the archives. */ - gcc_assert (num_objects == nfiles); } symtab->state = LTO_STREAMING; @@ -2806,7 +2803,12 @@ read_cgraph_and_symbols (unsigned nfiles, const char **fnames) lto_register_canonical_types_for_odr_types (); if (resolution_file_name) -fclose (resolution); +{ + /* True, since the plugin splits the archives. */ + if (!seen_error ()) + gcc_assert (resolution_objects == nfiles); + fclose (resolution); +} /* Show the LTO report before launching LTRANS. */ if (flag_lto_report || (flag_wpa && flag_lto_report_wpa)) -- 2.34.1
Re: [PATCH] lto: fix error handling for -Wl,-plugin-opt=debug
On Wed, Feb 2, 2022 at 3:31 PM Martin Liška wrote: > > When one uses something like: -Wl,-plugin-opt=debug, > we end up with lto1 WPA invocation that has 'debug' > on command line. We interpret that as input filename. > > The patch moves resolution checking later so that we end up with > a reasonable error message: > > lto1: error: open debug failed: No such file or directory I think almost all of these errors should be fatal_error, not error () since we cannot really recover. That would make .. > lto1: fatal error: errors during merging of translation units > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? > Thanks, > Martin > > PR lto/104333 > > gcc/lto/ChangeLog: > > * lto-common.cc (read_cgraph_and_symbols): Move resolution > checking for number of files later and report a reasonable > error message. > --- > gcc/lto/lto-common.cc | 14 -- > 1 file changed, 8 insertions(+), 6 deletions(-) > > diff --git a/gcc/lto/lto-common.cc b/gcc/lto/lto-common.cc > index 4d6686b0b99..fbcd4742ee4 100644 > --- a/gcc/lto/lto-common.cc > +++ b/gcc/lto/lto-common.cc > @@ -2704,6 +2704,7 @@ read_cgraph_and_symbols (unsigned nfiles, const char > **fnames) > { > unsigned int i, last_file_ix; > FILE *resolution; > + unsigned resolution_objects = 0; > int count = 0; > struct lto_file_decl_data **decl_data; > symtab_node *snode; > @@ -2726,18 +2727,14 @@ read_cgraph_and_symbols (unsigned nfiles, const char > **fnames) > if (resolution_file_name) > { > int t; > - unsigned num_objects; > > resolution = fopen (resolution_file_name, "r"); > if (resolution == NULL) > fatal_error (input_location, > "could not open symbol resolution file: %m"); > > - t = fscanf (resolution, "%u", &num_objects); > + t = fscanf (resolution, "%u", &resolution_objects); > gcc_assert (t == 1); > - > - /* True, since the plugin splits the archives. */ > - gcc_assert (num_objects == nfiles); > } > symtab->state = LTO_STREAMING; > > @@ -2806,7 +2803,12 @@ read_cgraph_and_symbols (unsigned nfiles, const char > **fnames) > lto_register_canonical_types_for_odr_types (); > > if (resolution_file_name) > -fclose (resolution); > +{ > + /* True, since the plugin splits the archives. */ > + if (!seen_error ()) ... checking for seen_error () unnecessary. > + gcc_assert (resolution_objects == nfiles); > + fclose (resolution); > +} > > /* Show the LTO report before launching LTRANS. */ > if (flag_lto_report || (flag_wpa && flag_lto_report_wpa)) > -- > 2.34.1 >
[PATCH] Add CLOBBER_MARKS_EOL to mark storage end-of-life clobbers
This adds a flag to CONSTRUCTOR nodes indicating that for clobbers this marks the end-of-life of storage as opposed to just ending the lifetime of the object that occupied it. The dangling pointer diagnostics uses CLOBBERs but is confused by those emitted by the C++ frontend for example which emits them for the second purpose at the start of CTORs. The issue is also appearant for aarch64 in PR104092. Distinguishing the two cases is also necessary for the PR90348 fix. Bootstrapped and tested on x86_64-unknown-linux-gnu. I verified the bogus diagnostic in PR104092 is gone with a cross. OK for trunk? Thanks, Richard. 2022-02-02 Richard Biener PR middle-end/90348 PR middle-end/104092 * tree-core.h: Document protected_flag use on CONSTRUCTOR nodes. * tree.h (CLOBBER_MARKS_EOL): Add. * tree-pretty-print.cc (dump_generic_node): Mark EOL CLOBBERs. * gimplify.c (gimplify_bind_expr): Mark storage end-of-life clobbers with CLOBBER_MARKS_EOL. (gimplify_target_expr): Likewise. * tree-inline.cc (expand_call_inline): Likewise. * tree-ssa-ccp.cc (insert_clobber_before_stack_restore): Likewise. * gimple-ssa-warn-access.cc (pass_waccess::check_stmt): Only treat CLOBBER_MARKS_EOL clobbers as ending lifetime of storage. * gcc.dg/pr87052.c: Adjust. --- gcc/gimple-ssa-warn-access.cc | 4 +++- gcc/gimplify.cc| 2 ++ gcc/testsuite/gcc.dg/pr87052.c | 2 +- gcc/tree-core.h| 3 +++ gcc/tree-inline.cc | 2 ++ gcc/tree-pretty-print.cc | 6 +- gcc/tree-ssa-ccp.cc| 1 + gcc/tree.h | 3 +++ 8 files changed, 20 insertions(+), 3 deletions(-) diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc index ad5e2f4458e..4890737587c 100644 --- a/gcc/gimple-ssa-warn-access.cc +++ b/gcc/gimple-ssa-warn-access.cc @@ -4330,7 +4330,9 @@ is_auto_decl (tree x) void pass_waccess::check_stmt (gimple *stmt) { - if (m_check_dangling_p && gimple_clobber_p (stmt)) + if (m_check_dangling_p + && gimple_clobber_p (stmt) + && CLOBBER_MARKS_EOL (gimple_assign_rhs1 (stmt))) { /* Ignore clobber statemts in blocks with exceptional edges. */ basic_block bb = gimple_bb (stmt); diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index cd4b61362b4..253f2c9af64 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -1476,6 +1476,7 @@ gimplify_bind_expr (tree *expr_p, gimple_seq *pre_p) && flag_stack_reuse != SR_NONE) { tree clobber = build_clobber (TREE_TYPE (t)); + CLOBBER_MARKS_EOL (clobber) = 1; gimple *clobber_stmt; clobber_stmt = gimple_build_assign (t, clobber); gimple_set_location (clobber_stmt, end_locus); @@ -6982,6 +6983,7 @@ gimplify_target_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p) if (flag_stack_reuse == SR_ALL) { tree clobber = build_clobber (TREE_TYPE (temp)); + CLOBBER_MARKS_EOL (clobber) = 1; clobber = build2 (MODIFY_EXPR, TREE_TYPE (temp), temp, clobber); gimple_push_cleanup (temp, clobber, false, pre_p, true); } diff --git a/gcc/testsuite/gcc.dg/pr87052.c b/gcc/testsuite/gcc.dg/pr87052.c index 2affc480f4e..18e092c4674 100644 --- a/gcc/testsuite/gcc.dg/pr87052.c +++ b/gcc/testsuite/gcc.dg/pr87052.c @@ -38,4 +38,4 @@ void test (void) { dg-final { scan-tree-dump-times "c = \"\";" 1 "gimple" } } { dg-final { scan-tree-dump-times "d = { *};" 1 "gimple" } } { dg-final { scan-tree-dump-times "e = " 1 "gimple" } } - { dg-final { scan-tree-dump-times "e = {CLOBBER}" 1 "gimple" } } */ + { dg-final { scan-tree-dump-times "e = {CLOBBER\\(eol\\)}" 1 "gimple" } } */ diff --git a/gcc/tree-core.h b/gcc/tree-core.h index e83669f20d8..924a6a2b2cf 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -1281,6 +1281,9 @@ struct GTY(()) tree_base { ASM_INLINE_P in ASM_EXPR + CLOBBER_MARKS_EOL in + CONSTRUCTOR + side_effects_flag: TREE_SIDE_EFFECTS in diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc index 497aa667eb9..47e895dad68 100644 --- a/gcc/tree-inline.cc +++ b/gcc/tree-inline.cc @@ -5139,6 +5139,7 @@ expand_call_inline (basic_block bb, gimple *stmt, copy_body_data *id, && !(id->debug_map && id->debug_map->get (p))) { tree clobber = build_clobber (TREE_TYPE (*varp)); + CLOBBER_MARKS_EOL (clobber) = 1; gimple *clobber_stmt; clobber_stmt = gimple_build_assign (*varp, clobber); gimple_set_location (clobber_stmt, gimple_location (stmt)); @@ -5208,6 +5209,7 @@ expand_call_inline (basic_block bb, gimple *stmt, copy_body_data *id, && !stmt_ends_bb_p (stmt)) { tree clobber = build_clobber (TREE_TYPE (id->retvar)); + CLOBBER_MAR
Re: [PATCH] lto: fix error handling for -Wl,-plugin-opt=debug
On 2/2/22 15:38, Richard Biener wrote: ... checking for seen_error () unnecessary. Sure, so something like this? Ready to be installed? Thanks, MartinFrom fd7385e495acfced416b37320b4bb7475bf2f9ff Mon Sep 17 00:00:00 2001 From: Martin Liska Date: Wed, 2 Feb 2022 14:21:51 +0100 Subject: [PATCH] lto: fix error handling for -Wl,-plugin-opt=debug When one uses something like: -Wl,-plugin-opt=debug, we end up with lto1 WPA invocation that has 'debug' on command line. We interpret that as input filename. The patch moves resolution checking later so that we end up with a reasonable error message: lto1: fatal error: open debug failed: No such file or directory compilation terminated. PR lto/104333 gcc/lto/ChangeLog: * lto-common.cc (read_cgraph_and_symbols): Move resolution checking for number of files later and report a reasonable error message. * lto-object.cc (lto_obj_file_open): Make error fatal. --- gcc/lto/lto-common.cc | 13 +++-- gcc/lto/lto-object.cc | 8 ++-- 2 files changed, 9 insertions(+), 12 deletions(-) diff --git a/gcc/lto/lto-common.cc b/gcc/lto/lto-common.cc index 4d6686b0b99..11fde671162 100644 --- a/gcc/lto/lto-common.cc +++ b/gcc/lto/lto-common.cc @@ -2704,6 +2704,7 @@ read_cgraph_and_symbols (unsigned nfiles, const char **fnames) { unsigned int i, last_file_ix; FILE *resolution; + unsigned resolution_objects = 0; int count = 0; struct lto_file_decl_data **decl_data; symtab_node *snode; @@ -2726,18 +2727,14 @@ read_cgraph_and_symbols (unsigned nfiles, const char **fnames) if (resolution_file_name) { int t; - unsigned num_objects; resolution = fopen (resolution_file_name, "r"); if (resolution == NULL) fatal_error (input_location, "could not open symbol resolution file: %m"); - t = fscanf (resolution, "%u", &num_objects); + t = fscanf (resolution, "%u", &resolution_objects); gcc_assert (t == 1); - - /* True, since the plugin splits the archives. */ - gcc_assert (num_objects == nfiles); } symtab->state = LTO_STREAMING; @@ -2806,7 +2803,11 @@ read_cgraph_and_symbols (unsigned nfiles, const char **fnames) lto_register_canonical_types_for_odr_types (); if (resolution_file_name) -fclose (resolution); +{ + /* True, since the plugin splits the archives. */ + gcc_assert (resolution_objects == nfiles); + fclose (resolution); +} /* Show the LTO report before launching LTRANS. */ if (flag_lto_report || (flag_wpa && flag_lto_report_wpa)) diff --git a/gcc/lto/lto-object.cc b/gcc/lto/lto-object.cc index 6f7f96f14d9..329bbc71fd6 100644 --- a/gcc/lto/lto-object.cc +++ b/gcc/lto/lto-object.cc @@ -103,10 +103,7 @@ lto_obj_file_open (const char *filename, bool writable) : O_RDONLY | O_BINARY), 0666); if (lo->fd == -1) -{ - error ("open %s failed: %s", fname, xstrerror (errno)); - goto fail; -} +fatal_error (input_location, "open %s failed: %s", fname, xstrerror (errno)); if (!writable) { @@ -146,13 +143,12 @@ lto_obj_file_open (const char *filename, bool writable) return &lo->base; - fail_errmsg: +fail_errmsg: if (err == 0) error ("%s: %s", fname, errmsg); else error ("%s: %s: %s", fname, errmsg, xstrerror (err)); - fail: if (lo->fd != -1) lto_obj_file_close ((lto_file *) lo); free (lo); -- 2.34.1
Re: [PATCH] lto: fix error handling for -Wl,-plugin-opt=debug
On Wed, Feb 2, 2022 at 3:48 PM Martin Liška wrote: > > On 2/2/22 15:38, Richard Biener wrote: > > ... checking for seen_error () unnecessary. > > Sure, so something like this? yes, I think so. > Ready to be installed? OK. > Thanks, > Martin
Re: [PATCH] Add CLOBBER_MARKS_EOL to mark storage end-of-life clobbers
On Wed, Feb 02, 2022 at 03:42:48PM +0100, Richard Biener via Gcc-patches wrote: > This adds a flag to CONSTRUCTOR nodes indicating that for > clobbers this marks the end-of-life of storage as opposed to > just ending the lifetime of the object that occupied it. > The dangling pointer diagnostics uses CLOBBERs but is confused > by those emitted by the C++ frontend for example which emits > them for the second purpose at the start of CTORs. The issue > is also appearant for aarch64 in PR104092. I think many further build_clobber calls actually should build those EOL clobbers, I think we'd need to go through them one by one and see what kind of clobber it is. E.g. I think all of omp-low.cc build_clobber calls are like that. C++ FE indeed emits some clobbers at the start of ctors with -flifetime-dse=2, but others e.g. at the end of dtors, dunno about those, though maybe the object doesn't go out of scope at that point yet. What about expansion and tree-ssa-live.cc, should those keep doing what it does and not ignore clobbers that aren't CLOBBER_MARKS_EOL? Jakub
[PATCH][pushed] Remove dead macro: TEXT_SECTION_NAME
The macro is unused. I'm going to push the patch. Martin gcc/ChangeLog: * dwarf2out.cc (TEXT_SECTION_NAME): Remove unused macro. --- gcc/dwarf2out.cc | 5 - 1 file changed, 5 deletions(-) diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc index ad1d804dcaf..e60575b1398 100644 --- a/gcc/dwarf2out.cc +++ b/gcc/dwarf2out.cc @@ -4186,11 +4186,6 @@ new_addr_loc_descr (rtx addr, enum dtprel_bool dtprel) #define DEBUG_LTO_LINE_STR_SECTION ".gnu.debuglto_.debug_line_str" #endif -/* Standard ELF section names for compiled code and data. */ -#ifndef TEXT_SECTION_NAME -#define TEXT_SECTION_NAME ".text" -#endif - /* Section flags for .debug_str section. */ #define DEBUG_STR_SECTION_FLAGS \ (HAVE_GAS_SHF_MERGE && flag_merge_debug_strings \ -- 2.34.1
[committed] analyzer: stop -ftrivial-auto-var-init from suppressing uninit warnings [PR104270]
GCC 12 has gained two features for dealing with uninitialized variables: (a) a new -Wanalyzer-use-of-uninitialized-value warning within -fanalyzer for interprocedural path-sensitive detection of ununit uses, and (b) a new -ftrivial-auto-var-init option for mitigating some uses of uninit variables It turns out that using (b) was thwarting (a), as it led to -fanalyzer seeing calls to IFN_DEFERRED_INIT, which -fanalyzer wasn't special-casing, thus treating it as initializing the variables in question, and thus silencing -Wanalyzer-use-of-uninitialized-value on them. invoke.texi says: "GCC still considers an automatic variable that doesn't have an explicit initializer as uninitialized, @option{-Wuninitialized} will still report warning messages on such automatic variables." and thus -Wanalyzer-use-of-uninitialized-value ought to as well. This patch adds special-case handling to -fanalyzer for IFN_DEFERRED_INIT, so that -fanalyzer will warn on uninit uses of variables that are mitigated by -ftrivial-auto-var-init. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Not strictly a regression, but as this affects two new GCC 12 features it seems appropriate to fix in stage 4. Pushed to trunk as r12-6997-g9b4eee5fd158c4ee75d1f1000debbf5082fb9b56. gcc/analyzer/ChangeLog: PR analyzer/104270 * region-model.cc (region_model::on_call_pre): Handle IFN_DEFERRED_INIT. gcc/testsuite/ChangeLog: PR analyzer/104270 * gcc.dg/analyzer/uninit-trivial-auto-var-init-pattern.c: New test. * gcc.dg/analyzer/uninit-trivial-auto-var-init-uninitialized.c: New test. * gcc.dg/analyzer/uninit-trivial-auto-var-init-zero.c: New test. Signed-off-by: David Malcolm --- gcc/analyzer/region-model.cc | 10 ++ .../analyzer/uninit-trivial-auto-var-init-pattern.c| 7 +++ .../uninit-trivial-auto-var-init-uninitialized.c | 7 +++ .../analyzer/uninit-trivial-auto-var-init-zero.c | 7 +++ 4 files changed, 31 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-pattern.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-uninitialized.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-zero.c diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc index 6810cf508d9..4c312b053f8 100644 --- a/gcc/analyzer/region-model.cc +++ b/gcc/analyzer/region-model.cc @@ -1109,6 +1109,16 @@ region_model::on_call_pre (const gcall *call, region_model_context *ctxt, bool unknown_side_effects = false; + /* Special-case for IFN_DEFERRED_INIT. + We want to report uninitialized variables with -fanalyzer (treating + -ftrivial-auto-var-init= as purely a mitigation feature). + Handle IFN_DEFERRED_INIT by treating it as no-op: don't touch the + lhs of the call, so that it is still uninitialized from the point of + view of the analyzer. */ + if (gimple_call_internal_p (call) + && gimple_call_internal_fn (call) == IFN_DEFERRED_INIT) +return false; + /* Some of the cases below update the lhs of the call based on the return value, but not all. Provide a default value, which may get overwritten below. */ diff --git a/gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-pattern.c b/gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-pattern.c new file mode 100644 index 000..0b78dc65267 --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-pattern.c @@ -0,0 +1,7 @@ +/* { dg-additional-options "-ftrivial-auto-var-init=pattern" } */ + +int test_1 (void) +{ + int i; /* { dg-message "region created on stack here" } */ + return i; /* { dg-warning "use of uninitialized value 'i'" } */ +} diff --git a/gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-uninitialized.c b/gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-uninitialized.c new file mode 100644 index 000..124d3a327b8 --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-uninitialized.c @@ -0,0 +1,7 @@ +/* { dg-additional-options "-ftrivial-auto-var-init=uninitialized" } */ + +int test_1 (void) +{ + int i; /* { dg-message "region created on stack here" } */ + return i; /* { dg-warning "use of uninitialized value 'i'" } */ +} diff --git a/gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-zero.c b/gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-zero.c new file mode 100644 index 000..ef7dc674867 --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-zero.c @@ -0,0 +1,7 @@ +/* { dg-additional-options "-ftrivial-auto-var-init=zero" } */ + +int test_1 (void) +{ + int i; /* { dg-message "region created on stack here" } */ + return i; /* { dg-warning "use of uninitialized value 'i'" } */ +} -- 2.26.3
Re: [PATCH] Add CLOBBER_MARKS_EOL to mark storage end-of-life clobbers
On Wed, 2 Feb 2022, Jakub Jelinek wrote: > On Wed, Feb 02, 2022 at 03:42:48PM +0100, Richard Biener via Gcc-patches > wrote: > > This adds a flag to CONSTRUCTOR nodes indicating that for > > clobbers this marks the end-of-life of storage as opposed to > > just ending the lifetime of the object that occupied it. > > The dangling pointer diagnostics uses CLOBBERs but is confused > > by those emitted by the C++ frontend for example which emits > > them for the second purpose at the start of CTORs. The issue > > is also appearant for aarch64 in PR104092. > > I think many further build_clobber calls actually should build > those EOL clobbers, I think we'd need to go through them one by one and see > what kind of clobber it is. > E.g. I think all of omp-low.cc build_clobber calls are like that. Yeah, there may be others that also end lifetime of the storage. It should be pretty safe since the only consumer after this patch is the access warning. > C++ FE indeed emits some clobbers at the start of ctors with > -flifetime-dse=2, but others e.g. at the end of dtors, dunno about those, > though maybe the object doesn't go out of scope at that point yet. > What about expansion and tree-ssa-live.cc, should those keep doing > what it does and not ignore clobbers that aren't CLOBBER_MARKS_EOL? Yes. Basically EOL clobbers act as regular clobbers _plus_ they also note the end of life of the underlying storage, so treating those like regular clobbers is safe, likewise it's not a regression to not ignore !CLOBBER_MARKS_EOL clobbers elsewhere and I'd rather have testcases showing the current behavior is wrong for those than changing other places. The inliner and CCP cases I ran into when fixing PR90348, I did not run into any others sofar. Richard.
[committed] analyzer: implement bit_range_region
GCC 12 has gained -Wanalyzer-use-of-uninitialized-value, and I'm seeing various false positives from it due to region_model::get_lvalue not properly handling BIT_FIELD_REF, and falling back to using an UNKNOWN_REGION for them. This patch fixes these false positives by implementing a new bit_range_region region subclass for handling BIT_FIELD_REF. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. This is a larger patch than I'd prefer in stage 4, but it prevents false positives from a new warning, so it seems appropriate to fix in stage 4. Pushed to trunk as r12-6998-g93e759fc18a1a4. gcc/analyzer/ChangeLog: * analyzer.h (class bit_range_region): New forward decl. * region-model-manager.cc (region_model_manager::get_bit_range): New. (region_model_manager::log_stats): Handle m_bit_range_regions. * region-model.cc (region_model::get_lvalue_1): Handle BIT_FIELD_REF. * region-model.h (region_model_manager::get_bit_range): New decl. (region_model_manager::m_bit_range_regions): New field. * region.cc (region::get_base_region): Handle RK_BIT_RANGE. (region::base_region_p): Likewise. (region::calc_offset): Likewise. (bit_range_region::dump_to_pp): New. (bit_range_region::get_byte_size): New. (bit_range_region::get_bit_size): New. (bit_range_region::get_byte_size_sval): New. (bit_range_region::get_relative_concrete_offset): New. * region.h (enum region_kind): Add RK_BIT_RANGE. (region::dyn_cast_bit_range_region): New vfunc. (class bit_range_region): New. (is_a_helper ::test): New. (default_hash_traits): New. gcc/testsuite/ChangeLog: * gcc.dg/analyzer/torture/uninit-bit-field-ref.c: New test. Signed-off-by: David Malcolm --- gcc/analyzer/analyzer.h | 1 + gcc/analyzer/region-model-manager.cc | 20 + gcc/analyzer/region-model.cc | 14 +++ gcc/analyzer/region-model.h | 4 + gcc/analyzer/region.cc| 84 + gcc/analyzer/region.h | 89 +++ .../analyzer/torture/uninit-bit-field-ref.c | 31 +++ 7 files changed, 243 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/analyzer/torture/uninit-bit-field-ref.c diff --git a/gcc/analyzer/analyzer.h b/gcc/analyzer/analyzer.h index c5bca2dec64..7e58bcd5d70 100644 --- a/gcc/analyzer/analyzer.h +++ b/gcc/analyzer/analyzer.h @@ -67,6 +67,7 @@ class region; class cast_region; class field_region; class string_region; + class bit_range_region; class region_model_manager; struct model_merger; class store_manager; diff --git a/gcc/analyzer/region-model-manager.cc b/gcc/analyzer/region-model-manager.cc index ba835cba22c..010ad078849 100644 --- a/gcc/analyzer/region-model-manager.cc +++ b/gcc/analyzer/region-model-manager.cc @@ -1494,6 +1494,25 @@ region_model_manager::get_region_for_string (tree string_cst) return reg; } +/* Return the region that describes accessing BITS within PARENT as TYPE, + creating it if necessary. */ + +const region * +region_model_manager::get_bit_range (const region *parent, tree type, +const bit_range &bits) +{ + gcc_assert (parent); + + bit_range_region::key_t key (parent, type, bits); + if (bit_range_region *reg = m_bit_range_regions.get (key)) +return reg; + + bit_range_region *bit_range_reg += new bit_range_region (alloc_region_id (), parent, type, bits); + m_bit_range_regions.put (key, bit_range_reg); + return bit_range_reg; +} + /* If we see a tree code we don't know how to handle, rather than ICE or generate bogus results, create a dummy region, and notify CTXT so that it can mark the new state as being not properly @@ -1663,6 +1682,7 @@ region_model_manager::log_stats (logger *logger, bool show_objs) const log_uniq_map (logger, show_objs, "frame_region", m_frame_regions); log_uniq_map (logger, show_objs, "symbolic_region", m_symbolic_regions); log_uniq_map (logger, show_objs, "string_region", m_string_map); + log_uniq_map (logger, show_objs, "bit_range_region", m_bit_range_regions); logger->log (" # managed dynamic regions: %i", m_managed_dynamic_regions.length ()); m_store_mgr.log_stats (logger, show_objs); diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc index 4c312b053f8..58c7028fc9c 100644 --- a/gcc/analyzer/region-model.cc +++ b/gcc/analyzer/region-model.cc @@ -1707,6 +1707,20 @@ region_model::get_lvalue_1 (path_var pv, region_model_context *ctxt) const } break; +case BIT_FIELD_REF: + { + tree inner_expr = TREE_OPERAND (expr, 0); + const region *inner_reg = get_lvalue (inner_expr, ctxt); + tree num_bits = TREE_OPERAND (expr, 1); + tree first_bit_offset = TREE_OPERAND (expr, 2); + gcc_assert (TREE_CODE (num_bi
Re: [PATCH] reload: Adjust comment in find_reloads about subset, not intersection
Hans-Peter Nilsson writes: >> From: Richard Sandiford >> Hans-Peter Nilsson via Gcc-patches writes: >> > The mystery isn't so much that there's code mismatching comments or >> > intent, but that this code has been there "forever". There has been a >> > function reg_classes_intersect_p, in gcc since r0-54, even *before* >> > there was reload.c; r0-361, so why the "we don't have a way of forming >> > the intersection" in the actual patch and why wasn't this fixed >> > (perhaps not even noticed) when reload was state-of-the-art? >> >> But doesn't the comment > > (the second, patched comment; removed in the patch) > >> mean that we have/had no way of getting >> a *class* that is the intersection of preferred_class[i] and >> this_alternative[i], to store as the new >> this_alternative[i]? > > Yes, that's likely what's going on in the (second) comment > and code; needing a s/intersection/a class for the > intersection/, but the *first* comment is pretty clear that > the intent is exactly to "override" this_alternative[i]: "If > not (a subclass), but it intersects that class, use the > preferred class instead". But of course, that doesn't > actually have to make sense as code! And indeed it doesn't, > as you say. > >> If the alternatives were register sets rather than classes, >> I think the intended effect would be: >> >> this_alternative[i] &= preferred_class[i]; >> >> (i.e. AND_HARD_REG_SET in old money). >> >> It looks like the patch would instead make this_alternative[i] include >> registers that the alternative doesn't actually accept, which feels odd. > > Perhaps I put too much trust in the sanity of old comments. > > How about I actually commit this one instead? Better get it > right before reload is removed. :-) LGTM, but I'd like to hear Jeff's opinion. Thanks, Richard > 8< --- >8 > "reload: Adjust comment in find_reloads about subset, not intersection" > gcc: > > * reload.cc (find_reloads): Align comment with code where > considering the intersection of register classes then tweaking the > regclass for the current alternative or rejecting it. > --- > gcc/reload.cc | 15 +-- > 1 file changed, 9 insertions(+), 6 deletions(-) > > diff --git a/gcc/reload.cc b/gcc/reload.cc > index 664082a533d9..3ed901e39447 100644 > --- a/gcc/reload.cc > +++ b/gcc/reload.cc > @@ -3635,9 +3635,11 @@ find_reloads (rtx_insn *insn, int replace, int > ind_levels, int live_known, >a hard reg and this alternative accepts some >register, see if the class that we want is a subset >of the preferred class for this register. If not, > - but it intersects that class, use the preferred class > - instead. If it does not intersect the preferred > - class, show that usage of this alternative should be > + but it intersects that class, we'd like to use the > + intersection, but the best we can do is to use the > + preferred class, if it is instead a subset of the > + class we want in this alternative. If we can't use > + it, show that usage of this alternative should be >discouraged; it will be discouraged more still if the >register is `preferred or nothing'. We do this >because it increases the chance of reusing our spill > @@ -3664,9 +3666,10 @@ find_reloads (rtx_insn *insn, int replace, int > ind_levels, int live_known, > if (! reg_class_subset_p (this_alternative[i], > preferred_class[i])) > { > - /* Since we don't have a way of forming the intersection, > - we just do something special if the preferred class > - is a subset of the class we have; that's the most > + /* Since we don't have a way of forming a register > + class for the intersection, we just do > + something special if the preferred class is a > + subset of the class we have; that's the most >common case anyway. */ > if (reg_class_subset_p (preferred_class[i], > this_alternative[i]))
Re: [PATCH] reload: Adjust comment in find_reloads about subset, not intersection
Richard Sandiford writes: > Hans-Peter Nilsson writes: >>> From: Richard Sandiford >>> Hans-Peter Nilsson via Gcc-patches writes: >>> > The mystery isn't so much that there's code mismatching comments or >>> > intent, but that this code has been there "forever". There has been a >>> > function reg_classes_intersect_p, in gcc since r0-54, even *before* >>> > there was reload.c; r0-361, so why the "we don't have a way of forming >>> > the intersection" in the actual patch and why wasn't this fixed >>> > (perhaps not even noticed) when reload was state-of-the-art? >>> >>> But doesn't the comment >> >> (the second, patched comment; removed in the patch) >> >>> mean that we have/had no way of getting >>> a *class* that is the intersection of preferred_class[i] and >>> this_alternative[i], to store as the new >>> this_alternative[i]? >> >> Yes, that's likely what's going on in the (second) comment >> and code; needing a s/intersection/a class for the >> intersection/, but the *first* comment is pretty clear that >> the intent is exactly to "override" this_alternative[i]: "If >> not (a subclass), but it intersects that class, use the >> preferred class instead". But of course, that doesn't >> actually have to make sense as code! And indeed it doesn't, >> as you say. >> >>> If the alternatives were register sets rather than classes, >>> I think the intended effect would be: >>> >>> this_alternative[i] &= preferred_class[i]; >>> >>> (i.e. AND_HARD_REG_SET in old money). >>> >>> It looks like the patch would instead make this_alternative[i] include >>> registers that the alternative doesn't actually accept, which feels odd. >> >> Perhaps I put too much trust in the sanity of old comments. >> >> How about I actually commit this one instead? Better get it >> right before reload is removed. > > :-) LGTM, but I'd like to hear Jeff's opinion. So it would be a good idea if I used the right email address. > > Thanks, > Richard > >> 8< --- >8 >> "reload: Adjust comment in find_reloads about subset, not intersection" >> gcc: >> >> * reload.cc (find_reloads): Align comment with code where >> considering the intersection of register classes then tweaking the >> regclass for the current alternative or rejecting it. >> --- >> gcc/reload.cc | 15 +-- >> 1 file changed, 9 insertions(+), 6 deletions(-) >> >> diff --git a/gcc/reload.cc b/gcc/reload.cc >> index 664082a533d9..3ed901e39447 100644 >> --- a/gcc/reload.cc >> +++ b/gcc/reload.cc >> @@ -3635,9 +3635,11 @@ find_reloads (rtx_insn *insn, int replace, int >> ind_levels, int live_known, >> a hard reg and this alternative accepts some >> register, see if the class that we want is a subset >> of the preferred class for this register. If not, >> - but it intersects that class, use the preferred class >> - instead. If it does not intersect the preferred >> - class, show that usage of this alternative should be >> + but it intersects that class, we'd like to use the >> + intersection, but the best we can do is to use the >> + preferred class, if it is instead a subset of the >> + class we want in this alternative. If we can't use >> + it, show that usage of this alternative should be >> discouraged; it will be discouraged more still if the >> register is `preferred or nothing'. We do this >> because it increases the chance of reusing our spill >> @@ -3664,9 +3666,10 @@ find_reloads (rtx_insn *insn, int replace, int >> ind_levels, int live_known, >>if (! reg_class_subset_p (this_alternative[i], >> preferred_class[i])) >> { >> - /* Since we don't have a way of forming the intersection, >> - we just do something special if the preferred class >> - is a subset of the class we have; that's the most >> + /* Since we don't have a way of forming a register >> + class for the intersection, we just do >> + something special if the preferred class is a >> + subset of the class we have; that's the most >> common case anyway. */ >>if (reg_class_subset_p (preferred_class[i], >>this_alternative[i]))
[committed] analyzer: consolidate duplicate code in region::calc_offset
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. I'm taking a liberty by committing this cleanup in stage 4, but it's confined to the analyzer and seems low-risk. Pushed to trunk as r12-6999-gea3e1915954371. gcc/analyzer/ChangeLog: * region.cc (region::calc_offset): Consolidate effectively identical cases. Signed-off-by: David Malcolm --- gcc/analyzer/region.cc | 48 +- 1 file changed, 5 insertions(+), 43 deletions(-) diff --git a/gcc/analyzer/region.cc b/gcc/analyzer/region.cc index 9d8fdb22271..77554b86143 100644 --- a/gcc/analyzer/region.cc +++ b/gcc/analyzer/region.cc @@ -499,41 +499,16 @@ region::calc_offset () const switch (iter_region->get_kind ()) { case RK_FIELD: - { - const field_region *field_reg - = (const field_region *)iter_region; - iter_region = iter_region->get_parent_region (); - - bit_offset_t rel_bit_offset; - if (!field_reg->get_relative_concrete_offset (&rel_bit_offset)) - return region_offset::make_symbolic (iter_region); - accum_bit_offset += rel_bit_offset; - } - continue; - case RK_ELEMENT: - { - const element_region *element_reg - = (const element_region *)iter_region; - iter_region = iter_region->get_parent_region (); - - bit_offset_t rel_bit_offset; - if (!element_reg->get_relative_concrete_offset (&rel_bit_offset)) - return region_offset::make_symbolic (iter_region); - accum_bit_offset += rel_bit_offset; - } - continue; - case RK_OFFSET: + case RK_BIT_RANGE: { - const offset_region *offset_reg - = (const offset_region *)iter_region; - iter_region = iter_region->get_parent_region (); - bit_offset_t rel_bit_offset; - if (!offset_reg->get_relative_concrete_offset (&rel_bit_offset)) - return region_offset::make_symbolic (iter_region); + if (!iter_region->get_relative_concrete_offset (&rel_bit_offset)) + return region_offset::make_symbolic + (iter_region->get_parent_region ()); accum_bit_offset += rel_bit_offset; + iter_region = iter_region->get_parent_region (); } continue; @@ -549,19 +524,6 @@ region::calc_offset () const } continue; - case RK_BIT_RANGE: - { - const bit_range_region *bit_range_reg - = (const bit_range_region *)iter_region; - iter_region = iter_region->get_parent_region (); - - bit_offset_t rel_bit_offset; - if (!bit_range_reg->get_relative_concrete_offset (&rel_bit_offset)) - return region_offset::make_symbolic (iter_region); - accum_bit_offset += rel_bit_offset; - } - continue; - default: return region_offset::make_concrete (iter_region, accum_bit_offset); } -- 2.26.3
[committed] analyzer: fix missing check for uninit of return values
When moving the -fanalyzer tests for -ftrivial-auto-var-init to the "torture" subdirectory of gcc.dg/analyzer I noticed that -fanalyzer wasn't always properly checking for initialization of return values. The issue was that some "return" handling was using region_model::copy_region to copy to the RESULT_DECL, and copy_region wasn't checking for poisoned svalues. This patch eliminates region_model::copy_region in favor of simply doing a get_ravlue/set_value pair, fixing the issue. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to trunk as r12-7000-g13ad6d9f50e3f1. gcc/analyzer/ChangeLog: * region-model.cc (region_model::on_return): Replace usage of copy_region with get_rvalue/set_value pair. (region_model::pop_frame): Likewise. (selftest::test_compound_assignment): Likewise. * region-model.h (region_model::copy_region): Delete decl. * region.cc (region_model::copy_region): Delete. gcc/testsuite/ChangeLog: * gcc.dg/analyzer/torture/ubsan-1.c: Add missing return stmts. * gcc.dg/analyzer/uninit-trivial-auto-var-init-pattern.c: Move to... * gcc.dg/analyzer/torture/uninit-trivial-auto-var-init-pattern.c: ...here. * gcc.dg/analyzer/uninit-trivial-auto-var-init-uninitialized.c: Move to... * gcc.dg/analyzer/torture/uninit-trivial-auto-var-init-uninitialized.c: ...here. * gcc.dg/analyzer/uninit-trivial-auto-var-init-zero.c: Move to... * gcc.dg/analyzer/torture/uninit-trivial-auto-var-init-zero.c: ...here. Signed-off-by: David Malcolm --- gcc/analyzer/region-model.cc | 21 ++- gcc/analyzer/region-model.h | 2 -- gcc/analyzer/region.cc| 15 - .../gcc.dg/analyzer/torture/ubsan-1.c | 2 ++ .../uninit-trivial-auto-var-init-pattern.c| 10 + ...init-trivial-auto-var-init-uninitialized.c | 10 + .../uninit-trivial-auto-var-init-zero.c | 10 + .../uninit-trivial-auto-var-init-pattern.c| 7 --- ...init-trivial-auto-var-init-uninitialized.c | 7 --- .../uninit-trivial-auto-var-init-zero.c | 7 --- 10 files changed, 43 insertions(+), 48 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/analyzer/torture/uninit-trivial-auto-var-init-pattern.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/torture/uninit-trivial-auto-var-init-uninitialized.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/torture/uninit-trivial-auto-var-init-zero.c delete mode 100644 gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-pattern.c delete mode 100644 gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-uninitialized.c delete mode 100644 gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-zero.c diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc index 58c7028fc9c..6e7a21d0f9c 100644 --- a/gcc/analyzer/region-model.cc +++ b/gcc/analyzer/region-model.cc @@ -1559,7 +1559,11 @@ region_model::on_return (const greturn *return_stmt, region_model_context *ctxt) tree rhs = gimple_return_retval (return_stmt); if (lhs && rhs) -copy_region (get_lvalue (lhs, ctxt), get_lvalue (rhs, ctxt), ctxt); +{ + const svalue *sval = get_rvalue (rhs, ctxt); + const region *ret_reg = get_lvalue (lhs, ctxt); + set_value (ret_reg, sval, ctxt); +} } /* Update this model for a call and return of setjmp/sigsetjmp at CALL within @@ -3618,15 +3622,11 @@ region_model::pop_frame (const region *result_dst_reg, tree result = DECL_RESULT (fndecl); if (result && TREE_TYPE (result) != void_type_node) { + const svalue *retval = get_rvalue (result, ctxt); if (result_dst_reg) - { - /* Copy the result to RESULT_DST_REG. */ - copy_region (result_dst_reg, - get_lvalue (result, ctxt), - ctxt); - } + set_value (result_dst_reg, retval, ctxt); if (out_result) - *out_result = get_rvalue (result, ctxt); + *out_result = retval; } /* Pop the frame. */ @@ -4758,8 +4758,9 @@ test_compound_assignment () model.set_value (c_y, int_m3, NULL); /* Copy c to d. */ - model.copy_region (model.get_lvalue (d, NULL), model.get_lvalue (c, NULL), -NULL); + const svalue *sval = model.get_rvalue (c, NULL); + model.set_value (model.get_lvalue (d, NULL), sval, NULL); + /* Check that the fields have the same svalues. */ ASSERT_EQ (model.get_rvalue (c_x, NULL), model.get_rvalue (d_x, NULL)); ASSERT_EQ (model.get_rvalue (c_y, NULL), model.get_rvalue (d_y, NULL)); diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h index 3fa090d771e..46cf37e6b26 100644 --- a/gcc/analyzer/region-model.h +++ b/gcc/analyzer/region-model.h @@ -676,8 +676,6 @@ class region_model void zero_fill_region (const region *reg); void mark_region_as_u
Re: [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
On 09.08.21 15:55, Tobias Burnus wrote: Now that the GCN/OpenACC patches for this have been committed today, I think it makes sense to add it to the documentation. (I was told that some follow-up items are still pending, but as the feature does work ...) I think the follow-up patches have now been committed. How about the attached patch? Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 gcc-12/changes.html (GCN): >1 workers per gang diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html index 479bd6c5..738f4c73 100644 --- a/htdocs/gcc-12/changes.html +++ b/htdocs/gcc-12/changes.html @@ -360,6 +360,10 @@ a work-in-progress. Debug experience with ROCGDB has been improved. Support for the type __int128_t/integer(kind=16) was added. + For offloading, the limitation of using only one wavefront per compute + unit (CU) has been lifted; up to 40 workgroup per CU and 16 wavefronts + per workgroup are supported. Additionally, the number of used wavefronts + and workgroups was tuned for performance.
Re: [PATCH] tree-optimization/94899: Remove "+ 0x80000000" in int comparisons
Hi Richard, > Oh, I see - that's very special then and the pattern in the comment > does not include this conversion. I think you can simplify the checking > done by requiring types_match (TREE_TYPE (@1), TREE_TYPE (@3)) > and by noting that the types of @0, @2 and @4 are the same > (you don't seem to use @2). > > I wonder how relevant these kind of patterns are? Probably clang > catches this simplification while we don't? Yes. The bug report was based on a comparison with clang. > Btw, you fail to check for INTEGRAL_TYPE_P, the whole thing > would also match floats as-is, I think the easiest thing would be to > change the match to > > + (op (plus:c (convert@0 @1) INTEGER_CST@4) (plus:c (convert@2 @3) > INTEGER_CST@4)) > > where you then also can elide the CONSTANT_CLASS_P (@4) check. Thanks. I did miss checking for INTEGRAL_TYPE_P, and I didn't think to use INTEGER_CST. > Btw, for > > unsigned x, y; > if (x + 0x8000 < y + 0x8000) > > it would be valid to transform this into > > if ((int)x < (int)y) > > which is a simplification that's worthwhile as well I think? So we > might not actually > need the (convert ...) but rely on (convert (convert @0)) being > simplfiied? You'd > then use > > (with { stype = signed_type_for (TREE_TYPE (@1)); } > (op (convert:stype @1) (convert:stype @3))) > > as transform. Thank you for all the hints! I'm going to work a v2 based on these. Cheers, Arjun
Re: [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
On 02/02/2022 15:39, Tobias Burnus wrote: On 09.08.21 15:55, Tobias Burnus wrote: Now that the GCN/OpenACC patches for this have been committed today, I think it makes sense to add it to the documentation. (I was told that some follow-up items are still pending, but as the feature does work ...) I think the follow-up patches have now been committed. How about the attached patch? We should probably add a qualification "(up to a limit of 40 wavefronts in total, per CU)" or else were suggesting that there can be 40 workgroups of 16 wavefronts, which the hardware will not do. Andrew
s390: Fix bootstrap -Wformat-diag errors
Hi, this fixes the s390 bootstrap errors caused by -Werror=format-diag. It simply splits the problematic format strings. Bootstrapped and regtested with -march=z15. Is it OK? Regards Robin -- gcc/ChangeLog: * config/s390/s390.cc (s390_valid_target_attribute_inner_p): Split format strings.commit 41270791b1d1235d580b6d81c315c74ad07c1807 Author: Robin Dapp Date: Tue Feb 1 10:13:52 2022 +0100 s390: Fix bootstrap by splitting format string. Recently -Werror=format-diag was turned on for bootstrap which broke s390 bootstrap. This patch splits the problematic format strings and changes quoting from explicit \" to %qs. Bootstrapped and regtested on s390x. diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc index 58064bfb525..d2faa1371eb 100644 --- a/gcc/config/s390/s390.cc +++ b/gcc/config/s390/s390.cc @@ -15879,6 +15879,9 @@ s390_valid_target_attribute_inner_p (tree args, /* Handle multiple arguments separated by commas. */ next_optstr = ASTRDUP (TREE_STRING_POINTER (args)); + const char err_prefix[] = "attribute(target("; + const char err_suffix[] = "))"; + while (next_optstr && *next_optstr != '\0') { char *p = next_optstr; @@ -15938,7 +15941,7 @@ s390_valid_target_attribute_inner_p (tree args, /* Process the option. */ if (!found) { - error ("attribute(target(\"%s\")) is unknown", orig_p); + error ("%s%qs%s is unknown", err_prefix, orig_p, err_suffix); return false; } else if (attrs[i].only_as_pragma && !force_pragma) @@ -15988,7 +15991,7 @@ s390_valid_target_attribute_inner_p (tree args, } else { - error ("attribute(target(\"%s\")) is unknown", orig_p); + error ("%s%qs%s is unknown", err_prefix, orig_p, err_suffix); ret = false; } } @@ -16005,7 +16008,7 @@ s390_valid_target_attribute_inner_p (tree args, global_dc); else { - error ("attribute(target(\"%s\")) is unknown", orig_p); + error ("%s%qs%s is unknown", err_prefix, orig_p, err_suffix); ret = false; } }
Re: [PATCH] Add CLOBBER_MARKS_EOL to mark storage end-of-life clobbers
Hello, On Wed, 2 Feb 2022, Richard Biener via Gcc-patches wrote: > This adds a flag to CONSTRUCTOR nodes indicating that for clobbers this > marks the end-of-life of storage as opposed to just ending the lifetime > of the object that occupied it. The dangling pointer diagnostics uses > CLOBBERs but is confused by those emitted by the C++ frontend for > example which emits them for the second purpose at the start of CTORs. > The issue is also appearant for aarch64 in PR104092. > > Distinguishing the two cases is also necessary for the PR90348 fix. (Just FWIW: I agree with the plan to have different types of CLOBBERs, in particular those for marking births) A style nit: > tree clobber = build_clobber (TREE_TYPE (t)); > + CLOBBER_MARKS_EOL (clobber) = 1; I think it really were better if build_clobber() gained an additional argument (non-optional!) that sets the type of it. Ciao, Michael.
Re: [PATCH v3 07/15] arm: Implement MVE predicates as vectors of booleans
On Tue, Feb 1, 2022 at 4:42 AM Richard Sandiford wrote: > Christophe Lyon via Gcc-patches writes: > > On Mon, Jan 31, 2022 at 7:01 PM Richard Sandiford via Gcc-patches < > > gcc-patches@gcc.gnu.org> wrote: > > > >> Sorry for the slow response, was out last week. > >> > >> Christophe Lyon via Gcc-patches writes: > >> > diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c > >> > index f16d320..5f559f8fd93 100644 > >> > --- a/gcc/emit-rtl.c > >> > +++ b/gcc/emit-rtl.c > >> > @@ -6239,9 +6239,14 @@ init_emit_once (void) > >> > > >> >/* For BImode, 1 and -1 are unsigned and signed interpretations > >> > of the same value. */ > >> > - const_tiny_rtx[0][(int) BImode] = const0_rtx; > >> > - const_tiny_rtx[1][(int) BImode] = const_true_rtx; > >> > - const_tiny_rtx[3][(int) BImode] = const_true_rtx; > >> > + for (mode = MIN_MODE_BOOL; > >> > + mode <= MAX_MODE_BOOL; > >> > + mode = (machine_mode)((int)(mode) + 1)) > >> > +{ > >> > + const_tiny_rtx[0][(int) mode] = const0_rtx; > >> > + const_tiny_rtx[1][(int) mode] = const_true_rtx; > >> > + const_tiny_rtx[3][(int) mode] = const_true_rtx; > >> > +} > >> > > >> >for (mode = MIN_MODE_PARTIAL_INT; > >> > mode <= MAX_MODE_PARTIAL_INT; > >> > >> Does this do the right thing for: > >> > >> gen_int_mode (-1, B2Imode) > >> > >> (which is used e.g. in native_decode_vector_rtx)? It looks like it > >> would give 0b01 rather than 0b11. > >> > >> Maybe for non-BImode we should use const1_rtx and constm1_rtx, like with > >> MODE_INT. > >> > > > > debug_rtx ( gen_int_mode (-1, B2Imode) says: > > (const_int -1 [0x]) > > so that looks right? > > Ah, right, I forgot that the mode is unused for the small constant lookup. > But it looks like CONSTM1_RTX (B2Imode) would be (const_int 1) instead, > even though the two should be equal. > Indeed! So I changed the above loop into: /* For BImode, 1 and -1 are unsigned and signed interpretations of the same value. */ for (mode = MIN_MODE_BOOL; mode <= MAX_MODE_BOOL; mode = (machine_mode)((int)(mode) + 1)) { const_tiny_rtx[0][(int) mode] = const0_rtx; const_tiny_rtx[1][(int) mode] = const_true_rtx; - const_tiny_rtx[3][(int) mode] = const_true_rtx; + const_tiny_rtx[3][(int) mode] = constm1_rtx; } which works, both constants are now equal and the validation still passes. > >> > @@ -1679,15 +1708,25 @@ emit_class_narrowest_mode (void) > >> >print_decl ("unsigned char", "class_narrowest_mode", > >> "MAX_MODE_CLASS"); > >> > > >> >for (c = 0; c < MAX_MODE_CLASS; c++) > >> > -/* Bleah, all this to get the comment right for MIN_MODE_INT. */ > >> > -tagged_printf ("MIN_%s", mode_class_names[c], > >> > -modes[c] > >> > -? ((c != MODE_INT || modes[c]->precision != 1) > >> > - ? modes[c]->name > >> > - : (modes[c]->next > >> > - ? modes[c]->next->name > >> > - : void_mode->name)) > >> > -: void_mode->name); > >> > +{ > >> > + /* Bleah, all this to get the comment right for MIN_MODE_INT. > */ > >> > + const char *comment_name = void_mode->name; > >> > + > >> > + if (modes[c]) > >> > + if (c != MODE_INT || !modes[c]->boolean) > >> > + comment_name = modes[c]->name; > >> > + else > >> > + { > >> > + struct mode_data *m = modes[c]; > >> > + while (m->boolean) > >> > + m = m->next; > >> > + if (m) > >> > + comment_name = m->name; > >> > + else > >> > + comment_name = void_mode->name; > >> > + } > >> > >> Have you tried bootstrapping the patch on a host of your choice? > >> I would expect a warning/Werror about an ambiguous else here. > >> > > No I hadn't and indeed the build fails > > > >> > >> I guess this reduces to: > >> > >> struct mode_data *m = modes[c]; > >> while (m && m->boolean) > >> m = m->next; > >> const char *comment_name = (m ? m : void_mode)->name; > >> > >> but I don't know if that's more readable. > >> > > but to my understanding the problem is that the ambiguous else > > is the first one, and the code should read: > > if (modes[c]) > > + { > > if (c != MODE_INT || !modes[c]->boolean) > > comment_name = modes[c]->name; > > else > > { > > struct mode_data *m = modes[c]; > > while (m->boolean) > > m = m->next; > > if (m) > > comment_name = m->name; > > else > > comment_name = void_mode->name; > > } > > +} > > Yeah. I just meant that the alternative loop was probably simpler, > as a replacement for the outer “if”. > > It looks like that the outer “if” is effectively a peeled iteration of > the while loop in the outer “else”. And the “c != MODE_INT” part ought > to be redundant: as it stands, the bo
Re: [PATCH] powerc: Fix asm machine directive for some CPUs
On 19/01/2022 07:54, Sebastian Huber wrote: Okay with those changes. Thanks! Thanks for having a look at this. I would like to back port this patch also to the GCC 10 and 11 branches. The default is to ask for back ports after a break. Can I back port the patch (with the default: break) to GCC 10 and 11 now? -- embedded brains GmbH Herr Sebastian HUBER Dornierstr. 4 82178 Puchheim Germany email: sebastian.hu...@embedded-brains.de phone: +49-89-18 94 741 - 16 fax: +49-89-18 94 741 - 08 Registergericht: Amtsgericht München Registernummer: HRB 157899 Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler Unsere Datenschutzerklärung finden Sie hier: https://embedded-brains.de/datenschutzerklaerung/
[PATCH] c++: constrained auto in lambda using outer tparms [PR103706]
Here we're crashing during satisfaction of the lambda's placeholder type constraints because the constraints depend on the template arguments from the enclosing scope, which aren't a part of the lambda's DECL_TI_ARGS. So when inside a lambda, do_auto_deduction needs to add the "regenerating" template arguments to the set of outer template arguments, like we already do in satisfy_declaration_constraints. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps 11? PR c++/103706 gcc/cp/ChangeLog: * constraint.cc (satisfy_declaration_constraints): Use lambda_regenerating_args instead. * cp-tree.h (lambda_regenerating_args): Declare. * pt.cc (add_to_template_args): Handle NULL_TREE extra_args gracefully. (lambda_regenerating_args): Define, split out from satisfy_declaration_constraints. (do_auto_deduction): Use lambda_regenerating_args to obtain the full set of outer template arguments when checking the constraint on a placeholder type within a lambda. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-lambda18.C: New test. --- gcc/cp/constraint.cc | 9 + gcc/cp/cp-tree.h | 1 + gcc/cp/pt.cc | 37 --- .../g++.dg/cpp2a/concepts-lambda18.C | 14 +++ 4 files changed, 48 insertions(+), 13 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-lambda18.C diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc index cc4df4216ef..c41ee02b910 100644 --- a/gcc/cp/constraint.cc +++ b/gcc/cp/constraint.cc @@ -3148,18 +3148,13 @@ satisfy_declaration_constraints (tree t, sat_info info) args = add_outermost_template_args (args, inh_ctor_targs); } - if (regenerated_lambda_fn_p (t)) + if (LAMBDA_FUNCTION_P (t)) { /* The TI_ARGS of a regenerated lambda contains only the innermost set of template arguments. Augment this with the outer template arguments that were used to regenerate the lambda. */ gcc_assert (!args || TMPL_ARGS_DEPTH (args) == 1); - tree lambda = CLASSTYPE_LAMBDA_EXPR (DECL_CONTEXT (t)); - tree outer_args = TI_ARGS (LAMBDA_EXPR_REGEN_INFO (lambda)); - if (args) - args = add_to_template_args (outer_args, args); - else - args = outer_args; + args = add_to_template_args (lambda_regenerating_args (t), args); } /* If any arguments depend on template parameters, we can't diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index b9eb71fbc3a..7a7fe5ba599 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -7715,6 +7715,7 @@ extern void finish_lambda_scope (void); extern tree start_lambda_function (tree fn, tree lambda_expr); extern void finish_lambda_function (tree body); extern bool regenerated_lambda_fn_p(tree); +extern tree lambda_regenerating_args (tree); extern tree most_general_lambda(tree); extern tree finish_omp_target (location_t, tree, tree, bool); extern void finish_omp_target_clauses (location_t, tree, tree *); diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc index 6e129da1d05..c919d2de68f 100644 --- a/gcc/cp/pt.cc +++ b/gcc/cp/pt.cc @@ -573,6 +573,9 @@ add_to_template_args (tree args, tree extra_args) if (args == NULL_TREE || extra_args == error_mark_node) return extra_args; + if (extra_args == NULL_TREE) +return args; + extra_depth = TMPL_ARGS_DEPTH (extra_args); new_args = make_tree_vec (TMPL_ARGS_DEPTH (args) + extra_depth); @@ -14442,6 +14445,21 @@ most_general_lambda (tree t) return t; } +/* Return the set of template arguments used to regenerate the lambda T + from its most general lambda. */ + +tree +lambda_regenerating_args (tree t) +{ + if (LAMBDA_FUNCTION_P (t)) +t = CLASSTYPE_LAMBDA_EXPR (DECL_CONTEXT (t)); + gcc_assert (TREE_CODE (t) == LAMBDA_EXPR); + if (tree ti = LAMBDA_EXPR_REGEN_INFO (t)) +return TI_ARGS (ti); + else +return NULL_TREE; +} + /* We're instantiating a variable from template function TCTX. Return the corresponding current enclosing scope. We can match them up using DECL_SOURCE_LOCATION because lambdas only ever have one source location, and @@ -30100,12 +30118,19 @@ do_auto_deduction (tree type, tree init, tree auto_node, return type; } - if ((context == adc_return_type - || context == adc_variable_type - || context == adc_decomp_type) - && current_function_decl - && DECL_TEMPLATE_INFO (current_function_decl)) - outer_targs = DECL_TI_ARGS (current_function_decl); + if (context == adc_return_type + || context == adc_variable_type + || context == adc_decomp_type) + if (tree fn = current_function_decl) + if (DECL_TEMPLATE_INFO (fn) || LAMBD
[PATCH] c++: lambda in pack expansion using outer pack in constraint [PR103706]
The satisfaction cache needs to look through ARGUMENT_PACK_SELECT template arguments before calling iterative_hash_template_arg and template_args_equal, which would otherwise crash. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps 11? PR c++/103706 gcc/cp/ChangeLog: * constraint.cc (sat_hasher::hash): Look through ARGUMENT_PACK_SELECT. (sat_hasher::equal) Likewise. * cp-tree.h (argument_pack_select_arg): Declare. * pt.cc (argument_pack_select_arg): No longer static. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-lambda19.C: New test. --- gcc/cp/constraint.cc | 6 ++ gcc/cp/cp-tree.h | 1 + gcc/cp/pt.cc | 2 +- gcc/testsuite/g++.dg/cpp2a/concepts-lambda19.C | 11 +++ 4 files changed, 19 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-lambda19.C diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc index c41ee02b910..ab45988e8bd 100644 --- a/gcc/cp/constraint.cc +++ b/gcc/cp/constraint.cc @@ -2484,6 +2484,8 @@ struct sat_hasher : ggc_ptr_hash tree parm = TREE_VALUE (target_parms); template_parm_level_and_index (parm, &level, &index); tree arg = TMPL_ARG (e->args, level, index); + if (TREE_CODE (arg) == ARGUMENT_PACK_SELECT) + arg = argument_pack_select_arg (arg); value = iterative_hash_template_arg (arg, value); } return value; @@ -2515,6 +2517,10 @@ struct sat_hasher : ggc_ptr_hash template_parm_level_and_index (parm, &level, &index); tree arg1 = TMPL_ARG (e1->args, level, index); tree arg2 = TMPL_ARG (e2->args, level, index); + if (TREE_CODE (arg1) == ARGUMENT_PACK_SELECT) + arg1 = argument_pack_select_arg (arg1); + if (TREE_CODE (arg2) == ARGUMENT_PACK_SELECT) + arg2 = argument_pack_select_arg (arg2); if (!template_args_equal (arg1, arg2)) return false; } diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index 7a7fe5ba599..0bb4d8eb5df 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -7375,6 +7375,7 @@ extern bool primary_template_specialization_p (const_tree); extern tree get_primary_template_innermost_parameters (const_tree); extern tree get_template_innermost_arguments (const_tree); extern tree get_template_argument_pack_elems (const_tree); +extern tree argument_pack_select_arg (tree); extern tree get_function_template_decl (const_tree); extern tree resolve_nondeduced_context (tree, tsubst_flags_t); extern tree resolve_nondeduced_context_or_error(tree, tsubst_flags_t); diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc index c919d2de68f..1ddd36b4413 100644 --- a/gcc/cp/pt.cc +++ b/gcc/cp/pt.cc @@ -3718,7 +3718,7 @@ get_template_argument_pack_elems (const_tree t) /* In an ARGUMENT_PACK_SELECT, the actual underlying argument that the ARGUMENT_PACK_SELECT represents. */ -static tree +tree argument_pack_select_arg (tree t) { tree args = ARGUMENT_PACK_ARGS (ARGUMENT_PACK_SELECT_FROM_PACK (t)); diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-lambda19.C b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda19.C new file mode 100644 index 000..fcf086248f5 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda19.C @@ -0,0 +1,11 @@ +// PR c++/103706 +// { dg-do compile { target c++20 } } + +template concept C = __is_same(T, int); + +template void f() { + ([] () requires C { return T(); }(), ...); // { dg-error "no match" } +} + +template void f(); // { dg-bogus "" } +template void f(); // { dg-message "required from here" } -- 2.35.0
Re: [committed] analyzer: stop -ftrivial-auto-var-init from suppressing uninit warnings [PR104270]
Hi, David, Thank you for fixing this issue! > On Feb 2, 2022, at 9:06 AM, David Malcolm via Gcc-patches > wrote: > > GCC 12 has gained two features for dealing with uninitialized variables: > > (a) a new -Wanalyzer-use-of-uninitialized-value warning within -fanalyzer > for interprocedural path-sensitive detection of ununit uses, and > > (b) a new -ftrivial-auto-var-init option for mitigating some uses of > uninit variables > > It turns out that using (b) was thwarting (a), as it led to -fanalyzer > seeing calls to IFN_DEFERRED_INIT, which -fanalyzer wasn't > special-casing, thus treating it as initializing the variables in > question, and thus silencing -Wanalyzer-use-of-uninitialized-value on > them. > > invoke.texi says: > > "GCC still considers an automatic variable that doesn't have an explicit > initializer as uninitialized, @option{-Wuninitialized} will still report > warning messages on such automatic variables." > > and thus -Wanalyzer-use-of-uninitialized-value ought to as well. Then should we updated the invoke.texi to include this as well? thanks. Qing > > This patch adds special-case handling to -fanalyzer for > IFN_DEFERRED_INIT, so that -fanalyzer will warn on uninit uses of > variables that are mitigated by -ftrivial-auto-var-init. > > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. > > Not strictly a regression, but as this affects two new GCC 12 features > it seems appropriate to fix in stage 4. > > Pushed to trunk as r12-6997-g9b4eee5fd158c4ee75d1f1000debbf5082fb9b56. > > gcc/analyzer/ChangeLog: > PR analyzer/104270 > * region-model.cc (region_model::on_call_pre): Handle > IFN_DEFERRED_INIT. > > gcc/testsuite/ChangeLog: > PR analyzer/104270 > * gcc.dg/analyzer/uninit-trivial-auto-var-init-pattern.c: New > test. > * gcc.dg/analyzer/uninit-trivial-auto-var-init-uninitialized.c: > New test. > * gcc.dg/analyzer/uninit-trivial-auto-var-init-zero.c: New test. > > Signed-off-by: David Malcolm > --- > gcc/analyzer/region-model.cc | 10 ++ > .../analyzer/uninit-trivial-auto-var-init-pattern.c| 7 +++ > .../uninit-trivial-auto-var-init-uninitialized.c | 7 +++ > .../analyzer/uninit-trivial-auto-var-init-zero.c | 7 +++ > 4 files changed, 31 insertions(+) > create mode 100644 > gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-pattern.c > create mode 100644 > gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-uninitialized.c > create mode 100644 > gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-zero.c > > diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc > index 6810cf508d9..4c312b053f8 100644 > --- a/gcc/analyzer/region-model.cc > +++ b/gcc/analyzer/region-model.cc > @@ -1109,6 +1109,16 @@ region_model::on_call_pre (const gcall *call, > region_model_context *ctxt, > > bool unknown_side_effects = false; > > + /* Special-case for IFN_DEFERRED_INIT. > + We want to report uninitialized variables with -fanalyzer (treating > + -ftrivial-auto-var-init= as purely a mitigation feature). > + Handle IFN_DEFERRED_INIT by treating it as no-op: don't touch the > + lhs of the call, so that it is still uninitialized from the point of > + view of the analyzer. */ > + if (gimple_call_internal_p (call) > + && gimple_call_internal_fn (call) == IFN_DEFERRED_INIT) > +return false; > + > /* Some of the cases below update the lhs of the call based on the > return value, but not all. Provide a default value, which may > get overwritten below. */ > diff --git > a/gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-pattern.c > b/gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-pattern.c > new file mode 100644 > index 000..0b78dc65267 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-pattern.c > @@ -0,0 +1,7 @@ > +/* { dg-additional-options "-ftrivial-auto-var-init=pattern" } */ > + > +int test_1 (void) > +{ > + int i; /* { dg-message "region created on stack here" } */ > + return i; /* { dg-warning "use of uninitialized value 'i'" } */ > +} > diff --git > a/gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-uninitialized.c > b/gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-uninitialized.c > new file mode 100644 > index 000..124d3a327b8 > --- /dev/null > +++ > b/gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-uninitialized.c > @@ -0,0 +1,7 @@ > +/* { dg-additional-options "-ftrivial-auto-var-init=uninitialized" } */ > + > +int test_1 (void) > +{ > + int i; /* { dg-message "region created on stack here" } */ > + return i; /* { dg-warning "use of uninitialized value 'i'" } */ > +} > diff --git > a/gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-zero.c > b/gcc/testsuite/gcc.dg/analyzer/uninit-trivial-auto-var-init-zero.c > new file mode 100644 > index 00
[committed] libstdc++: Fix link failure in _OutputIteratorConcept
Tested x86_64-linux and by building x86_64-w64-mingw, pushed to trunk. The C++98-style concept check for output iterators causes a link failure on mingw-w64, because the __val() member function isn't defined. Change it to use a function pointer instead. That pointer is never set to anything meaningful, but it doesn't matter as the __constraints() function only has to be instantiated, it's never called. We could refactor all of these to use unevaluated contexts (e.g. sizeof of __decltype) so that we only check the expressions are well-formed, without any codegen at all. Any improvements to these are very low priority though. libstdc++-v3/ChangeLog: * include/bits/boost_concept_check.h (_OutputIteratorConcept): Change member function to data member of function pointer type. --- libstdc++-v3/include/bits/boost_concept_check.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/libstdc++-v3/include/bits/boost_concept_check.h b/libstdc++-v3/include/bits/boost_concept_check.h index 23dba067693..3d884eb440f 100644 --- a/libstdc++-v3/include/bits/boost_concept_check.h +++ b/libstdc++-v3/include/bits/boost_concept_check.h @@ -483,7 +483,9 @@ struct _Aux_require_same<_Tp,_Tp> { typedef _Tp _Type; }; *__i++ = __val(); // require postincrement and assignment } _Tp __i; -_ValueT __val() const; +// Use a function pointer here so no definition of the function needed. +// Just need something that returns a _ValueT (which might be a reference). +_ValueT (*__val)(); }; template -- 2.34.1
[committed] libstdc++: Fix invalid instantiations in tests
Tested x86_64-linux, pushed to trunk. These tests instantiate std::multiset and std::set with a type that has no operator< so they should use a custom comparison function. libstdc++-v3/ChangeLog: * testsuite/23_containers/multiset/operators/cmp_c++20.cc: Use custom comparison function for multiset. * testsuite/23_containers/set/operators/cmp_c++20.cc: Use custom comparison function for set. --- .../23_containers/multiset/operators/cmp_c++20.cc | 8 ++-- .../testsuite/23_containers/set/operators/cmp_c++20.cc| 8 ++-- 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/libstdc++-v3/testsuite/23_containers/multiset/operators/cmp_c++20.cc b/libstdc++-v3/testsuite/23_containers/multiset/operators/cmp_c++20.cc index 3972f756af9..ad3ab787946 100644 --- a/libstdc++-v3/testsuite/23_containers/multiset/operators/cmp_c++20.cc +++ b/libstdc++-v3/testsuite/23_containers/multiset/operators/cmp_c++20.cc @@ -49,9 +49,13 @@ test01() { bool operator==(E) { return true; } }; - static_assert( ! std::totally_ordered> ); + struct Cmp + { +bool operator()(E, E) const { return false; } + }; + static_assert( ! std::totally_ordered> ); static_assert( ! std::three_way_comparable ); - static_assert( ! std::three_way_comparable> ); + static_assert( ! std::three_way_comparable> ); } void diff --git a/libstdc++-v3/testsuite/23_containers/set/operators/cmp_c++20.cc b/libstdc++-v3/testsuite/23_containers/set/operators/cmp_c++20.cc index fdcb5db699d..58698bf426f 100644 --- a/libstdc++-v3/testsuite/23_containers/set/operators/cmp_c++20.cc +++ b/libstdc++-v3/testsuite/23_containers/set/operators/cmp_c++20.cc @@ -49,9 +49,13 @@ test01() { bool operator==(E) { return true; } }; - static_assert( ! std::totally_ordered> ); + struct Cmp + { +bool operator()(E, E) const { return false; } + }; + static_assert( ! std::totally_ordered> ); static_assert( ! std::three_way_comparable ); - static_assert( ! std::three_way_comparable> ); + static_assert( ! std::three_way_comparable> ); } void -- 2.34.1
Re: [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
Now committed, taking Andrew's comments into account: https://gcc.gnu.org/gcc-12/changes.html#amdgcn Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 commit 4b4a85b696b3bb6700eea8f687bc45f1fd0b1f8c Author: Tobias Burnus Date: Wed Feb 2 18:51:32 2022 +0100 gcc-12/changes.html (GCN): >1 workers per gang diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html index 479bd6c5..b6341fda 100644 --- a/htdocs/gcc-12/changes.html +++ b/htdocs/gcc-12/changes.html @@ -360,6 +360,11 @@ a work-in-progress. Debug experience with ROCGDB has been improved. Support for the type __int128_t/integer(kind=16) was added. + For offloading, the limitation of using only one wavefront per compute + unit (CU) has been lifted. Up to 40 workgroups per CU and 16 wavefronts + per workgroup are supported (up to a limit of 40 wavefronts in total, + per CU). Additionally, the number of used wavefronts and workgroups was + tuned for performance.
Re: [PATCH] powerc: Fix asm machine directive for some CPUs
On Wed, Feb 02, 2022 at 06:07:38PM +0100, Sebastian Huber wrote: > On 19/01/2022 07:54, Sebastian Huber wrote: > >>Okay with those changes. Thanks! > > > >Thanks for having a look at this. I would like to back port this patch > >also to the GCC 10 and 11 branches. > > The default is to ask for back ports after a break. Can I back port the > patch (with the default: break) to GCC 10 and 11 now? I have many more changes, but I'll deal with that. Okay for 11 and 10. Thanks! Segher
Re: [PATCH] c++: constrained auto in lambda using outer tparms [PR103706]
On Wed, 2 Feb 2022, Patrick Palka wrote: > Here we're crashing during satisfaction of the lambda's placeholder type > constraints because the constraints depend on the template arguments > from the enclosing scope, which aren't a part of the lambda's > DECL_TI_ARGS. So when inside a lambda, do_auto_deduction needs to add > the "regenerating" template arguments to the set of outer template > arguments, like we already do in satisfy_declaration_constraints. > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for > trunk and perhaps 11? > > PR c++/103706 > > gcc/cp/ChangeLog: > > * constraint.cc (satisfy_declaration_constraints): Use > lambda_regenerating_args instead. > * cp-tree.h (lambda_regenerating_args): Declare. > * pt.cc (add_to_template_args): Handle NULL_TREE extra_args > gracefully. > (lambda_regenerating_args): Define, split out from > satisfy_declaration_constraints. > (do_auto_deduction): Use lambda_regenerating_args to obtain the > full set of outer template arguments when checking the > constraint on a placeholder type within a lambda. > > gcc/testsuite/ChangeLog: > > * g++.dg/cpp2a/concepts-lambda18.C: New test. > --- > gcc/cp/constraint.cc | 9 + > gcc/cp/cp-tree.h | 1 + > gcc/cp/pt.cc | 37 --- > .../g++.dg/cpp2a/concepts-lambda18.C | 14 +++ > 4 files changed, 48 insertions(+), 13 deletions(-) > create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-lambda18.C > > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc > index cc4df4216ef..c41ee02b910 100644 > --- a/gcc/cp/constraint.cc > +++ b/gcc/cp/constraint.cc > @@ -3148,18 +3148,13 @@ satisfy_declaration_constraints (tree t, sat_info > info) > args = add_outermost_template_args (args, inh_ctor_targs); > } > > - if (regenerated_lambda_fn_p (t)) > + if (LAMBDA_FUNCTION_P (t)) > { >/* The TI_ARGS of a regenerated lambda contains only the innermost >set of template arguments. Augment this with the outer template >arguments that were used to regenerate the lambda. */ >gcc_assert (!args || TMPL_ARGS_DEPTH (args) == 1); > - tree lambda = CLASSTYPE_LAMBDA_EXPR (DECL_CONTEXT (t)); > - tree outer_args = TI_ARGS (LAMBDA_EXPR_REGEN_INFO (lambda)); > - if (args) > - args = add_to_template_args (outer_args, args); > - else > - args = outer_args; > + args = add_to_template_args (lambda_regenerating_args (t), args); > } > >/* If any arguments depend on template parameters, we can't > diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h > index b9eb71fbc3a..7a7fe5ba599 100644 > --- a/gcc/cp/cp-tree.h > +++ b/gcc/cp/cp-tree.h > @@ -7715,6 +7715,7 @@ extern void finish_lambda_scope (void); > extern tree start_lambda_function(tree fn, tree lambda_expr); > extern void finish_lambda_function (tree body); > extern bool regenerated_lambda_fn_p (tree); > +extern tree lambda_regenerating_args (tree); > extern tree most_general_lambda (tree); > extern tree finish_omp_target(location_t, tree, > tree, bool); > extern void finish_omp_target_clauses(location_t, tree, tree > *); > diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc > index 6e129da1d05..c919d2de68f 100644 > --- a/gcc/cp/pt.cc > +++ b/gcc/cp/pt.cc > @@ -573,6 +573,9 @@ add_to_template_args (tree args, tree extra_args) >if (args == NULL_TREE || extra_args == error_mark_node) > return extra_args; > > + if (extra_args == NULL_TREE) > +return args; > + Whoops, this last-minute change to add_to_template_args turned out to be a bad idea as some callers apparently rely on add_to_template_args treating NULL_TREE extra_args as a depth 1 targ vector (whereas for the calls in satisfy_declaration_constraints and do_auto_deduction we want NULL_TREE extra_args to be treated as a depth 0 targ vector). Please consider the below patch instead, which doesn't attempt to change the behavior of add_to_template_args: -- >8 -- Subject: [PATCH] c++: constrained auto in lambda using outer tparms [PR103706] Here we're crashing during satisfaction of the lambda's placeholder type constraints because the constraints depend on the template arguments from the enclosing scope, which aren't part of the lambda's DECL_TI_ARGS. So when inside a lambda, do_auto_deduction needs to add its "regenerating" template arguments to the set of outer template arguments, like we already do in satisfy_declaration_constraints. PR c++/103706 gcc/cp/ChangeLog: * constraint.cc (satisfy_declaration_constraints): Use lambda_regenerating_args instead. * cp-tree.h (lambda_regenerating_args): Declare. * pt.cc (lambda_regenerating_args): Define, split out from
Re: [PATCH v2 1/8] rs6000: More factoring of overload processing
Hi! On 2/1/22 3:48 PM, Segher Boessenkool wrote: > On Tue, Feb 01, 2022 at 08:49:34AM -0600, Bill Schmidt wrote: >> I've modified the previous patch to add more explanatory commentary about >> the number-of-arguments test that was previously confusing, and to convert >> the switch into an if-then-else chain. The rest of the patch is unchanged. >> Bootstrapped and tested on powerpc64le-linux-gnu. Is this okay for trunk? >> gcc/ >> * config/rs6000/rs6000-c.cc (resolve_vec_mul): Accept args and types >> parameters instead of arglist and nargs. Simplify accordingly. Remove >> unnecessary test for argument count mismatch. >> (resolve_vec_cmpne): Likewise. >> (resolve_vec_adde_sube): Likewise. >> (resolve_vec_addec_subec): Likewise. >> (altivec_resolve_overloaded_builtin): Move overload special handling >> after the gathering of arguments into args[] and types[] and the test >> for correct number of arguments. Don't perform the test for correct >> number of arguments for certain special cases. Call the other special >> cases with args and types instead of arglist and nargs. >> + if (fcode != RS6000_OVLD_VEC_PROMOTE >> + && fcode != RS6000_OVLD_VEC_SPLATS >> + && fcode != RS6000_OVLD_VEC_EXTRACT >> + && fcode != RS6000_OVLD_VEC_INSERT >> + && fcode != RS6000_OVLD_VEC_STEP >> + && (!VOID_TYPE_P (TREE_VALUE (fnargs)) || n < nargs)) >> return NULL; > Please don't do De Morgan manually, let the compiler deal with it? > Although even with that the logic is as clear as mud. This matters if > someone (maybe even you) will have to debug this later, or modify this. > Maybe adding some suitably named variables can clarify things here? I can de-deMorgan this. Do you want to see the patch again, or is it okay with that change? Thanks! Bill > >> + if (fcode == RS6000_OVLD_VEC_MUL) >> +returned_expr = resolve_vec_mul (&res, args, types, loc); >> + else if (fcode == RS6000_OVLD_VEC_CMPNE) >> +returned_expr = resolve_vec_cmpne (&res, args, types, loc); >> + else if (fcode == RS6000_OVLD_VEC_ADDE || fcode == RS6000_OVLD_VEC_SUBE) >> +returned_expr = resolve_vec_adde_sube (&res, fcode, args, types, loc); >> + else if (fcode == RS6000_OVLD_VEC_ADDEC || fcode == RS6000_OVLD_VEC_SUBEC) >> +returned_expr = resolve_vec_addec_subec (&res, fcode, args, types, loc); >> + else if (fcode == RS6000_OVLD_VEC_SPLATS || fcode == >> RS6000_OVLD_VEC_PROMOTE) >> +returned_expr = resolve_vec_splats (&res, fcode, arglist, nargs); >> + else if (fcode == RS6000_OVLD_VEC_EXTRACT) >> +returned_expr = resolve_vec_extract (&res, arglist, nargs, loc); >> + else if (fcode == RS6000_OVLD_VEC_INSERT) >> +returned_expr = resolve_vec_insert (&res, arglist, nargs, loc); >> + else if (fcode == RS6000_OVLD_VEC_STEP) >> +returned_expr = resolve_vec_step (&res, arglist, nargs); >> + >> + if (res == resolved) >> +return returned_expr; > This is so convoluted because the functions do two things, and have two > return values (res and returned_expr). > > > Segher
Re: s390: Fix bootstrap -Wformat-diag errors
On 2/2/22 09:35, Robin Dapp via Gcc-patches wrote: Hi, this fixes the s390 bootstrap errors caused by -Werror=format-diag. It simply splits the problematic format strings. Either this: error ("% is unknown", orig_p); or this would be better: error ("attribute % is unknown", orig_p); The %< %> directives will render it in single quotes like keywords and identifiers. Using %qs would render it in double quotes like a string, which wouldn't be quite right. Martin Bootstrapped and regtested with -march=z15. Is it OK? Regards Robin -- gcc/ChangeLog: * config/s390/s390.cc (s390_valid_target_attribute_inner_p): Split format strings.
Re: [PATCH] fortran: Unshare associate var charlen [PR104228]
Hi Mikael, Am 29.01.22 um 15:24 schrieb Mikael Morin: Hello, the attached patch is a fix for PR104228. Even if simple, I wouldn’t call it obvious, as it’s involving character length and associate, so I don’t mind some extra review eyes. I am probably not experienced enough to review this. Paul? Nevertheless, I looked at the commits that introduced the code you touched. It appears that these did not cover, maybe even avoid the current case where the associate target is not a dummy. Your changes seem to make sense and fix the issue. Tested on x86_64-pc-linux-gnu. Ok for master/11/10/9? OK, and thanks for the patch! Harald
Re: [PATCH] Add CLOBBER_MARKS_EOL to mark storage end-of-life clobbers
On 2/2/2022 7:42 AM, Richard Biener via Gcc-patches wrote: This adds a flag to CONSTRUCTOR nodes indicating that for clobbers this marks the end-of-life of storage as opposed to just ending the lifetime of the object that occupied it. The dangling pointer diagnostics uses CLOBBERs but is confused by those emitted by the C++ frontend for example which emits them for the second purpose at the start of CTORs. The issue is also appearant for aarch64 in PR104092. Distinguishing the two cases is also necessary for the PR90348 fix. Bootstrapped and tested on x86_64-unknown-linux-gnu. I verified the bogus diagnostic in PR104092 is gone with a cross. OK for trunk? Thanks, Richard. 2022-02-02 Richard Biener PR middle-end/90348 PR middle-end/104092 * tree-core.h: Document protected_flag use on CONSTRUCTOR nodes. * tree.h (CLOBBER_MARKS_EOL): Add. * tree-pretty-print.cc (dump_generic_node): Mark EOL CLOBBERs. * gimplify.c (gimplify_bind_expr): Mark storage end-of-life clobbers with CLOBBER_MARKS_EOL. (gimplify_target_expr): Likewise. * tree-inline.cc (expand_call_inline): Likewise. * tree-ssa-ccp.cc (insert_clobber_before_stack_restore): Likewise. * gimple-ssa-warn-access.cc (pass_waccess::check_stmt): Only treat CLOBBER_MARKS_EOL clobbers as ending lifetime of storage. * gcc.dg/pr87052.c: Adjust. OK. Note that I think something similar may be needed to mark EOL for the pointer passed to realloc to fix a related set of false positives for code like this bool something = p != q; whatever = realloc (p, newsize) if (something) We forward propagate the p != q test generating something like this; whatever - realloc (p, newsize); if (p != q) Which triggers a warning even though the original source was reasonable. I think a well placed clobber of p should expose the dataflow necessary to prevent the propagation and ultimately avoid the false positive. IIRC something like this came up in glibc and/or gnulib. I realize it's not exactly the same, but they're reasonably closely related. jeff
[PATCH] rs6000/testsuite: Return 0 for powerpc_altivec_ok on other targets
Tested on powerpc64-linux {-m32,-m64}. Committed. Segher 2022-02-02 Segher Boessenkool gcc/testsuite/ * lib/target-supports.exp (check_effective_target_powerpc_altivec_ok): Return 0 if the target is not Power. Restructure and add some comments. --- gcc/testsuite/lib/target-supports.exp | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index cffcdb5f049f..4463cc8d7ed6 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -6181,21 +6181,21 @@ proc check_effective_target_powerpc_sqrt { } { # Return 1 if this is a PowerPC target supporting -maltivec. proc check_effective_target_powerpc_altivec_ok { } { -if { ([istarget powerpc*-*-*] - && ![istarget powerpc-*-linux*paired*]) -|| [istarget rs6000-*-*] } { - # AltiVec is not supported on AIX before 5.3. - if { [istarget powerpc*-*-aix4*] -|| [istarget powerpc*-*-aix5.1*] -|| [istarget powerpc*-*-aix5.2*] } { - return 0 - } - return [check_no_compiler_messages powerpc_altivec_ok object { - int dummy; - } "-maltivec"] -} else { - return 0 -} +# Not PowerPC, then not ok +if { !([istarget powerpc*-*-*] || [istarget rs6000-*-*]) } { return 0 } + +# Paired Single, then not ok +if { [istarget powerpc-*-linux*paired*] } { return 0 } + +# AltiVec is not supported on AIX before 5.3. +if { [istarget powerpc*-*-aix4*] +|| [istarget powerpc*-*-aix5.1*] +|| [istarget powerpc*-*-aix5.2*] } { return 0 } + +# Return true iff compiling with -maltivec does not error. +return [check_no_compiler_messages powerpc_altivec_ok object { + int dummy; +} "-maltivec"] } # Return 1 if this is a PowerPC target supporting -mpower8-vector -- 1.8.3.1
Re: [PATCH] Add CLOBBER_MARKS_EOL to mark storage end-of-life clobbers
On Wed, Feb 02, 2022 at 01:23:44PM -0700, Jeff Law via Gcc-patches wrote: > Note that I think something similar may be needed to mark EOL for the > pointer passed to realloc to fix a related set of false positives for code > like this > > bool something = p != q; > whatever = realloc (p, newsize) > if (something) > > We forward propagate the p != q test generating something like this; > > whatever - realloc (p, newsize); > if (p != q) IMNSHO we shouldn't warn for pointer passed to realloc being used in equality comparisons at all, it is just unnecessarily pedantic, it works just fine and a lot of programs use it to find out if the memory was actually reallocated or was just extended or shrunk in place; in the former case they often need some extra work, such as adjust something in the memory block. Yes, one can probably do uintptr_t pint = (uintptr_t) p; whatever = realloc (p, newsize); if ((uintptr_t) whatever != pint) but I think most people don't bother. Jakub
Re: ifcvt: Fix PR104153 and PR104198
On 2/1/2022 7:45 AM, Robin Dapp wrote: Hi, this is a bugfix for aa8cfe785953a0e87d2472311e1260cd98c605c0 which broke an or1k test case (PR104153) as well as SPARC bootstrap (PR104198). cond_exec_get_condition () returns the jump condition directly and we now it to the backend. The or1k backend modified the condition in-place but this modification is not reverted when the sequence in question is discarded. Therefore this patch copies the RTX instead of using it directly. The SPARC problem is due to the backend recreating the initial condition when being passed a CC comparison. This causes the sequence to read from an already overwritten condition operand. Generally, this could also happen on other targets. The workaround is to always first emit to a temporary. In a second run of noce_convert_multiple_1 we know which sequences actually require the comparison and use no temporaries if all sequences after the current one do not require it. Before, I used reg_overlap_mentioned_p () to check the generated instructions against the condition. The problem with this is that reg_overlap... only handles a set of rtx_codes while a backend can theoretically emit everything in an expander. Is reg_mentioned_p () the "right thing" to do? Maybe it is overly conservative but as soon as we have more than let's say three insns, we are unlikely to succeed anyway. Bootstrapped and reg-tested on s390x, Power 9, x86 and SPARC. Regards Robin -- PR 104198 PR 104153 gcc/ChangeLog: * ifcvt.cc (noce_convert_multiple_sets_1): Copy rtx instead of using it directly. Rework comparison handling and always perform a second pass. gcc/testsuite/ChangeLog: * gcc.dg/pr104198.c: New test. Also note this fixed the or1k issues and it's been tested on a variety of the embedded targets. ifcvt-pr.patch commit 68489d5729b4879bf2df540753fc7ea8ba1565a5 Author: Robin Dapp Date: Mon Jan 24 10:28:05 2022 +0100 ifcvt: Fix PR104153 and PR104198. This is a bugfix for aa8cfe785953a0e87d2472311e1260cd98c605c0 which broke an or1k test case (PR104153) as well as SPARC bootstrap (PR104198). cond_exec_get_condition () returns the jump condition directly and we now pass it to the backend. The or1k backend modified the condition in-place (other backends do that as well) but this modification is not reverted when the sequence in question is discarded. Therefore we copy the RTX instead of using it directly. The SPARC problem is due to the SPARC backend recreating the initial condition when being passed a CC comparison. This causes the sequence to read from an already overwritten condition operand. Generally, this could also happen on other targets. The workaround is to always first emit to a temporary. In a second run of noce_convert_multiple_1 we know which sequences actually require the comparison and will use no temporaries if all sequences after the current one do not require it. diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc index fe250d508e1..92c2b40a45a 100644 --- a/gcc/ifcvt.cc +++ b/gcc/ifcvt.cc @@ -3391,7 +3391,11 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, rtx cond = noce_get_condition (jump, &cond_earliest, false); rtx cc_cmp = cond_exec_get_condition (jump); + if (cc_cmp) +cc_cmp = copy_rtx (cc_cmp); rtx rev_cc_cmp = cond_exec_get_condition (jump, /* get_reversed */ true); + if (rev_cc_cmp) +rev_cc_cmp = copy_rtx (rev_cc_cmp); rtx_insn *insn; int count = 0; @@ -3515,6 +3519,7 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, unsigned cost1 = 0, cost2 = 0; rtx_insn *seq, *seq1, *seq2; rtx temp_dest = NULL_RTX, temp_dest1 = NULL_RTX, temp_dest2 = NULL_RTX; + bool read_comparison = false; seq1 = try_emit_cmove_seq (if_info, temp, cond, new_val, old_val, need_cmov, @@ -3524,10 +3529,38 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, as well. This allows the backend to emit a cmov directly without creating an additional compare for each. If successful, costing is easier and this sequence is usually preferred. */ - seq2 = try_emit_cmove_seq (if_info, target, cond, + seq2 = try_emit_cmove_seq (if_info, temp, cond, Do you need to adjust anything now that this is emitting into TEMP rather than TARGET? jeff
Re: [PATCH] c++: lambda in pack expansion using outer pack in constraint [PR103706]
On 2/2/22 12:09, Patrick Palka wrote: The satisfaction cache needs to look through ARGUMENT_PACK_SELECT template arguments before calling iterative_hash_template_arg and template_args_equal, which would otherwise crash. Maybe we should handle ARGUMENT_PACK_SELECT in iterative_hash_template_arg and template_args_equal, rather than their callers? Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps 11? PR c++/103706 gcc/cp/ChangeLog: * constraint.cc (sat_hasher::hash): Look through ARGUMENT_PACK_SELECT. (sat_hasher::equal) Likewise. * cp-tree.h (argument_pack_select_arg): Declare. * pt.cc (argument_pack_select_arg): No longer static. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-lambda19.C: New test. --- gcc/cp/constraint.cc | 6 ++ gcc/cp/cp-tree.h | 1 + gcc/cp/pt.cc | 2 +- gcc/testsuite/g++.dg/cpp2a/concepts-lambda19.C | 11 +++ 4 files changed, 19 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-lambda19.C diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc index c41ee02b910..ab45988e8bd 100644 --- a/gcc/cp/constraint.cc +++ b/gcc/cp/constraint.cc @@ -2484,6 +2484,8 @@ struct sat_hasher : ggc_ptr_hash tree parm = TREE_VALUE (target_parms); template_parm_level_and_index (parm, &level, &index); tree arg = TMPL_ARG (e->args, level, index); + if (TREE_CODE (arg) == ARGUMENT_PACK_SELECT) + arg = argument_pack_select_arg (arg); value = iterative_hash_template_arg (arg, value); } return value; @@ -2515,6 +2517,10 @@ struct sat_hasher : ggc_ptr_hash template_parm_level_and_index (parm, &level, &index); tree arg1 = TMPL_ARG (e1->args, level, index); tree arg2 = TMPL_ARG (e2->args, level, index); + if (TREE_CODE (arg1) == ARGUMENT_PACK_SELECT) + arg1 = argument_pack_select_arg (arg1); + if (TREE_CODE (arg2) == ARGUMENT_PACK_SELECT) + arg2 = argument_pack_select_arg (arg2); if (!template_args_equal (arg1, arg2)) return false; } diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index 7a7fe5ba599..0bb4d8eb5df 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -7375,6 +7375,7 @@ extern bool primary_template_specialization_p (const_tree); extern tree get_primary_template_innermost_parameters (const_tree); extern tree get_template_innermost_arguments (const_tree); extern tree get_template_argument_pack_elems (const_tree); +extern tree argument_pack_select_arg (tree); extern tree get_function_template_decl(const_tree); extern tree resolve_nondeduced_context(tree, tsubst_flags_t); extern tree resolve_nondeduced_context_or_error (tree, tsubst_flags_t); diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc index c919d2de68f..1ddd36b4413 100644 --- a/gcc/cp/pt.cc +++ b/gcc/cp/pt.cc @@ -3718,7 +3718,7 @@ get_template_argument_pack_elems (const_tree t) /* In an ARGUMENT_PACK_SELECT, the actual underlying argument that the ARGUMENT_PACK_SELECT represents. */ -static tree +tree argument_pack_select_arg (tree t) { tree args = ARGUMENT_PACK_ARGS (ARGUMENT_PACK_SELECT_FROM_PACK (t)); diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-lambda19.C b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda19.C new file mode 100644 index 000..fcf086248f5 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda19.C @@ -0,0 +1,11 @@ +// PR c++/103706 +// { dg-do compile { target c++20 } } + +template concept C = __is_same(T, int); + +template void f() { + ([] () requires C { return T(); }(), ...); // { dg-error "no match" } +} + +template void f(); // { dg-bogus "" } +template void f(); // { dg-message "required from here" }
Re: [committed] libstdc++: Reset filesystem::recursive_directory_iterator on error
On Tue, 1 Feb 2022 at 22:00, Jonathan Wakely via Libstdc++ wrote: > > Tested powerpc64le-linux, pushed to trunk. > > > The standard requires directory iterators to become equal to the end > iterator value if they report an error. Some members functions of > filesystem::recursive_directory_iterator fail to do that. > > libstdc++-v3/ChangeLog: > > * src/c++17/fs_dir.cc (recursive_directory_iterator::increment): > Reset state to past-the-end iterator on error. > (fs::recursive_directory_iterator::pop(error_code&)): Likewise. > (fs::recursive_directory_iterator::pop()): Check _M_dirs before > it might get reset. > * src/filesystem/dir.cc (recursive_directory_iterator): Likewise, > for the TS implementation. > * testsuite/27_io/filesystem/iterators/error_reporting.cc: New test. > * testsuite/experimental/filesystem/iterators/error_reporting.cc: New > test. > --- > libstdc++-v3/src/c++17/fs_dir.cc | 12 +- > libstdc++-v3/src/filesystem/dir.cc| 12 +- > .../filesystem/iterators/error_reporting.cc | 135 + > .../filesystem/iterators/error_reporting.cc | 136 ++ > 4 files changed, 291 insertions(+), 4 deletions(-) > create mode 100644 > libstdc++-v3/testsuite/27_io/filesystem/iterators/error_reporting.cc > create mode 100644 > libstdc++-v3/testsuite/experimental/filesystem/iterators/error_reporting.cc > > diff --git a/libstdc++-v3/src/c++17/fs_dir.cc > b/libstdc++-v3/src/c++17/fs_dir.cc > index e050304c19a..149a8b0740c 100644 > --- a/libstdc++-v3/src/c++17/fs_dir.cc > +++ b/libstdc++-v3/src/c++17/fs_dir.cc > @@ -311,6 +311,10 @@ fs::recursive_directory_iterator::increment(error_code& > ec) > return *this; > } > } > + > + if (ec) > +_M_dirs.reset(); > + >return *this; > } > > @@ -334,16 +338,20 @@ fs::recursive_directory_iterator::pop(error_code& ec) > ec.clear(); > return; >} > - } while (!_M_dirs->top().advance(skip_permission_denied, ec)); > + } while (!_M_dirs->top().advance(skip_permission_denied, ec) && !ec); > + > + if (ec) > +_M_dirs.reset(); > } > > void > fs::recursive_directory_iterator::pop() > { > + const bool dereferenceable = _M_dirs != nullptr; >error_code ec; >pop(ec); >if (ec) > -_GLIBCXX_THROW_OR_ABORT(filesystem_error(_M_dirs > +_GLIBCXX_THROW_OR_ABORT(filesystem_error(dereferenceable > ? "recursive directory iterator cannot pop" > : "non-dereferenceable recursive directory iterator cannot pop", > ec)); This gives -Wunused warnings when libstdc++ is built with exceptions disabled. Fixed by the attached, pushed to trunk. commit 2905e1af94519b7ba3c43a57af8a7d5e10815950 Author: Jonathan Wakely Date: Wed Feb 2 11:40:28 2022 libstdc++: Fix -Wunused-variable warning for -fno-exceptions build If _GLIBCXX_THROW_OR_ABORT expands to just __builtin_abort() then the bool variable used in the filesystem_error constructor is unused. Mark it as maybe_unused to there's no warning for -fno-exceptions builds. libstdc++-v3/ChangeLog: * src/c++17/fs_dir.cc (fs::recursive_directory_iterator::pop): Add [[maybe_unused]] attribute. * src/filesystem/dir.cc (fs::recursive_directory_iterator::pop): Likewise. diff --git a/libstdc++-v3/src/c++17/fs_dir.cc b/libstdc++-v3/src/c++17/fs_dir.cc index 149a8b0740c..a77aabb6dcc 100644 --- a/libstdc++-v3/src/c++17/fs_dir.cc +++ b/libstdc++-v3/src/c++17/fs_dir.cc @@ -347,7 +347,7 @@ fs::recursive_directory_iterator::pop(error_code& ec) void fs::recursive_directory_iterator::pop() { - const bool dereferenceable = _M_dirs != nullptr; + [[maybe_unused]] const bool dereferenceable = _M_dirs != nullptr; error_code ec; pop(ec); if (ec) diff --git a/libstdc++-v3/src/filesystem/dir.cc b/libstdc++-v3/src/filesystem/dir.cc index ac9e70da516..7cf8e62b5e6 100644 --- a/libstdc++-v3/src/filesystem/dir.cc +++ b/libstdc++-v3/src/filesystem/dir.cc @@ -334,7 +334,7 @@ fs::recursive_directory_iterator::pop(error_code& ec) void fs::recursive_directory_iterator::pop() { - const bool dereferenceable = _M_dirs != nullptr; + [[maybe_unused]] const bool dereferenceable = _M_dirs != nullptr; error_code ec; pop(ec); if (ec)
Re: [PATCH] c++: constrained auto in lambda using outer tparms [PR103706]
On 2/2/22 13:21, Patrick Palka wrote: On Wed, 2 Feb 2022, Patrick Palka wrote: Here we're crashing during satisfaction of the lambda's placeholder type constraints because the constraints depend on the template arguments from the enclosing scope, which aren't a part of the lambda's DECL_TI_ARGS. So when inside a lambda, do_auto_deduction needs to add the "regenerating" template arguments to the set of outer template arguments, like we already do in satisfy_declaration_constraints. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps 11? PR c++/103706 gcc/cp/ChangeLog: * constraint.cc (satisfy_declaration_constraints): Use lambda_regenerating_args instead. * cp-tree.h (lambda_regenerating_args): Declare. * pt.cc (add_to_template_args): Handle NULL_TREE extra_args gracefully. (lambda_regenerating_args): Define, split out from satisfy_declaration_constraints. (do_auto_deduction): Use lambda_regenerating_args to obtain the full set of outer template arguments when checking the constraint on a placeholder type within a lambda. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-lambda18.C: New test. --- gcc/cp/constraint.cc | 9 + gcc/cp/cp-tree.h | 1 + gcc/cp/pt.cc | 37 --- .../g++.dg/cpp2a/concepts-lambda18.C | 14 +++ 4 files changed, 48 insertions(+), 13 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-lambda18.C diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc index cc4df4216ef..c41ee02b910 100644 --- a/gcc/cp/constraint.cc +++ b/gcc/cp/constraint.cc @@ -3148,18 +3148,13 @@ satisfy_declaration_constraints (tree t, sat_info info) args = add_outermost_template_args (args, inh_ctor_targs); } - if (regenerated_lambda_fn_p (t)) + if (LAMBDA_FUNCTION_P (t)) { /* The TI_ARGS of a regenerated lambda contains only the innermost set of template arguments. Augment this with the outer template arguments that were used to regenerate the lambda. */ gcc_assert (!args || TMPL_ARGS_DEPTH (args) == 1); - tree lambda = CLASSTYPE_LAMBDA_EXPR (DECL_CONTEXT (t)); - tree outer_args = TI_ARGS (LAMBDA_EXPR_REGEN_INFO (lambda)); - if (args) - args = add_to_template_args (outer_args, args); - else - args = outer_args; + args = add_to_template_args (lambda_regenerating_args (t), args); } /* If any arguments depend on template parameters, we can't diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index b9eb71fbc3a..7a7fe5ba599 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -7715,6 +7715,7 @@ extern void finish_lambda_scope (void); extern tree start_lambda_function (tree fn, tree lambda_expr); extern void finish_lambda_function(tree body); extern bool regenerated_lambda_fn_p (tree); +extern tree lambda_regenerating_args (tree); extern tree most_general_lambda (tree); extern tree finish_omp_target (location_t, tree, tree, bool); extern void finish_omp_target_clauses (location_t, tree, tree *); diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc index 6e129da1d05..c919d2de68f 100644 --- a/gcc/cp/pt.cc +++ b/gcc/cp/pt.cc @@ -573,6 +573,9 @@ add_to_template_args (tree args, tree extra_args) if (args == NULL_TREE || extra_args == error_mark_node) return extra_args; + if (extra_args == NULL_TREE) +return args; + Whoops, this last-minute change to add_to_template_args turned out to be a bad idea as some callers apparently rely on add_to_template_args treating NULL_TREE extra_args as a depth 1 targ vector (whereas for the calls in satisfy_declaration_constraints and do_auto_deduction we want NULL_TREE extra_args to be treated as a depth 0 targ vector). Please consider the below patch instead, which doesn't attempt to change the behavior of add_to_template_args: -- >8 -- Subject: [PATCH] c++: constrained auto in lambda using outer tparms [PR103706] Here we're crashing during satisfaction of the lambda's placeholder type constraints because the constraints depend on the template arguments from the enclosing scope, which aren't part of the lambda's DECL_TI_ARGS. So when inside a lambda, do_auto_deduction needs to add its "regenerating" template arguments to the set of outer template arguments, like we already do in satisfy_declaration_constraints. PR c++/103706 gcc/cp/ChangeLog: * constraint.cc (satisfy_declaration_constraints): Use lambda_regenerating_args instead. * cp-tree.h (lambda_regenerating_args): Declare. * pt.cc (lambda_regenerating_args): Define, split out from satisfy_declaration_constraints. (do_auto_deduction): Use lambda_regene
Re: [PATCH] libstdc++: Add missing free functions for atomic_flag [PR103934]
>+ inline void >+ atomic_flag_wait_explicit(const atomic_flag* __a, bool __old, >+ std::memory_order __m) noexcept No need for the std:: qualification, and check the indentation. > libstdc++-v3/ChangeLog: > >PR103934 This needs to include the component: PR libstdc++/103934 >* include/std/atomic: Add missing free functions. Please name the new functions in the changelog, in the usual format. Just the names is fine, no need for the full signatures with parameters. OK for trunk with those changes.
Re: [PATCH] docs: remove --disable-stage1-checking from requirements
On Tue, 1 Feb 2022, Martin Liška wrote: > As the minimal GCC version that can build the current master is 4.8, it > does not make sense mentioning something for older versions. > > Ready to be installed? Yep, looks good. Thank you, Gerald
Re: [PATCH] docs: replace http:// with https://
On Wed, 22 Dec 2021, Martin Liška wrote: > I replaced and verified http:// links for various domains. Thank you, and apologies for not acking this right away back then. (Did you ping, and I missed that? Not that you should have to, just missing a ping would be even worse.) In any case this is a good reminder for me to do some general link maintenance again, which I'll look into once your patch is in. Gerald
[committed] docs: mention analyzer interaction with -ftrivial-auto-var-init [PR104270]
On Wed, 2022-02-02 at 17:14 +, Qing Zhao wrote: > Hi, David, > > Thank you for fixing this issue! > > > On Feb 2, 2022, at 9:06 AM, David Malcolm via Gcc-patches < > > gcc-patches@gcc.gnu.org> wrote: > > > > GCC 12 has gained two features for dealing with uninitialized > > variables: > > > > (a) a new -Wanalyzer-use-of-uninitialized-value warning within > > -fanalyzer > > for interprocedural path-sensitive detection of ununit uses, and > > > > (b) a new -ftrivial-auto-var-init option for mitigating some uses > > of > > uninit variables > > > > It turns out that using (b) was thwarting (a), as it led to > > -fanalyzer > > seeing calls to IFN_DEFERRED_INIT, which -fanalyzer wasn't > > special-casing, thus treating it as initializing the variables in > > question, and thus silencing -Wanalyzer-use-of-uninitialized-value > > on > > them. > > > > invoke.texi says: > > > > "GCC still considers an automatic variable that doesn't have an > > explicit > > initializer as uninitialized, @option{-Wuninitialized} will still > > report > > warning messages on such automatic variables." > > > > and thus -Wanalyzer-use-of-uninitialized-value ought to as well. > > Then should we updated the invoke.texi to include this as well? Good point; I've taken the liberty of pushing the following to trunk as a followup (as r12-7007-gfb45d8e692d41d) Thanks Dave gcc/ChangeLog: PR analyzer/104270 * doc/invoke.texi (-ftrivial-auto-var-init=): Add reference to -Wanalyzer-use-of-uninitialized-value to paragraph documenting that -ftrivial-auto-var-init= doesn't suppress warnings. Signed-off-by: David Malcolm --- gcc/doc/invoke.texi | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index cfd415110cd..b1b8cc806c7 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -12304,7 +12304,8 @@ Initialize automatic variables with either a pattern or with zeroes to increase the security and predictability of a program by preventing uninitialized memory disclosure and use. GCC still considers an automatic variable that doesn't have an explicit -initializer as uninitialized, @option{-Wuninitialized} will still report +initializer as uninitialized, @option{-Wuninitialized} and +@option{-Wanalyzer-use-of-uninitialized-value} will still report warning messages on such automatic variables. With this option, GCC will also initialize any padding of automatic variables that have structure or union types to zeroes. -- 2.26.3
Re: [PATCH] docs: replace http:// with https://
On Wed, Feb 2, 2022 at 1:46 PM Gerald Pfeifer wrote: > > On Wed, 22 Dec 2021, Martin Liška wrote: > > I replaced and verified http:// links for various domains. > > Thank you, and apologies for not acking this right away back then. > > (Did you ping, and I missed that? Not that you should have to, just > missing a ping would be even worse.) > > In any case this is a good reminder for me to do some general link > maintenance again, which I'll look into once your patch is in. It went in with r12-6130-g786973ce33dfbbd1fe64e1 . I also committed one extra move to https r12-6983-g6bc732eba9a7dc that was pointed out to the gcc@ mailing list . Thanks, Andrew Pinski > > Gerald
Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
On Mon, Jan 31, 2022 at 12:17 AM Eric Botcazou via Gcc-patches wrote: > > > Unfortunately this breaks quite a lot of things. > > Right, for example in Ada where we now happily turn a division by zero, which > should raise an exception with -gnatp, into nonsense. Do we really need this > rather useless optimization in GCC? Blindly mimicing LLVM is not a reason... Since I didn't see anyone responding to this problem, I filed PR 104356 to record the regression. And yes this should be handled correctly. Thanks, Andrew Pinski > > I have installed the attached testcase, which now fails because of the change. > > * gnat.dg/div_zero.adb: New test. > > -- > Eric Botcazou
Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
> Since I didn't see anyone responding to this problem, I filed PR > 104356 to record the regression. > And yes this should be handled correctly. Thanks. Note that we have an example of this in libgcc/libgcc2.c too. -- Eric Botcazou
Re: [PATCH 6/8] rs6000: Remove -m[no-]fold-gimple flag [PR103686]
On Fri, Jan 28, 2022 at 11:50:24AM -0600, Bill Schmidt wrote: > The -m[no-]fold-gimple flag was really intended primarily for internal > testing while implementing GIMPLE folding for rs6000 vector built-in > functions. It ended up leaking into other places, causing problems such > as PR103686 identifies. Let's remove it. Okay. BUT: > gcc.target/powerpc/builtins-1.c was more problematic. It was written in > such a way as to be extremely fragile. For this one, I rewrote the whole > test in a different style, using individual functions to test each > built-in function. These same tests are also largely covered by > builtins-1-be-folded.c and builtins-1-le-folded.c, so I chose to > explicitly make this test -mbig for simplicity, and use -O2 for clean code > generation. I made some slight modifications to the expected instruction > counts as a result, and tested on both 32- and 64-bit. This made the testsuite part very hard to review again. I gave up. In the future, please do *one* logical change per patch. That way at least the patches are readable (they were not now, a lot of mixed-up context). So first a patch rewriting this testcase, and then a separate patch that is the meat of *this* patch. > Most instruction > count tests now use the {\m ... \M} style, but I wasn't able to figure out > how to get this right for vcmpequd. and vcmpgtud. Using \. didn't do the > trick, and I got tired of messing with it. I can change those if you > suggest the proper incantation for an opcode ending with a period. {\madd\.} does the trick. \.\M does not make any sense (a word cannot end in a dot, it cannot contain one in the first place). \M\. is valid, but add\M\. is a bit silly: it is obvious the word ends there, there is no need to assert that :-) Okay for trunk like that. Thanks! Segher
[PATCH 0/3] Enable pointer_query caching throughout.
Richard, as we discussed(*), this patch series enables the pointer_query cache in the remaining two passes where it's currently disabled. Since not using the cache is not an option anymore, the first patch in the series makes it a private member of the pointer_query class and its use unconditional. It also simplifies the two passes that use it to avoid having to install it. Tested on x86_64-linux. Martin [*] For reference: https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589243.html
[PATCH 1/3] Make pointer_query cache a private member.
The first patch in the series make the pointer_query cache a private member of the class and removes the ability to create an object of it without one, or one with an external cache. It also simplifies existing clients of the class that provide an external cache to avoid doing so. gcc/ChangeLog: * gimple-ssa-warn-access.cc (pass_waccess::pass_waccess): Remove pointer_query cache. * pointer-query.cc (pointer_query::pointer_query): Remove cache argument. Zero-initialize new cache member. (pointer_query::get_ref): Replace cache pointer with direct access. (pointer_query::put_ref): Same. (pointer_query::flush_cache): Same. (pointer_query::dump): Same. * pointer-query.h (class pointer_query): Remove cache argument from ctor. Change cache pointer to cache subobject member. * tree-ssa-strlen.cc: Remove pointer_query cache. --- gcc/gimple-ssa-warn-access.cc | 8 ++-- gcc/pointer-query.cc | 74 +++ gcc/pointer-query.h | 12 +++--- gcc/tree-ssa-strlen.cc| 8 ++-- 4 files changed, 44 insertions(+), 58 deletions(-) diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc index ad5e2f4458e..4b3d2c00b03 100644 --- a/gcc/gimple-ssa-warn-access.cc +++ b/gcc/gimple-ssa-warn-access.cc @@ -2137,10 +2137,9 @@ private: /* Return true if use follows an invalidating statement. */ bool use_after_inval_p (gimple *, gimple *, bool = false); - /* A pointer_query object and its cache to store information about - pointers and their targets in. */ + /* A pointer_query object to store information about pointers and + their targets in. */ pointer_query m_ptr_qry; - pointer_query::cache_type m_var_cache; /* Mapping from DECLs and their clobber statements in the function. */ hash_map m_clobbers; /* A bit is set for each basic block whose statements have been assigned @@ -2158,8 +2157,7 @@ private: pass_waccess::pass_waccess (gcc::context *ctxt) : gimple_opt_pass (pass_data_waccess, ctxt), -m_ptr_qry (NULL, &m_var_cache), -m_var_cache (), +m_ptr_qry (NULL), m_clobbers (), m_bb_uids_set (), m_func (), diff --git a/gcc/pointer-query.cc b/gcc/pointer-query.cc index b092ef4fbdc..afbcd0a15ea 100644 --- a/gcc/pointer-query.cc +++ b/gcc/pointer-query.cc @@ -1433,12 +1433,11 @@ ssa_name_limit_t::~ssa_name_limit_t () } /* Default ctor. Initialize object with pointers to the range_query - and cache_type instances to use or null. */ + instance to use or null. */ -pointer_query::pointer_query (range_query *qry /* = NULL */, - cache_type *cache /* = NULL */) -: rvals (qry), var_cache (cache), hits (), misses (), - failures (), depth (), max_depth () +pointer_query::pointer_query (range_query *qry /* = NULL */) + : rvals (qry), hits (), misses (), failures (), depth (), max_depth (), +var_cache () { /* No op. */ } @@ -1449,28 +1448,22 @@ pointer_query::pointer_query (range_query *qry /* = NULL */, const access_ref * pointer_query::get_ref (tree ptr, int ostype /* = 1 */) const { - if (!var_cache) -{ - ++misses; - return NULL; -} - unsigned version = SSA_NAME_VERSION (ptr); unsigned idx = version << 1 | (ostype & 1); - if (var_cache->indices.length () <= idx) + if (var_cache.indices.length () <= idx) { ++misses; return NULL; } - unsigned cache_idx = var_cache->indices[idx]; - if (var_cache->access_refs.length () <= cache_idx) + unsigned cache_idx = var_cache.indices[idx]; + if (var_cache.access_refs.length () <= cache_idx) { ++misses; return NULL; } - access_ref &cache_ref = var_cache->access_refs[cache_idx]; + const access_ref &cache_ref = var_cache.access_refs[cache_idx]; if (cache_ref.ref) { ++hits; @@ -1491,17 +1484,17 @@ pointer_query::get_ref (tree ptr, gimple *stmt, access_ref *pref, const unsigned version = TREE_CODE (ptr) == SSA_NAME ? SSA_NAME_VERSION (ptr) : 0; - if (var_cache && version) + if (version) { unsigned idx = version << 1 | (ostype & 1); - if (idx < var_cache->indices.length ()) + if (idx < var_cache.indices.length ()) { - unsigned cache_idx = var_cache->indices[idx] - 1; - if (cache_idx < var_cache->access_refs.length () - && var_cache->access_refs[cache_idx].ref) + unsigned cache_idx = var_cache.indices[idx] - 1; + if (cache_idx < var_cache.access_refs.length () + && var_cache.access_refs[cache_idx].ref) { ++hits; - *pref = var_cache->access_refs[cache_idx]; + *pref = var_cache.access_refs[cache_idx]; return true; } } @@ -1525,7 +1518,7 @@ void pointer_query::put_ref (tree ptr, const access_ref &ref, int ostype /* = 1 */) { /* Only add populated/valid entr
[PATCH 2/3] Enable pointer_query caching for -Warray-bounds.
The second patch in the series adds a pointer_query instance to the array bounds checker object and uses it for each invocation to check a function. gcc/ChangeLog: * gimple-array-bounds.cc (array_bounds_checker::array_bounds_checker): Define ctor. (array_bounds_checker::get_value_range): Use new member. (array_bounds_checker::check_mem_ref): Same. * gimple-array-bounds.h (array_bounds_checker::array_bounds_checker): Outline ctor. (array_bounds_checker::m_ptr_query): New member. --- gcc/gimple-array-bounds.cc | 13 ++--- gcc/gimple-array-bounds.h | 10 ++ 2 files changed, 16 insertions(+), 7 deletions(-) diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc index 80c70b49607..7ec8b08c8d2 100644 --- a/gcc/gimple-array-bounds.cc +++ b/gcc/gimple-array-bounds.cc @@ -24,6 +24,7 @@ along with GCC; see the file COPYING3. If not see #include "tree.h" #include "gimple.h" #include "ssa.h" +#include "pointer-query.h" #include "gimple-array-bounds.h" #include "gimple-iterator.h" #include "gimple-walk.h" @@ -37,7 +38,13 @@ along with GCC; see the file COPYING3. If not see #include "domwalk.h" #include "tree-cfg.h" #include "attribs.h" -#include "pointer-query.h" + +array_bounds_checker::array_bounds_checker (struct function *func, + range_query *qry) + : fun (func), m_ptr_qry (qry) +{ + /* No-op. */ +} // This purposely returns a value_range, not a value_range_equiv, to // break the dependency on equivalences for this pass. @@ -45,7 +52,7 @@ along with GCC; see the file COPYING3. If not see const value_range * array_bounds_checker::get_value_range (const_tree op, gimple *stmt) { - return ranges->get_value_range (op, stmt); + return m_ptr_qry.rvals->get_value_range (op, stmt); } /* Try to determine the DECL that REF refers to. Return the DECL or @@ -401,7 +408,7 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref, axssize = wi::to_offset (access_size); access_ref aref; - if (!compute_objsize (ref, m_stmt, 0, &aref, ranges)) + if (!m_ptr_qry.get_ref (ref, m_stmt, &aref, 0)) return false; if (aref.offset_in_range (axssize)) diff --git a/gcc/gimple-array-bounds.h b/gcc/gimple-array-bounds.h index d42146b87c8..eb399271da7 100644 --- a/gcc/gimple-array-bounds.h +++ b/gcc/gimple-array-bounds.h @@ -20,13 +20,14 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_GIMPLE_ARRAY_BOUNDS_H #define GCC_GIMPLE_ARRAY_BOUNDS_H +#include "pointer-query.h" + class array_bounds_checker { friend class check_array_bounds_dom_walker; public: - array_bounds_checker (struct function *fun, range_query *v) -: fun (fun), ranges (v) { } + array_bounds_checker (struct function *, range_query *); void check (); private: @@ -38,8 +39,9 @@ private: /* Current function. */ struct function *fun; - /* Ranger instance. */ - range_query *ranges; + /* A pointer_query object to store information about pointers and + their targets in. */ + pointer_query m_ptr_qry; /* Current statement. */ gimple *m_stmt; }; -- 2.34.1
[PATCH 3/3] Enable pointer_query caching for -Wrestrict.
The third patch in the series adds a pointer_query instance to the wrestrict pass and uses it for each invocation to check a function. gcc/ChangeLog: * gimple-ssa-warn-restrict.cc (class pass_wrestrict): Outline ctor. (pass_wrestrict::m_ptr_qry): New member. (wrestrict_walk): Rename... (pass_wrestrict::check_block): ...to this. (pass_wrestrict::execute): Set up and tear down pointer_query and ranger. (builtin_memref::builtin_memref): Change ctor argument. Simplify. (builtin_access::builtin_access): Same. (builtin_access::m_ptr_qry): New member. (check_call): Rename... (pass_wrestrict::check_call): ...to this. (check_bounds_or_overlap): Change argument. * gimple-ssa-warn-restrict.h (check_bounds_or_overlap): Same. --- gcc/gimple-ssa-warn-restrict.cc | 126 gcc/gimple-ssa-warn-restrict.h | 2 +- 2 files changed, 62 insertions(+), 66 deletions(-) diff --git a/gcc/gimple-ssa-warn-restrict.cc b/gcc/gimple-ssa-warn-restrict.cc index d1beb01fe89..72418398dad 100644 --- a/gcc/gimple-ssa-warn-restrict.cc +++ b/gcc/gimple-ssa-warn-restrict.cc @@ -62,46 +62,63 @@ const pass_data pass_data_wrestrict = { class pass_wrestrict : public gimple_opt_pass { public: - pass_wrestrict (gcc::context *ctxt) -: gimple_opt_pass (pass_data_wrestrict, ctxt) -{ } + pass_wrestrict (gcc::context *); - opt_pass *clone () { return new pass_wrestrict (m_ctxt); } + // opt_pass *clone () { return new pass_wrestrict (m_ctxt); } virtual bool gate (function *); virtual unsigned int execute (function *); + + void check_call (gimple *); + + void check_block (basic_block); + + /* A pointer_query object to store information about pointers and + their targets in. */ + pointer_query m_ptr_qry; }; +pass_wrestrict::pass_wrestrict (gcc::context *ctxt) + : gimple_opt_pass (pass_data_wrestrict, ctxt), +m_ptr_qry () +{ } + bool pass_wrestrict::gate (function *fun ATTRIBUTE_UNUSED) { return warn_array_bounds || warn_restrict || warn_stringop_overflow; } -static void check_call (range_query *, gimple *); - -static void -wrestrict_walk (range_query *query, basic_block bb) +void +pass_wrestrict::check_block (basic_block bb) { /* Iterate over statements, looking for function calls. */ - for (gimple_stmt_iterator si = gsi_start_bb (bb); !gsi_end_p (si); - gsi_next (&si)) + for (auto si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si)) { gimple *stmt = gsi_stmt (si); if (!is_gimple_call (stmt)) continue; - check_call (query, stmt); + check_call (stmt); } } unsigned pass_wrestrict::execute (function *fun) { - gimple_ranger ranger; + /* Create a new ranger instance and associate it with FUN. */ + m_ptr_qry.rvals = enable_ranger (fun); + basic_block bb; FOR_EACH_BB_FN (bb, fun) -wrestrict_walk (&ranger, bb); +check_block (bb); + + m_ptr_qry.flush_cache (); + + /* Release the ranger instance and replace it with a global ranger. + Also reset the pointer since calling disable_ranger() deletes it. */ + disable_ranger (fun); + m_ptr_qry.rvals = NULL; return 0; } @@ -145,7 +162,7 @@ public: only the destination reference is. */ bool strbounded_p; - builtin_memref (range_query *, gimple *, tree, tree); + builtin_memref (pointer_query &, gimple *, tree, tree); tree offset_out_of_bounds (int, offset_int[3]) const; @@ -153,7 +170,7 @@ private: /* Call statement to the built-in. */ gimple *stmt; - range_query *query; + pointer_query &m_ptr_qry; /* Ctor helper to set or extend OFFRANGE based on argument. */ void extend_offset_range (tree); @@ -187,7 +204,8 @@ class builtin_access && detect_overlap != &builtin_access::no_overlap); } - builtin_access (range_query *, gimple *, builtin_memref &, builtin_memref &); + builtin_access (pointer_query &, gimple *, + builtin_memref &, builtin_memref &); /* Entry point to determine overlap. */ bool overlap (); @@ -225,7 +243,7 @@ class builtin_access a size SIZE in bytes. If SIZE is NULL_TREE then the size is assumed to be unknown. STMT is the statement in which expr appears in. */ -builtin_memref::builtin_memref (range_query *query, gimple *stmt, tree expr, +builtin_memref::builtin_memref (pointer_query &ptrqry, gimple *stmt, tree expr, tree size) : ptr (expr), ref (), @@ -238,7 +256,7 @@ builtin_memref::builtin_memref (range_query *query, gimple *stmt, tree expr, maxobjsize (tree_to_shwi (max_object_size ())), strbounded_p (), stmt (stmt), - query (query) + m_ptr_qry (ptrqry) { /* Unfortunately, wide_int default ctor is a no-op so array members of the type must be set individually. */ @@ -257,7 +275,7 @@ builtin_memref::builtin_memref (range_query *query, gimple *stmt, tree expr, tree
[committed] wwwdocs: gcc-4.5: Update link to MPC
This is on top of Martin's changes. Pushed. Gerald --- htdocs/gcc-4.5/changes.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/htdocs/gcc-4.5/changes.html b/htdocs/gcc-4.5/changes.html index c141a4d9..5181bae8 100644 --- a/htdocs/gcc-4.5/changes.html +++ b/htdocs/gcc-4.5/changes.html @@ -18,7 +18,7 @@ GCC now requires the http://www.multiprecision.org/mpc/";>MPC library in order to +href="https://www.multiprecision.org/mpc/";>MPC library in order to build. See the https://gcc.gnu.org/install/prerequisites.html";>prerequisites page for version requirements. -- 2.34.1
[committed] Correct typos in -Wuse-after-free description.
I committed r12-7009 to fix a couple of typos in the description of the option. https://gcc.gnu.org/pipermail/gcc-cvs/2022-February/360083.html Martin
[PATCH v2] tree-optimization/94899: Remove "+ 0x80000000" in int comparisons
Expressions of the form "X + CST < Y + CST" where: * CST is an unsigned integer constant with only the MSB set, and * X and Y's types have integer conversion ranks <= CST's can be simplified to "(signed) X < (signed) Y". This is because, assuming a 32-bit signed numbers, (unsigned) INT_MIN + 0x8000 is 0, and (unsigned) INT_MAX + 0x8000 is UINT_MAX. i.e. the result increases monotonically with signed input. This means: ((signed) X < (signed) Y) iff (X + 0x8000 < Y + 0x8000) gcc/ * match.pd (X + C < Y + C -> (signed) X < (signed) Y, if C is 0x8000): New simplification. gcc/testsuite/ * gcc.dg/pr94899.c: New test. --- gcc/match.pd | 13 + gcc/testsuite/gcc.dg/pr94899.c | 48 ++ 2 files changed, 61 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/pr94899.c --- v1: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/589557.html Notes on v2, based on Richard's review comments: 1. I removed matching on "convert", and therefore also replaced the removal of convert upon simplification with an explicit cast to signed. I originally thought this simplification only applies to signed operands that have been cast to unsigned, but thinking about it, it became clear that they do not necessarily have to be signed originally. The simplification is now a bit more general. 2. Removed checks for operands' types as it seems to be unnecessary. I hope this is correct. 3. Added unsigned types and mismatched sizes of operands to the test. These are now simplified. diff --git a/gcc/match.pd b/gcc/match.pd index b942cb2930a..fabb06c6808 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1975,6 +1975,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0)) && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))) (op @0 @1 + +/* As a special case, X + C < Y + C is the same as (signed) X < (signed) Y + when C is an unsigned integer constant with only the MSB set, and X and + Y have types of equal or lower integer conversion rank than C's. */ +(for op (lt le ge gt) + (simplify + (op (plus:c INTEGER_CST@0 @1) (plus:c INTEGER_CST@0 @2)) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) + && TYPE_UNSIGNED (TREE_TYPE (@0)) + && wi::only_sign_bit_p (wi::to_wide (@0))) + (with { tree stype = signed_type_for (TREE_TYPE (@0)); } +(op (convert:stype @1) (convert:stype @2)) + /* For equality and subtraction, this is also true with wrapping overflow. */ (for op (eq ne minus) (simplify diff --git a/gcc/testsuite/gcc.dg/pr94899.c b/gcc/testsuite/gcc.dg/pr94899.c new file mode 100644 index 000..f47120988b7 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr94899.c @@ -0,0 +1,48 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +typedef __INT16_TYPE__ int16_t; +typedef __INT32_TYPE__ int32_t; +typedef __UINT16_TYPE__ uint16_t; +typedef __UINT32_TYPE__ uint32_t; + +#define MAGIC 0x8000 + +int +f_i16_i16 (int16_t x, int16_t y) +{ + return x + MAGIC < y + MAGIC; +} + +int +f_i16_i32 (int16_t x, int32_t y) +{ + return x + MAGIC < y + MAGIC; +} + +int +f_i32_i32 (int32_t x, int32_t y) +{ + return x + MAGIC < y + MAGIC; +} + +int +f_u32_i32 (uint32_t x, int32_t y) +{ + return x + MAGIC < y + MAGIC; +} + +int +f_u32_u32 (uint32_t x, uint32_t y) +{ + return x + MAGIC < y + MAGIC; +} + +int +f_i32_i32_sub (int32_t x, int32_t y) +{ + return x - MAGIC < y - MAGIC; +} + +/* The constants above should have been optimized away. */ +/* { dg-final { scan-tree-dump-times "2147483648" 0 "optimized"} } */ -- 2.34.1
[PATCH] c++: dependent array bounds completion [PR104302]
The patch for PR55227 changed the minimal init-list handling in cp_complete_array_type to a call to reshape_init, which broke on the dependent initializer. It occurred to me that trying to deduce the array size from a dependent init-list is wrong in general, as we can see with the second testcase, so let's just not. I also limited the reshape_init call to the case of a char array, as before the patch for 55227; that's the only case where we want to strip a level of braces from an array. Tested x86_64-pc-linux-gnu, applying to trunk. PR c++/104302 gcc/cp/ChangeLog: * decl.cc (maybe_deduce_size_from_array_init): Give up on type-dependent init. (cp_complete_array_type): Only call reshape_init for character array. gcc/testsuite/ChangeLog: * g++.dg/template/array35.C: New test. * g++.dg/template/array36.C: New test. --- gcc/cp/decl.cc | 10 -- gcc/testsuite/g++.dg/template/array35.C | 11 +++ gcc/testsuite/g++.dg/template/array36.C | 15 +++ 3 files changed, 34 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.dg/template/array35.C create mode 100644 gcc/testsuite/g++.dg/template/array36.C diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc index 09eed9ceba6..7b48b56231b 100644 --- a/gcc/cp/decl.cc +++ b/gcc/cp/decl.cc @@ -6017,6 +6017,11 @@ maybe_deduce_size_from_array_init (tree decl, tree init) return; if (!check_array_designated_initializer (ce, i)) failure = 1; + /* If an un-designated initializer is type-dependent, we can't +check brace elision yet. */ + if (ce->index == NULL_TREE + && type_dependent_expression_p (ce->value)) + return; } } @@ -8113,7 +8118,7 @@ cp_finish_decl (tree decl, tree init, bool init_const_expr_p, else { gcc_assert (!DECL_PRETTY_FUNCTION_P (decl)); - /* Deduce array size even if the initializer is dependent. */ + /* Try to deduce array size. */ maybe_deduce_size_from_array_init (decl, init); /* And complain about multiple initializers. */ if (init && TREE_CODE (init) == TREE_LIST && TREE_CHAIN (init) @@ -9570,7 +9575,8 @@ cp_complete_array_type (tree *ptype, tree initial_value, bool do_default) /* An array of character type can be initialized from a brace-enclosed string constant so call reshape_init to remove the optional braces from a braced string literal. */ - if (BRACE_ENCLOSED_INITIALIZER_P (initial_value)) + if (char_type_p (TYPE_MAIN_VARIANT (TREE_TYPE (*ptype))) + && BRACE_ENCLOSED_INITIALIZER_P (initial_value)) initial_value = reshape_init (*ptype, initial_value, tf_warning_or_error); diff --git a/gcc/testsuite/g++.dg/template/array35.C b/gcc/testsuite/g++.dg/template/array35.C new file mode 100644 index 000..9fd02633523 --- /dev/null +++ b/gcc/testsuite/g++.dg/template/array35.C @@ -0,0 +1,11 @@ +// PR c++/104302 + +struct ss {}; +static ss ff(void* const v); +template +void f1(void) { +int mem[mem_size]; +ss StateRegs[] = { +ff(mem) +}; +} diff --git a/gcc/testsuite/g++.dg/template/array36.C b/gcc/testsuite/g++.dg/template/array36.C new file mode 100644 index 000..1511da7020d --- /dev/null +++ b/gcc/testsuite/g++.dg/template/array36.C @@ -0,0 +1,15 @@ +// Don't try to deduce array bounds from a dependent initializer. +// { dg-do compile { target c++11 } } + +struct A { int i,j; }; + +template void f(T t) +{ + A ar[] = { t, t }; + static_assert (sizeof(ar)/sizeof(A) == 1, ""); +} + +int main() +{ + f(42); +} base-commit: dc898b2ba5c50a7311bc3137f0987a7831362ed8 -- 2.27.0
Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
On Thu, 3 Feb 2022, Eric Botcazou wrote: > > Since I didn't see anyone responding to this problem, I filed PR > > 104356 to record the regression. > > And yes this should be handled correctly. > > Thanks. Note that we have an example of this in libgcc/libgcc2.c too. I assumed this was handled by the followups to the patch ... Well, yes, we have to fix it. I will share my thoughts when coming along the bugreport. Richard.
[PATCH] ranger: Fix up wi_fold_in_parts for small precision types [PR104334]
Hi! The wide-int.h templates expect that when an int/long etc. operand is used it will be sign-extended based on the types precision. wi_fold_in_parts passes 3 such non-zero constants to wi::lt_p, wi::gt_p and wi::eq_p - 1, 3 and 4, which means it was doing weird things if either some of 1, 3 or 4 weren't representable in type, or if type was unsigned 3 bit type 4 should be written as -4. The following patch promotes the subtraction operands to widest_int and uses that as the type for ?h_range variables and compares them as such. We don't need the overflow handling because there is never an overflow. Bootstrapped/regtested on x86_64-linux and i686-linux (on the former also with lto bootstrap), ok for trunk? 2022-02-02 Jakub Jelinek PR tree-optimization/104334 * range-op.cc (range_operator::wi_fold_in_parts): Change lh_range and rh_range type to widest_int and subtract in widest_int. Remove ov_rh, ov_lh and sign vars, always perform comparisons as signed and use >, < and == operators for it. * g++.dg/opt/pr104334.C: New test. --- gcc/range-op.cc.jj 2022-01-13 22:29:15.345831749 +0100 +++ gcc/range-op.cc 2022-02-02 20:20:22.02000 +0100 @@ -144,22 +144,21 @@ range_operator::wi_fold_in_parts (irange const wide_int &rh_lb, const wide_int &rh_ub) const { - wi::overflow_type ov_rh, ov_lh; int_range_max tmp; - wide_int rh_range = wi::sub (rh_ub, rh_lb, TYPE_SIGN (type), &ov_rh); - wide_int lh_range = wi::sub (lh_ub, lh_lb, TYPE_SIGN (type), &ov_lh); - signop sign = TYPE_SIGN (type);; + widest_int rh_range = wi::sub (widest_int::from (rh_ub, TYPE_SIGN (type)), +widest_int::from (rh_lb, TYPE_SIGN (type))); + widest_int lh_range = wi::sub (widest_int::from (lh_ub, TYPE_SIGN (type)), +widest_int::from (lh_lb, TYPE_SIGN (type))); // If there are 2, 3, or 4 values in the RH range, do them separately. // Call wi_fold_in_parts to check the RH side. - if (wi::gt_p (rh_range, 0, sign) && wi::lt_p (rh_range, 4, sign) - && ov_rh == wi::OVF_NONE) + if (rh_range > 0 && rh_range < 4) { wi_fold_in_parts (r, type, lh_lb, lh_ub, rh_lb, rh_lb); - if (wi::gt_p (rh_range, 1, sign)) + if (rh_range > 1) { wi_fold_in_parts (tmp, type, lh_lb, lh_ub, rh_lb + 1, rh_lb + 1); r.union_ (tmp); - if (wi::eq_p (rh_range, 3)) + if (rh_range == 3) { wi_fold_in_parts (tmp, type, lh_lb, lh_ub, rh_lb + 2, rh_lb + 2); r.union_ (tmp); @@ -170,15 +169,14 @@ range_operator::wi_fold_in_parts (irange } // Otherise check for 2, 3, or 4 values in the LH range and split them up. // The RH side has been checked, so no recursion needed. - else if (wi::gt_p (lh_range, 0, sign) && wi::lt_p (lh_range, 4, sign) - && ov_lh == wi::OVF_NONE) + else if (lh_range > 0 && lh_range < 4) { wi_fold (r, type, lh_lb, lh_lb, rh_lb, rh_ub); - if (wi::gt_p (lh_range, 1, sign)) + if (lh_range > 1) { wi_fold (tmp, type, lh_lb + 1, lh_lb + 1, rh_lb, rh_ub); r.union_ (tmp); - if (wi::eq_p (lh_range, 3)) + if (lh_range == 3) { wi_fold (tmp, type, lh_lb + 2, lh_lb + 2, rh_lb, rh_ub); r.union_ (tmp); --- gcc/testsuite/g++.dg/opt/pr104334.C.jj 2022-02-02 14:35:51.184657968 +0100 +++ gcc/testsuite/g++.dg/opt/pr104334.C 2022-02-02 14:37:14.888478594 +0100 @@ -0,0 +1,40 @@ +// PR tree-optimization/104334 +// { dg-do run { target c++11 } } +// { dg-options "-O2 --param logical-op-non-short-circuit=0" } + +enum class A { A0, A1, A2, A3 }; +int x; + +__attribute__((noipa)) void +baz () +{ + x = 1; +} + +struct B { + unsigned b : 2; + + A + foo () const + { +return static_cast (b); + } + + __attribute__((noinline)) void + bar () + { +if (foo () == A::A2 || foo () == A::A3) + baz (); + } +}; + +int +main () +{ + B c; + c.b = 2; + c.bar (); + if (x != 1) +__builtin_abort (); + return 0; +} Jakub