Re: RFC: C++ PATCH to adjust empty class parameter passing ABI
On 13 April 2016 at 22:12, Jason Merrill wrote: > On 04/13/2016 03:18 PM, Jakub Jelinek wrote: >> >> On Wed, Apr 13, 2016 at 03:11:34PM -0400, Jason Merrill wrote: >>> >>> commit 761983a023b5217ef831a43f423779940c788ecf >>> Author: Jason Merrill >>> Date: Tue Apr 12 13:16:50 2016 -0400 >>> >>> gcc/ >>> * cfgexpand.c (pass_expand::execute): Handle attribute >>> abi_warning. >>> * expr.c (expand_expr_real_1): Likewise. >>> gcc/cp/ >>> * call.c (empty_class_msg, mark_for_abi_warning): New. >>> (build_call_a): Use them. >>> * decl.c (store_parm_decls): Use mark_for_abi_warning. >>> * error.c (pp_format_to_string): New. >> >> >> I think you should put a space into the attribute name instead of _ >> to make it clear that it is not an attribute users can use directly in >> their >> code through __attribute__. >> >> Otherwise it looks reasonable to me. > > > Thanks, applied with that change. > Hi, I'm seeing g++.dg/abi/empty13.C failing at execution on aarch64-none-linux-gnu (using qemu): qemu: uncaught target signal 11 (Segmentation fault) - core dumped and aarch64_be-none-elf (using the Foundation Model): Terminated by exception. But it passes on aarch64-none-elf (using the Foundation Model too) Am I the only one? Christophe. > Jason >
Re: RFC: C++ PATCH to adjust empty class parameter passing ABI
On 14/04/16 09:02, Christophe Lyon wrote: On 13 April 2016 at 22:12, Jason Merrill wrote: On 04/13/2016 03:18 PM, Jakub Jelinek wrote: On Wed, Apr 13, 2016 at 03:11:34PM -0400, Jason Merrill wrote: commit 761983a023b5217ef831a43f423779940c788ecf Author: Jason Merrill Date: Tue Apr 12 13:16:50 2016 -0400 gcc/ * cfgexpand.c (pass_expand::execute): Handle attribute abi_warning. * expr.c (expand_expr_real_1): Likewise. gcc/cp/ * call.c (empty_class_msg, mark_for_abi_warning): New. (build_call_a): Use them. * decl.c (store_parm_decls): Use mark_for_abi_warning. * error.c (pp_format_to_string): New. I think you should put a space into the attribute name instead of _ to make it clear that it is not an attribute users can use directly in their code through __attribute__. Otherwise it looks reasonable to me. Thanks, applied with that change. Hi, I'm seeing g++.dg/abi/empty13.C failing at execution on aarch64-none-linux-gnu (using qemu): qemu: uncaught target signal 11 (Segmentation fault) - core dumped and aarch64_be-none-elf (using the Foundation Model): Terminated by exception. But it passes on aarch64-none-elf (using the Foundation Model too) Am I the only one? Hi Christophe, I see the test failing on aarch64-none-linux-gnu (native) with no output, just: spawn [open ...] FAIL: g++.dg/abi/empty13.C -std=gnu++98 execution test And I see it passing on aarch64-none-elf. I haven't tried aarch64_be-none-elf yet. Kyrill Christophe. Jason
[C++ Patch] PR 70540 ("[4.9/5/6 Regression] ICE on invalid code in cxx_incomplete_type_diagnostic...")
Hi, in this regression we ICE during error recovery after an additional redundant error message. I think it's one of those cases (we have got quite a few elsewhere, in semantics.c too) when it's better to immediately return error_mark_node when mark_used returns false, even if we aren't in a SFINAE context. To be sure, I double checked that in those cases mark_used certainly issues an error, thus we aren't risking creating accepts-invalid bugs, it's only matter of fine tuning error recovery. Tested x86_64-linux. Thanks, Paolo. / /cp 2016-04-14 Paolo Carlini PR c++/70540 * semantics.c (process_outer_var_ref): Unconditionally return error_mark_node when mark_used returns false. /testsuite 2016-04-14 Paolo Carlini PR c++/70540 * g++.dg/cpp0x/auto47.C: New. Index: cp/semantics.c === --- cp/semantics.c (revision 234970) +++ cp/semantics.c (working copy) @@ -3276,7 +3276,7 @@ process_outer_var_ref (tree decl, tsubst_flags_t c tree initializer = convert_from_reference (decl); /* Mark it as used now even if the use is ill-formed. */ - if (!mark_used (decl, complain) && !(complain & tf_error)) + if (!mark_used (decl, complain)) return error_mark_node; bool saw_generic_lambda = false; Index: testsuite/g++.dg/cpp0x/auto47.C === --- testsuite/g++.dg/cpp0x/auto47.C (revision 0) +++ testsuite/g++.dg/cpp0x/auto47.C (working copy) @@ -0,0 +1,8 @@ +// PR c++/70540 +// { dg-do compile { target c++11 } } + +void +foo () +{ + auto f = [&] { return f; }; // { dg-error "before deduction" } +}
Re: RFC: C++ PATCH to adjust empty class parameter passing ABI
On Thu, Apr 14, 2016 at 10:41 AM, Kyrill Tkachov wrote: > > On 14/04/16 09:02, Christophe Lyon wrote: >> >> On 13 April 2016 at 22:12, Jason Merrill wrote: >>> >>> On 04/13/2016 03:18 PM, Jakub Jelinek wrote: On Wed, Apr 13, 2016 at 03:11:34PM -0400, Jason Merrill wrote: > > commit 761983a023b5217ef831a43f423779940c788ecf > Author: Jason Merrill > Date: Tue Apr 12 13:16:50 2016 -0400 > > gcc/ > * cfgexpand.c (pass_expand::execute): Handle attribute > abi_warning. > * expr.c (expand_expr_real_1): Likewise. > gcc/cp/ > * call.c (empty_class_msg, mark_for_abi_warning): New. > (build_call_a): Use them. > * decl.c (store_parm_decls): Use mark_for_abi_warning. > * error.c (pp_format_to_string): New. I think you should put a space into the attribute name instead of _ to make it clear that it is not an attribute users can use directly in their code through __attribute__. Otherwise it looks reasonable to me. >>> >>> >>> Thanks, applied with that change. >>> >> Hi, >> >> I'm seeing g++.dg/abi/empty13.C failing at execution on >> aarch64-none-linux-gnu (using qemu): >> qemu: uncaught target signal 11 (Segmentation fault) - core dumped >> >> and aarch64_be-none-elf (using the Foundation Model): >> Terminated by exception. >> >> But it passes on aarch64-none-elf (using the Foundation Model too) >> >> Am I the only one? > > Hi Christophe, > > I see the test failing on aarch64-none-linux-gnu (native) > with no output, just: > spawn [open ...] > FAIL: g++.dg/abi/empty13.C -std=gnu++98 execution test > > And I see it passing on aarch64-none-elf. IIRC on AArch64 we document that empty classes and structs take up a slot as per the PCS . http://infocenter.arm.com/help/topic/com.arm.doc.ihi0059b/IHI0059B_cppabi64.pdf See section 3.1 For the purposes of parameter passing in [ AAPCS64 ], a parameter whose type is an empty class shall be treated as if its type were an aggregate with a single member of type unsigned byte. regards Ramana > > I haven't tried aarch64_be-none-elf yet. > > Kyrill > > >> Christophe. >> >> >> >>> Jason >>> >
Re: RFC: C++ PATCH to adjust empty class parameter passing ABI
On Thu, Apr 14, 2016 at 11:25:07AM +0100, Ramana Radhakrishnan wrote: > > I see the test failing on aarch64-none-linux-gnu (native) > > with no output, just: > > spawn [open ...] > > FAIL: g++.dg/abi/empty13.C -std=gnu++98 execution test > > > > And I see it passing on aarch64-none-elf. > > IIRC on AArch64 we document that empty classes and structs take up a > slot as per the PCS . > > http://infocenter.arm.com/help/topic/com.arm.doc.ihi0059b/IHI0059B_cppabi64.pdf > > See section 3.1 > > For the purposes of parameter passing in > [ > AAPCS64 > ], a parameter whose type is an empty class shall be treated as if its > type were an aggregate with a > single > member of type unsigned byte. One thing is passing such arguments in registers, another is how they are passed on the stack (e.g. if there are several other arguments that are passed in registers). Also, note that the (new) definition of C++ empty class I believe includes even empty classes that are normally very large, i.e. don't pretend they are just single byte, but can be e.g. 1024 bytes or longer. Do you want to still pass them in registers and/or on the stack as if it is 1 byte, 0 bytes or more bytes? How do you pass empty C structures? E.g. looking at sparc-solaris2.12 cross, C for empty structures passes both in registers and on the stack as if there is one argument (and C++ did too for the empty classes that pretended to have size 1), so appart from the ICE in emit_move_insn there is no ABI change, only perhaps for the larger structs. Guess we need to figure out what do the various targets do for C empty structures and decide what we do want to do with C++ empty classes. Perhaps the C++ FE should just set some flag on the empty class parameters, and let the backends decide what to do with those? Jakub
Re: [PATCH] 69517 - [5/6 regression] SEGV on a VLA with excess initializer elements
Martin Sebor writes: > diff --git a/gcc/testsuite/g++.dg/cpp1y/vla11.C > b/gcc/testsuite/g++.dg/cpp1y/vla11.C > new file mode 100644 > index 000..af9624a > --- /dev/null > +++ b/gcc/testsuite/g++.dg/cpp1y/vla11.C > @@ -0,0 +1,711 @@ > +// PR c++/69517 - [5/6 regression] SEGV on a VLA with excess initializer > +// elements > +// PR c++/70019 - VLA size overflow not detected > +// > +// Runtime test to verify that attempting to either construct a VLA with > +// erroneous bounds, or initialize one with an initializer-list that > +// contains more elements than the VLA's non-constant (runtime) bounds > +// causes an exception to be thrown. Test also verifies that valid > +// VLAs and their initializers don't cause such an exception. > + > +// { dg-do run { target c++11 } } > +// { dg-additional-options "-Wno-vla" } On m68k: /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C: In instantiation of 'struct TestType<32u>': /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:201:1: required from here /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:89:27: error: requested alignment 32 is larger than 16 [-Wattributes] /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C: In function 'int testcase201(const char*)': /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:122:5: error: static assertion failed: wrong test type size /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:201:1: note: in expansion of macro 'TEST' /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C: In instantiation of 'struct TestType<64u>': /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:202:1: required from here /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:89:27: error: requested alignment 64 is larger than 16 [-Wattributes] /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C: In function 'int testcase202(const char*)': /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:122:5: error: static assertion failed: wrong test type size /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:202:1: note: in expansion of macro 'TEST' /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C: In function 'int testcase704(const char*)': /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:122:5: error: static assertion failed: wrong test type size /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:704:1: note: in expansion of macro 'TEST' /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C: In function 'int testcase705(const char*)': /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:122:5: error: static assertion failed: wrong test type size /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:705:1: note: in expansion of macro 'TEST' FAIL: g++.dg/cpp1y/vla11.C -std=c++11 (test for excess errors) Excess errors: /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:89:27: error: requested alignment 32 is larger than 16 [-Wattributes] /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:122:5: error: static assertion failed: wrong test type size /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:89:27: error: requested alignment 64 is larger than 16 [-Wattributes] /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:122:5: error: static assertion failed: wrong test type size /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:122:5: error: static assertion failed: wrong test type size /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:122:5: error: static assertion failed: wrong test type size Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."
[PATCH] Another small PRE speedup / cleanup
As partial antic compute ignores back-edges (it's a union DF problem and thus would blow up when working over back-edges) we can avoid iterating if we use the correct processing order (postorder). Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Queued for stage1. Richard. 2016-04-14 Richard Biener * tree-ssa-pre.c (postorder, postorder_num): Make locals ... (compute_antic): ... here. For partial antic use regular postorder and scrap iteration. (compute_partial_antic_aux): Remove unused return value. (init_pre): Do not allocate postorder. (fini_pre): Do not free postorder. Index: gcc/tree-ssa-pre.c === *** gcc/tree-ssa-pre.c (revision 234970) --- gcc/tree-ssa-pre.c (working copy) *** typedef struct bb_bitmap_sets *** 432,441 #define BB_LIVE_VOP_ON_EXIT(BB) ((bb_value_sets_t) ((BB)->aux))->vop_on_exit - /* Basic block list in postorder. */ - static int *postorder; - static int postorder_num; - /* This structure is used to keep track of statistics on what optimization PRE was able to perform. */ static struct --- 432,437 *** compute_antic_aux (basic_block block, bo *** 2209,2219 - ANTIC_IN[BLOCK]) */ ! static bool compute_partial_antic_aux (basic_block block, bool block_has_abnormal_pred_edge) { - bool changed = false; bitmap_set_t old_PA_IN; bitmap_set_t PA_OUT; edge e; --- 2228,2237 - ANTIC_IN[BLOCK]) */ ! static void compute_partial_antic_aux (basic_block block, bool block_has_abnormal_pred_edge) { bitmap_set_t old_PA_IN; bitmap_set_t PA_OUT; edge e; *** compute_partial_antic_aux (basic_block b *** 2312,2320 dependent_clean (PA_IN (block), ANTIC_IN (block)); - if (!bitmap_set_equal (old_PA_IN, PA_IN (block))) - changed = true; - maybe_dump_sets: if (dump_file && (dump_flags & TDF_DETAILS)) { --- 2330,2335 *** compute_partial_antic_aux (basic_block b *** 2327,2333 bitmap_set_free (old_PA_IN); if (PA_OUT) bitmap_set_free (PA_OUT); - return changed; } /* Compute ANTIC and partial ANTIC sets. */ --- 2342,2347 *** compute_antic (void) *** 2349,2371 FOR_ALL_BB_FN (block, cfun) { FOR_EACH_EDGE (e, ei, block->preds) if (e->flags & EDGE_ABNORMAL) { bitmap_set_bit (has_abnormal_preds, block->index); break; } - BB_VISITED (block) = 0; - /* While we are here, give empty ANTIC_IN sets to each block. */ ANTIC_IN (block) = bitmap_set_new (); ! PA_IN (block) = bitmap_set_new (); } /* At the exit block we anticipate nothing. */ BB_VISITED (EXIT_BLOCK_PTR_FOR_FN (cfun)) = 1; sbitmap worklist = sbitmap_alloc (last_basic_block_for_fn (cfun) + 1); bitmap_ones (worklist); while (changed) --- 2363,2395 FOR_ALL_BB_FN (block, cfun) { + BB_VISITED (block) = 0; + FOR_EACH_EDGE (e, ei, block->preds) if (e->flags & EDGE_ABNORMAL) { bitmap_set_bit (has_abnormal_preds, block->index); + + /* We also anticipate nothing. */ + BB_VISITED (block) = 1; break; } /* While we are here, give empty ANTIC_IN sets to each block. */ ANTIC_IN (block) = bitmap_set_new (); ! if (do_partial_partial) ! PA_IN (block) = bitmap_set_new (); } /* At the exit block we anticipate nothing. */ BB_VISITED (EXIT_BLOCK_PTR_FOR_FN (cfun)) = 1; + /* For ANTIC computation we need a postorder that also guarantees that + a block with a single successor is visited after its successor. + RPO on the inverted CFG has this property. */ + int *postorder = XNEWVEC (int, n_basic_blocks_for_fn (cfun)); + int postorder_num = inverted_post_order_compute (postorder); + sbitmap worklist = sbitmap_alloc (last_basic_block_for_fn (cfun) + 1); bitmap_ones (worklist); while (changed) *** compute_antic (void) *** 2403,2441 if (do_partial_partial) { ! bitmap_ones (worklist); ! num_iterations = 0; ! changed = true; ! while (changed) { ! if (dump_file && (dump_flags & TDF_DETAILS)) ! fprintf (dump_file, "Starting iteration %d\n", num_iterations); ! num_iterations++; ! changed = false; ! for (i = postorder_num - 1 ; i >= 0; i--) ! { ! if (bitmap_bit_p (worklist, postorder[i])) ! { ! basic_block block = BASIC_BLOCK_FOR_FN (cfun, postorder[i]); ! bitmap_clear_bit (worklist, block->index); ! if (compute_partial_a
[PATCH] Fix PR70614
The following patch (the 2nd hunk actually) brings down compile-time for the testcase in the PR from to 0.5s. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2016-04-14 Richard Biener PR tree-optimization/70614 * tree-scalar-evolution.c (analyze_evolution_in_loop): Terminate loop if the evolution dropped to chrec_dont_know. (interpret_condition_phi): Likewise. Index: gcc/tree-scalar-evolution.c === *** gcc/tree-scalar-evolution.c (revision 234970) --- gcc/tree-scalar-evolution.c (working copy) *** analyze_evolution_in_loop (gphi *loop_ph *** 1510,1515 --- 1510,1518 /* When there are multiple back edges of the loop (which in fact never happens currently, but nevertheless), merge their evolutions. */ evolution_function = chrec_merge (evolution_function, ev_fn); + + if (evolution_function == chrec_dont_know) + break; } if (dump_file && (dump_flags & TDF_SCEV)) *** interpret_condition_phi (struct loop *lo *** 1687,1692 --- 1690,1697 (loop, PHI_ARG_DEF (condition_phi, i)); res = chrec_merge (res, branch_chrec); + if (res == chrec_dont_know) + break; } return res;
Re: RFC: C++ PATCH to adjust empty class parameter passing ABI
On Thu, Apr 14, 2016 at 11:36 AM, Jakub Jelinek wrote: > On Thu, Apr 14, 2016 at 11:25:07AM +0100, Ramana Radhakrishnan wrote: >> > I see the test failing on aarch64-none-linux-gnu (native) >> > with no output, just: >> > spawn [open ...] >> > FAIL: g++.dg/abi/empty13.C -std=gnu++98 execution test >> > >> > And I see it passing on aarch64-none-elf. >> I don't understand why this passes on aarch64-none-elf btw. >> IIRC on AArch64 we document that empty classes and structs take up a >> slot as per the PCS . >> >> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0059b/IHI0059B_cppabi64.pdf >> >> See section 3.1 >> >> For the purposes of parameter passing in >> [ >> AAPCS64 >> ], a parameter whose type is an empty class shall be treated as if its >> type were an aggregate with a >> single >> member of type unsigned byte. > > One thing is passing such arguments in registers, another is how they are > passed on the stack (e.g. if there are several other arguments that are > passed in registers). It would just be treated like a structure with an unsigned byte and treated the same way for parameter passing on the stack - i.e. IIRC take a slot of 8 bytes. > Also, note that the (new) definition of C++ empty class I believe includes > even empty classes that are normally very large, i.e. don't pretend they > are just single byte, but can be e.g. 1024 bytes or longer. > Do you want to still pass them in registers and/or on the stack as if > it is 1 byte, 0 bytes or more bytes? I don't know off-hand. Is this a valid example for what you have in mind ? struct baz { char a[1024]; }; struct foo : baz { }; int bar (struct foo b, int x) if so, that would get treated as though it is just a 1 byte quantity for C++. as per current behaviour by both gcc and llvm for aarch64. > How do you pass empty C structures? I don't think we do anything special for empty C structures which IIRC means they don't take a slot. > > E.g. looking at sparc-solaris2.12 cross, C for empty structures passes > both in registers and on the stack as if there is one argument (and C++ did > too for the empty classes that pretended to have size 1), so appart from the > ICE in emit_move_insn there is no ABI change, only perhaps for the larger > structs. > > Guess we need to figure out what do the various targets do for C empty > structures and decide what we do want to do with C++ empty classes. > Perhaps the C++ FE should just set some flag on the empty class parameters, > and let the backends decide what to do with those. Ramana > > Jakub
[PATCH] Fix PR70130
The following fixes PR70130 - improved SLP capabilities now run into the realignment code on ppc which doesn't properly verify that all vector loads emitted by vectorizable_load share the same alignment. Bootstrap / regtest pending on x86_64-unknown-linux-gnu. Bootstrapped / tested on ppc64le by Alan. Richard. 2016-04-14 Richard Biener Alan Modra PR tree-optimization/70130 * tree-vect-data-refs.c (vect_supportable_dr_alignment): Detect when alignment stays not the same and no not use the realign scheme then. * gcc.dg/vect/O3-pr70130.c: New testcase. Index: gcc/tree-vect-data-refs.c === *** gcc/tree-vect-data-refs.c (revision 234970) --- gcc/tree-vect-data-refs.c (working copy) *** vect_supportable_dr_alignment (struct da *** 5983,5992 || targetm.vectorize.builtin_mask_for_load ())) { tree vectype = STMT_VINFO_VECTYPE (stmt_info); ! if ((nested_in_vect_loop ! && (TREE_INT_CST_LOW (DR_STEP (dr)) ! != GET_MODE_SIZE (TYPE_MODE (vectype ! || !loop_vinfo) return dr_explicit_realign; else return dr_explicit_realign_optimized; --- 5983,6001 || targetm.vectorize.builtin_mask_for_load ())) { tree vectype = STMT_VINFO_VECTYPE (stmt_info); ! ! /* If we are doing SLP then the accesses need not have the !same alignment, instead it depends on the SLP group size. */ ! if (loop_vinfo ! && STMT_SLP_TYPE (stmt_info) ! && (LOOP_VINFO_VECT_FACTOR (loop_vinfo) ! * GROUP_SIZE (vinfo_for_stmt (GROUP_FIRST_ELEMENT (stmt_info ! % TYPE_VECTOR_SUBPARTS (vectype) != 0) ! ; ! else if (!loop_vinfo ! || (nested_in_vect_loop ! && (TREE_INT_CST_LOW (DR_STEP (dr)) ! != GET_MODE_SIZE (TYPE_MODE (vectype) return dr_explicit_realign; else return dr_explicit_realign_optimized; Index: gcc/testsuite/gcc.dg/vect/O3-pr70130.c === *** gcc/testsuite/gcc.dg/vect/O3-pr70130.c (revision 0) --- gcc/testsuite/gcc.dg/vect/O3-pr70130.c (working copy) *** *** 0 --- 1,94 + /* { dg-do run } */ + /* { dg-require-effective-target vsx_hw { target powerpc*-*-* } } */ + /* { dg-additional-options "-mcpu=power7" { target powerpc*-*-* } } */ + + struct foo + { + short a[3][16][16]; + short pad; + } images[8]; + + void __attribute__ ((noinline, noclone)) + Loop_err (struct foo *img, const int s[16][2], int s0) + { + int i, j; + + for (j = 0; j < 16; j++) + { + for (i=0; i < 16; i++) + { + img->a[0][j][i] = s[i][0]; + img->a[1][j][i] = s[j][1]; + img->a[2][j][i] = s0; + } + } + } + + const int s[16][2] = { { 1, 16 }, { 2, 15 }, { 3, 14 }, { 4, 13 }, + { 5, 12 }, { 6, 11 }, { 7, 10 }, { 8, 9 }, + { 9, 8 }, { 10, 7 }, { 11, 6 }, { 12, 5 }, + { 13, 4 }, { 14, 3 }, { 15, 2 }, { 16, 1 } }; + const struct foo expected + = { { { { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 } }, + { { 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16 }, + { 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15 }, + { 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14 }, + { 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13 }, + { 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12 }, + { 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11 }, + { 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10 }, + { 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9
Re: RFC: C++ PATCH to adjust empty class parameter passing ABI
On Thu, Apr 14, 2016 at 12:19:26PM +0100, Ramana Radhakrishnan wrote: > Is this a valid example for what you have in mind ? > > struct baz > { > char a[1024]; > }; > struct foo : baz > { > }; > > int bar (struct foo b, int x) No, I meant say: struct A {}; struct B { A a[1024]; }; int bar (struct B b, int c) { return c; } int baz (int a, int b, int c, int d, int e, int f, int g, struct B h, int i) { return g + i; } Strangely, we warn about this with -Wabi=9 even on x86_64-linux on both testcases, while only on baz it actually changes code generation. > if so, that would get treated as though it is just a 1 byte quantity > for C++. as per current behaviour by both gcc and llvm for aarch64. > > > How do you pass empty C structures? > > I don't think we do anything special for empty C structures which IIRC > means they don't take a slot. I think the intent of the C++ changes was to make them roughly match (at least for the empty classes where C++ says sizeof (...) == 1 while C says sizeof (...) == 0) what C does. Jakub
Re: RFC: C++ PATCH to adjust empty class parameter passing ABI
On Thu, Apr 14, 2016 at 12:25 PM, Jakub Jelinek wrote: > On Thu, Apr 14, 2016 at 12:19:26PM +0100, Ramana Radhakrishnan wrote: >> Is this a valid example for what you have in mind ? >> >> struct baz >> { >> char a[1024]; >> }; >> struct foo : baz >> { >> }; >> >> int bar (struct foo b, int x) > > No, I meant say: > struct A {}; > struct B { A a[1024]; }; > int bar (struct B b, int c) > { > return c; > } > int baz (int a, int b, int c, int d, int e, int f, int g, struct B h, int i) > { > return g + i; > } > > Strangely, we warn about this with -Wabi=9 even on x86_64-linux on both > testcases, while only on baz it actually changes code generation. > Seems to be treated as passing a 1 byte quantity before the change. Note you need one more parameter on aarch64 to force struct B into the stack. IIRC the same problem should happen on AArch32 as well but I haven't tried that yet. regards Ramana
Re: [PATCH] [AArch64] support -mfentry feature for arm64
On Mar 14, 2016, at 11:14 AM, Li Bin wrote: > > As ARM64 is entering enterprise world, machines can not be stopped for > some critical enterprise production environment, that is, live patch as > one of the RAS features is increasing more important for ARM64 arch now. > > Now, the mainstream live patch implementation which has been merged in > Linux kernel (x86/s390) is based on the 'ftrace with regs' feature, and > this feature needs the help of gcc. > > This patch proposes a generic solution for arm64 gcc which called mfentry, > following the example of x86, mips, s390, etc. and on these archs, this > feature has been used to implement the ftrace feature 'ftrace with regs' > to support live patch. > > By now, there is an another solution from linaro [1], which proposes to > implement a new option -fprolog-pad=N that generate a pad of N nops at the > beginning of each function. This solution is a arch-independent way for gcc, > but there may be some limitations which have not been recognized for Linux > kernel to adapt to this solution besides the discussion on [2] It appears that implementing -fprolog-pad=N option in GCC will not enable kernel live-patching support for AArch64. The proposal for the option was to make GCC output a given number of NOPs at the beginning of each function, and then the kernel could use that NOP pad to insert whatever instructions it needs. The modification of kernel instruction stream needs to be done atomically, and, unfortunately, it seems the kernel can use only architecture-provided atomicity primitives -- i.e., changing at most 8 bytes at a time. >From the kernel discussion thread it appears that the pad needs to be more >than 8 bytes, and that the kernel can't update that atomically. However if >-mfentry approach is used, then we need to update only 4 (or 8) bytes of the >pad, and we avoid the atomicity problem. Therefore, [unless there is a clever multi-stage update process to atomically change NOPs to whatever we need,] I think we have to go with Li's -mfentry approach. Comments? -- Maxim Kuvyrkov www.linaro.org > , typically > for powerpc archs. Furthermore I think there are no good reasons to promote > the other archs (such as x86) which have implemented the feature 'ftrace with > regs' > to replace the current method with the new option, which may bring heavily > target-dependent code adaption, as a result it becomes a arm64 dedicated > solution, leaving kernel with two different forms of implementation. > > [1] https://gcc.gnu.org/ml/gcc/2015-10/msg00090.html > [2] > http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/401854.html
Re: [PATCH] Fix PR70130
On Thu, 2016-04-14 at 13:22 +0200, Richard Biener wrote: > The following fixes PR70130 - improved SLP capabilities now run into > the realignment code on ppc which doesn't properly verify that all > vector loads emitted by vectorizable_load share the same alignment. > > Bootstrap / regtest pending on x86_64-unknown-linux-gnu. > > Bootstrapped / tested on ppc64le by Alan. > > Richard. > > 2016-04-14 Richard Biener > Alan Modra > > PR tree-optimization/70130 > * tree-vect-data-refs.c (vect_supportable_dr_alignment): Detect > when alignment stays not the same and no not use the realign > scheme then. > > * gcc.dg/vect/O3-pr70130.c: New testcase. > > Index: gcc/tree-vect-data-refs.c > === > *** gcc/tree-vect-data-refs.c (revision 234970) > --- gcc/tree-vect-data-refs.c (working copy) > *** vect_supportable_dr_alignment (struct da > *** 5983,5992 > || targetm.vectorize.builtin_mask_for_load ())) > { > tree vectype = STMT_VINFO_VECTYPE (stmt_info); > ! if ((nested_in_vect_loop > !&& (TREE_INT_CST_LOW (DR_STEP (dr)) > !!= GET_MODE_SIZE (TYPE_MODE (vectype > ! || !loop_vinfo) > return dr_explicit_realign; > else > return dr_explicit_realign_optimized; > --- 5983,6001 > || targetm.vectorize.builtin_mask_for_load ())) > { > tree vectype = STMT_VINFO_VECTYPE (stmt_info); > ! > ! /* If we are doing SLP then the accesses need not have the > ! same alignment, instead it depends on the SLP group size. */ > ! if (loop_vinfo > ! && STMT_SLP_TYPE (stmt_info) > ! && (LOOP_VINFO_VECT_FACTOR (loop_vinfo) > ! * GROUP_SIZE (vinfo_for_stmt (GROUP_FIRST_ELEMENT > (stmt_info > ! % TYPE_VECTOR_SUBPARTS (vectype) != 0) Parentheses here look wrong. Should be one fewer ending paren on the "* GROUP_SIZE" line and one more on the following line, right? Bill > ! ; > ! else if (!loop_vinfo > !|| (nested_in_vect_loop > !&& (TREE_INT_CST_LOW (DR_STEP (dr)) > !!= GET_MODE_SIZE (TYPE_MODE (vectype) > return dr_explicit_realign; > else > return dr_explicit_realign_optimized; > Index: gcc/testsuite/gcc.dg/vect/O3-pr70130.c > === > *** gcc/testsuite/gcc.dg/vect/O3-pr70130.c(revision 0) > --- gcc/testsuite/gcc.dg/vect/O3-pr70130.c(working copy) > *** > *** 0 > --- 1,94 > + /* { dg-do run } */ > + /* { dg-require-effective-target vsx_hw { target powerpc*-*-* } } */ > + /* { dg-additional-options "-mcpu=power7" { target powerpc*-*-* } } */ > + > + struct foo > + { > + short a[3][16][16]; > + short pad; > + } images[8]; > + > + void __attribute__ ((noinline, noclone)) > + Loop_err (struct foo *img, const int s[16][2], int s0) > + { > + int i, j; > + > + for (j = 0; j < 16; j++) > + { > + for (i=0; i < 16; i++) > + { > + img->a[0][j][i] = s[i][0]; > + img->a[1][j][i] = s[j][1]; > + img->a[2][j][i] = s0; > + } > + } > + } > + > + const int s[16][2] = { { 1, 16 }, { 2, 15 }, { 3, 14 }, { 4, 13 }, > +{ 5, 12 }, { 6, 11 }, { 7, 10 }, { 8, 9 }, > +{ 9, 8 }, { 10, 7 }, { 11, 6 }, { 12, 5 }, > +{ 13, 4 }, { 14, 3 }, { 15, 2 }, { 16, 1 } }; > + const struct foo expected > + = { { { { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 } }, > + { { 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16 }, > + { 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15 }, > + { 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14 }, > + { 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13
Re: [PATCH] [AArch64] support -mfentry feature for arm64
On Thu, Apr 14, 2016 at 9:08 PM, Maxim Kuvyrkov wrote: > On Mar 14, 2016, at 11:14 AM, Li Bin wrote: >> >> As ARM64 is entering enterprise world, machines can not be stopped for >> some critical enterprise production environment, that is, live patch as >> one of the RAS features is increasing more important for ARM64 arch now. >> >> Now, the mainstream live patch implementation which has been merged in >> Linux kernel (x86/s390) is based on the 'ftrace with regs' feature, and >> this feature needs the help of gcc. >> >> This patch proposes a generic solution for arm64 gcc which called mfentry, >> following the example of x86, mips, s390, etc. and on these archs, this >> feature has been used to implement the ftrace feature 'ftrace with regs' >> to support live patch. >> >> By now, there is an another solution from linaro [1], which proposes to >> implement a new option -fprolog-pad=N that generate a pad of N nops at the >> beginning of each function. This solution is a arch-independent way for gcc, >> but there may be some limitations which have not been recognized for Linux >> kernel to adapt to this solution besides the discussion on [2] > > It appears that implementing -fprolog-pad=N option in GCC will not enable > kernel live-patching support for AArch64. The proposal for the option was to > make GCC output a given number of NOPs at the beginning of each function, and > then the kernel could use that NOP pad to insert whatever instructions it > needs. The modification of kernel instruction stream needs to be done > atomically, and, unfortunately, it seems the kernel can use only > architecture-provided atomicity primitives -- i.e., changing at most 8 bytes > at a time. > Can't we add a 16byte atomic primitive for ARM64 to the kernel? Though you need to align all functions to a 16 byte boundary if the -fprolog-pag=N needs to happen. Do you know what the size that needs to be modified? It does seem to be either 12 or 16 bytes. > From the kernel discussion thread it appears that the pad needs to be more > than 8 bytes, and that the kernel can't update that atomically. However if > -mfentry approach is used, then we need to update only 4 (or 8) bytes of the > pad, and we avoid the atomicity problem. I think you are incorrect, you could add a 16 byte atomic primitive if needed. > > Therefore, [unless there is a clever multi-stage update process to atomically > change NOPs to whatever we need,] I think we have to go with Li's -mfentry > approach. Please consider the above of having a 16 byte (128bit) atomic instructions be available would that be enough? Thanks, Andrew > > Comments? > > -- > Maxim Kuvyrkov > www.linaro.org > > >> , typically >> for powerpc archs. Furthermore I think there are no good reasons to promote >> the other archs (such as x86) which have implemented the feature 'ftrace >> with regs' >> to replace the current method with the new option, which may bring heavily >> target-dependent code adaption, as a result it becomes a arm64 dedicated >> solution, leaving kernel with two different forms of implementation. >> >> [1] https://gcc.gnu.org/ml/gcc/2015-10/msg00090.html >> [2] >> http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/401854.html >
Re: [PATCH] Fix PR70130
On Thu, 14 Apr 2016, Bill Schmidt wrote: > On Thu, 2016-04-14 at 13:22 +0200, Richard Biener wrote: > > The following fixes PR70130 - improved SLP capabilities now run into > > the realignment code on ppc which doesn't properly verify that all > > vector loads emitted by vectorizable_load share the same alignment. > > > > Bootstrap / regtest pending on x86_64-unknown-linux-gnu. > > > > Bootstrapped / tested on ppc64le by Alan. > > > > Richard. > > > > 2016-04-14 Richard Biener > > Alan Modra > > > > PR tree-optimization/70130 > > * tree-vect-data-refs.c (vect_supportable_dr_alignment): Detect > > when alignment stays not the same and no not use the realign > > scheme then. > > > > * gcc.dg/vect/O3-pr70130.c: New testcase. > > > > Index: gcc/tree-vect-data-refs.c > > === > > *** gcc/tree-vect-data-refs.c (revision 234970) > > --- gcc/tree-vect-data-refs.c (working copy) > > *** vect_supportable_dr_alignment (struct da > > *** 5983,5992 > > || targetm.vectorize.builtin_mask_for_load ())) > > { > > tree vectype = STMT_VINFO_VECTYPE (stmt_info); > > ! if ((nested_in_vect_loop > > ! && (TREE_INT_CST_LOW (DR_STEP (dr)) > > ! != GET_MODE_SIZE (TYPE_MODE (vectype > > ! || !loop_vinfo) > > return dr_explicit_realign; > > else > > return dr_explicit_realign_optimized; > > --- 5983,6001 > > || targetm.vectorize.builtin_mask_for_load ())) > > { > > tree vectype = STMT_VINFO_VECTYPE (stmt_info); > > ! > > ! /* If we are doing SLP then the accesses need not have the > > !same alignment, instead it depends on the SLP group size. */ > > ! if (loop_vinfo > > ! && STMT_SLP_TYPE (stmt_info) > > ! && (LOOP_VINFO_VECT_FACTOR (loop_vinfo) > > ! * GROUP_SIZE (vinfo_for_stmt (GROUP_FIRST_ELEMENT > > (stmt_info > > ! % TYPE_VECTOR_SUBPARTS (vectype) != 0) > > Parentheses here look wrong. Should be one fewer ending paren on the "* > GROUP_SIZE" line and one more on the following line, right? Yeah, though it shouldn't matter in practice. I'll fix things up before committing. Richard. > Bill > > > ! ; > > ! else if (!loop_vinfo > > ! || (nested_in_vect_loop > > ! && (TREE_INT_CST_LOW (DR_STEP (dr)) > > ! != GET_MODE_SIZE (TYPE_MODE (vectype) > > return dr_explicit_realign; > > else > > return dr_explicit_realign_optimized; > > Index: gcc/testsuite/gcc.dg/vect/O3-pr70130.c > > === > > *** gcc/testsuite/gcc.dg/vect/O3-pr70130.c (revision 0) > > --- gcc/testsuite/gcc.dg/vect/O3-pr70130.c (working copy) > > *** > > *** 0 > > --- 1,94 > > + /* { dg-do run } */ > > + /* { dg-require-effective-target vsx_hw { target powerpc*-*-* } } */ > > + /* { dg-additional-options "-mcpu=power7" { target powerpc*-*-* } } */ > > + > > + struct foo > > + { > > + short a[3][16][16]; > > + short pad; > > + } images[8]; > > + > > + void __attribute__ ((noinline, noclone)) > > + Loop_err (struct foo *img, const int s[16][2], int s0) > > + { > > + int i, j; > > + > > + for (j = 0; j < 16; j++) > > + { > > + for (i=0; i < 16; i++) > > + { > > + img->a[0][j][i] = s[i][0]; > > + img->a[1][j][i] = s[j][1]; > > + img->a[2][j][i] = s0; > > + } > > + } > > + } > > + > > + const int s[16][2] = { { 1, 16 }, { 2, 15 }, { 3, 14 }, { 4, 13 }, > > + { 5, 12 }, { 6, 11 }, { 7, 10 }, { 8, 9 }, > > + { 9, 8 }, { 10, 7 }, { 11, 6 }, { 12, 5 }, > > + { 13, 4 }, { 14, 3 }, { 15, 2 }, { 16, 1 } }; > > + const struct foo expected > > + = { { { { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }, > > + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 } }, > > + { { 16, 16,
Re: [PATCH][AArch64] Backport of PR 70044 fix to GCC 5
Ping. https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00142.html Thanks, Kyrill On 04/04/16 10:10, Kyrill Tkachov wrote: Hi all, I'd like to backport Nicks' patch for PR 70044 to the GCC 5 branch. The patch doesn't apply cleanly because the aarch64_override_options_after_change and associated machinery was reworked for GCC 6. This is the (simple) backport of that patch to GCC 5. Bootstrapped and tested on aarch64-none-linux-gnu. Confirmed that the test fails before the patch and passes with it. Ok to commit? Thanks, Kyrill 2016-04-04 Nick Clifton Kyrylo Tkachov PR target/70044 * config/aarch64/aarch64.c (aarch64_override_options_after_change): When forcing flag_omit_frame_pointer to be true, use a special value that can be detected if this function is called again, thus preventing flag_omit_leaf_frame_pointer from being forced to be false. 2016-04-04 Kyrylo Tkachov Backport from mainline 2016-03-31 Nick Clifton PR target/70044 * gcc.target/aarch64/pr70044.c: New test.
Re: [PATCH] c++/70594 debug info differences
On 04/13/16 15:41, Jason Merrill wrote: The fini_constexpr stuff is OK immediately. As those two objects are currently GTY((deletable)) I don't think that's necessary. Have I missed something? The rest of the patch is OK if Jakub thinks it should go in, but if his approach addresses the problem, we might as well continue to GC the tables. Jakub? I see you've obscured the UIDs in the dump, addressed a LABELs issue, and found another problem. Do we still want DECL_UID stability? nathan
Re: [PATCH][AArch64] Backport of PR 70044 fix to GCC 5
On Thu, Apr 14, 2016 at 02:24:48PM +0100, Kyrill Tkachov wrote: > Ping. > https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00142.html > > Thanks, > Kyrill > On 04/04/16 10:10, Kyrill Tkachov wrote: > >Hi all, > > > >I'd like to backport Nicks' patch for PR 70044 to the GCC 5 branch. > >The patch doesn't apply cleanly because the > >aarch64_override_options_after_change and > >associated machinery was reworked for GCC 6. This is the (simple) backport > >of that > >patch to GCC 5. > > > >Bootstrapped and tested on aarch64-none-linux-gnu. Confirmed that the test > >fails before the patch > >and passes with it. > > > >Ok to commit? OK. Thanks, James
Re: Splitting up gcc/omp-low.c?
Hi! On Wed, 13 Apr 2016 20:20:19 +0200, Bernd Schmidt wrote: > On 04/13/2016 07:56 PM, Thomas Schwinge wrote: > > 0001-Split-up-gcc-omp-low.c-plain.patch.xz attached. > > That looks much better. However, the //OMPWHATEVER comments are not > really all that helpful. They're helpful in the word-diff patch, as they show where I moved stuff. I agree that they can't serve this purpose in the line-diff patch. > I think the next step would be to clear out all > this stuff including the OMPCUT markers I think the first step would be to get agreement on the several questions that I posted yesterday. That is, which stuff to move, into which files, how to structure it all, and so on. That's what I hoped the word-diff patch would help with. Everything else sounds a bit like busywork to me; that is: > start with an > initial patch that includes everything that actually modifies code > rather than moving it (omp_is_reference seems to be the major thing here). I don't know how meaningful it is to create/discuss/review/commit the following patch until the overall approach has been agreed upon? commit 8ac6b83350022c74bb2af8006e51f3620889e7c1 Author: Thomas Schwinge Date: Thu Apr 14 15:25:24 2016 +0200 gcc/omp-low.c:is_reference -> gcc/omp-low.h:omp_is_reference gcc/ * omp-low.h: Include "langhooks.h". Define omp_is_reference here, and... * omp-low.c: ... don't define is_reference here. Adjust all users. --- gcc/omp-low.c | 102 +++--- gcc/omp-low.h | 10 ++ 2 files changed, 57 insertions(+), 55 deletions(-) diff --git gcc/omp-low.c gcc/omp-low.c index 7335abc..bb9c61e 100644 --- gcc/omp-low.c +++ gcc/omp-low.c @@ -1039,21 +1039,13 @@ is_variable_sized (const_tree expr) return !TREE_CONSTANT (TYPE_SIZE_UNIT (TREE_TYPE (expr))); } -/* Return true if DECL is a reference type. */ - -static inline bool -is_reference (tree decl) -{ - return lang_hooks.decls.omp_privatize_by_reference (decl); -} - /* Return the type of a decl. If the decl is reference type, return its base type. */ static inline tree get_base_type (tree decl) { tree type = TREE_TYPE (decl); - if (is_reference (decl)) + if (omp_is_reference (decl)) type = TREE_TYPE (type); return type; } @@ -1348,7 +1340,7 @@ build_outer_var_ref (tree var, omp_context *ctx, bool lastprivate = false) } x = lookup_decl (var, outer); } - else if (is_reference (var)) + else if (omp_is_reference (var)) /* This can happen with orphaned constructs. If var is reference, it is possible it is shared and as such valid. */ x = var; @@ -1371,7 +1363,7 @@ build_outer_var_ref (tree var, omp_context *ctx, bool lastprivate = false) } } - if (is_reference (var)) + if (omp_is_reference (var)) x = build_simple_mem_ref (x); return x; @@ -1433,7 +1425,7 @@ install_var_field (tree var, bool by_ref, int mask, omp_context *ctx, if (base_pointers_restrict) type = build_qualified_type (type, TYPE_QUAL_RESTRICT); } - else if ((mask & 3) == 1 && is_reference (var)) + else if ((mask & 3) == 1 && omp_is_reference (var)) type = TREE_TYPE (type); field = build_decl (DECL_SOURCE_LOCATION (var), @@ -1904,7 +1896,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx, if ((! TREE_READONLY (decl) && !OMP_CLAUSE_SHARED_READONLY (c)) || TREE_ADDRESSABLE (decl) || by_ref - || is_reference (decl)) + || omp_is_reference (decl)) { by_ref = use_pointer_for_field (decl, ctx); install_var_field (decl, by_ref, 3, ctx); @@ -1954,7 +1946,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx, && is_gimple_omp_offloaded (ctx->stmt)) { if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FIRSTPRIVATE) - install_var_field (decl, !is_reference (decl), 3, ctx); + install_var_field (decl, !omp_is_reference (decl), 3, ctx); else if (TREE_CODE (TREE_TYPE (decl)) == ARRAY_TYPE) install_var_field (decl, true, 3, ctx); else @@ -1973,7 +1965,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx, by_ref = use_pointer_for_field (decl, NULL); if (is_task_ctx (ctx) - && (global || by_ref || is_reference (decl))) + && (global || by_ref || omp_is_reference (decl))) { install_var_field (decl, false, 1, ctx); if (!global) @@ -4673,10 +4665,10 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, tree ref = build_outer_var_ref (var, ctx); /* For ref build_outer_var_ref already performs this. */ if (TREE_CODE (d) == INDIRECT_REF) - gcc_assert (is_reference (var)); +
Re: [PATCH] c++/70594 debug info differences
On 04/14/2016 09:25 AM, Nathan Sidwell wrote: On 04/13/16 15:41, Jason Merrill wrote: The fini_constexpr stuff is OK immediately. As those two objects are currently GTY((deletable)) I don't think that's necessary. Have I missed something? True, I suppose it doesn't make much difference. Jason
Re: [PATCH] c++/70594 debug info differences
On Thu, Apr 14, 2016 at 09:43:26AM -0400, Jason Merrill wrote: > On 04/14/2016 09:25 AM, Nathan Sidwell wrote: > >On 04/13/16 15:41, Jason Merrill wrote: > > > >>The fini_constexpr stuff is OK immediately. > > > >As those two objects are currently GTY((deletable)) I don't think that's > >necessary. Have I missed something? > > True, I suppose it doesn't make much difference. Right now our debugging shows that the remaining issue is just an unrelated ipa-devirt bug (where the aggressive pruning of BLOCK trees with -g0 results in different devirt decisions from non-pruned one). So we hopefully can get away with deletable. Jakub
PATCH to disable the canonical types check in verify_type (PR c++/70029)
Looking at this PR again, seems we have reached conclusion that the way forward for GCC 6 is to temporarily disable the check, so I'm posting a patch for that, so as to finally resolve this PR. The problem is that the C++ FE violates the check when it sets FUNCTION_*_QUALIFIED flags. Bootstrapped/regtested on x86_64-linux, ok for trunk? 2016-04-14 Marek Polacek Jan Hubicka PR c++/70029 * tree.c (verify_type): Disable the canonical type of main variant check. * g++.dg/torture/pr70029.C: New test. diff --git gcc/testsuite/g++.dg/torture/pr70029.C gcc/testsuite/g++.dg/torture/pr70029.C index e69de29..9592f0c 100644 --- gcc/testsuite/g++.dg/torture/pr70029.C +++ gcc/testsuite/g++.dg/torture/pr70029.C @@ -0,0 +1,12 @@ +// PR c++/70029 +// { dg-do compile } +// { dg-options "-std=c++11 -g -flto" } +// { dg-require-effective-target lto } + +struct A +{ + A(); + int foo() && __attribute__ ((__warn_unused_result__)) { return 0; } +}; + +A a; diff --git gcc/tree.c gcc/tree.c index ed28429..c64d720 100644 --- gcc/tree.c +++ gcc/tree.c @@ -13584,7 +13584,9 @@ verify_type (const_tree t) debug_tree (ct); error_found = true; } - if (TYPE_MAIN_VARIANT (t) == t && ct && TYPE_MAIN_VARIANT (ct) != ct) + /* FIXME: this is violated by the C++ FE as discussed in PR70029, when + FUNCTION_*_QUALIFIED flags are set. */ + if (0 && TYPE_MAIN_VARIANT (t) == t && ct && TYPE_MAIN_VARIANT (ct) != ct) { error ("TYPE_CANONICAL of main variant is not main variant"); debug_tree (ct); Marek
Re: PATCH to disable the canonical types check in verify_type (PR c++/70029)
On 04/14/2016 10:30 AM, Marek Polacek wrote: + /* FIXME: this is violated by the C++ FE as discussed in PR70029, when + FUNCTION_*_QUALIFIED flags are set. */ + if (0 && TYPE_MAIN_VARIANT (t) == t && ct && TYPE_MAIN_VARIANT (ct) != ct) How about guarding this check with flag_checking rather than disabling it entirely? That way it won't affect released compilers, and we can downgrade the PR from P1, but doesn't hide the bug. Jason
Re: PATCH to disable the canonical types check in verify_type (PR c++/70029)
On Thu, Apr 14, 2016 at 11:01:02AM -0400, Jason Merrill wrote: > On 04/14/2016 10:30 AM, Marek Polacek wrote: > >+ /* FIXME: this is violated by the C++ FE as discussed in PR70029, when > >+ FUNCTION_*_QUALIFIED flags are set. */ > >+ if (0 && TYPE_MAIN_VARIANT (t) == t && ct && TYPE_MAIN_VARIANT (ct) != ct) > > How about guarding this check with flag_checking rather than disabling it > entirely? That way it won't affect released compilers, and we can downgrade > the PR from P1, but doesn't hide the bug. That will still mean people who use -fchecking will keep reporting such ICEs. I think it is better to disable it and reenable after GCC 6 branches. Jakub
Re: PATCH to disable the canonical types check in verify_type (PR c++/70029)
On 04/14/2016 11:05 AM, Jakub Jelinek wrote: On Thu, Apr 14, 2016 at 11:01:02AM -0400, Jason Merrill wrote: On 04/14/2016 10:30 AM, Marek Polacek wrote: + /* FIXME: this is violated by the C++ FE as discussed in PR70029, when + FUNCTION_*_QUALIFIED flags are set. */ + if (0 && TYPE_MAIN_VARIANT (t) == t && ct && TYPE_MAIN_VARIANT (ct) != ct) How about guarding this check with flag_checking rather than disabling it entirely? That way it won't affect released compilers, and we can downgrade the PR from P1, but doesn't hide the bug. That will still mean people who use -fchecking will keep reporting such ICEs. I think it is better to disable it and reenable after GCC 6 branches. OK. Jason
Re: [PATCH] 69517 - [5/6 regression] SEGV on a VLA with excess initializer elements
On 04/14/2016 04:39 AM, Andreas Schwab wrote: Martin Sebor writes: diff --git a/gcc/testsuite/g++.dg/cpp1y/vla11.C b/gcc/testsuite/g++.dg/cpp1y/vla11.C new file mode 100644 index 000..af9624a --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1y/vla11.C @@ -0,0 +1,711 @@ +// PR c++/69517 - [5/6 regression] SEGV on a VLA with excess initializer +// elements +// PR c++/70019 - VLA size overflow not detected +// +// Runtime test to verify that attempting to either construct a VLA with +// erroneous bounds, or initialize one with an initializer-list that +// contains more elements than the VLA's non-constant (runtime) bounds +// causes an exception to be thrown. Test also verifies that valid +// VLAs and their initializers don't cause such an exception. + +// { dg-do run { target c++11 } } +// { dg-additional-options "-Wno-vla" } On m68k: /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C: In instantiation of 'struct TestType<32u>': /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:201:1: required from here /daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:89:27: error: requested alignment 32 is larger than 16 [-Wattributes] Thank you for the heads up (and sorry about the breakage). I've committed r234976 to fix that. Martin
Re: [PATCH] [CLEANUP] Mark locally-used functions static
Hi! On Tue, 15 Apr 2014 09:58:41 +0200, Richard Biener wrote: > On Mon, Apr 14, 2014 at 4:48 PM, Patrick Palka wrote: > > This patch marks "static" a bunch of locally-used, non-debug functions > > within the GCC sources. Doing so addresses a subset of the warnings emitted > > when compiling the GCC sources with -Wmissing-declarations. > > > > I bootstrapped and regtested this change on x86_64-unknown-linux-gnu. > > The gcc/ parts are ok. It seems this hasn't been committed. I just re-discovered that gcc/omp-low.c:simd_clone_vector_of_formal_parm_types could/should be static but isn't. Instead of now resolving that one individually: Patrick, are you going to resolve the whole lot (in next stage 1, I suppose)? For reference: > > 2014-04-13 Patrick Palka > > > > gcc/c/ > > * c-array-notation.c (replace_invariant_exprs): Make static. > > > > gcc/cp/ > > * class.c (inherit_targ_abi_tags): Make static. > > * cp-array-notation.c (create_cmp_incr): Likewise. > > * mangle.c (tree_string_cmp): Likewise. > > * pt.c (fixed_parameter_pack_p): Likewise. > > * vtable-class-hierarchy.c > > (vtv_register_class_hierarchy_information): Likewise. > > > > gcc/fortran/ > > * class.c (gfc_intrinsic_hash_value): Make static. > > * trans-expr.c (gfc_conv_intrinsic_to_class): Likewise. > > > > gcc/ > > * asan.c (asan_mem_ref_get_end, replace_invariant_exprs): Make > > static. > > * cgraphclones.c (redirect_edge_duplicating_thunks): Likewise. > > * cgraphunit.c (decide_is_symbol_needed): Likewise. > > * config/i386/i386.c (make_pass_insert_vzeroupper, > > ix86_avx_emit_vzeroupper): Likewise. > > * dwarf2out.c (init_addr_table_entry): Likewise. > > * gengtype.c (get_string_option, already_seen_tag, > > mark_tag_as_seen): Likewise. > > * gimple-ssa-isolate-paths (isolate_path): Likewise. > > * graphite.c (graphite_transform_loops): Likewise. > > * internal-fn.c > > (ubsan_expand_si_overflow_{addsub,neg,mul}_check): Likewise. > > * ipa-devirt.c (hash_type_name, likely_target_p): Likewise. > > * ipa-inline-analysis.c (simple_edge_hints): Likewise. > > * ipa-profile.c (cmp_counts, contains_hot_call_p): Likewise. > > * ipa-prop.c (ipa_alloc_node_params, > > write_agg_replacement_chain): Likewise. > > * ipa.c (can_replace_by_local_alias): Likewise. > > * lto-streamer-out.c (output_spmbol_p): Likewise. > > * omp-low.c (simd_clone_vector_of_formal_parm_types): Likewise. > > * tree-inline.c (redirect_all_calls, freqs_to_counts): Likewise. > > * tree-predcom.c (tree_predictive_commoning): Likewise. > > * tree-sra.c (ipa_sra_modify_function_body): Likewise. > > * tree-ssa-loop-im.c (movement_possibilyt, tree_ssa_lim): > > Likewise. > > * tree-ssa-loop-ivcanon.c (canonicalize_induction_variables, > > tree_unroll_loops_completely): Likewise. > > * tree-ssa-loop-prefetch.c (tree_ssa_prefetch_arrays): Likewise. > > * tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Likewise. > > * tree-ssa-threadupdate.c (ssa_fix_duplicate_block_edges): > > Likewise. > > * tree-vect-loop-manip.c (vect_create_cond_for_alias_checks): > > Likewise. > > * ubsan.c (tree_type_map_hash): Likewise. > > * varpool.c (varpool_call_variable_insertion_hooks): Likewise. > > > > libiberty/ > > * make-temp-file.c (choose_tmpdir): Make static. > > --- > > gcc/asan.c | 4 ++-- > > gcc/c/c-array-notation.c| 2 +- > > gcc/cgraphclones.c | 2 +- > > gcc/cgraphunit.c| 2 +- > > gcc/config/i386/i386.c | 4 ++-- > > gcc/cp/class.c | 2 +- > > gcc/cp/cp-array-notation.c | 2 +- > > gcc/cp/mangle.c | 2 +- > > gcc/cp/pt.c | 2 +- > > gcc/cp/vtable-class-hierarchy.c | 2 +- > > gcc/dwarf2out.c | 2 +- > > gcc/fortran/class.c | 2 +- > > gcc/fortran/trans-expr.c| 2 +- > > gcc/gengtype.c | 6 +++--- > > gcc/gimple-ssa-isolate-paths.c | 2 +- > > gcc/graphite.c | 2 +- > > gcc/internal-fn.c | 6 +++--- > > gcc/ipa-devirt.c| 4 ++-- > > gcc/ipa-inline-analysis.c | 2 +- > > gcc/ipa-profile.c | 4 ++-- > > gcc/ipa-prop.c | 4 ++-- > > gcc/ipa.c | 2 +- > > gcc/lto-streamer-out.c | 2 +- > > gcc/omp-low.c | 2 +- > > gcc/tree-inline.c | 4 ++-- > > gcc/tree-predcom.c | 2 +- > > gcc/tree-sra.c | 2 +- > > gcc/tree-ssa-loop-im.c | 4 ++-- > > gcc/tree-ssa-loop-ivcanon.c | 4 ++-- > > gcc/tree-ssa-loop-prefetch.c| 2 +- > > gcc/tree-ssa-loop-unswitch.c
Re: [PATCH] 69517 - [5/6 regression] SEGV on a VLA with excess initializer elements
It caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70652 Sorry about that (I missed java among the languages when configuring my build for regression testing). I'm testing a libjava patch that I expect will eliminate the linker error. Martin
Re: rs6000 stack_tie mishap again
On 04/11/2016 12:15 PM, Olivier Hainque wrote: > >> On Mar 28, 2016, at 19:41 , Richard Henderson wrote: >> >> But I expect for stage4, the best solution is to strengthen the stack_tie >> pattern to block all memory. Early scheduling of the stack frame >> deallocation (a simple logic insn) can't really be that important to >> performance. > > Something like the attached patch, at least for next stage1 until the > more general issue is sorted out ? > > Only basic testing at this point. I can do more if considered appropriate. > > Thanks for your feedback, > > With Kind Regards, > > Olivier > > * config/rs6000/rs6000.c (rs6000_emit_stack_tie): Add a > (clobber mem:BLK (scratch)) to the set of effects, blocking > all memory accesses. We have the same problem on S/390 and I'm also looking for a way to solve that. I'm not sure about the (clobber (mem)) approach. Wouldn't it be theoretically possible to move a memory write over a (clobber (mem))? The patch I'm currently testing follows a proposal from Joseph Myers in 2011: https://gcc.gnu.org/ml/gcc-patches/2011-11/msg00977.html + [(set (mem:BLK (scratch)) + (unspec:BLK [(match_operand:BLK 0 "memory_operand" "+m")] UNSPEC_TIE))] What I'm wondering is whether the memory read is actually needed here?! Wouldn't the following have the same effect? [(set (mem:BLK (scratch)) (unspec:BLK [(const_int 0)] UNSPEC_TIE))] To my understanding this should block memory reads and writes from being moved across. The MEM probably should be set up to be in the frame alias set? Bye, -Andreas-
Re: [PATCH] [AArch64] support -mfentry feature for arm64
On 14/04/16 14:15, Andrew Pinski wrote: > On Thu, Apr 14, 2016 at 9:08 PM, Maxim Kuvyrkov > wrote: >> On Mar 14, 2016, at 11:14 AM, Li Bin wrote: >>> >>> As ARM64 is entering enterprise world, machines can not be stopped for >>> some critical enterprise production environment, that is, live patch as >>> one of the RAS features is increasing more important for ARM64 arch now. >>> >>> Now, the mainstream live patch implementation which has been merged in >>> Linux kernel (x86/s390) is based on the 'ftrace with regs' feature, and >>> this feature needs the help of gcc. >>> >>> This patch proposes a generic solution for arm64 gcc which called mfentry, >>> following the example of x86, mips, s390, etc. and on these archs, this >>> feature has been used to implement the ftrace feature 'ftrace with regs' >>> to support live patch. >>> >>> By now, there is an another solution from linaro [1], which proposes to >>> implement a new option -fprolog-pad=N that generate a pad of N nops at the >>> beginning of each function. This solution is a arch-independent way for gcc, >>> but there may be some limitations which have not been recognized for Linux >>> kernel to adapt to this solution besides the discussion on [2] >> >> It appears that implementing -fprolog-pad=N option in GCC will not enable >> kernel live-patching support for AArch64. The proposal for the option was >> to make GCC output a given number of NOPs at the beginning of each function, >> and then the kernel could use that NOP pad to insert whatever instructions >> it needs. The modification of kernel instruction stream needs to be done >> atomically, and, unfortunately, it seems the kernel can use only >> architecture-provided atomicity primitives -- i.e., changing at most 8 bytes >> at a time. >> > > Can't we add a 16byte atomic primitive for ARM64 to the kernel? > Though you need to align all functions to a 16 byte boundary if the > -fprolog-pag=N needs to happen. Do you know what the size that needs > to be modified? It does seem to be either 12 or 16 bytes. > looking at [2] i don't see why func: mov x9, x30 bl _tracefunc is not good for the kernel. mov x9, x30 is a nop at function entry, so in theory 4 byte atomic write should be enough to enable/disable tracing. >> From the kernel discussion thread it appears that the pad needs to be more >> than 8 bytes, and that the kernel can't update that atomically. However if >> -mfentry approach is used, then we need to update only 4 (or 8) bytes of the >> pad, and we avoid the atomicity problem. > > I think you are incorrect, you could add a 16 byte atomic primitive if needed. > >> >> Therefore, [unless there is a clever multi-stage update process to >> atomically change NOPs to whatever we need,] I think we have to go with Li's >> -mfentry approach. > > Please consider the above of having a 16 byte (128bit) atomic > instructions be available would that be enough? > > Thanks, > Andrew > >> >> Comments? >> >> -- >> Maxim Kuvyrkov >> www.linaro.org >> >> >>> , typically >>> for powerpc archs. Furthermore I think there are no good reasons to promote >>> the other archs (such as x86) which have implemented the feature 'ftrace >>> with regs' >>> to replace the current method with the new option, which may bring heavily >>> target-dependent code adaption, as a result it becomes a arm64 dedicated >>> solution, leaving kernel with two different forms of implementation. >>> >>> [1] https://gcc.gnu.org/ml/gcc/2015-10/msg00090.html >>> [2] >>> http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/401854.html >> >
Re: Splitting up gcc/omp-low.c?
On 04/14/2016 03:36 PM, Thomas Schwinge wrote: I don't know how meaningful it is to create/discuss/review/commit the following patch until the overall approach has been agreed upon? Why not? We agree the file should be split, and this makes it easier. So I'll approve it for stage1, or earlier if RMs want this split in gcc-6. You could however probably also remove the #include langhooks from the C file. Bernd
Re: Splitting up gcc/omp-low.c?
Hi! On Wed, 13 Apr 2016 18:01:09 +0200, I wrote: > On Fri, 08 Apr 2016 11:36:03 +0200, I wrote: > > On Thu, 10 Dec 2015 09:08:35 +0100, Jakub Jelinek wrote: > > > On Wed, Dec 09, 2015 at 06:23:22PM +0100, Bernd Schmidt wrote: > > > > On 12/09/2015 05:24 PM, Thomas Schwinge wrote: > > > > >how about we split up gcc/omp-low.c into several > > > > >files? Would it make sense (I have not yet looked in detail) to do so > > > > >along the borders of the several passes defined therein? > > > > > I suspect a split along the ompexp/omplow boundary would be quite easy > > > > to > > > > achieve. > > And possibly some kind of omp-simd.c, and omp-checking.c, and so > > on, if feasible. > > Not yet looked into these. ..., and here they are: word-diff patches creating omp-diagnostics.c and omp-simd.c, 0001-new-file-gcc-omp-diagnostics.c.patch.xz and 0002-new-file-gcc-omp-simd.c.patch.xz. The former contains the "diagnose_omp_blocks" pass and the latter the "simdclone" pass, with the respective supporting code. I will certainly submit line-diff patches if we agree that this is sound -- these two may actually be good candidates to do first, individually, and do that now, because they're completely self-contained. Makes sense? Should possibly rename omp-simd.c to omp-simd-clone.c to make it clear that's the only thing it does, the "simdclone" pass? Should we maybe rename omp-diagnostics.c to omp-structured-blocks.c, and really only have it contain that "diagnose_omp_blocks" pass, or should we move other diagnostic stuff like check_omp_nesting_restrictions into that file, too? One //OMPTODO: --- gcc/omp-low.h +++ gcc/omp-low.h @@ -95,0 +96,11 @@ extern gimple *build_omp_barrier (tree); +//OMPTODO: moved from omp-low.c. Renamed from WALK_SUBSTMTS to OMP_WALK_SUBSTMTS because of the very generic name. Used in omp-low.c and omp-diagnostics.c. Alternatively, WALK_SUBSTMTS should perhaps simply be duplicated in the two files? +#define OMP_WALK_SUBSTMTS \ +case GIMPLE_BIND: \ +case GIMPLE_TRY: \ +case GIMPLE_CATCH: \ +case GIMPLE_EH_FILTER: \ +case GIMPLE_TRANSACTION: \ + /* The sub-statements for these should be walked. */ \ + *handled_ops_p = false; \ + break; Instead of that, duplicate WALK_SUBSTMTS into both omp-low.c and omp-diagnostics.c? Grüße Thomas 0001-new-file-gcc-omp-diagnostics.c.patch.xz Description: application/xz 0002-new-file-gcc-omp-simd.c.patch.xz Description: application/xz
Re: PATCH to disable the canonical types check in verify_type (PR c++/70029)
On Thu, Apr 14, 2016 at 11:18:59AM -0400, Jason Merrill wrote: > On 04/14/2016 11:05 AM, Jakub Jelinek wrote: > >On Thu, Apr 14, 2016 at 11:01:02AM -0400, Jason Merrill wrote: > >>On 04/14/2016 10:30 AM, Marek Polacek wrote: > >>>+ /* FIXME: this is violated by the C++ FE as discussed in PR70029, when > >>>+ FUNCTION_*_QUALIFIED flags are set. */ > >>>+ if (0 && TYPE_MAIN_VARIANT (t) == t && ct && TYPE_MAIN_VARIANT (ct) != > >>>ct) > >> > >>How about guarding this check with flag_checking rather than disabling it > >>entirely? That way it won't affect released compilers, and we can downgrade > >>the PR from P1, but doesn't hide the bug. > > > >That will still mean people who use -fchecking will keep reporting such > >ICEs. I think it is better to disable it and reenable after GCC 6 branches. > > OK. Should I consider the patch approved or do we want to hear from Honza? Marek
Re: PATCH to disable the canonical types check in verify_type (PR c++/70029)
On 04/14/2016 10:12 AM, Marek Polacek wrote: On Thu, Apr 14, 2016 at 11:18:59AM -0400, Jason Merrill wrote: On 04/14/2016 11:05 AM, Jakub Jelinek wrote: On Thu, Apr 14, 2016 at 11:01:02AM -0400, Jason Merrill wrote: On 04/14/2016 10:30 AM, Marek Polacek wrote: + /* FIXME: this is violated by the C++ FE as discussed in PR70029, when + FUNCTION_*_QUALIFIED flags are set. */ + if (0 && TYPE_MAIN_VARIANT (t) == t && ct && TYPE_MAIN_VARIANT (ct) != ct) How about guarding this check with flag_checking rather than disabling it entirely? That way it won't affect released compilers, and we can downgrade the PR from P1, but doesn't hide the bug. That will still mean people who use -fchecking will keep reporting such ICEs. I think it is better to disable it and reenable after GCC 6 branches. OK. Should I consider the patch approved or do we want to hear from Honza? Given the BZ & list discussion, I'd consider the patch approved. I *think* the way to deal with the BZ is to change the regression marker to 7 and the target milestone as well. I think leaving it as a P1 would be fine as that'll force revisiting in the gcc-7 timeframe. jeff
Re: rs6000 stack_tie mishap again
On 04/14/2016 09:47 AM, Andreas Krebbel wrote: On 04/11/2016 12:15 PM, Olivier Hainque wrote: On Mar 28, 2016, at 19:41 , Richard Henderson wrote: But I expect for stage4, the best solution is to strengthen the stack_tie pattern to block all memory. Early scheduling of the stack frame deallocation (a simple logic insn) can't really be that important to performance. Something like the attached patch, at least for next stage1 until the more general issue is sorted out ? Only basic testing at this point. I can do more if considered appropriate. Thanks for your feedback, With Kind Regards, Olivier * config/rs6000/rs6000.c (rs6000_emit_stack_tie): Add a (clobber mem:BLK (scratch)) to the set of effects, blocking all memory accesses. We have the same problem on S/390 and I'm also looking for a way to solve that. I'm not sure about the (clobber (mem)) approach. Wouldn't it be theoretically possible to move a memory write over a (clobber (mem))? I thought we had code to handle this case specially, but I can't immediately find it in sched-deps.c. Some ports also use an unspec_volatile to achieve the same effect: ;; UNSPEC_VOLATILE is considered to use and clobber all hard registers and ;; all of memory. This blocks insns from being moved across this point. (define_insn "blockage" [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)] "" "" [(set_attr "length" "0")]) ;; Do not schedule instructions accessing memory across this point. (define_expand "memory_blockage" [(set (match_dup 0) (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BLOCKAGE))] "" { operands[0] = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode)); MEM_VOLATILE_P (operands[0]) = 1; }) (define_insn "*memory_blockage" [(set (match_operand:BLK 0) (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BLOCKAGE))] "" "" [(set_attr "length" "0")]) Jeff
Re: PATCH to disable the canonical types check in verify_type (PR c++/70029)
On Thu, Apr 14, 2016 at 10:40:43AM -0600, Jeff Law wrote: > Given the BZ & list discussion, I'd consider the patch approved. > > I *think* the way to deal with the BZ is to change the regression marker to > 7 and the target milestone as well. I think leaving it as a P1 would be > fine as that'll force revisiting in the gcc-7 timeframe. Thanks, I've done so now. Marek
Re: rs6000 stack_tie mishap again
> On 14 Apr 2016, at 18:50, Jeff Law wrote: > > I thought we had code to handle this case specially, but I can't immediately > find it in sched-deps.c. Unless I misunderstood what you were exactly looking for, the special code is in alias.c. E.g. write_dependence_p: /* (mem:BLK (scratch)) is a special mechanism to conflict with everything. This is used in epilogue deallocation functions. */ ... > Some ports also use an unspec_volatile to achieve the same effect: > > ;; UNSPEC_VOLATILE is considered to use and clobber all hard registers and > ;; all of memory. This blocks insns from being moved across this point. > > (define_insn "blockage" > [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)] > "" > "" > [(set_attr "length" "0")]) Yes, I pondered this one and thought that a memory barrier would be less aggressive, while good enough. > ;; Do not schedule instructions accessing memory across this point. > > (define_expand "memory_blockage" > [(set (match_dup 0) >(unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BLOCKAGE))] > "" > { > operands[0] = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode)); > MEM_VOLATILE_P (operands[0]) = 1; > }) > > (define_insn "*memory_blockage" > [(set (match_operand:BLK 0) >(unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BLOCKAGE))] > "" > "" > [(set_attr "length" "0")]) Emitting just that alone isn't enough though. The first attempt I made did that and failed because the whole reg-restore reg-restore tie (mem blockage only) sequence was allowed to move past the stack pointer reset when it's performed as a mere register to register move. So I ended up adding the clobber mem:blk scratch to the existing tie parallel, which references the stack pointer register as well so is forbidden to move past it's update. Olivier
Re: Fix some x86 peepholes vs. the red-zone
On 04/12/2016 05:57 AM, Bernd Schmidt wrote: With some changes I was working on, I produced a situation where one of the x86 peepholes made an incorrect transformation. In the prologue expansion code, we have /* When using red zone we may start register saving before allocating the stack frame saving one cycle of the prologue. However, avoid doing this if we have to probe the stack; at least on x86_64 the stack probe can turn into a call that clobbers a red zone location. So, we can in some situations produce something like: mov %ebx, -8(%rsp) mov %edi, -16(%rsp) subl $16, %rsp which is fine if using a redzone. The problem is that there are peepholes which convert sp subtractions into pushes, clobbering the saved registers. I've made these peepholes conditional on !x86_using_red_zone. Bootstrapped and tested on x86_64-linux, ok (now or later)? OK for stage1. Uros's call whether or not to OK for the trunk right now. jeff
[PATCH] Fix -fcompare-debug due to incorrect block pruning (PR c++/70594)
Hi! On a testcase Tobias provided privately there is a -fcompare-debug failure due to different cgraph node->order values in the printout. The problem is in different result from the noncall_stmt_may_be_vtbl_ptr_store function on a store, which is caused by too aggressive block pruning with -g0. The noncall_stmt_may_be_vtbl_ptr_store relies on enough BLOCKS to be preserved, such that if block_ultimate_origin of some BLOCK (gimple_block or up through its BLOCK_SUPERCONTEXTs) is a FUNCTION_DECL, then it is a cdtor if the stmt is originally inlined from a cdtor, or is some other FUNCTION_DECL if it is not originally inlined from a cdtor. To achieve that, tree-ssa-live.c attempts to keep the blocks with block_ultimate_origin of FUNCTION_DECL which is cdtor, and nearest block with block_ultimate_origin of FUNCTION_DECL that is not cdtor below the cdtor one. There is: /* For ipa-polymorphic-call.c purposes, preserve blocks: 1) with BLOCK_ABSTRACT_ORIGIN of a ctor/dtor or their clones */ if (inlined_polymorphic_ctor_dtor_block_p (scope, true)) { in_ctor_dtor_block = true; unused = false; } /* 2) inside such blocks, the outermost block with BLOCK_ABSTRACT_ORIGIN being a FUNCTION_DECL. */ else if (in_ctor_dtor_block && BLOCK_ABSTRACT_ORIGIN (scope) && TREE_CODE (BLOCK_ABSTRACT_ORIGIN (scope)) == FUNCTION_DECL) { in_ctor_dtor_block = false; unused = false; } which is roughly right, but there are 2 bugs, both of which affect the testcase I've looked at. 1) the above works well if a non-cdtor function is inlined into cdtor and the cdtor is then inlined into some other function and pruning happens only afterwards, but if a non-cdtor function is inlined into cdtor, then pruning happens on the cdtor, the block with block_ultimate_origin of FUNCTION_DECL can be pruned, and then finally the cdtor function is inlined into some other one and noncall_stmt_may_be_vtbl_ptr_store depends on -g 2) another issue is that noncall_stmt_may_be_vtbl_ptr_store uses block_ultimate_origin, and for the cdtor blocks we use it in remove_unused_scope_block_p too (because inlined_polymorphic_ctor_dtor_block_p calls it), but for the other blocks we just check BLOCK_ABSTRACT_ORIGIN if it is FUNCTION_DECL. If it isn't a FUNCTION_DECL, but block_ultimate_origin is FUNCTION_DECL, we still prune it, even when we shouldn't. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-04-14 Jakub Jelinek PR c++/70594 * ipa-utils.h (polymorphic_ctor_dtor_p): New prototype. * ipa-polymorphic-call.c (polymorphic_ctor_dtor_p): New function. (inlined_polymorphic_ctor_dtor_block_p): Use it. * tree-ssa-live.c (remove_unused_scope_block_p): When in_ctor_dtor_block, avoid discarding not just BLOCKs with BLOCK_ABSTRACT_ORIGIN being FUNCTION_DECL, but even when block_ultimate_origin is FUNCTION_DECL. (remove_unused_locals): If current_function_decl is polymorphic_ctor_dtor_p, pass initial true to remove_unused_scope_block_p' is_ctor_dtor_block. --- gcc/ipa-utils.h.jj 2016-01-04 14:55:51.0 +0100 +++ gcc/ipa-utils.h 2016-04-14 16:46:08.828444152 +0200 @@ -70,6 +70,7 @@ void dump_possible_polymorphic_call_targ bool possible_polymorphic_call_target_p (tree, HOST_WIDE_INT, const ipa_polymorphic_call_context &, struct cgraph_node *); +tree polymorphic_ctor_dtor_p (tree, bool); tree inlined_polymorphic_ctor_dtor_block_p (tree, bool); bool decl_maybe_in_construction_p (tree, tree, gimple *, tree); tree vtable_pointer_value_to_binfo (const_tree); --- gcc/ipa-polymorphic-call.c.jj 2016-03-30 16:00:17.0 +0200 +++ gcc/ipa-polymorphic-call.c 2016-04-14 16:45:45.407754387 +0200 @@ -479,16 +479,12 @@ contains_type_p (tree outer_type, HOST_W } -/* Return a FUNCTION_DECL if BLOCK represents a constructor or destructor. +/* Return a FUNCTION_DECL if FN represent a constructor or destructor. If CHECK_CLONES is true, also check for clones of ctor/dtors. */ tree -inlined_polymorphic_ctor_dtor_block_p (tree block, bool check_clones) +polymorphic_ctor_dtor_p (tree fn, bool check_clones) { - tree fn = block_ultimate_origin (block); - if (fn == NULL || TREE_CODE (fn) != FUNCTION_DECL) -return NULL_TREE; - if (TREE_CODE (TREE_TYPE (fn)) != METHOD_TYPE || (!DECL_CXX_CONSTRUCTOR_P (fn) && !DECL_CXX_DESTRUCTOR_P (fn))) { @@ -510,6 +506,19 @@ inlined_polymorphic_ctor_dtor_block_p (t return fn; } +/* Return a FUNCTION_DECL if BLOCK represents a constructor or destructor. + If CHECK_CLONES is true, also check for clones of ctor/dtors. */ + +tree +inlined_polymorphic_ctor_dtor_block_p (tree block, bool check_clones) +{ + tree fn = block_ultimate_origin (block); + if (fn == NULL || TREE_CODE (fn) != FUNCTION_DECL) +return NULL_TREE; + + retur
Re: [PATCH] Fix -fcompare-debug due to incorrect block pruning (PR c++/70594)
On April 14, 2016 8:31:32 PM GMT+02:00, Jakub Jelinek wrote: >Hi! > >On a testcase Tobias provided privately there is a -fcompare-debug >failure due to different cgraph node->order values in the printout. >The problem is in different result from the >noncall_stmt_may_be_vtbl_ptr_store function on a store, which is caused >by too aggressive block pruning with -g0. > >The noncall_stmt_may_be_vtbl_ptr_store relies on enough BLOCKS to be >preserved, such that if block_ultimate_origin of some BLOCK >(gimple_block or >up through its BLOCK_SUPERCONTEXTs) is a FUNCTION_DECL, then it is >a cdtor if the stmt is originally inlined from a cdtor, or is some >other FUNCTION_DECL if it is not originally inlined from a cdtor. >To achieve that, tree-ssa-live.c attempts to keep the blocks >with block_ultimate_origin of FUNCTION_DECL which is cdtor, >and nearest block with block_ultimate_origin of FUNCTION_DECL >that is not cdtor below the cdtor one. > >There is: > /* For ipa-polymorphic-call.c purposes, preserve blocks: > 1) with BLOCK_ABSTRACT_ORIGIN of a ctor/dtor or their clones */ > if (inlined_polymorphic_ctor_dtor_block_p (scope, true)) >{ > in_ctor_dtor_block = true; > unused = false; >} >/* 2) inside such blocks, the outermost block with >BLOCK_ABSTRACT_ORIGIN > being a FUNCTION_DECL. */ > else if (in_ctor_dtor_block > && BLOCK_ABSTRACT_ORIGIN (scope) > && TREE_CODE (BLOCK_ABSTRACT_ORIGIN (scope)) == FUNCTION_DECL) >{ > in_ctor_dtor_block = false; > unused = false; >} >which is roughly right, but there are 2 bugs, both of which affect >the testcase I've looked at. > >1) the above works well if a non-cdtor function is inlined into cdtor >and the cdtor is then inlined into some other function and pruning >happens >only afterwards, but if a non-cdtor function is inlined into cdtor, >then >pruning happens on the cdtor, the block with block_ultimate_origin >of FUNCTION_DECL can be pruned, and then finally the cdtor function is >inlined into some other one and noncall_stmt_may_be_vtbl_ptr_store >depends >on -g > >2) another issue is that noncall_stmt_may_be_vtbl_ptr_store uses >block_ultimate_origin, and for the cdtor blocks we use it in >remove_unused_scope_block_p too (because >inlined_polymorphic_ctor_dtor_block_p calls it), but for the other >blocks we just check BLOCK_ABSTRACT_ORIGIN if it is FUNCTION_DECL. >If it isn't a FUNCTION_DECL, but block_ultimate_origin is >FUNCTION_DECL, >we still prune it, even when we shouldn't. > >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. Thanks, Richard. >2016-04-14 Jakub Jelinek > > PR c++/70594 > * ipa-utils.h (polymorphic_ctor_dtor_p): New prototype. > * ipa-polymorphic-call.c (polymorphic_ctor_dtor_p): New function. > (inlined_polymorphic_ctor_dtor_block_p): Use it. > * tree-ssa-live.c (remove_unused_scope_block_p): When > in_ctor_dtor_block, avoid discarding not just BLOCKs with > BLOCK_ABSTRACT_ORIGIN being FUNCTION_DECL, but even when > block_ultimate_origin is FUNCTION_DECL. > (remove_unused_locals): If current_function_decl is > polymorphic_ctor_dtor_p, pass initial true to > remove_unused_scope_block_p' is_ctor_dtor_block. > >--- gcc/ipa-utils.h.jj 2016-01-04 14:55:51.0 +0100 >+++ gcc/ipa-utils.h2016-04-14 16:46:08.828444152 +0200 >@@ -70,6 +70,7 @@ void dump_possible_polymorphic_call_targ > bool possible_polymorphic_call_target_p (tree, HOST_WIDE_INT, >const ipa_polymorphic_call_context &, >struct cgraph_node *); >+tree polymorphic_ctor_dtor_p (tree, bool); > tree inlined_polymorphic_ctor_dtor_block_p (tree, bool); > bool decl_maybe_in_construction_p (tree, tree, gimple *, tree); > tree vtable_pointer_value_to_binfo (const_tree); >--- gcc/ipa-polymorphic-call.c.jj 2016-03-30 16:00:17.0 +0200 >+++ gcc/ipa-polymorphic-call.c 2016-04-14 16:45:45.407754387 +0200 >@@ -479,16 +479,12 @@ contains_type_p (tree outer_type, HOST_W > } > > >-/* Return a FUNCTION_DECL if BLOCK represents a constructor or >destructor. >+/* Return a FUNCTION_DECL if FN represent a constructor or destructor. >If CHECK_CLONES is true, also check for clones of ctor/dtors. */ > > tree >-inlined_polymorphic_ctor_dtor_block_p (tree block, bool check_clones) >+polymorphic_ctor_dtor_p (tree fn, bool check_clones) > { >- tree fn = block_ultimate_origin (block); >- if (fn == NULL || TREE_CODE (fn) != FUNCTION_DECL) >-return NULL_TREE; >- > if (TREE_CODE (TREE_TYPE (fn)) != METHOD_TYPE > || (!DECL_CXX_CONSTRUCTOR_P (fn) && !DECL_CXX_DESTRUCTOR_P (fn))) > { >@@ -510,6 +506,19 @@ inlined_polymorphic_ctor_dtor_block_p (t > return fn; > } > >+/* Return a FUNCTION_DECL if BLOCK represents a constructor or >destructor. >+ If CHECK_CLONES is true, also check for clones of ctor/dtors. */ >+ >+tree >+inlined_polymorphic_ctor
C++ PATCH for c++/70648 (wrong error with constexpr constructor)
When we overwrite one CONSTRUCTOR with another, we need to transfer CONSTRUCTOR_NO_IMPLICIT_ZERO as well. Tested x86_64-pc-linux-gnu, applying to trunk. commit aa2f91a9e517f9351ae9624d35b27f6a25fc1b97 Author: Jason Merrill Date: Thu Apr 14 13:08:20 2016 -0400 PR c++/70648 * constexpr.c (cxx_eval_store_expression): Also copy CONSTRUCTOR_NO_IMPLICIT_ZERO. diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c index 37cc336..4abff20 100644 --- a/gcc/cp/constexpr.c +++ b/gcc/cp/constexpr.c @@ -3149,6 +3149,8 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, tree t, CONSTRUCTOR_ELTS (*valp) = CONSTRUCTOR_ELTS (init); TREE_CONSTANT (*valp) = TREE_CONSTANT (init); TREE_SIDE_EFFECTS (*valp) = TREE_SIDE_EFFECTS (init); + CONSTRUCTOR_NO_IMPLICIT_ZERO (*valp) + = CONSTRUCTOR_NO_IMPLICIT_ZERO (init); } else *valp = init; diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-initlist10.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-initlist10.C new file mode 100644 index 000..c12347d --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-initlist10.C @@ -0,0 +1,11 @@ +// PR c++/70648 +// { dg-do compile { target c++11 } } + +struct C +{ + template + constexpr C (...) : c { static_cast(0)... } {} + int c[1]; +}; + +static constexpr int b = C{}.c[0];
C++ PATCH for c++/70543 (wrong error with static const as array bound)
When we consider whether the initializer for a constant variable makes it value-dependent, we should check type-dependence as well as value-dependence. Tested x86_64-pc-linux-gnu, applying to trunk. commit 4af96532b224029a20b192be2e7cedc60b9d12d0 Author: Jason Merrill Date: Thu Apr 14 13:23:44 2016 -0400 PR c++/70543 * pt.c (value_dependent_expression_p) [VAR_DECL]: A type-dependent initializer also makes the variable value-dependent. diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index d066e55..4a00530 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -22670,6 +22670,7 @@ value_dependent_expression_p (tree expression) && (TREE_CODE (DECL_INITIAL (expression)) == TREE_LIST /* cp_finish_decl doesn't fold reference initializers. */ || TREE_CODE (TREE_TYPE (expression)) == REFERENCE_TYPE + || type_dependent_expression_p (DECL_INITIAL (expression)) || value_dependent_expression_p (DECL_INITIAL (expression return true; return false; diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-template9.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-template9.C new file mode 100644 index 000..2ca641d --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-template9.C @@ -0,0 +1,17 @@ +// PR c++/70543 +// { dg-do compile { target c++11 } } + +template +struct X +{ + template + static constexpr int + calc (void) + { +return 0; + } + + static constexpr unsigned int value = calc (); + + char foo[value]; +};
Re: [PATCH #2], PR target/70640, Fix thinkos in PowerPC IEEE 128-bit floating point emulation
After I applied the fix for PR 70640, one the regression testers reported the test as an error. This system is a power7 system that does not have an asembler that understands the power8/power9 instructions. This means that -mcpu=power8 will only generate power7 code. The difference for this test is that on power7, it generates a vspltisw instruciton to create a register with all 1's set, while on power8 it generates xxlorc instead (vspltisw is restricted to the Altivec registers, while xxlorc can target all VSX registers, so it is preferred for power8 code). Since it was in a new test and obvious, I went ahead and commited the following patch: 2016-04-14 Michael Meissner PR target/70640 * gcc.target/powerpc/pr70640.c: Fix test so it correctly works on a power7 system that does not have an assembler that supports power8. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797 Index: gcc/testsuite/gcc.target/powerpc/pr70640.c === --- gcc/testsuite/gcc.target/powerpc/pr70640.c (revision 234980) +++ gcc/testsuite/gcc.target/powerpc/pr70640.c (working copy) @@ -5,7 +5,7 @@ __float128 foo (__float128 a) { return -a; } -/* { dg-final { scan-assembler "xxlorc" } } */ +/* { dg-final { scan-assembler "xxlorc\|vspltisw" } } */ /* { dg-final { scan-assembler "xxlxor" } } */ /* { dg-final { scan-assembler "vslb" } } */ /* { dg-final { scan-assembler "vsldoi" } } */
C++ PATCH for c++/70622 (wrong error with multiple auto)
My change to how we check for consistency of multiple declarators for the same auto type-specifier failed to consider declarators of different forms; we need to compare only the part of the declared type that corresponds to the type-specifier. Tested x86_64-pc-linux-gnu, applying to trunk. commit 824d6696c29b11ff8015f632c58272685c0bbfe6 Author: Jason Merrill Date: Thu Apr 14 15:05:44 2016 -0400 PR c++/70622 * parser.c (cp_parser_init_declarator): Add auto_result parm. (cp_parser_simple_declaration): Pass it. (strip_declarator_types): New. diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 00e211e..cba2d65 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -2193,7 +2193,7 @@ static tree cp_parser_decltype static tree cp_parser_init_declarator (cp_parser *, cp_decl_specifier_seq *, vec *, - bool, bool, int, bool *, tree *, location_t *); + bool, bool, int, bool *, tree *, location_t *, tree *); static cp_declarator *cp_parser_declarator (cp_parser *, cp_parser_declarator_kind, int *, bool *, bool, bool); static cp_declarator *cp_parser_direct_declarator @@ -12337,10 +12337,9 @@ cp_parser_simple_declaration (cp_parser* parser, && !cp_parser_error_occurred (parser)) cp_parser_commit_to_tentative_parse (parser); - tree last_type, auto_node; + tree last_type; last_type = NULL_TREE; - auto_node = type_uses_auto (decl_specifiers.type); /* Keep going until we hit the `;' at the end of the simple declaration. */ @@ -12351,6 +12350,7 @@ cp_parser_simple_declaration (cp_parser* parser, cp_token *token; bool function_definition_p; tree decl; + tree auto_result = NULL_TREE; if (saw_declarator) { @@ -12376,7 +12376,8 @@ cp_parser_simple_declaration (cp_parser* parser, declares_class_or_enum, &function_definition_p, maybe_range_for_decl, - &init_loc); + &init_loc, + &auto_result); /* If an error occurred while parsing tentatively, exit quickly. (That usually happens when in the body of a function; each statement is treated as a declaration-statement until proven @@ -12384,10 +12385,10 @@ cp_parser_simple_declaration (cp_parser* parser, if (cp_parser_error_occurred (parser)) goto done; - if (auto_node) + if (auto_result) { - tree type = TREE_TYPE (decl); - if (last_type && !same_type_p (type, last_type)) + if (last_type && last_type != error_mark_node + && !same_type_p (auto_result, last_type)) { /* If the list of declarators contains more than one declarator, the type of each declared variable is determined as described @@ -12395,10 +12396,11 @@ cp_parser_simple_declaration (cp_parser* parser, the same in each deduction, the program is ill-formed. */ error_at (decl_specifiers.locations[ds_type_spec], "inconsistent deduction for %qT: %qT and then %qT", - decl_specifiers.type, last_type, type); - auto_node = NULL_TREE; + decl_specifiers.type, last_type, auto_result); + last_type = error_mark_node; } - last_type = type; + else + last_type = auto_result; } /* Handle function definitions specially. */ @@ -18221,6 +18223,31 @@ cp_parser_asm_definition (cp_parser* parser) } } +/* Given the type TYPE of a declaration with declarator DECLARATOR, return the + type that comes from the decl-specifier-seq. */ + +static tree +strip_declarator_types (tree type, cp_declarator *declarator) +{ + for (cp_declarator *d = declarator; d;) +switch (d->kind) + { + case cdk_id: + case cdk_error: + d = NULL; + break; + + default: + if (TYPE_PTRMEMFUNC_P (type)) + type = TYPE_PTRMEMFUNC_FN_TYPE (type); + type = TREE_TYPE (type); + d = d->declarator; + break; + } + + return type; +} + /* Declarators [gram.dcl.decl] */ /* Parse an init-declarator. @@ -18286,7 +18313,8 @@ cp_parser_init_declarator (cp_parser* parser, int declares_class_or_enum, bool* function_definition_p, tree* maybe_range_for_decl, - location_t* init_loc) + location_t* init_loc, + tree* auto_result) { cp_token *token = NULL, *asm_spec_start_token = NULL, *attributes_start_token = NULL; @@ -18677,6 +18705,10 @@ cp_parser_init_declarator (cp_parser* parser, finish_fully_implicit_template (parser, /*member_decl_opt=*/0); } + if (auto_result && is_initialized && decl_specifiers->type + && type_uses_auto (decl_specifiers->type)) +*auto_result = strip_declarator_types (TREE_TYPE (decl), declarator); + return decl; } @@ -25808,7 +25840,7 @@ cp_parser_single_declaration (cp_parser* parser, member_p, declares_class_or_enum, &function_definition_p, - NULL, NULL); + NULL, NULL, NULL); /* 7.1.1-1 [dcl.stc] diff --git a/gcc/testsuite/g++.dg/cpp0x/auto47.C b/gcc/testsuite/g++.dg/cpp0x/auto47.C new file mode 100644 index 000..0d80be6 --- /d
[patch] fix an openacc test case
This patch fixes a segfault in libgomp.oacc-fortran/non-scalar-data.f90. The problem here is that 'n' is a parameter, and the kernels region implicitly adds a copy clause to n. Naturally, the test segfaults when it comes time to write the value back to the host as the kernels region terminates. This problem only occurs on nvptx targets. I'll apply this patch to trunk as obvious. Cesar 2016-04-14 Cesar Philippidis libgomp/ * testsuite/libgomp.oacc-fortran/non-scalar-data.f90: Don't pass parameter variables to subroutines. diff --git a/libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 b/libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 index 4afb562..94e4228 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 @@ -6,9 +6,11 @@ program main implicit none - integer, parameter :: n = 100 - integer :: array(n), i - + integer,parameter :: size = 100 + integer :: array(size), i, n + + n = size + !$acc data copy(array) call kernels(array, n)
Re: rs6000 stack_tie mishap again
On Mon, Apr 11, 2016 at 12:15:10PM +0200, Olivier Hainque wrote: > > > On Mar 28, 2016, at 19:41 , Richard Henderson wrote: > > > > But I expect for stage4, the best solution is to strengthen the stack_tie > > pattern to block all memory. Early scheduling of the stack frame > > deallocation (a simple logic insn) can't really be that important to > > performance. > > Something like the attached patch, at least for next stage1 until the > more general issue is sorted out ? It's a bit heavy; as a workaround, we may want this clobber in the stack deallocation in the epilogue, but not in all other places stack_tie is used. And for stage 1 we do not really want a workaround, we want to fix the actual problem ;-) Segher
[PATCH] Fix PR target/70669 (allow __float128 to use direct move)
When adding the basic __float128 support, I forgot to enable direct move support for moving __float128 between VSX registers and GPR registers. This patch enables using direct move for __float128 variables on Power8 systems. I bootstrapped the compiler and found no regressions with this patch. Is it ok to apply to the GCC trunk? [gcc] 2016-04-14 Michael Meissner PR target/70669 * config/rs6000/rs6000.c (rs6000_init_hard_regno_mode_ok): Add direct move handlers for KFmode. Change TFmode handlers test from FLOAT128_IEEE_P to FLOAT128_VECTOR_P. [gcc/testsuite] 2016-04-14 Michael Meissner PR target/70669 * gcc.target/powerpc/pr70669.c: New test. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797 Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 234910) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -3132,8 +3132,6 @@ rs6000_init_hard_regno_mode_ok (bool glo reg_addr[V4SFmode].reload_load = CODE_FOR_reload_v4sf_di_load; reg_addr[V2DFmode].reload_store = CODE_FOR_reload_v2df_di_store; reg_addr[V2DFmode].reload_load = CODE_FOR_reload_v2df_di_load; - reg_addr[KFmode].reload_store= CODE_FOR_reload_kf_di_store; - reg_addr[KFmode].reload_load = CODE_FOR_reload_kf_di_load; reg_addr[DFmode].reload_store= CODE_FOR_reload_df_di_store; reg_addr[DFmode].reload_load = CODE_FOR_reload_df_di_load; reg_addr[DDmode].reload_store= CODE_FOR_reload_dd_di_store; @@ -3141,7 +3139,13 @@ rs6000_init_hard_regno_mode_ok (bool glo reg_addr[SFmode].reload_store= CODE_FOR_reload_sf_di_store; reg_addr[SFmode].reload_load = CODE_FOR_reload_sf_di_load; - if (FLOAT128_IEEE_P (TFmode)) + if (FLOAT128_VECTOR_P (KFmode)) + { + reg_addr[KFmode].reload_store = CODE_FOR_reload_kf_di_store; + reg_addr[KFmode].reload_load = CODE_FOR_reload_kf_di_load; + } + + if (FLOAT128_VECTOR_P (TFmode)) { reg_addr[TFmode].reload_store = CODE_FOR_reload_tf_di_store; reg_addr[TFmode].reload_load = CODE_FOR_reload_tf_di_load; @@ -3182,6 +3186,18 @@ rs6000_init_hard_regno_mode_ok (bool glo reg_addr[V8HImode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprv8hi; reg_addr[V16QImode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprv16qi; reg_addr[SFmode].reload_vsx_gpr= CODE_FOR_reload_vsx_from_gprsf; + + if (FLOAT128_VECTOR_P (KFmode)) + { + reg_addr[KFmode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxkf; + reg_addr[KFmode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprkf; + } + + if (FLOAT128_VECTOR_P (TFmode)) + { + reg_addr[TFmode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxtf; + reg_addr[TFmode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprtf; + } } } else @@ -3200,8 +3216,6 @@ rs6000_init_hard_regno_mode_ok (bool glo reg_addr[V4SFmode].reload_load = CODE_FOR_reload_v4sf_si_load; reg_addr[V2DFmode].reload_store = CODE_FOR_reload_v2df_si_store; reg_addr[V2DFmode].reload_load = CODE_FOR_reload_v2df_si_load; - reg_addr[KFmode].reload_store= CODE_FOR_reload_kf_si_store; - reg_addr[KFmode].reload_load = CODE_FOR_reload_kf_si_load; reg_addr[DFmode].reload_store= CODE_FOR_reload_df_si_store; reg_addr[DFmode].reload_load = CODE_FOR_reload_df_si_load; reg_addr[DDmode].reload_store= CODE_FOR_reload_dd_si_store; @@ -3209,6 +3223,12 @@ rs6000_init_hard_regno_mode_ok (bool glo reg_addr[SFmode].reload_store= CODE_FOR_reload_sf_si_store; reg_addr[SFmode].reload_load = CODE_FOR_reload_sf_si_load; + if (FLOAT128_VECTOR_P (KFmode)) + { + reg_addr[KFmode].reload_store = CODE_FOR_reload_kf_si_store; + reg_addr[KFmode].reload_load = CODE_FOR_reload_kf_si_load; + } + if (FLOAT128_IEEE_P (TFmode)) { reg_addr[TFmode].reload_store = CODE_FOR_reload_tf_si_store; Index: gcc/testsuite/gcc.target/powerpc/pr70669.c === --- gcc/testsuite/gcc.target/powerpc/pr70669.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/pr70669.c (working copy) @@ -0,0 +1,22 @@ +/* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */ +/* { dg-options "-O2 -mcpu=power8 -
Re: [PATCH] Fix PR target/70669 (allow __float128 to use direct move)
On Thu, Apr 14, 2016 at 6:43 PM, Michael Meissner wrote: > When adding the basic __float128 support, I forgot to enable direct move > support for moving __float128 between VSX registers and GPR registers. > > This patch enables using direct move for __float128 variables on Power8 > systems. I bootstrapped the compiler and found no regressions with this > patch. Is it ok to apply to the GCC trunk? > > [gcc] > 2016-04-14 Michael Meissner > > PR target/70669 > * config/rs6000/rs6000.c (rs6000_init_hard_regno_mode_ok): Add > direct move handlers for KFmode. Change TFmode handlers test from > FLOAT128_IEEE_P to FLOAT128_VECTOR_P. > > [gcc/testsuite] > 2016-04-14 Michael Meissner > > PR target/70669 > * gcc.target/powerpc/pr70669.c: New test. Okay. Thanks, David
Re: [PATCH] rtlanal: Fix bits/bytes confusion in set_noop_p (PR68814)
Hi, This now also shows up on GCC 5, see PR70672. It there happens in the jump1 pass already. Is this okay to backport to 5 and 4.9? Segher On Wed, Dec 09, 2015 at 08:17:51PM +, Segher Boessenkool wrote: > The meaning of ZERO_EXTRACT depends on BITS_BIG_ENDIAN, not on > BYTES_BIG_ENDIAN. This caused PR68814. > > Testing in progress on powerpc64le-linux; if it passes, is this > okay for trunk? > > > Segher > > > 2015-12-09 Segher Boessenkool > > PR rtl-optimization/68814 > * rtlanal.c (set_noop_p): Use BITS_BIG_ENDIAN instead of > BYTES_BIG_ENDIAN. > --- > gcc/rtlanal.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c > index 67098e5..f893bca 100644 > --- a/gcc/rtlanal.c > +++ b/gcc/rtlanal.c > @@ -1534,7 +1534,7 @@ set_noop_p (const_rtx set) > >if (GET_CODE (dst) == ZERO_EXTRACT) > return rtx_equal_p (XEXP (dst, 0), src) > -&& ! BYTES_BIG_ENDIAN && XEXP (dst, 2) == const0_rtx > +&& !BITS_BIG_ENDIAN && XEXP (dst, 2) == const0_rtx > && !side_effects_p (src); > >if (GET_CODE (dst) == STRICT_LOW_PART) > -- > 1.9.3
C++ PATCH for c++/70528 (mutual dependence error)
The problem with this testcase was that in order to determine whether J is a literal type, we needed to implicitly declare its default constructor, which calls the H default constructor, and instantiating the template arguments for H() requires the K default constructor, which needs the First default mem-initializer, which we haven't parsed yet. We can break this chain by just treating the implicitly-declared default constructor as constexpr. This doesn't break anything in the regression testsuite: tests for non-literal type diagnostics still pass, just because of a member/base type being non-literal rather than because of the default constructor, which actually produces better diagnostics. This is also a direction that Core has been discussing; the notion of literal type is only a convenience for diagnostics, it's not important to get it right at the cost of excessive instantiation. Tested x86_64-pc-linux-gnu, applying to trunk. commit 743489d435da4976c6ecdd66468b85b3ef9a1968 Author: Jason Merrill Date: Thu Apr 7 07:45:03 2016 -0400 PR c++/70528 * class.c (type_has_constexpr_default_constructor): Return true for an implicitly declared constructor. diff --git a/gcc/cp/class.c b/gcc/cp/class.c index 02a992f..e6d5bb0 100644 --- a/gcc/cp/class.c +++ b/gcc/cp/class.c @@ -3346,7 +3346,6 @@ add_implicitly_declared_members (tree t, tree* access_decls, CLASSTYPE_LAZY_DEFAULT_CTOR (t) = 1; if (cxx_dialect >= cxx11) TYPE_HAS_CONSTEXPR_CTOR (t) - /* This might force the declaration. */ = type_has_constexpr_default_constructor (t); } @@ -5349,8 +5348,11 @@ type_has_constexpr_default_constructor (tree t) { if (!TYPE_HAS_COMPLEX_DFLT (t)) return trivial_default_constructor_is_constexpr (t); - /* Non-trivial, we need to check subobject constructors. */ - lazily_declare_fn (sfk_constructor, t); + /* Assume it's constexpr to avoid unnecessary instantiation; if the + definition would have made the class non-literal, it will still be + non-literal because of the base or member in question, and that + gives a better diagnostic. */ + return true; } fns = locate_ctor (t); return (fns && DECL_DECLARED_CONSTEXPR_P (fns)); diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-ctor12.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-ctor12.C index 45e1ed2..42ca30a 100644 --- a/gcc/testsuite/g++.dg/cpp0x/constexpr-ctor12.C +++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-ctor12.C @@ -3,6 +3,7 @@ template struct C { + C() = default; constexpr C(const Tp& r) { } }; diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-default-ctor.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-default-ctor.C index a67505d..8d352d0 100644 --- a/gcc/testsuite/g++.dg/cpp0x/constexpr-default-ctor.C +++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-default-ctor.C @@ -7,6 +7,6 @@ struct A { struct B: A { }; constexpr int f(B b) { return b.i; } -struct C { C(); }; // { dg-message "calls non-constexpr" } -struct D: C { }; // { dg-message "no constexpr constructor" } +struct C { C(); }; // { dg-message "" } +struct D: C { }; // { dg-message "" } constexpr int g(D d) { return 42; } // { dg-error "invalid type" } diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-ice6.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-ice6.C index 30e0a64..bf95b24 100644 --- a/gcc/testsuite/g++.dg/cpp0x/constexpr-ice6.C +++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-ice6.C @@ -6,6 +6,6 @@ struct A A(int); }; -struct B : A {}; // { dg-error "no matching" } +struct B : A {}; // { dg-message "" } constexpr int foo(B) { return 0; } // { dg-error "invalid type" } diff --git a/gcc/testsuite/g++.dg/cpp0x/inh-ctor19.C b/gcc/testsuite/g++.dg/cpp0x/inh-ctor19.C index 7a22f88..7e2d58b 100644 --- a/gcc/testsuite/g++.dg/cpp0x/inh-ctor19.C +++ b/gcc/testsuite/g++.dg/cpp0x/inh-ctor19.C @@ -8,7 +8,7 @@ struct A struct B : A { - using A::A; // { dg-error "inherited" } + using A::A; }; constexpr B b; // { dg-error "literal" } diff --git a/gcc/testsuite/g++.dg/cpp0x/pr70528.C b/gcc/testsuite/g++.dg/cpp0x/pr70528.C new file mode 100644 index 000..af1c84e --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/pr70528.C @@ -0,0 +1,16 @@ +// PR c++/70258 +// { dg-do compile { target c++11 } } + +template +struct H +{ + template + H (); +}; + +struct J { + struct K { +int First = 0; + }; + H FunctionMDInfo; +};
C++ PATCH for c++/70494 (ICE with lambda capture of array)
split_nonconstant_init_1 didn't know how to deal with arrays when building a cleanup. Rather than add that logic there, we can use cxx_maybe_build_cleanup, which does understand arrays, once we adjust it to handle non-VAR_DECLs. Tested x86_64-pc-linux-gnu, applying to trunk. commit 4c385da85bf8f9a1c0f26b25441aa2d182a2c02d Author: Jason Merrill Date: Thu Apr 14 23:58:39 2016 -0400 PR c++/70494 * decl.c (cxx_maybe_build_cleanup): Handle non-decls. * typeck2.c (split_nonconstant_init_1): Use it. diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index 380bc79..38e6bd8 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -15021,7 +15021,8 @@ complete_vars (tree type) /* If DECL is of a type which needs a cleanup, build and return an expression to perform that cleanup here. Return NULL_TREE if no - cleanup need be done. */ + cleanup need be done. DECL can also be a _REF when called from + split_nonconstant_init_1. */ tree cxx_maybe_build_cleanup (tree decl, tsubst_flags_t complain) @@ -15039,7 +15040,10 @@ cxx_maybe_build_cleanup (tree decl, tsubst_flags_t complain) /* Handle "__attribute__((cleanup))". We run the cleanup function before the destructor since the destructor is what actually terminates the lifetime of the object. */ - attr = lookup_attribute ("cleanup", DECL_ATTRIBUTES (decl)); + if (DECL_P (decl)) +attr = lookup_attribute ("cleanup", DECL_ATTRIBUTES (decl)); + else +attr = NULL_TREE; if (attr) { tree id; @@ -15098,6 +15102,7 @@ cxx_maybe_build_cleanup (tree decl, tsubst_flags_t complain) protected_set_expr_location (cleanup, UNKNOWN_LOCATION); if (cleanup + && DECL_P (decl) && !lookup_attribute ("warn_unused", TYPE_ATTRIBUTES (TREE_TYPE (decl))) /* Treat objects with destructors as used; the destructor may do something substantive. */ diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c index b921689..e59ad51 100644 --- a/gcc/cp/typeck2.c +++ b/gcc/cp/typeck2.c @@ -688,14 +688,9 @@ split_nonconstant_init_1 (tree dest, tree init) code = build_stmt (input_location, EXPR_STMT, code); code = maybe_cleanup_point_expr_void (code); add_stmt (code); - if (type_build_dtor_call (inner_type)) - { - code = (build_special_member_call - (sub, complete_dtor_identifier, NULL, inner_type, - LOOKUP_NORMAL, tf_warning_or_error)); - if (!TYPE_HAS_TRIVIAL_DESTRUCTOR (inner_type)) - finish_eh_cleanup (code); - } + if (tree cleanup + = cxx_maybe_build_cleanup (sub, tf_warning_or_error)) + finish_eh_cleanup (cleanup); } num_split_elts++; diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-array2.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-array2.C new file mode 100644 index 000..d0063e1 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-array2.C @@ -0,0 +1,10 @@ +// PR c++/70494 +// { dg-do compile { target c++11 } } + +struct A { ~A(); }; + +int main() +{ + A v[] = { A(), A() }; + auto lambda = [v]{}; +}
Re: rs6000 stack_tie mishap again
On 04/14/2016 07:10 PM, Olivier Hainque wrote: > >> On 14 Apr 2016, at 18:50, Jeff Law wrote: >> >> I thought we had code to handle this case specially, but I can't immediately >> find it in sched-deps.c. > > Unless I misunderstood what you were exactly looking for, > the special code is in alias.c. E.g. write_dependence_p: > > /* (mem:BLK (scratch)) is a special mechanism to conflict with everything. > This is used in epilogue deallocation functions. */ > ... Ok thanks. I've verified that the dependencies are also generated when using this expression in a clobber. > Emitting just that alone isn't enough though. The first attempt > I made did that and failed because the whole > > reg-restore > reg-restore > tie (mem blockage only) > > sequence was allowed to move past the stack pointer reset when > it's performed as a mere register to register move. > > So I ended up adding the clobber mem:blk scratch to the existing > tie parallel, which references the stack pointer register as > well so is forbidden to move past it's update. I've seen the same on S/390 when restoring the stack pointer from a floating point reg. But I think it is not limited to reg-reg stack pointer restores. Potentially this could also happen with the stack pointer decrement in the prologue which could also get moved across a mem only barrier. Bye, -Andreas-
Re: Split out OMP constructs' SIMD clone supporting code
Hi! On Thu, 14 Apr 2016 22:27:40 +0200, I wrote: > On Thu, 14 Apr 2016 18:01:13 +0200, I wrote: > > "simdclone" pass, with the > > respective supporting code. I will certainly submit line-diff patches if > > we agree that this is sound -- these two may actually be good candidates > > to do first, individually, and do that now, because they're completely > > self-contained. Makes sense? > > ;-) Made enough sense to me, so that I prepared the attached patch. I'm > also attaching a "-C" variant that I created using Git's -C5% option, and > which shows how the new file gcc/omp-simd-clone.c can be created from the > original gcc/omp-low.c. (Is that useful for review, or are you manually > doing something like that anyway?) > > > Should possibly rename omp-simd.c to omp-simd-clone.c to make it clear > > that's the only thing it does, the "simdclone" pass? > > I did that. > > I manually determined a reduced #include list for the new file > gcc/omp-simd-clone.c. Hope that's alright. > > OK to commit once bootstrap testing succeeded? Bootstrap/testing look good. OK to commit? > commit 8f33dc59ad24a995694d42ee9013e0853426e190 > Author: Thomas Schwinge > Date: Thu Apr 14 21:56:31 2016 +0200 > > Split out OMP constructs' SIMD clone supporting code > > gcc/ > * omp-low.c (simd_clone_struct_alloc, simd_clone_struct_copy) > (simd_clone_vector_of_formal_parm_types) > (simd_clone_clauses_extract, simd_clone_compute_base_data_type) > (simd_clone_mangle, simd_clone_create) > (simd_clone_adjust_return_type, create_tmp_simd_array) > (simd_clone_adjust_argument_types, simd_clone_init_simd_arrays) > (struct modify_stmt_info, ipa_simd_modify_stmt_ops) > (ipa_simd_modify_function_body, simd_clone_linear_addend) > (simd_clone_adjust, expand_simd_clones, ipa_omp_simd_clone) > (pass_data_omp_simd_clone, class pass_omp_simd_clone) > (pass_omp_simd_clone::gate, make_pass_omp_simd_clone): Move > into... > * omp-simd-clone.c: ... this new file. > (simd_clone_vector_of_formal_parm_types): Make it static. > * Makefile.in (OBJS): Add omp-simd-clone.o. > > gcc/Makefile.in |1 + > gcc/omp-low.c| 1606 > gcc/omp-simd-clone.c | 1654 > ++ > 3 files changed, 1655 insertions(+), 1606 deletions(-) Grüße Thomas