[libstdc++] Assertion in optional
Hello, this patch adds 2 simple __glibcxx_assert in optional that match the precondition in the comment above. I am not sure if there was a reason the author wrote that comment instead of the assertion, but constexpr use still seems to work. I hesitated about having the assertion in operator*, etc, so that the error message would be clearer, but we would still be missing the key information of where this function was called from (in user code), so the real solution would be for __glibcxx_assert to print (a few lines of) a stack trace, an unrelated issue. Bootstrap+testsuite on powerpc64le-unknown-linux-gnu. 2017-05-15 Marc Glisse * include/std/optional (_Optional_base::_M_get): Check precondition. * testsuite/20_util/optional/cons/value_neg.cc: Update line numbers. -- Marc GlisseIndex: include/std/optional === --- include/std/optional (revision 248008) +++ include/std/optional (working copy) @@ -379,25 +379,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION } // The following functionality is also needed by optional, hence the // protected accessibility. protected: constexpr bool _M_is_engaged() const noexcept { return this->_M_payload._M_engaged; } // The _M_get operations have _M_engaged as a precondition. constexpr _Tp& _M_get() noexcept - { return this->_M_payload._M_payload; } + { + __glibcxx_assert(_M_is_engaged()); + return this->_M_payload._M_payload; + } constexpr const _Tp& _M_get() const noexcept - { return this->_M_payload._M_payload; } + { + __glibcxx_assert(_M_is_engaged()); + return this->_M_payload._M_payload; + } // The _M_construct operation has !_M_engaged as a precondition // while _M_destruct has _M_engaged as a precondition. template void _M_construct(_Args&&... __args) noexcept(is_nothrow_constructible<_Stored_type, _Args...>()) { ::new (std::__addressof(this->_M_payload._M_payload)) _Stored_type(std::forward<_Args>(__args)...); Index: testsuite/20_util/optional/cons/value_neg.cc === --- testsuite/20_util/optional/cons/value_neg.cc (revision 248008) +++ testsuite/20_util/optional/cons/value_neg.cc (working copy) @@ -30,15 +30,15 @@ int main() struct X { explicit X(int) {} }; std::optional ox{42}; std::optional ox2 = 42; // { dg-error "conversion" } std::optional> oup{new int}; std::optional> oup2 = new int; // { dg-error "conversion" } struct U { explicit U(std::in_place_t); }; std::optional ou(std::in_place); // { dg-error "no matching" } -// { dg-error "no type" "" { target { *-*-* } } 487 } -// { dg-error "no type" "" { target { *-*-* } } 497 } -// { dg-error "no type" "" { target { *-*-* } } 554 } +// { dg-error "no type" "" { target { *-*-* } } 493 } +// { dg-error "no type" "" { target { *-*-* } } 503 } +// { dg-error "no type" "" { target { *-*-* } } 560 } } }
Re: [PATCH][X86] Add missing xgetbv xsetbv intrinsics
On Fri, May 12, 2017 at 12:29 PM, Koval, Julia wrote: > Hi, > > This patch add these missing intrinsics: > _xsetbv > _xgetbv > > gcc/ > * config/i386/i386-builtin-types.def (VOID_FTYPE_INT_INT64): New type. > * config/i386/i386-builtin.def (__builtin_ia32_xgetbv, > __builtin_ia32_xsetbv): New builtins. > * config/i386/i386.c (ix86_expand_special_args_builtin): Process new > type. > (ix86_expand_builtin): Special expand for new intrinsics. > * config/i386/i386.md: (UNSPECV_XGETBV, UNSPECV_XSETBV): New. > (xsetbv, xsetbv_rex64, xgetbv, xgetbv_rex64): New patterns. > * config/i386/xsaveintrin.h (_xsetbv, _getbv): New intrinsics. > > gcc/testsuite > * gcc.target/i386/xgetsetbv.c: New test. > > Ok for trunk? Approved and committed to mainline SVN. Thanks, Uros.
[PATCH] [i386] Recompute the frame layout less often
Hi, this patch uses the new TARGET_COMPUTE_FRAME_LAYOUT hook in the i386 backend to avoid re-computing the frame layout when not really necessary. It simplifies the logic in ix86_compute_frame_layout by removing the use_fast_prologue_epilogue_nregs, which is no longer necessary, because the frame layout can no longer change spontaneously. Bootstrapped and reg-tested on x86_64-pc-linux-gnu. Is it OK for trunk? Thanks Bernd. 2017-05-14 Bernd Edlinger * config/i386/i386.c (ix86_can_use_return_insn_p): Use the ix86_frame data structure directly. (ix86_initial_elimination_offset): Likewise. (ix86_expand_prologue): Likewise. (ix86_expand_epilogue): Likewise. (ix86_expand_split_stack_prologue): Likewise. (ix86_compute_frame_layout): Remove frame parameter ... (TARGET_COMPUTE_FRAME_LAYOUT): ... and export it as a target hook. (ix86_finalize_stack_realign_flags): Call ix86_compute_frame_layout only if necessary. (ix86_init_machine_status): Don't set use_fast_prologue_epilogue_nregs. (ix86_frame): Move from here ... * config/i386/i386.h (ix86_frame): ... to here. (machine_function): Remove use_fast_prologue_epilogue_nregs, cache the complete ix86_frame data structure instead. Index: gcc/config/i386/i386.c === --- gcc/config/i386/i386.c (revision 248005) +++ gcc/config/i386/i386.c (working copy) @@ -2441,53 +2441,6 @@ struct GTY(()) stack_local_entry { struct stack_local_entry *next; }; -/* Structure describing stack frame layout. - Stack grows downward: - - [arguments] - <- ARG_POINTER - saved pc - - saved static chain if ix86_static_chain_on_stack - - saved frame pointer if frame_pointer_needed - <- HARD_FRAME_POINTER - [saved regs] - <- regs_save_offset - [padding0] - - [saved SSE regs] - <- sse_regs_save_offset - [padding1] | - | <- FRAME_POINTER - [va_arg registers] | - | - [frame] | - | - [padding2] | = to_allocate - <- STACK_POINTER - */ -struct ix86_frame -{ - int nsseregs; - int nregs; - int va_arg_size; - int red_zone_size; - int outgoing_arguments_size; - - /* The offsets relative to ARG_POINTER. */ - HOST_WIDE_INT frame_pointer_offset; - HOST_WIDE_INT hard_frame_pointer_offset; - HOST_WIDE_INT stack_pointer_offset; - HOST_WIDE_INT hfp_save_offset; - HOST_WIDE_INT reg_save_offset; - HOST_WIDE_INT sse_reg_save_offset; - - /* When save_regs_using_mov is set, emit prologue using - move instead of push instructions. */ - bool save_regs_using_mov; -}; - /* Which cpu are we scheduling for. */ enum attr_cpu ix86_schedule; @@ -2579,7 +2532,7 @@ static unsigned int ix86_function_arg_boundary (ma const_tree); static rtx ix86_static_chain (const_tree, bool); static int ix86_function_regparm (const_tree, const_tree); -static void ix86_compute_frame_layout (struct ix86_frame *); +static void ix86_compute_frame_layout (void); static bool ix86_expand_vector_init_one_nonzero (bool, machine_mode, rtx, rtx, int); static void ix86_add_new_builtins (HOST_WIDE_INT, HOST_WIDE_INT); @@ -12007,7 +11960,7 @@ ix86_can_use_return_insn_p (void) if (crtl->args.pops_args && crtl->args.size >= 32768) return 0; - ix86_compute_frame_layout (&frame); + frame = cfun->machine->frame; return (frame.stack_pointer_offset == UNITS_PER_WORD && (frame.nregs + frame.nsseregs) == 0); } @@ -12493,8 +12446,7 @@ ix86_can_eliminate (const int from, const int to) HOST_WIDE_INT ix86_initial_elimination_offset (int from, int to) { - struct ix86_frame frame; - ix86_compute_frame_layout (&frame); + struct ix86_frame frame = cfun->machine->frame; if (from == ARG_POINTER_REGNUM && to == HARD_FRAME_POINTER_REGNUM) return frame.hard_frame_pointer_offset; @@ -12533,8 +12485,9 @@ ix86_builtin_setjmp_frame_value (void) /* Fill structure ix86_frame about frame of currently computed function. */ static void -ix86_compute_frame_layout (struct ix86_frame *frame) +ix86_compute_frame_layout (void) { + struct ix86_frame *frame = &cfun->machine->frame; unsigned HOST_WIDE_INT stack_alignment_needed; HOST_WIDE_INT offset; unsigned HOST_WIDE_INT preferred_alignment; @@ -12570,19 +12523,11 @@ static void in doing anything except PUSHs. */ if (TARGET_SEH) cfun->machine->use_fast_prologue_epilogue = false; - - /* During reload iteration the amount of registers saved can change. - Recompute the value as needed. Do not recompute when amount of registers - didn't change as reload does multiple calls to the function and does not - expect the decision to change within single iteration. */ - else if (!optimize_bb_for_size_p (ENTRY_BLOCK_PTR_FOR_FN (cfun)) - && cfun->machine->use_fast_prologue_epilogue_nregs != frame->nregs) + else if (!optimize_bb_for_size
Re: [PATCH] [i386] Recompute the frame layout less often
On 05/14/2017 02:42 AM, Bernd Edlinger wrote: Hi, this patch uses the new TARGET_COMPUTE_FRAME_LAYOUT hook in the i386 backend to avoid re-computing the frame layout when not really necessary. It simplifies the logic in ix86_compute_frame_layout by removing the use_fast_prologue_epilogue_nregs, which is no longer necessary, because the frame layout can no longer change spontaneously. Bootstrapped and reg-tested on x86_64-pc-linux-gnu. Is it OK for trunk? Thanks Bernd. I think Uros is about to commit my improvements to ms to sysv abi calls, which is a large change and will conflict with your patch. I've added several new fields to struct ix86_frame that will need to be merged (and moved to i386.h). I believe that my only explicit check of crtl->stack_realign_finalized is during pro/epilogue expand, and not in ix86_compute_frame_layout. A former incarnation of my patches needed ix86_compute_frame_layout to be called *after* it was set, but I believe that is no longer the case, and so shouldn't conflict, but retesting should certainly be done. https://gcc.gnu.org/ml/gcc-patches/2017-04/msg01338.html Thanks, Daniel
[PATCH] objc-runtime-shared-support.c - Identical code for different branches
Hello, Now that Coverity is up and running, I am trying to fix some errors. Let's start a trivial one (same code in different branches) S >From 50248decd02bfac52ad64b64c972750489e2ffa0 Mon Sep 17 00:00:00 2001 From: Sylvestre Ledru Date: Sun, 14 May 2017 10:55:24 +0200 Subject: [PATCH 1/5] 2017-05-14 Sylvestre Ledru * objc-runtime-shared-support.c (build_module_descriptor): Identical code for different branches (since 2012) CID 1406758 --- gcc/objc/objc-runtime-shared-support.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/gcc/objc/objc-runtime-shared-support.c b/gcc/objc/objc-runtime-shared-support.c index 8d35d27c031..5ead87078c6 100644 --- a/gcc/objc/objc-runtime-shared-support.c +++ b/gcc/objc/objc-runtime-shared-support.c @@ -500,11 +500,7 @@ build_module_descriptor (long vers, tree attr) objc_finish_struct (objc_module_template, decls); /* Create an instance of "_objc_module". */ - UOBJC_MODULES_decl = start_var_decl (objc_module_template, - /* FIXME - why the conditional - if the symbol is the - same. */ - flag_next_runtime ? "_OBJC_Module" : "_OBJC_Module"); + UOBJC_MODULES_decl = start_var_decl (objc_module_template, "_OBJC_Module"); /* This is the root of the metadata for defined classes and categories, it is referenced by the runtime and, therefore, needed. */ -- 2.11.0
[PATCH] plugin.c (try_init_one_plugin): Fix ressource leaks (CID 726637)
Add missing dlclose() S >From d0926b84047f281a29dc51bbd0a4bdda01a5c63f Mon Sep 17 00:00:00 2001 From: Sylvestre Ledru Date: Sun, 14 May 2017 11:28:38 +0200 Subject: [PATCH 4/5] 2017-05-14 Sylvestre Ledru * plugin.c (try_init_one_plugin): Fix ressource leaks (CID 726637) --- gcc/plugin.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/gcc/plugin.c b/gcc/plugin.c index cfd6ef25036..903a197b78b 100644 --- a/gcc/plugin.c +++ b/gcc/plugin.c @@ -617,6 +617,7 @@ try_init_one_plugin (struct plugin_name_args *plugin) if ((err = dlerror ()) != NULL) { + dlclose(dl_handle); error ("cannot find %s in plugin %s\n%s", str_plugin_init_func_name, plugin->full_name, err); return false; @@ -625,10 +626,12 @@ try_init_one_plugin (struct plugin_name_args *plugin) /* Call the plugin-provided initialization routine with the arguments. */ if ((*plugin_init) (plugin, &gcc_version)) { + dlclose(dl_handle); error ("fail to initialize plugin %s", plugin->full_name); return false; } + dlclose(dl_handle); return true; } -- 2.11.0
[PATCH] lto-wrapper.c (copy_file): Fix resource leaks
Add missing fclose CID 1407987, 1407986 S >From d255827a64012fb81937d6baa8534eabecf9b735 Mon Sep 17 00:00:00 2001 From: Sylvestre Ledru Date: Sun, 14 May 2017 11:37:37 +0200 Subject: [PATCH 5/5] 2017-05-14 Sylvestre Ledru * lto-wrapper.c (copy_file): Fix resource leaks CID 1407987, 1407986 --- gcc/lto-wrapper.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c index 4b86f939ca2..832ffde3e40 100644 --- a/gcc/lto-wrapper.c +++ b/gcc/lto-wrapper.c @@ -838,6 +838,8 @@ copy_file (const char *dest, const char *src) fatal_error (input_location, "writing output file"); } } + fclose (d); + fclose (s); } /* Find the crtoffloadtable.o file in LIBRARY_PATH, make copy and pass name of -- 2.11.0
Re: [PING] [PATCH v4 0/12] [i386] Improve 64-bit Microsoft to System V ABI pro/epilogues
On Sun, May 14, 2017 at 12:34 AM, Daniel Santos wrote: > On 05/13/2017 11:52 AM, Uros Bizjak wrote: >> >> On Sat, May 13, 2017 at 1:01 AM, Daniel Santos >> wrote: >>> >>> Ping? I have posted revisions of the following in patch set: >>> >>> 05/12 - https://gcc.gnu.org/ml/gcc-patches/2017-04/msg01442.html >>> 09/12 - https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00348.html >>> 11/12 - https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00350.html >>> >>> I have retested them on Linux x86-64 in addition a Wine testsuite >>> comparison >>> resulting in fewer failed tests (31) than when using unpatched 7.1.0 (78) >>> and 5.4.0 (78). A cursory examination of the now working failures with >>> 7.1.0 seemed to be to be due to race conditions in Wine that are >>> incidentally hidden after the patches. >>> >>> Is there anything else needed before we can commit these? They still >>> rebase >>> cleanly onto the HEAD, but I can repost as "v5" if you prefer. >> >> Please go ahead and commit the patches. >> >> However, please stay around to fix possible fallout. As said - you are >> touching quite complex part of the compiler ... >> >> Thanks, >> Uros. > > > Thanks! I'll definitely be around, I have a lot more that I'm working on > with C generics/pseudo-templates (all middle-end stuff). I also want to > examine more ways that SSE saves/restores can be omitted in these ms to sysv > calls through static analysis and such. > > Anyway, I don't yet have SVN write access, will you sponsor my request? The patchset was committed to mainline SVN as r248029. Uros.
Re: [PATCH] [i386] Recompute the frame layout less often
On Sun, May 14, 2017 at 11:16 AM, Daniel Santos wrote: > On 05/14/2017 02:42 AM, Bernd Edlinger wrote: >> >> Hi, >> >> >> this patch uses the new TARGET_COMPUTE_FRAME_LAYOUT hook in the i386 >> backend to avoid re-computing the frame layout when not really >> necessary. >> >> It simplifies the logic in ix86_compute_frame_layout by removing >> the use_fast_prologue_epilogue_nregs, which is no longer necessary, >> because the frame layout can no longer change spontaneously. >> >> >> Bootstrapped and reg-tested on x86_64-pc-linux-gnu. >> Is it OK for trunk? >> >> >> Thanks >> Bernd. > > > I think Uros is about to commit my improvements to ms to sysv abi calls, > which is a large change and will conflict with your patch. I've added > several new fields to struct ix86_frame that will need to be merged (and > moved to i386.h). I believe that my only explicit check of > crtl->stack_realign_finalized is during pro/epilogue expand, and not in > ix86_compute_frame_layout. A former incarnation of my patches needed > ix86_compute_frame_layout to be called *after* it was set, but I believe > that is no longer the case, and so shouldn't conflict, but retesting should > certainly be done. Yes, the mcall-ms2sysv-xlogues patch was committed to mainline, please re-test and re-send the patch. Uros.
Re: Runtime checking of OpenACC parallelism dimensions clauses
Hi! On Thu, 11 May 2017 14:24:05 +0200, I wrote: > OK for trunk? > Runtime checking of OpenACC parallelism dimensions clauses For now, committed to gomp-4_0-branch in r248030: commit 59e5204e0ec16c0f14ec68148f856fd307ef8d51 Author: tschwinge Date: Sun May 14 10:25:46 2017 + Runtime checking of OpenACC parallelism dimensions clauses libgomp/ * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Rewrite. * testsuite/libgomp.oacc-c++/c++.exp (check_effective_target_c) (check_effective_target_c++): New procs. * testsuite/libgomp.oacc-c/c.exp (check_effective_target_c) (check_effective_target_c++): Likewise. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@248030 138bc75d-0d04-0410-961f-82ee72b054a4 --- libgomp/ChangeLog.gomp | 8 + libgomp/testsuite/libgomp.oacc-c++/c++.exp | 7 + .../libgomp.oacc-c-c++-common/parallel-dims.c | 526 - libgomp/testsuite/libgomp.oacc-c/c.exp | 7 + 4 files changed, 536 insertions(+), 12 deletions(-) diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp index def0feb..a1627a8 100644 --- libgomp/ChangeLog.gomp +++ libgomp/ChangeLog.gomp @@ -1,3 +1,11 @@ +2017-05-14 Thomas Schwinge + + * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Rewrite. + * testsuite/libgomp.oacc-c++/c++.exp (check_effective_target_c) + (check_effective_target_c++): New procs. + * testsuite/libgomp.oacc-c/c.exp (check_effective_target_c) + (check_effective_target_c++): Likewise. + 2017-05-12 Cesar Philippidis * testsuite/libgomp.oacc-c-c++-common/par-reduction-3.c: New test. diff --git libgomp/testsuite/libgomp.oacc-c++/c++.exp libgomp/testsuite/libgomp.oacc-c++/c++.exp index ba1a28e..695b96d 100644 --- libgomp/testsuite/libgomp.oacc-c++/c++.exp +++ libgomp/testsuite/libgomp.oacc-c++/c++.exp @@ -4,6 +4,13 @@ load_lib libgomp-dg.exp load_gcc_lib gcc-dg.exp load_gcc_lib torture-options.exp +proc check_effective_target_c { } { +return 0 +} +proc check_effective_target_c++ { } { +return 1 +} + global shlib_ext set shlib_ext [get_shlib_extension] diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c index f5766a4..3458757 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c @@ -1,25 +1,527 @@ -/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* OpenACC parallelism dimensions clauses: num_gangs, num_workers, + vector_length. */ + +/* { dg-additional-options "-foffload-force" } */ + +#include +#include + +/* TODO: "(int) acc_device_*" casts because of the C++ acc_on_device wrapper + not behaving as expected for -O0. */ +#pragma acc routine seq +static unsigned int __attribute__ ((optimize ("O2"))) acc_gang () +{ + if (acc_on_device ((int) acc_device_host)) +return 0; + else if (acc_on_device ((int) acc_device_nvidia)) +{ + unsigned int r; + asm volatile ("mov.u32 %0,%%ctaid.x;" : "=r" (r)); + return r; +} + else +__builtin_abort (); +} + +#pragma acc routine seq +static unsigned int __attribute__ ((optimize ("O2"))) acc_worker () +{ + if (acc_on_device ((int) acc_device_host)) +return 0; + else if (acc_on_device ((int) acc_device_nvidia)) +{ + unsigned int r; + asm volatile ("mov.u32 %0,%%tid.y;" : "=r" (r)); + return r; +} + else +__builtin_abort (); +} + +#pragma acc routine seq +static unsigned int __attribute__ ((optimize ("O2"))) acc_vector () +{ + if (acc_on_device ((int) acc_device_host)) +return 0; + else if (acc_on_device ((int) acc_device_nvidia)) +{ + unsigned int r; + asm volatile ("mov.u32 %0,%%tid.x;" : "=r" (r)); + return r; +} + else +__builtin_abort (); +} -/* Worker and vector size checks. Picked an outrageously large - value. */ int main () { - int dummy[10]; + acc_init (acc_device_default); -#pragma acc parallel num_workers (2<<20) /* { dg-error "using num_workers" } */ + /* Non-positive value. */ + + /* GR, WS, VS. */ + { +#define GANGS 0 /* { dg-warning "'num_gangs' value must be positive" "" { target c } } */ +int gangs_actual = GANGS; +int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max; +gangs_min = workers_min = vectors_min = INT_MAX; +gangs_max = workers_max = vectors_max = INT_MIN; +#pragma acc parallel copy (gangs_actual) \ + reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max) \ + num_gangs (GANGS) /* { dg-warning "'num_gangs' value must be positive" "" { target c++ } } */ +{ + /* We're actually executing with num_gangs (1). */ + gangs_actual = 1; + for (int i = 100 * gangs_actual; i > -100 * gangs_
Re: OpenACC 2.5 kernels construct: num_gangs, num_workers, vector_length clauses
Hi! On Thu, 11 May 2017 14:26:51 +0200, I wrote: > Building on the other pending patches (I'll soon commit the approved > ones), we can then support the num_gangs, num_workers, vector_length > clauses for the OpenACC 2.5 kernels construct. OK for trunk? > OpenACC 2.5 kernels construct: num_gangs, num_workers, vector_length > clauses For now, committed to gomp-4_0-branch in r248031: commit cc2a61ba48e84268e37c53874cb3eef27f5ede1d Author: tschwinge Date: Sun May 14 10:26:07 2017 + OpenACC 2.5 kernels construct: num_gangs, num_workers, vector_length clauses gcc/c/ * c-parser.c (OACC_KERNELS_CLAUSE_MASK) (OACC_KERNELS_CLAUSE_DEVICE_TYPE_MASK): Add "PRAGMA_OACC_CLAUSE_NUM_GANGS", "PRAGMA_OACC_CLAUSE_NUM_WORKERS", "VECTOR_LENGTH". gcc/cp/ * parser.c (OACC_KERNELS_CLAUSE_MASK) (OACC_KERNELS_CLAUSE_DEVICE_TYPE_MASK): Add "PRAGMA_OACC_CLAUSE_NUM_GANGS", "PRAGMA_OACC_CLAUSE_NUM_WORKERS", "VECTOR_LENGTH". gcc/fortran/ * openmp.c (OACC_KERNELS_CLAUSES) (OACC_KERNELS_CLAUSE_DEVICE_TYPE_MASK): Add "OMP_CLAUSE_NUM_GANGS", "OMP_CLAUSE_NUM_WORKERS", "OMP_CLAUSE_VECTOR_LENGTH". gcc/ * omp-low.c (execute_oacc_device_lower): Remove the parallelism dimensions function attributes for unparallelized OpenACC kernels constructs. gcc/testsuite/ * c-c++-common/goacc/parallel-dims-1.c: Update. * c-c++-common/goacc/parallel-dims-2.c: Likewise. * c-c++-common/goacc/routine-1.c: Likewise. * c-c++-common/goacc/uninit-dim-clause.c: Likewise. * g++.dg/goacc/template.C: Likewise. * gfortran.dg/goacc/kernels-tree.f95: Likewise. * gfortran.dg/goacc/routine-3.f90: Likewise. * gfortran.dg/goacc/sie.f95: Likewise. * gfortran.dg/goacc/uninit-dim-clause.f95: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: New file. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c: Update. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise. * testsuite/libgomp.oacc-fortran/kernels-loop-2.f95: Likewise. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@248031 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.gomp | 6 + gcc/c/ChangeLog.gomp | 7 + gcc/c/c-parser.c | 6 + gcc/cp/ChangeLog.gomp | 7 + gcc/cp/parser.c| 6 + gcc/fortran/ChangeLog.gomp | 7 + gcc/fortran/openmp.c | 6 +- gcc/omp-low.c | 9 + gcc/testsuite/ChangeLog.gomp | 12 + gcc/testsuite/c-c++-common/goacc/parallel-dims-1.c | 4 + gcc/testsuite/c-c++-common/goacc/parallel-dims-2.c | 152 +++-- gcc/testsuite/c-c++-common/goacc/routine-1.c | 13 ++ .../c-c++-common/goacc/uninit-dim-clause.c | 17 +- gcc/testsuite/g++.dg/goacc/template.C | 4 + gcc/testsuite/gfortran.dg/goacc/kernels-tree.f95 | 6 +- gcc/testsuite/gfortran.dg/goacc/routine-3.f90 | 6 + gcc/testsuite/gfortran.dg/goacc/sie.f95| 84 +++ .../gfortran.dg/goacc/uninit-dim-clause.f95| 18 +- libgomp/ChangeLog.gomp | 6 + .../libgomp.oacc-c-c++-common/acc_prof-kernels-1.c | 244 + .../libgomp.oacc-c-c++-common/kernels-loop-2.c | 21 +- .../libgomp.oacc-c-c++-common/parallel-dims.c | 35 +++ .../libgomp.oacc-fortran/kernels-loop-2.f95| 13 +- 23 files changed, 661 insertions(+), 28 deletions(-) diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp index a754647..a4720c3 100644 --- gcc/ChangeLog.gomp +++ gcc/ChangeLog.gomp @@ -1,3 +1,9 @@ +2017-05-14 Thomas Schwinge + + * omp-low.c (execute_oacc_device_lower): Remove the parallelism + dimensions function attributes for unparallelized OpenACC kernels + constructs. + 2017-05-12 Cesar Philippidis * config/nvptx/nvptx.c (nvptx_goacc_reduction_init): Don't update diff --git gcc/c/ChangeLog.gomp gcc/c/ChangeLog.gomp index 3efcc8b..baedcf8 100644 --- gcc/c/ChangeLog.gomp +++ gcc/c/ChangeLog.gomp @@ -1,3 +1,10 @@ +2017-05-14 Thomas Schwinge + + * c-parser.c (OACC_KERNELS_CLAUSE_MASK) + (OACC_KERNELS_CLAUSE_DEVICE_TYPE_MASK): Add + "PRAGMA_OACC_CLAUSE_NUM_GANGS", "PRAGMA_OACC_CLAUSE_NUM_WORKERS", + "VECTOR_LENGTH". + 2017-05-12 Thomas Schwinge * c-parser.c (c_parser_omp_clause_num_gangs) diff --git gcc/c/c-parser.c gcc/c/c-parser.c index ef61c5f..afc467d 100644 --- gcc/c/c-parser.c ++
Re: [PATCH] plugin.c (try_init_one_plugin): Fix ressource leaks (CID 726637)
On Sun, May 14, 2017 at 11:59:40AM +0200, Sylvestre Ledru wrote: > Add missing dlclose() > > S > > > From d0926b84047f281a29dc51bbd0a4bdda01a5c63f Mon Sep 17 00:00:00 2001 > From: Sylvestre Ledru > Date: Sun, 14 May 2017 11:28:38 +0200 > Subject: [PATCH 4/5] 2017-05-14 Sylvestre Ledru > > * plugin.c (try_init_one_plugin): Fix ressource leaks (CID 726637) > --- > gcc/plugin.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/gcc/plugin.c b/gcc/plugin.c > index cfd6ef25036..903a197b78b 100644 > --- a/gcc/plugin.c > +++ b/gcc/plugin.c > @@ -617,6 +617,7 @@ try_init_one_plugin (struct plugin_name_args *plugin) > >if ((err = dlerror ()) != NULL) > { > + dlclose(dl_handle); >error ("cannot find %s in plugin %s\n%s", str_plugin_init_func_name, > plugin->full_name, err); >return false; > @@ -625,10 +626,12 @@ try_init_one_plugin (struct plugin_name_args *plugin) >/* Call the plugin-provided initialization routine with the arguments. */ >if ((*plugin_init) (plugin, &gcc_version)) > { > + dlclose(dl_handle); These seem like unimportant, but real leaks so they seem correct. >error ("fail to initialize plugin %s", plugin->full_name); >return false; > } > > + dlclose(dl_handle); Does this part pass the plugin tests? because it seems suspicious, if the plugin's init function registered any callbacks which it almost certainly did, then we'd be holding function pointers into the plugin after we dlclosed our only reference to it. We don't need to call any more functions with the handle, but I think we want to morally leak it here to ensure the plugin is loaded for the entire run of the compiler. Trev
Re: {PATCH] New C++ warning -Wcatch-value
On 7 May, Martin Sebor wrote: > On 05/07/2017 02:03 PM, Volker Reichelt wrote: >> On 2 May, Martin Sebor wrote: >>> On 05/01/2017 02:38 AM, Volker Reichelt wrote: Hi, catching exceptions by value is a bad thing, as it may cause slicing, i.e. a) a superfluous copy b) which is only partial. See also https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#e15-catch-exceptions-from-a-hierarchy-by-reference To warn the user about catch handlers of non-reference type, the following patch adds a new C++/ObjC++ warning option "-Wcatch-value". >>> >>> I think the problems related to catching exceptions by value >>> apply to (a subset of) class types but not so much to fundamental >>> types. I would expect indiscriminately warning on every type to >>> be overly restrictive. >>> >>> The Enforcement section of the C++ guideline suggests to >>> >>>Flag by-value exceptions if their types are part of a hierarchy >>>(could require whole-program analysis to be perfect). >>> >>> The corresponding CERT C++ Coding Standard guideline offers >>> a similar suggestion here: >>> >>>https://www.securecoding.cert.org/confluence/x/TAD5CQ >>> >>> so I would suggest to model the warning on that approach (within >>> limits of a single translation unit, of course). I.e., warn only >>> for catching by value objects of non-trivial types, or perhaps even >>> only polymorphic types? >>> >>> Martin >> >> I've never seen anybody throw integers in real-world code, so I didn't >> want to complicate things for this case. But maybe I should only warn >> about class-types. >> >> IMHO it makes sense to warn about non-polymorphic class types >> (although slicing is not a problem there), because you still have to >> deal with redundant copies. >> >> Another thing would be pointers. I've never seen pointers in catch >> handlers (except some 'catch (const char*)' which I would consider >> bad practice). Therefore I'd like to warn about 'catch (const A*)' >> which might be a typo that should read 'catch (const A&)' instead. >> >> Would that be OK? > > To my knowledge, catch by value of non-polymorphic types (and > certainly fundamental types) is not a cause of common bugs. > It's part of the recommended practice to throw by value, catch > by reference, which is grounded in avoiding the slicing problem. > It's also sometimes recommended for non-trivial class types to > avoid creating a copy of the object (which, for non-trivial types, > may need to allocate resource and could throw). Otherwise, it's > not dissimilar to pass-by value vs pass-by-reference (or even > pass-by-pointer). Both may be good practices for some types or > in some situations but neither is necessary to avoid bugs or > universally applicable to achieve superior performance. > > The pointer case is interesting. In C++ Coding Standards, > Sutter and Alexandrescu recommend to throw (and catch) smart > pointers over plain pointers because it obviates having to deal > with memory management issues. That's sound advice but it seems > more like a design guideline than a coding rule aimed at directly > preventing bugs. I also think that the memory management bugs > that it might find might be more easily detected at the throw > site instead. E.g., warning on the throw expression below: > >{ > Exception e; > throw &e; >} > > or perhaps even on > >{ > throw *new Exception (); >} > > A more sophisticated (and less restrictive) checker could detect > and warn on "throw " if it found a catch (T) or catch (T&) > in the same file and no catch (T*) (but not warn otherwise). > > Martin > > PS After re-reading some of the coding guidelines on this topic > it occurs to me that (if your patch doesn't handle this case yet) > it might be worth considering to enhance it to also warn on > rethrowing caught polymorphic objects (i.e., warn on > >catch (E &e) { throw e; } > > and suggest to use "throw;" instead, for the same reason: to > help avoid slicing. > > PPS It may be a useful feature to implement some of other ideas > you mentioned (e.g., throw by value rather than pointer) but it > feels like a separate and more ambitious project than detecting > the relatively common and narrow slicing problem. So how about the following then? I stayed with the catch part and added a parameter to the warning to let the user decide on the warnings she/he wants to get: -Wcatch-value=n. -Wcatch-value=1 only warns for polymorphic classes that are caught by value (to avoid slicing), -Wcatch-value=2 warns for all classes that are caught by value (to avoid copies). And finally -Wcatch-value=3 warns for everything not caught by reference to find typos (like pointer instead of reference) and bad coding practices. Bootstrapped and regtested on x86_64-pc-linux-gnu. OK for trunk? If so, would it make sense to add -Wcatch-value=1 to -Wextra or even -Wall? I would do this in a seperate pat
Re: [PATCH] plugin.c (try_init_one_plugin): Fix ressource leaks (CID 726637)
Le 14/05/2017 à 12:40, Trevor Saunders a écrit : > On Sun, May 14, 2017 at 11:59:40AM +0200, Sylvestre Ledru wrote: >> Add missing dlclose() >> >> S >> >> >> From d0926b84047f281a29dc51bbd0a4bdda01a5c63f Mon Sep 17 00:00:00 2001 >> From: Sylvestre Ledru >> Date: Sun, 14 May 2017 11:28:38 +0200 >> Subject: [PATCH 4/5] 2017-05-14 Sylvestre Ledru >> >> * plugin.c (try_init_one_plugin): Fix ressource leaks (CID 726637) >> --- >> gcc/plugin.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/gcc/plugin.c b/gcc/plugin.c >> index cfd6ef25036..903a197b78b 100644 >> --- a/gcc/plugin.c >> +++ b/gcc/plugin.c >> @@ -617,6 +617,7 @@ try_init_one_plugin (struct plugin_name_args *plugin) >> >>if ((err = dlerror ()) != NULL) >> { >> + dlclose(dl_handle); >>error ("cannot find %s in plugin %s\n%s", str_plugin_init_func_name, >> plugin->full_name, err); >>return false; >> @@ -625,10 +626,12 @@ try_init_one_plugin (struct plugin_name_args *plugin) >>/* Call the plugin-provided initialization routine with the arguments. */ >>if ((*plugin_init) (plugin, &gcc_version)) >> { >> + dlclose(dl_handle); > These seem like unimportant, but real leaks so they seem correct. > >>error ("fail to initialize plugin %s", plugin->full_name); >>return false; >> } >> >> + dlclose(dl_handle); > Does this part pass the plugin tests? because it seems suspicious, if > the plugin's init function registered any callbacks which it almost > certainly did, then we'd be holding function pointers into the plugin > after we dlclosed our only reference to it. We don't need to call any > more functions with the handle, but I think we want to morally leak it > here to ensure the plugin is loaded for the entire run of the compiler. > Indeed, false positive marked in the coverity interface. New patch attached S >From 08f3fb989f6b6ee56e1d4d9674e743dd563a0904 Mon Sep 17 00:00:00 2001 From: Sylvestre Ledru Date: Sun, 14 May 2017 11:28:38 +0200 Subject: [PATCH 1/2] 2017-05-14 Sylvestre Ledru * plugin.c (try_init_one_plugin): Fix ressource leaks (CID 726637) --- gcc/plugin.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/gcc/plugin.c b/gcc/plugin.c index cfd6ef25036..60d037c2b83 100644 --- a/gcc/plugin.c +++ b/gcc/plugin.c @@ -617,6 +617,7 @@ try_init_one_plugin (struct plugin_name_args *plugin) if ((err = dlerror ()) != NULL) { + dlclose(dl_handle); error ("cannot find %s in plugin %s\n%s", str_plugin_init_func_name, plugin->full_name, err); return false; @@ -625,6 +626,7 @@ try_init_one_plugin (struct plugin_name_args *plugin) /* Call the plugin-provided initialization routine with the arguments. */ if ((*plugin_init) (plugin, &gcc_version)) { + dlclose(dl_handle); error ("fail to initialize plugin %s", plugin->full_name); return false; } -- 2.11.0
Re: [PATCH] [i386] Recompute the frame layout less often
Hi Daniel, there is one thing I don't understand in your patch: That is, it introduces a static value: /* Registers who's save & restore will be managed by stubs called from pro/epilogue. */ static HARD_REG_SET GTY(()) stub_managed_regs; This seems to be set as a side effect of ix86_compute_frame_layout, and depends on the register usage of the current function. But values that depend on the current function need usually be attached to cfun->machine, because the passes can run in parallel unless I am completely mistaken, and the stub_managed_regs may therefore be computed from a different function. Bernd. On 05/14/17 12:25, Uros Bizjak wrote: > On Sun, May 14, 2017 at 11:16 AM, Daniel Santos > wrote: >> On 05/14/2017 02:42 AM, Bernd Edlinger wrote: >>> >>> Hi, >>> >>> >>> this patch uses the new TARGET_COMPUTE_FRAME_LAYOUT hook in the i386 >>> backend to avoid re-computing the frame layout when not really >>> necessary. >>> >>> It simplifies the logic in ix86_compute_frame_layout by removing >>> the use_fast_prologue_epilogue_nregs, which is no longer necessary, >>> because the frame layout can no longer change spontaneously. >>> >>> >>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu. >>> Is it OK for trunk? >>> >>> >>> Thanks >>> Bernd. >> >> >> I think Uros is about to commit my improvements to ms to sysv abi calls, >> which is a large change and will conflict with your patch. I've added >> several new fields to struct ix86_frame that will need to be merged (and >> moved to i386.h). I believe that my only explicit check of >> crtl->stack_realign_finalized is during pro/epilogue expand, and not in >> ix86_compute_frame_layout. A former incarnation of my patches needed >> ix86_compute_frame_layout to be called *after* it was set, but I believe >> that is no longer the case, and so shouldn't conflict, but retesting should >> certainly be done. > > Yes, the mcall-ms2sysv-xlogues patch was committed to mainline, please > re-test and re-send the patch. > > Uros. >
[PATCH, i386]: Make CCNOmode compatible with CCGOCmode and with CCZmode
Hello! Attached patch makes CCNOmode compatible with CCGOCmode and with CCZmode, allowing post-reload compare elimiation to eliminate: testl %r15d, %r15d movq%rax, -552(%rbp) cmovns %r15, %r13 movq%rsi, -568(%rbp) - testl %r15d, %r15d jle .L12207 CCNOmode means that we know that OF bit is zero, so we can combine it with CCGOCmode, which expects garbage in OF. The resulting mode is CCNOmode, which is more constrained mode of the mode pair. CCZmode looks only at the ZF, and can be trivially combined into CCNOmode, in the same way it is already combined into CCGCmode and CCGOCmode. 2017-05-14 Uros Bizjak * config/i386.i386.c (ix86_cc_modes_compatible): CCNOmode is compatible with CCGOCmode and with CCZmode. Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Committed to mainline SVN. Uros. Index: config/i386/i386.c === --- config/i386/i386.c (revision 248033) +++ config/i386/i386.c (working copy) @@ -23251,9 +23251,15 @@ ix86_cc_modes_compatible (machine_mode m1, machine || (m1 == CCGOCmode && m2 == CCGCmode)) return CCGCmode; - if (m1 == CCZmode && (m2 == CCGCmode || m2 == CCGOCmode)) + if ((m1 == CCNOmode && m2 == CCGOCmode) + || (m1 == CCGOCmode && m2 == CCNOmode)) +return CCNOmode; + + if (m1 == CCZmode + && (m2 == CCGCmode || m2 == CCGOCmode || m2 == CCNOmode)) return m2; - else if (m2 == CCZmode && (m1 == CCGCmode || m1 == CCGOCmode)) + else if (m2 == CCZmode + && (m1 == CCGCmode || m1 == CCGOCmode || m1 == CCNOmode)) return m1; switch (m1)
Re: [PATCH] [i386] Recompute the frame layout less often
On 05/14/2017 11:31 AM, Bernd Edlinger wrote: Hi Daniel, there is one thing I don't understand in your patch: That is, it introduces a static value: /* Registers who's save & restore will be managed by stubs called from pro/epilogue. */ static HARD_REG_SET GTY(()) stub_managed_regs; This seems to be set as a side effect of ix86_compute_frame_layout, and depends on the register usage of the current function. But values that depend on the current function need usually be attached to cfun->machine, because the passes can run in parallel unless I am completely mistaken, and the stub_managed_regs may therefore be computed from a different function. Bernd. I'm relatively new to GCC and still learning. However, there are quite a lot of static TU variables in i386.c like this. I am not aware of gcc having parallelism support, but if it were to be added then all of these TU variables should probably be moved to some class or struct (like cfun->machine) to reduce the number of TLS lookups required (which I presume is a little more expensive than a this/offset calculation). Having this (as well as other variables) in such a struct is better design IMO, but as I said, I'm still learning GCC's architecture, idioms and patterns. (I should add that I don't really understand the GTY memory management either. :) To be clear on class xlogue_layout, the only instances of this class are const and could be shared across multiple threads. It is dependent upon the cfun->machine as well as the global struct rtl_data crtl, but is not so entangled that were these proper C++ classes (with private data) that it would need to be a friend -- it only needs read-access to their data members. To be honest, it's a strange feeling programming in a mixture of C and C++ idioms, but I know it was only recently converted to C++ so I think it's better to try to use only one or the other in a given function. But if I were going to do this all OO, then ix86_compute_frame_layout would be a member function of ix86_frame (which would be a specialization of some generic "frame" class), machine_function would be class ix86_machine_function with it's own compute_frame_layout that called ix86_frame::compute_frame_layout, etc. If I really wanted to go nuts, I would consider making class function, et.al. template classes with machine_function and machine_function_state part of the object instead of pointers to separate objects to reduce accesses down to a single this/offset, but now I I'm *really* digressing... Please free to move it. Thanks, Daniel
Re: Avoid _Rb_tree_rotate_[left,right] symbols export
On 12/05/2017 13:03, Jonathan Wakely wrote: On 11/05/17 22:06 +0200, François Dumont wrote: Hi When versioned namespace is active we can avoid export of _Rb_tree_rotate_[left,right] symbols. I also took the opportunity to put static functions in the anonymous namespace rather than using static. Is this usage of static still planned to be deprecated ? No, I don't think so. So not interested in replacing it with anonymous namespace ? A much simpler (but equivalent) change would be: --- a/libstdc++-v3/src/c++98/tree.cc +++ b/libstdc++-v3/src/c++98/tree.cc @@ -153,6 +153,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION /* Static keyword was missing on _Rb_tree_rotate_left. Export the symbol for backward compatibility until next ABI change. */ +#if _GLIBCXX_INLINE_VERSION + static +#endif Surely simpler but why keeping this function if it is not used ? void _Rb_tree_rotate_left(_Rb_tree_node_base* const __x, _Rb_tree_node_base*& __root) @@ -184,6 +187,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION /* Static keyword was missing on _Rb_tree_rotate_right Export the symbol for backward compatibility until next ABI change. */ +#if _GLIBCXX_INLINE_VERSION + static +#endif void _Rb_tree_rotate_right(_Rb_tree_node_base* const __x, _Rb_tree_node_base*& __root) * src/c++98/tree.cc [_GLIBCXX_INLINE_VERSION] (_Rb_tree_rotate_left, _Rb_tree_rotate_right): Remove. * src/c++98/tree.cc (local_Rb_tree_increment, local_Rb_tree_decrement): Move to anonymous namespace. (local_Rb_tree_rotate_left, local_Rb_tree_rotate_right): Likewise. Tested under Linux x86_64 with versioned namespace. What about the normal configuration? It's much more important that the default configuration works. The versioned namespace that nobody uses doesn't matter. Normal mode now tested too, success. François
[wwwdocs] projects/prefetch.html - remove Itanium manual
The usual theme, but at least a redirect to a most generic page, so let's just remove this. Applied. Gerald Index: htdocs/projects/prefetch.html === RCS file: /cvs/gcc/wwwdocs/htdocs/projects/prefetch.html,v retrieving revision 1.28 diff -u -r1.28 prefetch.html --- htdocs/projects/prefetch.html 23 Aug 2016 08:23:36 - 1.28 +++ htdocs/projects/prefetch.html 14 May 2017 20:45:26 - @@ -767,10 +767,7 @@ [8] Intel Itanium[tm] Architecture Software Developer's Manual Vol. 3 -rev. 2.1: Instruction Set Reference; -an 10 Mbyte PDF file with a link from -http://www.intel.com/design/itanium/manuals/iiasdmanual.htm";> -http://www.intel.com/design/itanium/manuals/iiasdmanual.htm. +rev. 2.1: Instruction Set Reference.. [9] MIPS32[tm] Architecture for Programmers; Volume II: The MIPS32[tm]
Re: dejagnu version update?
On Sat, May 13, 2017 at 4:39 PM, Jeff Law wrote: > On 05/13/2017 04:38 AM, Jakub Jelinek wrote: >> >> On Sat, May 13, 2017 at 12:24:12PM +0200, Bernhard Reutner-Fischer wrote: >>> >>> I guess neither redhat >>> (https://access.redhat.com/downloads/content/dejagnu/ redirects to a >>> login page but there seem to be 1.5.1 packages) nor SuSE did update >>> dejagnu in the meantime. >> >> >> Fedora has dejagnu-1.6 in Fedora 25 and later, dejagnu-1.5.3 in Fedora 24, >> older >> Fedora versions are EOL. RHEL 7 has dejagnu-1.5.1, RHEL 6 as well as RHEL >> 5 has >> dejagnu-1.4.4, older RHEL versions are EOL. > > RHEL-5 is old enough that IMHO it ought not figure into this discussion. > RHEL-6 is probably close to if not past that same point as well. FWIW, I still run the testsuite on RHEL 6.
committed: Fix NetBSD problem PR80600
I have committed the attached patch to make NetBSD handle -lgcc correctly for shared libraries. gcc/ChangeLog: PR target/80600 * config/netbsd.h (NETBSD_LIBGCC_SPEC): Always add -lgcc. libgcc/ChangeLog: PR target/80600 * config.host (*-*-netbsd*): Add t-slibgcc-libgcc to tmake_file. /KristerIndex: gcc/config/netbsd.h === --- gcc/config/netbsd.h (revision 248036) +++ gcc/config/netbsd.h (working copy) @@ -120,8 +120,7 @@ #undef LIB_SPEC #define LIB_SPEC NETBSD_LIB_SPEC -/* Provide a LIBGCC_SPEC appropriate for NetBSD. We also want to exclude - libgcc with -symbolic. */ +/* Provide a LIBGCC_SPEC appropriate for NetBSD. */ #ifdef NETBSD_NATIVE #define NETBSD_LIBGCC_SPEC \ @@ -133,7 +132,7 @@ %{p: -lgcc_p} \ %{pg: -lgcc_p}}" #else -#define NETBSD_LIBGCC_SPEC "%{!shared:%{!symbolic: -lgcc}}" +#define NETBSD_LIBGCC_SPEC "-lgcc" #endif #undef LIBGCC_SPEC Index: libgcc/config.host === --- libgcc/config.host (revision 248036) +++ libgcc/config.host (working copy) @@ -249,6 +249,7 @@ *-*-netbsd*) tmake_file="$tmake_file t-crtstuff-pic t-libgcc-pic t-eh-dw2-dip" tmake_file="$tmake_file t-slibgcc t-slibgcc-gld t-slibgcc-elf-ver" + tmake_file="$tmake_file t-slibgcc-libgcc" # NetBSD 1.7 and later are set up to use GCC's crtstuff for # ELF configurations. We will clear extra_parts in the # a.out configurations.
Re: committed: Fix NetBSD problem PR80600
On Mon, 15 May 2017, Krister Walfridsson wrote: I have committed the attached patch to make NetBSD handle -lgcc correctly for shared libraries. gcc/ChangeLog: PR target/80600 * config/netbsd.h (NETBSD_LIBGCC_SPEC): Always add -lgcc. libgcc/ChangeLog: PR target/80600 * config.host (*-*-netbsd*): Add t-slibgcc-libgcc to tmake_file. Forgot to say: bootstrapped and tested on i386-unknown-netbsdelf6.1 and x86_64-unknown-netbsd6.1. /Krister
[patch] build xz (instead of bz2) compressed tarballs and diffs
As discussed on IRC with Jakub and Richard here are is a small patch which builds xz compressed tarballs and diff files. Tested with maintainer-scripts/gcc_release \ -s snap:trunk -p diffs sources tarfiles maintainer-scripts/gcc_release \ -s snap:trunk -p diffs sources tarfiles and checked that the new tarball and diff files are compressed using xz. Ok for the trunk and the gcc-7-branch? Matthias maintainer-scripts/ 2017-05-14 Matthias Klose * gcc_release (build_gzip): Build xz tarball instead of bz2 tarball. (build_diffs): Handle building diffs from either bz2 or xz tarballs, compress diffs using xz instead of bz2. (build_diff): Likewise. (upload_files): Check for *.xz files instead of *.bz2 files. (announce_snapshot): Announce xz tarball instead of bz2 tarball. (XZ): New definition. (): Look for both bz2 and xz compressed old tarballs. Index: maintainer-scripts/gcc_release === --- maintainer-scripts/gcc_release (revision 248041) +++ maintainer-scripts/gcc_release (working copy) @@ -221,7 +221,7 @@ # Create a "MD5SUMS" file to use for checking the validity of the release. echo \ "# This file contains the MD5 checksums of the files in the -# gcc-"${RELEASE}".tar.bz2 tarball. +# gcc-"${RELEASE}".tar.xz tarball. # # Besides verifying that all files in the tarball were correctly expanded, # it also can be used to determine if any files have changed since the @@ -244,11 +244,11 @@ build_tarfile() { # Get the name of the destination tar file. - TARFILE="$1.tar.bz2" + TARFILE="$1.tar.xz" shift # Build the tar file itself. - (${TAR} cf - "$@" | ${BZIP2} > ${TARFILE}) || \ + (${TAR} cf - "$@" | ${XZ} > ${TARFILE}) || \ error "Could not build tarfile" FILE_LIST="${FILE_LIST} ${TARFILE}" } @@ -273,8 +273,8 @@ # Build .gz files. build_gzip() { for f in ${FILE_LIST}; do -target=${f%.bz2}.gz -(${BZIP2} -d -c $f | ${GZIP} > ${target}) || error "Could not create ${target}" +target=${f%.xz}.gz +(${XZ} -d -c $f | ${GZIP} > ${target}) || error "Could not create ${target}" done } @@ -282,12 +282,19 @@ build_diffs() { old_dir=${1%/*} old_file=${1##*/} - old_vers=${old_file%.tar.bz2} + case "$old_file" in +*.tar.xz) old_vers=${old_file%.tar.xz};; +*) old_vers=${old_file%.tar.bz2};; + esac old_vers=${old_vers#gcc-} inform "Building diffs against version $old_vers" for f in gcc; do -old_tar=${old_dir}/${f}-${old_vers}.tar.bz2 -new_tar=${WORKING_DIRECTORY}/${f}-${RELEASE}.tar.bz2 +if [ -e ${old_dir}/${f}-${old_vers}.tar.xz ]; then + old_tar=${old_dir}/${f}-${old_vers}.tar.xz +else + old_tar=${old_dir}/${f}-${old_vers}.tar.bz2 +fi +new_tar=${WORKING_DIRECTORY}/${f}-${RELEASE}.tar.xz if [ ! -e $old_tar ]; then inform "$old_tar not found; not generating diff file" elif [ ! -e $new_tar ]; then @@ -294,7 +301,7 @@ inform "$new_tar not found; not generating diff file" else build_diff $old_tar gcc-${old_vers} $new_tar gcc-${RELEASE} \ -${f}-${old_vers}-${RELEASE}.diff.bz2 +${f}-${old_vers}-${RELEASE}.diff.xz fi done } @@ -305,13 +312,20 @@ tmpdir=gccdiff.$$ mkdir $tmpdir || error "Could not create directory $tmpdir" changedir $tmpdir - (${BZIP2} -d -c $1 | ${TAR} xf - ) || error "Could not unpack $1 for diffs" - (${BZIP2} -d -c $3 | ${TAR} xf - ) || error "Could not unpack $3 for diffs" - ${DIFF} $2 $4 > ../${5%.bz2} + case "$1" in +*.tar.bz2) + (${BZIP2} -d -c $1 | ${TAR} xf - ) || error "Could not unpack $1 for diffs" + ;; +*.tar.xz) + (${XZ} -d -c $1 | ${TAR} xf - ) || error "Could not unpack $1 for diffs" + ;; + esac + (${XZ} -d -c $3 | ${TAR} xf - ) || error "Could not unpack $3 for diffs" + ${DIFF} $2 $4 > ../${5%.xz} if [ $? -eq 2 ]; then error "Trouble making diffs from $1 to $3" fi - ${BZIP2} ../${5%.bz2} || error "Could not generate ../$5" + ${XZ} ../${5%.xz} || error "Could not generate ../$5" changedir .. rm -rf $tmpdir FILE_LIST="${FILE_LIST} $5" @@ -335,7 +349,7 @@ fi # Then copy files to their respective (sub)directories. - for x in gcc*.gz gcc*.bz2; do + for x in gcc*.gz gcc*.xz; do if [ -e ${x} ]; then # Make sure the file will be readable on the server. chmod a+r ${x} @@ -410,7 +424,7 @@ " > ${SNAPSHOT_INDEX} - snapshot_print gcc-${RELEASE}.tar.bz2 "Complete GCC" + snapshot_print gcc-${RELEASE}.tar.xz "Complete GCC" echo \ "Diffs from "${BRANCH}"-"${LAST_DATE}" are available in the diffs/ subdirectory. @@ -528,12 +542,13 @@ MODE_TARFILES=0 MODE_UPLOAD=0 -# List of archive files generated; used to create .gz files from .bz2. +# List of archive files generated; used to create .gz files from .xz. FILE_LIST="" # Programs we use. BZIP2="${BZ
Re: [PATCH] [i386] Recompute the frame layout less often
On 05/14/2017 11:31 AM, Bernd Edlinger wrote: Hi Daniel, there is one thing I don't understand in your patch: That is, it introduces a static value: /* Registers who's save & restore will be managed by stubs called from pro/epilogue. */ static HARD_REG_SET GTY(()) stub_managed_regs; This seems to be set as a side effect of ix86_compute_frame_layout, and depends on the register usage of the current function. But values that depend on the current function need usually be attached to cfun->machine, because the passes can run in parallel unless I am completely mistaken, and the stub_managed_regs may therefore be computed from a different function. Bernd. I should add that if you want to run faster tests just on the ms to sysv abi code, you can use make RUNTESTFLAGS="ms-sysv.exp" check and then if that succeeds run the full testsuite. Daniel
Re: [PATCH 2/N] Add dump_flags_type for handling of suboptions.
On 05/12/2017 11:42 AM, Martin Sebor wrote: On 05/05/2017 04:44 AM, Martin Liška wrote: Hi. This one is more interesting as it implements hierarchical option parsing and as a first step I implemented that for optgroup suboptions. I haven't gone through the rest of the patches so I could be missing some context. But I have a few observations about and suggestions for the new dump_option_node (and to a smaller extent, dump_flags_type) class template. (Feel free to disregard whatever doesn't make sense.) First, since dump_option_node's public member functions (including the only ctor) are defined in dumpfile.c it seems that the template definition doesn't really need to be provided in the header. If that's correct, it would reduce coupling and might avoid bloat to only declare the bare minimum needed to use the type in other .c files, and define the rest in the one .c file where the complete type is actually needed. As far as I can see, there is just one instantiation of the template in the patch, in dumpfile.c: + typedef dump_option_node node; If there are no other existing instantiations and no others are anticipated, it could even be a plain class (further reducing complexity and bloat). Finally, the template ctor takes a const char* and stores it in a member, and the implicit copy ctor and assignment operator copy the underlying vec class. That means that the string argument must outlive the constructed object, which is not typically expected. IIUC, vec is not safely copy-assignable or copy-constructible (i.e., has undefined behavior). At a minimum, it would be appropriate to document these constraints. Removing them or preventing copy construction and assignment to avoid getting bit by it would be even better. For dump_flags_type, none of the members need to be explicitly declared inline or mention the template argument list. operator&(dump_flags_type) should be declared const. Compound assignment operators should return a reference to *this (to behave as closely to the native operators). The exclusive OR operator and compound assignment are missing and should probably be provided for completeness, even if they aren't needed. The m_mask member should probably be private for better encapsulation (and operator uint64_t() const provided to provide implicit conversion to that type, or perhaps one converting to E's underlying type; that should also obviate the need for operator bool). One other comment/suggestion I meant to add but forgot: It's best to make binary operators like operator| non-members. That way, they are symmetric in terms of implicit conversions of their operands. E.g., with dump_flags_type which defines a conversion ctor from uint64_t (whether or not that's a good approach is a separate topic), it's expected that if dump_flags_type () | uint64_t (1) is valid, then uint64_t (1) | dump_flags_type () is valid as well. With operator| being a dump_flags_type member the former would be valid but the latter would not be. In other words, the general rule is to only define member functions for the smallest set of operations that cannot be non-members (like compound assignments). Martin
More OpenACC 2.5 Profiling Interface (was: OpenACC 2.5 Profiling Interface (incomplete))
Hi! On Tue, 28 Feb 2017 18:43:36 +0100, I wrote: > The 2.5 versions of the OpenACC standard added a new chapter "Profiling > Interface". In r245784, I committed incomplete support to > gomp-4_0-branch. I plan to continue working on this, but wanted to > synchronize at this point. > > commit b22a85fe7f3daeb48460e7aa28606d0cdb799f69 > Author: tschwinge > Date: Tue Feb 28 17:36:03 2017 + > > OpenACC 2.5 Profiling Interface (incomplete) Committed to gomp-4_0-branch in r248042: commit e3720963a1f494b2a0a1b6c28d5eb8bfb7c0d546 Author: tschwinge Date: Mon May 15 06:50:17 2017 + More OpenACC 2.5 Profiling Interface libgomp/ * oacc-async.c (acc_async_test, acc_async_test_all, acc_wait) (acc_wait_async, acc_wait_all, acc_wait_all_async): Set up profiling. * oacc-cuda.c (acc_get_current_cuda_device) (acc_get_current_cuda_context, acc_get_cuda_stream) (acc_set_cuda_stream): Likewise. * oacc-init.c (acc_set_device_type, acc_get_device_type) (acc_get_device_num): Likewise. * oacc-mem.c (acc_malloc, acc_free, memcpy_tofrom_device) (acc_map_data, acc_unmap_data, present_create_copy) (delete_copyout, update_dev_host): Likewise. * oacc-parallel.c (GOACC_data_start, GOACC_data_end) (GOACC_enter_exit_data, GOACC_update, GOACC_wait): Likewise. * oacc-profiling.c (goacc_profiling_setup_p): New function. (goacc_profiling_dispatch_p): Add a "bool" formal parameter. Adjust all users. * oacc-int.h (goacc_profiling_setup_p) (goacc_profiling_dispatch_p): Update. * plugin/plugin-nvptx.c (nvptx_exec, nvptx_wait, nvptx_wait_all): Generate more profiling events. * libgomp.texi (OpenACC Profiling Interface): Update. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@248042 138bc75d-0d04-0410-961f-82ee72b054a4 --- libgomp/ChangeLog.gomp| 24 +++ libgomp/libgomp.texi | 74 +++-- libgomp/oacc-async.c | 110 - libgomp/oacc-cuda.c | 82 -- libgomp/oacc-init.c | 102 +++- libgomp/oacc-int.h| 4 +- libgomp/oacc-mem.c| 154 +- libgomp/oacc-parallel.c | 357 +++--- libgomp/oacc-profiling.c | 100 +++- libgomp/plugin/plugin-nvptx.c | 113 - 10 files changed, 1056 insertions(+), 64 deletions(-) diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp index 5dc0889..23882cf 100644 --- libgomp/ChangeLog.gomp +++ libgomp/ChangeLog.gomp @@ -1,3 +1,27 @@ +2017-05-15 Thomas Schwinge + + * oacc-async.c (acc_async_test, acc_async_test_all, acc_wait) + (acc_wait_async, acc_wait_all, acc_wait_all_async): Set up + profiling. + * oacc-cuda.c (acc_get_current_cuda_device) + (acc_get_current_cuda_context, acc_get_cuda_stream) + (acc_set_cuda_stream): Likewise. + * oacc-init.c (acc_set_device_type, acc_get_device_type) + (acc_get_device_num): Likewise. + * oacc-mem.c (acc_malloc, acc_free, memcpy_tofrom_device) + (acc_map_data, acc_unmap_data, present_create_copy) + (delete_copyout, update_dev_host): Likewise. + * oacc-parallel.c (GOACC_data_start, GOACC_data_end) + (GOACC_enter_exit_data, GOACC_update, GOACC_wait): Likewise. + * oacc-profiling.c (goacc_profiling_setup_p): New function. + (goacc_profiling_dispatch_p): Add a "bool" formal parameter. + Adjust all users. + * oacc-int.h (goacc_profiling_setup_p) + (goacc_profiling_dispatch_p): Update. + * plugin/plugin-nvptx.c (nvptx_exec, nvptx_wait, nvptx_wait_all): + Generate more profiling events. + * libgomp.texi (OpenACC Profiling Interface): Update. + 2017-05-14 Thomas Schwinge * testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: New diff --git libgomp/libgomp.texi libgomp/libgomp.texi index 93365cd..b3fa139 100644 --- libgomp/libgomp.texi +++ libgomp/libgomp.texi @@ -3207,12 +3207,19 @@ Will be @code{acc_construct_parallel} for OpenACC kernels constructs; should be @code{acc_construct_kernels}. @item +Will be @code{acc_construct_enter_data} or +@code{acc_construct_exit_data} when processing variable mappings +specified in OpenACC declare directives; should be +@code{acc_construct_declare}. + +@item For implicit @code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}, and explicit as well as implicit @code{acc_ev_alloc}, @code{acc_ev_free}, @code{acc_ev_enqueue_upload_start}, @code{acc_ev_enqueue_upload_end}, -@code{acc_ev_enqueue_download_start}, and -@code{acc_ev_enqueue_download_end}, will be +@code{acc_ev_enqueue_download_start}, +@code{acc_ev_enqueue_download_end}, @code{acc_ev_wait_start}, and +@code{acc_ev_wait_end}, will be @code{acc