Re: RFC - Refactor tree.h
Mike Stump wrote: >On Aug 10, 2013, at 3:03 AM, Richard Biener > wrote: >> Mike Stump wrote: >>> On Aug 9, 2013, at 3:36 PM, Diego Novillo >wrote: This patch is still WIP. It builds stage1, but I'm getting ICEs during stage 2. The patch splits tree.h into three files: - tree-core.h: All data structures, enums and typedefs from tree.h - tree-api.h: All extern function definitions from tree.h - tree-macros.h: All macro accessors, tree checks and other inline functions. >>> >>> I don't like this split. You focus in on the details and sort code >by >>> detail. I think this is wrong. > >> I mostly agree - tree-macros.h is a red herring. It should be >tree-core.h and tree.h only. > >I disagree. core isn't a concept that should be binned into. control >flow, call graph, register, arm, alias, allocation, attribute, builtin, >type, eval, jit, symbol, file, floating point, pass, block, stack, >constant, hash, map, range, memory, debug, dump, elf, dwarf, operator, >value, vector, declarations, int, statements, object, storage, >expressions, frame, error, values, mapping, the list is endless. core >is like a bin for important, functions that begin with a, functions I >wrote, big functions, functions implemented with templates, trivial >functions, hard to grasp concepts, simple things, things added in the >last year, old things, fun things, extra things, useful thing, unsorted >things, often used things, and so on… core goes in exactly the wrong >long term direction. Fact is that we need to separate internal details of tree.h into sth shareable from exactly two places. Tree-core.h and tree.h are both 'tree.h' in some way. Call it tree-internal.h or tree1.h. The goal is to have two distinct and conflicting APIs to trees, one exposed from gimple.h and one from tree.h. And yes, that's a transitional thing - but possibly a very long living one... Richard.
Re: [Patch] Regex back-reference support
Hi, >I have to use a vector, because I need to iterate it while >manipulating it as a stack. Ok. Strictly speaking, you could't do that with a stack. Maybe you should say something like we add and remove elements FILO. But don't bother for now. Paolo
Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
> > I see, yes LTO can deal with this better since it has global > information. In non-LTO mode (including LIPO) we have the issue. Either Martin or me will implement merging of the multiple copies at LTO link time. This is needed for Martin's code unification patch anyway. Theoretically gcov runtime can also have symbol names and cfg checksums of comdats in the static data and at exit produce buckets based on matching names+checksums+counter counts, merge all data into in each bucket to one representative by the existing merging routines and then memcpy them to all the oriignal copiles. This way all compilation units will receive same results. I am not very keen about making gcov runtime bigger and more complex than it needs to be, but having sane profile for comdats seems quite important. Perhaps, in GNU toolchain, ordered subsections can be used to make linker to produce ordered list of comdats, so the runtime won't need to do hashing + lookups. Honza > > I take it gimp is built with LTO and therefore shouldn't be hitting > this comdat issue? > > Let me do a couple things: > - port over my comdat inlining fix from the google branch to trunk and > send it for review. If you or Martin could try it to see if it helps > with function splitting to avoid the hits from the cold code that > would be great > - I'll add some new sanity checking to try to detect non-zero blocks > in the cold section, or 0 blocks reached by non-zero edges and see if > I can flush out any problems with my tests or a profiledbootstrap or > gimp. > - I'll try building and profiling gimp myself to see if I can > reproduce the issue with code executing out of the cold section. > > Thanks, > Teresa > > >> > >> Also, can you send me reproduction instructions for gimp? I don't > >> think I need Martin's patch, but which version of gimp and what is the > >> equivalent way for me to train it? I have some scripts to generate a > >> similar type of instruction heat map graph that I have been using to > >> tune partitioning and function reordering. Essentially it uses linux > >> perf to sample on instructions_retired and then munge the data in > >> several ways to produce various stats and graphs. One thing that has > >> been useful has been to combine the perf data with nm output to > >> determine which cold functions are being executed at runtime. > > > > Martin? > > > >> > >> However, for this to tell me which split cold bbs are being executed I > >> need to use a patch that Sri sent for review several months back that > >> gives the split cold section its own name: > >> http://gcc.gnu.org/ml/gcc-patches/2013-04/msg01571.html > >> Steven had some follow up comments that Sri hasn't had a chance to address > >> yet: > >> http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00798.html > >> (cc'ing Sri as we should probably revive this patch soon to address > >> gdb and other issues with detecting split functions properly) > > > > Intresting, I used linker script for this purposes, but that his GNU ld > > only... > > > > Honza > >> > >> Thanks! > >> Teresa > >> > >> > > >> > Honza > >> >> > >> >> Thanks, > >> >> Teresa > >> >> > >> >> > I think we are really looking primarily for dead parts of the > >> >> > functions (sanity checks/error handling) > >> >> > that should not be visited by train run. We can then see how to make > >> >> > the heuristic more aggressive? > >> >> > > >> >> > Honza > >> >> > >> >> > >> >> > >> >> -- > >> >> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 > >> > >> > >> > >> -- > >> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 > > > > -- > Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
ping^3: [patch] Support .eh_frame in crt1 x86_64 glibc (PR libgcc/57280, libc/15407)
Hi, [patch update] Support .eh_frame in crt1 x86_64 glibc (PR libgcc/57280, libc/15407) http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00775.html Message-ID: <20130514191244.ga12...@host2.jankratochvil.net> Thanks, Jan
Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
Cc'ing Rong since he is also working on trying to address the comdat profile issue. Rong, you may need to see an earlier message for more context: http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00558.html Teresa On Sun, Aug 11, 2013 at 5:21 AM, Jan Hubicka wrote: >> >> I see, yes LTO can deal with this better since it has global >> information. In non-LTO mode (including LIPO) we have the issue. > > Either Martin or me will implement merging of the multiple copies at > LTO link time. This is needed for Martin's code unification patch anyway. > > Theoretically gcov runtime can also have symbol names and cfg checksums of > comdats in the static data and at exit produce buckets based on matching > names+checksums+counter counts, merge all data into in each bucket to one > representative by the existing merging routines and then memcpy them to > all the oriignal copiles. This way all compilation units will receive same > results. > > I am not very keen about making gcov runtime bigger and more complex than it > needs to be, but having sane profile for comdats seems quite important. > Perhaps, in GNU toolchain, ordered subsections can be used to make linker to > produce ordered list of comdats, so the runtime won't need to do hashing + > lookups. > > Honza >> >> I take it gimp is built with LTO and therefore shouldn't be hitting >> this comdat issue? >> >> Let me do a couple things: >> - port over my comdat inlining fix from the google branch to trunk and >> send it for review. If you or Martin could try it to see if it helps >> with function splitting to avoid the hits from the cold code that >> would be great >> - I'll add some new sanity checking to try to detect non-zero blocks >> in the cold section, or 0 blocks reached by non-zero edges and see if >> I can flush out any problems with my tests or a profiledbootstrap or >> gimp. >> - I'll try building and profiling gimp myself to see if I can >> reproduce the issue with code executing out of the cold section. >> >> Thanks, >> Teresa >> >> >> >> >> Also, can you send me reproduction instructions for gimp? I don't >> >> think I need Martin's patch, but which version of gimp and what is the >> >> equivalent way for me to train it? I have some scripts to generate a >> >> similar type of instruction heat map graph that I have been using to >> >> tune partitioning and function reordering. Essentially it uses linux >> >> perf to sample on instructions_retired and then munge the data in >> >> several ways to produce various stats and graphs. One thing that has >> >> been useful has been to combine the perf data with nm output to >> >> determine which cold functions are being executed at runtime. >> > >> > Martin? >> > >> >> >> >> However, for this to tell me which split cold bbs are being executed I >> >> need to use a patch that Sri sent for review several months back that >> >> gives the split cold section its own name: >> >> http://gcc.gnu.org/ml/gcc-patches/2013-04/msg01571.html >> >> Steven had some follow up comments that Sri hasn't had a chance to >> >> address yet: >> >> http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00798.html >> >> (cc'ing Sri as we should probably revive this patch soon to address >> >> gdb and other issues with detecting split functions properly) >> > >> > Intresting, I used linker script for this purposes, but that his GNU ld >> > only... >> > >> > Honza >> >> >> >> Thanks! >> >> Teresa >> >> >> >> > >> >> > Honza >> >> >> >> >> >> Thanks, >> >> >> Teresa >> >> >> >> >> >> > I think we are really looking primarily for dead parts of the >> >> >> > functions (sanity checks/error handling) >> >> >> > that should not be visited by train run. We can then see how to >> >> >> > make the heuristic more aggressive? >> >> >> > >> >> >> > Honza >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Teresa Johnson | Software Engineer | tejohn...@google.com | >> >> >> 408-460-2413 >> >> >> >> >> >> >> >> -- >> >> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 >> >> >> >> -- >> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
[RFC, patch] Detect lack of 32-bit devel environment on x86_64-linux targets
Given that I did not receive any feedback on my earlier email on this topic, I would like to send this patch for RFC. I'm not expert at this configury-stuff, so please try to comment on both the test proposed and my actual implementation :) The idea is to find a patch which both catches probable issues early on for most x86_64-linux users, yet does not make build more complex for our power users. So, I propose to include a specific check in toplevel configure: The cumulative conditions I suggest, in order to make it as unobtrusive as possible for current users, are: 1. if we build a native compiler, 2. on x86_64-linux (and possible other x86_64 targets whose maintainers want to opt in), 3. and neither --enable-multilib nor --disable-multilib were passed then: a. we check that the native compiler can handle 32-bit, by compiling a test executable with the "-m32" option b. if we fail, we error out of the configure process, indicating that this can be overriden with --{enable,disable}-multilib I suspect this might catch (at configure time) the large majority of users who currently get stuck at stage 2 with the "gnu/stubs-32.h" error, while being invisible to a large majority of the power users. So, what do you think? FX Index: configure.ac === --- configure.ac(revision 201292) +++ configure.ac(working copy) @@ -2861,6 +2861,26 @@ case "${target}" in ;; esac +# Special user-friendly check for native x86_64-linux build, if +# multilib is not explicitly enabled. +case "$target:$have_compiler:$host:$target:$enable_multilib" in + x86_64-*linux*:yes:$build:$build:) +# Make sure we have a developement environment that handles 32-bit +dev64=no +echo "int main () { return 0; }" > conftest.c +${CC} -m32 -o conftest ${CFLAGS} ${CPPFLAGS} ${LDFLAGS} conftest.c +if test $? = 0 ; then + if test -s conftest || test -s conftest.exe ; then + dev64=yes + fi +fi +rm -f conftest* +if test x${dev64} != xyes ; then + AC_MSG_ERROR([I suspect your system does not have 32-bit developement libraries (libc and headers). If you have them, rerun configure with --enable-multilib. If you do not have them, and want to build a 64-bit-only compiler, rerun configure with --disable-multilib.]) +fi +;; +esac + # Default to --enable-multilib. if test x${enable_multilib} = x ; then target_configargs="--enable-multilib ${target_configargs}"
Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
> Hello, >I did a collection of systemtap graphs for GIMP. > > All these graphs were created with enabled LTO, profiling and -O2. > > 1) gimp-reordered.pdf - function are reordered according to my newly > created profile that utilizes LTO infrastructure > 2) gimp-no-top-level-reorder.pdf - (GCC rev. 201648) -fno-top-level-reorder > 3) gimp-top-level-reorder.pdf - (GCC rev. 201648) -ftop-level-reorder Thanks for the graphs! gimp-top-level-reorder seems to be bogus (it shows accesses into dynstr only). To catch the -fno-reorder-blocks-partition problem, perhaps you can modify the Martin's linker script to make .text.unlikely section non-executable. This way it will crash application every time we jump into it. Honza > > Honza has an idea how to minimize hot text section and I will send new > graphs for the proposed patch. > Moreover, I will send graphs for Inkscape which is written in C++. > > Have a nice day, > Martin > > On 11 August 2013 15:25, Teresa Johnson wrote: > > Cc'ing Rong since he is also working on trying to address the comdat > > profile issue. Rong, you may need to see an earlier message for more > > context: > > http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00558.html > > > > Teresa > > > > On Sun, Aug 11, 2013 at 5:21 AM, Jan Hubicka wrote: > >>> > >>> I see, yes LTO can deal with this better since it has global > >>> information. In non-LTO mode (including LIPO) we have the issue. > >> > >> Either Martin or me will implement merging of the multiple copies at > >> LTO link time. This is needed for Martin's code unification patch anyway. > >> > >> Theoretically gcov runtime can also have symbol names and cfg checksums of > >> comdats in the static data and at exit produce buckets based on matching > >> names+checksums+counter counts, merge all data into in each bucket to one > >> representative by the existing merging routines and then memcpy them to > >> all the oriignal copiles. This way all compilation units will receive same > >> results. > >> > >> I am not very keen about making gcov runtime bigger and more complex than > >> it > >> needs to be, but having sane profile for comdats seems quite important. > >> Perhaps, in GNU toolchain, ordered subsections can be used to make linker > >> to > >> produce ordered list of comdats, so the runtime won't need to do hashing + > >> lookups. > >> > >> Honza > >>> > >>> I take it gimp is built with LTO and therefore shouldn't be hitting > >>> this comdat issue? > >>> > >>> Let me do a couple things: > >>> - port over my comdat inlining fix from the google branch to trunk and > >>> send it for review. If you or Martin could try it to see if it helps > >>> with function splitting to avoid the hits from the cold code that > >>> would be great > >>> - I'll add some new sanity checking to try to detect non-zero blocks > >>> in the cold section, or 0 blocks reached by non-zero edges and see if > >>> I can flush out any problems with my tests or a profiledbootstrap or > >>> gimp. > >>> - I'll try building and profiling gimp myself to see if I can > >>> reproduce the issue with code executing out of the cold section. > >>> > >>> Thanks, > >>> Teresa > >>> > >>> >> > >>> >> Also, can you send me reproduction instructions for gimp? I don't > >>> >> think I need Martin's patch, but which version of gimp and what is the > >>> >> equivalent way for me to train it? I have some scripts to generate a > >>> >> similar type of instruction heat map graph that I have been using to > >>> >> tune partitioning and function reordering. Essentially it uses linux > >>> >> perf to sample on instructions_retired and then munge the data in > >>> >> several ways to produce various stats and graphs. One thing that has > >>> >> been useful has been to combine the perf data with nm output to > >>> >> determine which cold functions are being executed at runtime. > >>> > > >>> > Martin? > >>> > > >>> >> > >>> >> However, for this to tell me which split cold bbs are being executed I > >>> >> need to use a patch that Sri sent for review several months back that > >>> >> gives the split cold section its own name: > >>> >> http://gcc.gnu.org/ml/gcc-patches/2013-04/msg01571.html > >>> >> Steven had some follow up comments that Sri hasn't had a chance to > >>> >> address yet: > >>> >> http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00798.html > >>> >> (cc'ing Sri as we should probably revive this patch soon to address > >>> >> gdb and other issues with detecting split functions properly) > >>> > > >>> > Intresting, I used linker script for this purposes, but that his GNU ld > >>> > only... > >>> > > >>> > Honza > >>> >> > >>> >> Thanks! > >>> >> Teresa > >>> >> > >>> >> > > >>> >> > Honza > >>> >> >> > >>> >> >> Thanks, > >>> >> >> Teresa > >>> >> >> > >>> >> >> > I think we are really looking primarily for dead parts of the > >>> >> >> > functions (sanity checks/error handling) > >>> >> >> > that should not be visited by train run. We can then see how to > >>> >>
Lambda templates and implicit function templates.
Hi Jason, I decided to go ahead and submit the latest cleaned up version of the generic lambda and implicit function template patches. I think all review comments have been addressed. As well as the cleanup there are a few enhancements; generic lambda instantiations in diagnostics now show template argument bindings and the appropriate -std=xxx guards have been put in place. I've also fixed a couple of cases where invalid user code caused an ICE. I am yet to write some tests for individual features to be included in the source tree but intend to when I get some more time. In the meantime I've included the two main tests I've used for trials. Cheers, Adam Patch summary (3): Support lambda templates. Support using 'auto' in a function parameter list to introduce an implicit template parameter. Support dumping type bindings in lambda diagnostics. gcc/cp/cp-tree.h | 11 gcc/cp/decl.c| 19 +- gcc/cp/decl2.c | 5 +- gcc/cp/error.c | 22 +++ gcc/cp/lambda.c | 87 +++-- gcc/cp/parser.c | 98 +--- gcc/cp/pt.c | 193 ++- 7 files changed, 375 insertions(+), 60 deletions(-) Basic compile and runtime check: /* Implicit function templates at namespace scope. */ auto f1 (auto& a, auto const& b) { return a += b; } template auto f2 (A& a, auto const& b) { return a += b; } template auto f3 (auto& a, B const& b) { return a += b; } template auto f4 (A& a, B const& b) { return a += b; } struct S { /* Implicit non-static member function templates. */ auto mf1 (auto& a, auto const& b) { return a += b; } template auto mf2 (A& a, auto const& b) { return a += b; } template auto mf3 (auto& a, B const& b) { return a += b; } template auto mf4 (A& a, B const& b) { return a += b; } /* Implicit static member function templates. */ static auto smf1 (auto& a, auto const& b) { return a += b; } template static auto smf2 (A& a, auto const& b) { return a += b; } template static auto smf3 (auto& a, B const& b) { return a += b; } template static auto smf4 (A& a, B const& b) { return a += b; } }; #undef NDEBUG #include #define CHECK(A, b, f) do { \ A a1 = 5, a2 = 12;\ auto r1 = f (a1, b); \ auto r2 = f (a2, b); \ assert ((#f, a1 == 5 + b)); \ assert ((#f, a2 == 12 + b)); \ assert ((#f, r1 == a1)); \ assert ((#f, r2 == a2)); \ } while (0) #define INVOKEi(f, A, b, i) do { CHECK (A, b, f ## i); } while (0) #define INVOKE4(f, A, b) do { INVOKEi (f, A, b, 1); \ INVOKEi (f, A, b, 2); \ INVOKEi (f, A, b, 3); \ INVOKEi (f, A, b, 4); } while (0) #define AS_FUNi(f, A, b, i) do { CHECK (A, b, f ## i._FUN); } while (0) #define AS_FUN4(f, A, b) do { AS_FUNi (f, A, b, 1); \ AS_FUNi (f, A, b, 2); \ AS_FUNi (f, A, b, 3); \ AS_FUNi (f, A, b, 4); } while (0) #define AS_PTRi(f, A, B, b, i) do { A (*pfn) (A&, B const&) = f ## i; \ CHECK (A, b, pfn); } while (0) #define AS_PTR4(f, A, B, b) do { AS_PTRi (f, A, B, b, 1); \ AS_PTRi (f, A, B, b, 2); \ AS_PTRi (f, A, B, b, 3); \ AS_PTRi (f, A, B, b, 4); } while (0) int main() { /* Check namespace templates. */ INVOKE4 (f, float, 7); AS_PTR4 (f, float, int, 7); /* Check member templates. */ S s; INVOKE4 (s.mf, float, 7); INVOKE4 (s.smf, float, 7); INVOKE4 (S::smf, float, 7); AS_PTR4 (s.smf, float, int, 7); AS_PTR4 (S::smf, float, int, 7); /* Regression check non-template stateless lambda and its conversion to function pointer. */ auto lf0 = [] (float& a, int const& b) { return a += b; }; INVOKEi (lf, float, 7, 0); AS_FUNi (lf, float, 7, 0); AS_PTRi (lf, float, int, 7, 0); /* Check stateless lambda templates. */ auto lf1 = [] (auto& a, auto const& b) { return a += b; }; auto lf2 = [] (A& a, auto const& b) { return a += b; }; auto lf3 = [] (auto& a, B const& b) { return a += b; }; auto lf4 = [] (A& a, B const& b) { return a += b; }; INVOKE4 (lf, float, 7); AS_FUN4 (lf, float, 7); AS_PTR4 (lf, float, int, 7); /* Check capturing lambda templates. */ int i; auto lc1 = [i] (auto& a, auto const& b) { return a += b; }; auto lc2 = [i] (A& a,
[PATCH 2/3] Support using 'auto' in a function parameter list to introduce an implicit template parameter.
* cp-tree.h (struct saved_scope): Add x_fully_implicit_template bit ... (fully_implicit_template): ... and provide conventional access to it. (type_uses_auto_or_concept): Declare. (is_auto_or_concept): Declare. (add_implicit_template_parms): Declare. (finish_fully_implicit_template): Declare. * decl.c (grokdeclarator): Allow 'auto' parameters with -std=gnu++1y, or, in lambda parameter lists, with at least -std=c++1y. * parser.c (cp_parser_parameter_declaration_list): Count generic parameters and call add_implicit_template_parms to synthesize them. (cp_parser_direct_declarator): Account for implicit template parameters. (cp_parser_lambda_declarator_opt): Finish fully implicit template if necessary. (cp_parser_member_declaration): Likewise. (cp_parser_function_definition_after_declarator): Likewise. * pt.c (type_uses_auto): Reimplement with ... (find_type_usage): ... this new static function. (is_auto_or_concept): New function. (type_uses_auto_or_concept): New function. (make_generic_type_name): New static function. (tree_type_is_auto_or_concept): New static function. (add_implicit_template_parms): New function. (finish_fully_implicit_template): New function. --- gcc/cp/cp-tree.h | 11 gcc/cp/decl.c| 19 +- gcc/cp/parser.c | 58 ++--- gcc/cp/pt.c | 189 ++- 4 files changed, 254 insertions(+), 23 deletions(-) diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index 8672739..8e5247c 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -1034,6 +1034,7 @@ struct GTY(()) saved_scope { int x_processing_template_decl; int x_processing_specialization; BOOL_BITFIELD x_processing_explicit_instantiation : 1; + BOOL_BITFIELD x_fully_implicit_template : 1; BOOL_BITFIELD need_pop_function_context : 1; int unevaluated_operand; @@ -1088,6 +1089,12 @@ struct GTY(()) saved_scope { #define processing_specialization scope_chain->x_processing_specialization #define processing_explicit_instantiation scope_chain->x_processing_explicit_instantiation +/* Nonzero if the function being declared was made a template due to its + parameter list containing generic type specifiers (`auto' or concept + identifiers) rather than an explicit template parameter list. */ + +#define fully_implicit_template scope_chain->x_fully_implicit_template + /* The cached class binding level, from the most recently exited class, or NULL if none. */ @@ -5454,12 +5461,16 @@ extern tree make_auto (void); extern tree make_decltype_auto (void); extern tree do_auto_deduction (tree, tree, tree); extern tree type_uses_auto (tree); +extern tree type_uses_auto_or_concept (tree); extern void append_type_to_template_for_access_check (tree, tree, tree, location_t); extern tree splice_late_return_type(tree, tree); extern bool is_auto(const_tree); +extern bool is_auto_or_concept (const_tree); extern tree process_template_parm (tree, location_t, tree, bool, bool); +extern tree add_implicit_template_parms(size_t, tree); +extern tree finish_fully_implicit_template (tree); extern tree end_template_parm_list (tree); extern void end_template_decl (void); extern tree maybe_update_decl_type (tree, tree); diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index d49ed29..223bfd5 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -10331,8 +10331,23 @@ grokdeclarator (const cp_declarator *declarator, if (type_uses_auto (type)) { - error ("parameter declared %"); - type = error_mark_node; + bool lambda_p = (current_class_type + && LAMBDA_TYPE_P (current_class_type)); + + static const char * const lambda_error + = "use of % in lambda parameter declaration " + "only available with " + "-std=c++1y or -std=gnu++1y"; + static const char * const nonlambda_error + = "use of % in parameter declaration " + "only available with " + "-std=gnu++1y"; + + if (!(cxx_dialect >= cxx1y && (!flag_iso || lambda_p))) + { + error (lambda_p ? lambda_error : nonlambda_error); + type = error_mark_node; + } } /* A parameter declared as an array of T is really a pointer to T. diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index d32608c..19182bd 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -8931,6 +8931,11 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, tree lambda_expr)
[PATCH 1/3] Support lambda templates.
* parser.c (cp_parser_lambda_declarator_opt): Accept template parameter list with std=gnu++1y. (cp_parser_lambda_body): Don't call 'expand_or_defer_fn' for lambda call operator template to avoid adding template result to symbol table. * lambda.c (lambda_function): Return template result if call operator is a template. (maybe_add_lambda_conv_op): Support conversion of a non-capturing lambda template to a function pointer. * decl2.c (check_member_template): Don't reject lambda call operator template in local [lambda] class. * pt.c (instantiate_class_template_1): Don't instantiate lambda call operator template when instantiating lambda class. --- gcc/cp/decl2.c | 5 ++-- gcc/cp/lambda.c | 87 - gcc/cp/parser.c | 40 -- gcc/cp/pt.c | 4 ++- 4 files changed, 110 insertions(+), 26 deletions(-) diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c index d5d2912..ac9dbd7 100644 --- a/gcc/cp/decl2.c +++ b/gcc/cp/decl2.c @@ -507,8 +507,9 @@ check_member_template (tree tmpl) || (TREE_CODE (decl) == TYPE_DECL && MAYBE_CLASS_TYPE_P (TREE_TYPE (decl { - /* The parser rejects template declarations in local classes. */ - gcc_assert (!current_function_decl); + /* The parser rejects template declarations in local classes +(with the exception of generic lambdas). */ + gcc_assert (!current_function_decl || LAMBDA_FUNCTION_P (decl)); /* The parser rejects any use of virtual in a function template. */ gcc_assert (!(TREE_CODE (decl) == FUNCTION_DECL && DECL_VIRTUAL_P (decl))); diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c index a53e692..e9bc7c5 100644 --- a/gcc/cp/lambda.c +++ b/gcc/cp/lambda.c @@ -196,7 +196,7 @@ lambda_function (tree lambda) /*protect=*/0, /*want_type=*/false, tf_warning_or_error); if (lambda) -lambda = BASELINK_FUNCTIONS (lambda); +lambda = STRIP_TEMPLATE (get_first_fn (lambda)); return lambda; } @@ -759,6 +759,10 @@ maybe_add_lambda_conv_op (tree type) if (processing_template_decl) return; + bool generic_lambda_p += (DECL_TEMPLATE_INFO (callop) +&& DECL_TEMPLATE_RESULT (DECL_TI_TEMPLATE (callop)) == callop); + if (DECL_INITIAL (callop) == NULL_TREE) { /* If the op() wasn't instantiated due to errors, give up. */ @@ -766,7 +770,54 @@ maybe_add_lambda_conv_op (tree type) return; } - stattype = build_function_type (TREE_TYPE (TREE_TYPE (callop)), + tree fn_result = TREE_TYPE (TREE_TYPE (callop)); + tree fn_args = copy_list (DECL_CHAIN (DECL_ARGUMENTS (callop))); + + if (generic_lambda_p) +{ + /* Construct the dependent member call for the static member function +'_FUN' and remove 'auto' from its return type to allow for simple +implementation of the conversion operator. */ + + tree instance = build_nop (type, null_pointer_node); + argvec = make_tree_vector (); + for (arg = fn_args; arg; arg = DECL_CHAIN (arg)) + { + mark_exp_read (arg); + vec_safe_push (argvec, convert_from_reference (arg)); + } + + tree objfn = build_min (COMPONENT_REF, NULL_TREE, + instance, DECL_NAME (callop), NULL_TREE); + call = build_nt_call_vec (objfn, argvec); + + if (type_uses_auto (fn_result)) + { + ++processing_template_decl; + fn_result = finish_decltype_type + (call, /*id_expression_or_member_access_p=*/false, +tf_warning_or_error); + --processing_template_decl; + } +} + else +{ + arg = build1 (NOP_EXPR, TREE_TYPE (DECL_ARGUMENTS (callop)), + null_pointer_node); + argvec = make_tree_vector (); + argvec->quick_push (arg); + for (arg = fn_args; arg; arg = DECL_CHAIN (arg)) + { + mark_exp_read (arg); + vec_safe_push (argvec, arg); + } + call = build_call_a (callop, argvec->length (), argvec->address ()); + CALL_FROM_THUNK_P (call) = 1; + if (MAYBE_CLASS_TYPE_P (TREE_TYPE (call))) + call = build_cplus_new (TREE_TYPE (call), call, tf_warning_or_error); +} + + stattype = build_function_type (fn_result, FUNCTION_ARG_CHAIN (callop)); /* First build up the conversion op. */ @@ -794,6 +845,9 @@ maybe_add_lambda_conv_op (tree type) if (nested) DECL_INTERFACE_KNOWN (fn) = 1; + if (generic_lambda_p) +fn = add_inherited_template_parms (fn, DECL_TI_TEMPLATE (callop)); + add_method (type, fn, NULL_TREE); /* Generic thunk code fails for varargs; we'll complain in mark_used if @@ -820,8 +874,8 @@ maybe_add_lambda_conv_op (tree type) DECL_NOT_REALLY_EXTERN (fn) = 1; DECL_DECLARED_INLINE_P (fn) = 1; DECL_STATIC_FUNCTION_P (fn) = 1;
[PATCH 3/3] Support dumping type bindings in lambda diagnostics.
* error.c (dump_function_decl): Use standard diagnostic flow to dump a lambda diagnostic, albeit without stating the function name or duplicating the parameter spec (which is dumped as part of the type). --- gcc/cp/error.c | 22 +++--- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/gcc/cp/error.c b/gcc/cp/error.c index 440169a..bd5f8cc 100644 --- a/gcc/cp/error.c +++ b/gcc/cp/error.c @@ -1374,14 +1374,7 @@ dump_function_decl (tree t, int flags) int do_outer_scope = ! (flags & TFF_UNQUALIFIED_NAME); tree exceptions; vec *typenames = NULL; - - if (DECL_NAME (t) && LAMBDA_FUNCTION_P (t)) -{ - /* A lambda's signature is essentially its "type", so defer. */ - gcc_assert (LAMBDA_TYPE_P (DECL_CONTEXT (t))); - dump_type (DECL_CONTEXT (t), flags); - return; -} + bool lambda_p = false; flags &= ~(TFF_UNQUALIFIED_NAME | TFF_TEMPLATE_NAME); if (TREE_CODE (t) == TEMPLATE_DECL) @@ -1443,16 +1436,23 @@ dump_function_decl (tree t, int flags) else if (cname) { dump_type (cname, flags); - pp_cxx_colon_colon (cxx_pp); + if (LAMBDA_TYPE_P (cname)) + lambda_p = true; + else + pp_cxx_colon_colon (cxx_pp); } else dump_scope (CP_DECL_CONTEXT (t), flags); - dump_function_name (t, flags); + /* A lambda's signature is essentially its "type", which has already been + dumped. */ + if (!lambda_p) +dump_function_name (t, flags); if (!(flags & TFF_NO_FUNCTION_ARGUMENTS)) { - dump_parameters (parmtypes, flags); + if (!lambda_p) + dump_parameters (parmtypes, flags); if (TREE_CODE (fntype) == METHOD_TYPE) { -- 1.8.3
[C++ PATCH] Grammar fix in pt.c comments.
* pt.c: Grammar fix in comments ("it's" to "its"). --- gcc/cp/pt.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index ce899ef..78b7a97 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -1986,7 +1986,7 @@ determine_specialization (tree template_id, tree decl_arg_types; /* This is an ordinary member function. However, since -we're here, we can assume it's enclosing class is a +we're here, we can assume its enclosing class is a template class. For example, template struct S { void f(); }; @@ -4337,7 +4337,7 @@ check_default_tmpl_args (tree decl, tree parms, bool is_primary, || DECL_INITIALIZED_IN_CLASS_P (decl))) /* We already checked these parameters when the template was declared, so there's no need to do it again now. This function - was defined in class scope, but we're processing it's body now + was defined in class scope, but we're processing its body now that the class is complete. */ return true; @@ -7555,7 +7555,7 @@ lookup_template_class_1 (tree d1, tree arglist, tree in_decl, tree context, the one of #0. When we encounter #1, we want to store the partial instantiation - of M (template S::M) in it's CLASSTYPE_TI_TEMPLATE. + of M (template S::M) in its CLASSTYPE_TI_TEMPLATE. For all cases other than this "explicit specialization of member of a class template", we just want to store the most general template into -- 1.8.3
Re: [PATCH, vtv update] Fix /tmp directory issues in libvtv
On 08/11/2013 01:08 AM, Caroline Tice wrote: OK, I have removed the attempt to use $HOME for the logs; they will now either go into the directory specified by the environment variable VTV_LOGS_DIR, or they will go into the current directory. I also added code to use secure_getenv, rather than getenv, if it is available. Is this patch ok to commit? + logs_prefix = secure_getenv ("VTV_LOGS_DIR"); + if (!logs_prefix || strlen (logs_prefix) == 0) +logs_prefix = (char *) "."; Hmm. If you fall back to the current directory, using secure_getenv doesn't have the intended security effect. I wonder if we can simply label this functionality as unsafe for SUID/SGID programs, like we (hopefully) do for profiling. Also, logs_prefix should be declared const char *, then the cast can go away (I hope). -- Florian Weimer / Red Hat Product Security Team
Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
Hi, thinking about it a bit more, I suppose easiest way is to 1) make separate sets of counters for each comdat and place them into comdat section named as DECL_COMDAT_GROUP (node) + cfg_checksum + individual_counter_counts. This will make linker to unify the sections for us. 2) extend API of libgcov initialization so multiple counters can be recorded per file. 3) at merging time, gcov needs to merge all comdat section counters into temporary memory, so multiple merging won't produce bad results 4) counter streaming will need to be updated to deal with separate comdat sections... 5) probably we will want to update histogram production to avoid counting same comdat many times (this can be done by adding an "processed" flag into the per-function sections I don't see any obvious problems with this plan, just that it is quite some work. If you had chance to implement something along these lines, I think it would help ;)) Honza > Cc'ing Rong since he is also working on trying to address the comdat > profile issue. Rong, you may need to see an earlier message for more > context: > http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00558.html > > Teresa > > On Sun, Aug 11, 2013 at 5:21 AM, Jan Hubicka wrote: > >> > >> I see, yes LTO can deal with this better since it has global > >> information. In non-LTO mode (including LIPO) we have the issue. > > > > Either Martin or me will implement merging of the multiple copies at > > LTO link time. This is needed for Martin's code unification patch anyway. > > > > Theoretically gcov runtime can also have symbol names and cfg checksums of > > comdats in the static data and at exit produce buckets based on matching > > names+checksums+counter counts, merge all data into in each bucket to one > > representative by the existing merging routines and then memcpy them to > > all the oriignal copiles. This way all compilation units will receive same > > results. > > > > I am not very keen about making gcov runtime bigger and more complex than it > > needs to be, but having sane profile for comdats seems quite important. > > Perhaps, in GNU toolchain, ordered subsections can be used to make linker to > > produce ordered list of comdats, so the runtime won't need to do hashing + > > lookups. > > > > Honza > >> > >> I take it gimp is built with LTO and therefore shouldn't be hitting > >> this comdat issue? > >> > >> Let me do a couple things: > >> - port over my comdat inlining fix from the google branch to trunk and > >> send it for review. If you or Martin could try it to see if it helps > >> with function splitting to avoid the hits from the cold code that > >> would be great > >> - I'll add some new sanity checking to try to detect non-zero blocks > >> in the cold section, or 0 blocks reached by non-zero edges and see if > >> I can flush out any problems with my tests or a profiledbootstrap or > >> gimp. > >> - I'll try building and profiling gimp myself to see if I can > >> reproduce the issue with code executing out of the cold section. > >> > >> Thanks, > >> Teresa > >> > >> >> > >> >> Also, can you send me reproduction instructions for gimp? I don't > >> >> think I need Martin's patch, but which version of gimp and what is the > >> >> equivalent way for me to train it? I have some scripts to generate a > >> >> similar type of instruction heat map graph that I have been using to > >> >> tune partitioning and function reordering. Essentially it uses linux > >> >> perf to sample on instructions_retired and then munge the data in > >> >> several ways to produce various stats and graphs. One thing that has > >> >> been useful has been to combine the perf data with nm output to > >> >> determine which cold functions are being executed at runtime. > >> > > >> > Martin? > >> > > >> >> > >> >> However, for this to tell me which split cold bbs are being executed I > >> >> need to use a patch that Sri sent for review several months back that > >> >> gives the split cold section its own name: > >> >> http://gcc.gnu.org/ml/gcc-patches/2013-04/msg01571.html > >> >> Steven had some follow up comments that Sri hasn't had a chance to > >> >> address yet: > >> >> http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00798.html > >> >> (cc'ing Sri as we should probably revive this patch soon to address > >> >> gdb and other issues with detecting split functions properly) > >> > > >> > Intresting, I used linker script for this purposes, but that his GNU ld > >> > only... > >> > > >> > Honza > >> >> > >> >> Thanks! > >> >> Teresa > >> >> > >> >> > > >> >> > Honza > >> >> >> > >> >> >> Thanks, > >> >> >> Teresa > >> >> >> > >> >> >> > I think we are really looking primarily for dead parts of the > >> >> >> > functions (sanity checks/error handling) > >> >> >> > that should not be visited by train run. We can then see how to > >> >> >> > make the heuristic more aggressive? > >> >> >> > > >> >> >> > Honza > >> >> >> > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> Teresa Johnson
Re: [patch, fortran] RFD: PR 56666 Allow suppression of zero-trip DO loop warning
Hi Janus, > So: Ok for trunk from my side. > > However, I would prefer to disable the warning by default, but include > it in -Wall. Here's a patch to do just that. Regression-tested (hence the changes to the existing test cases :-) OK for trunk? Once it goes in, I will also update the WWW changes. Thomas 2013-08-03 Thomas Koenig PR fortran/5 * gfortran.h (gfc_option_t): Add warn_zerotrip. * invoke.texi (-Wzerotrip): Document option. * lang.opt (Wzerotrip): Add. * options.c (gfc_init_options): Initialize warn_zerotrip. (set_Wall): Add handling of warn_zerotrip. (gfc_handle_option): Handle OPT_Wzerotrip. * resolve.c (gfc_resolve_iterator): Honor gfc_option.warn_zerotrip; update error message to show how to suppress the warning. 2013-08-03 Thomas Koenig PR fortran/5 * gfortran.dg/do_check_10.f90: New test. * gfortran.dg/array_constructor_11.f90: Add -Wzerotrip to dg-options. * gfortran.dg/array_constructor_18.f90: Likewise. * gfortran.dg/array_constructor_22.f90: Likewise. * gfortran.dg/coarray_15.f90: Likewise. * gfortran.dg/do_1.f90: Add -Wall to dg-options. * gfortran.dg/do_3.F90: Add -Wzerotrip to dg-options. * gfortran.dg/do_check_5.f90: Add -Wall to gd-options. Index: fortran/gfortran.h === --- fortran/gfortran.h (Revision 201448) +++ fortran/gfortran.h (Arbeitskopie) @@ -2252,6 +2252,7 @@ typedef struct int warn_align_commons; int warn_real_q_constant; int warn_unused_dummy_argument; + int warn_zerotrip; int warn_realloc_lhs; int warn_realloc_lhs_all; int warn_compare_reals; Index: fortran/invoke.texi === --- fortran/invoke.texi (Revision 201448) +++ fortran/invoke.texi (Arbeitskopie) @@ -954,6 +954,11 @@ This option is implied by @option{-Wextra}. Warn if the pointer in a pointer assignment might be longer than the its target. This option is implied by @option{-Wall}. +@item -Wzerotrip +@opindex @code{Wzerotrip} +Warn if a @code{DO} loop is known to execute zero times at compile +time. This option is implied by @option{-Wall}. + @item -Werror @opindex @code{Werror} @cindex warnings, to errors Index: fortran/lang.opt === --- fortran/lang.opt (Revision 201448) +++ fortran/lang.opt (Arbeitskopie) @@ -293,6 +293,10 @@ Wunused-dummy-argument Fortran Warning Warn about unused dummy arguments. +Wzerotrip +Fortran Warning +Warn about zero-trip DO loops + cpp Fortran Negative(nocpp) Enable preprocessing Index: fortran/options.c === --- fortran/options.c (Revision 201448) +++ fortran/options.c (Arbeitskopie) @@ -109,6 +109,7 @@ gfc_init_options (unsigned int decoded_options_cou gfc_option.warn_align_commons = 1; gfc_option.warn_real_q_constant = 0; gfc_option.warn_unused_dummy_argument = 0; + gfc_option.warn_zerotrip = 0; gfc_option.warn_realloc_lhs = 0; gfc_option.warn_realloc_lhs_all = 0; gfc_option.warn_compare_reals = 0; @@ -466,6 +467,7 @@ set_Wall (int setting) gfc_option.warn_real_q_constant = setting; gfc_option.warn_unused_dummy_argument = setting; gfc_option.warn_target_lifetime = setting; + gfc_option.warn_zerotrip = setting; warn_return_type = setting; warn_uninitialized = setting; @@ -747,6 +749,10 @@ gfc_handle_option (size_t scode, const char *arg, gfc_option.warn_unused_dummy_argument = value; break; +case OPT_Wzerotrip: + gfc_option.warn_zerotrip = value; + break; + case OPT_fall_intrinsics: gfc_option.flag_all_intrinsics = 1; break; Index: fortran/resolve.c === --- fortran/resolve.c (Revision 201448) +++ fortran/resolve.c (Arbeitskopie) @@ -6282,8 +6282,10 @@ gfc_resolve_iterator (gfc_iterator *iter, bool rea sgn = mpfr_sgn (iter->step->value.real); cmp = mpfr_cmp (iter->end->value.real, iter->start->value.real); } - if ((sgn > 0 && cmp < 0) || (sgn < 0 && cmp > 0)) - gfc_warning ("DO loop at %L will be executed zero times", + if (gfc_option.warn_zerotrip && + ((sgn > 0 && cmp < 0) || (sgn < 0 && cmp > 0))) + gfc_warning ("DO loop at %L will be executed zero times" + " (use -Wno-zerotrip to suppress)", &iter->step->where); } Index: testsuite/gfortran.dg/array_constructor_11.f90 === --- testsuite/gfortran.dg/array_constructor_11.f90 (Revision 201448) +++ testsuite/gfortran.dg/array_constructor_11.f90 (Arbeitskopie) @@ -1,6 +1,7 @@ ! Like array_constructor_6.f90, but check iterators with non-default stride, ! including combinations which lead to zero-length vec
[C++ testcase, committed] PR 53349
Hi, committed to mainline. Thanks, Paolo. 2013-08-11 Paolo Carlini PR c++/53349 * g++.dg/cpp0x/constexpr-ice8.C: New. Index: g++.dg/cpp0x/constexpr-ice8.C === --- g++.dg/cpp0x/constexpr-ice8.C (revision 0) +++ g++.dg/cpp0x/constexpr-ice8.C (working copy) @@ -0,0 +1,17 @@ +// PR c++/53349 +// { dg-do compile { target c++11 } } + +template +struct Foo { + constexpr Foo(const Foo a) : m_a(a) {} + constexpr Foo(const Foo &a) : m_a(a.m_a) {} + + Foo m_a; +}; + +template <> struct Foo<0> {}; + +constexpr Foo<1> catty1(Foo<1> x) { return x; } +constexpr Foo<2> catty2(Foo<1> x) { return Foo<2>(catty1(x)); } + +constexpr auto res = catty2(Foo<1>(Foo<0>()));
Re: [patch, fortran] RFD: PR 56666 Allow suppression of zero-trip DO loop warning
2013/8/11 Thomas Koenig : > Hi Janus, > >> So: Ok for trunk from my side. >> >> However, I would prefer to disable the warning by default, but include >> it in -Wall. > > Here's a patch to do just that. > > Regression-tested (hence the changes to the existing test cases :-) > > OK for trunk? Looks good to me ... Thanks, Janus > 2013-08-03 Thomas Koenig > > PR fortran/5 > * gfortran.h (gfc_option_t): Add warn_zerotrip. > * invoke.texi (-Wzerotrip): Document option. > * lang.opt (Wzerotrip): Add. > * options.c (gfc_init_options): Initialize warn_zerotrip. > (set_Wall): Add handling of warn_zerotrip. > (gfc_handle_option): Handle OPT_Wzerotrip. > * resolve.c (gfc_resolve_iterator): Honor > gfc_option.warn_zerotrip; update error message to show > how to suppress the warning. > > 2013-08-03 Thomas Koenig > > PR fortran/5 > * gfortran.dg/do_check_10.f90: New test. > * gfortran.dg/array_constructor_11.f90: Add -Wzerotrip to > dg-options. > * gfortran.dg/array_constructor_18.f90: Likewise. > * gfortran.dg/array_constructor_22.f90: Likewise. > * gfortran.dg/coarray_15.f90: Likewise. > * gfortran.dg/do_1.f90: Add -Wall to dg-options. > * gfortran.dg/do_3.F90: Add -Wzerotrip to dg-options. > * gfortran.dg/do_check_5.f90: Add -Wall to gd-options. >
Cost model for indirect call speculation
Hi, this patch adds simple cost model into indirect call speculation. First we do not turn calls into speculative calls when it seems bad idea (i.e. call is cold) and during inlining we remove speculations that do not seem benefical. On modern chip speculative call sequence without inlining is not really going to fare better than indirect call because of indirect call predictor. So we keep them only if the call was inlined or if the callee is turned to clone or CONST/PURE flags are propagated to them. We may want to add target hook specifying if target support indirect call predictor, but I am not sure how important this is in practice. To enable cost model everywhere, the old unit-local transform code now does nothing but does sanity checking and debug output dumping. On GCC there are 2600 indirect calls executed during training. For 1600 we find histograms (I have no clue what happens to the others - will debug it tomorrow) and out of them 500 are not cold and thus turned into speculative call. After inlining about speculative 60 calls survives into final binary (as opposed to 2000 before) Bootstrapped/regtested x86_64-linux, will commit it after testers picks previous change. Honza * cgraph.c (cgraph_turn_edge_to_speculative): Return newly introduced edge; fix typo in sanity check. (cgraph_resolve_speculation): Export; improve diagnostic. (cgraph_redirect_edge_call_stmt_to_callee): Better diagnostic; cancel speculation at type mismatch. * cgraph.h (cgraph_turn_edge_to_speculative): Update. (cgraph_resolve_speculation): Declare. (symtab_can_be_discarded): New function. * value-prof.c (gimple_ic_transform): Remove actual transform code. * ipa-inline-transform.c (speculation_removed): New global var. (clone_inlined_nodes): See if speculation can be removed. (inline_call): If speculations was removed, we growths may not match. * ipa-inline.c (can_inline_edge_p): Add DISREGARD_LIMITS parameter. (speculation_useful_p): New function. (resolve_noninline_speculation): New function. (inline_small_functions): Resolve useless speculations. * ipa-inline.h (speculation_useful_p): Declare * ipa.c (can_replace_by_local_alias): Simplify. (ipa_profile): Produce speculative calls in non-lto, too; add simple cost model; produce local aliases. Index: cgraph.c === *** cgraph.c(revision 201646) --- cgraph.c(working copy) *** cgraph_set_edge_callee (struct cgraph_ed *** 1040,1048 At this time the function just creates the direct call, the referencd representing the if conditional and attaches !them all to the orginal indirect call statement. */ ! void cgraph_turn_edge_to_speculative (struct cgraph_edge *e, struct cgraph_node *n2, gcov_type direct_count, --- 1040,1050 At this time the function just creates the direct call, the referencd representing the if conditional and attaches !them all to the orginal indirect call statement. !Return direct edge created. */ ! ! struct cgraph_edge * cgraph_turn_edge_to_speculative (struct cgraph_edge *e, struct cgraph_node *n2, gcov_type direct_count, *** cgraph_turn_edge_to_speculative (struct *** 1073,1078 --- 1075,1081 IPA_REF_ADDR, e->call_stmt); ref->lto_stmt_uid = e->lto_stmt_uid; ref->speculative = e->speculative; + return e2; } /* Speculative call consist of three components: *** cgraph_speculative_call_info (struct cgr *** 1107,1113 if (e2->call_stmt) { e = cgraph_edge (e->caller, e2->call_stmt); ! gcc_assert (!e->speculative && !e->indirect_unknown_callee); } else for (e = e->caller->callees; --- 1110,1116 if (e2->call_stmt) { e = cgraph_edge (e->caller, e2->call_stmt); ! gcc_assert (e->speculative && !e->indirect_unknown_callee); } else for (e = e->caller->callees; *** cgraph_redirect_edge_callee (struct cgra *** 1147,1153 Remove the speculative call sequence and return edge representing the call. It is up to caller to redirect the call as appropriate. */ ! static struct cgraph_edge * cgraph_resolve_speculation (struct cgraph_edge *edge, tree callee_decl) { struct cgraph_edge *e2; --- 1150,1156 Remove the speculative call sequence and return edge representing the call. It is up to caller to redirect the call as appropriate. */ ! struct cgraph_edge * cgraph_resolve_speculation (struct cgraph_edge *edge, tree callee_decl) { struct cgraph_edge *e2; *** cgraph_re
[wwwdocs] Rotate news
Committed. Gerald Index: index.html === RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v retrieving revision 1.887 diff -u -3 -p -r1.887 index.html --- index.html 31 May 2013 11:41:08 - 1.887 +++ index.html 12 Aug 2013 00:01:12 - @@ -84,36 +84,6 @@ mission statement. at IITB is providing documentation, tutorials and videos about GCC internals with support from the Government of India. -ARM AArch64 support -[2012-10-24] -A port for AArch64, the 64-bit execution state in the ARMv8 architecture, -has been contributed by ARM Ltd. - -IBM zEnterprise EC12 support -[2012-10-10] -Support for the latest release of the System z mainframe zEC12 has -been added to the architecture back end. This work was contributed by -Andreas Krebbel of IBM. - -GCC 4.7.2 released -[2012-09-20] - - -GCC now uses C++ as its implementation language -[2012-08-14] -The http://gcc.gnu.org/wiki/cxx-conversion";>cxx-conversion -branch has been merged into trunk. This switches GCC's implementation -language from C to C++. -Additionally, some data structures have been re-implemented in C++ -(more details in the http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00711.html";>merge -announcement). This work was contributed by Lawrence Crowl and -Diego Novillo of Google. - -GCC 4.5.4 released -[2012-07-02] - - Index: news.html === RCS file: /cvs/gcc/wwwdocs/htdocs/news.html,v retrieving revision 1.136 diff -u -3 -p -r1.136 news.html --- news.html 30 Mar 2013 18:22:02 - 1.136 +++ news.html 12 Aug 2013 00:01:13 - @@ -14,6 +14,36 @@ +ARM AArch64 support +[2012-10-24] +A port for AArch64, the 64-bit execution state in the ARMv8 architecture, +has been contributed by ARM Ltd. + +IBM zEnterprise EC12 support +[2012-10-10] +Support for the latest release of the System z mainframe zEC12 has +been added to the architecture back end. This work was contributed by +Andreas Krebbel of IBM. + +GCC 4.7.2 released +[2012-09-20] + + +GCC now uses C++ as its implementation language +[2012-08-14] +The http://gcc.gnu.org/wiki/cxx-conversion";>cxx-conversion +branch has been merged into trunk. This switches GCC's implementation +language from C to C++. +Additionally, some data structures have been re-implemented in C++ +(more details in the http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00711.html";>merge +announcement). This work was contributed by Lawrence Crowl and +Diego Novillo of Google. + +GCC 4.5.4 released +[2012-07-02] + + GCC 4.7.1 released [2012-06-14]
[wwwdocs] Add link to @gnutools on Twitter
David suggested adding this link, and I think it fits nicely. Applied. Gerald Index: style.mhtml === RCS file: /cvs/gcc/wwwdocs/htdocs/style.mhtml,v retrieving revision 1.119 diff -r1.119 style.mhtml 160a161,164 > http://twitter.com/gnutools";> >height="42" width="42" align="middle" alt="@gnutools on Twitter" > style="border:0px" />@gnutools
[PATCH] Fix PR57451 (Incorrect debug ranges emitted for -freorder-blocks-and-partition -g)
This patch fixes PR rtl-optimizations/57451 by preventing scopes and therefore lexical blocks from crossing split section boundaries. This will prevent debug info generation from using DW_AT_low_pc/high_pc pairs across the section boundary. Bootstrapped and tested on x86_64-unknown-linux-gnu. With this patch, a profilebootstrap with -freorder-blocks-and-partition force-enabled also passes. Ok for trunk? Thanks, Teresa 2013-08-11 Teresa Johnson PR rtl-optimizations/57451 * final.c (reemit_insn_block_notes): Prevent lexical blocks from crossing split section boundaries. Index: final.c === --- final.c (revision 201644) +++ final.c (working copy) @@ -1650,12 +1650,26 @@ reemit_insn_block_notes (void) rtx insn, note; insn = get_insns (); - if (!active_insn_p (insn)) -insn = next_active_insn (insn); - for (; insn; insn = next_active_insn (insn)) + for (; insn; insn = next_insn (insn)) { tree this_block; + /* Prevent lexical blocks from straddling section boundaries. */ + if (NOTE_P (insn) && NOTE_KIND (insn) == NOTE_INSN_SWITCH_TEXT_SECTIONS) +{ + for (tree s = cur_block; s != DECL_INITIAL (cfun->decl); + s = BLOCK_SUPERCONTEXT (s)) +{ + rtx note = emit_note_before (NOTE_INSN_BLOCK_END, insn); + NOTE_BLOCK (note) = s; + note = emit_note_after (NOTE_INSN_BLOCK_BEG, insn); + NOTE_BLOCK (note) = s; +} +} + + if (!active_insn_p (insn)) +continue; + /* Avoid putting scope notes between jump table and its label. */ if (JUMP_TABLE_DATA_P (insn)) continue; -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Re: Cost model for indirect call speculation
I like the approach in general -- in the past, indirect call promotion and function inlining heuristics are disconnected -- which can lead to either missing promotions or useless ones. This approach solves the problem. On Sun, Aug 11, 2013 at 4:11 PM, Jan Hubicka wrote: > Hi, > this patch adds simple cost model into indirect call speculation. First we > do not > turn calls into speculative calls when it seems bad idea (i.e. call is cold) > and during inlining we remove speculations that do not seem benefical. > On modern chip speculative call sequence without inlining is not really going > to fare better than indirect call because of indirect call predictor. > So we keep them only if the call was inlined or if the callee is turned to > clone > or CONST/PURE flags are propagated to them. > > We may want to add target hook specifying if target support indirect call > predictor, > but I am not sure how important this is in practice. It might be also useful to introduce a parameter to control the behavior so that people can do better experiment. The capability of indirect branch predictions varies a lot depending on the target. I only noticed a couple of indentation problems. thanks, David > > To enable cost model everywhere, the old unit-local transform code now does > nothing > but does sanity checking and debug output dumping. > /* Speculative call consist of three components: > *** cgraph_speculative_call_info (struct cgr > *** 1107,1113 > if (e2->call_stmt) > { indentation? > e = cgraph_edge (e->caller, e2->call_stmt); > ! gcc_assert (!e->speculative && !e->indirect_unknown_callee); > } > else > for (e = e->caller->callees; > --- 1110,1116 > if (e2->call_stmt) > { > e = cgraph_edge (e->caller, e2->call_stmt); > ! gcc_assert (e->speculative && !e->indirect_unknown_callee); > } > else > for (e = e->caller->callees; > *** cgraph_redirect_edge_callee (struct cgra > *** 1147,1153 > Remove the speculative call sequence and return edge representing the > call. > It is up to caller to redirect the call as appropriate. */ > > ! static struct cgraph_edge * > cgraph_resolve_speculation (struct cgraph_edge *edge, tree callee_decl) > { > struct cgraph_edge *e2; > --- 1150,1156 > Remove the speculative call sequence and return edge representing the > call. > It is up to caller to redirect the call as appropriate. */ > > ! struct cgraph_edge * > cgraph_resolve_speculation (struct cgraph_edge *edge, tree callee_decl) > { > struct cgraph_edge *e2; > *** cgraph_resolve_speculation (struct cgrap > *** 1159,1170 > { > if (dump_file) > { > ! fprintf (dump_file, "Speculative indirect call %s/%i => %s/%i has " > ! "turned out to have contradicitng known target ", > ! xstrdup (cgraph_node_name (edge->caller)), > edge->caller->symbol.order, > ! xstrdup (cgraph_node_name (e2->callee)), > e2->callee->symbol.order); > ! print_generic_expr (dump_file, callee_decl, 0); > ! fprintf (dump_file, "\n"); > } > } > else > --- 1162,1182 > { > if (dump_file) > { > ! if (callee_decl) > ! { > ! fprintf (dump_file, "Speculative indirect call %s/%i => %s/%i > has " > ! "turned out to have contradicitng known target ", > ! xstrdup (cgraph_node_name (edge->caller)), > edge->caller->symbol.order, > ! xstrdup (cgraph_node_name (e2->callee)), > e2->callee->symbol.order); > ! print_generic_expr (dump_file, callee_decl, 0); > ! fprintf (dump_file, "\n"); > ! } > ! else > ! { > ! fprintf (dump_file, "Removing speculative call %s/%i => > %s/%i\n", > ! xstrdup (cgraph_node_name (edge->caller)), > edge->caller->symbol.order, > ! xstrdup (cgraph_node_name (e2->callee)), > e2->callee->symbol.order); > ! } > } > } > else > *** cgraph_redirect_edge_call_stmt_to_callee > *** 1264,1275 > cgraph_speculative_call_info (e, e, e2, ref); > if (gimple_call_fndecl (e->call_stmt)) > e = cgraph_resolve_speculation (e, gimple_call_fndecl (e->call_stmt)); > ! else > { > if (dump_file) > ! fprintf (dump_file, "Expanding speculative call of %s/%i -> > %s/%i\n", > xstrdup (cgraph_node_name (e->caller)), > e->caller->symbol.order, > xstrdup (cgraph_node_name (e->callee)), > e->callee->symbol.order); > gcc_assert (e2->speculative); > push_cfun (DECL_STRUCT_FUNCTION (e->caller->symbol.decl)); > new_stmt = gimple_ic (e->call_stmt, cg
Re: [patch, fortran] RFD: PR 56666 Allow suppression of zero-trip DO loop warning
Hi Janus, >> OK for trunk? > Looks good to m Committed as rev. 201658; also committed a snippet to the documentation. Thanks for the review! Thomas
[PATCH] TREE-SSA remove redundant condition checks in get_default_value
In function get_default_value of tree-ssa-ccp.c, 261 else if (is_gimple_assign (stmt) 262/* Value-returning GIMPLE_CALL statements assign to 263 a variable, and are treated similarly to GIMPLE_ASSIGN. */ 264|| (is_gimple_call (stmt) 265&& gimple_call_lhs (stmt) != NULL_TREE) 266|| gimple_code (stmt) == GIMPLE_PHI) 267 { 268 tree cst; 269 if (gimple_assign_single_p (stmt) 270 && DECL_P (gimple_assign_rhs1 (stmt)) 271 && (cst = get_symbol_constant_value (gimple_assign_rhs1 (stmt 272 { 273 val.lattice_val = CONSTANT; 274 val.value = cst; 275 } 276 else 277 /* Any other variable defined by an assignment or a PHI node 278is considered UNDEFINED. */ 279 val.lattice_val = UNDEFINED; if the stmt is a gimple call node or a gimple phi node, it will never satisfy the condition gimple_assign_single_p (stmt). so there exists redundant condition checks. The patch attached try to remove this. Bootstrap passed. Regression tested on x86_64-unknown-linux-gnu (pc). ChangeLog: 2013-08-13 Zhouyi Zhou * tree-ssa-ccp.c (get_default_value): remove redundant condition checks -- Zhouyi Zhou diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c index 6472f48..7fbb687 100644 --- a/gcc/tree-ssa-ccp.c +++ b/gcc/tree-ssa-ccp.c @@ -258,12 +258,7 @@ get_default_value (tree var) val.mask = double_int_minus_one; } } - else if (is_gimple_assign (stmt) - /* Value-returning GIMPLE_CALL statements assign to - a variable, and are treated similarly to GIMPLE_ASSIGN. */ - || (is_gimple_call (stmt) - && gimple_call_lhs (stmt) != NULL_TREE) - || gimple_code (stmt) == GIMPLE_PHI) + else if (is_gimple_assign (stmt)) { tree cst; if (gimple_assign_single_p (stmt) @@ -274,10 +269,18 @@ get_default_value (tree var) val.value = cst; } else - /* Any other variable defined by an assignment or a PHI node + /* Any other variable defined by an assignment is considered UNDEFINED. */ val.lattice_val = UNDEFINED; } + else if ((is_gimple_call (stmt) + && gimple_call_lhs (stmt) != NULL_TREE) + || gimple_code (stmt) == GIMPLE_PHI) +{ + /*Variable defined by a call or a PHI node + is considered UNDEFINED. */ + val.lattice_val = UNDEFINED; +} else { /* Otherwise, VAR will never take on a constant value. */
Re: [PATCH] x86-64 gcc generate wrong assembly instruction movabs for intel syntax
Hello! > movabs is incorrectly translated into "mov [rax], -1", and causes > compile error "Error: ambiguous operand size for `mov' ". > It should be "mov QWORD PTR [rax], -1" > > Bootstrap passed. Regression tested on x86_64-unknown-linux-gnu (pc). > > 2013-08-10 Perez Read > > * config/i386/i386.md (*movabs_1) : Add PTR before > operand 0 for intel asm alternative. > > * testsuite/gcc.target/i386/movabs-1.c : New test. You should mention PR number in the ChangeLog. Looks OK, but I think that for consistency this decoration should also be added to *movabs_2 pattern. Uros.