Re: [Patch, fortran] PR46328 - [OOP] type-bound operator call with non-trivial polymorphic operand
Dear Tobias, Many thanks for the review and your replies to redux(1)/(2). I will take account of your comments and commit tomorrow night. I have a fix for the allocation of class temporaries, which cures the problem with index vectors It will see the light of day on Monday. Happy New Year to one and all. Paul On Sat, Dec 31, 2011 at 12:14 AM, Tobias Burnus wrote: > Dear Paul, > > Paul Richard Thomas wrote: >> >> Bootstrapped and regtested on i686/Ubuntu 11.1 - OK for trunk? >> >> 2011-12-30 Paul Thomas >> >> * trans-array.c (gfc_array_allocate): Null allocated memory of >> newly allocted class arrays. > > > PR fortran/51529 > > >> PR fortran/46328 >> * interface.c(build_compcall_for_operator): Add a type to the >> expression. > > > You might want to quote additionally PR fortran/51052 and PR fortran/51052. > >> *** gcc/fortran/interface.c (revision 182566) >> --- gcc/fortran/interface.c (working copy) >> *** >> *** 3256,3261 >> --- 3256,3269 > > > I would have liked a diff with "-p" flag which shows the function name in > the diff (for instance "svn diff -x -p" or "svn diff -x '-p -u'"). > >> >> + if (expr->ts.type == BT_CLASS&& expr3) >> + { >> + tmp = build_int_cst (unsigned_char_type_node, 0); >> + /* With class objects, it is best to play safe and null the >> + memory because we cannot know if dynamic types have allocatable >> + components or not. */ > > > I don't like the comment; how about something along the following: "For > class objects we need to nullify the memory in case they have allocatable > components; the reason is that _copy, which is used for initialization, > first frees the destination." > > >> + gfc_trans_class_init_assign (gfc_code *code) >> + { >> + stmtblock_t block; >> + tree tmp; >> + gfc_se dst,src,memsz; > > > Space after each comma. > >> + gfc_expr *lhs,*rhs,*sz; > > > Ditto. > > >> *** >> *** 3301,3306 >> --- 3502,3514 >> { >> gfc_conv_expr_reference (&parmse, e); >> >> + /* Catch base objects that are not variables. */ >> + if (e->ts.type == BT_CLASS >> + && e->expr_type != EXPR_VARIABLE >> + && expr&& e == expr->base_expr) > > > The indentation looks wrong. > > >> + for (args= e->value.function.actual; args; args = args->next) >> + { >> + if (expr == args->expr) >> + expr = args->expr; >> + } > > > Space before the equal sign in "args=". If you want, you can also remove the > curly braces as they are not required. > >> + args= code->expr1->value.function.actual; >> + for (; args; args = args->next) >> + { >> + if (expr == args->expr) >> + expr = args->expr; >> + } > > > Ditto. > >> get_declared_from_expr (gfc_ref **class_ref, gfc_ref **new_ref, >> ! gfc_expr *e) >> --- 5623,5629 >> get_declared_from_expr (gfc_ref **class_ref, gfc_ref **new_ref, >> ! gfc_expr *e, bool types) > > > I think "types" deserves a comment line in the comment block before the > function; additionally - and related - I wonder whether the name is well > chosen. "types" reminds me of "bt" rather than of "bool". In the changelog, > you use: "Add 'types' argument to switch checking of derived types on or > off." Thus, "check_types" would be a possible choice. > > Otherwise, the patch looks okay to me. > > Happy new year to every one! > > Tobias -- The knack of flying is learning how to throw yourself at the ground and miss. --Hitchhikers Guide to the Galaxy
Re: [google] change dump_inline_decisions to make it print more useful and better looking info
I see. The patch is updated: Thanks, Dehao Index: ipa-inline.c === --- ipa-inline.c(revision 182739) +++ ipa-inline.c(working copy) @@ -303,26 +303,19 @@ char *buf; size_t buf_size; const char *bfd_name = lang_hooks.dwarf_name (node->decl, 0); - const char *count_text = "count="; - const char *max_count_text = "max_count="; if (!bfd_name) bfd_name = "unknown"; buf_size = strlen (bfd_name) + 1; - if (flag_opt_info >= OPT_INFO_MED) -buf_size += (strlen (count_text) + MAX_INT_LENGTH + 1); - if (flag_opt_info >= OPT_INFO_MAX) -buf_size += (strlen (max_count_text) + MAX_INT_LENGTH + 1); + if (flag_opt_info >= OPT_INFO_MED && profile_info) +buf_size += (2 * MAX_INT_LENGTH + 5); buf = (char *) xmalloc (buf_size); strcpy (buf, bfd_name); - if (flag_opt_info >= OPT_INFO_MED) -sprintf (buf, "%s,%s"HOST_WIDEST_INT_PRINT_DEC, -buf, count_text, node->count); - if (flag_opt_info >= OPT_INFO_MAX) -sprintf (buf, "%s,%s"HOST_WIDEST_INT_PRINT_DEC, -buf, max_count_text, node->max_bb_count); + if (flag_opt_info >= OPT_INFO_MED && profile_info) +sprintf (buf, "("HOST_WIDEST_INT_PRINT_DEC", "HOST_WIDEST_INT_PRINT_DEC")", +node->count, node->max_bb_count); return buf; } On Fri, Dec 30, 2011 at 4:59 PM, Xinliang David Li wrote: > the early inline decisions are still good to dump out. However the > opt-info should check 'if (profile_info)' to decide if count and max > count info need to be dumped. > > David > > On Fri, Dec 30, 2011 at 12:31 AM, Dehao Chen wrote: >> Hi, >> >> This patch makes the -fopt-info print more concise info: >> 1. only inline decisions after einline are printed >> 2. print in a more compact format >> >> Bootstrapped with no problem. >> >> Is it okay for google-4_6 and google-main branch? >> >> Thanks, >> Dehao >> >> gcc/ChangeLog.google-4_6 >> 2011-12-30 Dehao Chen >> >> * ipa-inline.c (dump_inline_decision): Print more concise info when >> dumping inline decisions. >> >> Index: gcc/ipa-inline.c >> === >> --- gcc/ipa-inline.c (revision 182739) >> +++ gcc/ipa-inline.c (working copy) >> @@ -303,26 +303,19 @@ >> char *buf; >> size_t buf_size; >> const char *bfd_name = lang_hooks.dwarf_name (node->decl, 0); >> - const char *count_text = "count="; >> - const char *max_count_text = "max_count="; >> >> if (!bfd_name) >> bfd_name = "unknown"; >> >> buf_size = strlen (bfd_name) + 1; >> if (flag_opt_info >= OPT_INFO_MED) >> - buf_size += (strlen (count_text) + MAX_INT_LENGTH + 1); >> - if (flag_opt_info >= OPT_INFO_MAX) >> - buf_size += (strlen (max_count_text) + MAX_INT_LENGTH + 1); >> + buf_size += (2 * MAX_INT_LENGTH + 5); >> buf = (char *) xmalloc (buf_size); >> >> strcpy (buf, bfd_name); >> if (flag_opt_info >= OPT_INFO_MED) >> - sprintf (buf, "%s,%s"HOST_WIDEST_INT_PRINT_DEC, >> - buf, count_text, node->count); >> - if (flag_opt_info >= OPT_INFO_MAX) >> - sprintf (buf, "%s,%s"HOST_WIDEST_INT_PRINT_DEC, >> - buf, max_count_text, node->max_bb_count); >> + sprintf (buf, "("HOST_WIDEST_INT_PRINT_DEC", >> "HOST_WIDEST_INT_PRINT_DEC")", >> + node->count, node->max_bb_count); >> return buf; >> } >> >> @@ -369,6 +362,14 @@ >> const char *inline_chain_text; >> const char *call_count_text; >> struct cgraph_node *final_caller = edge->caller; >> + tree decl = edge->caller->decl; >> + >> + if (decl) >> + { >> + struct function *fn = DECL_STRUCT_FUNCTION (decl); >> + if (!fn || !fn->always_inline_functions_inlined) >> + return; >> + } >> >> if (final_caller->global.inlined_to != NULL) >> inline_chain_text = cgraph_node_call_chain (final_caller, &final_caller);
[C++ Patch] PR 51397
Hi, this issue, reported by Dave (and Jon), is about the error message printed for a static_assert like: static_assert('X' != '\130', "'X' has the wrong value"); where, due to the use of %E, we go through pp_c_string_literal and thus pp_c_char, and we escape the single quotes around X. Dave also recommends not using *any* quotes around the string. Thus, just using %s on the TREE_STRING_POINTER seems a straightforward fix to me (but in fact %qs also passes the testsuite). Tested x86_64-linux. Thanks, Paolo. // /cp 2011-12-31 Paolo Carlini PR c++/51397 * semantics.c (finish_static_assert): Use %s instead of %E for the error message. /testsuite 2011-12-31 Paolo Carlini PR c++/51397 * g++.dg/cpp0x/static_assert6.C: New. Index: testsuite/g++.dg/cpp0x/static_assert6.C === --- testsuite/g++.dg/cpp0x/static_assert6.C (revision 0) +++ testsuite/g++.dg/cpp0x/static_assert6.C (revision 0) @@ -0,0 +1,4 @@ +// PR c++/51397 +// { dg-options "-std=c++0x" } + +static_assert('X' != '\130', "'X' has the wrong value"); // { dg-error "'X' has the wrong value" } Index: cp/semantics.c === --- cp/semantics.c (revision 182754) +++ cp/semantics.c (working copy) @@ -5127,7 +5127,7 @@ finish_static_assert (tree condition, tree message if (TREE_CODE (condition) == INTEGER_CST && integer_zerop (condition)) /* Report the error. */ -error ("static assertion failed: %E", message); +error ("static assertion failed: %s", TREE_STRING_POINTER (message)); else if (condition && condition != error_mark_node) { error ("non-constant condition for static assertion");
[Patch, Fortran] PR 51682 - Fix coarray issues with -fdefault-integer-8
Dear all, first, I hope you will have (or already had) a good start into the new year! This patch is rather obvious. Thanks to Dominique for reporting the issue. Build and regtested on x86-64-linux. OK for the trunk? Happy New Year to all, Tobias caf-i8.diff Description: application/unknown
Re: RE :Re: RE :Re: hashtable local iterator
On 29 December 2011 18:00, François Dumont wrote: > Attached patch applied. Thanks! > (_Local_iterator_base<>, _Locale_iterator<>, > _Locale_const_iterator<>): I've fixed these two instances of "Locale" in the ChangeLog I might also rename _Ebo_helper to something more specific such as _Hashtable_ebo_helper, as the name _Ebo_helper is a bit generic.
Re: [C++ Patch] deprecation of access declarations
On 12/30/2011 04:34 PM, Mike Stump wrote: So, I'm wondering why it was done this way originally: - static int S1; // ERROR - uses same name 9.3 + static int S1; // { dg-error "" } uses same name 9.3 and not: - static int S1; // ERROR - uses same name 9.3 + static int S1; // { dg-error "uses same name 9.3" } "uses same name 9.3" sounds like a comment, not error text. Jason
Re: [C++ Patch] PR 51397
OK. Jason
[Patch, fortran] PR fortran/50981 segmentation fault when trying to access absent elemental actual arg
Hello, as promised, here is a fix for pr50981. Currently, for a call to an elemental procedure, every scalar actual argument is evaluated before the loop containing the function call. The bug is, we can't evaluate the actual argument if it is a reference to an absent optional dummy argument, as it will result in a NULL pointer dereference. We must pass the reference directly in that case. To fix this, the call to gfc_conv_expr in gfc_add_loop_ss_code, must be changed to a call to gfc_conv_expr_reference. Such a change is basically a revert of PR43841's fix, so we are back with a missed optimization bug. To avoid this we have to do the change only when it is necessary, i.e. when the dummy argument is optional and the actual argument is a reference to an optional dummy. This information is not available in gfc_add_loop_ss_code, so I make for it a new field can_be_null_ref in the gfc_ss_info struct: this is the second patch. Then, the third patch is about setting that field: as the dummy argument information isn't either available in gfc_walk_elemental_function_args, a new argument, proc_expr, is added, which holds the reference to the procedure. It is of type gfc_expr* so that it can handle direct calls and type-bound calls equally well. The first patch is for consistency: gfc_conv_expr should return values, not references, so the address taking is moved where it is actually requested (in gfc_conv_expr_reference). Regression tested on x86_64-unknown-linux-gnu. OK for 4.7/4.6/4.5[/4.4] ? Mikael. PS: Greetings for the new year. 2011-12-29 Mikael Morin * trans-expr.c (gfc_conv_expr): Move address taking... (gfc_conv_expr_reference): ... here. diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c index 83d8087..20da730 100644 --- a/gcc/fortran/trans-expr.c +++ b/gcc/fortran/trans-expr.c @@ -5120,8 +5120,6 @@ gfc_conv_expr (gfc_se * se, gfc_expr * expr) /* Substitute a scalar expression evaluated outside the scalarization loop. */ se->expr = ss_info->data.scalar.value; - if (ss_info->type == GFC_SS_REFERENCE) - se->expr = gfc_build_addr_expr (NULL_TREE, se->expr); se->string_length = ss_info->string_length; gfc_advance_se_ss_chain (se); return; @@ -5254,6 +5252,7 @@ gfc_conv_expr_reference (gfc_se * se, gfc_expr * expr) /* Returns a reference to the scalar evaluated outside the loop for this case. */ gfc_conv_expr (se, expr); + se->expr = gfc_build_addr_expr (NULL_TREE, se->expr); return; } 2011-12-29 Mikael Morin PR fortran/50981 * trans.h (struct gfc_ss_info): New field data::scalar::can_be_null_ref * trans-array.c: If the reference can be NULL, save the reference instead of the value. * trans-expr.c (gfc_conv_expr): If we have saved a reference, dereference it. diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c index a644312..19e081b 100644 --- a/gcc/fortran/trans-array.c +++ b/gcc/fortran/trans-array.c @@ -2422,10 +2422,21 @@ gfc_add_loop_ss_code (gfc_loopinfo * loop, gfc_ss * ss, bool subscript, break; case GFC_SS_REFERENCE: - /* Scalar argument to elemental procedure. Evaluate this - now. */ + /* Scalar argument to elemental procedure. */ gfc_init_se (&se, NULL); - gfc_conv_expr (&se, expr); + if (ss_info->data.scalar.can_be_null_ref) + { + /* If the actual argument can be absent (in other words, it can + be a NULL reference), don't try to evaluate it; pass instead + the reference directly. */ + gfc_conv_expr_reference (&se, expr); + } + else + { + /* Otherwise, evaluate the argument out of the loop and pass + a reference to the value. */ + gfc_conv_expr (&se, expr); + } gfc_add_block_to_block (&outer_loop->pre, &se.pre); gfc_add_block_to_block (&outer_loop->post, &se.post); if (gfc_is_class_scalar_expr (expr)) diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c index 20da730..00f38e1 100644 --- a/gcc/fortran/trans-expr.c +++ b/gcc/fortran/trans-expr.c @@ -5120,6 +5120,11 @@ gfc_conv_expr (gfc_se * se, gfc_expr * expr) /* Substitute a scalar expression evaluated outside the scalarization loop. */ se->expr = ss_info->data.scalar.value; + /* If the reference can be NULL, the value field contains the reference, + not the value the reference points to (see gfc_add_loop_ss_code). */ + if (ss_info->data.scalar.can_be_null_ref) + se->expr = build_fold_indirect_ref_loc (input_location, se->expr); + se->string_length = ss_info->string_length; gfc_advance_se_ss_chain (se); return; diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h index 259a08a..61a4817 100644 --- a/gcc/fortran/trans.h +++ b/gcc/fortran/trans.h @@ -145,8 +145,9 @@ typedef enum GFC_SS_SCALAR, /* Like GFC_SS_SCALAR it evaluates the expression outside the - loop. I
[v3] proposed fix for libstdc++/49204 causes abi_check failure
I want to commit the attached patch: PR libstdc++/49204 * include/std/future (__future_base::_State_base::wait()): Call _M_complete_async instead of _M_run_deferred. (__future_base::_State_base::wait_for()): Call _M_has_deferred and return future_status. (__future_base::_State_base::wait_until()): Likewise. (__future_base::_State_base::_M_run_deferred()): Rename to ... (__future_base::_State_base::_M_complete_async()): This. (__future_base::_State_base::_M_has_deferred()): New. (__basic_future::wait_for(), __basic_future::wait_until()): Return future_status. (__future_base::_Deferred_state::_M_run_deferred()): Rename to ... (__future_base::_Deferred_state::_M_complete_async()): This. (__future_base::_Async_state::_M_join()): New. (__future_base::_Async_state::~_Async_state()): Call _M_join. (__future_base::_Async_state::_M_complete_async()): Likewise. * testsuite/30_threads/packaged_task/members/get_future.cc: Expect future_status return instead of bool. * testsuite/30_threads/shared_future/members/wait_until.cc: Likewise. * testsuite/30_threads/shared_future/members/wait_for.cc: Likewise. * testsuite/30_threads/future/members/wait_until.cc: Likewise. * testsuite/30_threads/future/members/wait_for.cc: Likewise. * testsuite/30_threads/promise/members/set_value2.cc: Likewise. * testsuite/30_threads/promise/members/set_value3.cc: Likewise. * testsuite/30_threads/promise/members/swap.cc: Likewise. * testsuite/30_threads/async/sync.cc: Likewise, and test 'deferred' status. But it causes this abi_check error: 1 incompatible symbols 0 _ZTVNSt13__future_base11_State_baseE vtable for std::__future_base::_State_base version status: unversioned type: object type size: 48 status: incompatible sizes 40 48 I've renamed _M_run_deferred to _M_complete_async but it has the same signature, semantics and position in the vtable. I've also added _M_has_deferred() at the end of the vtable, for an internal implementation type that users use directly or refer to. Is that OK and the testsuite can be adjusted to know about the new vtable size? Index: include/std/future === --- include/std/future (revision 182658) +++ include/std/future (working copy) @@ -326,29 +326,38 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _Result_base& wait() { - _M_run_deferred(); + _M_complete_async(); unique_lock __lock(_M_mutex); - if (!_M_ready()) - _M_cond.wait(__lock, std::bind(&_State_base::_M_ready, this)); + _M_cond.wait(__lock, [&] { return _M_ready(); }); return *_M_result; } template -bool +future_status wait_for(const chrono::duration<_Rep, _Period>& __rel) { unique_lock __lock(_M_mutex); - auto __bound = std::bind(&_State_base::_M_ready, this); - return _M_ready() || _M_cond.wait_for(__lock, __rel, __bound); + if (_M_ready()) + return future_status::ready; + if (_M_has_deferred()) + return future_status::deferred; + if (_M_cond.wait_for(__lock, __rel, [&] { return _M_ready(); })) + return future_status::ready; + return future_status::timeout; } template -bool +future_status wait_until(const chrono::time_point<_Clock, _Duration>& __abs) { unique_lock __lock(_M_mutex); - auto __bound = std::bind(&_State_base::_M_ready, this); - return _M_ready() || _M_cond.wait_until(__lock, __abs, __bound); + if (_M_ready()) + return future_status::ready; + if (_M_has_deferred()) + return future_status::deferred; + if (_M_cond.wait_until(__lock, __abs, [&] { return _M_ready(); })) + return future_status::ready; + return future_status::timeout; } void @@ -480,7 +489,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION bool _M_ready() const noexcept { return static_cast(_M_result); } - virtual void _M_run_deferred() { } + // Wait for completion of async function. + virtual void _M_complete_async() { } + + // Return true if state contains a deferred function. + virtual bool _M_has_deferred() { return false; } }; template @@ -573,7 +586,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION } template -bool +future_status wait_for(const chrono::duration<_Rep, _Period>& __rel) const { _State_base::_S_check(_M_state); @@ -581,7 +594,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION } template -bool +future_status wait_until(const chrono::time_point<_Clock, _Duration>& __abs) const { _S
Re: [C++ Patch] deprecation of access declarations
On Dec 31, 2011, at 9:14 AM, Jason Merrill wrote: > On 12/30/2011 04:34 PM, Mike Stump wrote: >> So, I'm wondering why it was done this way originally: >> >> - static int S1; // ERROR - uses same name 9.3 >> + static int S1; // { dg-error "" } uses same name 9.3 >> >> and not: >> >> - static int S1; // ERROR - uses same name 9.3 >> + static int S1; // { dg-error "uses same name 9.3" } > > "uses same name 9.3" sounds like a comment, not error text. Ah, yes, we never had text matching, that would do it. Thanks.
Re: Keep static VTA locs in cselib tables only
On Nov 25, 2011, Jakub Jelinek wrote: > The numbers I got with your patch (RTL checking) are below, seems > the cumulative numbers other than 100% are all bigger with patched stage2, > which means unfortunately debug info quality degradation. Not really. I found some actual degradation after finally getting back to it. In some cases, I failed to reset NO_LOC_P, and this caused expressions that depended on it to resort to alternate values or end up unset. In other cases, we created different cselib values for debug temps and implicit ptrs, and merging them at dataflow confluences no longer found a common value because the common value was in cselib's static equivalence table. I've fixed (and added an assertion to catch) left-over NO_LOC_Ps, and arranged for values created for debug exprs, implicit ptr, entry values and parameter refs to be preserved across basic blocks as constants within cselib. With that, the debug info we get is a strict improvement in terms of coverage, even though a bunch of .o files still display a decrease in 100% coverage. In the handful files I examined, the patched compiler was emitting a loc list without full coverage, while the original compiler was emitting a single loc expr, that implicitly got full coverage even though AFAICT it should really cover a narrower range. Full coverage was a false positive, and less-than-100% coverage in these cases is not a degradation, but rather an improvement. Now, the reason why we emit additional expressions now is that the new algorithm is more prone to emitting different (and better) expressions when entering basic block, because we don't try as hard as before to keep on with the same location expression. Instead we recompute all the potentially-changed expressions, which will tend to select better expressions if available. > Otherwise the patch looks good to me. Thanks. After the updated comparison data below, you can find the patch I'm checking in, followed by the small interdiff from the previous patch. Happy GNU Year! :-) The results below can be reproduced with r182723. stage1 sources are patched, stage2 and stage3 aren't, so stage2 is built with a patched compiler, stage3 isn't. $ wc -l obj-{x86_64,i686}-linux-gnu/stage[23]-gcc/cc1plus.ev 100784 obj-x86_64-linux-gnu/stage2-gcc/cc1plus.ev 102406 obj-x86_64-linux-gnu/stage3-gcc/cc1plus.ev 33275 obj-i686-linux-gnu/stage2-gcc/cc1plus.ev 33944 obj-i686-linux-gnu/stage3-gcc/cc1plus.ev $ wc -l obj-{x86_64,i686}-linux-gnu/stage[23]-gcc/cc1plus.csv 523647 obj-x86_64-linux-gnu/stage2-gcc/cc1plus.csv 523536 obj-x86_64-linux-gnu/stage3-gcc/cc1plus.csv 521276 obj-i686-linux-gnu/stage2-gcc/cc1plus.csv 521907 obj-i686-linux-gnu/stage3-gcc/cc1plus.csv $ diff -yW80 obj-x86_64-linux-gnu/stage[23]-gcc/cc1plus.ls cov%samples cumul cov%samples cumul 0.0 150949/30% 150949/30%| 0.0 150980/30% 150980/30% 0..56234/1% 157183/31%| 0..56254/1% 157234/31% 6..10 5630/1% 162813/32%| 6..10 5641/1% 162875/32% 11..15 4675/0% 167488/33%| 11..15 4703/0% 167578/33% 16..20 5041/1% 172529/34%| 16..20 5044/1% 172622/34% 21..25 5435/1% 177964/35%| 21..25 5466/1% 178088/35% 26..30 4249/0% 182213/36%| 26..30 4269/0% 182357/36% 31..35 4666/0% 186879/37%| 31..35 4674/0% 187031/37% 36..40 6939/1% 193818/38%| 36..40 6982/1% 194013/38% 41..45 7824/1% 201642/40%| 41..45 7859/1% 201872/40% 46..50 8538/1% 210180/42%| 46..50 8536/1% 210408/42% 51..55 7585/1% 217765/43%| 51..55 7611/1% 218019/43% 56..60 6088/1% 223853/44%| 56..60 6108/1% 224127/44% 61..65 5545/1% 229398/45%| 61..65 5574/1% 229701/46% 66..70 7151/1% 236549/47%| 66..70 7195/1% 236896/47% 71..75 8068/1% 244617/49%| 71..75 8104/1% 245000/49% 76..80 18852/3%263469/52%| 76..80 18879/3%263879/52% 81..85 11958/2%275427/55%| 81..85 11954/2%275833/55% 86..90 15201/3%290628/58%| 86..90 15145/3%290978/58% 91..95 16814/3%307442/61%| 91..95 16727/3%307705/61% 96..99 17121/3%324563/65%| 96..99 16991/3%324696/65% 100 174515/34% 499078/100% | 100 173994/34% 498690/100% $ diff -yW80 obj-i686-linux-gnu/stage[23]-gcc/cc1plus.ls cov%samples cumul cov%samples cumul 0.0 145453/27% 145453/27%| 0.0 145480/27% 145480/27% 0..56594/1% 152047/29%| 0..56603/1% 152083/29% 6..10 5664/1% 157711/30%| 6..10 5671/1% 157754/30% 11..15 4982/0% 162693/31%| 11..15 4997/0% 162751/31% 16..20 6155/1% 168848/32%| 16..20 6169/1% 168920/32% 21..25 5038/0% 173886/33%| 21..25 5057/0% 173977/33% 26..30 4925/0% 178811/34%| 26..30 4918/0% 178895/34%
[C++ Patch] PR 51379
Hi, this implements in a straightforward way the resolution of DR 799 (which went in CD2): basically a reinterpret_cast to the type of the expression itself is fine for integral, enumeration, pointer, or pointer-to-member type. Tested x86_64-linux. Thanks, Paolo. / /cp 2011-12-31 Paolo Carlini PR c++/51379 * typeck.c (build_reinterpret_cast_1): Implement resolution of DR 799. /testsuite 2011-12-31 Paolo Carlini PR c++/51379 * g++.dg/cpp0x/reinterpret1.C: New. * g++.dg/conversion/reinterpret1.C: Adjust, run in C++98 mode. Index: testsuite/g++.dg/conversion/reinterpret1.C === --- testsuite/g++.dg/conversion/reinterpret1.C (revision 182759) +++ testsuite/g++.dg/conversion/reinterpret1.C (working copy) @@ -1,4 +1,5 @@ // PR c++/15076 +// { dg-options -std=c++98 } struct Y { Y(int &); }; Index: testsuite/g++.dg/cpp0x/reinterpret1.C === --- testsuite/g++.dg/cpp0x/reinterpret1.C (revision 0) +++ testsuite/g++.dg/cpp0x/reinterpret1.C (revision 0) @@ -0,0 +1,5 @@ +// PR c++/51379 +// { dg-options -std=c++0x } + +unsigned long t1 = 1; +unsigned long t2 = reinterpret_cast(t1); Index: cp/typeck.c === --- cp/typeck.c (revision 182759) +++ cp/typeck.c (working copy) @@ -6203,6 +6203,11 @@ build_reinterpret_cast_1 (tree type, tree expr, bo else if (TYPE_PTR_P (type) && INTEGRAL_OR_ENUMERATION_TYPE_P (intype)) /* OK */ ; + else if ((cxx_dialect != cxx98) + && (INTEGRAL_OR_ENUMERATION_TYPE_P (type) + || TYPE_PTR_P (type) || TYPE_PTR_TO_MEMBER_P (type)) + && same_type_p (type, intype)) +return fold_if_not_in_template (build_nop (type, expr)); else if ((TYPE_PTRFN_P (type) && TYPE_PTRFN_P (intype)) || (TYPE_PTRMEMFUNC_P (type) && TYPE_PTRMEMFUNC_P (intype))) return fold_if_not_in_template (build_nop (type, expr));
Re: [C++ Patch] PR 51379
OK. Jason
Re: [C++ Patch] PR 51379
On 12/31/2011 03:08 PM, Paolo Carlini wrote: this implements in a straightforward way the resolution of DR 799 (which went in CD2): basically a reinterpret_cast to the type of the expression itself is fine for integral, enumeration, pointer, or pointer-to-member type. Tested x86_64-linux. + else if ((cxx_dialect != cxx98) This DR should apply to C++98 mode as well. Jason
Re: [C++ Patch] PR 51379
Hi, On 12/31/2011 03:08 PM, Paolo Carlini wrote: this implements in a straightforward way the resolution of DR 799 (which went in CD2): basically a reinterpret_cast to the type of the expression itself is fine for integral, enumeration, pointer, or pointer-to-member type. Tested x86_64-linux. + else if ((cxx_dialect != cxx98) This DR should apply to C++98 mode as well. I suspected that. Thus I'm finishing testing the below instead. Ok? Thanks, Paolo. / /cp 2011-12-31 Paolo Carlini PR c++/51379 * typeck.c (build_reinterpret_cast_1): Implement resolution of DR 799. /testsuite 2011-12-31 Paolo Carlini PR c++/51379 * g++.dg/conversion/reinterpret4.C: New. * g++.dg/conversion/reinterpret1.C: Adjust. Index: testsuite/g++.dg/conversion/reinterpret1.C === --- testsuite/g++.dg/conversion/reinterpret1.C (revision 182759) +++ testsuite/g++.dg/conversion/reinterpret1.C (working copy) @@ -3,4 +3,4 @@ struct Y { Y(int &); }; int v; -Y y1(reinterpret_cast(v)); // { dg-error "" } +Y y1(reinterpret_cast(v)); Index: testsuite/g++.dg/conversion/reinterpret4.C === --- testsuite/g++.dg/conversion/reinterpret4.C (revision 0) +++ testsuite/g++.dg/conversion/reinterpret4.C (revision 0) @@ -0,0 +1,4 @@ +// PR c++/51379 + +unsigned long t1 = 1; +unsigned long t2 = reinterpret_cast(t1); Index: cp/typeck.c === --- cp/typeck.c (revision 182759) +++ cp/typeck.c (working copy) @@ -6203,6 +6203,11 @@ build_reinterpret_cast_1 (tree type, tree expr, bo else if (TYPE_PTR_P (type) && INTEGRAL_OR_ENUMERATION_TYPE_P (intype)) /* OK */ ; + else if ((INTEGRAL_OR_ENUMERATION_TYPE_P (type) + || TYPE_PTR_P (type) || TYPE_PTR_TO_MEMBER_P (type)) + && same_type_p (type, intype)) +/* DR 799 */ +return fold_if_not_in_template (build_nop (type, expr)); else if ((TYPE_PTRFN_P (type) && TYPE_PTRFN_P (intype)) || (TYPE_PTRMEMFUNC_P (type) && TYPE_PTRMEMFUNC_P (intype))) return fold_if_not_in_template (build_nop (type, expr));
Re: [PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)
On Mon, Dec 19, 2011 at 4:28 PM, Richard Sandiford wrote: > Hi Revital, > > Revital Eres writes: >> The attached patch is a resubmission following comments made by Ayal >> and Richard. >> >> Tested and bootstrap with the other patches in the series on >> ppc64-redhat-linux, enabling SMS on loops with SC 1. > The modulo-sched parts are approved, with some comments provided below. Moving get_regno_pressure_class() from loop-invariant.c to ira.c and other ira.h/ira-costs.c changes are not for me to approve, but they make sense to me. > Looks really good to me. I'll leave any approval to Ayal though. > Just one suggestion: > >> + /* Handle register moves. */ while we're at it, the comment above should be assigned to the "then" part below: >> + if (ps_i->id >= ps->g->num_nodes) >> + { >> + int old_regno = REGNO (ps_reg_move (ps, ps_i->id)->old_reg); >> + int new_regno = REGNO (ps_reg_move (ps, ps_i->id)->new_reg); >> + >> + change_pressure (old_regno, true); >> + change_pressure (new_regno, false); >> + change_pressure (old_regno, true); >> + } >> + else >> + { >> + for (use_rec = DF_INSN_USES (insn); *use_rec; use_rec++) >> + change_pressure (DF_REF_REGNO (*use_rec), true); >> + >> + for (def_rec = DF_INSN_DEFS (insn); *def_rec; def_rec++) >> + change_pressure (DF_REF_REGNO (*def_rec), false); >> + >> + for (use_rec = DF_INSN_USES (insn); *use_rec; use_rec++) >> + change_pressure (DF_REF_REGNO (*use_rec), true); >> + } > > It might be worth adding a comment to say why the code is doing it this way. Indeed, this does raise an eyebrow, and your explanation below is great. Suggest to start by saying that "The last two steps" are the natural way one would go about updating live registers in a bottom-up scan, except that in some cases (iiuc) the same physical register cannot be assigned to both use and def on same insn, so the first step is added conservatively. > E.g.: > > /* Process all uses, all defs, and then all uses again. The first > two steps give us an estimate of the pressure when dying inputs > cannot be tied to outputs (which is the worst case). The second > two steps update the set of live registers ready for the next > instruction. */ > > Also, as a general "future direction" comment, I think we should seriously > consider turning both this and -fmodulo-sched-allow-regmoves on by default, > so that -fmodulo-sched uses them unless explicitly told not to. The results > for ARM seemed to suggest that it is now the best SMS mode (although the > results can't be shared, unfortunately). It'd be unfortunate if users > had to write: > > -fmodulo-sched -fmodulo-sched-allow-regmoves -fmodulo-sched-reg-pressure > > instead of plain: > > -fmodulo-sched > > It might make sense as a follow-on patch rather than part of the main commit. > I'd be in favor, provided this works well for other platforms as well of-course. Ignoring the potential increase of register pressure inside a loop is being incautiously optimistic... It would be great to document testcases where spillage is detected; the original motivation of SMS is to reduce size of register live ranges so as to prevent spillage. There may be other means of preventing or mitigating spills other than bumping the II (e.g., "Minimum register requirements for a module schedule" Micro'94), which could be devised if concrete examples are analyzed. > There's also the separate debate about whether SMS should now be enabled > by default for ARM at -O3, but that's for another day... > Guess so... ;-) Thanks. > Richard Here are some more comments: +change_pressure (int regno, bool incr_p) ^^ I realize this was taken from elsewhere, but maybe "update_pressure" or "update_current_pressure" would better fit here. +{ + int nregs; + enum reg_class pressure_class; + + if (regno < FIRST_PSEUDO_REGISTER + && (TEST_HARD_REG_BIT (ira_no_alloc_regs, regno) + || TEST_HARD_REG_BIT (eliminable_regset, regno))) +return; + + /* Update the current set of live registers. Exit early if nothing + has changed. */ please clarify: we want to increase curr_reg_pressure as we scan upwards and encounter a last use; if regno is already live, this use is not last. likewise, we want to decrease curr_reg_pressure as we encounter a def; regno might not be live when !incr_p, if the def feeds only uses in next iteration. + if (incr_p + ? !bitmap_set_bit (&curr_regs_live, regno) + : !bitmap_clear_bit (&curr_regs_live, regno)) +return; + + pressure_class = get_regno_pressure_class (regno, &nregs); + + if (!incr_p) +curr_reg_pressure[pressure_class] -= nregs; + else +curr_reg_pressure[pressure_class] += nregs; + sounds a bit more natural to ask "if (incr_p)"...; and only if so, check also if the increased current pressu
Re: [google] change dump_inline_decisions to make it print more useful and better looking info
Ok for google branches. David On Sat, Dec 31, 2011 at 12:45 AM, Dehao Chen wrote: > I see. The patch is updated: > > Thanks, > Dehao > > Index: ipa-inline.c > === > --- ipa-inline.c (revision 182739) > +++ ipa-inline.c (working copy) > @@ -303,26 +303,19 @@ > char *buf; > size_t buf_size; > const char *bfd_name = lang_hooks.dwarf_name (node->decl, 0); > - const char *count_text = "count="; > - const char *max_count_text = "max_count="; > > if (!bfd_name) > bfd_name = "unknown"; > > buf_size = strlen (bfd_name) + 1; > - if (flag_opt_info >= OPT_INFO_MED) > - buf_size += (strlen (count_text) + MAX_INT_LENGTH + 1); > - if (flag_opt_info >= OPT_INFO_MAX) > - buf_size += (strlen (max_count_text) + MAX_INT_LENGTH + 1); > + if (flag_opt_info >= OPT_INFO_MED && profile_info) > + buf_size += (2 * MAX_INT_LENGTH + 5); > buf = (char *) xmalloc (buf_size); > > strcpy (buf, bfd_name); > - if (flag_opt_info >= OPT_INFO_MED) > - sprintf (buf, "%s,%s"HOST_WIDEST_INT_PRINT_DEC, > - buf, count_text, node->count); > - if (flag_opt_info >= OPT_INFO_MAX) > - sprintf (buf, "%s,%s"HOST_WIDEST_INT_PRINT_DEC, > - buf, max_count_text, node->max_bb_count); > + if (flag_opt_info >= OPT_INFO_MED && profile_info) > + sprintf (buf, "("HOST_WIDEST_INT_PRINT_DEC", > "HOST_WIDEST_INT_PRINT_DEC")", > + node->count, node->max_bb_count); > return buf; > } > > On Fri, Dec 30, 2011 at 4:59 PM, Xinliang David Li wrote: >> the early inline decisions are still good to dump out. However the >> opt-info should check 'if (profile_info)' to decide if count and max >> count info need to be dumped. >> >> David >> >> On Fri, Dec 30, 2011 at 12:31 AM, Dehao Chen wrote: >>> Hi, >>> >>> This patch makes the -fopt-info print more concise info: >>> 1. only inline decisions after einline are printed >>> 2. print in a more compact format >>> >>> Bootstrapped with no problem. >>> >>> Is it okay for google-4_6 and google-main branch? >>> >>> Thanks, >>> Dehao >>> >>> gcc/ChangeLog.google-4_6 >>> 2011-12-30 Dehao Chen >>> >>> * ipa-inline.c (dump_inline_decision): Print more concise info when >>> dumping inline decisions. >>> >>> Index: gcc/ipa-inline.c >>> === >>> --- gcc/ipa-inline.c (revision 182739) >>> +++ gcc/ipa-inline.c (working copy) >>> @@ -303,26 +303,19 @@ >>> char *buf; >>> size_t buf_size; >>> const char *bfd_name = lang_hooks.dwarf_name (node->decl, 0); >>> - const char *count_text = "count="; >>> - const char *max_count_text = "max_count="; >>> >>> if (!bfd_name) >>> bfd_name = "unknown"; >>> >>> buf_size = strlen (bfd_name) + 1; >>> if (flag_opt_info >= OPT_INFO_MED) >>> - buf_size += (strlen (count_text) + MAX_INT_LENGTH + 1); >>> - if (flag_opt_info >= OPT_INFO_MAX) >>> - buf_size += (strlen (max_count_text) + MAX_INT_LENGTH + 1); >>> + buf_size += (2 * MAX_INT_LENGTH + 5); >>> buf = (char *) xmalloc (buf_size); >>> >>> strcpy (buf, bfd_name); >>> if (flag_opt_info >= OPT_INFO_MED) >>> - sprintf (buf, "%s,%s"HOST_WIDEST_INT_PRINT_DEC, >>> - buf, count_text, node->count); >>> - if (flag_opt_info >= OPT_INFO_MAX) >>> - sprintf (buf, "%s,%s"HOST_WIDEST_INT_PRINT_DEC, >>> - buf, max_count_text, node->max_bb_count); >>> + sprintf (buf, "("HOST_WIDEST_INT_PRINT_DEC", >>> "HOST_WIDEST_INT_PRINT_DEC")", >>> + node->count, node->max_bb_count); >>> return buf; >>> } >>> >>> @@ -369,6 +362,14 @@ >>> const char *inline_chain_text; >>> const char *call_count_text; >>> struct cgraph_node *final_caller = edge->caller; >>> + tree decl = edge->caller->decl; >>> + >>> + if (decl) >>> + { >>> + struct function *fn = DECL_STRUCT_FUNCTION (decl); >>> + if (!fn || !fn->always_inline_functions_inlined) >>> + return; >>> + } >>> >>> if (final_caller->global.inlined_to != NULL) >>> inline_chain_text = cgraph_node_call_chain (final_caller, >>> &final_caller);