On Thu, Oct 29, 2015 at 10:49 AM, Martin Liška <mli...@suse.cz> wrote: > On 10/28/2015 04:23 PM, Richard Biener wrote: >> On Tue, Oct 27, 2015 at 4:30 PM, Martin Liška <mli...@suse.cz> wrote: >>> On 10/27/2015 03:49 PM, Richard Biener wrote: >>>> On Tue, Oct 27, 2015 at 1:36 PM, Martin Liška <mli...@suse.cz> wrote: >>>>> On 10/26/2015 02:48 PM, Richard Biener wrote: >>>>>> On Thu, Oct 22, 2015 at 1:02 PM, Martin Liška <mli...@suse.cz> wrote: >>>>>>> On 10/21/2015 04:06 PM, Richard Biener wrote: >>>>>>>> On Wed, Oct 21, 2015 at 1:24 PM, Martin Liška <mli...@suse.cz> wrote: >>>>>>>>> On 10/21/2015 11:59 AM, Richard Biener wrote: >>>>>>>>>> On Wed, Oct 21, 2015 at 11:19 AM, Martin Liška <mli...@suse.cz> >>>>>>>>>> wrote: >>>>>>>>>>> On 10/20/2015 03:39 PM, Richard Biener wrote: >>>>>>>>>>>> On Tue, Oct 20, 2015 at 3:00 PM, Martin Liška <mli...@suse.cz> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> Hello. >>>>>>>>>>>>> >>>>>>>>>>>>> As part of upcoming merge of HSA branch, we would like to have >>>>>>>>>>>>> possibility to terminate >>>>>>>>>>>>> pass manager after execution of the HSA generation pass. The HSA >>>>>>>>>>>>> back-end is implemented >>>>>>>>>>>>> as a tree pass that directly emits HSAIL from gimple tree >>>>>>>>>>>>> representation. The pass operates >>>>>>>>>>>>> on clones created by HSA IPA pass and the pass manager should >>>>>>>>>>>>> stop execution of further >>>>>>>>>>>>> RTL passes. >>>>>>>>>>>>> >>>>>>>>>>>>> Suggested patch survives bootstrap and regression tests on >>>>>>>>>>>>> x86_64-linux-pc. >>>>>>>>>>>>> >>>>>>>>>>>>> What do you think about it? >>>>>>>>>>>> >>>>>>>>>>>> Are you sure it works this way? >>>>>>>>>>>> >>>>>>>>>>>> Btw, you will miss executing of all the cleanup passes that will >>>>>>>>>>>> eventually free memory >>>>>>>>>>>> associated with the function. So I'd rather support a >>>>>>>>>>>> TODO_discard_function which >>>>>>>>>>>> should basically release the body from the cgraph. >>>>>>>>>>> >>>>>>>>>>> Hi. >>>>>>>>>>> >>>>>>>>>>> Agree with you that I should execute all TODOs, which can be easily >>>>>>>>>>> done. >>>>>>>>>>> However, if I just try to introduce the suggested TODO and handle >>>>>>>>>>> it properly >>>>>>>>>>> by calling cgraph_node::release_body, then for instance >>>>>>>>>>> fn->gimple_df, fn->cfg are >>>>>>>>>>> released and I hit ICEs on many places. >>>>>>>>>>> >>>>>>>>>>> Stopping the pass manager looks necessary, or do I miss something? >>>>>>>>>> >>>>>>>>>> "Stopping the pass manager" is necessary after >>>>>>>>>> TODO_discard_function, yes. >>>>>>>>>> But that may be simply done via a has_body () check then? >>>>>>>>> >>>>>>>>> Thanks, there's second version of the patch. I'm going to start >>>>>>>>> regression tests. >>>>>>>> >>>>>>>> As release_body () will free cfun you should pop_cfun () before >>>>>>>> calling it (and thus >>>>>>> >>>>>>> Well, release_function_body calls both push & pop, so I think calling >>>>>>> pop >>>>>>> before cgraph_node::release_body is not necessary. >>>>>> >>>>>> (ugh). >>>>>> >>>>>>> If tried to call pop_cfun before cgraph_node::release_body, I have cfun >>>>>>> still >>>>>>> pointing to the same (function *) (which is gcc_freed, but cfun != >>>>>>> NULL). >>>>>> >>>>>> Hmm, I meant to call pop_cfun then after it (unless you want to fix the >>>>>> above, >>>>>> none of the freeing functions should techincally need 'cfun', just add >>>>>> 'fn' parameters ...). >>>>>> >>>>>> I expected pop_cfun to eventually set cfun to NULL if it popped the >>>>>> "last" cfun. Why >>>>>> doesn't it do that? >>>>>> >>>>>>>> drop its modification). Also TODO_discard_functiuon should be only >>>>>>>> set for >>>>>>>> local passes thus you should probably add a gcc_assert (cfun). >>>>>>>> I'd move its handling earlier, definitely before the ggc_collect, >>>>>>>> eventually >>>>>>>> before the pass_fini_dump_file () (do we want a last dump of the >>>>>>>> function or not?). >>>>>>> >>>>>>> Fully agree, moved here. >>>>>>> >>>>>>>> >>>>>>>> @@ -2397,6 +2410,10 @@ execute_pass_list_1 (opt_pass *pass) >>>>>>>> { >>>>>>>> gcc_assert (pass->type == GIMPLE_PASS >>>>>>>> || pass->type == RTL_PASS); >>>>>>>> + >>>>>>>> + >>>>>>>> + if (!gimple_has_body_p (current_function_decl)) >>>>>>>> + return; >>>>>>>> >>>>>>>> too much vertical space. With popping cfun before releasing the body >>>>>>>> the check >>>>>>>> might just become if (!cfun) and >>>>>>> >>>>>>> As mentioned above, as release body is symmetric (calling push & pop), >>>>>>> the suggested >>>>>>> guard will not work. >>>>>> >>>>>> I suggest to fix it. If it calls push/pop it should leave with the >>>>>> original cfun, popping again >>>>>> should make it NULL? >>>>>> >>>>>>>> >>>>>>>> @@ -2409,7 +2426,7 @@ execute_pass_list (function *fn, opt_pass *pass) >>>>>>>> { >>>>>>>> push_cfun (fn); >>>>>>>> execute_pass_list_1 (pass); >>>>>>>> - if (fn->cfg) >>>>>>>> + if (gimple_has_body_p (current_function_decl) && fn->cfg) >>>>>>>> { >>>>>>>> free_dominance_info (CDI_DOMINATORS); >>>>>>>> free_dominance_info (CDI_POST_DOMINATORS); >>>>>>>> >>>>>>>> here you'd need to guard the pop_cfun call on cfun != NULL. IMHO it's >>>>>>>> better >>>>>>>> to not let cfun point to deallocated data. >>>>>>> >>>>>>> As I've read the code, since we gcc_free 'struct function', one can >>>>>>> just rely on >>>>>>> gimple_has_body_p (current_function_decl) as it correctly realizes that >>>>>>> DECL_STRUCT_FUNCTION (current_function_decl) == NULL. >>>>>> >>>>>> I'm sure that with some GC checking ggc_free makes them #deadbeef or so: >>>>>> >>>>>> void >>>>>> ggc_free (void *p) >>>>>> { >>>>>> ... >>>>>> #ifdef ENABLE_GC_CHECKING >>>>>> /* Poison the data, to indicate the data is garbage. */ >>>>>> VALGRIND_DISCARD (VALGRIND_MAKE_MEM_UNDEFINED (p, size)); >>>>>> memset (p, 0xa5, size); >>>>>> #endif >>>>>> >>>>>> so I don't think that's a good thing to rely on ;) >>>>>> >>>>>> Richard. >>>>> >>>>> Hi Richi, fully agree that relying of 0xdeadbeaf is not good. >>>>> I'm sending quite simple patch v4 where I enable popping up >>>>> the stack to eventually set cfun = current_function_decl = NULL. >>>>> >>>>> Question of using push & pop in cgraph_node::release_body should >>>>> be orthogonal as there are other places where the function is used. >>>>> >>>>> What do you think about the patch? >>>> >>>> @@ -4763,6 +4763,13 @@ push_cfun (struct function *new_cfun) >>>> void >>>> pop_cfun (void) >>>> { >>>> + if (cfun_stack.is_empty ()) >>>> + { >>>> + set_cfun (NULL); >>>> + current_function_decl = NULL_TREE; >>>> + return; >>>> + } >>>> + >>>> >>>> I'd like to avoid this by... >>>> >>>> @@ -2405,11 +2428,12 @@ execute_pass_list (function *fn, opt_pass *pass) >>>> { >>>> push_cfun (fn); >>>> execute_pass_list_1 (pass); >>>> - if (fn->cfg) >>>> + if (current_function_decl && fn->cfg) >>>> { >>>> free_dominance_info (CDI_DOMINATORS); >>>> free_dominance_info (CDI_POST_DOMINATORS); >>>> } >>>> + >>>> pop_cfun (); >>>> >>>> making this conditional on if (cfun). Btw, please check cfun against NULL, >>>> not current_function_decl. >>> >>> Done. >>> >>>> >>>> + if (dom_info_available_p (CDI_POST_DOMINATORS)) >>>> + free_dominance_info (CDI_POST_DOMINATORS); >>>> + >>>> + /* Pop function inserted in execute_pass_list. */ >>>> + pop_cfun (); >>>> + >>>> + cgraph_node::get (cfun->decl)->release_body (); >>>> + >>>> + /* Set cfun and current_function_decl to be NULL. */ >>>> + pop_cfun (); >>>> + } >>>> >>>> this also looks weird. Because you pop_cfun and then access cfun and >>>> then you pop cfun again? I'd say most clean would be >>>> >>>> + if (dom_info_available_p (CDI_POST_DOMINATORS)) >>>> + free_dominance_info (CDI_POST_DOMINATORS); >>>> + tree fn = cfun->decl; >>>> pop_cfun (); >>>> cgraph_node::get (fn)->release_body (); >>>> } >>>> >>>> or does the comment say that the current function is on the stack >>>> twice somehow? How can that happen? >>> >>> You are right, your change applied. >>> >>> Sending V5. If OK, I'll create corresponding changelog entry >>> and run regression tests. >> >> You still have >> >> @@ -4763,6 +4763,13 @@ push_cfun (struct function *new_cfun) >> void >> pop_cfun (void) >> { >> + if (cfun_stack.is_empty ()) >> + { >> + set_cfun (NULL); >> + current_function_decl = NULL_TREE; >> + return; >> + } >> >> and popping of cfun twice in execute_one_pass. >> >> Richard. > > Right and that's the way how I set current_function_decl = cfun = NULL > (both pop_cfun are commented what they do). > As the popping is done at the end of execute_pass_list and having empty > stack just sets to NULL, it should be fine.
popping it once should have that effect already. If not then go figure why it does not. Richard. > Martin > >> >>> Thanks, >>> Martin >>> >>>> >>>> Richard. >>>> >>>> >>>>> Thanks, >>>>> Martin >>>>> >>>>>> >>>>>>> I'm attaching v3. >>>>>>> >>>>>>> Thanks, >>>>>>> Martin >>>>>>> >>>>>>>> >>>>>>>> Otherwise looks good to me. >>>>>>>> >>>>>>>> Richard. >>>>>>>> >>>>>>>> >>>>>>>>> Martin >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Martin >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Richard. >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Martin >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>> >