Hi Richard, Thanks again for your review!
on 2020/11/2 下午6:23, Richard Sandiford wrote: > "Kewen.Lin" <li...@linux.ibm.com> writes: >> diff --git a/gcc/function.c b/gcc/function.c >> index 2c8fa217f1f..3e92ee9c665 100644 >> --- a/gcc/function.c >> +++ b/gcc/function.c >> @@ -4841,6 +4841,8 @@ allocate_struct_function (tree fndecl, bool abstract_p) >> binding annotations among them. */ >> cfun->debug_nonbind_markers = lang_hooks.emits_begin_stmt >> && MAY_HAVE_DEBUG_MARKER_STMTS; >> + >> + cfun->pending_TODOs = 0; > > The field is cleared on allocation. I think it would be better > to drop this, to avoid questions about why other fields aren't > similarly zero-initialised. > >> } >> >> /* This is like allocate_struct_function, but pushes a new cfun for FNDECL >> diff --git a/gcc/function.h b/gcc/function.h >> index d55cbddd0b5..ffed6520bf9 100644 >> --- a/gcc/function.h >> +++ b/gcc/function.h >> @@ -269,6 +269,13 @@ struct GTY(()) function { >> /* Value histograms attached to particular statements. */ >> htab_t GTY((skip)) value_histograms; >> >> + /* Different from normal TODO_flags which are handled right at the >> + begin or the end of one pass execution, the pending_TODOs are > > beginning > >> + passed down in the pipeline until one of its consumers can >> + perform the requested action. Consumers should then clear the >> + flags for the actions that they have taken. */ >> + unsigned int pending_TODOs; >> + >> /* For function.c. */ >> >> /* Points to the FUNCTION_DECL of this function. */ >> […] >> diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c >> index 298ab215530..9a9076cee67 100644 >> --- a/gcc/tree-ssa-loop-ivcanon.c >> +++ b/gcc/tree-ssa-loop-ivcanon.c >> @@ -1411,6 +1411,13 @@ tree_unroll_loops_completely_1 (bool >> may_increase_size, bool unroll_outer, >> bitmap_clear (father_bbs); >> bitmap_set_bit (father_bbs, loop_father->header->index); >> } >> + else if (unroll_outer >> + && !(cfun->pending_TODOs >> + & PENDING_TODO_force_next_scalar_cleanup)) >> + { >> + /* Trigger scalar cleanup once any outermost loop gets unrolled. */ >> + cfun->pending_TODOs |= PENDING_TODO_force_next_scalar_cleanup; >> + } > > I can see it would make sense to test whether the flag is already set > if we were worried about polluting the cache line. But this test and > set isn't performance-sensitive, so I think it would be clearer to > remove the “&& …” part of the condition. > > Nit: there should be no braces around the block, since it's a single > statement. > > OK with those changes, thanks. The patch was updated as your comments above, re-tested on Power8 and committed in r11-4637. BR, Kewen