Re: Compile-time improvement for if conversion.

Yuri Rumyantsev Fri, 14 Oct 2016 05:07:54 -0700

Richard,

Here is updated patch with the changes proposed by you.


Bootstrapping and regression testing did not show any new failures.
Is it OK for trunk?

ChangeLog:
2016-10-14  Yuri Rumyantsev  <ysrum...@gmail.com>

* dominance.c (dom_info::dom_info): Add new constructor for region
presented by vector of basic blocks.
(dom_init): New method to initialize members common for both
constructors.
(dom_info::dom_info): Invoke dom_init for partial initialization.
(dom_info::get_idom): Add check to corner cases on basic blocks which
are not in region.
(dom_info::calc_dfs_tree): Check M_FAKE_EXIT_EDGE instead of M_REVERSE
to detect unreachable bbs.
(dom_info::calc_idoms): Likewise.
(compute_dom_fast_query_in_region): New function.
(calculate_dominance_info_for_region): Likewise.
(free_dominance_info_for_region): Add couple functions to free
dominance info for region.
* dominance.h: Add prototypes for introduced region-based functions
tree-if-conv.c: (build_region): New function.
(if_convertible_loop_p_1): Invoke local version of post-dominators
calculation before basic block predication with subsequent freeing
post-dominator info.
(tree_if_conversion): Remove free of post-dominator info
(pass_if_conversion::execute): Delete detection of infinite loops
and fake edges to exit block since post-dominator calculation is
performed per if-converted loop only.

2016-10-13 13:15 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Wed, Oct 12, 2016 at 3:37 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
>> Richard,
>>
>> Here is updated patch . I avoided creation of new entry/exit blocks
>> but instead add check to border cases - do not consider also blocks
>> which are out of region.
>>
>> Any comments will be appreciated.
>
> I mostly like it now.  Can you split out all the common parts from the
> dom_info constructor
> to a helper method please and just compute m_n_basic_blocks and a max_index 
> and
> do all the memory allocation in that common function?
>
> @@ -276,9 +334,10 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb)
>               bn = e->src;
>
>               /* If the next node BN is either already visited or a border
> -                block the current edge is useless, and simply overwritten
> -                with the next edge out of the current node.  */
> -             if (bn == m_end_block || m_dfs_order[bn->index])
> +                block or out of region the current edge is useless, and 
> simply
> +                overwritten with the next edge out of the current node.  */
> +             if (bn == m_end_block || bn->dom[d_i] == NULL
>
> clever idea ;)  Just thinking if this means we support single entry,
> multiple exit
> regions for CDI_DOMINATORs and multiple entry, single exit regions for
> CDI_POST_DOMINATORs.  I think so.  Please update the function comments
> accordingly.
>
>
> +  m_dfsnum = 1;
> +  m_nodes = 0;
> +  m_fake_exit_edge = NULL; /* Assume that region is SCC.  */
>
> you mean SESE rather than SCC?
>
> @@ -511,7 +573,7 @@ dom_info::calc_idoms ()
>                                    : ei_start (bb->preds);
>        edge_iterator einext;
>
> -      if (m_reverse)
> +      if (m_reverse && !in_region)
>         {
>           /* If this block has a fake edge to exit, process that first.  */
>           if (bitmap_bit_p (m_fake_exit_edge, bb->index))
>
> I think it's better to simply change the if (m_reverse) test to a if
> (m_fake_exit_edge) test.
>
> I noticed you do not initialize n_bbs_in_dom_tree (just set it to
> zero), it's not really used
> anywhere (but in an assert).
>
> free_dominance_info_for_region needs a function level comment (per
> coding guidelines).
>
> Please make free_dominance_info_for_region take a struct function * pointer.
>
>  @@ -1318,11 +1350,13 @@ if_convertible_loop_p_1 (struct loop *loop,
> vec<data_reference_p> *refs)
>  {
>    unsigned int i;
>    basic_block exit_bb = NULL;
> +  vec<basic_block> region;
>
>    if (find_data_references_in_loop (loop, refs) == chrec_dont_know)
>      return false;
>
> -  calculate_dominance_info (CDI_DOMINATORS);
> +  if (dom_info_state (CDI_DOMINATORS) != DOM_OK)
> +    calculate_dominance_info (CDI_DOMINATORS);
>
> This is a premature (unwanted) change.
>
> Other than that the only O(function-size) part left is the zeroing in
>
> +  /* Determine max basic block index in region.  */
> +  int max_index = region[0]->index;
> +  for (size_t i = 1; i < num; i++)
> +    if (region[i]->index > max_index)
> +      max_index = region[i]->index;
> +  max_index += 1;
> +  m_dfs_order = new_zero_array <TBB> (max_index + 1);
> +  m_dfs_last = &m_dfs_order[max_index];
>
> I suppose we can try addressing this as followup.
>
> Thanks a lot for doing this.
> Richard.
>
>> 2016-10-11 16:48 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>>> On Tue, Oct 11, 2016 at 3:23 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
>>>> Richard,
>>>>
>>>> I implemented this by passing callback function in_region which
>>>> returns true if block belongs to region.
>>>> I am testing it now
>>>>
>>>> I attach modified patch for your quick review.
>>>
>>> +  FOR_EACH_VEC_ELT (region, i, bb)
>>> +    {
>>> +      bb->dom[dir_index] = et_new_tree (bb);
>>> +    }
>>> +  /* Check that region is SESE region.  */
>>> +  entry = region[0];
>>> +  if ( EDGE_COUNT (entry->succs) > 1)
>>> +    {
>>> +      /* Create new entry block with the only successor.  */
>>> +      basic_block succ = NULL;
>>> +      bool found = false;
>>> +      FOR_EACH_EDGE (e, ei, entry->succs)
>>> +       if (in_region (region, e->dest))
>>> +         {
>>> +           gcc_assert (!found);
>>>
>>> is that check equal to e->dest->dom[dir_index] != NULL?  I think we
>>> shouldn't need the callback.
>>>
>>> +    new_entry = create_empty_bb (entry);
>>> +    unchecked_make_edge (new_entry, succ, 0);
>>>
>>> I still think this is somewhat gross and we should try to avoid it
>>> somehow - at least it's well-hidden now ;)
>>>
>>> +    /* Put it to region as entry node.  */
>>> +    region[0] = new_entry;
>>>
>>> so region[0] is overwritten now?
>>>
>>> As far as I understand calc_dfs_tree it should be possible to compute
>>> the region on-the-fly
>>> from calling calc_dfs_tree_nonrec on the entry/exit (also maybe
>>> avoiding some of the still
>>> "large" complexity of zeroing arrays in the constructor).
>>>
>>> And if we use that DFS walk directly we should be able to avoid
>>> creating those fake entry/exit
>>> blocks by using entry/exit edges instead... (somehow).
>>>
>>> Richard.
>>>
>>>
>>>
>>>> Thanks.
>>>>
>>>> 2016-10-11 13:33 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>>>>> On Mon, Oct 10, 2016 at 4:17 PM, Yuri Rumyantsev <ysrum...@gmail.com> 
>>>>> wrote:
>>>>>> Richard,
>>>>>>
>>>>>> If "fake" exit or entry block is created in dominance how we can
>>>>>> determine what is its the only  predecessor or successor without using
>>>>>> a notion of loop?
>>>>>
>>>>> The caller passes in an entry and exit edge instead of a block or loop.
>>>>>
>>>>> Richard.
>>>>>
>>>>>> 2016-10-10 15:00 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>>>>>>> On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev <ysrum...@gmail.com> 
>>>>>>> wrote:
>>>>>>>> Thanks Richard for your comments.
>>>>>>>> I'd like to answer on your last comment regarding use split_edge()
>>>>>>>> instead of creating fake post-header. I started with this splitting
>>>>>>>> but it requires to fix-up closed ssa form by creating additional phi
>>>>>>>> nodes, so I decided to use only cfg change without updating ssa form.
>>>>>>>> Other changes look reasonable and will fix them.
>>>>>>>
>>>>>>> Ah.  In this case can you investigate what it takes to make the 
>>>>>>> entry/exit
>>>>>>> edges rather than BBs?  That is, introduce those "fakes" only internally
>>>>>>> in dominance.c?
>>>>>>>
>>>>>>>> 2016-10-10 12:52 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>>>>>>>>> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev <ysrum...@gmail.com> 
>>>>>>>>> wrote:
>>>>>>>>>> Hi All,
>>>>>>>>>>
>>>>>>>>>> Here is implementation of Richard proposal:
>>>>>>>>>>
>>>>>>>>>> < For general infrastructure it would be nice to expose a 
>>>>>>>>>> (post-)dominator
>>>>>>>>>> < compute for MESE (post-dominators) / SEME (dominators) regions.  I 
>>>>>>>>>> believe
>>>>>>>>>> < what makes if-conversion expensive is the post-dom compute which 
>>>>>>>>>> happens
>>>>>>>>>> < for each loop for the whole function.  It shouldn't be very 
>>>>>>>>>> difficult
>>>>>>>>>> < to write this,
>>>>>>>>>> < sharing as much as possible code with the current DOM code might 
>>>>>>>>>> need
>>>>>>>>>> < quite some refactoring though.
>>>>>>>>>>
>>>>>>>>>> I implemented this proposal by adding calculation of dominance info
>>>>>>>>>> for SESE regions and incorporate this change to if conversion pass.
>>>>>>>>>> SESE region is built by adding loop pre-header and possibly fake
>>>>>>>>>> post-header blocks to loop body. Fake post-header is deleted after
>>>>>>>>>> predication completion.
>>>>>>>>>>
>>>>>>>>>> Bootstrapping and regression testing did not show any new failures.
>>>>>>>>>>
>>>>>>>>>> Is it OK for trunk?
>>>>>>>>>
>>>>>>>>> It's mostly reasonable but I have a few comments.  First, re-using
>>>>>>>>> bb->dom[] for the dominator info is somewhat fragile but indeed
>>>>>>>>> a requirement to make the patch reasonably small.  Please,
>>>>>>>>> in calculate_dominance_info_for_region, make sure that
>>>>>>>>> !dom_info_available_p (dir).
>>>>>>>>>
>>>>>>>>> You pass loop * everywhere but require ->aux to be set up as
>>>>>>>>> an array of BBs forming the region with special BBs at array ends.
>>>>>>>>>
>>>>>>>>> Please instead pass in a vec<basic_block> which avoids using ->aux
>>>>>>>>> and also allows other non-loop-based SESE regions to be used
>>>>>>>>> (I couldn't spot anything that relies on this being a loop).
>>>>>>>>>
>>>>>>>>> Adding a convenience wrapper for loop  * would be of course nice,
>>>>>>>>> to cover the special pre/post-header code in tree-if-conv.c.
>>>>>>>>>
>>>>>>>>> In theory a SESE region is fully specified by its entry end exit 
>>>>>>>>> _edge_,
>>>>>>>>> so you might want to see if it's possible to use such a pair of edges
>>>>>>>>> to guard the dfs/idom walks to avoid the need to create fake blocks.
>>>>>>>>>
>>>>>>>>> Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
>>>>>>>>> please use split_edge() of the entry/exit edges.
>>>>>>>>>
>>>>>>>>> Richard.
>>>>>>>>>
>>>>>>>>>> ChangeLog:
>>>>>>>>>> 2016-10-05  Yuri Rumyantsev  <ysrum...@gmail.com>
>>>>>>>>>>
>>>>>>>>>> * dominance.c : Include cfgloop.h for loop recognition.
>>>>>>>>>> (dom_info): Add new functions and add boolean argument to recognize
>>>>>>>>>> computation for loop region.
>>>>>>>>>> (dom_info::dom_info): New function.
>>>>>>>>>> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
>>>>>>>>>> handle unvisited blocks.
>>>>>>>>>> (dom_info::calc_idoms): Likewise.
>>>>>>>>>> (compute_dom_fast_query_in_region): New function.
>>>>>>>>>> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
>>>>>>>>>> false argument.
>>>>>>>>>> (calculate_dominance_info_for_region): New function.
>>>>>>>>>> (free_dominance_info_for_region): Likewise.
>>>>>>>>>> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
>>>>>>>>>> argument.
>>>>>>>>>> * dominance.h: Add prototype for introduced functions
>>>>>>>>>> calculate_dominance_info_for_region and
>>>>>>>>>> free_dominance_info_for_region.
>>>>>>>>>> tree-if-conv.c: Add to local variables ifc_sese_bbs & 
>>>>>>>>>> fake_postheader.
>>>>>>>>>> (build_sese_region): New function.
>>>>>>>>>> (if_convertible_loop_p_1): Invoke local version of post-dominators
>>>>>>>>>> calculation, free it after basic block predication and delete created
>>>>>>>>>> fake post-header block if any.
>>>>>>>>>> (tree_if_conversion): Delete call of free_dominance_info for
>>>>>>>>>> post-dominators, free ifc_sese_bbs which represents SESE region.
>>>>>>>>>> (pass_if_conversion::execute): Delete detection of infinite loops
>>>>>>>>>> and fake edges to exit block since post-dominator calculation is
>>>>>>>>>> performed per if-converted loop only.

patch.4
Description: Binary data

Re: Compile-time improvement for if conversion.

Reply via email to