On 9/29/20 10:46 AM, Richard Biener wrote:
On Fri, Sep 25, 2020 at 4:05 PM Martin Liška <mli...@suse.cz> wrote:

On 9/24/20 2:41 PM, Richard Biener wrote:
On Wed, Sep 2, 2020 at 1:53 PM Martin Liška <mli...@suse.cz> wrote:

On 9/1/20 4:50 PM, David Malcolm wrote:
Hope this is constructive
Dave

Thank you David. All of them very very useful!

There's updated version of the patch.

Hey.

What a juicy patch review!


I noticed several functions without a function-level comment.

Yep, but several of them are documented in a class declaration. Anyway, I will
improve for the next time.


-  cluster (tree case_label_expr, basic_block case_bb, profile_probability prob,
-          profile_probability subtree_prob);
+  inline cluster (tree case_label_expr, basic_block case_bb,
+                 profile_probability prob, profile_probability subtree_prob);

I thought we generally leave this to the compiler ...

+@item -fconvert-if-to-switch
+@opindex fconvert-if-to-switch
+Perform conversion of an if cascade into a switch statement.
+Do so if the switch can be later transformed using a jump table
+or a bit test.  The transformation can help to produce faster code for
+the switch statement.  This flag is enabled by default
+at @option{-O2} and higher.

this mentions we do this only when we later can convert the
switch again but both passes (we still have two :/) have
independent guards.

Yes, we have the option for jump tables (-jump-tables), but we miss one for a 
bit-test.
Moreover, as mentioned in the cover email, one can see it beneficial to convert 
a if-chain
to switch as the expansion (without any BT and JT) can benefit from balanced 
tree.


+  /* For now, just wipe the dominator information.  */
+  free_dominance_info (CDI_DOMINATORS);

could at least be conditional on the vop renaming condition...

+  if (!all_candidates.is_empty ())
+    mark_virtual_operands_for_renaming (fun);

Yep.


+      if (bitmap_bit_p (*visited_bbs, bb->index))
+       break;
+      bitmap_set_bit (*visited_bbs, bb->index);

since you are using a bitmap and not a sbitmap (why?)
you can combine those into

New to me, thanks.


     if (!bitmap_set_bit (*visited_bbs, bb->index))
      break;

+      /* Current we support following patterns (situations):
+
+        1) if condition with equal operation:
+
...

did you see whether using

     register_edge_assert_for (lhs, true_edge, code, lhs, rhs, asserts);

works equally well?  It fills the 'asserts' vector with relations
derived from 'lhs'.  There's also
vr_values::extract_range_for_var_from_comparison_expr
to compute the case_range

Good point! I must admit that my patch doesn't properly handle negative 
conditions:

    if (argc != 11111)
    {
      if (argc == 1)
        global = 222;
      ...
    }

which can VRP correctly identify as anti-range:
int ~[11111, 11111]  EQUIVALENCES: { argc_8(D) } (1 elements)$1 = void

I have question about OR and AND conditions:

    <bb 2> :
    _1 = aChar_8(D) == 1;
    _2 = aChar_8(D) == 10;
    _3 = _1 | _2;
    if (_3 != 0)
      goto <bb 6>; [INV]
    else
      goto <bb 3>; [INV]

    <bb 2> :
    _1 = aChar_8(D) != 1;
    _2 = aChar_8(D) != 10;
    _3 = _1 & _2;
    if (_3 != 0)
      goto <bb 6>; [INV]
    else
      goto <bb 3>; [INV]

Can I somehow get that from VRP (as I ask register_edge_assert_for only for LHS
of a condition)?

Yes, you simply get all sorts of conditions that hold when a condition is
true, not just those based on the SSA name you put in.  But it occured
to me that the use-case is somewhat different - for switch-conversion
you want to know whether the test _exactly_ matches a range test,
the VRP worker will not tell you that.  For example if you had
if (x &&  a > 3 && a < 7) then it will give you 'a in [4, 6]' and it might
not give you 'x in [1, 1]' (for example if x is float).  But that's required
for correctness.

Hello.

Adding Ranger guys. Is it something that can be handled by the upcoming changes 
in VRP?


So we're back to your custom crafted code unless we manage to somehow
refactor the VRP condition analysis to handle both cases (should be
possible I think, but have not looked too closely).  Maybe the actual
matching pieces can be shared at least.


+      /* If it's not the first condition, then we need a BB without
+        any statements.  */
+      if (!first)
+       {
+         unsigned stmt_count = 0;
+         for (gimple_stmt_iterator gsi = gsi_start_nondebug_bb (bb);
+              !gsi_end_p (gsi); gsi_next_nondebug (&gsi))
+           ++stmt_count;
+
+         if (stmt_count - visited_stmt_count != 0)
+           break;

hmm, OK, this might be a bit iffy to get correct then, still it's a lot
of pattern maching code that is there elsewhere already.
ifcombine simply hoists any stmts without side-effects up the
dominator tree and thus only requires BBs without side-effects
(IIRC there's a predicate fn for that).

Yes, I completely miss support for code hoisting (expect first BB where we put 
gswitch).
If I'm correct hoisting should be possible where case destination should be a 
new BB
that will contain original statements and then it will jump to a case 
destination block.


+      /* Prevent loosing information for a PHI node where 2 edges will
+        be folded into one.  Note that we must do the same also for false_edge
+        (for last BB in a if-elseif chain).  */
+      if (!chain->record_phi_arguments (true_edge)
+         || !chain->record_phi_arguments (false_edge))

I don't really get this - looking at record_phi_arguments it seems
we're requiring that all edges into the same PHI from inside the case
(irrespective of from which case label) have the same value for the
PHI arg?

I guess so, I'll refresh the functionality.


+             if (arg != *v)
+               return false;

should use operand_equal_p at least, REAL_CSTs are for example
not shared tree nodes.  I'll also notice that if record_phi_arguments
fails we still may have altered its hash-map even though the particular
edge will not participate in the current chain, so it will affect other
chains ending in the same BB.  Overall this looks a bit too conservative
(and random, based on visiting order).

Oh, yes, it's not properly cleared once we bail out for a particular chain.


+    expanded_location loc
+    = expand_location (gimple_location (chain->m_first_condition));
+      if (dump_file)
+       {
+         fprintf (dump_file, "Condition chain (at %s:%d) with %d conditions "
+                  "(%d BBs) transformed into a switch statement.\n",
+                  loc.file, loc.line, total_case_values,
+                  chain->m_entries.length ());

Use dump_printf_loc and you can pass a gimple * stmt as location.

+      /* Follow if-elseif-elseif chain.  */
+      bb = false_edge->dest;

so that means the code doesn't handle a tree, right?  But what
makes us sure the chain doesn't continue on the true_edge instead,
guess this degenerate tree isn't handled either.

As mentioned earlier, I didn't consider VAR != CST type of conditions that
makes it more complicated.


I was thinking on whether doing the switch discovery in a reverse
CFG walk, recording for each BB what case_range(s) it represents
for a particular variable(s) so when visiting a dominator you
can quickly figure what's the relevant children (true, false or both).

Sounds promising. Note that right now we do not support overlapping cases like:

    if (5 <= argc && argc <= 10)
      foo ();
    else if (6 <= argc && argc <= 100)
      foo ();

So I'm wondering if we can support 2 children?

It would also make the matching a BB-local operation where you'd
do the case_label discovery based on the single-pred BBs gimple-cond.

Can you please describe more how will the walk work?


+  output = bit_test_cluster::find_bit_tests (filtered_clusters);
+  r = output.length () < filtered_clusters.length ();
+  if (r)
+    dump_clusters (&output, "BT can be built");

so as of the very above comment - this might be guarded with
flag_tree_switch_conversion?

flag_tree_switch_conversion isn't connected to the if-chain pass (yet).


As mentioned previously I would have liked to see if-to-switch
integrated with switch-conversion, having separate passes is
somewhat awkward (for example the redundant and possibly
expensive find_bit_tests).

Well, the CFG transformation for BT and JT is not trivial and I would like
to go in the first iteration through gswitch statements.
I have a massive speed up for the find_bit_tests/find_jump_tables.


+         /* Move all statements from the BB to the BB with gswitch.  */
+         auto_vec<gimple *> stmts;
+         for (gimple_stmt_iterator gsi = gsi_start_bb (entry.m_bb);
+              !gsi_end_p (gsi); gsi_next (&gsi))
+           {
+             gimple *stmt = gsi_stmt (gsi);
+             if (gimple_code (stmt) != GIMPLE_COND)
+               stmts.safe_push (stmt);
+           }
+
+         for (unsigned i = 0; i < stmts.length (); i++)
+           {
+             gimple_stmt_iterator gsi_from = gsi_for_stmt (stmts[i]);
+             gsi_move_before (&gsi_from, &gsi);
+           }

so you are already hoisting all stmts ...

As mentioned, it's not supported right now. This moves all these kind of "temp" 
statements:

    _1 = aChar_8(D) == 1;
    _2 = aChar_8(D) == 10;
    _3 = _1 | _2;

Sure, but that _is_ hoisting of all stmts.

No, hoisting for a real statement would break chain recognition by this code:

       /* If it's not the first condition, then we need a BB without
         any statements.  */
      if (!first)
        {
          unsigned stmt_count = 0;
          for (gimple_stmt_iterator gsi = gsi_start_nondebug_bb (bb);
               !gsi_end_p (gsi); gsi_next_nondebug (&gsi))
            ++stmt_count;

          if (stmt_count - visited_stmt_count != 0)
            break;
        }

Martin


Martin


+      make_edge (first_cond.m_bb, case_bb, 0);

and if this doesn't create a new edge you need equivalent PHI
args in the case_bb.  To remove this restriction you "only"
have to add a forwarder.  Sth like

     edge e = make_edge (...);
     if (!e)
       {
          bb = create_basic_block ();
          make_edge (first_cond.m_bb, bb, 0);
          e = make_edge (bb, case_bb, 0);
       }
    fill PHI arg of 'e' from original value (no need to create the hash-map 
then)

Richard.


Martin


Reply via email to