Re: [PATCH] Add one more pass_convert_switch late.

Martin Liška Fri, 27 Mar 2020 08:56:00 -0700

On 11/19/19 3:48 PM, Richard Biener wrote:

Sure.  As said in the if-to-switch pass "review" all these (now three)
passes should share an intermediate representation and costing they
then generate code from (either switch-converted lookups, switches
or optimized if-chains).  There's no reason to actually generate a switch
from an if-chain if we just want to optimize the if-chain (it might be
convenient for the code generator though, not sure).  Likewise if the
switch ends up switch-converted there's no reason to produce the
switch first.


But first and foremost sharing costing and transform decision code
which means having some unified representation of the IL semantics
is what is needed for this.


Hi.

I'm slowly but surely planning to continue working on that in next stage1.
There are concerns that should be addressed:

1) if-chain to switch conversion pass made conversion unconditionally without
analysis if a switch can be transformed in a clever way (cswitch, jump-table 
(JT), bit-test (BT)).

That can be solved by having an abstract analysis model. One problematic thing 
is cswitch
conversion that requires various CFG constraints and I'm not sure that early 
tree pass
can prove that a switch can eventually become CSWITCH.

2) we observed that current switch expansion can't support tree reassoc tricks:
optimize_range_tests_xor and optimize_range_tests_diff

I can add that to switch expansion but I see one problem. The switch expansion 
builds
top-level decision tree that contains JT, BT or a simple equality comparison 
nodes.
And each node handles only a consecutive interval. On the contrary the XOR and 
DIFF tricks
can handle any 2 comparisons. Moreover, these tests generate a if-chain that 
goes at the beginning
of a comparison.

There's one counter-example which presents the problem:

For foo we get:

Optimizing range tests a_20(D) -[33, 33] and -[41, 41]
 into (a_20(D) & -9) != 33
Optimizing range tests a_20(D) -[80, 80] and -[88, 88]
 into (a_20(D) & -9) != 80
Optimizing range tests a_20(D) -[99, 99] and -[115, 115]
 into (a_20(D) & -17) != 99
Optimizing range tests a_20(D) -[130, 130] and -[162, 162]
 into (a_20(D) & -33) != 130
Optimizing range tests a_20(D) -[200, 200] and -[216, 216]
 into (a_20(D) & -17) != 200
Optimizing range tests a_20(D) -[3, 3] and -[5, 5]
 into ((unsigned int) a_20(D) + 4294967293 & 4294967293) != 0

so a chain of 6 comparisons.

For bar we can generate a 3 BTs with the following balanced tree on the top:

;;               BT:3-41 21.7% (guessed) subtree: 21.7% (guessed))
;;        BT:80-130 27.1% (guessed) subtree: 65.0% (guessed))
;;               BT:162-216 16.3% (guessed) subtree: 16.3% (guessed))

There can be also situations where a simple balanced tree can be faster than a 
series of
the XOR and DIFF tests.

So it's not an easy task to integrate XOR and DIFF tricks into switch 
conversion with a reasonable
costing model.

3) I can imagine a huge number if-else chain can benefit from simple balanced 
tree expansion. It can
be even better with PGO.

So maybe we can transform these huge if-chains into switch unconditionally? 
Examples:

gcc/c-family/c-common.c:2302
gcc/c-family/c-common.c:2871

Thoughts?
Thanks,
Martin

#define T(v, step) a == v || a == (v + step)
#define C(v, step) case v: case (v + step)

int foo(int a)
{
  if (T(3, 2) || T(33, 8) || T(80, 8) || T(99, 16) || T(130, 32) || T(200, 16))
    return 1;
  else
    return 4;
}

int bar(int a)
{
  switch (a)
  {
    C(3, 2):
    C(33, 8):
    C(80, 8):
    C(99, 16):
    C(130, 32):
    C(200, 16):
      return 1;
    default:
    return 4;
  }
}

Re: [PATCH] Add one more pass_convert_switch late.

Reply via email to