On 09/18/2015 05:21 PM, Jiong Wang wrote:

Current conditional compare (CCMP) support in GCC aim to optimize
short circuit for cascade comparision, given a simple conditional
compare candidate:

   if (a == 17 || a == 32)
[...]
The problem is current implementation always expand t0 first, then
t1. While the expand order need to consider the rtx costs, because "cmp"
and "ccmp" may have different restrictions that the expand order will
result in performance differences.

For example on AArch64, "ccmp" only accept immediate within -31 ~ 31
while "cmp" accept wider range, so if we expand "a == 32" in the second
step, then it will use "ccmp", and thus an extra "mov reg, 32"
instruction is generated because "32" is out of the range. While if we
expand "a == 32" first, then it's fine as "cmp" can encoding it.

I've played with this patch a bit with an aarch64 cross compiler. First of all - it doesn't seem to work, I get identical costs and the swapping doesn't happen. Did you forget to include a rtx_cost patch?

I was a little worried about whether this would be expensive for longer sequences of conditions, but it seems like it looks only at leafs where we have two comparisons, so that cost should be minimal. However, it's possible there's room for improvement in code generation. I was curious and looked at a slightly more complex testcase

int
foo (int a)
{
  if (a == 17 || a == 32 || a == 47 || a == 53 || a == 66 || a == 72)
    return 1;
  else
    return 2;
}

and this doesn't generate a sequence of ccmps as might have been expected; we only get pairs of comparisons merged with a bit_ior:

  D.2699 = a == 17;
  D.2700 = a == 32;
  D.2701 = D.2699 | D.2700;
  if (D.2701 != 0) goto <D.2697>; else goto <D.2702>;
  <D.2702>:
  D.2703 = a == 47;
  D.2704 = a == 53;
  D.2705 = D.2703 | D.2704;
  if (D.2705 != 0) goto <D.2697>; else goto <D.2706>;

and the ccmp expander doesn't see the entire thing. I found that a little surprising TBH.


Bernd

Reply via email to