Re: [PATCH][RTL ifcvt] Transform (X == CST) ? -CST : Y into (X == CST) ? -X : Y when conditional negation is available

Jeff Law Thu, 29 Sep 2016 10:35:32 -0700

On 09/28/2016 10:33 AM, Kyrill Tkachov wrote:

FWIW, the simpler form of the transformation is already done just
prior to leaving SSA form in tree-ssa-uncprop.c.  So there may not be
a ton of opportunities for the simpler form in the RTL optimizers.


Indeed there weren't that many places, but they were some. They helped
in avoiding materialising the floating point 0.0
in particular in some cases.

Yea, that makes perfect sense. I'm pretty sure we don't do thetransformation on any floating point constants.

My only concern here is this transformation isn't significantly
different than the simpler form, which is apparently causing some kind
of combine problem.

I'd like to understand the combine problem better before giving final
approval to this patch.

So it's performance/codesize issue, not a correctness issue. For somereason I thought it was a correctness issue and thus neededinvestigation prior to this patch going forward.


Sure, I'm attaching the ifcvt patch that does the aforementioned simple
transform.
The testcase it regresses is:
int
foo (float a, float b, float c, int d)
{
  return a > 5 && a < 10 ? 6 : 0;
}

As far as I got is:
The relevant pre-combine (aarch64) RTL without the patch is:
(insn 21 19 45 2 (set (reg:SI 87)
        (zero_extend:SI (subreg:QI (reg:SI 86) 0))) fbad.c:4 90
{*zero_extendqisi2_aarch64}
     (expr_list:REG_DEAD (reg:SI 86)
        (nil)))
(insn 47 45 48 2 (set (reg:CC 66 cc)
        (compare:CC (reg:SI 87)
            (const_int 0 [0]))) fbad.c:4 392 {cmpsi}
     (expr_list:REG_DEAD (reg:SI 87)
        (nil)))
(insn 48 47 30 2 (set (reg:SI 76 [ <retval> ])
        (if_then_else:SI (ne (reg:CC 66 cc)
                (const_int 0 [0]))
            (reg:SI 89)
            (const_int 0 [0]))) fbad.c:4 444 {*cmovsi_insn}
     (expr_list:REG_DEAD (reg:SI 89)
        (expr_list:REG_DEAD (reg:CC 66 cc)
            (nil))))


and with the patch it is:
(insn 21 19 45 2 (set (reg:SI 87)
        (zero_extend:SI (subreg:QI (reg:SI 86) 0))) fbad.c:4 90
{*zero_extendqisi2_aarch64}
     (expr_list:REG_DEAD (reg:SI 86)
        (nil)))
(insn 46 45 47 2 (set (reg:CC 66 cc)
        (compare:CC (reg:SI 87)
            (const_int 0 [0]))) fbad.c:4 392 {cmpsi}
     (nil))
(insn 47 46 30 2 (set (reg:SI 76 [ <retval> ])
        (if_then_else:SI (eq (reg:CC 66 cc)
                (const_int 0 [0]))
            (reg:SI 87)
            (reg:SI 89))) fbad.c:4 444 {*cmovsi_insn}
     (expr_list:REG_DEAD (reg:SI 89)
        (expr_list:REG_DEAD (reg:SI 87)
            (expr_list:REG_DEAD (reg:CC 66 cc)
                (nil)))))

Combine is trying to combine 21 -> 47 in the first case (21 -> 46 in the
second) and without the
transformation ends up looking into the uses of CC as well, the
IF_THEN_ELSE in insn 48.
But I think it only does that if reg 87 is dead in insn 47.

Yea, I see what you're saying.

In the end what happens is it fails to merge the comparison and the use
of the comparison
and ends up emitting extra code.
So before:
    fmov    s2, 5.0e+0
    fmov    s1, 1.0e+1
    mov    w0, 6
    fcmpe    s0, s2
    fccmpe    s0, s1, 0, gt
    csel    w0, w0, wzr, mi //conditional select
    ret

after:
foo:
    fmov    s2, 5.0e+0
    fmov    s1, 1.0e+1
    mov    w1, 6
    fcmpe    s0, s2
    fccmpe    s0, s1, 0, gt
    cset    w0, mi // conditional set
    cmp    w0, 0
    csel    w0, w0, w1, eq
    ret

I think it has to do with the logic to set and use added_sets_2 in
try_combine
but I got lost trying to debug it, so I sort of attributed it to
"extending live ranges
of registers is bad"

Well, extending the live range restricts what combine can do in somecases. Essentially combine has to preserve the assignment to (reg 87).It'll end up trying to match a more complex parallel pattern in thatcase -- which likely fails to be recognized as a valid pattern.


These can sometimes be addressed with 3->2 splitters.

Anyway, since the combine issue is strictly a performance questionrather than a correctness issue, your patch is OK for the trunk.


Jeff

Re: [PATCH][RTL ifcvt] Transform (X == CST) ? -CST : Y into (X == CST) ? -X : Y when conditional negation is available

Reply via email to