On 28/11/18 6:52 pm, Thomas Helland wrote:
Den ons. 28. nov. 2018 kl. 04:26 skrev Timothy Arceri <tarc...@itsqueeze.com>:
This adds allows loop analysis to detect inductions varibales that
are incremented in both branches of an if rather than in a main
loop block. For example:
loop {
block block_1:
/* preds: block_0 block_7 */
vec1 32 ssa_8 = phi block_0: ssa_4, block_7: ssa_20
vec1 32 ssa_9 = phi block_0: ssa_0, block_7: ssa_4
vec1 32 ssa_10 = phi block_0: ssa_1, block_7: ssa_4
vec1 32 ssa_11 = phi block_0: ssa_2, block_7: ssa_21
vec1 32 ssa_12 = phi block_0: ssa_3, block_7: ssa_22
vec4 32 ssa_13 = vec4 ssa_12, ssa_11, ssa_10, ssa_9
vec1 32 ssa_14 = ige ssa_8, ssa_5
/* succs: block_2 block_3 */
if ssa_14 {
block block_2:
/* preds: block_1 */
break
/* succs: block_8 */
} else {
block block_3:
/* preds: block_1 */
/* succs: block_4 */
}
block block_4:
/* preds: block_3 */
vec1 32 ssa_15 = ilt ssa_6, ssa_8
/* succs: block_5 block_6 */
if ssa_15 {
block block_5:
/* preds: block_4 */
vec1 32 ssa_16 = iadd ssa_8, ssa_7
vec1 32 ssa_17 = load_const (0x3f800000 /* 1.000000*/)
/* succs: block_7 */
} else {
block block_6:
/* preds: block_4 */
vec1 32 ssa_18 = iadd ssa_8, ssa_7
vec1 32 ssa_19 = load_const (0x3f800000 /* 1.000000*/)
/* succs: block_7 */
}
block block_7:
/* preds: block_5 block_6 */
vec1 32 ssa_20 = phi block_5: ssa_16, block_6: ssa_18
vec1 32 ssa_21 = phi block_5: ssa_17, block_6: ssa_4
vec1 32 ssa_22 = phi block_5: ssa_4, block_6: ssa_19
/* succs: block_1 */
}
Unfortunatly GCM could move the addition out of the if for us
(making this patch unrequired) but we still cannot enable the GCM
pass without regressions.
Just some questions / suggestions from my side for now.
I'll try to take a closer look at the patch later today.
While GCM would be nice, to me it seems that adding an
if-opt instead, that pulls common code from both branches
of an if out of the if on a more general basis, would get us
this, plus a bunch of other benefits? As far as I can see there
should never be negative impacts from pulling common code
out like that, but I might be wrong. Did you look into that?
I bet out did, I'm just interested in how that worked out.
I didn't attempt this because pulling code out of the ifs can increase
register pressure. This is one of the problems we have with the GCM pass
currently, so for now I chose a more conservative approach.
Since GCM is not yet where we want it to be, maybe we'd
want to implement LICM? That obviously does not come
into play with what this patch adresses, but it might help
get a more accurate estimate of the cost/benefit of unrolling?
(Invariant computations that will be CSE'd will not be
counted multiple times). This might already be accounted
for by counting the invariant computations only once?
No we don't do anything like this currently. The GCM pass can pull
things out of loops also, but again we hit register pressure issues with
that pass.
As far as I'm aware reducing invariants is not where we get most of our
wins from with unrolling. Removing indirect array access, improving
opportunities for constant folding (and a bunch of other passes), being
able to evaluate the unfolded loop with the surrounding code etc all
result in greater benefits.
With the limits we place on making sure we don't unroll large loops that
are going to cause register use issues, nobody has yet been able to show
that always unrolling loops is causing any harm, and it's certainly been
shown to help :)
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev