On Mon, Jan 8, 2018 at 7:25 PM, Connor Abbott <cwabbo...@gmail.com> wrote:
> In order to reduce moves when coalescing multiple registers into a
> larger register, RA will try to coalesce MERGE instructions with their
> definitions. For example, for something like this in GLSL:
>
> uint a = ...;
> uint b = ...;
> uint64 x = packUint2x32(a, b);
>
> The compiler will try to coalesce x with a and b, in the same way as
> something like:
>
> uint a = ...;
> uint b = ...;
> ...
> uint x = phi(a, b);
>
> with the crucial difference that the definitions of a and b only clobber
> part of the register, instead of the whole thing. This information is
> carried through the compound flag and compMask bitmask. If compound is
> set, then the value has been coalesced in such a way that not all the
> defs clobber the entire register. The compMask bitmask describes which
> subregister each def clobbers, although it does it in a slightly
> convoluted way. It's an invariant that once compound is set on one def,
> it must be set for all the defs in a given coalesced value.

A sorta related thing (ie, same problem, but implemented in a
different way in ir3_ra.c:get_definer()), dealing w/ live ranges when
different instructions write parts of a merged register, is I think
the most annoying/painful thing about ir3_ra.. the simple case is
simple, but it gets annoying quickly when you combine splits (fanout)
/ merges (fanin).  I think I do a sufficient job of preventing too
many mov's from being eliminated pre-ra to keep it all working, but
seems super fragile.  I'd be curious if you'd come across any papers
(preferably not paywalled) on how to deal with that in a better way?

BR,
-R

> In more detail, the constraints pass will first create extra moves:
>
> uint a = ...;
> uint b = ...;
> uint a' = a;
> uint b' = b;
> uint64 x = packUint2x32(a', b');
>
> and then RA will merge values involved in MERGE/SPLIT instructions,
> merging x with a' and b' and making the combined value compound -- this
> is relatively simple, and will always succeed since we just created a'
> and b', so they never interfere with x, and x has no other definitions,
> since we haven't started coalescing moves yet. Basically, we just replaced
> the MERGE instruction with an equivalent sequence of partial writes to the
> destination. The tricky part comes when we try to merge a' with a
> and b' with b. We need to transfer the compound information from a' to a
> and b' to b, which copyCompound() does, but we also need to transfer it
> to any defs coalesced with a and b, which the code failed to do. Similarly,
> if x is the argument to a phi instruction, then when we try to merge it
> with other arguments to the same phi by coalescing moves, we'd have
> problems guaranteeing that all the other merged defs stay up-to-date.
>
> One tricky part of fixing this is that in order to properly propagate
> the information from a' to a, we need to do it before the defs for a and
> a' are merged in coalesceValues(), since we need to know which defs are
> merged with a but not a' -- after coalesceValues() returns, all the defs
> have been combined, so we don't know which is which. I took the approach
> of calling copyCompound() inside coalesceValues(), instead of
> afterwards.
>
> Cc: Ilia Mirkin <imir...@alum.mit.edu>
> Cc: Karol Herbst <kher...@redhat.com>
> ---
> So, I guess curiosity got the best of me :). Of course, I have no actual
> way to test if this fixes the problem, but hopefully this at least helps
> someone get further...
>
>  src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 56 
> ++++++++++++++--------
>  1 file changed, 36 insertions(+), 20 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> index b33d7b4010..2664c0678f 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> @@ -890,6 +890,34 @@ GCRA::RIG_Node::init(const RegisterSet& regs, LValue 
> *lval)
>     livei.insert(lval->livei);
>  }
>
> +// Used when coalescing moves. The non-compound value will become one, e.g.:
> +// mov b32 $r0 $r2            / merge b64 $r0d { $r0 $r1 }
> +// split b64 { $r0 $r1 } $r0d / mov b64 $r0d f64 $r2d
> +static inline void copyCompound(Value *dst, Value *src)
> +{
> +   LValue *ldst = dst->asLValue();
> +   LValue *lsrc = src->asLValue();
> +
> +   if (ldst->compound && !lsrc->compound) {
> +      LValue *swap = lsrc;
> +      lsrc = ldst;
> +      ldst = swap;
> +   }
> +
> +   assert(!ldst->compound);
> +
> +   if (lsrc->compound) {
> +      Value *dstRep = ldst->join;
> +      for (Value::DefIterator d = dstRep->defs.begin(); d != 
> dstRep->defs.end();
> +           ++d) {
> +         LValue *ldst = (*d)->get()->asLValue();
> +         assert(!ldst->compound)
> +         ldst->compound = 1;
> +         ldst->compMask = lsrc->compMask;
> +      }
> +   }
> +}
> +
>  bool
>  GCRA::coalesceValues(Value *dst, Value *src, bool force)
>  {
> @@ -932,9 +960,16 @@ GCRA::coalesceValues(Value *dst, Value *src, bool force)
>     if (!force && nRep->livei.overlaps(nVal->livei))
>        return false;
>
> +   // TODO: Handle this case properly.
> +   if (!force && rep->compound && val->compound)
> +      return false;
> +
>     INFO_DBG(prog->dbgFlags, REG_ALLOC, "joining %%%i($%i) <- %%%i\n",
>              rep->id, rep->reg.data.id, val->id);
>
> +   if (!force)
> +      copyCompound(dst, src);
> +
>     // set join pointer of all values joined with val
>     for (Value::DefIterator def = val->defs.begin(); def != val->defs.end();
>          ++def)
> @@ -997,24 +1032,6 @@ static inline uint8_t makeCompMask(int compSize, int 
> base, int size)
>     }
>  }
>
> -// Used when coalescing moves. The non-compound value will become one, e.g.:
> -// mov b32 $r0 $r2            / merge b64 $r0d { $r0 $r1 }
> -// split b64 { $r0 $r1 } $r0d / mov b64 $r0d f64 $r2d
> -static inline void copyCompound(Value *dst, Value *src)
> -{
> -   LValue *ldst = dst->asLValue();
> -   LValue *lsrc = src->asLValue();
> -
> -   if (ldst->compound && !lsrc->compound) {
> -      LValue *swap = lsrc;
> -      lsrc = ldst;
> -      ldst = swap;
> -   }
> -
> -   ldst->compound = lsrc->compound;
> -   ldst->compMask = lsrc->compMask;
> -}
> -
>  void
>  GCRA::makeCompound(Instruction *insn, bool split)
>  {
> @@ -1100,8 +1117,7 @@ GCRA::doCoalesce(ArrayList& insns, unsigned int mask)
>              break;
>           i = insn->getSrc(0)->getUniqueInsn();
>           if (i && !i->constrainedDefs()) {
> -            if (coalesceValues(insn->getDef(0), insn->getSrc(0), false))
> -               copyCompound(insn->getSrc(0), insn->getDef(0));
> +            coalesceValues(insn->getDef(0), insn->getSrc(0), false);
>           }
>           break;
>        case OP_TEX:
> --
> 2.14.3
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to