On Mon, Jan 8, 2018 at 7:25 PM, Connor Abbott <cwabbo...@gmail.com> wrote: > In order to reduce moves when coalescing multiple registers into a > larger register, RA will try to coalesce MERGE instructions with their > definitions. For example, for something like this in GLSL: > > uint a = ...; > uint b = ...; > uint64 x = packUint2x32(a, b); > > The compiler will try to coalesce x with a and b, in the same way as > something like: > > uint a = ...; > uint b = ...; > ... > uint x = phi(a, b); > > with the crucial difference that the definitions of a and b only clobber > part of the register, instead of the whole thing. This information is > carried through the compound flag and compMask bitmask. If compound is > set, then the value has been coalesced in such a way that not all the > defs clobber the entire register. The compMask bitmask describes which > subregister each def clobbers, although it does it in a slightly > convoluted way. It's an invariant that once compound is set on one def, > it must be set for all the defs in a given coalesced value.
A sorta related thing (ie, same problem, but implemented in a different way in ir3_ra.c:get_definer()), dealing w/ live ranges when different instructions write parts of a merged register, is I think the most annoying/painful thing about ir3_ra.. the simple case is simple, but it gets annoying quickly when you combine splits (fanout) / merges (fanin). I think I do a sufficient job of preventing too many mov's from being eliminated pre-ra to keep it all working, but seems super fragile. I'd be curious if you'd come across any papers (preferably not paywalled) on how to deal with that in a better way? BR, -R > In more detail, the constraints pass will first create extra moves: > > uint a = ...; > uint b = ...; > uint a' = a; > uint b' = b; > uint64 x = packUint2x32(a', b'); > > and then RA will merge values involved in MERGE/SPLIT instructions, > merging x with a' and b' and making the combined value compound -- this > is relatively simple, and will always succeed since we just created a' > and b', so they never interfere with x, and x has no other definitions, > since we haven't started coalescing moves yet. Basically, we just replaced > the MERGE instruction with an equivalent sequence of partial writes to the > destination. The tricky part comes when we try to merge a' with a > and b' with b. We need to transfer the compound information from a' to a > and b' to b, which copyCompound() does, but we also need to transfer it > to any defs coalesced with a and b, which the code failed to do. Similarly, > if x is the argument to a phi instruction, then when we try to merge it > with other arguments to the same phi by coalescing moves, we'd have > problems guaranteeing that all the other merged defs stay up-to-date. > > One tricky part of fixing this is that in order to properly propagate > the information from a' to a, we need to do it before the defs for a and > a' are merged in coalesceValues(), since we need to know which defs are > merged with a but not a' -- after coalesceValues() returns, all the defs > have been combined, so we don't know which is which. I took the approach > of calling copyCompound() inside coalesceValues(), instead of > afterwards. > > Cc: Ilia Mirkin <imir...@alum.mit.edu> > Cc: Karol Herbst <kher...@redhat.com> > --- > So, I guess curiosity got the best of me :). Of course, I have no actual > way to test if this fixes the problem, but hopefully this at least helps > someone get further... > > src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 56 > ++++++++++++++-------- > 1 file changed, 36 insertions(+), 20 deletions(-) > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp > b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp > index b33d7b4010..2664c0678f 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp > @@ -890,6 +890,34 @@ GCRA::RIG_Node::init(const RegisterSet& regs, LValue > *lval) > livei.insert(lval->livei); > } > > +// Used when coalescing moves. The non-compound value will become one, e.g.: > +// mov b32 $r0 $r2 / merge b64 $r0d { $r0 $r1 } > +// split b64 { $r0 $r1 } $r0d / mov b64 $r0d f64 $r2d > +static inline void copyCompound(Value *dst, Value *src) > +{ > + LValue *ldst = dst->asLValue(); > + LValue *lsrc = src->asLValue(); > + > + if (ldst->compound && !lsrc->compound) { > + LValue *swap = lsrc; > + lsrc = ldst; > + ldst = swap; > + } > + > + assert(!ldst->compound); > + > + if (lsrc->compound) { > + Value *dstRep = ldst->join; > + for (Value::DefIterator d = dstRep->defs.begin(); d != > dstRep->defs.end(); > + ++d) { > + LValue *ldst = (*d)->get()->asLValue(); > + assert(!ldst->compound) > + ldst->compound = 1; > + ldst->compMask = lsrc->compMask; > + } > + } > +} > + > bool > GCRA::coalesceValues(Value *dst, Value *src, bool force) > { > @@ -932,9 +960,16 @@ GCRA::coalesceValues(Value *dst, Value *src, bool force) > if (!force && nRep->livei.overlaps(nVal->livei)) > return false; > > + // TODO: Handle this case properly. > + if (!force && rep->compound && val->compound) > + return false; > + > INFO_DBG(prog->dbgFlags, REG_ALLOC, "joining %%%i($%i) <- %%%i\n", > rep->id, rep->reg.data.id, val->id); > > + if (!force) > + copyCompound(dst, src); > + > // set join pointer of all values joined with val > for (Value::DefIterator def = val->defs.begin(); def != val->defs.end(); > ++def) > @@ -997,24 +1032,6 @@ static inline uint8_t makeCompMask(int compSize, int > base, int size) > } > } > > -// Used when coalescing moves. The non-compound value will become one, e.g.: > -// mov b32 $r0 $r2 / merge b64 $r0d { $r0 $r1 } > -// split b64 { $r0 $r1 } $r0d / mov b64 $r0d f64 $r2d > -static inline void copyCompound(Value *dst, Value *src) > -{ > - LValue *ldst = dst->asLValue(); > - LValue *lsrc = src->asLValue(); > - > - if (ldst->compound && !lsrc->compound) { > - LValue *swap = lsrc; > - lsrc = ldst; > - ldst = swap; > - } > - > - ldst->compound = lsrc->compound; > - ldst->compMask = lsrc->compMask; > -} > - > void > GCRA::makeCompound(Instruction *insn, bool split) > { > @@ -1100,8 +1117,7 @@ GCRA::doCoalesce(ArrayList& insns, unsigned int mask) > break; > i = insn->getSrc(0)->getUniqueInsn(); > if (i && !i->constrainedDefs()) { > - if (coalesceValues(insn->getDef(0), insn->getSrc(0), false)) > - copyCompound(insn->getSrc(0), insn->getDef(0)); > + coalesceValues(insn->getDef(0), insn->getSrc(0), false); > } > break; > case OP_TEX: > -- > 2.14.3 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev