If the TGSI writemask is .xzw and the initial dmask is 0xf, the first application of the function sees 4 VGPRs and correctly sets the dmask to 13 (xzw) because the second VGPR is unused, but the second application of the function now sees 3 VGPRs, ignores the fact that dmask is 13 meaning xzw, and sets the writemask to 7 meaning xyz, which breaks the texture instruction. I thought it was obvious from the comments I added.
Marek On Tue, Sep 24, 2013 at 9:00 AM, Christian König <deathsim...@vodafone.de> wrote: > Why should that be necessary? > > Christian. > > Am 23.09.2013 21:09, schrieb mar...@gmail.com: > >> From: Marek Olšák <marek.ol...@amd.com> >> >> This fixes piglit: >> - shaders/glsl-fs-texture2d-masked >> - shaders/glsl-fs-texture2d-masked-4 >> >> Signed-off-by: Marek Olšák <marek.ol...@amd.com> >> --- >> lib/Target/R600/SIISelLowering.cpp | 27 +++++++++++++++++++++------ >> 1 file changed, 21 insertions(+), 6 deletions(-) >> >> diff --git a/lib/Target/R600/SIISelLowering.cpp >> b/lib/Target/R600/SIISelLowering.cpp >> index 2174753..891a51b 100644 >> --- a/lib/Target/R600/SIISelLowering.cpp >> +++ b/lib/Target/R600/SIISelLowering.cpp >> @@ -1065,7 +1065,9 @@ static unsigned SubIdx2Lane(unsigned Idx) { >> void SITargetLowering::adjustWritemask(MachineSDNode *&Node, >> SelectionDAG &DAG) const { >> SDNode *Users[4] = { }; >> - unsigned Writemask = 0, Lane = 0; >> + unsigned Lane = 0; >> + unsigned OldDmask = Node->getConstantOperandVal(0); >> + unsigned NewDmask = 0; >> // Try to figure out the used register components >> for (SDNode::use_iterator I = Node->use_begin(), E = Node->use_end(); >> @@ -1076,29 +1078,42 @@ void >> SITargetLowering::adjustWritemask(MachineSDNode *&Node, >> I->getMachineOpcode() != TargetOpcode::EXTRACT_SUBREG) >> return; >> + /* Lane means which subreg of %VGPRa_VGPRb_VGPRc_VGPRd is used. >> + * Note that subregs are packed, i.e. Lane==0 is the first bit set >> + * in OldDmask, so it can be any of X,Y,Z,W; Lane==1 is the second >> bit >> + * set, etc. */ >> Lane = SubIdx2Lane(I->getConstantOperandVal(1)); >> + // Set which texture component corresponds to the lane. >> + unsigned Comp; >> + for (unsigned i = 0, Dmask = OldDmask; i <= Lane; i++) { >> + assert(Dmask); >> + Comp = ffs(Dmask)-1; >> + Dmask &= ~(1 << Comp); >> + } >> + >> // Abort if we have more than one user per component >> if (Users[Lane]) >> return; >> Users[Lane] = *I; >> - Writemask |= 1 << Lane; >> + NewDmask |= 1 << Comp; >> } >> - // Abort if all components are used >> - if (Writemask == 0xf) >> + // Abort if there's no change >> + if (NewDmask == OldDmask) >> return; >> // Adjust the writemask in the node >> std::vector<SDValue> Ops; >> - Ops.push_back(DAG.getTargetConstant(Writemask, MVT::i32)); >> + Ops.push_back(DAG.getTargetConstant(NewDmask, MVT::i32)); >> for (unsigned i = 1, e = Node->getNumOperands(); i != e; ++i) >> Ops.push_back(Node->getOperand(i)); >> Node = (MachineSDNode*)DAG.UpdateNodeOperands(Node, Ops.data(), >> Ops.size()); >> // If we only got one lane, replace it with a copy >> - if (Writemask == (1U << Lane)) { >> + // (if NewDmask has only one bit set...) >> + if (NewDmask && (NewDmask & (NewDmask-1)) == 0) { >> SDValue RC = DAG.getTargetConstant(AMDGPU::VReg_32RegClassID, >> MVT::i32); >> SDNode *Copy = DAG.getMachineNode(TargetOpcode::COPY_TO_REGCLASS, >> SDLoc(), >> Users[Lane]->getValueType(0), > > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev