On Tue, Feb 12, 2013 at 10:24:24AM +0100, Michel Dänzer wrote: > On Mon, 2013-02-11 at 19:39 +0100, Vincent Lejeune wrote: > > --- > > lib/Target/R600/AMDILISelDAGToDAG.cpp | 24 ++++++++++++++++++++++++ > > 1 file changed, 24 insertions(+) > > > > diff --git a/lib/Target/R600/AMDILISelDAGToDAG.cpp > > b/lib/Target/R600/AMDILISelDAGToDAG.cpp > > index b125ba8..2f34fe3 100644 > > --- a/lib/Target/R600/AMDILISelDAGToDAG.cpp > > +++ b/lib/Target/R600/AMDILISelDAGToDAG.cpp > > @@ -160,6 +160,30 @@ SDNode *AMDGPUDAGToDAGISel::Select(SDNode *N) { > > } > > switch (Opc) { > > default: break; > > + case ISD::BUILD_VECTOR: { > > + // BUILD_VECTOR is usually lowered into an IMPLICIT_DEF + 4 > > INSERT_SUBREG > > + // that adds a 128 bits reg copy when going through > > TwoAddressInstructions > > + // pass. We want to avoid 128 bits copies as much as possible because > > they > > + // can't be bundled by our scheduler. > > + SDValue RegSeqArgs[9] = { > > + CurDAG->getTargetConstant(AMDGPU::R600_Reg128RegClassID, MVT::i32), > > + SDValue(), CurDAG->getTargetConstant(AMDGPU::sub0, MVT::i32), > > + SDValue(), CurDAG->getTargetConstant(AMDGPU::sub1, MVT::i32), > > + SDValue(), CurDAG->getTargetConstant(AMDGPU::sub2, MVT::i32), > > + SDValue(), CurDAG->getTargetConstant(AMDGPU::sub3, MVT::i32) > > + }; > > + bool IsRegSeq = true; > > + for (unsigned i = 0; i < N->getNumOperands(); i++) { > > + if (dyn_cast<RegisterSDNode>(N->getOperand(i))) { > > + IsRegSeq = false; > > + break; > > + } > > + RegSeqArgs[2 * i + 1] = N->getOperand(i); > > + } > > + if (!IsRegSeq) > > + break; > > + return CurDAG->SelectNodeTo(N, AMDGPU::REG_SEQUENCE, N->getVTList(), > > RegSeqArgs, 2 * N->getNumOperands() + 1); > > + } > > This breaks radeonsi for almost anything non-trivial with > > /.../lib/Target/R600/SIInstrInfo.cpp:77: virtual void > llvm::SIInstrInfo::copyPhysReg(llvm::MachineBasicBlock&, > llvm::MachineBasicBlock::iterator, llvm::DebugLoc, unsigned int, unsigned > int, bool) const: Assertion `AMDGPU::SReg_32RegClass.contains(DestReg)' > failed. > > An example instruction is > > %T0_X<def> = COPY %VGPR3<kill>, %T0_XYZW<imp-def> > > where %T0_* makes no sense for SI. > > > Looking at the code, I see two main problems: The SI code uses vectors > of 1, 2, 4, 8 and in the future potentially 16 components for passing > texture sampling addressing parameters, and > AMDGPU::R600_Reg128RegClassID is non-SI specific. >
This code should be disabled on SI, the same way the propagation of modifiers is disabled. > > -- > Earthling Michel Dänzer | http://www.amd.com > Libre software enthusiast | Debian, X and DRI developer > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev