On Fri, Aug 16, 2013 at 03:36:38PM +0200, Michel Dänzer wrote: > On Don, 2013-08-15 at 13:50 -0700, Tom Stellard wrote: > > On Thu, Aug 15, 2013 at 07:50:10PM +0200, Michel Dänzer wrote: > > > On Don, 2013-08-15 at 09:16 -0700, Tom Stellard wrote: > > > > On Thu, Aug 15, 2013 at 08:22:39AM -0700, Tom Stellard wrote: > > > > > On Thu, Aug 15, 2013 at 11:55:36AM +0200, Michel Dänzer wrote: > > > > > > On Fre, 2013-08-02 at 17:58 +0200, Michel Dänzer wrote: > > > > > > > On Mit, 2013-07-31 at 08:42 -0700, Tom Stellard wrote: > > > > > > > > On Wed, Jul 31, 2013 at 01:04:01PM +0200, Michel Dänzer wrote: > > > > > > > > > > > > > > > > > > LLVM revision 187139 ('Allocate local registers in order for > > > > > > > > > optimal > > > > > > > > > coloring.') broke some derivative related piglit tests with > > > > > > > > > the radeonsi > > > > > > > > > driver. > > > > > > > > > > > > > > > > > > I'm attaching a diff between the bad and good generated code > > > > > > > > > (as printed > > > > > > > > > with RADEON_DUMP_SHADERS=1) for the glsl-derivs test. The only > > > > > > > > > difference I can see is in which registers are used in which > > > > > > > > > order. > > > > > > > > > > > > > > > > > > I wonder if we might be missing S_WAITCNT after DS_READ/WRITE > > > > > > > > > instructions in some cases, but I haven't spotted any > > > > > > > > > candidates for > > > > > > > > > that in the bad code which aren't there in the good code as > > > > > > > > > well. Can > > > > > > > > > anyone else spot something I've missed? > > > > > > > > > > > > > > > > Shouldn't we be using the S_BARRIER instruction to keep the > > > > > > > > threads in sync? > > > > > > > > > > > > > > Doesn't seem to help unfortunately, but thanks for the good > > > > > > > suggestion. > > > > > > > > > > > > I found one thing going wrong: DS_WRITE_B32 ends up using a VGPR > > > > > > register number instead of the $gds operand for encoding the GDS > > > > > > field > > > > > > (the asm output from llc even shows the VGPR name). If the VGPR > > > > > > number > > > > > > happens to be odd (i.e. to have the least significant bit set), the > > > > > > shader ends up writing to GDS instead of LDS. > > > > > > > > > > > > > > > > Ouch, that's a pretty bad bug. > > > > > > > > > > > But I have no idea why this is happening, or how to fix it. :( > > > > > > > > > > > > > > > > > > > > > > I can take a look at it. > > > > > > > > The attached patch should fix the problem, can you test? > > > > > > Thanks for finding my silly mistake. > > > > > > However, I'd like to preserve the ability to use these instructions for > > > GDS access, and the logic in SIInsertWaits::getHwCounts() only really > > > makes sense for SMRD anyway. > > > > > > How about this patch instead? It fixes the piglit regressions that > > > prompted me to start this thread. > > [...] > > > > diff --git a/lib/Target/R600/SIInsertWaits.cpp > > > b/lib/Target/R600/SIInsertWaits.cpp > > > index ba202e3..200e064 100644 > > > --- a/lib/Target/R600/SIInsertWaits.cpp > > > +++ b/lib/Target/R600/SIInsertWaits.cpp > > > @@ -134,14 +134,20 @@ Counters SIInsertWaits::getHwCounts(MachineInstr > > > &MI) { > > > // LGKM may uses larger values > > > if (TSFlags & SIInstrFlags::LGKM_CNT) { > > > > > > - MachineOperand &Op = MI.getOperand(0); > > > - if (!Op.isReg()) > > > - Op = MI.getOperand(1); > > > - assert(Op.isReg() && "First LGKM operand must be a register!"); > > > - > > > - unsigned Reg = Op.getReg(); > > > - unsigned Size = TRI->getMinimalPhysRegClass(Reg)->getSize(); > > > - Result.Named.LGKM = Size > 4 ? 2 : 1; > > > + if (MI.getNumOperands() == 3) { > > > > We should add a TSFlag for SMRD like we do for MIMG and add a helper > > function isSMRD to SIInstrInfo and use it here. The number of operands > > for instructions tends to change from time to time. > > Like this? > > > -- > Earthling Michel Dänzer | http://www.amd.com > Libre software enthusiast | Debian, X and DRI developer
> From c1d6b0f3d9cfcf2255257fdc87c748a46f799935 Mon Sep 17 00:00:00 2001 > From: =?UTF-8?q?Michel=20D=C3=A4nzer?= <michel.daen...@amd.com> > Date: Thu, 15 Aug 2013 19:43:02 +0200 > Subject: [PATCH v2] R600/SI: Fix broken encoding of DS_WRITE_B32 > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > > The logic in SIInsertWaits::getHwCounts() only really made sense for SMRD > instructions, and trying to shoehorn it into handling DS_WRITE_B32 caused > it to corrupt the encoding of that by clobbering the first operand with > the second one. > > Undo that damage and only apply the SMRD logic to that. > > Fixes some derivates related piglit regressions with radeonsi. > > Signed-off-by: Michel Dänzer <michel.daen...@amd.com> Reviewed-by: Tom Stellard <thomas.stell...@amd.com> > --- > > v2: Use SIInstrFlags to test for SMRD instructions. > > lib/Target/R600/SIDefines.h | 3 ++- > lib/Target/R600/SIInsertWaits.cpp | 21 +++++++++++++-------- > lib/Target/R600/SIInstrFormats.td | 3 +++ > lib/Target/R600/SIInstrInfo.cpp | 4 ++++ > lib/Target/R600/SIInstrInfo.h | 1 + > test/CodeGen/R600/local-memory.ll | 4 ++-- > 6 files changed, 25 insertions(+), 11 deletions(-) > > diff --git a/lib/Target/R600/SIDefines.h b/lib/Target/R600/SIDefines.h > index 572ed6a..f5445ad 100644 > --- a/lib/Target/R600/SIDefines.h > +++ b/lib/Target/R600/SIDefines.h > @@ -13,7 +13,8 @@ > > namespace SIInstrFlags { > enum { > - MIMG = 1 << 3 > + MIMG = 1 << 3, > + SMRD = 1 << 4 > }; > } > > diff --git a/lib/Target/R600/SIInsertWaits.cpp > b/lib/Target/R600/SIInsertWaits.cpp > index ba202e3..7e42fb7 100644 > --- a/lib/Target/R600/SIInsertWaits.cpp > +++ b/lib/Target/R600/SIInsertWaits.cpp > @@ -134,14 +134,19 @@ Counters SIInsertWaits::getHwCounts(MachineInstr &MI) { > // LGKM may uses larger values > if (TSFlags & SIInstrFlags::LGKM_CNT) { > > - MachineOperand &Op = MI.getOperand(0); > - if (!Op.isReg()) > - Op = MI.getOperand(1); > - assert(Op.isReg() && "First LGKM operand must be a register!"); > - > - unsigned Reg = Op.getReg(); > - unsigned Size = TRI->getMinimalPhysRegClass(Reg)->getSize(); > - Result.Named.LGKM = Size > 4 ? 2 : 1; > + if (TII->isSMRD(MI.getOpcode())) { > + > + MachineOperand &Op = MI.getOperand(0); > + assert(Op.isReg() && "First LGKM operand must be a register!"); > + > + unsigned Reg = Op.getReg(); > + unsigned Size = TRI->getMinimalPhysRegClass(Reg)->getSize(); > + Result.Named.LGKM = Size > 4 ? 2 : 1; > + > + } else { > + // DS > + Result.Named.LGKM = 1; > + } > > } else { > Result.Named.LGKM = 0; > diff --git a/lib/Target/R600/SIInstrFormats.td > b/lib/Target/R600/SIInstrFormats.td > index cd1bbcd..9576c05 100644 > --- a/lib/Target/R600/SIInstrFormats.td > +++ b/lib/Target/R600/SIInstrFormats.td > @@ -18,11 +18,13 @@ class InstSI <dag outs, dag ins, string asm, list<dag> > pattern> : > field bits<1> EXP_CNT = 0; > field bits<1> LGKM_CNT = 0; > field bits<1> MIMG = 0; > + field bits<1> SMRD = 0; > > let TSFlags{0} = VM_CNT; > let TSFlags{1} = EXP_CNT; > let TSFlags{2} = LGKM_CNT; > let TSFlags{3} = MIMG; > + let TSFlags{4} = SMRD; > } > > class Enc32 <dag outs, dag ins, string asm, list<dag> pattern> : > @@ -142,6 +144,7 @@ class SMRD <bits<5> op, bits<1> imm, dag outs, dag ins, > string asm, > let Inst{31-27} = 0x18; //encoding > > let LGKM_CNT = 1; > + let SMRD = 1; > } > > > //===----------------------------------------------------------------------===// > diff --git a/lib/Target/R600/SIInstrInfo.cpp b/lib/Target/R600/SIInstrInfo.cpp > index 9bb4ad9..2719ea2 100644 > --- a/lib/Target/R600/SIInstrInfo.cpp > +++ b/lib/Target/R600/SIInstrInfo.cpp > @@ -229,6 +229,10 @@ int SIInstrInfo::isMIMG(uint16_t Opcode) const { > return get(Opcode).TSFlags & SIInstrFlags::MIMG; > } > > +int SIInstrInfo::isSMRD(uint16_t Opcode) const { > + return get(Opcode).TSFlags & SIInstrFlags::SMRD; > +} > + > > //===----------------------------------------------------------------------===// > // Indirect addressing callbacks > > //===----------------------------------------------------------------------===// > diff --git a/lib/Target/R600/SIInstrInfo.h b/lib/Target/R600/SIInstrInfo.h > index 8d24ab4..87b8063 100644 > --- a/lib/Target/R600/SIInstrInfo.h > +++ b/lib/Target/R600/SIInstrInfo.h > @@ -48,6 +48,7 @@ public: > > virtual bool isSafeToMoveRegClassDefs(const TargetRegisterClass *RC) const; > int isMIMG(uint16_t Opcode) const; > + int isSMRD(uint16_t Opcode) const; > > virtual int getIndirectIndexBegin(const MachineFunction &MF) const; > > diff --git a/test/CodeGen/R600/local-memory.ll > b/test/CodeGen/R600/local-memory.ll > index 5458fb9..9ebb769 100644 > --- a/test/CodeGen/R600/local-memory.ll > +++ b/test/CodeGen/R600/local-memory.ll > @@ -13,7 +13,7 @@ > ; SI-CHECK-NEXT: .long 32768 > > ; EG-CHECK: LDS_WRITE > -; SI-CHECK: DS_WRITE_B32 > +; SI-CHECK: DS_WRITE_B32 0 > > ; GROUP_BARRIER must be the last instruction in a clause > ; EG-CHECK: GROUP_BARRIER > @@ -21,7 +21,7 @@ > ; SI-CHECK: S_BARRIER > > ; EG-CHECK: LDS_READ_RET > -; SI-CHECK: DS_READ_B32 > +; SI-CHECK: DS_READ_B32 {{VGPR[0-9]+}}, 0 > > define void @local_memory(i32 addrspace(1)* %out) { > entry: > -- > 1.8.4.rc2 > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev