On Don, 2013-08-15 at 13:50 -0700, Tom Stellard wrote:
> On Thu, Aug 15, 2013 at 07:50:10PM +0200, Michel Dänzer wrote:
> > On Don, 2013-08-15 at 09:16 -0700, Tom Stellard wrote:
> > > On Thu, Aug 15, 2013 at 08:22:39AM -0700, Tom Stellard wrote:
> > > > On Thu, Aug 15, 2013 at 11:55:36AM +0200, Michel Dänzer wrote:
> > > > > On Fre, 2013-08-02 at 17:58 +0200, Michel Dänzer wrote:
> > > > > > On Mit, 2013-07-31 at 08:42 -0700, Tom Stellard wrote:
> > > > > > > On Wed, Jul 31, 2013 at 01:04:01PM +0200, Michel Dänzer wrote:
> > > > > > > > 
> > > > > > > > LLVM revision 187139 ('Allocate local registers in order for 
> > > > > > > > optimal
> > > > > > > > coloring.') broke some derivative related piglit tests with the 
> > > > > > > > radeonsi
> > > > > > > > driver. 
> > > > > > > > 
> > > > > > > > I'm attaching a diff between the bad and good generated code 
> > > > > > > > (as printed
> > > > > > > > with RADEON_DUMP_SHADERS=1) for the glsl-derivs test. The only
> > > > > > > > difference I can see is in which registers are used in which 
> > > > > > > > order.
> > > > > > > > 
> > > > > > > > I wonder if we might be missing S_WAITCNT after DS_READ/WRITE
> > > > > > > > instructions in some cases, but I haven't spotted any 
> > > > > > > > candidates for
> > > > > > > > that in the bad code which aren't there in the good code as 
> > > > > > > > well. Can
> > > > > > > > anyone else spot something I've missed?
> > > > > > > 
> > > > > > > Shouldn't we be using the S_BARRIER instruction to keep the 
> > > > > > > threads in sync?
> > > > > > 
> > > > > > Doesn't seem to help unfortunately, but thanks for the good 
> > > > > > suggestion.
> > > > > 
> > > > > I found one thing going wrong: DS_WRITE_B32 ends up using a VGPR
> > > > > register number instead of the $gds operand for encoding the GDS field
> > > > > (the asm output from llc even shows the VGPR name). If the VGPR number
> > > > > happens to be odd (i.e. to have the least significant bit set), the
> > > > > shader ends up writing to GDS instead of LDS.
> > > > >
> > > > 
> > > > Ouch, that's a pretty bad bug.
> > > > 
> > > > > But I have no idea why this is happening, or how to fix it. :(
> > > > > 
> > > > > 
> > > > 
> > > > I can take a look at it.
> > > 
> > > The attached patch should fix the problem, can you test?
> > 
> > Thanks for finding my silly mistake.
> > 
> > However, I'd like to preserve the ability to use these instructions for
> > GDS access, and the logic in SIInsertWaits::getHwCounts() only really
> > makes sense for SMRD anyway.
> > 
> > How about this patch instead? It fixes the piglit regressions that
> > prompted me to start this thread.

[...]

> > diff --git a/lib/Target/R600/SIInsertWaits.cpp 
> > b/lib/Target/R600/SIInsertWaits.cpp
> > index ba202e3..200e064 100644
> > --- a/lib/Target/R600/SIInsertWaits.cpp
> > +++ b/lib/Target/R600/SIInsertWaits.cpp
> > @@ -134,14 +134,20 @@ Counters SIInsertWaits::getHwCounts(MachineInstr &MI) 
> > {
> >    // LGKM may uses larger values
> >    if (TSFlags & SIInstrFlags::LGKM_CNT) {
> >  
> > -    MachineOperand &Op = MI.getOperand(0);
> > -    if (!Op.isReg())
> > -      Op = MI.getOperand(1);
> > -    assert(Op.isReg() && "First LGKM operand must be a register!");
> > -
> > -    unsigned Reg = Op.getReg();
> > -    unsigned Size = TRI->getMinimalPhysRegClass(Reg)->getSize();
> > -    Result.Named.LGKM = Size > 4 ? 2 : 1;
> > +    if (MI.getNumOperands() == 3) {
> 
> We should add a TSFlag for SMRD like we do for MIMG and add a helper
> function isSMRD to SIInstrInfo and use it here.  The number of operands
> for instructions tends to change from time to time.

Like this?


-- 
Earthling Michel Dänzer           |                   http://www.amd.com
Libre software enthusiast         |          Debian, X and DRI developer
>From c1d6b0f3d9cfcf2255257fdc87c748a46f799935 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Michel=20D=C3=A4nzer?= <michel.daen...@amd.com>
Date: Thu, 15 Aug 2013 19:43:02 +0200
Subject: [PATCH v2] R600/SI: Fix broken encoding of DS_WRITE_B32
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The logic in SIInsertWaits::getHwCounts() only really made sense for SMRD
instructions, and trying to shoehorn it into handling DS_WRITE_B32 caused
it to corrupt the encoding of that by clobbering the first operand with
the second one.

Undo that damage and only apply the SMRD logic to that.

Fixes some derivates related piglit regressions with radeonsi.

Signed-off-by: Michel Dänzer <michel.daen...@amd.com>
---

v2: Use SIInstrFlags to test for SMRD instructions.

 lib/Target/R600/SIDefines.h       |  3 ++-
 lib/Target/R600/SIInsertWaits.cpp | 21 +++++++++++++--------
 lib/Target/R600/SIInstrFormats.td |  3 +++
 lib/Target/R600/SIInstrInfo.cpp   |  4 ++++
 lib/Target/R600/SIInstrInfo.h     |  1 +
 test/CodeGen/R600/local-memory.ll |  4 ++--
 6 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/lib/Target/R600/SIDefines.h b/lib/Target/R600/SIDefines.h
index 572ed6a..f5445ad 100644
--- a/lib/Target/R600/SIDefines.h
+++ b/lib/Target/R600/SIDefines.h
@@ -13,7 +13,8 @@
 
 namespace SIInstrFlags {
 enum {
-  MIMG = 1 << 3
+  MIMG = 1 << 3,
+  SMRD = 1 << 4
 };
 }
 
diff --git a/lib/Target/R600/SIInsertWaits.cpp b/lib/Target/R600/SIInsertWaits.cpp
index ba202e3..7e42fb7 100644
--- a/lib/Target/R600/SIInsertWaits.cpp
+++ b/lib/Target/R600/SIInsertWaits.cpp
@@ -134,14 +134,19 @@ Counters SIInsertWaits::getHwCounts(MachineInstr &MI) {
   // LGKM may uses larger values
   if (TSFlags & SIInstrFlags::LGKM_CNT) {
 
-    MachineOperand &Op = MI.getOperand(0);
-    if (!Op.isReg())
-      Op = MI.getOperand(1);
-    assert(Op.isReg() && "First LGKM operand must be a register!");
-
-    unsigned Reg = Op.getReg();
-    unsigned Size = TRI->getMinimalPhysRegClass(Reg)->getSize();
-    Result.Named.LGKM = Size > 4 ? 2 : 1;
+    if (TII->isSMRD(MI.getOpcode())) {
+
+      MachineOperand &Op = MI.getOperand(0);
+      assert(Op.isReg() && "First LGKM operand must be a register!");
+
+      unsigned Reg = Op.getReg();
+      unsigned Size = TRI->getMinimalPhysRegClass(Reg)->getSize();
+      Result.Named.LGKM = Size > 4 ? 2 : 1;
+
+    } else {
+      // DS
+      Result.Named.LGKM = 1;
+    }
 
   } else {
     Result.Named.LGKM = 0;
diff --git a/lib/Target/R600/SIInstrFormats.td b/lib/Target/R600/SIInstrFormats.td
index cd1bbcd..9576c05 100644
--- a/lib/Target/R600/SIInstrFormats.td
+++ b/lib/Target/R600/SIInstrFormats.td
@@ -18,11 +18,13 @@ class InstSI <dag outs, dag ins, string asm, list<dag> pattern> :
   field bits<1> EXP_CNT = 0;
   field bits<1> LGKM_CNT = 0;
   field bits<1> MIMG = 0;
+  field bits<1> SMRD = 0;
 
   let TSFlags{0} = VM_CNT;
   let TSFlags{1} = EXP_CNT;
   let TSFlags{2} = LGKM_CNT;
   let TSFlags{3} = MIMG;
+  let TSFlags{4} = SMRD;
 }
 
 class Enc32 <dag outs, dag ins, string asm, list<dag> pattern> :
@@ -142,6 +144,7 @@ class SMRD <bits<5> op, bits<1> imm, dag outs, dag ins, string asm,
   let Inst{31-27} = 0x18; //encoding
 
   let LGKM_CNT = 1;
+  let SMRD = 1;
 }
 
 //===----------------------------------------------------------------------===//
diff --git a/lib/Target/R600/SIInstrInfo.cpp b/lib/Target/R600/SIInstrInfo.cpp
index 9bb4ad9..2719ea2 100644
--- a/lib/Target/R600/SIInstrInfo.cpp
+++ b/lib/Target/R600/SIInstrInfo.cpp
@@ -229,6 +229,10 @@ int SIInstrInfo::isMIMG(uint16_t Opcode) const {
   return get(Opcode).TSFlags & SIInstrFlags::MIMG;
 }
 
+int SIInstrInfo::isSMRD(uint16_t Opcode) const {
+  return get(Opcode).TSFlags & SIInstrFlags::SMRD;
+}
+
 //===----------------------------------------------------------------------===//
 // Indirect addressing callbacks
 //===----------------------------------------------------------------------===//
diff --git a/lib/Target/R600/SIInstrInfo.h b/lib/Target/R600/SIInstrInfo.h
index 8d24ab4..87b8063 100644
--- a/lib/Target/R600/SIInstrInfo.h
+++ b/lib/Target/R600/SIInstrInfo.h
@@ -48,6 +48,7 @@ public:
 
   virtual bool isSafeToMoveRegClassDefs(const TargetRegisterClass *RC) const;
   int isMIMG(uint16_t Opcode) const;
+  int isSMRD(uint16_t Opcode) const;
 
   virtual int getIndirectIndexBegin(const MachineFunction &MF) const;
 
diff --git a/test/CodeGen/R600/local-memory.ll b/test/CodeGen/R600/local-memory.ll
index 5458fb9..9ebb769 100644
--- a/test/CodeGen/R600/local-memory.ll
+++ b/test/CodeGen/R600/local-memory.ll
@@ -13,7 +13,7 @@
 ; SI-CHECK-NEXT: .long 32768
 
 ; EG-CHECK: LDS_WRITE
-; SI-CHECK: DS_WRITE_B32
+; SI-CHECK: DS_WRITE_B32 0
 
 ; GROUP_BARRIER must be the last instruction in a clause
 ; EG-CHECK: GROUP_BARRIER
@@ -21,7 +21,7 @@
 ; SI-CHECK: S_BARRIER
 
 ; EG-CHECK: LDS_READ_RET
-; SI-CHECK: DS_READ_B32
+; SI-CHECK: DS_READ_B32 {{VGPR[0-9]+}}, 0
 
 define void @local_memory(i32 addrspace(1)* %out) {
 entry:
-- 
1.8.4.rc2

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to