On 04/20/2016 07:49 PM, Ilia Mirkin wrote:
The SM20/SM30 logic does this for exch || cas. Are you *sure* this
shouldn't be the same? It's pretty silly to do a CAS without a dst,
but it's definitely possible (through DCE).
Yeah, I'm sure. It's definitely not the same logic between gk104 and gk110.
About gk104, there are different opcodes for EXCH, CAS and one for all
other atomic operations (ADD, MIN, etc). And this logic is only valid
for EXCH and CAS.
About gk110, there are only two different opcodes, one for CAS, and the
other one for EXCH including ADD, MIN etc. But it's invalid to not
define a dst for EXCH *only*.
For your input, it's possible to do a CAS without a dst on GK110, as you
can see below: :-)
.headerflags @"EF_CUDA_SM35 EF_CUDA_PTX_SM(EF_CUDA_SM35)"
/*0000*/ @P0 ATOM.CAS RZ, [R0], R0, R1; /* 0x77800000000003fe */
On Wed, Apr 20, 2016 at 1:47 PM, Samuel Pitoiset
<samuel.pitoi...@gmail.com> wrote:
This is only valid for other atomic operations (including CAS). This
fixes an invalid opcode error from dmesg. While we are it, make sure
to initialize global addr to 0 for other atomic operations.
Signed-off-by: Samuel Pitoiset <samuel.pitoi...@gmail.com>
Cc: "11.1 11.2" <mesa-sta...@lists.freedesktop.org>
---
.../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 21 +++++++++++++++------
1 file changed, 15 insertions(+), 6 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
index 70f3c3f..e2c3b8e 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
@@ -1808,6 +1808,9 @@ uses64bitAddress(const Instruction *ldst)
void
CodeEmitterGK110::emitATOM(const Instruction *i)
{
+ const bool hasDst = i->defExists(0);
+ const bool exch = i->subOp == NV50_IR_SUBOP_ATOM_EXCH;
+
code[0] = 0x00000002;
if (i->subOp == NV50_IR_SUBOP_ATOM_CAS)
code[1] = 0x77800000;
@@ -1836,15 +1839,21 @@ CodeEmitterGK110::emitATOM(const Instruction *i)
/* TODO: cas: flip bits if $r255 is used */
srcId(i->src(1), 23);
- if (i->defExists(0))
+ if (hasDst) {
defId(i->def(0), 2);
- else
+ } else
+ if (!exch) {
code[0] |= 255 << 2;
+ }
- const int32_t offset = SDATA(i->src(0)).offset;
- assert(offset < 0x80000 && offset >= -0x80000);
- code[0] |= (offset & 1) << 31;
- code[1] |= (offset & 0xffffe) >> 1;
+ if (hasDst || !exch) {
+ const int32_t offset = SDATA(i->src(0)).offset;
+ assert(offset < 0x80000 && offset >= -0x80000);
+ code[0] |= (offset & 1) << 31;
+ code[1] |= (offset & 0xffffe) >> 1;
+ } else {
+ srcAddr32(i->src(0), 31);
+ }
if (i->getIndirect(0, 0)) {
srcId(i->getIndirect(0, 0), 10);
--
2.8.0
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev