playing a bout around with nvdisasm, there seems to be some complications with certain sched opcodes. I think we should figure out if this is simply nvdisasm crashing or if this is a real hardware thing.
On Tue, Jul 17, 2018 at 9:26 PM, Rhys Perry <pendingchao...@gmail.com> wrote: > After some testing and looking at traces of the blob or nvcc output, > it seems the only system value CS2R is useful for is SV_CLOCK. > > On Tue, Jul 17, 2018 at 1:09 PM, Karol Herbst <kher...@redhat.com> wrote: >> that seems like a good enough improvement. I think looking onto other >> sysvals would be worthwhile as SV_CLOCK isn't used that often. The >> invocation ID and related ones would be interesting to look into as >> they are much more common. >> >> On Tue, Jul 17, 2018 at 1:59 PM, Rhys Perry <pendingchao...@gmail.com> wrote: >>> I'm getting ~28 cycles for the S2R and ~6 cycles (unsurprisingly) for the >>> CS2R. >>> >>> nvcc with SM30 seems to use the same instruction as the nvc0 emission code. >>> >>> The SV_LANE* system values don't work with CS2R and I haven't looked >>> too deeply into the others. >>> >>> On Tue, Jul 17, 2018 at 12:13 PM, Karol Herbst <kher...@redhat.com> wrote: >>>> interesting, do you have some numbers on that? Wondering if we could >>>> switch more sysvals over to it and what about older gens? >>>> >>>> On Tue, Jul 17, 2018 at 12:46 PM, Rhys Perry <pendingchao...@gmail.com> >>>> wrote: >>>>> This instruction seems to be faster than S2R and requires no barrier, >>>>> though the range of special registers it can read from is limited. >>>>> >>>>> Signed-off-by: Rhys Perry <pendingchao...@gmail.com> >>>>> --- >>>>> src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 14 >>>>> +++++++++++++- >>>>> .../drivers/nouveau/codegen/nv50_ir_target_gm107.cpp | 4 +++- >>>>> 2 files changed, 16 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp >>>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp >>>>> index 694d1b10a3..c306a4680b 100644 >>>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp >>>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp >>>>> @@ -124,6 +124,7 @@ private: >>>>> >>>>> void emitMOV(); >>>>> void emitS2R(); >>>>> + void emitCS2R(); >>>>> void emitF2F(); >>>>> void emitF2I(); >>>>> void emitI2F(); >>>>> @@ -749,6 +750,14 @@ CodeEmitterGM107::emitS2R() >>>>> emitGPR (0x00, insn->def(0)); >>>>> } >>>>> >>>>> +void >>>>> +CodeEmitterGM107::emitCS2R() >>>>> +{ >>>>> + emitInsn(0x50c80000); >>>>> + emitSYS (0x14, insn->src(0)); >>>>> + emitGPR (0x00, insn->def(0)); >>>>> +} >>>>> + >>>>> void >>>>> CodeEmitterGM107::emitF2F() >>>>> { >>>>> @@ -3192,7 +3201,10 @@ CodeEmitterGM107::emitInstruction(Instruction *i) >>>>> emitMOV(); >>>>> break; >>>>> case OP_RDSV: >>>>> - emitS2R(); >>>>> + if (insn->getSrc(0)->reg.data.id == SV_CLOCK) >>>>> + emitCS2R(); >>>>> + else >>>>> + emitS2R(); >>>>> break; >>>>> case OP_ABS: >>>>> case OP_NEG: >>>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_gm107.cpp >>>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_gm107.cpp >>>>> index 04cbd402a1..009470fb93 100644 >>>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_gm107.cpp >>>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_gm107.cpp >>>>> @@ -153,9 +153,10 @@ TargetGM107::isBarrierRequired(const Instruction >>>>> *insn) const >>>>> case OP_AFETCH: >>>>> case OP_PFETCH: >>>>> case OP_PIXLD: >>>>> - case OP_RDSV: >>>>> case OP_SHFL: >>>>> return true; >>>>> + case OP_RDSV: >>>>> + return insn->getSrc(0)->reg.data.id != SV_CLOCK; >>>>> default: >>>>> break; >>>>> } >>>>> @@ -229,6 +230,7 @@ TargetGM107::getLatency(const Instruction *insn) const >>>>> case OP_SUB: >>>>> case OP_VOTE: >>>>> case OP_XOR: >>>>> + case OP_RDSV: >>>>> if (insn->dType != TYPE_F64) >>>>> return 6; >>>>> break; >>>>> -- >>>>> 2.14.4 >>>>> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev