so yeah, I am a bit blind today as this was already handled in the patch. Moving the SvSemantic check inside something like isCS2RSV(SvSemantic) might make it simplier for future system values to be used with cs2r, but not really required right now. In either case: Reviewed-by: Karol Herbst <kher...@redhat.com>
On Thu, Jul 19, 2018 at 5:28 PM, Karol Herbst <kher...@redhat.com> wrote: > playing a bout around with nvdisasm, there seems to be some > complications with certain sched opcodes. I think we should figure out > if this is simply nvdisasm crashing or if this is a real hardware > thing. > > On Tue, Jul 17, 2018 at 9:26 PM, Rhys Perry <pendingchao...@gmail.com> wrote: >> After some testing and looking at traces of the blob or nvcc output, >> it seems the only system value CS2R is useful for is SV_CLOCK. >> >> On Tue, Jul 17, 2018 at 1:09 PM, Karol Herbst <kher...@redhat.com> wrote: >>> that seems like a good enough improvement. I think looking onto other >>> sysvals would be worthwhile as SV_CLOCK isn't used that often. The >>> invocation ID and related ones would be interesting to look into as >>> they are much more common. >>> >>> On Tue, Jul 17, 2018 at 1:59 PM, Rhys Perry <pendingchao...@gmail.com> >>> wrote: >>>> I'm getting ~28 cycles for the S2R and ~6 cycles (unsurprisingly) for the >>>> CS2R. >>>> >>>> nvcc with SM30 seems to use the same instruction as the nvc0 emission code. >>>> >>>> The SV_LANE* system values don't work with CS2R and I haven't looked >>>> too deeply into the others. >>>> >>>> On Tue, Jul 17, 2018 at 12:13 PM, Karol Herbst <kher...@redhat.com> wrote: >>>>> interesting, do you have some numbers on that? Wondering if we could >>>>> switch more sysvals over to it and what about older gens? >>>>> >>>>> On Tue, Jul 17, 2018 at 12:46 PM, Rhys Perry <pendingchao...@gmail.com> >>>>> wrote: >>>>>> This instruction seems to be faster than S2R and requires no barrier, >>>>>> though the range of special registers it can read from is limited. >>>>>> >>>>>> Signed-off-by: Rhys Perry <pendingchao...@gmail.com> >>>>>> --- >>>>>> src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 14 >>>>>> +++++++++++++- >>>>>> .../drivers/nouveau/codegen/nv50_ir_target_gm107.cpp | 4 +++- >>>>>> 2 files changed, 16 insertions(+), 2 deletions(-) >>>>>> >>>>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp >>>>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp >>>>>> index 694d1b10a3..c306a4680b 100644 >>>>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp >>>>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp >>>>>> @@ -124,6 +124,7 @@ private: >>>>>> >>>>>> void emitMOV(); >>>>>> void emitS2R(); >>>>>> + void emitCS2R(); >>>>>> void emitF2F(); >>>>>> void emitF2I(); >>>>>> void emitI2F(); >>>>>> @@ -749,6 +750,14 @@ CodeEmitterGM107::emitS2R() >>>>>> emitGPR (0x00, insn->def(0)); >>>>>> } >>>>>> >>>>>> +void >>>>>> +CodeEmitterGM107::emitCS2R() >>>>>> +{ >>>>>> + emitInsn(0x50c80000); >>>>>> + emitSYS (0x14, insn->src(0)); >>>>>> + emitGPR (0x00, insn->def(0)); >>>>>> +} >>>>>> + >>>>>> void >>>>>> CodeEmitterGM107::emitF2F() >>>>>> { >>>>>> @@ -3192,7 +3201,10 @@ CodeEmitterGM107::emitInstruction(Instruction *i) >>>>>> emitMOV(); >>>>>> break; >>>>>> case OP_RDSV: >>>>>> - emitS2R(); >>>>>> + if (insn->getSrc(0)->reg.data.id == SV_CLOCK) >>>>>> + emitCS2R(); >>>>>> + else >>>>>> + emitS2R(); >>>>>> break; >>>>>> case OP_ABS: >>>>>> case OP_NEG: >>>>>> diff --git >>>>>> a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_gm107.cpp >>>>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_gm107.cpp >>>>>> index 04cbd402a1..009470fb93 100644 >>>>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_gm107.cpp >>>>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_gm107.cpp >>>>>> @@ -153,9 +153,10 @@ TargetGM107::isBarrierRequired(const Instruction >>>>>> *insn) const >>>>>> case OP_AFETCH: >>>>>> case OP_PFETCH: >>>>>> case OP_PIXLD: >>>>>> - case OP_RDSV: >>>>>> case OP_SHFL: >>>>>> return true; >>>>>> + case OP_RDSV: >>>>>> + return insn->getSrc(0)->reg.data.id != SV_CLOCK; >>>>>> default: >>>>>> break; >>>>>> } >>>>>> @@ -229,6 +230,7 @@ TargetGM107::getLatency(const Instruction *insn) >>>>>> const >>>>>> case OP_SUB: >>>>>> case OP_VOTE: >>>>>> case OP_XOR: >>>>>> + case OP_RDSV: >>>>>> if (insn->dType != TYPE_F64) >>>>>> return 6; >>>>>> break; >>>>>> -- >>>>>> 2.14.4 >>>>>> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev