I've updated the comments a bit and pushed to master. Thanks for all your debugging!
On Wed, Sep 19, 2018 at 11:21 AM Sergii Romantsov < sergii.romant...@gmail.com> wrote: > On Skylake enabling of ForceThreadDispatchEnable causes gpu-hang. > > -v2: enabling of ForceThreadDispatchEnable is only for gen8, for > gen9 and higher reverted enabling of PixelShaderHasUAV. > > CC: Jason Ekstrand <jason.ekstr...@intel.com> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107941 > Fixes: 79270d2140ec (anv: Stop setting 3DSTATE_PS_EXTRA::PixelShaderHasUAV) > Signed-off-by: Sergii Romantsov <sergii.romant...@globallogic.com> > --- > src/intel/vulkan/genX_pipeline.c | 33 ++++++++++++++++++++++++++++++++- > 1 file changed, 32 insertions(+), 1 deletion(-) > > diff --git a/src/intel/vulkan/genX_pipeline.c > b/src/intel/vulkan/genX_pipeline.c > index 9595a71..b469270 100644 > --- a/src/intel/vulkan/genX_pipeline.c > +++ b/src/intel/vulkan/genX_pipeline.c > @@ -1445,7 +1445,7 @@ emit_3dstate_wm(struct anv_pipeline *pipeline, > struct anv_subpass *subpass, > wm.EarlyDepthStencilControl = EDSC_NORMAL; > } > > -#if GEN_GEN >= 8 > +#if GEN_GEN == 8 > /* Gen8 hardware tries to compute ThreadDispatchEnable for us but > * doesn't take into account KillPixels when no depth or stencil > * writes are enabled. In order for occlusion queries to work > @@ -1663,6 +1663,37 @@ emit_3dstate_ps_extra(struct anv_pipeline *pipeline, > wm_prog_data->uses_kill; > > #if GEN_GEN >= 9 > + /* The stricter cross-primitive coherency guarantees that the > hardware > + * gives us with the "Accesses UAV" bit set for at least one shader > stage > + * and the "UAV coherency required" bit set on the 3DPRIMITIVE > command are > + * redundant within the current image, atomic counter and SSBO GL > APIs, > + * which all have very loose ordering and coherency requirements and > + * generally rely on the application to insert explicit barriers > when a > + * shader invocation is expected to see the memory writes performed > by the > + * invocations of some previous primitive. Regardless of the value > of > + * "UAV coherency required", the "Accesses UAV" bits will > implicitly cause > + * an in most cases useless DC flush when the lowermost stage with > the bit > + * set finishes execution. > + * > + * It would be nice to disable it, but in some cases we can't > because on > + * Gen8+ it also has an influence on rasterization via the PS > UAV-only > + * signal (which could be set independently from the coherency > mechanism > + * in the 3DSTATE_WM command on Gen7), and because in some cases it > will > + * determine whether the hardware skips execution of the fragment > shader > + * or not via the ThreadDispatchEnable signal. However if we know > that > + * GEN8_PS_BLEND_HAS_WRITEABLE_RT is going to be set and > + * GEN8_PSX_PIXEL_SHADER_NO_RT_WRITE is not set it shouldn't make > any > + * difference so we may just disable it here. > + * > + * Gen8 hardware tries to compute ThreadDispatchEnable for us but > doesn't > + * take into account KillPixels when no depth or stencil writes are > + * enabled. In order for occlusion queries to work correctly with no > + * attachments, we need to force-enable here. > + */ > + if ((wm_prog_data->has_side_effects || wm_prog_data->uses_kill) && > + !has_color_buffer_write_enabled(pipeline, blend)) > + ps.PixelShaderHasUAV = true; > + > ps.PixelShaderComputesStencil = wm_prog_data->computed_stencil; > ps.PixelShaderPullsBary = wm_prog_data->pulls_bary; > > -- > 2.7.4 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev