Hi,
On 28/06/16 18:00, Ilia Mirkin wrote: > On Tue, Jun 28, 2016 at 11:46 AM, Alejandro Piñeiro > <apinhe...@igalia.com> wrote: >> Fixes: >> GL44-CTS.texture_barrier_ARB.same-texel-rw-multipass >> >> On Haswell, Broadwell and Skylake (note that in order to execute >> that test, it is needed to override GL and GLSL versions). >> >> I was not able to find a documentation reference that justifies it. >> --- >> >> Having said, I didn't find a documentation reference explicitly >> mention that this is needed. >> >> Initially I thought that a flag was missing when calling >> emit_pipe_control_flush at brw_emit_mi_flush, but it was not the case >> as far as I saw. Then I noted that there is a gen6 workaround on that >> code: >> >> if (brw->gen == 6) { >> /* Hardware workaround: SNB B-Spec says: >> * >> * [Dev-SNB{W/A}]: Before a PIPE_CONTROL with Write Cache >> * Flush Enable =1, a PIPE_CONTROL with any non-zero >> * post-sync-op is required. >> */ >> brw_emit_post_sync_nonzero_flush(brw); >> } >> >> I tested calling that method for any gen, guessing if the workaround >> was needed also for other gens, and the test got fixed. But looking at >> the documentation of other gens, I didn't find the need for this >> workaround. For that reason I moved to use gen7_emit_cs_stall, that is >> less agressive and get the test fixed too. It seems that in order to >> get a complete flush you need a cs stall flush with a >> pipe_control_write. But again, I didn't find any reference at the PRMs >> confirming it. >> >> Intuitively, this would be needed on brw_emit_mi_flush or even at >> brw_emit_pipe_control_flush (this one already include some >> gen-specific workarounds), but I prefered to keep it on the only place >> that seems to need it for now. >> >> In addition to solve that CTS test, it also gets it passing for the >> test I recently sent to the piglit list, and not included on master >> yet (acked for now): >> https://lists.freedesktop.org/archives/piglit/2016-June/020055.html >> >> That piglit patch adds 48 parameter combination for the basic >> test. Without this mesa patch 5-6 subtests fails. With this patch all >> of them passes. Tested on Haswell, Broadwell and Skylake too. >> >> src/mesa/drivers/dri/i965/intel_tex.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/src/mesa/drivers/dri/i965/intel_tex.c >> b/src/mesa/drivers/dri/i965/intel_tex.c >> index cac33ac..e7459cd 100644 >> --- a/src/mesa/drivers/dri/i965/intel_tex.c >> +++ b/src/mesa/drivers/dri/i965/intel_tex.c >> @@ -362,6 +362,7 @@ intel_texture_barrier(struct gl_context *ctx) >> { >> struct brw_context *brw = brw_context(ctx); >> >> + gen7_emit_cs_stall_flush(brw); >> brw_emit_mi_flush(brw); > Without commenting on exactly what these do, what texture barrier *should* do > is > > (1) wait for all previous draws to complete (since they may be in the > process of filling caches with "old" data) > (2) flush texture caches > > If you flush caches without waiting first, then a draw currently in > progress may continue dirtying them with the "bad" data. Thanks for the detailed answer. It is true that I was forgetting (1) at all. I totally focused on the cache flush, and assumed that there was something missing there. > As I said, however, I have no idea what either of the above functions > *really* do, or what forms of parallelism are possible on intel hw. > Hopefully the above comments will help someone with the proper > knowledge evaluate whether this or a different change is necessary. I really think that brw_emit_mi_flush totally fits on your (2) (and should not be modified as I suggested on my previous email). gen7_emit_cs_stall_flush as the better option for (1) is debatable. Tomorrow I will keep checking to confirm it, or in order to find a better option. Obviously if something with previous knowledge appears it will be welcome. Again, thanks for all the feedback, and the patience. Best regards _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev