Just a comment: in truth the MEDIA_VFE_STATE -was- programmed correctly without this patch; it turns out that the PerThreadScratchSpace are the first bits in the bytes holding the scratch base pointer; those first bits are used by the HW (and the GENX pack knows this and accounts for it) to stash state.
In all honesty this patch is not necessary to fix car-chase, the patch is just a readability patch. My apologies for jumping the gun and not checking if the bits for PerThreadScratchSpace were of the first bits of the BO for scratch space. Sighs. In spite of that it is just a readability patch, I think it should land to aid in readability of the code. -Kevin -----Original Message----- From: Rogovin, Kevin Sent: Tuesday, December 12, 2017 12:05 PM To: mesa-dev@lists.freedesktop.org Cc: Rogovin, Kevin <kevin.rogo...@intel.com> Subject: [PATCH 1/2] i965: correctly program MEDIA_VFE_STATE for compute shading From: Kevin Rogovin <kevin.rogo...@intel.com> Signed-off-by: Kevin Rogovin <kevin.rogo...@intel.com> --- src/mesa/drivers/dri/i965/genX_state_upload.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c index 04a492539a..50ac5bc59f 100644 --- a/src/mesa/drivers/dri/i965/genX_state_upload.c +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c @@ -4183,28 +4183,35 @@ genX(upload_cs_state)(struct brw_context *brw) brw_batch_emit(brw, GENX(MEDIA_VFE_STATE), vfe) { if (prog_data->total_scratch) { - uint32_t bo_offset; + uint32_t per_thread_scratch_value; if (GEN_GEN >= 8) { /* Broadwell's Per Thread Scratch Space is in the range [0, 11] * where 0 = 1k, 1 = 2k, 2 = 4k, ..., 11 = 2M. */ - bo_offset = ffs(stage_state->per_thread_scratch) - 11; + per_thread_scratch_value = ffs(stage_state->per_thread_scratch) - 11; } else if (GEN_IS_HASWELL) { /* Haswell's Per Thread Scratch Space is in the range [0, 10] * where 0 = 2k, 1 = 4k, 2 = 8k, ..., 10 = 2M. */ - bo_offset = ffs(stage_state->per_thread_scratch) - 12; + per_thread_scratch_value = ffs(stage_state->per_thread_scratch) - 12; } else { /* Earlier platforms use the range [0, 11] to mean [1kB, 12kB] * where 0 = 1kB, 1 = 2kB, 2 = 3kB, ..., 11 = 12kB. */ - bo_offset = stage_state->per_thread_scratch / 1024 - 1; + per_thread_scratch_value = stage_state->per_thread_scratch / 1024 - 1; } - vfe.ScratchSpaceBasePointer = - rw_bo(stage_state->scratch_bo, bo_offset); + vfe.ScratchSpaceBasePointer = rw_bo(stage_state->scratch_bo, 0); + vfe.PerThreadScratchSpace = per_thread_scratch_value; } + /* If brw->screen->subslice_total is greater than one, then + * devinfo->max_cs_threads stores number of threads per sub-slice; + * thus we need to multiply by that number by subslices to get + * the actual maximum number of threads; the -1 is because the HW + * has a bias of 1 (would not make sense to say the maximum number + * of threads is 0). + */ const uint32_t subslices = MAX2(brw->screen->subslice_total, 1); vfe.MaximumNumberofThreads = devinfo->max_cs_threads * subslices - 1; vfe.NumberofURBEntries = GEN_GEN >= 8 ? 2 : 0; -- 2.15.0 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev