On Mon, Nov 30, 2015 at 1:30 PM, Marek Olšák <mar...@gmail.com> wrote: > On Mon, Nov 30, 2015 at 7:20 AM, Dave Airlie <airl...@gmail.com> wrote: >> From: Dave Airlie <airl...@redhat.com> >> >> This creates a constant buffer with the information about >> the layout of the LDS memory that is given to the vertex, tess >> control and tess evaluation shaders. >> >> This also programs the LDS size and the LS_HS_CONFIG registers, >> on evergreen only. >> >> Signed-off-by: Dave Airlie <airl...@redhat.com> >> --- >> src/gallium/drivers/r600/evergreen_state.c | 128 >> +++++++++++++++++++++++++++ >> src/gallium/drivers/r600/r600_pipe.h | 24 ++++- >> src/gallium/drivers/r600/r600_state_common.c | 13 +++ >> 3 files changed, 162 insertions(+), 3 deletions(-) >> >> diff --git a/src/gallium/drivers/r600/evergreen_state.c >> b/src/gallium/drivers/r600/evergreen_state.c >> index c01e8e3..edc6f28 100644 >> --- a/src/gallium/drivers/r600/evergreen_state.c >> +++ b/src/gallium/drivers/r600/evergreen_state.c >> @@ -3763,3 +3763,131 @@ void evergreen_init_state_functions(struct >> r600_context *rctx) >> >> evergreen_init_compute_state_functions(rctx); >> } >> + >> +/** >> + * This calculates the LDS size for tessellation shaders (VS, TCS, TES). >> + * >> + * The information about LDS and other non-compile-time parameters is then >> + * written to the const buffer. >> + >> + * const buffer contains - >> + * uint32_t input_patch_size >> + * uint32_t input_vertex_size >> + * uint32_t num_tcs_input_cp >> + * uint32_t num_tcs_output_cp; >> + * uint32_t output_patch_size >> + * uint32_t output_vertex_size >> + * uint32_t output_patch0_offset >> + * uint32_t perpatch_output_offset >> + * and the same constbuf is bound to LS/HS/VS(ES). >> + */ >> +void evergreen_setup_tess_constants(struct r600_context *rctx, const struct >> pipe_draw_info *info, unsigned *num_patches, uint32_t *lds_alloc) >> +{ >> + struct pipe_constant_buffer constbuf = {0}; >> + struct r600_pipe_shader_selector *tcs = rctx->tcs_shader ? >> rctx->tcs_shader : rctx->tes_shader; >> + struct r600_pipe_shader_selector *ls = rctx->vs_shader; >> + unsigned num_tcs_input_cp = info->vertices_per_patch; >> + unsigned num_tcs_outputs; >> + unsigned num_tcs_output_cp; >> + unsigned num_tcs_patch_outputs; >> + unsigned num_tcs_inputs; >> + unsigned input_vertex_size, output_vertex_size; >> + unsigned input_patch_size, pervertex_output_patch_size, >> output_patch_size; >> + unsigned output_patch0_offset, perpatch_output_offset, lds_size; >> + uint32_t values[16]; >> + uint32_t tmp; >> + >> + if (!rctx->tes_shader) >> + return; >> + >> + *num_patches = 1; > > num_patches should be set before returning. > >> + >> + num_tcs_inputs = util_last_bit64(ls->lds_outputs_written_mask); >> + >> + if (rctx->tcs_shader) { >> + num_tcs_outputs = >> util_last_bit64(tcs->lds_outputs_written_mask); >> + num_tcs_output_cp = >> tcs->info.properties[TGSI_PROPERTY_TCS_VERTICES_OUT]; >> + num_tcs_patch_outputs = >> util_last_bit64(tcs->lds_patch_outputs_written_mask); >> + } else { >> + num_tcs_outputs = num_tcs_inputs; >> + num_tcs_output_cp = num_tcs_input_cp; >> + num_tcs_patch_outputs = 2; /* TESSINNER + TESSOUTER */ >> + } >> + >> + /* size in bytes */ >> + input_vertex_size = num_tcs_inputs * 16; >> + output_vertex_size = num_tcs_outputs * 16; >> + >> + input_patch_size = num_tcs_input_cp * input_vertex_size; >> + >> + pervertex_output_patch_size = num_tcs_output_cp * output_vertex_size; >> + output_patch_size = pervertex_output_patch_size + >> num_tcs_patch_outputs * 16; >> + >> + output_patch0_offset = rctx->tcs_shader ? input_patch_size * >> *num_patches : 0; >> + perpatch_output_offset = output_patch0_offset + >> pervertex_output_patch_size; >> + >> + lds_size = output_patch0_offset + output_patch_size * *num_patches; >> + >> + values[0] = input_patch_size; >> + values[1] = input_vertex_size; >> + values[2] = num_tcs_input_cp; >> + values[3] = num_tcs_output_cp; >> + >> + values[4] = output_patch_size; >> + values[5] = output_vertex_size; >> + values[6] = output_patch0_offset; >> + values[7] = perpatch_output_offset; >> + >> + /* docs say HS_NUM_WAVES - CEIL((LS_HS_CONFIG.NUM_PATCHES * >> + LS_HS_CONFIG.HS_NUM_OUTPUT_CP) / (NUM_GOOD_PIPES * 16)) */ >> + tmp = (lds_size | (1 << 14)); /* TODO */ > > If I understand this correctly, num_good_pipes can be between 1 and 4. > Assume the worst case, which is 1. This gives us: > ceil(NUM_PATCHES * NUM_OUTPUT_CP / 16) > > That equals 2 if NUM_OUTPUT_CP > 16 and NUM_PATCHES = 1.
BTW, HS_NUM_WAVES means how many waves share the same LDS memory. 1 pipe = 16 threads per wave, (GCN always has 4 pipes = 64 threads per wave). That's where "16" in the equation comes from. The equation only ensures that all vertices within a patch are assigned the same LDS memory. (that's why you need at least 2 for 1-pipe chips and NUM_OUTPUT_CP > 16) Marek _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev