On Sun, Jul 23, 2017 at 12:27 PM, Kyriazis, George <george.kyria...@intel.com> wrote: > > On Jul 23, 2017, at 11:21 AM, Ilia Mirkin <imir...@alum.mit.edu> wrote: > > On Sun, Jul 23, 2017 at 12:08 PM, George Kyriazis > <george.kyria...@intel.com> wrote: > > The shader that is used to copy vertex data out of the vs/gs shaders to > the user-specified buffer (streamout os SO shader) was not using the > correct offsets. > > Adjust the offsets that are used just for the SO shader: > - Make sure that position is handled in the same special way > as in the vs/gs shaders > - Use the correct offset to be passed in the core > - consolidate register slot mapping logic into one function, since it's > been calculated in 2 different places (one for calcuating the slot mask, > and one for the register offsets themselves > > Also make room for all attibutes in the backend vertex area. > > Fixes: > - all vtk GL2PS tests > - 18 piglit tests (16 ext_transform_feedback tests, > arb-quads-follow-provoking-vertex and primitive-type gl_points > --- > src/gallium/drivers/swr/swr_draw.cpp | 11 ++++++++--- > src/gallium/drivers/swr/swr_state.cpp | 31 +++++++++++++++++++++++++++++-- > src/gallium/drivers/swr/swr_state.h | 3 +++ > 3 files changed, 40 insertions(+), 5 deletions(-) > > diff --git a/src/gallium/drivers/swr/swr_draw.cpp > b/src/gallium/drivers/swr/swr_draw.cpp > index 62ad3f7..218de0f 100644 > --- a/src/gallium/drivers/swr/swr_draw.cpp > +++ b/src/gallium/drivers/swr/swr_draw.cpp > @@ -26,6 +26,7 @@ > #include "swr_resource.h" > #include "swr_fence.h" > #include "swr_query.h" > +#include "swr_state.h" > #include "jit_api.h" > > #include "util/u_draw.h" > @@ -81,8 +82,11 @@ swr_draw_vbo(struct pipe_context *pipe, const struct > pipe_draw_info *info) > offsets[output_buffer] = so->output[i].dst_offset; > } > > + unsigned attrib_slot = so->output[i].register_index; > + attrib_slot = swr_so_adjust_attrib(attrib_slot, ctx->vs); > + > state.stream.decl[num].bufferIndex = output_buffer; > - state.stream.decl[num].attribSlot = > so->output[i].register_index - 1; > + state.stream.decl[num].attribSlot = attrib_slot; > state.stream.decl[num].componentMask = > ((1 << so->output[i].num_components) - 1) > << so->output[i].start_component; > @@ -130,9 +134,10 @@ swr_draw_vbo(struct pipe_context *pipe, const struct > pipe_draw_info *info) > SWR_FRONTEND_STATE feState = {0}; > > feState.vsVertexSize = > - VERTEX_ATTRIB_START_SLOT + > + VERTEX_ATTRIB_START_SLOT > + ctx->vs->info.base.num_outputs > - - (ctx->vs->info.base.writes_position ? 1 : 0); > + - (ctx->vs->info.base.writes_position ? 1 : 0) > + + ctx->fs->info.base.num_outputs;
Why do you care about the number of fs outputs in the vertex size? > > if (ctx->rasterizer->flatshade_first) { > feState.provokingVertex = {1, 0, 0}; > diff --git a/src/gallium/drivers/swr/swr_state.cpp > b/src/gallium/drivers/swr/swr_state.cpp > index 501fdea..3e07929 100644 > --- a/src/gallium/drivers/swr/swr_state.cpp > +++ b/src/gallium/drivers/swr/swr_state.cpp > @@ -345,13 +345,15 @@ swr_create_vs_state(struct pipe_context *pipe, > // soState.streamToRasterizer not used > > for (uint32_t i = 0; i < stream_output->num_outputs; i++) { > + unsigned attrib_slot = stream_output->output[i].register_index; > + attrib_slot = swr_so_adjust_attrib(attrib_slot, swr_vs); > swr_vs->soState.streamMasks[stream_output->output[i].stream] |= > - 1 << (stream_output->output[i].register_index - 1); > + (1 << attrib_slot); > } > for (uint32_t i = 0; i < MAX_SO_STREAMS; i++) { > swr_vs->soState.streamNumEntries[i] = > _mm_popcnt_u32(swr_vs->soState.streamMasks[i]); > - swr_vs->soState.vertexAttribOffset[i] = VERTEX_ATTRIB_START_SLOT; > // TODO: optimize > + swr_vs->soState.vertexAttribOffset[i] = 0; > } > } > > @@ -1777,6 +1779,31 @@ swr_update_derived(struct pipe_context *pipe, > ctx->dirty = post_update_dirty_flags; > } > > +unsigned > +swr_so_adjust_attrib(unsigned in_attrib, > + swr_vertex_shader *swr_vs) > +{ > + ubyte semantic_name; > + unsigned attrib; > + > + attrib = in_attrib + VERTEX_ATTRIB_START_SLOT; > + > + if (swr_vs) { > + semantic_name = swr_vs->info.base.output_semantic_name[in_attrib]; > + if (semantic_name == TGSI_SEMANTIC_POSITION) { > > > But it's not just the position... there are plenty of other attributes > that need similar treatment. e.g. pointsize, layer, viewport index. > > > Yes, we want to deal with the other attributes, too, but lack the testing > time, due to the 17.2 schedule. Since position fixes the vtk tests (that we > are mostly concerned about), that is what this patch covers. It’s in our > plan to deal with the rest of the attributes in the near future (before we > get distracted). OK, well your guys's call on how to tackle all this. I somewhat hate partial solutions, but I realize you have more tangible goals than 'perfection'. Either way ... doesn't this all break if there's a geometry shader? You're only ever looking at the VS, not at the last pre-rast stage (which might be GS, TES, or VS). I believe that TF currently works for GS, and you're going to be adjusting GS output attrib positions based on VS output indices. [Note that my change was written before GS was a thing in swr, so it also doesn't take this into account.] > > I tried to fix this a while back but IIRC my solution was left wanting: > > https://github.com/imirkin/mesa/commits/swr > > (see the WIP commits at the top). > > I ended up getting distracted and never completed the work. Feel free > to copy/ignore/whatever the above commits. I also tried to provide a > ARB_tf2 impl as well (i.e. pause/resume). > > + attrib = VERTEX_POSITION_SLOT; > + } else { > + for (int i = 0; i < PIPE_MAX_SHADER_OUTPUTS; i++) { > + if (swr_vs->info.base.output_semantic_name[i] == > TGSI_SEMANTIC_POSITION) { > + attrib--; > + break; > + } > + } > + } > + } > + > + return attrib; > +} > > static struct pipe_stream_output_target * > swr_create_so_target(struct pipe_context *pipe, > diff --git a/src/gallium/drivers/swr/swr_state.h > b/src/gallium/drivers/swr/swr_state.h > index 7940a96..8cbd463 100644 > --- a/src/gallium/drivers/swr/swr_state.h > +++ b/src/gallium/drivers/swr/swr_state.h > @@ -110,6 +110,9 @@ struct swr_derived_state { > void swr_update_derived(struct pipe_context *, > const struct pipe_draw_info * = nullptr); > > +unsigned swr_so_adjust_attrib(unsigned in_attrib, > + swr_vertex_shader *swr_vs); > + > /* > * Conversion functions: Convert mesa state defines to SWR. > */ > -- > 2.7.4 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev