I'm not sure if I fully understand, but it seems the problem is that the start/index_bias is copied to SQ_VTX_BASE_VTX and not VGT_INDX_OFFSET. I guess you can avoid using BASE_VTX by setting fetch_type=NO_INDEX_OFFSET, right? If yes, then there is a simple solution: The COPY_DW packet. It can be used to copy one dword between memory and registers (it supports copying mem->mem, reg->reg, reg->mem, mem->reg). It can be used to copy start/index_bias into VGT_INDX_OFFSET. All you have to do is to disable SQ_VTX_BASE_VTX use in the shader.
What do you think? Marek On Sat, Feb 7, 2015 at 1:42 AM, Glenn Kennard <glenn.kenn...@gmail.com> wrote: > On Fri, 06 Feb 2015 17:08:46 +0100, Marek Olšák <mar...@gmail.com> wrote: > >> Please bump the size of vgt_state for the SQ_VTX_BASE_VTX_LOC >> register. It's set by r600_init_atom in r600_state.c and >> evergreen_state.c >> >> Please bump R600_MAX_DRAW_CS_DWORDS. It's an upper bound of how many >> dwords draw_vbo can emit. >> > > Thanks, will fix. > >> I don't understand what get_vfetch_type is good for. Could you please >> explain it in the code? Also, I don't understand what constant buffer >> fetches have to do with VertexID. >> > > Will add some more blurb to get_vfetch_type, in particular i can point at > the appropriate parts of gpu documentation. > > As for the interaction of buffer fetches and VertexID, i'll attempt to > explain: > > The way R_03CFF0_SQ_VTX_BASE_VTX_LOC is delivered to the vertex shader is > basically, it isn't. Instead what the > hardware does is poke the 64 unique values (one per wavefront thread, "64 > state" in the documentation) into the fetch units into a hidden state > hardware register which the shader cannot read, at least not in any way that > i've been able to find. > > Setting FETCH_MODE=SQ_VTX_FETCH_VERTEX_DATA (=0) on a VFETCH instruction > then tells the fetch unit to add the BASE_VTX and start instance offsets > before reading the value - see r600_asm.c:r600_create_vertex_fetch_shader() > which open codes 0 as the fetch mode for vertex fetches. > > This creates a problem for GLSL gl_VertexId, since the shader cannot apply > the offset. Lets look at the shader for the > tests/spec/arb_draw_indirect/vertexid.c piglit test case: > > "#version 140\n" > "\n" > "in vec4 piglit_vertex;\n" > "out vec3 c;\n" > "\n" > "const vec3 colors[] = vec3[](\n" > " vec3(1, 0, 0),\n" > " vec3(1, 0, 0),\n" > " vec3(1, 0, 0),\n" > " vec3(1, 0, 0),\n" > "\n" > ... > " vec3(1, 0, 1),\n" > " vec3(1, 0, 1),\n" > " vec3(1, 0, 1),\n" > " vec3(1, 0, 1)\n" > ");\n" > "void main() {\n" > " c = colors[gl_VertexID];\n" > " gl_Position = piglit_vertex;\n" > "}\n" > > Colors here is a constant array, and base offset needs to be applied to look > up the correct color value - the GL 4.5 spec is quite clear that it should > be applied to gl_VertexID. Since the hardware offers no way to add base > instance to gl_VertexID, i do the next best thing and enable offset on the > array fetch operation instead. > > The detection logic is quite hacky, since really it needs to look if the > array expression depends in any way on gl_VertexId which requires looking at > def use chains, which aren't available in r600_asm.c - can probably have SB > compute the bit instead, but that sort of violates its "don't change program > meaning" principle, not to mention different behavior with SB disabled. > > All the actual shaders that i've found using gl_VertexId in conjunction with > indirect draws only use one constant array. I figure partial support at > least approximately matches what the binary driver supports, which doesn't > produce the correct value for gl_VertexId either for indirect draws in > various cases - in particular if the shader tries to compare gl_VertexID > against some other expression you get an incorrect value. > > > The driver does something totally different for direct draws, it adds the > base offset and start offset manually and feeds that to the hardware, with > BASE_VTX always set to 0, which allows it to work for all cases. Not an > option for indirect draws if you want any sort of performance out of them. > > > So to sum up, gl_VertexID i don't see the hardware being fully capable of > following the spec in conjunction with indirect drawing for all cases, at > least not without some very slow fallbacks reading back the draw parameters > to the cpu which is useless. One option would be to just drop the attempt at > supporting gl_VertexID from this patch if it's deemed too hacky. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev