You're right, that's a bad idea and doesn't work.
On Fri, Jun 27, 2014 at 6:09 PM, Iago Toral <ito...@igalia.com> wrote: > If by not doing anything you mean not processing or removing the > ir_emit_vertex instructions for that stream this would have two problems > at least: > > 1) We won't get correct results for GL_PRIMITIVES_GENERATED in that > stream (it will always be 0). This may be a minor problem. > > 2) If that stream is stream 0 and rendering is enabled then we lose > rendering output, which would be a major problem. > > So I think this is not a good thing to do. > > Iago > > On Fri, 2014-06-27 at 08:08 +1200, Chris Forbes wrote: >> As an alternative -- we know if we have this scenario at link time -- >> could we perhaps just not do anything in EmitStreamVertex if there are >> no varyings captured to that stream? >> On Thu, Jun 26, 2014 at 10:26 PM, Iago Toral <ito...@igalia.com> wrote: >> > Hello, >> > >> > while testing various scenarios for multi-stream support in geometry >> > shaders I came across one that I think might be a hardware bug, or at >> > the very least, a hardware limitation that creates a problem to >> > implement correct behavior according to ARB_transform_feedback3. >> > >> > The conflictive scenario is activated with this setup: >> > - Enable transform feedback. >> > - Do not associate any varyings with one particular stream (let's call >> > this stream X). >> > - Have the GS emit a vertex to stream X. >> > >> > ARB_transform_feedback3 clarifies expected behavior in this case: >> > >> > "If the set of varyings selected for transform feedback does not include >> > any belonging to the specified stream, nothing will be recorded when >> > primitives are emitted to that stream, and the corresponding vertex >> > count will be zero." >> > >> > However, we get two possible outcomes with this setup: >> > >> > 1) If the vertex emitted to that stream is not the last vertex emitted >> > by the GS, then primitive count for that stream is incorrect (returns >> > 0), but everything else works ok. >> > >> > I think this behavior is expected as per the IvyBridge documentation: >> > >> > "8.3 Stream Output Function: >> > ... >> > If a stream has no SO_DECL state defined (NumEntries is 0), incoming >> > objects targeting that stream are effectively ignored. As there is no >> > attempt to perform stream output, overflow detection is neither required >> > nor performed." >> > >> > Which means that we can't use SO_PRIMITIVE_STORAGE_NEEDED for the >> > primitive count in this case. We could still use CL_INVOCATION_COUNT for >> > stream 0, but that would not fix the problem for other streams. >> > >> > 2) If the vertex emitted to that stream is the last vertex emitted by >> > the GS, then transform feedback does not work for any stream (no values >> > are recorded in the TF buffers) and primitive queries for all streams >> > return 0. Rendering is okay though: stream 0 outputs are rendered >> > properly and outputs from other streams are discarded. This, I think, is >> > a hardware problem. >> > >> > With this setup, we are configuring the 3DSTATE_SO_DECL_LIST command for >> > stream X like this: >> > >> > Buffer Selects (Stream X) = 0 >> > Num Entries (Stream X) = 0 >> > >> > that is, that stream writes to no buffers and has no declarations to >> > write, which is correct. >> > >> > Now comes the funny part: simply forcing Num Entries(Stream X) = 1, even >> > if there are no declarations, makes TF and primitive queries work again >> > for all streams but X, and for stream X, primitive count is ok, but TF >> > is not (but that is kind of expected since we are not configuring it >> > properly). More over, if I also force Buffer Selects (Stream X) = N (so >> > that N is the index of a disabled TF buffer), then TF also works as >> > expected for Stream X (primitives generated is okay, TF primitives >> > written is 0, and no TF data for that stream is written). >> > >> > It looks like the hardware does not like setups where there are streams >> > that have 0 varyings to record after all, even less so if the last >> > vertex we emit is sent to such a stream. >> > >> > Based on the above, there is a work around for this but I think it is >> > pretty ugly so I would like to know other people's thoughts on whether >> > it is worth implementing. It would involve the following: >> > >> > In upload_3dstate_streamout() we make sure we disable all transform >> > feedback buffers that are not going to record information (currently a >> > TF buffer is activated as far as the user has called >> > glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, index bufferName)). We >> > can know if a buffer is not going to be written by inspecting its >> > BufferStride: it should be 0 for buffers that won't get written. I think >> > this is probably good t do in any case. >> > >> > Then the ugly part: in gen7_upload_3dstate_so_decl_list(), if we detect >> > a stream with no varyings bound to it (so num delcs is 0) *and* there >> > are disabled TF buffers, we silently set num decls for that stream to 1 >> > and set its buffer_mask to write to one of the disabled buffers (it >> > won't actually write because they are disabled). >> > >> > I have a patch for this [1] and seems to fix the problem (although it >> > only works as far as we have disabled TF buffers available). >> > >> > Opinions? Is there any other alternative to work around this issue? >> > >> > The problem is particularly annoying because I think it hits a very >> > likely scenario: an application using stream 0 for rendering only (no >> > TF) and using other streams to capture TF. >> > >> > Iago >> > >> > [1] Patch: >> > >> > diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c >> > b/src/mesa/drivers/dri/i965/gen7_sol_state.c >> > index d2c3ae3..1450dde 100644 >> > --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c >> > +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c >> > @@ -189,6 +189,27 @@ gen7_upload_3dstate_so_decl_list(struct brw_context >> > *brw, >> > max_decls = decls[stream_id]; >> > } >> > >> > + /* We need to inspect if we have streams for which we don't have any >> > + * varyings to record. The hardware does not handle this scenario well >> > + * and for TF to work in this case we need to configure such streams to >> > + * have at least one decl and write to some disabled buffer. >> > + */ >> > + int disabled_buffer = -1; >> > + for (int i = 0; i < 4; i++) { >> > + if (linked_xfb_info->BufferStride[i] == 0) { >> > + disabled_buffer = i; >> > + break; >> > + } >> > + } >> > + if (disabled_buffer >= 0) { >> > + for (int i = 0; i < MAX_VERTEX_STREAMS; i++) { >> > + if (decls[i] == 0) { >> > + decls[i] = 1; >> > + buffer_mask[i] = 1 << disabled_buffer; >> > + } >> > + } >> > + } >> > + >> > BEGIN_BATCH(max_decls * 2 + 3); >> > OUT_BATCH(_3DSTATE_SO_DECL_LIST << 16 | (max_decls * 2 + 1)); >> > >> > @@ -250,9 +271,10 @@ upload_3dstate_streamout(struct brw_context *brw, >> > bool active, >> > dw1 |= SO_REORDER_TRAILING; >> > >> > for (i = 0; i < 4; i++) { >> > - if (xfb_obj->Buffers[i]) { >> > - dw1 |= SO_BUFFER_ENABLE(i); >> > - } >> > + if (xfb_obj->Buffers[i] && >> > + >> > xfb_obj->shader_program->LinkedTransformFeedback.BufferStride[i] > 0) { >> > + dw1 |= SO_BUFFER_ENABLE(i); >> > + } >> > } >> > >> > >> > _______________________________________________ >> > mesa-dev mailing list >> > mesa-dev@lists.freedesktop.org >> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev >> > > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev