If by not doing anything you mean not processing or removing the ir_emit_vertex instructions for that stream this would have two problems at least:
1) We won't get correct results for GL_PRIMITIVES_GENERATED in that stream (it will always be 0). This may be a minor problem. 2) If that stream is stream 0 and rendering is enabled then we lose rendering output, which would be a major problem. So I think this is not a good thing to do. Iago On Fri, 2014-06-27 at 08:08 +1200, Chris Forbes wrote: > As an alternative -- we know if we have this scenario at link time -- > could we perhaps just not do anything in EmitStreamVertex if there are > no varyings captured to that stream? > On Thu, Jun 26, 2014 at 10:26 PM, Iago Toral <ito...@igalia.com> wrote: > > Hello, > > > > while testing various scenarios for multi-stream support in geometry > > shaders I came across one that I think might be a hardware bug, or at > > the very least, a hardware limitation that creates a problem to > > implement correct behavior according to ARB_transform_feedback3. > > > > The conflictive scenario is activated with this setup: > > - Enable transform feedback. > > - Do not associate any varyings with one particular stream (let's call > > this stream X). > > - Have the GS emit a vertex to stream X. > > > > ARB_transform_feedback3 clarifies expected behavior in this case: > > > > "If the set of varyings selected for transform feedback does not include > > any belonging to the specified stream, nothing will be recorded when > > primitives are emitted to that stream, and the corresponding vertex > > count will be zero." > > > > However, we get two possible outcomes with this setup: > > > > 1) If the vertex emitted to that stream is not the last vertex emitted > > by the GS, then primitive count for that stream is incorrect (returns > > 0), but everything else works ok. > > > > I think this behavior is expected as per the IvyBridge documentation: > > > > "8.3 Stream Output Function: > > ... > > If a stream has no SO_DECL state defined (NumEntries is 0), incoming > > objects targeting that stream are effectively ignored. As there is no > > attempt to perform stream output, overflow detection is neither required > > nor performed." > > > > Which means that we can't use SO_PRIMITIVE_STORAGE_NEEDED for the > > primitive count in this case. We could still use CL_INVOCATION_COUNT for > > stream 0, but that would not fix the problem for other streams. > > > > 2) If the vertex emitted to that stream is the last vertex emitted by > > the GS, then transform feedback does not work for any stream (no values > > are recorded in the TF buffers) and primitive queries for all streams > > return 0. Rendering is okay though: stream 0 outputs are rendered > > properly and outputs from other streams are discarded. This, I think, is > > a hardware problem. > > > > With this setup, we are configuring the 3DSTATE_SO_DECL_LIST command for > > stream X like this: > > > > Buffer Selects (Stream X) = 0 > > Num Entries (Stream X) = 0 > > > > that is, that stream writes to no buffers and has no declarations to > > write, which is correct. > > > > Now comes the funny part: simply forcing Num Entries(Stream X) = 1, even > > if there are no declarations, makes TF and primitive queries work again > > for all streams but X, and for stream X, primitive count is ok, but TF > > is not (but that is kind of expected since we are not configuring it > > properly). More over, if I also force Buffer Selects (Stream X) = N (so > > that N is the index of a disabled TF buffer), then TF also works as > > expected for Stream X (primitives generated is okay, TF primitives > > written is 0, and no TF data for that stream is written). > > > > It looks like the hardware does not like setups where there are streams > > that have 0 varyings to record after all, even less so if the last > > vertex we emit is sent to such a stream. > > > > Based on the above, there is a work around for this but I think it is > > pretty ugly so I would like to know other people's thoughts on whether > > it is worth implementing. It would involve the following: > > > > In upload_3dstate_streamout() we make sure we disable all transform > > feedback buffers that are not going to record information (currently a > > TF buffer is activated as far as the user has called > > glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, index bufferName)). We > > can know if a buffer is not going to be written by inspecting its > > BufferStride: it should be 0 for buffers that won't get written. I think > > this is probably good t do in any case. > > > > Then the ugly part: in gen7_upload_3dstate_so_decl_list(), if we detect > > a stream with no varyings bound to it (so num delcs is 0) *and* there > > are disabled TF buffers, we silently set num decls for that stream to 1 > > and set its buffer_mask to write to one of the disabled buffers (it > > won't actually write because they are disabled). > > > > I have a patch for this [1] and seems to fix the problem (although it > > only works as far as we have disabled TF buffers available). > > > > Opinions? Is there any other alternative to work around this issue? > > > > The problem is particularly annoying because I think it hits a very > > likely scenario: an application using stream 0 for rendering only (no > > TF) and using other streams to capture TF. > > > > Iago > > > > [1] Patch: > > > > diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c > > b/src/mesa/drivers/dri/i965/gen7_sol_state.c > > index d2c3ae3..1450dde 100644 > > --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c > > +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c > > @@ -189,6 +189,27 @@ gen7_upload_3dstate_so_decl_list(struct brw_context > > *brw, > > max_decls = decls[stream_id]; > > } > > > > + /* We need to inspect if we have streams for which we don't have any > > + * varyings to record. The hardware does not handle this scenario well > > + * and for TF to work in this case we need to configure such streams to > > + * have at least one decl and write to some disabled buffer. > > + */ > > + int disabled_buffer = -1; > > + for (int i = 0; i < 4; i++) { > > + if (linked_xfb_info->BufferStride[i] == 0) { > > + disabled_buffer = i; > > + break; > > + } > > + } > > + if (disabled_buffer >= 0) { > > + for (int i = 0; i < MAX_VERTEX_STREAMS; i++) { > > + if (decls[i] == 0) { > > + decls[i] = 1; > > + buffer_mask[i] = 1 << disabled_buffer; > > + } > > + } > > + } > > + > > BEGIN_BATCH(max_decls * 2 + 3); > > OUT_BATCH(_3DSTATE_SO_DECL_LIST << 16 | (max_decls * 2 + 1)); > > > > @@ -250,9 +271,10 @@ upload_3dstate_streamout(struct brw_context *brw, bool > > active, > > dw1 |= SO_REORDER_TRAILING; > > > > for (i = 0; i < 4; i++) { > > - if (xfb_obj->Buffers[i]) { > > - dw1 |= SO_BUFFER_ENABLE(i); > > - } > > + if (xfb_obj->Buffers[i] && > > + > > xfb_obj->shader_program->LinkedTransformFeedback.BufferStride[i] > 0) { > > + dw1 |= SO_BUFFER_ENABLE(i); > > + } > > } > > > > > > _______________________________________________ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev