Oh, I just noticed there was some problem with the first reply I sent and Chris and Steven got out of the CC in my reply, so adding them back now.
Steven: the summary is that on IvyBridge Kenneth's second patch produces a GPU hang in some cases so if you are going to test it try this version better: https://bugs.freedesktop.org/attachment.cgi?id=102118 Iago On Wed, 2014-07-02 at 10:09 +0200, Iago Toral wrote: > On Wed, 2014-07-02 at 08:49 +0200, Iago Toral wrote: > > Hello Kenneth, Steven: > > > > On Tue, 2014-07-01 at 17:25 -0700, Kenneth Graunke wrote: > > > Previously, we only emitted 3DSTATE_SO_BUFFER and 3DSTATE_SO_DECL_LIST > > > when transform feedback was active. When it was inactive, we disabled > > > the SOL stage in 3DSTATE_SOL so the other state would be ignored. > > > > > > In commit 3178d2474ae5bdd1102fb3d76a60d1d63c961ff5, Iago enabled the SOL > > > stage universally, so we could implement the GL_PRIMITIVES_GENERATED > > > statistics counter. This caused every Piglit test to trigger assertions > > > in the simulator, and apparently caused GPU hangs for some users. > > > > > > Apparently, we're supposed to program 3DSTATE_SO_DECL_LIST to zero when > > > output streams are inactive, but we hadn't been doing so. Now that SOL > > > is on, we need to do that properly. > > > > If that is the case then I guess the workaround I was trying to use for > > that hardware issue I mentioned in another e-mail may not be any good, > > since it required to configure a non-zero 3DSTATE_SO_DECL_LIST while > > outputting to a disabled stream... :( > > My workaround patch works here in IvyBridge and keeps working with your > patch since it only activates when TF is active, but since it relies on > this stuff I think it would be best if someone tests it on other > hardware when it gets reviewed. > > > I never get any GPU hangs though... > > > > > Experimentally, it looks like we also need to program 3DSTATE_SO_BUFFER > > > to zero as well, or else I get many GPU hangs on Haswell. > > > > > > Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> > > > Cc: Iago Toral Quiroga <ito...@igalia.com> > > > Cc: Chris Forbes <chr...@ijw.co.nz> > > > Cc: Steven Newbury <st...@snewbury.org.uk> > > > --- > > > src/mesa/drivers/dri/i965/brw_state.h | 3 ++- > > > src/mesa/drivers/dri/i965/gen7_sol_state.c | 34 > > > ++++++++++++++++++++++-------- > > > src/mesa/drivers/dri/i965/gen8_sol_state.c | 2 +- > > > 3 files changed, 28 insertions(+), 11 deletions(-) > > > > > > Iago, > > > > > > I noticed that pretty much every Piglit test broke in our simulation > > > environment after your patch to turn on SOL by default. Something about > > > SO_DECLs not being right. This patch seems to fix the issue. Does it > > > look reasonable to you? > > > > The patch is producing GPU hangs for me in the case where I have a > > geometry shader and no TF, so it seems that it introduces other > > problems. I'll see if I can find the reason for this. > > It looks like we have to send at least one decl per stream even if we > are configuring the number of declarations per stream to 0, otherwise we > get this GPU hang on IvyBridge at least. > > This seems to contradict clearly the documentation for IvyBridge. From > 8.5 3DSTATE_SO_DECL_LIST Command, dw2, bits 7:0, Num Entries[0]: > > "It is legal to specify Num Entries = 0 for all four streams > simultaneously. In this case there will be no SO_DECLs included in the > command (only DW 0-2). Note that all Stream to Buffer Selects bits must > be zero in this case (as no streams produce output). > > Which is exactly what Kenneth's patch does, but that produces a GPU hang > here. > > At the same time, there is this other statement at the beginning of that > same chapter: > > "Errata: All 128 decls for all four streams must be included whenever > this command is issued. The “Num Entries [n]” fields still contain the > actual numbers of valid decls." > > Not sure if that sentence is the fix to a previous errata or the errata > itself... but in any case, we are never sending all 128 decls but it > looks like at least on IvyBridge we have to send one or we get a GPU > hang. > > Kenneth, following the above, this change to your patch fixes the GPU > hang I was running into: > > diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c > b/src/mesa/drivers/dri/i965/gen7_sol_state.c > index 2013406..6d1b93c 100644 > --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c > +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c > @@ -108,8 +108,10 @@ gen7_upload_3dstate_so_decl_list(struct brw_context *brw, > * all zero (no buffers written to) and the corresponding Num Entries > * field to zero (no valid SO_DECLs)." > */ > - BEGIN_BATCH(3); > - OUT_BATCH(_3DSTATE_SO_DECL_LIST << 16 | (3 - 2)); > + BEGIN_BATCH(5); > + OUT_BATCH(_3DSTATE_SO_DECL_LIST << 16 | (5 - 2)); > + OUT_BATCH(0); > + OUT_BATCH(0); > OUT_BATCH(0); > OUT_BATCH(0); > ADVANCE_BATCH(); > > Steven, can you try Kenneth's patch with this change too? > > Iago > > > > Steven, > > > > > > Does this fix your GPU hangs on Ivybridge? > > > > Steven, can you provide more info on the use case that is triggering > > your GPU hang (specifically the setup for transform feedback and whether > > you are using a geometry shader or just a vertex shader, etc)? > > > > I have IvyBridge so I can try to reproduce your setup and see if I get > > the GPU hang here too. > > > > Iago > > > > > Thanks! > > > --Ken > > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_state.h > > > b/src/mesa/drivers/dri/i965/brw_state.h > > > index c52a977..2c1fc87 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_state.h > > > +++ b/src/mesa/drivers/dri/i965/brw_state.h > > > @@ -237,7 +237,8 @@ void gen7_init_vtable_surface_functions(struct > > > brw_context *brw); > > > > > > /* gen7_sol_state.c */ > > > void gen7_upload_3dstate_so_decl_list(struct brw_context *brw, > > > - const struct brw_vue_map *vue_map); > > > + const struct brw_vue_map *vue_map, > > > + bool active); > > > > > > /* gen8_surface_state.c */ > > > void gen8_init_vtable_surface_functions(struct brw_context *brw); > > > diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c > > > b/src/mesa/drivers/dri/i965/gen7_sol_state.c > > > index eccd5a5..e2eb334 100644 > > > --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c > > > +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c > > > @@ -36,7 +36,7 @@ > > > #include "main/transformfeedback.h" > > > > > > static void > > > -upload_3dstate_so_buffers(struct brw_context *brw) > > > +upload_3dstate_so_buffers(struct brw_context *brw, bool active) > > > { > > > struct gl_context *ctx = &brw->ctx; > > > /* BRW_NEW_TRANSFORM_FEEDBACK */ > > > @@ -56,7 +56,7 @@ upload_3dstate_so_buffers(struct brw_context *brw) > > > uint32_t start, end; > > > uint32_t stride; > > > > > > - if (!xfb_obj->Buffers[i]) { > > > + if (!xfb_obj->Buffers[i] || !active) { > > > /* The pitch of 0 in this command indicates that the buffer is > > > * unbound and won't be written to. > > > */ > > > @@ -96,10 +96,28 @@ upload_3dstate_so_buffers(struct brw_context *brw) > > > */ > > > void > > > gen7_upload_3dstate_so_decl_list(struct brw_context *brw, > > > - const struct brw_vue_map *vue_map) > > > + const struct brw_vue_map *vue_map, > > > + bool active) > > > { > > > - struct gl_context *ctx = &brw->ctx; > > > + if (!active) { > > > + /* If inactive, disable all SO outputs. > > > + * > > > + * From the Ivybridge PRM, Volume 2, Part 1, page 202 > > > + * (3DSTATE_SO_DECL_LIST packet definition), DWord 1, bits 3:0: > > > + * "Note: For 'inactive' streams, software must program this field > > > to > > > + * all zero (no buffers written to) and the corresponding Num > > > Entries > > > + * field to zero (no valid SO_DECLs)." > > > + */ > > > + BEGIN_BATCH(3); > > > + OUT_BATCH(_3DSTATE_SO_DECL_LIST << 16 | (3 - 2)); > > > + OUT_BATCH(0); > > > + OUT_BATCH(0); > > > + ADVANCE_BATCH(); > > > + return; > > > + } > > > + > > > /* BRW_NEW_TRANSFORM_FEEDBACK */ > > > + struct gl_context *ctx = &brw->ctx; > > > struct gl_transform_feedback_object *xfb_obj = > > > ctx->TransformFeedback.CurrentObject; > > > const struct gl_transform_feedback_info *linked_xfb_info = > > > @@ -289,11 +307,9 @@ upload_sol_state(struct brw_context *brw) > > > /* BRW_NEW_TRANSFORM_FEEDBACK */ > > > bool active = _mesa_is_xfb_active_and_unpaused(ctx); > > > > > > - if (active) { > > > - upload_3dstate_so_buffers(brw); > > > - /* BRW_NEW_VUE_MAP_GEOM_OUT */ > > > - gen7_upload_3dstate_so_decl_list(brw, &brw->vue_map_geom_out); > > > - } > > > + upload_3dstate_so_buffers(brw, active); > > > + /* BRW_NEW_VUE_MAP_GEOM_OUT */ > > > + gen7_upload_3dstate_so_decl_list(brw, &brw->vue_map_geom_out, active); > > > > > > /* Finally, set up the SOL stage. This command must always follow > > > updates to > > > * the nonpipelined SOL state (3DSTATE_SO_BUFFER, > > > 3DSTATE_SO_DECL_LIST) or > > > diff --git a/src/mesa/drivers/dri/i965/gen8_sol_state.c > > > b/src/mesa/drivers/dri/i965/gen8_sol_state.c > > > index ebcdaf8..ff85bf2 100644 > > > --- a/src/mesa/drivers/dri/i965/gen8_sol_state.c > > > +++ b/src/mesa/drivers/dri/i965/gen8_sol_state.c > > > @@ -157,7 +157,7 @@ upload_sol_state(struct brw_context *brw) > > > if (active) { > > > gen8_upload_3dstate_so_buffers(brw); > > > /* BRW_NEW_VUE_MAP_GEOM_OUT */ > > > - gen7_upload_3dstate_so_decl_list(brw, &brw->vue_map_geom_out); > > > + gen7_upload_3dstate_so_decl_list(brw, &brw->vue_map_geom_out, > > > active); > > > } > > > > > > gen8_upload_3dstate_streamout(brw, active, &brw->vue_map_geom_out); > > > > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev