On Wednesday, July 02, 2014 10:09:49 AM Iago Toral wrote: > On Wed, 2014-07-02 at 08:49 +0200, Iago Toral wrote: > > Hello Kenneth, Steven: > > > > On Tue, 2014-07-01 at 17:25 -0700, Kenneth Graunke wrote: > > > Previously, we only emitted 3DSTATE_SO_BUFFER and 3DSTATE_SO_DECL_LIST > > > when transform feedback was active. When it was inactive, we disabled > > > the SOL stage in 3DSTATE_SOL so the other state would be ignored. > > > > > > In commit 3178d2474ae5bdd1102fb3d76a60d1d63c961ff5, Iago enabled the SOL > > > stage universally, so we could implement the GL_PRIMITIVES_GENERATED > > > statistics counter. This caused every Piglit test to trigger assertions > > > in the simulator, and apparently caused GPU hangs for some users. > > > > > > Apparently, we're supposed to program 3DSTATE_SO_DECL_LIST to zero when > > > output streams are inactive, but we hadn't been doing so. Now that SOL > > > is on, we need to do that properly. > > > > If that is the case then I guess the workaround I was trying to use for > > that hardware issue I mentioned in another e-mail may not be any good, > > since it required to configure a non-zero 3DSTATE_SO_DECL_LIST while > > outputting to a disabled stream... :( > > My workaround patch works here in IvyBridge and keeps working with your > patch since it only activates when TF is active, but since it relies on > this stuff I think it would be best if someone tests it on other > hardware when it gets reviewed. > > > I never get any GPU hangs though... > > > > > Experimentally, it looks like we also need to program 3DSTATE_SO_BUFFER > > > to zero as well, or else I get many GPU hangs on Haswell. > > > > > > Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> > > > Cc: Iago Toral Quiroga <ito...@igalia.com> > > > Cc: Chris Forbes <chr...@ijw.co.nz> > > > Cc: Steven Newbury <st...@snewbury.org.uk> > > > --- > > > src/mesa/drivers/dri/i965/brw_state.h | 3 ++- > > > src/mesa/drivers/dri/i965/gen7_sol_state.c | 34 ++++++++++++++++++++++-------- > > > src/mesa/drivers/dri/i965/gen8_sol_state.c | 2 +- > > > 3 files changed, 28 insertions(+), 11 deletions(-) > > > > > > Iago, > > > > > > I noticed that pretty much every Piglit test broke in our simulation > > > environment after your patch to turn on SOL by default. Something about > > > SO_DECLs not being right. This patch seems to fix the issue. Does it > > > look reasonable to you? > > > > The patch is producing GPU hangs for me in the case where I have a > > geometry shader and no TF, so it seems that it introduces other > > problems. I'll see if I can find the reason for this. > > It looks like we have to send at least one decl per stream even if we > are configuring the number of declarations per stream to 0, otherwise we > get this GPU hang on IvyBridge at least.
Interesting. I'm not seeing any documentation indicating that we should have to, but it seems like a fairly reasonable thing to do... > This seems to contradict clearly the documentation for IvyBridge. From > 8.5 3DSTATE_SO_DECL_LIST Command, dw2, bits 7:0, Num Entries[0]: > > "It is legal to specify Num Entries = 0 for all four streams > simultaneously. In this case there will be no SO_DECLs included in the > command (only DW 0-2). Note that all Stream to Buffer Selects bits must > be zero in this case (as no streams produce output). Yeah - that's what I was looking at. It /ought/ to work (and seems to on Haswell). > Which is exactly what Kenneth's patch does, but that produces a GPU hang > here. > > At the same time, there is this other statement at the beginning of that > same chapter: > > "Errata: All 128 decls for all four streams must be included whenever > this command is issued. The “Num Entries [n]” fields still contain the > actual numbers of valid decls." > > Not sure if that sentence is the fix to a previous errata or the errata > itself... but in any case, we are never sending all 128 decls but it > looks like at least on IvyBridge we have to send one or we get a GPU > hang. This is a documentation mistake. Looking at the latest docs, that errata clearly only applies to early pre-production hardware. Unfortunately, when they published the PRM, they stripped out the stepping information, and didn't strip out the entire errata. :( > Kenneth, following the above, this change to your patch fixes the GPU > hang I was running into: > > diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c b/src/mesa/drivers/dri/i965/gen7_sol_state.c > index 2013406..6d1b93c 100644 > --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c > +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c > @@ -108,8 +108,10 @@ gen7_upload_3dstate_so_decl_list(struct brw_context *brw, > * all zero (no buffers written to) and the corresponding Num Entries > * field to zero (no valid SO_DECLs)." > */ > - BEGIN_BATCH(3); > - OUT_BATCH(_3DSTATE_SO_DECL_LIST << 16 | (3 - 2)); > + BEGIN_BATCH(5); > + OUT_BATCH(_3DSTATE_SO_DECL_LIST << 16 | (5 - 2)); > + OUT_BATCH(0); > + OUT_BATCH(0); > OUT_BATCH(0); > OUT_BATCH(0); > ADVANCE_BATCH(); > > Steven, can you try Kenneth's patch with this change too? > > Iago Cool, I can squash that in. Thanks!
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev