On Fri, Aug 09, 2013 at 07:54:11AM +0200, Michel Dänzer wrote: > On Don, 2013-08-08 at 11:32 -0700, Tom Stellard wrote: > > On Thu, Aug 08, 2013 at 05:36:09PM +0200, Michel Dänzer wrote: > > > On Don, 2013-08-08 at 08:00 -0700, Tom Stellard wrote: > > > > On Thu, Aug 08, 2013 at 02:20:54AM +0200, Marek Olšák wrote: > > > > > > > > > > diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c > > > > > b/src/gallium/drivers/radeonsi/si_state_draw.c > > > > > index 746ace6..4208fa7 100644 > > > > > --- a/src/gallium/drivers/radeonsi/si_state_draw.c > > > > > +++ b/src/gallium/drivers/radeonsi/si_state_draw.c > > > > > @@ -241,6 +241,7 @@ static void si_pipe_shader_ps(struct pipe_context > > > > > *ctx, struct si_pipe_shader *s > > > > > /* Last 2 reserved SGPRs are used for VCC */ > > > > > num_sgprs = num_user_sgprs + 2; > > > > > } > > > > > + num_sgprs += 1; /* XXX this fixes VM faults */ > > > > > > > > One problem is that the compiler is under reporting the number of SGPRs, > > > > when there are unused USER_SGPRs in the shader. It should always be > > > > reporting a number greater than or equal to the number of USER_SGPRs. > > > > > > That itself shouldn't be a problem, as the radeonsi code ensures this. > > > > > > > The way radeonsi checks for this on pixel shaders won't work in all > > cases. For example, if there are 8 USER_SGPR and the shader uses VCC, > > then the backend should report 10 SGPRs used. This value is eventually > > rounded up to the next multiple of 8, so what is actually stored in > > shader->num_sgprs is 16. > > > > Now lets say that the shader doesn't see 2 of the USER_SGPRs, it will > > report 8 SGPRs (6 USER_SGPRs + 2 for VCC). This value is rounded up to > > the nearest multiple of 8, which is still 8, and stored in > > shader->num_sgprs. > > > > Since the number of SGPRs reported matches the number of USER_SGPRs, > > the code in si_pipe_shader_ps() will accepted the reported number of > > SGPRs as valid, even though it isn't. > > Why isn't it valid? In your example, the shader doesn't use more than 8 > SGPRs, and doesn't use SGPR6 and SGPR7 for anything but VCC, right? >
My understanding is that USER_SGPRs count towards the total number of SGPRs even if they are not used, so VCC would go into SGPR8 and SGPR9. > I guess we could change the tests to ((num_user_sgprs + 2) > num_sgprs) > for extra safety, but I haven't seen or been able to come up with any > specific scenario where the current tests wouldn't be good enough. > The checks as they are now don't actually check anything, since num_sgprs is always greater than or equal to num_user_sgprs,. I think the best think to do is move all this logic into the compiler backend. > And I'm worried that changing the tests might just paper over the > problem for which Marek added the line above, which we don't really > understand yet. > My hypothesis is that that hangs are caused by the problem I described above. -Tom > > -- > Earthling Michel Dänzer | http://www.amd.com > Libre software enthusiast | Debian, X and DRI developer > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev