On Thu, 2015-10-22 at 09:09 -0700, Jason Ekstrand wrote: > On Thu, Oct 22, 2015 at 4:21 AM, Iago Toral Quiroga <ito...@igalia.com> wrote: > > I implemented this first as a separate optimization pass in GLSL IR [1], but > > Curro pointed out that this being pretty much a restricted form of a CSE > > pass > > it would probably make more sense to do it inside CSE (and we no longer have > > a CSE pass in GLSL IR). > > > > Unlike other things we CSE in NIR, in the case of SSBO loads we need to make > > sure that we invalidate previous entries in the set in the presence of > > conflicting instructions (i.e. SSBO writes to the same block and offset) or > > in the presence of memory barriers. > > > > If this is accepted I intend to extend this to also cover image reads, which > > follow similar behavior. > > > > No regressions observed in piglit or dEQP's SSBO functional tests. > > > > [1] http://lists.freedesktop.org/archives/mesa-dev/2015-October/097718.html > > I think you've gotten enough NAK's that I don't need to chime in > there. Unfortunately, solving this in general is something of a > research project that both Connor and I have been thinking about for > quite some time. I've been thinking off-and-on about how to add a > proper memory model to lower_vars_to_ssa for almost a year now and > still haven't come up with a good way to do it. I don't know whether > SSBO's would be simpler or not. We need a proper memory model for > both lower_vars_to_ssa and SSBO load/stores (and shared local > variables) but it's a substantial research project. > > This isn't to say that you couldn't do it. Just know what you're taking on. > ;-)
Yeah, it does not make sense that I try to do this, you guys have clearly given this much more thought than me and know much better how a solution for this would fit in NIR than me. > That said, here's a suggestion for something that we *could* write > today, wouldn't be very hard, and wold solve a decent number of cases. > > For each block: > > 1) Create a new instruction set (don't use anything from any previous blocks) > 2) call add_or_rewrite on all ssbo load operations > 3) If you ever see a barrier or ssbo store, destroy the entire > instruction set and start again. Yep, this is what I was thinking for the load-combine pass that Connor suggested. However, I think that in this case we do not need to destroy the entire set when we find a store, only for memory barriers, right? I mean, there should be nothing preventing us from checking the offset/block of the store and compare it with the offset/block of the loads in the set to decide which ones we need to remove (like I was doing in my last patch) > This is something you could put together fairly quickly and would > handle a fair number of cases. With a little special casing, you may > also be able to handle store and then an immediate load of the same > value or duplicate stores. Anything much more complex than that is > going to take a lot more thought. Yes, I'll give this a try next. Thanks for all the comments and suggestions! Iago _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev