On Mon, Dec 14, 2015 at 8:57 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: > On Mon, Dec 14, 2015 at 11:44 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote: >> On Mon, Dec 14, 2015 at 8:03 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: >>> On Mon, Dec 14, 2015 at 10:50 PM, Jason Ekstrand <ja...@jlekstrand.net> >>> wrote: >>>> On Mon, Dec 14, 2015 at 4:10 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: >>>>> Hello, >>>>> >>>>> I was going to add support for the various volatile/etc annotations in >>>>> my gallium impl (which is fairly far along), but it seems like those >>>>> are currently being dropped on the floor by lower_ubo_references, and >>>>> NIR has no support for them either. Both in GLSL IR and NIR, the >>>>> variable is what holds the access annotation (volatile, etc). However >>>>> the ssbo intrinsics are all about a particular binding and offset, >>>>> without any reference to the variable. >>>>> >>>>> What's the right way of handling this? (And do any tests exist that >>>>> would be sensitive to getting such things wrong?) >>>> >>>> First off, why is it that way? Well, because most of the IRs in mesa >>>> don't have a memory model capable of handling these sorts of things. >>>> Those that do (LLVM is the only one I'm aware of) can't handle the GL >>>> packing rules. The result is that we basically have to lower >>>> everything away to byte-offset load/store intrinsics. >>>> >>>> What do we do about it? My inclination would be to either add two new >>>> intrinsics for load/store_ssbo_volatile or to add a new constant >>>> boolean "volatile" parameter to the current intrinsics. If a >>>> load/store happens on a volatile things, you get the volatile version, >>>> otherwise you get the plane version. Then backends can know that they >>>> are free to reorder, combine, etc. non-volatile load/store operations >>>> as per their memory model and the provided barriers. If they >>>> encounter a volatile load/store, they can't do anything with it and >>>> just have to do the memory op. >>>> >>>> Is that reasonable? >>> >>> Well, I don't really care much about the NIR-side of it, but I think >>> at the very least coherent and volatile need to be exposed. I don't >>> much care about restrict, that's a bit too fancy for me, but no harm >>> in passing it through. TBH I don't get the point of >>> readonly/writeonly. FWIW this is what the nvidia hw is capable of: >>> http://docs.nvidia.com/cuda/parallel-thread-execution/#cache-operators >>> (I know it's PTX, but the actual hw is basically the same.) >>> >>> I think that volatile = cv, coherent = cg, and everything else = ca/cs >>> based on some clever heuristic (aka I'll always pick one of those >>> until a situation where that doesn't work well arises). >> >> Right. I don't actually know what we can do with our HW. I was >> mostly thinking about what optimization passes can or cannot do with a >> variable. It sounds like, whatever we do, you want it passed through >> somehow. How about we just add a little well-defined bitfield as an >> extra argument to the SSBO load/store intrinsics? The fact that it >> gets applied to a variable or a struct or a member or whatever is kind >> of a moot point to the backend compiler. > > An extra argument with a mask or enum is precisely what I was > expecting to find when looking at the lower_ubo_references pass when I > was rather surprised to see that the whole thing had been overlooked.
Seems reasonable to me. However, Intel hardware doesn't really give you that fine-grained control so we didn't care. Sounds like someone should add it. Have fun! --Jason _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev