On Sat, Mar 12, 2016 at 08:44:54AM -0800, Jason Ekstrand wrote: > On Mar 11, 2016 11:47 PM, "Pohjolainen, Topi" > <[1]topi.pohjolai...@intel.com> wrote: > > > > On Fri, Mar 11, 2016 at 05:59:37PM -0800, Jason Ekstrand wrote: > > > On Fri, Mar 11, 2016 at 4:40 AM, Topi Pohjolainen > > > <[1][2]topi.pohjolai...@intel.com> wrote: > > > > > > The logic iterates over param[] which contains pointers to > > > uniform storage set during linking (see > > > link_assign_uniform_locations()). > > > The pointers are unique and it should be impossible to find > > > matching entries. > > > I couldn't find any regressions with test system. In addition > > > I tried several benchmarks on HSW and none hit this. > > > I'm hoping to remove this optimization attempt. This is the > only > > > bit that depends on knowing about the actual storage during > > > compilation. All the rest deal with just relative push and > pull > > > locations once the actual filling of pull_param[] is moved > > > outside the compiler just as param[]. (Filling pull_param is > > > based on the pull locations and doesn't need to be inside the > > > compiler). > > > Any thoughts? > > > > > > I'm not 100% sure what you're trying to do, but I have a branch > that > > > may be of interest: > > > > [2][3]https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/i965-unif > orm > > > s > > > The branch enables support for pushing small uniform arrays. > Among > > > other things, it redoes the way we do push constants and gets > rid of > > > some of the data tracking in the backend compiler. The big > reason why > > > I haven't tried too hard to get it merged is because it > regresses Sandy > > > Bridge just a bit. I know I've seen and fixed the bug before in > an > > > alternate attempt, but I don't remember how. > > > I'm going to be refreshing it soon because we need indirect push > > > constants for the Vulkan driver. (The branch is already merged > into > > > the Vulkan branch.) > > > > I'd like to stop filling param[] before compilation. This is really > not > > needed by the compiler as it deals with pull and push constant > locations, > > i.e., positions in the push and pull files. Actual uniform values and > their > > location in the uniform storage are not needed until actual pipeline > upload. > > > > My plan is to move the iteration over the core uniform storage to > pipeline > > upload time. We can fill push and pull buffers directly without the > need of > > storing pointers to param[] in the middle. Not only makes this things > simpler > > and more flexible in my mind, does it give us the possibility to > upload > > floats with 16-bit precision instead of 32-bits. Current upload logic > only > > gets pointers to 32-bit values without knowing if they should be > represented > > with 16 bits let alone whether the values are floats or integers to > begin > > with. > > Right. Kristian and I have talked about some related things that we > need for pipeline caching and the Vulkan driver. In Vulkan, they > aren't actual pointers at all but are, instead, offsets into a push > constant block. Fortunately, the back-end compiler never dereferences > them so you can shove whatever you want in there and it's OK. We've > talked about turning the pull and push params into just a set of > integers that means whatever the api and state setup code want. One of > the problems with pointers is that you can't easily put them into an > on-disk shader cache (which we have for Vulkan). > > When you talk about 16 or 64-bit values, what is your intention? Are > 64-bit values still going to take up two slots or are they now one > 64-bit slot? Are there two 16-bit values per slot or just one? Are > 16-bit uniforms converted before they get uploaded or consumed directly > by the shader? I'm still a little confused as to what problem you're > trying to solve.
I'm seeing the 16-bit float as two-fold. First, the uniform storage always represents them as normal 32-bit floats for the gl-api to work correctly (even if they are marked as low/mediump I don't think the api for setting and querying them is allowed operate with reduced precision. On the other hand, such conversion back and forth in the core gl-api doesn't sound appealing at all just from implementation point of view). Therefore after the compiler has chosen to represent a particular uniform with reduced precision and set the operand types accordingly, the upload logic has to convert the 32-bit float into equivalent 16-bit value before uplaoding. Second is the question on how to pack the 16-bit values. I'm seeing this as second step of the optimization where we allow more individual components in the push file, two 16-bits components in one 32-bit slot. But as you are saying yourself this maybe a lot of work/complexity with unknown gains in performance. With 64-bit doubles we don't need to anything particular as both the upload logic and the core uniform storage agree on required space - both take two 32-bit slots. > > One thing to think about as you work on this is that Vulkan doesn't > have individual uniforms but instead has a block of explicit push > constants. In the shader, the push constants are specified with > explicit offsets into that block similar to a UBO. The result is that > it's very difficult for the state setup code to know what size anything > is. Just chopping the push constant space into 32-bit hunks that the > compiler is free to rearrange is terribly convenient. We could use 16 > or 8-bit chunks just as easily but having some be 32-bit, others > 64-bit, and others 16-bit has the potential to get very painful. > > Food for thought. Maybe I'm completely missing what your trying to do. Thanks Jason, I've been worrying if I'm conflicting with vulkan, so really good for you to give me a heads up. I've been hoping to keep my changes in the glsl-specific parts in the backend compiler. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev