On Mar 11, 2016 11:47 PM, "Pohjolainen, Topi" <topi.pohjolai...@intel.com> wrote: > > On Fri, Mar 11, 2016 at 05:59:37PM -0800, Jason Ekstrand wrote: > > On Fri, Mar 11, 2016 at 4:40 AM, Topi Pohjolainen > > <[1]topi.pohjolai...@intel.com> wrote: > > > > The logic iterates over param[] which contains pointers to > > uniform storage set during linking (see > > link_assign_uniform_locations()). > > The pointers are unique and it should be impossible to find > > matching entries. > > I couldn't find any regressions with test system. In addition > > I tried several benchmarks on HSW and none hit this. > > I'm hoping to remove this optimization attempt. This is the only > > bit that depends on knowing about the actual storage during > > compilation. All the rest deal with just relative push and pull > > locations once the actual filling of pull_param[] is moved > > outside the compiler just as param[]. (Filling pull_param is > > based on the pull locations and doesn't need to be inside the > > compiler). > > Any thoughts? > > > > I'm not 100% sure what you're trying to do, but I have a branch that > > may be of interest: > > [2] https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/i965-uniform > > s > > The branch enables support for pushing small uniform arrays. Among > > other things, it redoes the way we do push constants and gets rid of > > some of the data tracking in the backend compiler. The big reason why > > I haven't tried too hard to get it merged is because it regresses Sandy > > Bridge just a bit. I know I've seen and fixed the bug before in an > > alternate attempt, but I don't remember how. > > I'm going to be refreshing it soon because we need indirect push > > constants for the Vulkan driver. (The branch is already merged into > > the Vulkan branch.) > > I'd like to stop filling param[] before compilation. This is really not > needed by the compiler as it deals with pull and push constant locations, > i.e., positions in the push and pull files. Actual uniform values and their > location in the uniform storage are not needed until actual pipeline upload. > > My plan is to move the iteration over the core uniform storage to pipeline > upload time. We can fill push and pull buffers directly without the need of > storing pointers to param[] in the middle. Not only makes this things simpler > and more flexible in my mind, does it give us the possibility to upload > floats with 16-bit precision instead of 32-bits. Current upload logic only > gets pointers to 32-bit values without knowing if they should be represented > with 16 bits let alone whether the values are floats or integers to begin > with.
Right. Kristian and I have talked about some related things that we need for pipeline caching and the Vulkan driver. In Vulkan, they aren't actual pointers at all but are, instead, offsets into a push constant block. Fortunately, the back-end compiler never dereferences them so you can shove whatever you want in there and it's OK. We've talked about turning the pull and push params into just a set of integers that means whatever the api and state setup code want. One of the problems with pointers is that you can't easily put them into an on-disk shader cache (which we have for Vulkan). When you talk about 16 or 64-bit values, what is your intention? Are 64-bit values still going to take up two slots or are they now one 64-bit slot? Are there two 16-bit values per slot or just one? Are 16-bit uniforms converted before they get uploaded or consumed directly by the shader? I'm still a little confused as to what problem you're trying to solve. One thing to think about as you work on this is that Vulkan doesn't have individual uniforms but instead has a block of explicit push constants. In the shader, the push constants are specified with explicit offsets into that block similar to a UBO. The result is that it's very difficult for the state setup code to know what size anything is. Just chopping the push constant space into 32-bit hunks that the compiler is free to rearrange is terribly convenient. We could use 16 or 8-bit chunks just as easily but having some be 32-bit, others 64-bit, and others 16-bit has the potential to get very painful. Food for thought. Maybe I'm completely missing what your trying to do. --Jason
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev