On 10.10.2016 05:45, Dave Airlie wrote:
On 10 October 2016 at 13:25, Dave Airlie <airl...@gmail.com> wrote:
From: Dave Airlie <airl...@redhat.com>

This is a bit of a hack due to how llvm currently handles
spilling in it's shader ABI. Currently llvm amdgpu backend
uses relocations to patch the shader with the address of
the tmpring. The driver loads the shader and patches the
relocations.

However for vulkan this doesn't work so well for a few reasons
a) when we build/load the shaders we aren't constructing the
command stream yet, and the same shader could be using in multiple
command streams.

b) multiple command execution engines for compute shaders.

So ideally we'd fix LLVM to understand the ABI convention, possibly
we'd fix it so user sgpr 0,1 are used (this hack uses 10/11).

This patch when it gets the shader back from llvm it patches
the relocation dword to a nop, and patches to previous mov command
to move from SGPR 10 and 11. This works usually as it seems the
SGPR loading of the spill stuff is at the start of shaders always
so the 10/11 user sgprs haven't been trashed yet. I'm not 100%
sure this will work all the time, but for now this should allow
us to pass a bunch more CTS tests and make the Sascha computeshader
demo to work.

So I found a shader that this doesn't work so well with unfortunately, so
while I'd like this as a temporary solution I probably need to start
digging into llvm.

My current plan is to add a flag to llvm to denote ability to spill to
the address
in userdata sgpr 0/1, have llvm preload those and use them instead.

Now the other question I have is, should I be killing two user data sgprs for
this purpose, or should we define a better ABI, so that the first descriptor
in the buffer pointed to by these is the scratch buffer, and other things
could be queued after it, (like push constants and dynamic descriptor).

Bas, nha? not sure if Matt is on this list.

Getting rid of the relocations would be nice for OpenGL as well. It always feels like a bit of a hack.

For OpenGL, it's important that non-monolithic shaders work. In practice, this means that it must be possible to pass the scratch buffer pointer through the prolog shader into the main part.

It seems to me the simplest way to ensure this is to add an explicit function argument. Something like:

define amdgpu_XX TYPE @prolog(i64 inreg %spillptr, ....) "amdgpu-spill-ptr"="0" {
   ...
   %retval = insertvalue TYPE %prev, i64 %spillptr, 0
   ret TYPE %retval
}

amdgpu-spill-ptr would be a function attribute indicating which function argument (by position) holds the base pointer for spilling.

I also think that loading the spill pointer indirectly makes a lot of sense. After all, user data is scarce and most shaders don't spill. This could easily fit in the function attribute scheme. Something like "amdgpu-spill-ptr=1:16" to indicate that the second function argument is a 64-bit pointer, and the spill pointer should be loaded from there with a 16 byte offset.

FWIW, even with the indirect load, I think we should only load a 64-bit pointer instead of a full descriptor. This is because we should really set the size limit of the spill descriptor correctly to implement robustness semantics (this is relevant because the spill descriptor is also used for temporary arrays; not sure if Vulkan cares, but OpenGL definitely does).

Cheers,
Nicolai
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to