On 04/21/2016 04:46 PM, Hans de Goede wrote:
Hi,

On 21-04-16 16:28, Ilia Mirkin wrote:
On Thu, Apr 21, 2016 at 9:55 AM, Hans de Goede <hdego...@redhat.com>
wrote:
combineLd/St would combine, i.e. :

st  u32 # g[$r2+0x0] $r2
st  u32 # g[$r2+0x4] $r3

into:

st  u64 # g[$r2+0x0] $r2d

But this is only valid if r2 contains an 8 byte aligned address,
which is unknown.

This commit checks for src0 dim 0 not being indirect when combining
loads / stores as combining indirect loads / stores may break alignment
rules.

I believe the assumption is that all indirect addresses are 16-byte
aligned. This works out for GL, I think. Although hm... I wonder what
happens if you have a

layout (std430) buffer foo {
   int x[16];
}

And you access x[i], x[i+1], and i == 1. I think we end up doing a ton
of size-based validation which might avoid the problem.

My concern is that now constbufs will get the same treatment, and for
constbufs the alignment is always 16 :(

What do you think? Just drop those, or add extra conditionals to allow
it for constbufs?

I'm not sure we've the alignment guarantee for constbufs, IIRC we lower
const buf accesses to be indirect because we want to provide more then 8
UBO-s,
right ? So we read the offset from NVC0_CB_AUX_BUF_INFO and then end up
with e.g.:

Right. This is because the launch descriptor used for compute shaders on kepler only allows to set up 8 CBs. But OpenGL requires at least 14 UBOs, so the logic is to stick UBOs' information into the driver constant buffer.

As you can, we do this dance for all UBOs because it's simpler that testing if an UBO has been described in the launch descriptor or not (so if it's mapped as c1[], c2[], etc).

The lowering pass should properly handle indirect UBO accesses (I did write a piglit test for that and looked at blob). But I'm not sure if we can break alignment here.

Do you have a simple shader that might hit the issue?


ld u64  r2d c7[r1+0x0]

Where r1 contains the offset of the user-buf. But what if the user is
somehow
indirectly accessing the userbuf, then we will have added that indirect
offset
to r1, and we can no longer assume that we can safely merge the loads
without
breaking alignment rules.

I hope I'm making sense here, I'm still a bit unsure about the details how
this all works.

Regards,

Hans

--
-Samuel
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to