Yeah, so looks like the *implementation* tests do something like:
vec2 32 ssa_4 = intrinsic load_ubo (ssa_3, ssa_1) () ()
vec4 32 ssa_5 = tg4 ssa_2 (coord), ssa_4 (offset), 3
(gather_component), 0 (texture) 0 (sampler)
Which means that we don't hit the immediate path. I'll hack so
Ah OK. But presumably the gather4_po variant is chosen when the
constant offsets are outside the -8..7 range, since I didn't get any
failures, and the test suite has checks around impl-defined
min/maxes... Could be that all those were the non-const variants.
Looking through the brw_fs code, I indee
The HW limits here are -8/7 when using the gather4 message. [gather4_po
allows -32/31, and specified per channel]
On Mon, Nov 28, 2016 at 10:49 AM, Ilia Mirkin wrote:
> This matches what NVIDIA and AMD hardware expose.
>
> Signed-off-by: Ilia Mirkin
> ---
>
> Not sure what the true HW limit is
This matches what NVIDIA and AMD hardware expose.
Signed-off-by: Ilia Mirkin
---
Not sure what the true HW limit is here. On NVIDIA, the true HW limit really
is -32/31 though. As an aside, according to vulkan.gpuinfo.org, the Intel
Windows driver also exposes -32/31.
With the updated limits on