On 18.09.2017 17:02, Roland Scheidegger wrote:
This looks like a horrendous solution which will break the world - well
for us :-). Because integers simply will cease to work, always flushed
to zero (bye bye loop counter...).
The reason is that when you translate from something with a untyped
register file to something typed, the obvious solution is to store
everything as floats, and cast to int/uint as needed (if you'd translate
from tgsi back to glsl, you'd probably do it that way as well).
Hence, you must not flush denorms on float to int/uint casts - which,
btw, is also illegal by glsl as far as I can tell ("Returns a signed or
unsigned integer value representing the encoding of a floating-point
value. The floating-point value's bit-level representation is preserved.")
How could you possibly know that the value was a denorm to begin with?
"Any denormalized value input into a shader or potentially generated by
any operation in a shader can be flushed to 0.", which presumably
includes intBitsToFloat :-)
(As a side note, for the same reasons we rely on i2f/u2f "doing the
right thing", not messing with bits, albeit this one isn't guaranteed by
glsl. But I'm quite sure we're not the only ones relying on this, and
this is quite common.)
Yeah, fair enough. You've convinced me not to take this approach.
We could just flush to zero after min/max. Although this should lead to
a different result for min() vs. open-coding the comparison and select.
That feels pretty dirty as well...
I don't know what a proper solution would look like though. FWIW d3d10
permits this min/max behavior (the comparison must use denorm flush, but
the result may be denorm flushed or not, so it's completely ok if you
get the "wrong" result. And given the way glsl is specced wrt denorms
(read: mostly undefined) are you sure it's actually illegal there?
It's one of those borderline cases for GLSL. The GLSL language is
basically what I quoted above, which makes our behavior slightly icky
because we flush the input to zero in one place but not another.
The GLSL ES 3.10 spec has this bit though:
"Should subnormal numbers (also known as 'denorms') be supported?
RESOLUTION: No, subnormal numbers maybe flushed to zero at any time."
... which gives much more leeway. Together with the D3D10 behavior, this
is a good argument for changing the test. I'm going to look into that.
Cheers,
Nicolai
Roland
Am 18.09.2017 um 11:44 schrieb Nicolai Hähnle:
Hi all,
This series fixes an issue we have on GCN due to the arguably inconsistent
handling of denormal floating point values. We disable denormals for 32-bit
floating point numbers, because enabling them would cost a lot of
performance. By and large, GCN floating point instructions flush denormals
to zero. An annoying exception are v_min_f32 and v_max_f32, whose behavior
can be described as:
cc = compare(flushed(src0), flushed(src1))
dst = cc ? src0 : src1;
When both sources are denormals (or 0.0), the comparison will consider the
numbers equal. The instruction will return one of them *without* flushing
to zero, which means that we might end up with the wrong one (depending on
how the operands are ordered).
The way this series fixes the issue is to explicitly cause flush-to-zero
behavior whenever the application may be able to observe a difference, which
is when floats are bit-cast to integers or stored in memory (SSBOs and
32f images, since those images can be bitcast r32i/ui).
Since the backend cannot reliably detect bitcasts, we add a new CANON opcode
which is modeled after LLVM's llvm.canonicalize.* intrinsic. If the driver
requests it, CANON is emitted before bitcasts and stores, which means most
shaders are unaffected (the alternative would be to emit it for bitcasts
*from* integers and for loads, which has a much wider impact).
For GCN, LLVM turns the canonicalize into a multiplication by 1.0. We may
want to teach LLVM about cases where the additional instruction is
unnecessary.
Please review!
Thanks,
Nicolai
--
src/gallium/auxiliary/gallivm/lp_bld_limits.h | 1 +
src/gallium/auxiliary/tgsi/tgsi_exec.h | 1 +
src/gallium/auxiliary/tgsi/tgsi_info_opcodes.h | 2 +-
src/gallium/docs/source/screen.rst | 2 ++
src/gallium/docs/source/tgsi.rst | 18 ++++++++++++++++++
src/gallium/drivers/etnaviv/etnaviv_screen.c | 1 +
src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
src/gallium/drivers/nouveau/nv30/nv30_screen.c | 2 ++
src/gallium/drivers/nouveau/nv50/nv50_screen.c | 1 +
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 1 +
src/gallium/drivers/r300/r300_screen.c | 2 ++
src/gallium/drivers/r600/r600_pipe.c | 1 +
src/gallium/drivers/radeonsi/si_pipe.c | 1 +
src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c | 2 ++
src/gallium/drivers/svga/svga_screen.c | 3 +++
src/gallium/drivers/vc4/vc4_screen.c | 1 +
src/gallium/drivers/virgl/virgl_screen.c | 1 +
src/gallium/include/pipe/p_defines.h | 1 +
src/gallium/include/pipe/p_shader_tokens.h | 2 +-
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 31
+++++++++++++++++++++++++++++--
20 files changed, 71 insertions(+), 4 deletions(-)
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddev&d=DwIGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=_QIjpv-UJ77xEQY8fIYoQtr5qv8wKrPJc7v7_-CYAb0&m=GL1ia61sD0zG05zPlj1qW9ra1gTZ2KBkTNoSBG1cakM&s=o3M_-RsUz-5prZLmL33iu5AaeEChWHU6kf-FtWMliKU&e=
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev