On 2/23/20 5:57 PM, Ilia Mirkin wrote: > --- > > We talked about something like this a while back, but the end result > was inconclusive. I added a TGSI MUL_ZERO_WINS shader property for nine. > But it'd be nice for wine to be able to control this too. > > I couldn't actually find any evidence of the discussion from 2017 or so, > so ... let's have another one. > > docs/specs/MESA_ieee_fp_alu_mode.spec | 136 ++++++++++++++++++++++++++ > 1 file changed, 136 insertions(+) > create mode 100644 docs/specs/MESA_ieee_fp_alu_mode.spec > > diff --git a/docs/specs/MESA_ieee_fp_alu_mode.spec > b/docs/specs/MESA_ieee_fp_alu_mode.spec > new file mode 100644 > index 00000000000..cb274f06571 > --- /dev/null > +++ b/docs/specs/MESA_ieee_fp_alu_mode.spec > @@ -0,0 +1,136 @@ > +Name > + > + MESA_ieee_fp_alu_mode > + > +Name Strings > + > + GL_MESA_ieee_fp_alu_mode > + > +Contact > + > + Ilia Mirkin, ilia 'at' x.org > + > +IP Status > + > + No known IP issues. > + > +Status > + > + Proposed > + > +Version > + > +Number > + > + TBD > + > +Dependencies > + > + OpenGL 3.0 or OpenGL ES 3.0 is required. > + > + The extension is written against the OpenGL GL 3.0 and OpenGL ES 3.0 > + specifications. > + > +Overview > + > + Pre-GL3 hardware did not generally have full IEEE floating point > operation > + support. Among other things, 0 * Infinity would work out to 0, and NaN's > + might not be generated, or otherwise be treated improperly. GL3-class and > + later hardware introduced full IEEE FP support, including NaN, Infinity, > + and the proper generation of these. > + > + Some software targeted at older hardware makes assumptions about how the > + shader ALU works. And to accomodate these, GL3-class hardware has a way > to > + change how the shader ALU behaves. There are no standards around this, > and > + different hardware has different ways of dealing with it. However these > + modes were designed specifically with such older software in mind. > + > + This extension introduces a way to configure a context to be in non-IEEE > + ALU mode. This extension does not specify precisely what this means, as > + each vendor has something different. Generally it means non-IEEE > compliant > + handling of multiplication, as well as any other unspecified changes.
I think many of the other things are specified. They're the non-IEEE behaviors of GL_ARB_vertex_program and GL_ARB_fragment_program, and those mimic the required behavior of early DX shader models. There are a bunch of cases that specify that zero is generated when IEEE would require NaN. If there's just a small handful of things like this, we'd probably be better adding a couple new built-in functions to do the job. The problem on Intel hardware is... we really, really don't want to switch to non-IEEE mode because it changes how a bunch of things work, and we haven't tested any of that in many years. I'd much rather put in some kind of work-arounds for things that don't want multiplication or pow() to generate NaN. As for the mechanism, I'm very strongly in favor of something that would be locked-in when the shader is compiled. I really want to avoid any potential that an external glEnable could trigger a a recompile. The more I think about it... having an extension that adds a handful built-in functions that give old shader model behavior would be a good idea. We could even test it. :) I've looked a lot of shaders, and I've seen a lot of not-quite-what-they-wanted methods for avoiding NaN behavior in a bunch of these functions. Having a special version of inversesqrt() that returns FLT_MAX for 0 would be useful to a lot of users. As part of the spec we could even provide canonical versions of the functions so that users could copy-and-paste #ifndef GL_MESA_foo float inveresqrt_nonIEEE(float x) { ... } #endif > + > +New Tokens > + > + Accepted by the <cap> parameter of Enable, Disable, and IsEnabled, by > + the <pname> parameter of GetBooleanv, GetIntegerv, GetFloatv, and > + GetDoublev: > + > + IEEE_FP_ALU_MODE_MESA 0x???? > + > + > +Changes to GLSL Section 4.1.4 Floats: > + > + Add the following paragraph: > + > + In case that the shader is being executed in a context with > + IEEE_FP_ALU_MODE_MESA disabled, multiplication shall produce the > following > + (non-IEEE-complaint) result: > + > + float a = 0; > + float b = Infinity; > + float c = a * b; // c == 0 > + > + There may be other implications from this mode being enabled, including > + clamping of non-finite values, or anything else the hardware mode happens > + to enable to achieve compatibility. > + > +New State > + > + (add to table 6.52, Miscellaneous, p.392) > + > + Initial > + Get Value Type Get Command Value Description > Sec. Attribute > + --------------------- ----- ----------- ------- ------------------ > ------ --------- > + IEEE_FP_ALU_MODE_MESA B IsEnabled TRUE Whether shader ALU > enable > + is in IEEE FP mode > + > + > +Issues > + > + (1) This specification does not precisely specify what non-IEEE FP mode > is. > + > + RESOLVED. Shipping hardware has different ways of dealing with it. > For > + example, Intel clamps all values. NVIDIA Tesla series has a > + context-wide mode for controlling whether zero wins in multiplication > + or follows IEEE rules. NVIDIA Fermi+ series as well as ATI/AMD Radeon > + R600+ has separate opcodes which control this (but again, a different > + set of operations are covered). > + > + A single extension which is going to be easy to use for emulation > + software is thus much harder to write if it's to precisely specify > + this. > + > + The applications that want these have already been written and tested > + against these approaches, so we know they all work with whatever the > + hardware has to offer. > + > + (2) Why use an Enable instead of a shader layout token? > + > + RESOLVED. Because some hardware implementations don't allow > + controlling this on a per-stage level. While one could come up with > + rules requiring linked program stages to have the same setting, this > + is going to be extra validation for the implementations to > + implement. Furthermore, one would want these rules to also apply to > + fixed-function-generated shaders equally. Instead a simple mode > should > + be able to flip this on and off. > + > + (3) What about FP denorms? > + > + RESOLVED. The same hardware tends to also have a way to control > + whether denorm FP values are flushed to zero. GLSL does not specify > + this explicitly, but some software relies on denorms being > + flushed. Should there be a desire to allow denorms to work, this can > + be done by another extension. > + > + (4) What is the expected usage for this? > + > + RESOLVED. Software which enables older games to operate, > + e.g. emulators, will now be able to do shader translation without > + copious checks for these "error" conditions. > + > + > +Revision History > + > + Revision 1, ilia, 2020-02-23 > + - Initial draft > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev