On Thu, Jan 12, 2017 at 4:56 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: > On Thu, Jan 12, 2017 at 7:46 PM, Matt Turner <matts...@gmail.com> wrote: >> On Thu, Jan 12, 2017 at 3:20 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: >>> On Thu, Jan 12, 2017 at 6:04 PM, Nicolai Hähnle <nhaeh...@gmail.com> wrote: >>>> On 12.01.2017 23:46, Ilia Mirkin wrote: >>>>> >>>>> On Thu, Jan 12, 2017 at 4:03 PM, Matteo Bruni <matteo.myst...@gmail.com> >>>>> wrote: >>>>>> >>>>>> So, what would be really nice to have is a GLSL extension for some >>>>>> kind of switch to select the requested behavior WRT NaN. For example a >>>>>> three-way option with "don't generate NaN in arithmetic operations", >>>>>> "do generate NaN" and "don't care". It could also be a GL state if >>>>>> that's easier to implement with the existing hardware, since an >>>>>> individual application isn't supposed to require different behavior >>>>>> from one shader to the next. >>>>>> >>>>>> Is anyone interested in / favorable to something like this? It would >>>>>> solve the issue with defining NaN behavior in GLSL while making things >>>>>> a bit more compatible with "other API a lot of games are ported from >>>>>> which happens to be supported by all the desktop GPUs". >>>>> >>>>> >>>>> Not that I'm biased, but on the NVIDIA Tesla series (G80-GT21x), this >>>>> enable is handled via a global flag, not in the shader binary, so this >>>>> is all-or-nothing for a whole pipeline. On GF100+, I believe there is >>>>> also an enable via a global flag, but there are also a FMUL.FMZ (and >>>>> FFMA.FMZ) flag, which I *think* has the same effect. So for GF100+ hw, >>>>> this could be done at the instruction level. >>>> >>>> >>>> Well, I would also have advocated for what is effectively a >>>> per-program/pipeline flag anyway, even though GCN hardware can >>>> theoretically >>>> do it per-instruction. Tracking a per-instruction bit in the compiler >>>> quickly becomes fragile (e.g. there's no good way for us to model this >>>> information per-instruction in LLVM IR). Per-shader isn't any better than >>>> per-instruction due to linking, and per-shader-stage is awkward if we ever >>>> want to do fancier cross-stage optimizations. >>>> >>>> It's really quite simple. Introduce an extension with a name like >>>> MESA_shader_float_dx9. The behavior I'd suggest is: >>>> >>>> Enabling/requiring the extension in a shader causes various semantics >>>> changes to bring floating point behavior in line with DX9 in that shader's >>>> code: >>>> >>>> - 0*x = 0 >>> >>> Yes. But only for fp32, not for fp64. >>> >>>> - sqrt/rsqrt are guaranteed to take the absolute value of their argument >>> >>> Is that necessary? If the software knows about the ext, it also knows >>> to stick the abs() in. >> >> Is there a compelling reason to make the extension offer just one of >> these many behavior differences? >> >> FWIW, i965 has IEEE and "ALT" floating-point modes. ALT, I think >> corresponds to d3d9 behavior, and its description says >> >> A floating-point execution mode that maps +/- inf to +/- fmax, +/- >> denorm to +/-0, and NaN to +0 at the FPU inputs and never produces >> infinities, denormals, or NaN values as outputs. > > Interesting. I believe on NVIDIA hardware, it's just float multiply > that's affected. > >> >> Also: Extended mathematics functions of log(), rsq() and sqrt() take >> the absolute value of the sources before computation to avoid >> generating INF and NaN results. >> >> If those two behaviors correspond to d3d9 behavior, I wouldn't want an >> extension that offered only the "zero wins" behavior and expected >> applications to insert abs(). > > Really? That creates ARB_gpu_shader5-style extensions which do 75 > different things and that you can't expose if you can only do 74 of > them. I think in the past we've avoided things like having "d3d9 mode" > in gallium API's - it's nice for these things to be individually > enumerated. I like the direction that e.g. ARB_clip_control went in - > make it all configurable individually instead of bundling unrelated > things together. This has allowed e.g. dolphin to do things in OpenGL > that are impossible on DX. And whether 0 * x = 0 or not seems rather > unrelated from whether rsq takes abs of its args.
Definitely agree. Sorry about i965 :) I think we should figure out what behaviors D3D9 actually wants. i965's ALT mode maps ±inf as ±fmax on input. If D3D9 wants that... we should probably include it in the spec. Also, if the extension is written in a way that isn't doable on i965 I think we're just wasting time. i965 is the only driver that cannot use st/nine. :) _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev