Am 14.04.2013 23:44, schrieb Alex Deucher: > On Sun, Apr 14, 2013 at 2:36 PM, Marek Olšák <mar...@gmail.com> wrote: >> The R600 ISA documentation only says that the DX10 variants of MIN and MAX >> use DX10 handling of NaNs. It does not say anything about the non-DX10 >> variants. > > The difference is the NaN behavior. The dx10 versions of MIN/MAX are > NaN safe. Yes but what does it mean for the non-dx10 versions what do they return in case one argument is a NaN? Obviously it can't just be random otherwise you could always use the dx10 version...
Roland There are also DX10 and non-DX10 versions of the SET* > opcodes. The difference there is in the result: > > SETE A == B ? 1.0 : 0.0 > SETE_DX10 A == B ? -1 : 0 > etc. > > Alex > >> >> Marek >> >> >> On Sun, Apr 14, 2013 at 8:16 PM, Roland Scheidegger <srol...@vmware.com> >> wrote: >>> >>> Am 14.04.2013 18:39, schrieb Marek Olšák: >>>> On Sun, Apr 14, 2013 at 5:24 PM, Roland Scheidegger <srol...@vmware.com >>>> <mailto:srol...@vmware.com>> wrote: >>>> >>>> Am 14.04.2013 10:12, schrieb jfons...@vmware.com >>>> <mailto:jfons...@vmware.com>:> - TBD >>>> > + Start an IF ... ELSE .. ENDIF block. Condition evaluates to >>>> true if >>>> > + >>>> > + src0.x != 0.0 >>>> > + >>>> > + where src0.x is interpreted as a floating point register. >>>> Maybe should say something wrt evaluation of NaNs? I know we haven't >>>> really established rules for comparisons etc. wrt NaNs but those >>>> bools-as-float make me cry. I guess it is no different though than >>>> other >>>> float opcodes, if we now really have a definition saying IF takes >>>> _any_ >>>> float not just a bool-as-float which was loosely implied before. >>>> >>>> >>>> I don't know where the term "bool-as-float" came from, but I'd rather >>>> not use it unless it's properly defined somewhere, and TGSI doesn't have >>>> bools anyway, so why bother? The GLSL compiler or glsl-to-tgsi is >>>> responsible for converting bools to either floats or ints and TGSI >>>> shouldn't need to care. Both r300g and r600g use (src0.x != 0.0) for IF >>>> and (src0.x != 0) for UIF (r600-only), so there is always the >>>> "not-equal-to" operator, which is also well defined for NaNs. >>> That depends on your definition of "well defined". llvm for instance has >>> both "ordered not equal" and "unordered not equal" operators for >>> precisely this reason. But yes I guess ieee-754 has some defined >>> behavior there. >>> That "bool-as-float" essentially comes from state trackers, because the >>> language they are translating from require bools as "if" inputs - hence >>> the input value always should have been the result of some comparison >>> (or similar) operation (which in turn return these fake bools). >>> But I agree this was never really documented, so just clearly stating >>> you can pass in any float is just fine (it means that state trackers now >>> are explicitly allowed to omit the comparison for simple cases like this >>> one, "if(a != 0)...", well if they can detect it, it was not really >>> obvious without documentation before if that would be ok). So in that >>> sense nothing more needs to be said about NaNs, since they just adhere >>> to the same rules as in other places (meaning pretty much undefined for >>> most things, currently). >>> >>>> >>>> Also if you care about NaNs, we should start by defining how >>>> instructions should handle them, e.g. how relational operators handle >>>> NaNs, whether the multiplication operator follows the rule 0*anything = >>>> 0 (MUL, MAD, DP4, ...), etc. >>>> >>>> R600 have separate opcodes depending on what behavior you want, for >>>> example: >>>> - The MUL opcode follows the rule 0*anything = 0. (DX9) >>>> - The MUL_IEEE opcode follows the IEEE behavior. >>>> >>>> The other opcodes with both the DX9 and IEEE behavior are: MAD, DP4, >>>> EX2, LG2, RCP, RSQ. There are also separate MIN and MAX opcodes for DX9 >>>> and DX10. We should choose our opcodes carefully depending on whether we >>>> are implementing a DX9, DX10, OpenGL, or OpenCL state tracker. >>> >>> Yes indeed. d3d10 has quite strict rules which are mostly ieee754 (or >>> ieee754r) but with some deviations. Other specs tend to be more lenient, >>> and requiring strict rules could add quite some overhead, so we might >>> want to introduce additional opcodes. How does MIN/MAX work for dx9 btw? >>> DX10 will require you to give back the non-NaN value if only one >>> argument is NaN (which seems to be ieee754r behavior), which for >>> instance unfortunately doesn't translate well to sse2 code (as sse2 will >>> just give you the second source if there's a NaN in either src which >>> means you had to use cmp/select instead and be careful about what >>> comparison you use there since the cpu doesn't support the full set of >>> "ordered" and "unordered" comparisons unless you've got avx though >>> presumably llvm would take care of that if you use the right comparison >>> ops there). >>> >>> Roland >> >> >> >> _______________________________________________ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev >> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev