On Mon, Aug 14, 2017 at 12:45 PM, Pekka Jääskeläinen <pe...@parmance.com> wrote: > Hi Richard, > > The base idea of the patch is to optimize for the (common) situation > where FTZ/DAZ > is controlled by a CPU-wide flag and we then need to only avoid compile-time > optimizations that assume semantics where denorm handling is on to support > the ‘forced FTZ/DAZ semantics’. > >> This suggests only outputs are flushed to zero? OTOH documentation >> for X * 1 -> X suggests otherwise. This simplification also suggests to >> make FTZ operations explicit instead of adding a flag? Thus the BRIG >> FE would emit FTZ (X) * 1 which we can optimize to FTZ (X), and we >> could eventually add a pass optimizing FTZ operations? > > Both the inputs and outputs must be flushed to zero in the HSAIL’s > ‘ftz’ semantics. > FTZ operations were previously always “explicit” in the BRIG FE output, like > you > propose here; there were builtin calls injected for all inputs and the > output of ‘ftz’-marked > float HSAIL instructions. This is still provided as a fallback for > targets which do not > support a CPU mode flag.
I see. But how does making them implicit fix cases in the conformance testsuite? That is, isn't the error in the runtime implementation of __hsail_ftz_*? I'd have used a "simple" if (fpclassify (x) == FP_SUBNORMAL) return copysign (0, x); > The problem with a special FTZ ‘operation’ of some kind in the generic output > is > that the basic optimizations get confused by a new operation and we’d need to > add knowledge of the ‘FTZ’ operation to a bunch of existing optimizer > code, which > seems unnecessary to support this case as the optimizations typically apply > also > for the ‘FTZ semantics’ when the FTZ/DAZ flag is on. Apart from the exceptions you needed to guard ... do you have an example of a transform that is confused by explicit FTZ and that would be valid if that FTZ were implicit? An explicit FTZ should be much safer. I think the builtins should also be CONST and not only PURE. Richard. > Thanks, > Pekka