Re: FENV_ACCESS status

Marc Glisse Fri, 07 Aug 2020 03:46:27 -0700

Thank you for your comments.

On Fri, 7 Aug 2020, Richard Biener wrote:

Conversions look like
.FENV_CONVERT (arg, (target_type*)0, 0)
the pointer is there so we know the target type, even if the lhs
disappears at some point. The last 0 is the same as for all the others, a
place to store options about the operation (do we care about rounding,
about exceptions, etc), it is just a placeholder for now. I could rename
it to .FENV_NOP since we seem to generate NOP usually, but it looked
strange to me.


You could carry the info in the existing flags operand if you make that a
pointer ...


Ah, true, I forgot that some other trees already use this kind of trick.
Not super pretty, but probably better than an extra argument.

Adding some info missing above from reading the patch.

The idea seems to be to turn FP operations like PLUS_EXPR, FLOAT_EXPR
but also (only?) calls to BUILT_IN_SQRT to internal functions named
IFN_FENV_* where the internal function presumably has some extra
information.


Sqrt does seem to have a special place in IEEE 754, and in practice some
targets have instructions (with rounding) for it.

You have

+/* float operations with rounding / exception flags.  */
+DEF_INTERNAL_FN (FENV_PLUS, ECF_LEAF | ECF_NOTHROW, NULL)
+DEF_INTERNAL_FN (FENV_MINUS, ECF_LEAF | ECF_NOTHROW, NULL)
+DEF_INTERNAL_FN (FENV_MULT, ECF_LEAF | ECF_NOTHROW, NULL)
+DEF_INTERNAL_FN (FENV_RDIV, ECF_LEAF | ECF_NOTHROW, NULL)
+DEF_INTERNAL_FN (FENV_FLOAT, ECF_LEAF | ECF_NOTHROW, NULL)
+DEF_INTERNAL_FN (FENV_CONVERT, ECF_LEAF | ECF_NOTHROW, NULL)
+DEF_INTERNAL_FN (FENV_SQRT, ECF_LEAF | ECF_NOTHROW, NULL)

so with -fnon-call-exceptions they will not be throwing (but regular
FP PLUS_EXPR would).


Hmm, ok, I guess I should remove ECF_NOTHROW then, the priority should
be to be correct, we can carefully reintroduce optimizations later.

They will appear to alter memory state - that's probably to have the
extra dependence on FENV changing/querying operations but then why do you
still need to emit asm()s?

The IFNs are for GIMPLE and represent the operations, while the asm aresimple passthrough for RTL, I replace the first with the second (plus theregular operation) at expansion.

I suppose the (currently unused) flags parameter could be populated with
some known FP ENV state and then limited optimization across stmts
with the same non-zero state could be done?


I was mostly thinking of storing information like:
* don't care about the rounding mode for this operation
* may drop exceptions produced by this operation
* may produce extra exceptions
* don't care about signed zero
* may contract into FMA
* don't care about errno (for sqrt?)
etc

With fenv_round, we would actually have to store the rounding mode of
the operation (upward, towards-zero, dynamic, don't-care, etc), a bit
less nice because 0 is not a safe fallback anymore. We could also store
it when we detect a call to fesetround before, but we have to be careful
that this doesn't result in even more calls to fesetround at expansion
for targets that do not have statically rounded operations.

If there are other, better things to store there, great.

Using internal function calls paints us a bit into a corner since they are still
subject to the single-SSA def restriction in case we'd want to make FENV
dataflow more explicit.  What's the advantage of internal functions compared
to using asms for the operations themselves if we wrap this class into
a set of "nicer" helpers?

I wanted the representation on gimple to look a bit nice so it would beboth easy to read in the dumps, and not too hard to write optimizationsfor, and a function call looked good enough. Making FENV dataflow explicitwould mean having PHIs for FENV, etc? At most I thought FENV would berepresented by one specific memory region which would not alias uservariables of type float or double, in particular.

I don't really see what it would look like with asms and helpers. In somesense, the IFNs are already wrappers, that we unwrap at expansion. Yourasms would take some FENV as intput and output, so we have to track whatFENV to use where, similar to .MEM.

One complication with tracking data-flow is "unknown" stuff, I'd suggest
to invent a mediator between memory state and FP state which would
semantically be load and store operations of the FP state from/to memory.

All I can think of is make FP state a particular variable in memory, andteach alias analysis that those functions only read/write to thisvariable. What do you have in mind, splitting operations as:


fenv0 = read_fenv()
(res, fenv1) = oper(arg0, arg1, fenv0)
store_fenv(fenv1)

so that "oper" itself is const? (and hopefully simplify consecutiveread_fenv/store_fenv so there are fewer of them) I wonder if lying aboutthe constness of the operation may be problematic.

(and asm would be abused as a way to return a pair, with hopefully somemarker so we know it isn't a real asm)

That said, you're the one doing the work and going with internal functions
is reasonable - I'm not sure to what extent optimization for FENV acccess
code will ever be possible (or wanted/expected).  So going more precise
might not have any advantage.

I think some optimizations are expected. For instance, not having tore-read the same number from memory many times just because there was anaddition in between (which could write to fenv but that's it). Some maystill want FMA (with a consistent rounding direction). For those (like me)who usually only care about rounding and not exceptions, making theoperations pure would be great, and nothing says we cannot vectorize thoserounded operations!

I am trying to be realistic with what I can achieve, but if you think theIFNs would paint us into a corner, then we can drop this approach.

You needed to guard SQRT - will you need to guard other math functions?
(round, etc.)

Maybe, but probably not many. I thought I might have to guard all of them(sin, cos, etc), but IIRC Joseph's comment seemed to imply that thiswouldn't be necessary. I am likely missing FMA now...

If we need to keep the IFNs use memory state they will count towards
walk limits of the alias oracle even if they can be disambiguated against.
This will affect both compile-time and optimizations.


Yes...

+  /* Careful not to end up with something like X - X, which could get
+     simplified.  */
+  if (!skip0 && already_protected (op1))

we're already relying on RTL not optimizing (x + 0.5) - 0.5 but since
that would involve association the simple X - X case might indeed
be optimized (but wouldn't that be a bug if it is not correct?)

Indeed we do not currently simplify X-X without -ffinite-math-only.However, I am trying to be safe, and whether we can simplify or not issomething that depends on each operation (what the pragma said at thatpoint in the source code), while flag_finite_math_only is at best perfunction.


--
Marc Glisse

Re: FENV_ACCESS status

Reply via email to